GLM-5.2 crashes the coding frontier
Z.ai's MIT-licensed GLM-5.2 lands as the strongest open-weight coding model yet, dropped opportunistically into the vacuum left by Anthropic's government-forced Fable 5 takedown. Meanwhile SpaceX bought Cursor for $60B in stock days after its IPO, and the US government kept finding new ways to entangle itself with frontier labs. The week's quieter signal: post-training recipes and attention architectures are where the real engineering is happening.
GLM-5.2: a 744B open-weight model nips at Opus 4.8 on coding
Z.ai released GLM-5.2 under an MIT license: a 744B-parameter MoE (40B active) with a 1M-token context, high/max effort modes, and GLM-5.1 pricing ($1.4/$4.4 per M in/out). It posts 81.0 on Terminal-Bench 2.1 (vs 85.0 for Opus 4.8) and 62.1 on SWE-bench Pro, ranking the top open model on FrontierSWE, Design Arena and Code Arena: Frontend. The headline architecture trick is IndexShare, which reuses one sparse-attention indexer across every four layers to cut per-token FLOPs by 2.9x at 1M context, plus an improved MTP layer that lifts speculative-decoding acceptance ~20%.
Why it matters: It is the first open-weight model that practitioners credibly call an Opus/GPT-class substitute for agentic coding, and MIT weights mean you can quantize, fine-tune and self-host it — an obvious hedge against the export-control turmoil hitting closed US frontier models.
- GLM-5.2: Built for Long-Horizon Tasks (Hugging Face)
- [AINews] GLM-5.2: the top Frontend Coding model in the world, IndexShare for Speculative Decoding (Latent Space (swyx))
SpaceX buys Cursor for $60B in stock, days after its IPO
SpaceX closed an all-stock $60B acquisition of Anysphere, maker of Cursor, to help its xAI division catch Anthropic and OpenAI in AI-assisted coding. Cursor employees had already been embedded at xAI training a joint model; the deal trades SpaceX's chip stockpile for Cursor's talent and revenue (Cursor hit ~$3B annualized by late April). Newly public SpaceX briefly touched a ~$2.9T valuation on the news before paring gains, despite posting a $4.9B loss on $18.7B revenue last year.
Why it matters: The most successful independent AI coding tool just became captive to one vendor's models and compute — worth watching if you depend on Cursor staying model-agnostic across OpenAI and Anthropic backends.
- SpaceX bets $60 billion on Cursor to catch OpenAI and Anthropic (The Decoder)
- SpaceX to acquire Cursor for $60B in stock, days after blockbuster IPO (TechCrunch AI)
- SpaceX valuation balloons to $2.6T, briefly passes Amazon (TechCrunch AI)
Trump admin forces Anthropic to pull Fable 5 — and sales climb anyway
The White House sent Anthropic a letter demanding it block non-Americans, including its own employees, from accessing its top models — the limited Mythos 5 and the public Fable 5 — citing an obscure export-control directive after reports that hackers bypassed Fable 5's guardrails on its potent vulnerability-finding capabilities. Anthropic pulled both models. Yet Ramp data shows Anthropic passed OpenAI to 41% of business AI subscription spend in May, with its lead economist arguing the 'too dangerous to use' aura helps rather than hurts adoption.
Why it matters: If you built against Fable 5 or Mythos, they're gone for now; the episode is also why GLM-5.2 and other open weights suddenly look like strategic infrastructure rather than a budget option.
DOJ invokes 'national security' to defend xAI's unpermitted gas turbines
The Justice Department sided with xAI against a NAACP lawsuit seeking to shut down 57 unpermitted natural-gas turbines at its Memphis-area Colossus data centers, arguing a shutdown would threaten 'national, economic, and energy security.' A DoD official called Grok one of four AI models supporting 'mission-critical' classified operations, including recent strikes on Iran. The turbines stay trailer-mounted to claim a one-year exemption; the SELC says that still violates federal law, and emissions of NOx, PM2.5 and formaldehyde have spiked in an already-polluted region.
Why it matters: A concrete look at how AI compute buildout is now being framed as a defense asset — and the environmental and legal corners being cut to keep the GPUs powered.
Microsoft moves Copilot Cowork to usage billing, eyes self-hosted DeepSeek
Microsoft is shifting Copilot Cowork to usage-based pricing, with EVP Charles Lamanna telling Axios that flat-rate is unsustainable given 'users who do hundreds of tasks a week' — Cowork adapts Anthropic's Claude tech and burns tokens fast. The company is also weighing a self-hosted, fine-tuned DeepSeek V4 as a cheaper optional backend, fully on Azure with added bias safeguards, echoing Satya Nadella's pitch for a pick-and-tune ecosystem of models.
Why it matters: A blunt admission that agentic, token-hungry assistants break flat-rate economics — and a hint that 'optional Chinese open model on our cloud' is becoming a mainstream cost lever even for US incumbents.
SubQ 1.1 Small claims near-perfect retrieval at 12M tokens via sparse attention
SubQ released the model card for SubQ 1.1 Small, built on Subquadratic Sparse Attention (SSA) that replaces O(n²) dense attention with a learned linear-scaling formulation. It reports near-perfect needle-in-a-haystack retrieval at 1M–12M tokens (trained predominantly at 1M), 99.12% on RULER at 128K, and competitive GPQA Diamond (85.4%) and LiveCodeBench (89.7% pass@4). At 1M tokens it claims 64.5x less compute than dense attention and 56x faster than FlashAttention-2; results were third-party verified by Appen. It's deploying to design partners, with 2M–12M models promised later this year.
Why it matters: If the numbers hold, full-repo and full-document reasoning without RAG chunking moves from aspiration to product — though it's still partner-gated, not something you can call today.
- SubQ 1.1 Small Technical Report (Hacker News)
The frontier post-training recipe is converging on multi-teacher distillation
Nathan Lambert and Finbarr Timbers walk through how post-training has fragmented from the InstructGPT 'SFT → reward model → RL' pipeline into Multi-teacher On-Policy Distillation (MOPD): train N domain specialists, then distill them into one student by minimizing reverse-KL on the student's own rollouts. The pattern shows up across MiMo Flash V2, DeepSeek V4 (10+ teachers), Nemotron 3 Ultra and GLM-5 — driven by RL getting expensive and capability-conflicting when math, code and agentic tasks share one run. DPO has quietly disappeared from most frontier recipes.
Why it matters: Explains why 2026's open models keep leaping: the gains are organizational as much as algorithmic — specialists are parallelizable across teams, then merged. Useful mental model for reading any new model's technical report.
- Frontier post-training recipe review with Finbarr Timbers (Interconnects)
Alibaba and AWS push embodied AI from Hub datasets to real hardware
Alibaba released the Qwen-Robot Suite — RobotNav (5 navigation tasks), RobotManip (unified state-action space, 38,100+ hours of open data) and RobotWorld, a world model spanning 20+ embodiments and an 8.6M video-text corpus. Separately, AWS shipped Strands Robots (Apache 2.0), an SDK that exposes the LeRobot stack as composable AgentTools: the same agent code records demonstrations in MuJoCo simulation, runs GR00T/MolmoAct2 policies, deploys to a physical SO-101 with one kwarg change, and coordinates fleets over a Zenoh mesh with human-in-the-loop gates on actuating commands.
Why it matters: Robotics tooling is starting to look like normal software — open datasets on the Hub, a single agent loop spanning sim and hardware — lowering the barrier for developers without a lab full of arms.
Also worth a look
- Parallelize speculative decoding with P-EAGLE on Amazon SageMaker AI (AWS Machine Learning)
- Wolfram Language and Mathematica version 15 (Hacker News)
- Safeguard your agentic AI applications with the Amazon Bedrock Guardrails InvokeGuardrailChecks API (AWS Machine Learning)
- Android 17 launches with new multitasking tools as Google expands Gemini features (TechCrunch AI)
- How easily can Russian propaganda fool AI models? A new benchmark finds out (The Decoder)
- Berlin court rules Google's AI Overviews are just a new search format, not original content (The Decoder)
- Unlocking UK house-building with AI-accelerated planning (Google DeepMind)
- Probably raises $9M to build a more reliable kind of AI (TechCrunch AI)
- What are git worktrees, and why should I use them? (GitHub Blog)
- Has AI already killed self-help nonfiction books? (Hacker News)