astradevlabsastradevlabs
← All posts
AI News4 min

GLM-5.2 vs Kimi K2.7 Code: Which Open-Weight Coder Should You Actually Run?

AI News

Within 24 hours in mid-June, two Chinese labs dropped the strongest open-weight coding models the year has seen: Moonshot's Kimi K2.7 Code on June 12 and Z.ai's GLM-5.2 on June 13. Both promise frontier-class agentic coding at a fraction of closed-model prices, so the real question for anyone wiring up an autonomous coding agent is which one to actually run.

This is a head-to-head across the dimensions that matter when you're paying the bill and shipping the code. Winner called each round.

Round 1: Published benchmarks

GLM-5.2 walks in with a public track record. It holds the top open-weight spot on SWE-bench Pro at 62.1%, edging GPT-5.5 at 58.6% and beating its own predecessor GLM-5.1 (58.4%) by 3.7 points. On Terminal-Bench 2.1 it scores 81.0 — four points behind Claude Opus 4.8's 85.0, but well clear of the rest of the open field.

Kimi K2.7 Code is harder to pin down. As of late June, its published numbers came almost entirely from Moonshot's own test suites, with no independent third-party results yet on SWE-bench Pro, Terminal-Bench 2.1, or the other standard public leaderboards. The model may be excellent, but you're taking the vendor's word for it.

If you make decisions off reproducible public benchmarks, GLM-5.2 is the only one of the two with a real trail to follow right now.

Winner: GLM-5.2 — not necessarily better, but verifiable.

Round 2: Context window

Long-horizon agent work lives and dies on context.

  • GLM-5.2: a full 1M-token window, held stable by a new sparse-attention architecture Z.ai calls IndexShare, which it says cuts per-token FLOPs by 2.9x at 1M-token length.
  • Kimi K2.7 Code: 256K tokens — plenty for most tasks, but a quarter of GLM's ceiling when you're feeding a whole monorepo or a long agent trajectory.

Winner: GLM-5.2 — 4x the headroom, and the architecture to make it usable.

Round 3: Price per token

Both are dramatically cheaper than the closed frontier. Between them, Kimi undercuts on the way in.

  • GLM-5.2 (via Z.ai): $1.40 in / $4.40 out per million tokens.
  • Kimi K2.7 Code (via Moonshot): $0.95 in / $4.00 out per million tokens.

That makes Kimi about 32% cheaper on input and a touch cheaper on output. For agent loops that stuff large prompts into every step, input price dominates your bill — and Kimi wins that math. For reference, VentureBeat pegged GLM-5.2 at roughly one-sixth the cost of GPT-5.5 while beating it on several long-horizon coding benchmarks, so "expensive" is relative here either way.

Winner: Kimi K2.7 Code — cheapest input in the matchup.

Round 4: License and commercial terms

Both ship real open weights, but the fine print differs.

  • GLM-5.2: plain MIT. Use it, host it, ship it, no strings.
  • Kimi K2.7 Code: a Modified MIT license. Commercial use is allowed, but an attribution requirement kicks in once you pass 100M monthly active users or $20M in monthly revenue.

For the overwhelming majority of teams those Kimi thresholds are irrelevant. But if you're building something that might get large, GLM's unconditional MIT is the cleaner legal story.

Winner: GLM-5.2 — fewer conditions to reason about.

Round 5: Deployment and ecosystem

GLM-5.2 launched straight into Z.ai's GLM Coding Plan and quickly showed up across third-party inference providers, so you can hit it through an OpenAI-compatible endpoint without self-hosting. Both models expose OpenAI-style APIs, which means switching is mostly a base URL and a model string:

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.z.ai/api/paas/v4",  # swap for Moonshot's base URL
    api_key="YOUR_KEY",
)

resp = client.chat.completions.create(
    model="glm-5.2",  # or "kimi-k2.7-code"
    messages=[{"role": "user", "content": "Refactor this module for async I/O"}],
)
print(resp.choices[0].message.content)

Kimi's counterpunch is native multimodality — useful if your agent needs to read screenshots, diagrams, or design mocks alongside code. GLM-5.2 is the stronger pure-text coder with the broader hosting footprint; Kimi is the more flexible input story.

Winner: Tie — GLM for reach, Kimi for multimodal inputs.

The verdict

Tallying the rounds, GLM-5.2 takes the belt on the strength of a verifiable benchmark trail, a 1M-token window, unconditional MIT licensing, and wide availability. It's the safer default for a serious agentic-coding stack today.

Pick Kimi K2.7 Code when your workload is input-heavy enough that the ~32% cheaper input tokens move your monthly bill, or when you genuinely need images in the loop — and you're comfortable trusting first-party benchmarks until independent numbers land.

The bigger story isn't which of these two wins. It's that in a single 48-hour window, two open-weight models arrived that trade blows with GPT-5.5 on real coding benchmarks at a sixth of the price. Closed frontier labs shipped too — Claude Sonnet 5 landed June 30 — but the floor for "good enough to run your agents" just dropped hard, and it's wearing an MIT license.

References