Weekly Notes - 2026-W07

🗒️ Weekly Notes

Personal Reflection

This week made one trend unmistakable: AI is no longer just about better chat, it is about operational systems, distribution, and durable execution. The most interesting signals were where product capability, infrastructure spend, and enterprise adoption all moved in the same direction.

image_kw07.png

đź§  Main

đź§Ş Research

  • Nathan Lambert: Open Models Will Never Catch Up — Nathan Lambert argues open models may remain behind frontier closed systems, but still serve as the critical experimentation engine for research and policy visibility. The framing shifts from “winning benchmarks” to “expanding the innovation surface.”
  • GLM-5: From Vibe Coding to Agentic Engineering — GLM-5 scales parameters, training data, and asynchronous RL infrastructure to improve coding and long-horizon agent tasks. Its MIT-licensed release also adds pressure to the open-model ecosystem.
  • The Potential of RLMs — Recursive Language Models separate token context from programmatic context, reducing “context rot” in very long-context tasks. The larger idea is that RLM traces can become a discovery mechanism for better agent architectures.
  • The Isomorphic Labs Drug Design Engine unlocks a new frontier beyond AlphaFold — Isomorphic Labs reports significant gains in structure generalization, binding affinity prediction, and cryptic pocket identification. This points to a stronger bridge between structural biology outputs and practical drug design loops.
  • MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents — MemSkill proposes a framework where memory behavior is learned and iteratively improved instead of hard-coded. The reusable skill-bank approach is a useful direction for long-horizon agent reliability.
  • Text classification with Python 3.14’s zstd module — A lightweight classifier built on incremental Zstandard compression demonstrates strong accuracy-speed tradeoffs with minimal complexity. It is a practical reminder that not every NLP task needs heavy model stacks.

🛠️ Tools

  • Introducing GPT-5.3-Codex-Spark — Codex-Spark is a low-latency coding model tuned for real-time iteration and near-instant interaction. It complements longer-horizon agent workflows with a faster collaboration loop.
  • ChatGPT — Release Notes — Deep Research now supports tighter source scoping, editable plans, and a stronger workspace experience. These upgrades make research workflows more controllable and reproducible.
  • OpenAI works on ChatGPT Skills, upgrades Deep Research — Early signs point to a first-party skills layer for reusable workflows inside ChatGPT. If shipped broadly, this could standardize team playbooks without custom agent stacks.
  • Skills in OpenAI API — The Skills pattern formalizes reusable procedure bundles (SKILL.md + scripts + assets) for model execution environments. This is a strong building block for repeatable, versioned agent behavior.
  • Speed up responses with fast mode — Claude Code’s fast mode introduces a clear latency-cost tradeoff for Opus 4.6 sessions. It is useful when iteration speed matters more than token efficiency.
  • OpenClaw Partners with VirusTotal for Skill Security — OpenClaw now scans published skills with VirusTotal and Code Insight, adding a strong baseline for marketplace security hygiene. It is not complete protection, but it is a meaningful defense-in-depth layer.
  • ClawSec — ClawSec packages agent-focused security controls like integrity checks, advisory feeds, and automated audit workflows. It is a practical example of security tooling adapting to agent-native environments.
  • Harness engineering: leveraging Codex in an agent-first world — OpenAI’s harness writeup shows how strong repo scaffolding, constraints, and observability can enable high agent throughput. The key lesson is that engineering leverage now comes from environment design as much as code writing.
  • Towards self-driving codebases — Cursor’s multi-agent orchestration research details how role separation, anti-fragile loops, and throughput-first system design improve autonomous development. It offers concrete patterns for teams experimenting with long-running software agents.

🌅 Closing Reflection

This week’s signals all point in one direction: the edge is shifting from model novelty to system execution, where deployment architecture, workflow design, and governance determine who compounds fastest. Next week, I want to go deeper on how teams can make agentic workflows reliable without collapsing under operational complexity.

🙏 Thanks & Contact

Thanks for reading! If you have suggestions or feedback, I'd love to hear from you via my contact form. See you next week!