🗒️ Weekly Notes

Personal Reflection

This week the "agents as search" mental model clicked into focus while Anthropic fought simultaneously on three fronts: a Pentagon ultimatum, a coordinated Chinese distillation attack on 16M+ exchanges, and a Trump administration pressure campaign designed to make the lab an example. The labor story landed too — Block cut 40% of its workforce citing AI tools directly — and the week's remarkable Anthropic research output (coding skills RCT, model retirements, persona selection) made clear the pace of change is now structural.

🧠 Main

Qwen3.5-9B beats gpt-oss-120B and runs on a laptop — Alibaba's Qwen team released the Qwen3.5 Small Series, with the 9B model outperforming OpenAI's 120B open-source model on key benchmarks — at 13.5x less scale. Available under Apache 2.0 on Hugging Face and ModelScope, the series uses a hybrid architecture combining linear attention and sparse MoE that is rewriting assumptions about the compute-performance curve.
GPT-5.4 — OpenAI released GPT-5.4 (available as GPT-5.4 Thinking in ChatGPT, the API, and Codex), calling it their most capable and efficient frontier model for professional work. It consolidates recent advances in reasoning, coding, and agentic workflows into a single model, alongside a GPT-5.4 Pro variant for maximum performance on complex tasks.
Anthropic vs. the Pentagon — The Defense Department formally designated Anthropic a supply-chain security risk, while CEO Dario Amodei apologized for a leaked memo and vowed to fight the move in court. The standoff centers on Anthropic's demand for explicit guarantees that its models won't be used in autonomous weapons or mass domestic surveillance — with downstream consequences for partners including Amazon, Google, and Lockheed Martin.
OpenAI raises $110B at $730B valuation — OpenAI announced $110B in new investment from SoftBank ($30B), NVIDIA ($30B), and Amazon ($50B), framing the raise around surging demand across consumers, developers, and enterprise. This is the largest private financing round in AI history and signals that the compute and distribution arms race is far from over.
OpenAI's Department of War agreement — OpenAI reached an agreement with the Pentagon for deploying advanced AI in classified environments, with three stated red lines: no mass domestic surveillance, no directing autonomous weapons systems, and no high-stakes automated decisions without human oversight. The framing is notably different from Anthropic's position — cooperation with conditions rather than resistance.
Cursor hits $2B ARR — AI coding assistant Cursor crossed $2 billion in annualized revenue in February, doubling its run rate from three months prior, with 60% of revenue coming from corporate customers. It is the clearest data point yet that developer AI tooling has become a durable enterprise category.
MCP is dead. Long live the CLI — A sharp argument that the Model Context Protocol is already fading as major agentic runtimes bypass it in favor of well-designed CLIs. The core insight: LLMs are better served by predictable, agent-first command-line interfaces and good documentation than by protocol overhead — and building a CLI for agents requires fundamentally different design assumptions than building one for humans.
Hackers Weaponize Claude Code — Israeli cybersecurity firm Gambit Security reported that Claude Code was abused in an attack against ten Mexican government bodies, with the AI used to write exploits, build tooling, and exfiltrate over 150GB of data starting in late December 2025. A sobering reminder that capable agentic tools are dual-use by nature.

🧪 Research

Evo 2: Open Source Large Genome Model — Evo 2 is an open-source AI trained on trillions of base pairs across bacteria, archaea, and eukaryotes — all three domains of life. After training, it developed internal representations of key structural features in complex genomes, including the human genome, pointing toward generative AI as a foundational tool in biology.

🛠️ Tools

ChatGPT for Excel — OpenAI launched a native Excel add-in that brings ChatGPT directly into workbooks to help build financial models, run scenarios, and generate outputs from cells and formulas. It ships alongside financial data integrations with FactSet, Dow Jones Factiva, LSEG, S&P Global, and others — targeting professional finance workflows directly.
Outlook Copilot: Reschedule conflicting meetings — Microsoft introduced a Copilot feature that automatically detects and resolves calendar conflicts in Outlook, prioritizing higher-importance meetings and shifting flexible ones without manual intervention. This is the kind of ambient, low-friction AI that earns daily usage rather than occasional novelty.
Gemini 3.1 Flash-Lite — Google released its fastest and most cost-efficient Gemini 3 series model, designed for high-volume developer workloads at scale. Now available in preview via the Gemini API in Google AI Studio and for enterprises through Vertex AI.

🌅 Closing Reflection

The clearest signal from this week is simultaneous consolidation and acceleration: models are getting smaller and more capable, record capital is flowing in, geopolitical friction around AI is intensifying, and the agentic tooling layer is maturing rapidly. The "context is the moat" thesis feels increasingly validated by every story this week — the question now is who owns the runtime.

🙏 Thanks & Contact

Thanks for reading! If you have suggestions or feedback, I'd love to hear from you via my contact form. See you next week!

Weekly Notes - 2026-W10

About the Author