🗒️ Weekly Notes

Personal Reflection

This week signals a transition from AI as a sidekick to AI as a primary operator, highlighted by the launch of agentic systems like Copilot Cowork and Perplexity’s Personal Computer. The industry is rapidly verticalizing, with Meta developing in-house chips and Nvidia investing in frontier labs to secure the next generation of training infrastructure. These technical leaps are clashing with geopolitical realities, as seen in the legal and internal friction between major AI labs and defense agencies.

🧠 Main

10x is the new floor — The author explores the widening gap between AI-integrated startups and legacy corporate leadership. He argues that AI has raised the productivity floor, enabling individuals to perform at 10x levels and forcing a fundamental shift in the talent market and organizational structures as human agency is amplified.
Cursor Goes To War For AI Coding Dominance — Cursor is pivoting from a collaborative code editor to developing autonomous agents and specialized models to compete with Anthropic and OpenAI. Despite reaching $2 billion in annualized revenue, the company faces intense competition as the industry shifts toward agent-led development that reduces the need for traditional editors.
Anthropic sues Defense Department over supply-chain risk designation — Anthropic filed lawsuits against the US Department of Defense following its designation as a "supply-chain risk." The dispute stems from Anthropic's refusal to allow its AI to be used for mass surveillance or autonomous weapons. The company claims the designation is unlawful retaliation for its stance on AI safety.
Copilot Cowork: A new way of getting work done — Microsoft introduced Copilot Cowork, an agentic system for Microsoft 365 that automates complex tasks like calendar management and research. Integrating Anthropic's Claude Cowork technology, it executes multi-step plans across applications with user oversight. The tool is currently in Research Preview for select enterprise customers.
Building for trillions of agents — Aaron Levie explores the shift toward agent-centric software, where AI agents become primary users. He advocates for API-first design, sandboxed infrastructure, and consumption-based business models to accommodate a future where agents handle the majority of tasks and significantly outnumber human workers.
I don't know if my job will still exist in ten years — A staff engineer reflects on the potential obsolescence of software engineering as AI agents evolve. The article explores whether the industry will overshoot or undershoot AI capabilities, the impact on junior vs. senior roles, and the likelihood of AI eventually handling code maintenance better than humans.
Introducing The Anthropic Institute — Anthropic has launched The Anthropic Institute, an interdisciplinary initiative led by co-founder Jack Clark to address the societal, economic, and legal challenges of powerful AI. The institute integrates research on frontier safety and societal impacts to inform global AI governance and public policy.
Meta acquires Moltbook, the AI agent social network — Meta has acquired Moltbook, a viral social network composed of AI agents. Founders Matt Schlicht and Ben Parr will join Meta Superintelligence Labs, bringing their expertise in connecting autonomous agents through an "always-on directory" to enhance Meta's agentic product experiences.
Nvidia Invests in Mira Murati’s Thinking Machines Lab — Nvidia has invested in Mira Murati’s startup, Thinking Machines Lab, as part of a multiyear partnership. The deal includes deploying at least one gigawatt of Nvidia chips to train frontier AI models and collaborating on the design of advanced AI training and serving infrastructure.
OpenAI hardware exec Caitlin Kalinowski quits in response to Pentagon deal — OpenAI hardware lead Caitlin Kalinowski has resigned following the company’s controversial partnership with the Department of Defense. Kalinowski cited concerns over rushed governance and potential risks regarding autonomous weapons and domestic surveillance, highlighting internal tensions regarding OpenAI's strategic direction and ethical guardrails.
Nvidia reportedly building its own AI agent to compete with OpenClaw, report claims — ‘NemoClaw’ will supposedly be open source and designed for enterprise use — Nvidia is reportedly developing "NemoClaw," an open-source AI agent platform for enterprise use. Designed to compete with OpenClaw, it focuses on security and privacy while remaining hardware-agnostic. The move aims to capture the corporate market by offering customizable, automated orchestration tools for various enterprise applications.
Meta Preparing to Deploy Four New Homegrown Chips to Handle AI — Meta announced plans to deploy four new generations of in-house AI chips (MTIA series) by 2027. This strategic shift aims to optimize specialized AI workloads, lower costs, and reduce reliance on external suppliers like Nvidia, while continuing to purchase commercial hardware for general training needs.
Perplexity's Personal Computer lets AI agents access your Mac mini's files — Perplexity announced "Personal Computer," an extension of its agentic platform that allows AI agents to access and manage local files and applications on hardware like the Mac mini. It automates complex tasks by coordinating sub-agents while maintaining security through server-side processing and user approval logs.
Promptfoo is joining OpenAI — OpenAI has acquired Promptfoo, an open-source platform for testing, evaluating, and red-teaming AI applications. Promptfoo will remain open-source and continue supporting multiple model providers while integrating its core security and evaluation technology into OpenAI’s infrastructure to help developers ship more reliable and secure AI.
Introducing the Claude Marketplace — Anthropic launched the Claude Marketplace in limited preview, allowing enterprise customers to apply existing spend commitments toward third-party Claude-powered solutions from partners like GitLab and Snowflake, streamlining AI procurement and ecosystem distribution.
The 8 Levels of Agentic Engineering — Bassim Eledath outlines an 8-level maturity model for agentic engineering, moving from basic IDE autocompletion to autonomous multi-agent teams. The framework emphasizes context engineering, automated feedback loops, and background orchestration to bridge the gap between AI model capabilities and real-world engineering productivity.

🧪 Research

Designing AI agents to resist prompt injection — OpenAI analyzes how prompt injection attacks are evolving into social engineering. To defend AI agents, they propose a system-design approach that limits the impact of successful manipulation through techniques like 'Safe Url' monitoring and human-like capability constraints rather than simple input filtering.
Codex Security: now in research preview — OpenAI launched Codex Security, an AI-driven application security agent in research preview. It uses frontier models to build project-specific threat models, identify vulnerabilities, and provide high-confidence fixes. During testing, it significantly reduced false positives and helped discover critical vulnerabilities in prominent open-source repositories like OpenSSH and GnuTLS.
How Autoresearch will change Small Language Models adoption — Autoresearch is an autonomous framework that optimizes model training by iteratively editing code and running experiments. Key results include an 11% speedup in GPT-2 training and a 0.8B model outperforming a 1.6B version, demonstrating how automated technical investigations can significantly improve Small Language Model efficiency.
Introducing GPT‑5.4 — OpenAI launched GPT-5.4, featuring native computer-use capabilities, 1M token context, and improved reasoning for professional tasks. The model sets new benchmarks in OSWorld-Verified (75%) and spreadsheet modeling, while introducing features like tool search and mid-response steerability to reduce hallucinations and improve agentic workflows.
Introducing Phi-4-Reasoning-Vision-15B to Microsoft Foundry — Microsoft launched Phi-4-Reasoning-Vision-15B, a multimodal small language model combining visual perception with structured reasoning. Available on Microsoft Foundry and Hugging Face, it features controllable reasoning capabilities and demonstrates strong performance in vision-based math, document understanding, and computer-use agent scenarios.
Partnering with Mozilla to improve Firefox’s security — Anthropic collaborated with Mozilla to test Claude Opus 4.6's vulnerability detection capabilities. The model identified 22 Firefox vulnerabilities, including 14 high-severity issues. Research also explored automated exploit generation, finding that while Claude excels at discovery, creating functional exploits remains significantly more complex and resource-intensive.
Google launches new multimodal Gemini Embedding 2 model — Google released Gemini Embedding 2, a multimodal model unifying text, images, video, and audio into one embedding space. It utilizes Matryoshka Representation Learning for flexible output dimensions and demonstrates state-of-the-art benchmark performance, supporting advanced applications like multimodal Retrieval-Augmented Generation and semantic search.

🛠️ Tools

Anthropic launches code review tool to check flood of AI-generated code — Anthropic has launched Code Review, a multi-agent AI tool integrated into Claude Code. Designed for enterprise developers, it automatically analyzes GitHub pull requests to identify logical errors and suggest fixes, helping teams manage the increasing volume of AI-generated code while maintaining software quality.
autoresearch — Andrej Karpathy's experimental project where an AI agent autonomously iterates on LLM training code. It modifies model architecture and hyperparameters in five-minute cycles to minimize validation bits-per-byte, allowing users to wake up to optimized models and detailed experiment logs.
Early look: Meta silently launches Vibes AI editor to challenge rivals — Meta has transitioned Vibes into a web-based AI creation studio featuring project workflows, timeline editing, and generative video tools. The platform includes features like lip-syncing and style libraries, aiming to compete with production tools like Sora by providing a dedicated workspace for creators and marketers.

🌅 Closing Reflection

The arrival of GPT-5.4 with native computer-use and autonomous research frameworks suggests that the next phase of productivity will be defined by self-optimizing systems. As agents begin to manage local files and automate their own training, the focus will shift from prompt engineering to the governance of autonomous multi-agent ecosystems.

🙏 Thanks & Contact

Thanks for reading! If you have suggestions or feedback, I'd love to hear from you via my contact form. See you next week!

Weekly Notes - 2026-W11

About the Author