Weekly Notes - 2026-W12

About the Author

David Schaupp is a Senior Data Scientist at FiveSquare GmbH and an associate lecturer for Machine Learning at St. Polten University of Applied Sciences. He focuses on computer vision, large language models, multi-agent systems, and practical AI engineering.

๐Ÿ—’๏ธ Weekly Notes

Personal Reflection

This week signals a major pivot toward agentic infrastructure, with OpenAI's acquisition of Astral and the launch of Frontier positioning AI as a cross-platform semantic layer. Nvidia's foray into orbital computing and the emergence of China's OpenClaw ecosystem highlight the accelerating race for both physical and software dominance. The industry focus has moved beyond simple chat to integrated environments that unify coding, enterprise productivity, and autonomous execution.

Week 2026-W12 header

๐Ÿง  Main

  • OpenAI to Cut Back on Side Projects in Push to 'Nail' Core Business โ€” OpenAI is shifting its strategy to prioritize coding and enterprise productivity tools over diverse side projects. Under pressure from rival Anthropic, executives are refocusing resources on core business applications and specialized models like Codex to maintain market leadership ahead of a potential IPO later this year.

  • 1M context is now generally available for Opus 4.6 and Sonnet 4.6 โ€” Anthropic has made the 1M token context window generally available for Claude Opus 4.6 and Sonnet 4.6. The update features standard pricing without long-context premiums, expanded media limits to 600 images or PDF pages, and full rate limits across the entire window.

  • Fixing AI failure: Three changes enterprises should make now โ€” This article explores why enterprise AI projects often fail, identifying cultural and organizational barriers rather than technical flaws. It suggests three key improvements: broadening AI literacy beyond engineering teams, establishing clear rules for AI autonomy, and creating cross-functional playbooks to ensure better collaboration and accountability.

  • Jeff Bezos in Talks to Raise $100 Billion for AI Manufacturing Fund โ€” Jeff Bezos is seeking $100 billion for a new fund to acquire and automate manufacturing companies using AI. Linked to his startup Project Prometheus, the initiative targets sectors like chipmaking and defense, utilizing AI models that simulate the physical world to enhance industrial efficiency and profitability.

  • OpenAI Plans Launch of Desktop 'Superapp' to Refocus, Simplify User Experience โ€” OpenAI will unify ChatGPT, Codex, and its browser into a desktop "superapp" to streamline operations and compete with Anthropic. The initiative focuses on "agentic" AI capabilities and simplifying the user experience for enterprise and engineering customers through a more focused product strategy.

  • Data Is the Only Moat โ€” This article explores why human-generated data is becoming the only sustainable moat for software businesses as AI commoditizes coding and transformation. Using Podscan as an example, it argues that founders should focus on proprietary datasets, metadata, and API-first accessibility to remain competitive in an AI-driven future.

  • China Approves First Brain Implant for Commercial Use โ€” China has approved Neuracle Technology's invasive brain-computer interface for commercial use, marking a major milestone in its bid to challenge companies like Neuralink. The system helps partially paralyzed patients regain hand movement, reflecting China's strategic investment in neurotechnology as a key industry of the future.

  • How China is getting everyone on OpenClaw, from gearheads to grandmas โ€” China is seeing rapid adoption of OpenClaw, a viral personal AI assistant, through support from tech giants and government initiatives. While the nation aims to integrate AI across all industries by 2030, authorities are increasingly raising concerns regarding data privacy and security risks.

  • Nvidia announces Vera Rubin Space-1 chip system for orbital AI data centers โ€” Nvidia announced the Vera Rubin Space-1 chip system at GTC 2026, designed for orbital AI data centers. Developed with partners like Axiom Space and Starcloud, these modules address Earth's energy constraints by enabling space-based intelligence in power-constrained environments while overcoming unique cooling challenges.

  • Nvidia's CEO Projects $1 Trillion in AI Chip Sales as New Computing Era Begins โ€” At Nvidia's GTC conference, CEO Jensen Huang projected $1 trillion in AI chip sales by 2027, highlighting a strategic shift toward inference computing. He unveiled the Vera Rubin/Groq server architecture and announced major partnerships in autonomous driving and enterprise AI software.

  • OpenAI launches GPT-5.4 mini and GPT-5.4 nano on APIs โ€” OpenAI launched GPT-5.4 mini and GPT-5.4 nano, emphasizing speed and cost-efficiency for coding, automation, and agents. Available via API and ChatGPT, these models offer significantly improved benchmark performance over GPT-5 mini, with pricing optimized for high-volume developer workflows and multi-agent systems.

  • OpenAI to acquire Astral โ€” OpenAI announced its intent to acquire Astral, the creator of widely-used Python tools like uv and Ruff. The acquisition aims to integrate these tools into the Codex ecosystem, enabling AI agents to participate more deeply across the software development lifecycle from project management to code quality enforcement.

  • OpenAI's Frontier puts AI agents in a fight SaaS can't afford to lose โ€” OpenAI has launched Frontier, an enterprise AI agent platform that acts as a semantic layer across existing software systems. By enabling agents to operate across silos like CRMs, OpenAI challenges traditional SaaS seat-license models and forces incumbents like Salesforce and ServiceNow to rethink their business strategies.

  • Report: Meta could lay off 20% of its staff and replace many of them with AI workers โ€” Meta is reportedly planning to lay off up to 20% of its workforce to offset massive AI infrastructure investments. The move aims to replace human business processes with autonomous AI agents as the company shifts focus toward "superintelligence" and efficiency gains through automation.

  • The SaaS Extinction Test โ€” This article explores the impact of AI coding agents on the SaaS industry. It argues that while pure workflow tools are vulnerable, incumbents with data gravity and mission-criticality remain protected by the high complexity of production infrastructure, security, and operational risk management.

  • Nvidia Software Aims to Bring OpenClaw to the Enterprise โ€” Nvidia CEO Jensen Huang unveiled NemoClaw at GTC, a software toolkit designed to make the OpenClaw autonomous agent platform enterprise-ready. NemoClaw provides a secure virtual environment to prevent data tampering and security risks, facilitating the safe deployment of autonomous agents within corporate infrastructures.

๐Ÿงช Research

  • Can LLMs Be Computers? โ€” Percepta demonstrates turning Transformers into computers by implementing a WebAssembly interpreter within model weights. Using a novel 2D attention mechanism, they achieve O(log t) inference scaling, allowing models to execute complex algorithms directly at 30,000 tokens/second without external tools, overcoming traditional computational limitations of autoregressive decoding.

  • GLM-5-Turbo โ€” GLM-5-Turbo is a foundation model optimized for agent-based tasks in the OpenClaw ecosystem. It features enhanced tool invocation, complex instruction following, and persistent task execution. The model supports a 200K context length and demonstrates superior performance on the new ZClawBench benchmark for real-world agent workflows.

  • GPT 5.4 is a big step for Codex โ€” Nathan Lambert evaluates GPT 5.4's performance in Codex, highlighting significant improvements in instruction following, reasoning efficiency, and context management. While Claude remains a favorite for its intuitive "intent," GPT 5.4 is framed as a precise, mechanical tool optimized for complex, multi-step agentic tasks.

  • Introducing Composer 2 โ€” Cursor has launched Composer 2, a coding model featuring significant performance gains on benchmarks like Terminal-Bench 2.0 and SWE-bench Multilingual. Developed via continued pretraining and reinforcement learning for long-horizon tasks, it offers frontier-level intelligence at a competitive price point with a faster variant available.

  • How Do You Want to Remember? โ€” An experiment where an AI agent autonomously improved its memory recall from 60% to 93%. By diagnosing its failure to capture decision rationale, the agent redesigned its own data structure and performed self-evaluations, highlighting technical gains through AI-led cognitive architecture optimization.

  • LLM Architecture Gallery โ€” A technical reference gallery comparing architectures of prominent Large Language Models. It provides detailed fact sheets on parameter scales, decoder types, and attention mechanisms like GQA and MLA for models including Llama 3, DeepSeek V3, and GPT-OSS, highlighting structural innovations and training configurations.

  • MiniMax M2.7: Early Echoes of Self-Evolution โ€” MiniMax introduces M2.7, a model capable of autonomous self-evolution. It excels in software engineering, complex office tasks, and machine learning competitions. The model demonstrates advanced agentic capabilities, optimizing its own training workflows and achieving high scores on benchmarks like SWE-Pro and MLE Bench Lite.

  • MiniMax launches M2.7 model on MiniMax Agent and APIs โ€” MiniMax has launched its M2.7 model, which uses reinforcement learning and agent harnesses for self-updating capabilities. Available via API and MiniMax Agent, it excels in software engineering and complex workflows, demonstrating high performance on benchmarks like SWE-Pro and VIBE-Pro.

  • How Karpathy's Autoresearch Works And What You Can Learn From It โ€” An analysis of Andrej Karpathy's Autoresearch, a system for autonomous training script optimization. It details how tight constraints, stable metrics, and time-bounded experiments create reliable agentic workflows, emphasizing that architecture and operational discipline are more critical for successful autonomy than open-ended freedom.

  • Training Composer for longer horizons โ€” Cursor introduces a reinforcement learning technique called "self-summarization" to train its Composer agent for long-horizon coding tasks. By incorporating compaction into the training loop, the model learns to efficiently condense context, reducing errors by 50% on benchmarks while handling trajectories exceeding its context window.

  • Xiaomi stuns with new MiMo-V2-Pro LLM nearing GPT-5.2, Opus 4.6 performance at a fraction of the cost โ€” Xiaomi released MiMo-V2-Pro, a 1-trillion parameter sparse model rivaling GPT-5.2 and Claude Opus 4.6 in reasoning and agentic tasks. Featuring a 1M-token context window and hybrid architecture, it offers frontier-level performance at significantly lower costs than US competitors, though currently limited to a proprietary API.

๐Ÿ› ๏ธ Tools

  • colab-mcp โ€” An MCP (Model Context Protocol) server designed for interacting with Google Colab. It allows developers to integrate Colab capabilities into MCP-compatible services and tools, providing setup instructions using the uv package manager.

  • NVIDIA OpenShell โ€” NVIDIA OpenShell is an open-source runtime providing secure, sandboxed execution environments for autonomous AI agents. It uses declarative YAML policies to govern network, filesystem, and process access, offering built-in GPU support and integration with various agent frameworks to ensure private and safe agent operations.

  • Lessons from Building Claude Code: How We Use Skills โ€” Anthropic details lessons from developing 'Skills' for Claude Code, explaining how these folder-based extensions enhance agent capabilities. It categorizes skills into types like API references and business automation, while offering technical best practices for context engineering, data persistence, and marketplace distribution to improve developer workflows.

  • Open SWE: An Open-Source Framework for Internal Coding Agents โ€” LangChain has released Open SWE, an open-source framework for building internal coding agents. Built on Deep Agents and LangGraph, it features isolated sandboxes, curated toolsets, and multi-platform integration (Slack, Linear, GitHub), mirroring architectural patterns used by companies like Stripe and Coinbase.

  • Meet the new Stitch: AI-Native Design Partner Updates โ€” Google's Stitch introduces major updates, including an AI-native infinite canvas, a smarter design agent with full canvas context, and voice-controlled design. The platform also adds instant prototyping capabilities and DESIGN.md for maintaining design system consistency across projects.

๐ŸŒ… Closing Reflection

The transition from standalone models to integrated agentic layers is beginning to disrupt traditional SaaS business models by operating across previously isolated software silos. Watch for the performance of self-evolving models and the economic impact as major tech firms experiment with replacing human staff with autonomous AI workflows.

๐Ÿ™ Thanks & Contact

Thanks for reading! If you have suggestions or feedback, I'd love to hear from you via my contact form. See you next week!