🗒️ Weekly Notes

Personal Reflection

This week highlights a decisive pivot toward agentic autonomy, exemplified by Anthropic's computer-control capabilities and Meta's research into self-improving Hyperagents. The shift from static apps to agent-centric ecosystems is accelerating as Amazon and Apple reimagine hardware and operating systems to prioritize API-driven interactions over traditional GUIs. Simultaneously, advancements in distillation and extreme compression like TurboQuant suggest a future where sophisticated reasoning models move from the cloud to local, on-device environments.

Week 2026-W13 header

🧠 Main

App Store | Age of Agent — This article explores why the traditional App Store model is failing in the AI agent era. It argues that agents rely on APIs and open protocols like MCP rather than GUIs, shifting the economic focus from distribution and lock-in to discovery algorithms and payment orchestration.
Amazon is reportedly developing an AI-centric smartphone — Amazon is reportedly developing an AI-centric smartphone codenamed "Transformer." A decade after the Fire Phone's failure, this new device aims to integrate Amazon's AI services and shopping features, potentially bypassing traditional app stores in favor of AI-driven interactions and direct service integration.
Apple Can Create Smaller On-Device AI Models From Google's Gemini — Apple has reportedly gained full access to Google's Gemini models to perform distillation, creating smaller, efficient on-device AI models for Siri. This allows Apple to train specialized versions that mimic Gemini's reasoning while running locally on devices for upcoming iOS 27 features.
Anthropic's Claude Can Now Control Your Computer — Anthropic has launched a research preview allowing Claude to control MacOS computers to perform tasks like file management and app navigation. Additionally, Google announced new Gemini AI integrations for Workspace, enabling automated data entry in Sheets and enhanced content creation in Docs and Slides.
OpenAI Scraps Sora Video Platform Months After Launch — OpenAI is discontinuing its Sora video platform, developer API, and video features within ChatGPT. This strategic shift redirects resources toward productivity tools, coding, and agentic systems as the company refocuses its roadmap ahead of a potential initial public offering.
Mark Zuckerberg Is Building an AI Agent to Help Him Be CEO — Mark Zuckerberg is developing a personal AI "CEO agent" to streamline information retrieval and management tasks. This project is part of Meta's broader initiative to integrate AI-native workflows across the company, aiming to flatten organizational structures and increase efficiency to compete with smaller AI startups.
Apple Plans AI Reboot With Siri App, New Look and 'Ask Siri' Button in iOS 27 — Apple is planning a significant AI overhaul for iOS 27, featuring a standalone Siri app and a conversational interface. The update includes a system-wide 'Ask Siri' button, deeper app integration, and 'Write with Siri' tools, powered by updated internal models and a partnership with Google Gemini.
Anthropic's Claude Code and Cowork can control your computer — Anthropic has introduced autonomous computer control for its Claude Code and Cowork AI tools. Currently a macOS research preview for Pro and Max subscribers, the feature enables Claude to navigate apps, browsers, and development tools directly, building on earlier capabilities to automate complex agentic workflows.
Apple Plans to Open Up Siri to Rival AI Assistants in iOS 27 Update — Apple plans to open Siri to third-party AI assistants like Google Gemini and Anthropic's Claude in iOS 27. This strategic shift introduces an "Extensions" system, allowing users to choose their preferred AI provider and enabling Apple to generate revenue from App Store AI subscriptions.
Gemini 3.1 Flash Live: Making audio AI more natural and reliable — Google has launched Gemini 3.1 Flash Live, a high-quality voice model featuring lower latency and improved tonal understanding. It excels in complex function calling and reasoning benchmarks, enabling more natural real-time interactions across Gemini Live, Search Live, and developer APIs.
everything claude has shipped in 2026 and how to actually use it — This guide details Anthropic's rapid 2026 release cycle, featuring the Claude 4.6 model family (Opus, Sonnet, Haiku) and a 1M token context window. It provides setup instructions for Claude Cowork agents and Claude Code developer tools, alongside benchmarks and ecosystem updates.
Google partners with Agile Robots, growing its AI robotics footprint — Google DeepMind has partnered with Agile Robots to integrate Gemini Robotics foundation models into industrial hardware. The collaboration focuses on enhancing AI performance through real-world data collection in manufacturing, marking another strategic move in Google's expanding robotics ecosystem alongside Boston Dynamics and Apptronik.
I analyzed all YC W26 companies: Here's what it tells us about the future. — An analysis of the YC W26 batch shows 85% of companies are AI-first, shifting from copilots to autonomous agents. Major trends include healthcare-focused AI, vertical coding agents, physical robotics, and specialized infrastructure for an agent-centric economy across fintech and defense sectors.
OpenAI rolls out ChatGPT Library to store your personal files — OpenAI introduced "Library" for ChatGPT Plus, Pro, and Business tiers, enabling users to store uploaded files and images in a dedicated cloud location. This feature allows persistent access to files for future chats, ensuring they remain available even if the original conversation is deleted.
OpenAI reportedly plans to double its workforce to 8,000 employees — OpenAI aims to double its workforce to 8,000 employees by late 2026, focusing on engineering, research, and sales. The expansion seeks to maintain a competitive edge against rivals like Anthropic and support large-scale enterprise and government contracts.
Tesla, SpaceX Plan to Build New Chip Factory in Texas — Tesla and SpaceX are partnering to build "Terafab," a massive chip manufacturing facility in Austin, Texas. The factory will produce specialized semiconductors for Tesla's electric vehicles and Optimus robots, as well as AI-capable chips for SpaceX satellites to ensure supply chain independence for Musk's various AI initiatives.
OpenAI is throwing everything into building a fully automated researcher — OpenAI has established building a fully automated AI researcher as its "North Star" goal. Chief Scientist Jakub Pachocki details plans for an autonomous research intern by September 2026 and a multi-agent system by 2028, leveraging reasoning models to tackle complex scientific and mathematical problems.

🧪 Research

Hyperagents — Meta AI research introducing self-referential, self-improving agents designed to optimize for any computable task. The framework enables agents to autonomously modify their own logic and strategies, facilitating recursive improvement through meta-agent and task-agent interactions.
Hyperagents — Researchers introduced Hyperagents, self-referential AI agents that integrate task-solving and self-modification into a single editable program. This framework enables metacognitive self-improvement across any computable task, allowing agents to improve both their problem-solving abilities and the mechanisms they use to generate further improvements.
Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x — Google Research introduced TurboQuant, a compression algorithm reducing LLM memory usage by 6x and boosting speed by 8x. By employing polar coordinates and 1-bit error correction for KV cache quantization, TurboQuant maintains model accuracy across benchmarks without requiring additional training.
Mistral AI just released a text-to-speech model it says beats ElevenLabs — and it's giving away the weights for free — Mistral AI released Voxtral TTS, a 3.4B-parameter open-weight text-to-speech model designed for enterprise use. It supports nine languages, offers zero-shot voice adaptation, and outperforms ElevenLabs in human preference tests. The model is highly efficient, requiring only 3GB of RAM and running locally on consumer hardware.
State of Context Engineering in 2026 — An analysis of five essential patterns for context engineering in AI agents: progressive disclosure, context compression, routing, advanced retrieval, and tool management. It evaluates tradeoffs in accuracy, latency, and token efficiency to optimize agent performance within finite LLM attention budgets.
Silicon Valley's two biggest dramas have intersected: LiteLLM and Delve — LiteLLM, a popular open-source AI project, recently suffered a malware attack via a malicious dependency. The breach has drawn attention because LiteLLM's security was certified by Delve, a startup under fire for allegedly falsifying compliance data and using rubber-stamp auditors.
ARC-AGI-3 Release and Kaggle Competition — François Chollet announced ARC-AGI-3, a benchmark evaluating agentic intelligence through interactive reasoning. While humans solve 100% of these environments without prior instruction, current frontier AI models score under 1%. The release includes new Kaggle competitions focused on fluid intelligence and novel reasoning.
The Death of model.fit() What Data Scientists Actually Do in the Age of AI Agents — As AI agents replace traditional model training, the role of data scientists shifts to 'Evaluation-Driven Development.' This involves creating error taxonomies, golden datasets, and rigorous evaluation frameworks to manage non-deterministic behavior and ensure reliability through systematic measurement and context engineering.
TRIBE v2: An AI Model of the Human Brain — Meta released TRIBE v2, a foundation model predicting human brain activity from sensory stimuli. Using a three-stage transformer architecture, it scales to 70,000 voxels and enables zero-shot generalization, outperforming individual fMRI scans by predicting canonical neural responses with high fidelity across subjects.

🛠️ Tools

Claude Peers MCP — An MCP server enabling multiple Claude Code instances to discover and communicate with each other. It uses a local broker daemon and SQLite to facilitate instant message passing and project status sharing between separate terminal sessions for improved cross-project coordination.
MiniMax AI Skills — A collection of development skills for AI coding agents including Claude Code and Cursor. It offers structured guidance for frontend, fullstack, mobile, and shader development, plus automated document processing for PDF, Excel, and PowerPoint using MiniMax APIs.
Plugins – Codex OpenAI Developers — This guide explains how to package and distribute OpenAI Codex skills and integrations as plugins. It details plugin components like skills and MCP servers, installation processes via CLI or the app, and technical structures for creating manifests and local marketplaces to facilitate reusable AI workflows.

🌅 Closing Reflection

As the industry transitions from copilots to autonomous agents, the core challenge remains closing the reasoning gap identified in benchmarks like ARC-AGI-3. Success will likely depend on the rise of evaluation-driven development to manage the non-deterministic behavior of multi-agent systems. Watch for how the race for hardware and talent independence defines the next phase of competition among the largest AI labs.

🙏 Thanks & Contact

Thanks for reading! If you have suggestions or feedback, I'd love to hear from you via my contact form. See you next week!

Weekly Notes - 2026-W13

About the Author