Cursor

Companies

Feb 2026 • 2 videos

Steady coverage of Cursor. 20VC with Harry Stebbings and TechCrunch contributed to 2 videos from 2 sources.

2

Feb 2026

Apr 2026 • 1 videos

Lighter month. AI Coding Daily covered Cursor across 1 videos.

1

Apr 2026

May 2026 • 3 videos

High activity month for Cursor. AI Coding Daily and TechCrunch among the most active voices, with 3 videos across 2 sources.

3

May 2026

Jul 2026 • 1 videos

Lighter month. 20VC with Harry Stebbings covered Cursor across 1 videos.

1

Jul 2026

// 20VC with Harry Stebbings
The Collapse of Closed-Model Hegemony and the Rise of Private Data The narrative surrounding artificial intelligence has been dominated by a singular, loud obsession: Artificial General Intelligence (AGI). The prominent labs—most notably OpenAI and Anthropic—pitch a future ruled by one or two monolithic, ultra-intelligent models that solve every human problem. It is a neat, centralized vision. It is also completely wrong. Lin Qiao, the Co-Founder and CEO of Fireworks AI, brings a pragmatism forged during her years on the founding team of PyTorch at Meta. Her perspective is clear: the future belongs to specialized, private intelligence, not generalized monoliths. The fundamental argument rests on the nature of data itself. The public internet, which fuels general-purpose frontier models, represents a tiny fraction of the world's information. The vast majority of valuable data is private, locked behind enterprise firewalls and deep inside proprietary applications. This data is the lifeblood of business. It is a company's core intellectual property. No sane executive will hand this data over to a centralized AGI provider to train a model that their competitors can then rent. To activate this private data, enterprises must customize and steer their own models. This shift exposes the massive strategic disconnect at the heart of the closed-model ecosystem. A general-purpose API cannot be sufficiently customized. It cannot adapt to the unique design principles, brand voices, or operational demands of distinct businesses. As enterprises realize that they can achieve superior performance by tuning smaller, open-weights models on their proprietary datasets, the massive valuations of closed-model giants begin to look highly unstable. The Product-Market Fit Paradox and the Threat of "Scaling to Bankruptcy" In the software-as-a-service (SaaS) era, finding product-market fit (PMF) was the ultimate goal. Once you achieved it, scaling was a mathematical certainty. Central processing units (CPUs) were cheap commodities, and the cost of serving an additional customer was negligible. In the AI era, this playbook is dead. PMF and a durable business model are now two completely separate concepts. Startups and digital natives are encountering a brutal new phenomenon: scaling to bankruptcy. A company can build an application that users absolutely love, but if every user interaction queries an expensive closed-model API, the cost of scaling those features can easily outpace revenue. The problem is even more acute for established incumbents with millions of existing users. If a legacy giant rolls out an unoptimized AI feature to its entire user base, the resulting computing bill could decimate its margins. Chief Financial Officers are looking at the projected token costs of frontier APIs and flatly refusing to greenlight deployments. This economic reality is driving the rapid shift toward open-weights models. When an enterprise controls the model weights, they control the hosting, the optimization, and the long-term cost structure. They can deploy a model that is tailored precisely to their workload, stripping out unnecessary parameters to maximize efficiency. In a world where a 5% reduction in inference costs can save millions of dollars at production scale, the ability to optimize a custom model is not just a technical preference—it is a matter of corporate survival. Why Token Costs Will Plunge 10x as Free Markets End the Compute Shortage There is a prevailing belief that the staggering cost of AI compute is a permanent tax on innovation. It is an illusion caused by a temporary, severe supply chain bottleneck. In any free economy, a shortage that drives prices sky-high acts as a beacon for capital and competition. The current physical constraints—the scarcity of high-bandwidth memory, specialized packaging, and power—will inevitably yield to market forces. Over the next three years, a confluence of optimization vectors will drive a projected 10x reduction in the cost of generating a token. First, model efficiency is rising rapidly. Engineers are learning to build models that solve complex tasks with far fewer tokens, moving away from verbose, unoptimized outputs. Second, hardware and software co-design is yielding massive efficiency gains. Specialized platforms like Fireworks AI can optimize inference deployments to make the unit economics of token generation highly competitive. Finally, as the global supply chain for GPUs and alternative silicon matures over the next two to three years, the raw cost of compute infrastructure will compress. This 10x reduction in token costs will not result in lower overall spending on AI. Instead, it will unlock a 100x explosion in usage. When token costs fall past a certain threshold, intelligence shifts from a costly luxury to a cheap, ubiquitous utility. Enterprises will stop rationing their AI queries and start deploying agentic systems that run continuously in the background, autonomously executing complex workflows. Inside the Cursor Playbook: Post-Training, Decoupled Reinforcement Learning, and Global GPU Scarcity To understand what high-performance AI development looks like under capital constraints, look at the software development platform Cursor. While massive hyperscalers train frontier models on sprawling, homogeneous clusters interconnected by incredibly expensive networking, nimble startups must innovate. The partnership between Cursor and Fireworks AI reveals a highly efficient, distributed approach to reinforcement learning (RL) that points to the future of model training. Instead of running trainer and rollout phases together on a single, massive, and nearly unobtainable cluster, the system decouples these components. The trainer, which updates the model's weights, generates new model versions continuously. These versions are immediately deployed to RL rollout environments scattered across five or six distinct data center regions globally. These rollouts interact with synthetic or real coding environments to gather rewards and evaluate model performance. This fully distributed architecture allows Cursor to tap into scattered, lower-cost GPU capacity around the world rather than waiting for a single monolithic cluster to become available. The core technical hurdle in this design is model weight synchronization. Latency in sending updated weights across global regions threatens to make the gathered rewards stale, which can degrade training quality. To solve this, the platform uses highly optimized synchronization mechanisms to distribute fresh weights fast enough to maintain numerical soundness without requiring an ultra-expensive, low-latency network. This cooperative system engineering is what enabled Cursor to scale its capabilities rapidly while remaining highly capital-conscious. The Illusion of Homogeneity: Why Hardware Depreciates in Months, Not Years For decades, enterprise IT planning has relied on predictable capital expenditure cycles. Servers and data center hardware were depreciated over a comfortable five-to-six-year lifespan. This predictable cadence is completely incompatible with the blistering pace of AI innovation. Today, hardware and model depreciation cycles are compressed into months. A single chip vendor might release three new product variations within a single year. Simultaneously, the open-source community releases superior models on a weekly basis. Because newer, more complex models run exponentially better on the latest hardware architectures, old silicon becomes obsolete long before its physical lifespan is over. Running a cutting-edge model on a two-year-old chip is highly inefficient, yet depreciating expensive GPU clusters over twelve months wreaks havoc on traditional balance sheets. This rapid depreciation forces a fundamental reassessment of the "build versus buy" decision for AI infrastructure. Building and operating proprietary data centers is a highly specialized, capital-intensive endeavor that requires deep expertise in liquid cooling, power distribution, and high-performance networking. For the vast majority of companies, attempting to manage this rapidly evolving hardware stack is a distraction. Only giant platforms with massive, stable, and predictable workloads can justify the immense capital expenditure of building custom silicon and proprietary physical data centers. For everyone else, leveraging a specialized software platform that runs agnostically across all hardware architectures is the only way to maintain agility. Sovereign Power Lines and the Urgent Necessity of Organizational Independence The geopolitical risk of centralized AI infrastructure has become impossible to ignore. When an administration can cut off access to a critical frontier model API with the stroke of a pen, relying on third-party AI providers is a major liability. If a nation's healthcare system, financial infrastructure, or legal services are built on top of a closed API hosted in another jurisdiction, that nation has compromised its sovereignty. This vulnerability is driving the rise of sovereign AI models. Just as nations must secure their own physical power lines and water supplies, they must ensure they have independent access to intelligence infrastructure. The open-source ecosystem is the key enabler of this independence. By deploying and customizing open-weights models on national infrastructure, countries can build resilient systems that cannot be deactivated by a foreign corporation or government. This same principle applies at the enterprise level. No forward-thinking CEO should allow their company's core operations to depend on an API controlled by a single third party. If that provider changes their pricing, alters their model's behavior, or revokes access, the dependent business faces immediate disruption. Owning your own intelligence by tuning open-weights models and running them on independent infrastructure is not just a technical optimization—it is a mandatory risk-management strategy.
5 days ago
// AI Coding Daily
May 22, 2026
// AI Coding Daily
May 20, 2026
// TechCrunch
May 1, 2026
// AI Coding Daily
Apr 3, 2026
// 20VC with Harry Stebbings
The Great Compression of the Software Talent Stack Software engineering is facing a structural collapse of traditional role boundaries. We are witnessing what Alexander Embiricos, the lead for Codex at OpenAI, calls the compression of the talent stack. In the previous era of development, teams relied on a rigid hierarchy: backend engineers handled logic, frontend engineers managed the interface, designers provided the vision, and product managers (PMs) acted as the connective tissue. That model is obsolete. As AI models become increasingly proficient at cross-disciplinary tasks, the need for hyper-specialized siloes vanishes. The future belongs to the full-stack builder who operates with a level of agency previously reserved for small team leads. Even the role of the PM is under fire; when engineers can use AI to look around corners and automate the administrative overhead of development, the need for a dedicated coordinator diminishes for all but the largest organizations. This isn't about the elimination of engineers—it is about their evolution into superhuman architects who manage fleets of digital agents rather than writing every line of syntax by hand. From Pair Programming to Full Delegation A critical shift occurred between GPT-4 and the latest iterations of Codex. We have moved past the era of "tab completion" where AI simply suggested the next few words. We are now in the age of delegation. In the old pair-programming model, you still had your hands on the keyboard, treating the AI like a junior assistant. Today, the workflow is fundamentally different: you provide a high-level spec, review a generated plan, and then let the AI "cook." At OpenAI, the vast majority of internal code is no longer written by humans. Engineers spend their time on architectural decisions and reviewing the AI’s output. This transition requires a new form factor. Traditional Integrated Development Environments (IDEs) were built for typing; they are not optimized for managing multiple concurrent agents. This realization led to the development of the Codex App, a standalone interface designed specifically for high-level delegation rather than manual text editing. The IDE as we know it is becoming a legacy tool for those who still want to own every character, while the market winners will be those who master the art of the plan-and-review cycle. Solving the AGI Bottleneck: Human Action and Validation The real barrier to Artificial General Intelligence (AGI) isn't model compute or architectural limitations—it's us. Specifically, it is the speed at which humans can type and validate AI output. Currently, a power user might interact with AI 30 to 50 times a day. To reach the potential of AGI, that number needs to be in the tens of thousands. We are currently too lazy and too uncreative to prompt our way to the future. We shouldn't have to figure out how to use the tool; the tool should proactively chime in with context-aware solutions. The goal is to make AI usage effortless. This is why top-down enterprise automation often fails. When a company tries to force-feed AI workflows from the C-suite down, they miss the nuance of the actual work. The most successful adoption happens when individuals feel empowered by open-ended tools that they can adapt to their specific, creative needs. Once users achieve fluency, the automation of workflows follows naturally. The Three Phases of Agent Evolution The path to ubiquitous AI agents follows a distinct three-step speedrun. First, we establish dominance in software engineering because code is a high-signal, deterministic domain where LLMs already excel. Second, we realize that every effective agent is, at its core, a coding agent. Coding is simply the best language for an agent to manipulate a computer. During this phase, agents move beyond the IDE and start using browsers and local file systems to perform general tasks. Finally, we reach the productization phase. Once we observe which workflows builders are manually hacking together, we can bake those into specific, high-intent features. The industry is currently in the messy middle of phase two. Companies like Anthropic with Claude Code and Cursor are racing to define the interface of this era. OpenAI is betting on open standards like "agents.md" to ensure that users aren't locked into a single ecosystem, believing that the distribution of intelligence matters more than creating a walled garden. Market Dynamics: Survival in the Age of Commodity Code For investors and founders, the ground is shifting. If building a product is now trivial, then the "moat" of having a good product is gone. The value has migrated back to domain expertise, customer relationships, and distribution. We are entering a terminal stage of the market where a few massive providers will capture the majority of the value because they own the center of gravity of the conversation. In the same way Slack became the center of gravity for communication, a single, conversational agent will likely become the center of gravity for work. Users don't want to manage twelve different agents for twelve different tasks; they want one entity they can talk to about anything. SaaS companies that serve as mere "glue layers" are in grave danger. However, companies that own deep systems of record or gnarly physical infrastructure integrations will remain vital. The war for talent in this space is fierce, but the real winners won't just be the ones with the most GPUs—they will be the ones who build the most ergonomic systems of engagement that humans actually enjoy using.
Feb 21, 2026
// TechCrunch
The Rebirth of the Digital Foundation The digital world is currently undergoing a structural overhaul unlike anything we have seen in decades. Andreessen Horowitz (a16z) recently locked in a massive $1.7 billion fund dedicated specifically to AI infrastructure. This isn't just about throwing capital at a trend; it is about rebuilding the entire stack for a new era of compute. Jennifer Li, General Partner at a16z, identifies this as a "super cycle" where every layer—from silicon chips to the model layer—requires a complete retooling. The infrastructure running AI today was never designed for these specific workflows, creating a massive opportunity for founders who can build the hard technical backbone of the future. Beyond Large Language Models While the market fixates on chatbots, the real action is happening in specialized foundational models and inference clouds. Companies like ElevenLabs and Ideogram are not just building applications; they are developing their own models from pre-training to post-training. We have rapidly crossed the "uncanny valley" in audio and image generation. What seemed like a glitchy experiment six months ago is now indistinguishable from reality. Jennifer Li notes that her own voice clone in Japanese was enough to startle her husband, a native speaker. This leap in quality is shifting the focus from "can we do this?" to "how do we scale the inference?" Firms like Fal are stepping in to serve as the inference cloud for these multimedia models, providing the high-speed, low-latency environment necessary for real-world deployment. The Rise of Agentic Productivity 2026 marks the definitive shift from simple AI assistants to long-running, autonomous agents. We are moving past the "co-pilot" phase and entering a world where agents handle entire processes. While the concept has been discussed for years, we are finally seeing real ROI. These tools are being built to solve the "time and attention" crisis facing modern knowledge workers. The Trust Gap in Autonomy Despite the excitement, a significant hurdle remains: trust. Handing over a calendar is one thing; letting an agent manage a sensitive inbox is another. Julie Bort highlights that humans are still superior at connecting dots and identifying unspoken context—outliers that LLMs often miss. For agents to truly replace mundane tasks like data entry or order processing, they must move beyond token prediction and into world models that understand real-world physics and interaction. The industry is reaching a consensus that LLMs alone will not achieve AGI; we need multimodality and the ability for AI to interact with physical reality. The Velocity of Growth and the Talent Crunch We are witnessing unprecedented growth trajectories. Companies like Cursor and ElevenLabs have scaled from zero to hundreds of millions in revenue in record time. However, this velocity introduces extreme pressure. Jennifer Li warns that not all ARR is created equal, and founders must focus on business quality and durability rather than just top-line hype. The biggest bottleneck to this growth isn't capital—it's people. There is a profound shortage of AI-native talent capable of moving at this speed. Startups are scaling to massive valuations while keeping their headcounts under 100 people. This lean structure means every hire is a high-stakes decision. Founders are forced to solve complex legal, compliance, and deepfake challenges on the fly, often without the safety net of a CFO or traditional corporate guardrails. The Search for Accuracy As we look toward the next wave of investment, the focus is pivoting back to search. Not the old-school web search of the 2000s, but agentic search infrastructure. Agents need up-to-date, hyper-accurate information to function without hallucination. The demand for personalized, high-frequency search is skyrocketing. The next billion-dollar infrastructure play will likely be a team that solves the accuracy and latency problems for the millions of agents about to be unleashed on the global market. The market is wide open for disruption, and the capital is ready to ignite the next great solution.
Feb 4, 2026