Insights

Agent Infrastructure’s Moment

Dan Cahana

•

April 10, 2025

The agent hype is here, and I’ve bought in. While Matthew McConaughey Agentforce ads garner well-deserved chuckles, you don’t need to search far to find genuinely transformational agentic products. Replit Agent, v0, and Operator are increasingly splitting the world into people who have seen the future of software and people who haven’t.

As an infrastructure investor, what excites me most is that agents are a fundamentally new software workload. For the first time in more than a decade, tens of millions of developers have an urgent need for new infrastructure solutions to enable their progress.

Over the last two years the primary inhibitor to agentic progress and adoption has been the quality of the LLMs. I no longer believe that’s true. The LLMs are pretty damn great, and the inhibitor is becoming the infrastructure required for developers to build and scale agents, and for enterprises to adopt them securely and effectively.

Agentic Infrastructure Opportunities

By the end of 2025, I expect the toolset available to agent developers to be far more robust. I’m especially excited about a few new ideas.

Democratizing Post-Training:

The combination of a powerful pre-trained LLM and a reasoning engine on top of it appears to be a winner for complex tasks. That architecture, however, is not easily scaled down and post-trained for specific domains. I’ve heard and started to believe the argument that coding is the first complex task LLMs got great at because it’s both verifiable and proximate to the labs themselves. OpenAI, Anthropic*, and Google were able to “crack” coding because they know the domain well enough to be able to find advanced, high-quality training data for complex coding tasks. The open source domain also provides a tremendous amount of content (in this case, code) on which to train, further enhancing model accuracy.

If developers proximate to other complex domains can generate similar “golden datasets” of high-quality samples, how would they go about fine-tuning a reasoning model to improve performance? Today, even with high-quality training data and verifiable outputs, that’s nearly impossible. High quality reasoning engines are either closed source or impractically hard to fine-tune, in the case of Deepseek. Companies like Mercor are improving access to expert training data. The next step is lowering the barrier to entry for companies with data access to run stable, scaled reinforcement learning.

OpenAI is already moving in this direction with their Reinforcement Fine-Tuning research program. A Llama version of o1 (let alone o3) would accelerate progress even more. Combined with a platform to curate data and define step-by-step reward mechanisms for models, this could dramatically accelerate agents’ capabilities and compound the advantage of companies with access to proprietary data, helping vertical agents excel at high-value, verifiable tasks like automating data engineering work, penetration testing, or text-to-SQL. Over time it should help agents conquer more subjective work like planning product sprints or designing websites.

Agentic Storage and Compute Primitives

Much like the move to the cloud brought about cloud-native infrastructure like containers, Terraform, and Kubernetes, agents will require new serverless infrastructure. Agentic infrastructure will need to have a few key properties:

Scale to zero with fast cold starts: The optimal agent can be awoken up at any time to perform actions on a user’s behalf. Since keeping machines warm for every single agent is prohibitively expensive, developers will need serverless compute. And to meet user expectations, that compute needs to be available almost instantly. We’ve seen that many agents that tap into headless browsers are pushing the limits of what’s possible using Lambdas, creating room in the market for new solutions like unikernals that offer faster, cheaper, compute that is billable in granular increments.
Autoscaling: After starting up, agents need to ramp up and down quickly depending on where their task takes them. A platform that manages thousands or millions of agents needs to constantly adjust the compute requirements for each one, which can be prohibitively complex. Serverless infrastructure once again can dramatically simplify this work. Replit Agent uses Neon* to create databases for the apps users generate, using autoscaling to abstract away the complexity of managing the database as the app’s usage grows.
Branching: The best agentic systems we’ve seen deploy many agents with slight variations, either to complete multiple tasks or to compete to find the best path to completing a single task. Starting with a base agent and seamlessly branching it makes building this workflow simple.

We’re already seeing serverless databases like Neon find incredible product-market fit with companies building fleets of agents. Today, agents create 4x more databases on Neon than humans do. As developers build larger agentic systems with higher compute peaks, they’ll start to feel a more acute need for serverless.

*Databases created per day on Neon inflected when agents began building software that requires a database*

A new client-server model for agents

Many agents today are running client-side, leveraging a user’s own browser to execute actions. That’s a convenient way to start—it allows the agent to use a user’s existing apps and compute—but may not be enough. We increasingly see agentic architectures that require either more compute or increased security, stretching the limits of what can be done client-side.

Even for consumer agent use cases, more compute can mean better outcomes. A travel agent that’s tasked with building the best weeklong itinerary, when running on a users’ browser, might take a few minutes to generate a satisfactory output. That same agent, when given access to unlimited browsers through a platform like Browserbase, could pull together 20 different itineraries, compare them against the user’s known preferences, and generate a fantastic output, all in less time.

The opportunity in the enterprise is even greater. An agent tasked with optimizing node location in a Kubernetes cluster might want to ingest the user’s AWS logs, write code to test 10+ different location configurations, evaluate the results, and then apply the best run. Doing all of this in a user’s production AWS environment can be very risky, but running it in cloud sandboxes, like those offered by e2b and Modal, allows the agent to take greater risks in order to find the optimal path.

The internet’s client-server model has continually evolved, driven by changes in both the way the web is used and the hardware available to webapps. In the same way I expect the agentic client-server model to evolve, with headless browser clouds and sandboxed environments emerging as critical ways developers extend the compute and leniency available to them on users’ machines.

Making the enterprise agent-ready

Even the most capable agents face two critical obstacles within enterprises: secure integration and trust. When enterprises hear the word “non-deterministic,” they immediately imagine the worst; If I give this agent access to my software, what will it screw up?

If agents need human-like permissions to be effective but can’t be trusted like humans, we’ll need authentication and authorization infrastructure to enable the kind of transformation boards (and investors) are expecting. Part of the solution is for apps themselves to provide more fine-grained authorization for agents–something more expansive than an API or simple function-calling, but less permissive than what a human would get. Companies like Descope* are beginning to offer this, even allowing developers to build human-in-the-loop authorizations for agents.

But the enterprise integration platform of the future would dictate not only what apps an agent can access and to what extent, but also where the agent can actually run actions. In some cases an agent may be able to authenticate into a service, pull some but not all available data based on its role and history, and run an action but only in a dynamically-provisioned sandboxed environment, limiting the downside of that action until a human can approve it. In orchestrating this, an “Okta for agents” could simultaneously enable agent usage and limit downside.

Pushing agent capabilities

Today’s agents can execute specialized tasks with relative ease and more complex tasks effectively but inconsistently. They’re impressive enough to have already impacted the way developers and salespeople (among others) approach their jobs. But to cross the chasm to general usefulness, I believe agents will need to improve across a few vectors:

Reasoning capabilities: For agents to execute tasks like humans, they need to begin to reason like humans.
Reliability: Agents need to balance the non-deterministic nature of LLMs with enterprise and consumer requirements that they not wreak havoc on users' behalf.
Scalability: Agents promise to replicate or augment the work of thousands of human employees, meaning they need to spin up to massive scale quickly.
Extensibility: Agents will need to interact with additional external tools/websites to take impactful action on behalf of users.
Security: The more work is handed over to agents, the greater the threat vector they become.

Better infrastructure can help across all of these. Solving each of them is likely a multi-billion-dollar opportunity.

Our team at Notable Capital has been backing exceptional founders building the future of software infrastructure for over two decades. If you’re building something new in agentic infrastructure—no matter how early—we’d love to be in touch.

Thank you to Bob Mcgrew, Jeremy Berman, Nikita Shamgunov, and my colleagues Glenn Solomon and Laura Hamilton for their guidance and feedback on this post.

‍