Good Bot, Bad Bot: The Internet’s Latest Digital Arms Race

The internet faces a bot identity crisis as AI makes automated deployment easier than ever.

Rares Crisan
September 10, 2025
Good Bot, Bad Bot: The Internet’s Latest Digital Arms Race

In the midst of an AI boom time, we face a harsh reality: The internet was never designed for AI agents. Well, sort of—while in many ways the internet is to computers what earth is to humans, bots aren't exactly welcome guests. Bots (or web crawlers) are small programs designed to navigate the internet using browsers. They can use your favorite desktop browser, but they're best designed to use tools like Playwright or Puppeteer, which allows them to operate without an actual interface and be deployed at scale.

Here's where it gets complicated: some bots are genuinely helpful. Google's bots crawl the web to improve search results. Others automate routine tasks through RPA (Robotic Process Automation), filling out forms and gathering data efficiently. But then there are the bad actors—bots that buy up all the concert tickets, artificially inflate views and likes, or engage in endless arguments online. These latter examples contribute largely to what's known as the Dead Internet Theory, and in the case of Large Language Models, some even infringe on copyright laws.

This creates the fundamental challenge: some bots are good, and some bots are bad—and identifying which is which has become a digital arms race.

Why Bot Detection Is So Difficult

The challenge with identifying bots lies in how easy it is to make their digital fingerprint look exactly like a real person's. After all, both humans and bots are essentially doing the same thing—making HTTP requests and navigating websites.

Many people think of reCAPTCHAs and assume it must be simple for a program to click the button that says "I'm a human." But clicking the button isn't what those widgets are actually measuring. They're analyzing your browser history, cookies, previous browsing activity, and other behavioral signals. A bot launched from a server using Playwright typically has no browser history, no cookies, and may originate from a well-known AWS IP address on an EC2 instance.

(And those traffic lights and crosswalks you're asked to identify? You might not be shocked to learn you're actually just training AI models—nothing in those puzzles actually proves you're human.)

The AI Revolution: Making Everything Easier (and Harder)

This is where AI creates an entirely new challenge in the bot arms race. Traditionally, deploying bots required solving two complex problems: creating the infrastructure for large-scale deployment and writing sophisticated code to navigate websites and perform actions.

AI has dramatically simplified both challenges. Companies like Browserbase* have emerged to handle the infrastructure portion, while AI makes it much easier to generate the navigation code needed for complex web interactions. We've even seen OpenAI attempt to tackle the entire problem autonomously (with mixed results).

So we should expect a massive proliferation in the number of bots deployed online to perform routine tasks. One area seeing significant attention is "agentic commerce"—bots that book flights, order regular products, or wait in queues (legitimately this time) to purchase concert tickets. This introduces fascinating new problems beyond just detection: How do we let a bot pay with your credit card without it being flagged as potential fraud? (That challenge deserves its own discussion in a forthcoming part 2 to this piece.)

The Current State: An Escalating War

Developing an undetectable bot has become an art form of reverse engineering what a typical human's digital fingerprint looks like. Meanwhile, detection systems look for patterns atypical of human behavior—like browsing 1,000 LinkedIn profiles in 10 minutes.

This escalating war has led to tiered approaches to bot management. Some websites simply want to block traffic to avoid costs or poor user experiences. Others need to protect sensitive data. And some platforms don't mind bots—they just want equitable and fair use of their services.

The problem is becoming more urgent because we now know three critical things: there are good bots and bad bots, modern technology makes bot deployment much easier, and individuals and companies will increasingly deploy their own bots. These factors make it essential to solve the fundamental question of distinguishing desirable automation from unwanted interference.

Enforcing Self Identification: A Cryptographic Approach to Bot Identity

Bots self identifying themselves is actually not something new, it’s actually been a common practice since well before the dot com boom. RFC 8942, published in 1992, introduced the concept of the user-agent header being added to the HTTP protocol. The intention behind this RFC was to require clients (the browser) to reveal parts of their configuration to the server. This header is what Google’s indexing bots use to identify themselves to websites. It’s so common that companies actually publish their user agents so that developers can adjust configurations to make it easier for the bots to navigate the site; like OpenAI’s user-agents, Anthropic’s user-agent, and Cloudflare’s user-agents

The problem is that it’s not hard to pretend to be one of these user agents. In fact, it’s actually encouraged to test your site by pretending to be one of these agents to make sure it performs optimally for their needs. But even with its security gaps, this approach does pave the way for a meaningful solution to personalized identification.

What this means is that we should be able to define something in an HTTP request header that can reliably associate a bot with a person or a company, in a way that can not be impersonated. This actually isn’t an unsolved problem, this is the very fundamentals behind Ronald Rivest, Adi Shamir, and Leonard Adleman’s RSA encryption paper that formed that backbone of all modern encryption that secures every piece of digital communication. Simply by using private/public key cryptography, bots can be signed with hidden private keys and then have their identities verified by websites with the openly available public keys. The issue is just that we need somewhere to retrieve those public keys from.

Cloudflare introduced Web Bot Auth in July 2025, a system built on open IETF drafts that enables verified bots to cryptographically authenticate their requests. With this approach, bot operators generate their own signing keys, publish their public keys via a hosted key directory, and register that directory with Cloudflare. Verified bots then attach signed headers to their HTTP requests, and Cloudflare uses the published keys to verify their authenticity. This represents a shift toward cryptographic verification of bot traffic, laying the groundwork for a more open and standardized solution.

The real-world implementation of this vision is already beginning. Browserbase, which as I mentioned above provides web browsing infrastructure for some of the largest AI applications, recently partnered with Cloudflare to support Web Bot Auth adoption. 

Browserbase describes Web Bot Auth as providing "a passport for your AI agent" that allows them to equip the agent with a secure cryptographic signature that acts like a proof of identity. With billions of agents expected to come online over the next few years, an identity layer becomes crucial infrastructure for the AI-driven web.

The Path Forward: Accountability and Access

This cryptographic approach isn't a complete solution—companies still need to protect against the unwanted bots that comprise over 50% of all internet traffic (2024 Bad Bot Report). However, when bots can securely self-identify as belonging to a specific company or person, the exchange of information and commerce becomes more trusted and predictable.

The system creates accountability: your bot can navigate freely without being unnecessarily blocked, but the platform also has a clear target to hold responsible for any malicious or unwarranted actions that violate terms of service.

This solution paves the way for a future where your work bot can gather data in a fair-use manner, or your personal bot can shop around and successfully complete purchases without being flagged as suspicious. It's a foundational step toward the agentic web—where AI agents can act as trusted extensions of their human owners.

Now, once that bot presses the checkout button, an entirely new wave of challenges begins. As it turns out, when it comes to agentic commerce, the shopping part may be the easy part—the payment processing presents some of the biggest unsolved challenges in commerce technology today. So follow along for part 2…

Special thanks to Hans Tung and Cami Katz for their contributions to this piece.

AI

Related Articles

SEE ALL