
The following is a transcript of a recent presentation given by Klaas Meinke, Head of AI at Hadrian, at Black Hat USA 2025:
Klaas: The reason I started working in machine learning and artificial intelligence was to build machines capable of tasks that once seemed exclusive to human intuition—whether that’s recognizing a picture of an animal or, in our case at Hadrian Security, hacking another machine. Today, I want to show you the technology we’re building to bridge the gap between simple automated external tests and human-level hacking.
The fundamental flaw in rule-based testing
Klaas: To understand the change, we must look at how typical security tests work. These programs rely on rule-based decision trees programmed by engineers.
For example, an engineer might program a test to discover a password field by looking for the word "password." You can quickly see the limitation: What if the web application is in another language? You can program in more rules, but you soon encounter hundreds of edge cases that accumulate over time.
These typical tests are very useful for standard, well-defined environments, where they can be scaled easily to millions. However, they quickly break down when running against custom web applications that behave in unexpected ways or have peculiar edge cases.
How hackers think
Klaas: If we look at how human hackers operate, we see a fundamentally different approach. The first thing a hacker does is look for everything a target organization has exposed on the internet—they are trying to find every external-facing asset.
Next, they run a suite of tools, inspecting the outputs and building context. They are building knowledge about the organization's unique digital footprint. Finally, they start trying to exploit the system. They might do this thousands of times before hitting a bug.
This process is remarkably similar to how AI agents work. An agent has sensors that sense the environment—a browser, network logs, or a terminal—the same tools a hacker would use. We then use the Reasoning and Action (ReAct) framework, where the Large Language Model (LLM) reasons about the goal (finding a bug) and takes actions that influence the environment, such as running a tool and then observing the output. This forms a continuous loop that mimics the intuitive way a human finds vulnerabilities.
Hadrian’s architecture: Sense, Plan, Attack
Klaas: We generalize this framework into three distinct phases within Hadrian’s architecture:
- Sensing: Finding all different assets an organization has on the internet.
- Planning: Building context, figuring out what those things are, and prioritizing.
- Attacking: Running many different exploits to find a bug based on the plan.
Sensing
Klaas: A key element of sensing is efficient subdomain discovery. We developed our own custom GPT model, SubWiz. It’s 1,000 times smaller than models like ChatGPT, which is essential for running millions of scans every day at scale. Because it is trained for this specific purpose, SubWiz discovers 12% more subdomains compared to a fine-tuned version of ChatGPT. In fact, it is so lightweight that we open-sourced it so that bug bounty hunters and ethical hackers can benefit from it.
The next step is asset attribution: How do we confirm a web page belongs to a specific organization? Our AI analyzes both technical elements (code) and non-technical elements like specific logos, keywords (like "pentesting" in our case), and the layout to determine ownership.
Planning
Klaas: In the planning phase, we build critical context around the discovered applications:
- Flagging Old Code: We have a model that can predict the age of a web page with high accuracy, flagging older technologies more prone to compromise.
- Page Classification: We classify what is happening on the page—is it a login portal, a registration form, or a shopping cart?
- Context Engineering: This is vital. We compress megabytes of raw data into the most relevant information for hacking (e.g., extracting the page title or text fields), feeding only the crucial context into the LLM.
Attacking
Klaas: For crafting attacks, we run our own fine-tuned LLMs trained with more than 100,000 hacking challenges. We distill this knowledge from common hacker resources and benchmarks, giving our LLMs more than 2x performance on internal tests. We also work with our hackers to build specialized tools that the LLMs can use to mimic human actions—like clicking buttons in a browser, brute-forcing, or observing network logs.
Example flows of how AI agents operate
Klaas: Let's look at two examples that demonstrate this process:
Flow 1: Open directory vulnerability
Klaas: Our Subdomain Agent discovers an unknown subdomain that our model deems interesting ("more likely to contain a vulnerability"). It passes this to a Classification Agent, which recognizes an open directory. The Open Directory Agent then runs a crawling tool and, demonstrating intuition, selects only the important files to parse—a human hacker would not open 10,000 files.
The LLM then parses the files, finds two passwords, and, rather than just reporting a potential vulnerability, it validates the finding. It queries our data layer for other domains where it might be able to use those credentials and successfully logs into another domain.
This shows the power of human-level intuition. We once found a passport in an open directory (a PII leak) that an engineer would never have programmed a rule for. But an LLM quickly recognized the sensitive nature of the document and reported the risk.
Flow 2: CVE exploitation
Klaas: In this flow, the LLM queries the data layer for domain context, including fingerprinted technologies and known vulnerabilities. It finds ROM M, a hosted manager with an existing CVE. It is handed off to a CVE Agent, which pulls in the relevant CVE details and the patch from GitHub.
Crucially, even though there is no public exploit available, the agent is able to analyze the patch and write its own exploit to test the vulnerability. The next agent then executes the code in an isolated environment, successfully finding the exploit.
This demonstrates that our agentic AI can dynamically write a bespoke script to confirm a vulnerability, even when the code doesn't exist internally. This is the future of security testing—a continuous, adaptive, and highly intelligent offensive capability.
{{cta-demo}}