Despite years of tooling and automation, security teams still miss critical subdomains that don’t show up in passive datasets. Enter Subwiz, a new tool developed by Hadrian’s Machine Learning team that brings LLM-driven intelligence to this problem.
We sat down with Klaas Meinke, Machine Learning Lead, and Melvin Lammerts, Hacking Team Manager, to dig into what makes Subwiz different, how it works under the hood, and why it matters for security researchers today.
Q: What inspired the creation of Subwiz?
Klaas: From the beginning at Hadrian, we saw limitations with traditional subdomain enumeration tools. Most rely heavily on passive data sources, like certificate transparency logs or public DNS datasets, which are just incomplete. In cases where wildcard certificates are issued, for example, those logs don’t list the actual subdomains.
So instead of relying purely on these incomplete sources or brute-forcing against generic wordlists, we saw a gap: What if we could predict which subdomains are likely to exist? That’s where the idea for Subwiz came from.
Q: So how exactly is Subwiz using AI?
Klaas: Subwiz is powered by a lightweight generative language model. It’s technically similar to GPT in architecture, but trained specifically on subdomain data. Instead of predicting words or sentences, it predicts likely subdomain names based on patterns in historical DNS records.
Because we trained the model only on this very narrow task, it’s about 1,000 times smaller than ChatGPT. That makes it cheap to train and fast enough to run on a laptop. In fact, training cost us very little when compared to the millions required for general-purpose LLMs.
Q: How does Subwiz fit into the broader context of offensive security?
Melvin: It’s squarely in the reconnaissance category. Subwiz gives us an edge by discovering more infrastructure, especially forgotten or shadow infrastructure. The more surface area you can map, the more likely you are to find vulnerabilities that matte.
Q: What are other tools doing today for subdomain discovery, and why is it still such a hard problem?
Klaas: Most tools today fall into one of two camps: passive enumeration using open source datasets, or active enumeration using brute-force wordlists or permutations. Tools like GoBuster or SanicDNS are great at raw throughput, but they lack intelligence.
The challenge is that both approaches rely on known inputs or lists—so you're always limited by what’s already been seen.
Q: Why does this prediction-based approach matter for ethical hackers or businesses trying to keep themselves secure?
Klaas: The goal of any attack surface management is to find things others haven’t. Subwiz helps surface previously unknown or forgotten subdomains. Domains the company itself may not be tracking. These rogue assets might host legacy services, misconfigured endpoints, or exposed admin interfaces.
Melvin: And honestly, those are often the most interesting finds. If no scanner or auditor picked them up before, it’s likely they weren’t meant for public exposure. That gives us a real advantage in the reconnaissance phase.
Q: How much better is it, in practice?
Klaas: Based on our internal testing, we’ve seen up to a 10% increase in discovered subdomains on customer environments using Subwiz. That’s a significant boost when you're trying to build a complete picture of someone’s external attack surface.
Q: How does Subwiz compare to dictionary-based fuzzing tools?
Melvin: It’s smarter. Instead of cycling through fixed wordlists, it dynamically predicts what might exist based on real-world patterns. You get better signals, faster results, and much less wasted computing power.
Q: Does Subwiz improve any existing Hadrian functionality?
Klaas: It actually helps us build a stronger internal passive dataset. The subdomains it discovers become part of our broader asset intelligence. So while it doesn’t replace a specific feature, it improves our platform’s performance across the board.
Q: What’s next for Subwiz?
Klaas: We’re exploring ways to expand its inputs. Right now it’s trained solely on subdomain sequences, but we could eventually incorporate apex domain context or even linked entity data. The idea is to keep growing its prediction power without bloating the model.
Q: Where does this fit into Hadrian’s broader AI strategy?
Klaas: Subwiz is a perfect example of our philosophy: building hyper-specific models for narrow cybersecurity tasks. Instead of trying to replicate the entire hacker lifecycle with a giant LLM, we focus on one piece at a time and build for scale. That’s how we scan millions of targets daily—using models that are fast, efficient, and tuned for the job.
Want to learn more about Subwiz and how Hadrian uses AI to uncover the hidden layers of your attack surface? Learn more about Continuous Attack Surface Management and how it strengthens your cybersecurity posture here.