Herald — Helping AI Systems Find and Understand Your Site
There is a quiet shift happening in how people find information online.
Search engines are still important. But a growing number of people are now asking AI systems — ChatGPT, Perplexity, Claude — to find things for them. They ask questions, get summaries, and follow recommendations. And the sites those AI systems surface are the ones that get the traffic.
The question for site owners is: how does an AI system know what your site is about?
The answer, increasingly, is llms.txt.
What is llms.txt?
llms.txt is an emerging standard for AI crawler discovery. It is a plain text file that lives at the root of your site and tells AI systems exactly what your site is, what topics it covers, and where the most important content lives.
Think of it like robots.txt — but instead of telling crawlers what to ignore, it tells AI systems what to pay attention to.
The format is simple and human-readable. A site name, a description, a list of key pages, content categories, and any additional context you want AI systems to have. No markup, no schema, no complexity. Just clear structured information that AI systems can read and use.
What Herald does
Herald is Blacklight’s llms.txt module. It generates and serves your /llms.txt and /llms-full.txt files automatically, keeping them current as your site grows.
When you install Herald it immediately starts working. It reads your site name and description, queries your published content, pulls your content categories, and builds a structured file that gives AI crawlers a complete picture of what your site is about.
You configure it once. Herald handles everything else.
What goes into the file
Herald builds your llms.txt from real site data:
- Site name and description — pulled from your Herald settings, with fallback to your WordPress site settings
- Primary topics — keywords and subjects you define that tell AI systems what your site covers
- Key pages — your published posts and pages, ordered and limited to the most relevant content
- Content categories — your WordPress categories with descriptions where available
- Custom instructions — a free text block for anything else you want AI systems to know about your site
Herald also generates a /llms-full.txt variant that includes content excerpts alongside each page listing, giving AI systems more context about what each page actually contains.
Both files regenerate automatically every day and whenever you publish or update content, so they always reflect the current state of your site.
NoIndex awareness
Herald respects your existing SEO decisions. If you have marked pages as NoIndex in MetaMaster, Herald excludes them from the llms.txt output by default. Pages you have told search engines to ignore are also kept out of your AI discovery file.
Why this matters now
AI-driven discovery is still early. The sites that establish clear, structured AI-readable signals now will have an advantage as these systems mature and their influence on traffic grows.
llms.txt is not a ranking guarantee. It is a signal — one that tells AI systems you understand how they work and have taken the time to help them understand your site. That kind of clarity tends to get rewarded.
Herald makes it effortless. Your /llms.txt is live, current and correctly structured without any ongoing maintenance required.
For a more in-depth guide, check out this article on How to prepare your site for AI Search.