Add NLWeb Support
TL;DR
NLWeb is an emerging protocol that lets AI agents ask your website questions in natural language and receive structured answers. Implementing it makes your site directly queryable by AI systems, going beyond passive crawling to active conversation.
Last updated: 2026-03-09
What NLWeb Does#
NLWeb is a protocol that transforms your website from a passive document collection into an active, queryable knowledge source. Instead of waiting for AI crawlers to visit and parse your pages, NLWeb lets AI agents send natural language questions directly to your site and receive structured, schema.org-annotated responses.
Think of it as adding a conversational API to your website. When an AI agent wants to know "What is your return policy?" or "Which product is best for small teams?", it can send that question to your NLWeb endpoint and receive a precise, structured answer drawn from your content.
This protocol was introduced by Microsoft in 2025 and is gaining traction as AI agents become more capable. While still early, implementing NLWeb now positions your site for the next wave of AI interaction — one where AI does not just read your content but actively queries it. Sites with NLWeb support score higher on the AI protocols factor in AgentReady™ scans.
Setting Up the NLWeb Endpoint#
NLWeb works by exposing an endpoint on your site that accepts natural language queries and returns schema.org-formatted responses. The endpoint URL is typically
https://yourdomain.com/.well-known/nlweb or a custom path you define.
The endpoint accepts GET or POST requests with a query parameter containing the natural language question. It processes the query against your content, finds the most relevant answer, and returns a JSON response annotated with schema.org types.
The implementation requires a server-side component that can match queries to your content. The simplest approach uses a pre-built search index of your content combined with a retrieval system. More advanced implementations use embeddings and vector search for semantic matching.
The configuration example below shows the basic endpoint structure. Your server receives the query, searches your content, and returns a structured response with the answer, source URL, and schema.org annotations.NLWeb endpoint request and response structure
{
"endpoint": "https://www.example.com/.well-known/nlweb",
"method": "POST",
"request": {
"query": "What is your return policy?",
"context": "shopping",
"max_results": 3
},
"response": {
"@context": "https://schema.org",
"@type": "SearchResultsPage",
"mainEntity": [
{
"@type": "Answer",
"text": "We offer a 30-day no-questions-asked return policy for all products. Items must be in original packaging.",
"url": "https://www.example.com/policies/returns",
"dateModified": "2026-02-15",
"author": {
"@type": "Organization",
"name": "Example Store"
}
}
]
}
}json
Handling Queries Effectively#
The quality of your NLWeb implementation depends on how well you match queries to content. A basic keyword search is functional but limited. Semantic search using embeddings provides better results for natural language questions.
Start with your existing content as the knowledge base. Index your key pages — product descriptions, FAQs, documentation, policies, and about pages. Each indexed item should include the content text, its URL, the last modification date, and relevant schema.org type information.
When a query arrives, search your index and return the top results ranked by relevance. Each result should include a concise answer (ideally extracted from your content, not generated), the source URL where the full content lives, and schema.org annotations that describe the content type.
Handle edge cases gracefully. If no relevant content matches the query, return an empty result set rather than guessing. If the query is ambiguous, return multiple results and let the AI agent choose the most relevant one. Accuracy matters more than coverage — returning a wrong answer is worse than returning no answer.
Schema.org Integration#
NLWeb responses are built on schema.org types. This means your responses are not just plain text — they are structured data that AI agents can parse, validate, and reason about.
Use the appropriate schema.org type for each response. Product queries should return
Product objects with name, description, offers, and aggregateRating. Content queries should return Article or WebPage objects with headline, author, and dateModified. FAQ queries should return Answer objects.
This integration creates a powerful feedback loop with your existing schema markup. The same structured data you add to your pages for passive crawling also powers your active NLWeb responses. Keep both in sync so AI systems get consistent information regardless of whether they crawl your pages or query your NLWeb endpoint.
Include dateModified on every response so AI agents can assess freshness. Include author or publisher information so they can assess authority. These metadata fields directly support your authority and trust signals even in the NLWeb context.Testing Your NLWeb Implementation#
Test your NLWeb endpoint thoroughly before announcing it. Start with manual testing using
curl or a tool like Postman. Send a variety of queries — simple factual questions, product comparisons, policy inquiries, and ambiguous questions — and verify the responses are accurate and well-structured.
Validate that every response is valid JSON and conforms to schema.org types. Use the schema.org validator (validator.schema.org) to check your response structure. Invalid JSON or malformed schema will cause AI agents to discard your responses.
Test error handling. Send empty queries, very long queries, and queries about topics your site does not cover. Your endpoint should handle all of these gracefully without crashing or returning incorrect information.
Once your endpoint is live, add a reference to it in your llms.txt file so AI systems know it exists. Run an AgentReady™ scan to verify that the scan detects your NLWeb endpoint and awards credit in the AI protocols factor.
Monitor your endpoint logs to see which queries AI agents are sending. This data is valuable for improving your content — if agents frequently ask questions your site does not answer well, that is a content gap worth filling.Related Pages
Frequently Asked Questions
Do I need NLWeb if I already have schema markup and llms.txt?
Schema markup and llms.txt support passive crawling — AI reads your pages. NLWeb supports active querying — AI asks your site questions. They complement each other. Schema and llms.txt are higher priority because they are more widely supported today. NLWeb is forward-looking and positions you for the next wave of AI agent interaction.
Is NLWeb difficult to implement?
A basic implementation requires a server-side search endpoint and schema.org response formatting. If you already have a site search feature, you have most of the infrastructure. The added work is formatting responses as schema.org JSON and exposing the endpoint at a known URL. Advanced implementations with semantic search require more effort.
Which AI systems currently support NLWeb?
NLWeb is an emerging protocol introduced by Microsoft. Support is growing but not yet universal. Implementing it now puts you ahead of the curve. As AI agents become more sophisticated and autonomous, protocols like NLWeb will become increasingly important for direct AI-to-site communication.
Was this page helpful?