Token Optimization
Token Optimization & Density
ContextMD is designed to maximize the information-to-token ratio. Large documentation sites often contain redundant navigation, boilerplate code, and conversational filler that consumes valuable context window space. By default, ContextMD applies several layers of optimization to ensure your context.md file is "Agent-ready."
Automated AI Refinement
The core of ContextMD's optimization strategy is its Refinement Layer. Once a page is crawled and converted to Markdown, it is processed by gpt-4o-mini with a specific system prompt designed for technical density.
- Noise Striping: Removes conversational phrases ("In this section, we will explore...") and focuses on logic and syntax.
- Logical Compression: Collapses verbose explanations into high-density summaries while preserving all API signatures and code blocks.
- Format Normalization: Ensures headers and lists are consistently structured to help LLMs leverage structural attention.
Controlling Context Volume
To manage your token budget and prevent context overflow in models like Claude 3.5 Sonnet or GPT-4o, use the following controls:
1. Page Limits
Use the --limit (or -l) flag to restrict the depth of the crawl. This is the most effective way to keep the final output within a specific token range.
# Limit output to approximately the top 20 most relevant pages
npx contextmd https://docs.example.com --limit 20
2. Strategic Scoping
Instead of crawling a root domain, target specific sub-directories to generate "modular" context files. This allows you to feed your agent only the relevant module documentation rather than the entire library.
# High-density context for just the API reference
npx contextmd https://docs.example.com/api-reference/v1
Content Filtering (Noise Reduction)
ContextMD automatically performs "surgical" HTML cleaning before the AI even sees the content. This reduces the initial token count and prevents the model from being distracted by UI elements. The following elements are stripped by default:
| Element Type | Description |
| :--- | :--- |
| Navigation | Top bars, sidebars, and breadcrumb lists. |
| Footers | Copyright notices, social links, and site maps. |
| Scripts/Styles | All <script>, <style>, and <noscript> tags. |
| Interactive UI | Iframes and elements with role="navigation". |
Token Budgeting Best Practices
When preparing a context.md file for an LLM, consider the following target sizes:
- Small (10-30 pages): Ideal for RAG (Retrieval-Augmented Generation) or small context windows (8k - 32k). Use
-l 20. - Medium (30-100 pages): Optimized for "Long Context" models like GPT-4o (128k window). Use
-l 75. - Large (100+ pages): Best suited for "Mega Context" models like Gemini 1.5 Pro (1M+ window) or Claude 3.5 (200k window). Use
-l 200.
[!TIP] Output Monitoring: After generation, check the file size of
context.md. A rough estimate for token count is 0.75 tokens per character for refined technical markdown.