Local Development
Local Development
Follow these steps to set up ContextMD on your local machine for development, testing, or customization.
Prerequisites
Before you begin, ensure you have the following installed:
- Node.js (v18.0.0 or higher recommended)
- npm or yarn
- An OpenAI API Key (required for the AI-powered refinement feature)
1. Clone the Repository
Start by cloning the repository and navigating into the project directory:
git clone https://github.com/UditAkhourii/contextmd.git
cd contextmd
2. Install Dependencies
Install the required packages using npm:
npm install
3. Configure Environment Variables
ContextMD requires an OpenAI API key to process and refine documentation content. Create a .env file in the root directory to store your credentials:
# Create the .env file
touch .env
Add your API key to the file:
OPENAI_API_KEY=your_api_key_here
4. Running in Development Mode
Since ContextMD is written in TypeScript, you can run the CLI directly from the source code using ts-node. This allows you to test changes without a full build step.
To run the utility against a documentation site:
npx ts-node src/index.ts https://docs.example.com
Common Development Commands
| Task | Command |
| :--- | :--- |
| Run with custom output | npx ts-node src/index.ts <URL> -o custom-context.md |
| Limit crawl depth | npx ts-node src/index.ts <URL> -l 10 |
| Build the project | npm run build |
5. Project Structure
For developers looking to extend the functionality:
src/index.ts: The entry point for the CLI. Handles command-line arguments and orchestrates the crawl/process workflow.src/crawler.ts: Contains theCrawlerclass. Handles recursive URL discovery, domain filtering, and HTML fetching usingaxiosandcheerio.src/processor.ts: Contains theProcessorclass. Handles HTML-to-Markdown conversion, noise reduction (removing nav/footers), and the GPT-4o-mini refinement logic.
6. Testing Changes
When modifying the codebase, you can verify your changes by running the CLI with the --limit flag to keep execution times short:
npx ts-node src/index.ts https://react.dev --limit 5
This will crawl only 5 pages, allowing you to quickly inspect the context.md output and ensure your logic for cleaning or refining content is working as expected.