Purify turns any web page into pure Markdown before it hits your LLM. Spend $29/mo, save $5,000+ in AI token costs.
Works with your AI stack
Purify strips navigation, ads, and boilerplate. Your LLM only sees what matters.
Measured using tiktoken (GPT-4 tokenizer). Raw = unmodified HTML. After Purify = extracted Markdown.
Three steps from zero to clean Markdown.
Download a single binary. No Node.js, no Python, no Docker. Just one file.
# macOS / Linuxcurl -sSL https://purify.verifly.pro/install.sh | sh # Or download directly from GitHubwget https://github.com/Easonliuliang/purify/releases/latest/download/purifyPOST any URL to the API. Works with dynamic JavaScript-heavy sites too.
curl -X POST https://purify.verifly.pro/api/v1/scrape \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://github.com/Easonliuliang/purify"}'Receive structured, token-efficient Markdown. Ready for your LLM or AI agent.
{ "success": true, "markdown": "# Purify\n\nTurn any web page into clean Markdown...\n", "tokens_saved": "98.7%", "processing_time_ms": 420}Purify ships a built-in MCP server. Drop one config file and your AI assistant can scrape the web.
Add Purify as an MCP server in Claude Desktop or Cursor. Your AI assistant can scrape any web page on demand.
// ~/Library/Application Support/Claude/claude_desktop_config.json{ "mcpServers": { "purify": { "command": "/path/to/purify-mcp", "env": { "PURIFY_API_URL": "https://purify.verifly.pro", "PURIFY_API_KEY": "YOUR_API_KEY" } } }}A single binary that does web scraping extremely well.
Strip HTML junk before it hits your LLM. Verified savings across real websites.
One file, zero dependencies. No Docker, Node.js, or Python required.
Native Model Context Protocol support. Connect Claude, GPT, and other AI agents directly.
Apache 2.0 licensed. Self-host with no usage limits, or use our managed cloud API.
Headless browser renders dynamic SPA pages before extracting clean content.
Optimized pipeline delivers clean Markdown in under 500ms for most pages.
No cloud lock-in. No bloat. A single binary that does one thing extremely well.
Purify is a web scraping API that converts any web page into clean, token-efficient Markdown — optimized for LLMs and AI agents. It strips out navigation, ads, scripts, and boilerplate, saving you up to 99% on AI token costs.
Purify is a single binary with zero dependencies. It's fully open-source (Apache 2.0), self-hostable, and ships with a built-in MCP server. No cloud lock-in, no bloat.
Absolutely. Download the binary and run it anywhere — no Docker, no Node.js, no Python required. Self-hosted instances have no usage limits.
Model Context Protocol (MCP) is an open standard that lets AI agents interact with external tools. Purify's built-in MCP server lets Claude, GPT, and other agents scrape URLs directly.
If you process 50,000 pages/month, raw HTML would cost ~$15,000 in GPT-4 tokens. With Purify, you'd pay ~$129 total ($29 API + ~$100 tokens). That's a 170× ROI.
Yes. Purify uses a headless browser to fully render dynamic pages before extracting content.
Join developers building smarter AI agents — start free, no credit card needed.