RAG Web Browser

Fetch one public web page through a browser and return clean RAG-ready text, Markdown, headings, links, images, JSON-LD type hints, and query snippets.

Aiv0.1.2~1 credit/runSource on GitHub

Overview

RAG Web Browser gives agents a focused retrieval primitive: open one public web page in a real browser, remove noisy markup, and return clean content that is easy to cite, embed, summarize, or route into a retrieval pipeline. It preserves the source URL, canonical URL, title, headings, Markdown-style body text, link citations, and optional query snippets so downstream LLM steps can reason over the page without parsing HTML.

Last validated: Jul 3, 2026

Playground

Input

urlstring (uri)required

Public page URL to render and extract

querystring

Optional query terms used to return simple match counts and snippets from the extracted text

wait_msintegerdefault: 500

Additional browser wait time after DOMContentLoaded

max_linksintegerdefault: 20

Maximum page links to return as citation candidates

max_imagesintegerdefault: 0

Maximum images to return

max_text_charsintegerdefault: 12000

Maximum characters to return in text and markdown

Output

urlstringrequired

Requested URL

textstringrequired

Cleaned plain text

linksobject[]

Page links, capped by max_links

querystring

Query used for snippets

titlestringrequired

Resolved page title

imagesobject[]

Page images, capped by max_images

headingsobject[]

languagestring

HTML language code when present

markdownstringrequired

Markdown-style content preserving headings, lists, and quotes

snippetsstring[]

Short snippets around matched query terms

final_urlstringrequired

Final URL after redirects

word_countintegerrequired

Extracted word count

descriptionstring

Meta description when present

canonical_urlstring

Canonical URL when present

json_ld_typesstring[]

JSON-LD @type values found on the page

query_matchesobject[]

reading_time_minutesintegerrequired

Approximate reading time at 225 words per minute

Examples

example-domain-fetch

{
  "url": "https://example.com",
  "query": "illustrative examples",
  "max_links": 5,
  "max_text_chars": 2000
}

Use cases

RAG ingestion

Fetch a URL and convert it into normalized text and Markdown before saving it into a vector store or knowledge base.

Agent browsing

Let an agent read a page, inspect headings, follow citation links, and decide what to fetch next.

Source-grounded summaries

Return page text plus title, canonical URL, links, and query snippets so summaries can stay grounded in the source material.

FAQ

Does RAG Web Browser search the web?

This v0.1 fetches one URL at a time. Google search orchestration is intentionally left for a later version so the published tool stays predictable and easy to validate.

What is the difference between text and markdown?

text is the cleaned body content joined as plain text. markdown keeps simple heading, list, and quote structure so an LLM can preserve document shape.

Can it fetch dynamic pages?

Yes. It renders through the Better Fetch browser engine with a configurable wait time, then extracts the visible HTML returned by that browser session.

Use it anywhere

MCP (Claude, Cursor, any client)

# Add the Better Fetch MCP connector (or paste the URL into
# Claude → Settings → Connectors → Add custom connector):
claude mcp add --transport http better-fetch https://betterfetch.co/api/mcp \
  --header "Authorization: Bearer bf_your_key_here"

# Then ask for the tool by name: rag_web_browser

REST

curl -sS -X POST "https://betterfetch.co/api/tools/rag_web_browser/run" \
  -H "Authorization: Bearer bf_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"input": {"url":"https://example.com","query":"illustrative examples","max_links":5,"max_text_chars":2000}}'

Run locally

git clone https://github.com/better-fetch/tools/tree/main/tools/rag-web-browser && cd rag-web-browser && npm i
BETTER_FETCH_API_KEY=bf_your_key_here npx bf-tool run --input '{"url":"https://example.com","query":"illustrative examples","max_links":5,"max_text_chars":2000}'