bioRxiv and medRxiv Preprints Scraper
Fetch bioRxiv or medRxiv preprint metadata by date window, category, cursor, or DOI through the public bioRxiv API, returning normalized DOI, title, author, date, version, license, category, abstract, JATS XML, PDF, and published-article links.
Overview
bioRxiv and medRxiv Preprints Scraper gives agents a structured route into the public bioRxiv API for biology and health-science preprint metadata. Pull a bounded date window from bioRxiv or medRxiv, apply a subject-category filter, continue from a cursor, or fetch one exact preprint DOI. The tool normalizes the API response into compact rows with preprint DOI, server, title, authors, posted date, version, manuscript type, license, subject category, abstract text, corresponding author details, funder labels, published DOI when linked, article URL, PDF URL, and JATS XML URL. It is designed for preprint surveillance, biomedical and life-sciences literature review, science-news monitoring, pharma and biotech competitive intelligence, RAG corpus triage, weekly reading lists, and reproducible research snapshots.
Last validated: Jul 3, 2026
Playground
Input
doistringExact preprint DOI for DOI mode, e.g. 10.1101/2024.05.28.596311.
mode"date_range" | "doi"default: "date_range"Fetch preprints by date window or fetch one exact preprint DOI.
limitintegerdefault: 10Maximum records to return from this single API response.
cursorintegerdefault: 0bioRxiv API cursor/start offset for a date window.
server"biorxiv" | "medrxiv"default: "biorxiv"Preprint server to query.
date_tostringEnd date in YYYY-MM-DD format for date_range mode.
categorystringOptional category filter such as neuroscience, cell_biology, or sports_medicine.
date_fromstringStart date in YYYY-MM-DD format for date_range mode.
include_abstractbooleandefault: trueInclude abstract text when returned by the public API.
Output
doistringExact DOI used
modestringrequiredMode used for this run
countintegerrequiredNumber of returned preprints
cursorintegerCursor/start offset used
serverstringrequiredServer queried
statusstringAPI status message
date_tostringEnd date used
date_fromstringStart date used
preprintsobject[]requiredNormalized bioRxiv or medRxiv preprint records
source_urlstringrequiredbioRxiv API URL fetched
total_matchesintegerTotal API matches reported for the interval
new_papers_countintegerNew paper count reported for the interval
Examples
biorxiv-neuroscience-window
{
"mode": "date_range",
"limit": 2,
"server": "biorxiv",
"date_to": "2024-06-03",
"date_from": "2024-06-01",
"include_abstract": false
}biorxiv-doi
{
"doi": "10.1101/2024.05.28.596311",
"mode": "doi",
"server": "biorxiv",
"include_abstract": false
}Use cases
FAQ
Does bioRxiv and medRxiv Preprints Scraper require an API key?
No. Version 0.1 uses the public bioRxiv API and keeps each run to one bounded request. Upstream rate limits can still apply to aggressive use.
Does it download PDFs or parse full text?
No. It returns metadata plus preprint, PDF, and JATS XML URLs when the public API provides enough information. It does not download PDFs, parse JATS XML, or crawl article pages.
Is this tool affiliated with bioRxiv, medRxiv, or Cold Spring Harbor Laboratory?
No. This Better Fetch tool uses public API data and is not endorsed by or affiliated with bioRxiv, medRxiv, openRxiv, or Cold Spring Harbor Laboratory.
Use it anywhere
MCP (Claude, Cursor, any client)
# Add the Better Fetch MCP connector (or paste the URL into # Claude → Settings → Connectors → Add custom connector): claude mcp add --transport http better-fetch https://betterfetch.co/api/mcp \ --header "Authorization: Bearer bf_your_key_here" # Then ask for the tool by name: biorxiv_medrxiv_preprints_scraper
REST
curl -sS -X POST "https://betterfetch.co/api/tools/biorxiv_medrxiv_preprints_scraper/run" \
-H "Authorization: Bearer bf_your_key_here" \
-H "Content-Type: application/json" \
-d '{"input": {"mode":"date_range","limit":2,"server":"biorxiv","date_to":"2024-06-03","date_from":"2024-06-01","include_abstract":false}}'Run locally
git clone https://github.com/better-fetch/tools/tree/main/tools/biorxiv-medrxiv-preprints-scraper && cd biorxiv-medrxiv-preprints-scraper && npm i
BETTER_FETCH_API_KEY=bf_your_key_here npx bf-tool run --input '{"mode":"date_range","limit":2,"server":"biorxiv","date_to":"2024-06-03","date_from":"2024-06-01","include_abstract":false}'