CLI Reference¶
The executable command is nber-cli.
Global Options¶
| Option | Description |
|---|---|
-v, --version |
Print the installed NBER-CLI version. |
-h, --help |
Show command help. |
Running nber-cli without a subcommand prints the top-level help and exits successfully.
Commands¶
| Command | Purpose |
|---|---|
download |
Download one or more paper PDFs. |
info |
Show metadata and abstract for one paper. |
search |
Search NBER working papers. |
db |
Manage the local SQLite database. |
feed |
Manage the NBER new working papers RSS feed cache. |
mcp-server |
Start the MCP server for agents. |
download¶
Download one paper:
Download to an explicit file:
Download to a target directory:
Batch download:
nber-cli download --batch w34567 w25000 w32000 --save-base ~/papers/nber
nber-cli download -b w34567 w25000 w32000 -s ~/papers/nber
download Options¶
| Option | Description |
|---|---|
paper_id |
Optional positional paper ID for single downloads, for example w34567. |
--file, -f |
Explicit target PDF path for a single download. |
--save-base, -s |
Target directory for generated <paper_id>.pdf files. Defaults to the current working directory. |
--batch, -b |
One or more paper IDs to download concurrently. |
download Rules¶
- A single positional paper ID and
--batchcannot be used together. --fileis only supported for a single paper.- Batch mode supports
--save-baseonly. - If neither
--filenor--save-baseis passed, PDFs are saved in the current working directory. - If a paper is unavailable, NBER-CLI exits with code
1and prints a readable error message.
download Filesystem Behavior¶
- Existing files are overwritten. When the target PDF path already exists, NBER-CLI writes the new bytes in place and overwrites the previous file. There is no "skip if newer" or "preserve on error" mode.
- No atomic rename. The download is read fully into memory and then written to the target path in a single
write_bytescall. If the process is killed, the host loses power, or the disk fills up mid-write, the file at the target path can be left empty, truncated, or partially written. The previous file (when it existed) is not preserved on the failure path. - Parent directories are auto-created. The parent of the resolved output path is created with
mkdir(parents=True, exist_ok=True). Missing intermediate directories do not cause a failure, but the process needs write permission on the deepest existing ancestor. - Path resolution is literal. The string passed to
--file(or<paper_id>.pdfderived from--save-base) is used verbatim. Relative paths resolve against the current working directory. Tilde expansion (~) is not performed; if you want~-relative paths, your shell needs to expand them. - Single download is fully in-memory. The full PDF body is buffered before any disk write, so a single download holds the entire PDF in memory for the duration of the transfer. Very large PDFs may briefly use several hundred MB of RAM.
- Python API callers own their session. When you call
download_paper/download_paper_to_file/download_multiple_paperswith a customsession=..., you own the underlyingClientSession(orRetryClient), its timeouts, its connector limits, and any retry behavior. NBER-CLI does not wrap your session in a retry client. The default NBER_CLI_CONFIG timeout and retry settings only apply when the function creates its own session.
For the Python API, see Python API — Download PDF.
info¶
Show paper metadata:
Show all available fields:
Return JSON:
info Options¶
| Option | Description |
|---|---|
paper_id |
Required paper ID, with or without the w prefix. |
--all, -a |
Include related fields and published-version information when available. |
--format, -f |
Output format: list or json. Defaults to list. |
--refresh |
Bypass the local info_cache and re-fetch from NBER. The new data is written back to the cache when the cache is enabled. |
When the cache is enabled and the cached entry has not yet passed the configured TTL, repeated info calls are served from the local database. The first info call after a TTL expiry, or any call with --refresh, performs a live fetch.
The TTL is sliding: every cache hit updates last_fetched_at and increments fetch_count, so frequently consulted papers keep their cached copy until at least cache_ttl_days have passed since the most recent hit. "Last fetched" therefore means "last local hit", not "last network fetch from NBER". --refresh always bypasses the cache and writes a fresh row.
The MCP get_paper_info tool follows the same cache behavior, but it does not accept a per-call --refresh argument. The tool always honors the current info_cache toggle and TTL; agents that need a forced refresh must toggle the cache off, call get_paper_info, and toggle the cache back on (or rely on the next call after a TTL-driven re-fetch).
info cache¶
Manage the info_cache lookup behavior and clear cached records.
Show the current cache state, TTL, and row count:
info cache and info cache status are equivalent — both print the same status view (cache enabled/disabled, current TTL, and row count). The explicit status sub-action is provided for symmetry with clear/clean and for scripts that prefer an unambiguous form.
Toggle the cache globally:
Set the cache refresh interval in days:
--set-refresh requires a positive integer. The new value is written to ~/.nber-cli/config.json and used as the TTL for every subsequent info call.
Clean cached records not refreshed in the last 30 days:
Clean all cached records:
info cache clean is a convenience alias for info cache clear --all.
Clean cached records by last_fetched_at date:
nber-cli info cache clear --end-date 2026-06-01
nber-cli info cache clear --start-date 2026-05-01 --end-date 2026-06-01
--end-date without --start-date cleans from the earliest cached record through the end date. --start-date and --end-date are inclusive. Passing only --start-date is invalid.
Before deleting anything, info cache clear prints how many cached records match and asks for confirmation:
This operation is irreversible.
Deleted info cache records may be fetched again from NBER.
Continue? [y/N]:
Only y or Y continues. Any other response aborts without deleting records.
info cache Options¶
| Subcommand | Option | Description |
|---|---|---|
| (none) | --turn-on |
Enable the info cache globally. |
| (none) | --turn-off |
Disable the info cache globally. |
| (none) | --set-refresh |
Set the info cache refresh interval in days. Must be a positive integer. |
status |
— | Print the current cache state, TTL, and row count. Equivalent to running info cache with no sub-action. |
clear |
--days |
Clean cached records not refreshed for this many days. Defaults to 30. |
clear |
--all |
Clean all cached records. |
clear |
--start-date |
Clean cached records refreshed on or after this date, formatted YYYY-MM-DD. |
clear |
--end-date |
Clean cached records refreshed on or before this date, formatted YYYY-MM-DD. |
clean |
— | Alias for clear --all. |
search¶
Search by query:
Use date filters:
Change pagination:
Return JSON:
search Options¶
| Option | Description |
|---|---|
query |
Required search query. It may be a title, number, author, abstract phrase, or keyword. |
--start-date, --start |
Include papers on or after this date, formatted YYYY-MM-DD. |
--end-date, --end |
Include papers on or before this date, formatted YYYY-MM-DD. |
--page |
Result page to fetch. Defaults to 1. |
--per-page |
Results per page. Allowed values: 20, 50, 100. Defaults to 20. |
--format, -f |
Output format: list or json. Defaults to list. |
When only --start-date is provided, NBER-CLI automatically uses the current date as the end date.
feed¶
feed works with NBER's new working papers RSS feed and the local SQLite database. The database tracks which RSS items have already been seen, so feed fetch can show only newly discovered papers by default.
feed fetch¶
Fetch the RSS feed, store all fetched items in the cache, and display only new items by default:
Display all fetched RSS items, including items already present in the cache:
--display-all accepts a boolean value. The parser recognises (case-insensitive, whitespace tolerated) true, false, 1, 0, yes, no, y, n, on, off. When the flag is passed with no value (--display-all on its own) it defaults to true. Any other value is rejected with exit code 2.
Limit displayed output:
When --max-items is provided and --display-all is omitted, --display-all defaults to true. This makes nber-cli feed fetch --max-items 5 show the first five fetched RSS items instead of showing nothing when there are no new items.
Return JSON:
NBER-CLI strictly parses the RSS XML. To tolerate a known upstream formatting issue, it repairs only unescaped < characters followed by whitespace or a digit inside RSS title and description text. Other malformed XML is rejected. Parse errors exit with code 1, print a concise error with the line and column when available, and do not print command usage.
feed clean¶
Clean cached feed database records. This deletes records from the local cache, not from NBER. Deleted cache records may be fetched again as new items if they still appear in the RSS feed.
Clean records not seen for 30 days:
Clean all cached records:
Clean records by last-seen date:
nber-cli feed clean --end-date 2026-05-31
nber-cli feed clean --start-date 2026-05-01 --end-date 2026-05-31
--end-date without --start-date cleans from the earliest cached record through the end date. --start-date and --end-date are inclusive. Passing only --start-date is invalid.
Before deleting anything, feed clean prints how many cached records match and asks for confirmation:
This operation is irreversible.
Deleted cache records may be fetched again as new items if they still appear in the RSS feed.
Continue? [y/N]:
Only y or Y continues. Any other response aborts without deleting records.
feed Options¶
| Subcommand | Option | Description |
|---|---|---|
fetch |
--display-all [true\|false] |
Display all fetched RSS items instead of only new items. Accepts true/false/1/0/yes/no/y/n/on/off (case-insensitive). When passed without a value, defaults to true. |
fetch |
--format, -f |
Output format: list or json. Defaults to list. |
fetch |
--max-items |
Maximum number of feed items to display. |
clean |
--days |
Clean cached records not seen for this many days. Defaults to 30. |
clean |
--all |
Clean all cached feed records. |
clean |
--start-date |
Clean cached records last seen on or after this date, formatted YYYY-MM-DD. |
clean |
--end-date |
Clean cached records last seen on or before this date, formatted YYYY-MM-DD. |
db¶
db manages the local SQLite database used by info, search, download, and feed for cache and behavior logs.
db init¶
Initialize the database and write its path to the user config:
If --db-path is omitted, the default database path is ~/.nber-cli/nber.db.
If an existing ~/.nber-cli/feed.db from 0.3.0 is present and no nber.db exists yet, NBER-CLI uses that legacy file in place. Schema is automatically upgraded from version 1 to version 2 on first use.
db migrate¶
Move the database to a new path and update the user config:
Migration moves the SQLite database file and any SQLite sidecar files such as -wal, -shm, and -journal. The target path must not already exist.
db Options¶
| Subcommand | Option | Description |
|---|---|---|
init |
--db-path |
SQLite database path. Defaults to ~/.nber-cli/nber.db. |
migrate |
new_db_path |
New SQLite database path. |
mcp-server¶
Start the default stdio MCP server:
Start an HTTP transport:
mcp-server Options¶
| Option | Description |
|---|---|
--transport |
Transport mechanism: stdio or streamable_http. Defaults to stdio. |
--port |
Port for streamable_http. Defaults to 8000. |
For client configuration and tool details, see MCP Server.
Exit Codes¶
| Code | Meaning |
|---|---|
0 |
Command completed successfully, or help was printed. |
1 |
Runtime failure such as a failed download, a network error, a parse error, or any other unhandled exception. |
2 |
Invalid command-line arguments. Argparse raises SystemExit(2) and prints a usage message to stderr. |
A few extra rules that are easy to miss:
- A single
downloadfailure exits1. The successfulSuccessfully downloaded <id> to <path>line goes to stdout; theFailed to download <id>: <reason>line goes to stderr. The download log row indownload_logis written before the failure is printed. - A batch
downloadruns every requested paper and only exits1at the end if at least one paper failed. Successful files are written to stdout (Successfully downloaded ...), failures and the per-failure reasons go to stderr. The exit code is0only when every paper succeeded. - A
feed fetchRSS parse failure exits1and writes a concise error to stderr without printing command usage. The error includes the XML line and column when available. db init,db migrate,info cache clear, andfeed cleanprint a confirmation prompt to stderr. The command aborts with exit code0if the user declines (Abort.is printed to stderr). The actual deletion (when confirmed) exits0on success.- Database record-keeping failures (
record_query,record_download,record_info,touch_info_cache,write_info_cache) print a one-linewarning: failed to ...to stderr but do not raise. The main command's exit code is unaffected. - The download module reads the entire PDF body into memory and then writes it in one call. A failure that occurs between the network read and the disk write (process kill, disk full, permission revoked) typically surfaces as a Python exception on the way out; the user sees the traceback on stderr and the process exits
1. There is no atomic-rename guarantee, so a target file may be left empty or partially written when this happens.
Output Formats¶
info, search, and feed fetch default to list, a readable text format. Use --format json when piping output into scripts or agent workflows.
The rule of thumb for scripting:
- stdout carries the human-readable output or the JSON payload (with
--format json). - stderr carries the cache-hit hint, every per-paper error message, every per-paper success line that is incidental to the main payload, the
warning: ...line for failed background logging, and the confirmation prompts for destructive commands.
This means a script that wants the JSON payload can capture stdout with 2>/dev/null (or simply 2>&-), and a script that wants only errors can capture stderr with 2>&1 >/dev/null.