Configuration¶
Most NBER-CLI runtime behavior uses built-in defaults. The local database also uses a small user config file to remember the SQLite database path selected by nber-cli db init or nber-cli db migrate.
Runtime Defaults¶
| Setting | Default | Description |
|---|---|---|
| Request timeout | 60 seconds |
Total timeout for network requests. |
| Retry count | 3 |
Failed eligible requests are retried before surfacing the error. |
| Request attempts | 4 |
Derived from retry count plus the first attempt. |
| Download connection limit | 100 |
Maximum concurrent download connections. |
| Per-host connection limit | 10 |
Maximum concurrent connections to one host. |
| Search page sizes | 20, 50, 100 |
Accepted values for --per-page. |
These values live in NBERCLIConfig and NBER_CLI_CONFIG. They are compile-time constants: they are not exposed in ~/.nber-cli/config.json, are not read from any environment variable, and are not settable through CLI flags. To change them you must edit the source code and reinstall the package.
If you need different network behavior at runtime, the supported escape hatch is to call the Python API directly and pass a custom aiohttp.ClientSession (or aiohttp_retry.RetryClient) with the timeout, connector limits, and retry policy you want. The package's built-in retry client and connector are only used when the function creates its own session.
What is configurable today¶
The following list is exhaustive — values not listed here are constants.
| Surface | Configurable? | Where |
|---|---|---|
info.cache_enabled |
Yes | ~/.nber-cli/config.json; toggle via nber-cli info cache --turn-on/--off |
info.cache_ttl_days |
Yes | ~/.nber-cli/config.json; set via nber-cli info cache --set-refresh <N> |
feed.db-path (SQLite path) |
Yes | ~/.nber-cli/config.json; set via nber-cli db init --db-path ... or nber-cli db migrate ... |
| Request timeout | No | Code constant in NBERCLIConfig |
| Retry count / request attempts | No | Code constant in NBERCLIConfig |
| Download connection limits | No | Code constant in NBERCLIConfig |
| Per-host connection limit | No | Code constant in NBERCLIConfig |
| Search page sizes | No | Code constant in NBERCLIConfig |
| User agent string | No | Generated by fake_useragent.UserAgent per request |
| HTTP headers (other than UA) | No | Hard-coded in the download module |
User Config File¶
The user config file is:
Current schema:
{
"schema_version": 2,
"feed": {
"db-path": "/Users/name/.nber-cli/nber.db"
},
"info": {
"cache_enabled": true,
"cache_ttl_days": 30
}
}
feed.db-path points to the SQLite database used by info, search, download, and feed. The historical feed key name is preserved for backward compatibility; the database itself is general-purpose.
schema_version records the current database schema version. NBER-CLI updates it after db init or schema upgrades.
info.cache_enabled controls the info_cache lookup globally. Set to false to force every info call (and the MCP get_paper_info tool) to go straight to NBER. Defaults to true.
info.cache_ttl_days sets the refresh interval in days. The TTL is sliding: every cache hit updates the row's last_fetched_at (via touch_info_cache) and increments fetch_count, so repeatedly consulted papers keep their cached copy for at least cache_ttl_days after the most recent hit. Cached entries whose last_fetched_at is older than the TTL threshold are treated as cache misses and re-fetched on the next info call. Must be a positive integer. Defaults to 30.
Both info keys are managed by nber-cli info cache --turn-on/--off/--set-refresh <N>. Missing or malformed fields fall back to defaults and never cause NBER-CLI to fail.
Local Database¶
Default database path:
Initialize the default database:
Initialize at a custom path:
Move an existing database and update the config:
If you upgraded from 0.3.0 and still have ~/.nber-cli/feed.db, NBER-CLI will keep using that legacy file when no nber.db is present. The schema is upgraded automatically on first invocation.
The database holds:
feed_itemsandfeed_fetches: RSS cache used byfeed fetchandfeed clean.info_cache: paper metadata cache used byinfoand the MCPget_paper_infotool. Cache reads are gated byinfo.cache_enabledand respect theinfo.cache_ttl_daysTTL.query_log,download_log,info_log: behavior logs for search keywords, download outcomes, and info lookups.
Database Operations¶
The database is created and upgraded automatically the first time any command that touches it runs. Running nber-cli db init is not required before using info, search, download, or feed; it exists for callers that want to pre-create the file or pin a non-default path. After db init (or after the first successful run), the schema version is recorded in ~/.nber-cli/config.json under schema_version.
Table Reference¶
| Table | Written by | Read by | Cleanup |
|---|---|---|---|
feed_items |
feed fetch |
feed fetch (display_all=False selects new items) |
feed clean (with confirmation) |
feed_fetches |
feed fetch |
not read by any command | none |
info_cache |
info and get_paper_info (with the cache enabled) |
info and get_paper_info |
info cache clear (with confirmation) |
query_log |
search |
not read by any command | none |
download_log |
download (single and batch) |
not read by any command | none |
info_log |
info and get_paper_info |
not read by any command | none |
CLI vs MCP Differences¶
- The CLI and the MCP
get_paper_infotool both write toinfo_logandinfo_cachewhen the cache is enabled. The CLI also writes a one-line stderr hint when the result came from the cache; the MCP tool does not. feed fetchbehaves identically in both surfaces; the MCP layer does not currently expose it.- The CLI is the only surface that writes
query_log(viasearch) anddownload_log(viadownload). The MCPsearch_papersanddownload_papertools do not write those tables in the current version.
Migrate and Reset¶
nber-cli db migrate <new_db_path> moves the database to a new path, including any SQLite -wal, -shm, and -journal sidecar files, and updates feed.db-path in the user config. The destination must not already exist; the command refuses to overwrite an existing file.
There is no built-in command to reset the database to an empty state. The supported ways to start over are:
- Move the existing file aside with
nber-cli db migrate, or - Stop the CLI, delete
nber.db(and any sidecar files) directly, and runnber-cli db initagainst a new path.
Backups¶
The database is a single SQLite file plus its sidecar files. To back it up safely:
- Stop any running
nber-clior MCP server process that might hold a write transaction. - Copy
nber.dbtogether withnber.db-walandnber.db-shm(when present) into the backup location. - Use
sqlite3 nber.db ".backup '<backup_path>'"for a crash-consistent snapshot without stopping the CLI; this is the recommended approach for live systems.
Cleanup Coverage Today¶
feed cleanremoves rows fromfeed_itemsonly.feed_fetchesis a continuously growing audit log and is not cleared byfeed clean --all. To prune it, run a manualDELETE FROM feed_fetches WHERE ...against the database or usesqlite3directly.info cache clearremoves rows frominfo_cacheonly.info_logis not cleared.query_log,download_log, andinfo_loghave no CLI cleanup command. The only ways to remove them today arenber-cli db migrateto a new database, manualsqlite3operations, or deletingnber.db.
Output Paths¶
Single download default:
Creates:
Directory-based download:
Creates:
Explicit file download:
Creates exactly the requested path, including parent directories when possible.
Date Filtering¶
Search dates use YYYY-MM-DD.
If --start-date is provided without --end-date, NBER-CLI uses the current date as the end date.
Network Behavior¶
NBER-CLI sends a browser-like user agent, uses retries for transient failures, and raises readable errors for common download failures:
- HTTP 403 can mean a newly released paper is still under NBER's first-week access restriction.
- HTTP 404 means the paper PDF was not found.
- Timeout and connection failures are reported as network errors.
No Credentials Required¶
NBER-CLI does not require an API key. It works against public NBER web pages and NBER's public working paper search endpoint.