Context Hub: The Curated Doc Layer Every Coding Agent Needs

13 min read Tiếng Việt
Featured image for andrewyng/context-hub — Context Hub: The Curated Doc Layer Every Coding Agent Needs

TL;DR

  • What it solves: LLM training cutoffs silently corrupt AI-generated API code; chub get stripe/api fetches current docs before your agent writes a single line
  • Why it matters: Without it, your agent codes from prior knowledge that may be 12-18 months stale, and the bug compiles, deploys, and fails on a Sunday
  • Best for: Developers who give AI agents autonomy to write third-party API integrations — OpenAI SDK, Stripe, Vercel AI SDK, anything that ships breaking changes
  • Best use case: Install the bundled SKILL.md into Claude Code once, and it runs chub get automatically every time you ask it to call an external API — no prompt engineering required
  • Main differentiator: The Docs vs Skills taxonomy — large per-task API references fetched on demand versus small behavioral instructions that persist across sessions in your agent’s directory

The agent wrote a webhook handler. Four tests passed. Six weeks later, Stripe’s v4 SDK changed the method signature, and payments stopped processing at midnight on a Sunday.

No error message said “this code was written using 14-month-old API documentation.” The code just stopped working.


Think of two surgeons walking into the same OR. One scrubs in, checks the current procedure card, scans the notes the previous team left behind, then begins. The other operates from memory. They performed this surgery eighteen months ago. It went fine.

You want the first surgeon’s hands. Not because the second is incompetent — because procedures change, and the notes exist for a reason.

Context Hub is that procedure card and those notes, for AI coding agents writing API-integration code.

Andrew Ng — co-founder of Google Brain, creator of deeplearning.ai, the researcher who put hands-on ML education into more minds than anyone else alive — released this under his AI Suite organization. The same org behind aisuite, the multi-provider LLM abstraction library. The tool is called chub. It is a Node.js CLI. It gives your agent on-demand access to curated, versioned documentation for 500+ libraries, fetched over HTTPS right before your agent starts writing.

The deeper design is the Docs vs Skills distinction. Docs are large API references fetched per task — expensive to carry but precise when you need them. Skills are small behavioral instructions you install into your agent’s skill directory; they persist across sessions and encode how to behave for recurring tasks. One chub get command handles both. The CLI auto-detects which type from registry metadata.

That auto-detection is less obvious than it sounds. We will get back to it.

Real-World Use Cases

The core problem is invisible by design. Your agent generates correct-looking code using a method deprecated eight months ago. Your CI tests pass because the test author is the same agent with the same model-of-that-library in its weights. The bug lives in production until a customer finds it.

These are the situations where chub changes the outcome:

  • Greenfield API integration: Before the agent opens an editor, run chub get openai/chat-api and hand it the current method signatures.
  • SDK version migrations: Fetch both the pinned version and the current release — chub get openai/chat-api --version 1.52.0 and today’s docs — and let the agent generate an upgrade diff.
  • Agentic CI pipelines: A pre-commit hook that loads relevant API docs into .context/ before an LLM reviewer runs. Fully scriptable via JSON output.
  • Internal API knowledge base: Your team’s microservices, documented in the same format and served from a private CDN. Every developer’s agent knows the internal surface without onboarding docs.
  • Recurring agent patterns: Community Skills for Playwright login flows, Auth0 device flows, and common patterns. Install once, persistent across sessions.

The webhook scenario gives the clearest before/after.

Without Context Hub:

# Agent reaches into training data
# Training data: Stripe SDK v3, roughly 14 months old
# Agent writes: stripe.webhooks.constructEvent(payload, sig, secret)
# constructEvent is synchronous in v3
# constructEventAsync is required in v4
# Code works in development, fails in production

With Context Hub:

chub get stripe/api --lang js
# Output: Current Stripe SDK v4 docs, method signatures, webhook section
# Agent reads them, writes: await stripe.webhooks.constructEventAsync(...)
# Code works in development, works in production, all days

One command changes the outcome. But the second half of the design — the part about what your agent remembers after the session closes — is the piece worth a deeper look.

The second surgical principle — the one about the previous team’s notes — is the more interesting half of the design.

How to Use It

Those notes have to live somewhere. The CLI is the address book. Three seconds to install it.

npm install -g @aisuite/chub
# Installs @aisuite/chub globally; first run creates ~/.chub/ cache directory

First run pulls a ~100KB registry index to ~/.chub/. Subsequent searches hit the local cache.

Search the registry:

chub search openai
# openai/chat-api        OpenAI Chat Completions API (py, js)       official
# openai/embeddings      OpenAI Embeddings API (py, js)             official
# openai/assistants      OpenAI Assistants API (py, js)             official
# openai/responses-api   Responses API, new in SDK v5 (py, js)      official
# ... 12 results

Fetch docs for the current task:

chub get openai/chat-api --lang js
# Outputs: full current docs as markdown to stdout
# Footer: "Additional files available: references/streaming.md, references/tool-calling.md"

Entry points are around 500 lines. Reference files load on demand. This protects the agent’s context window — think of it as a card catalog, not a single reference shelf: you request the section you need, not the entire collection.

# Only need the streaming section?
chub get openai/chat-api --file references/streaming.md
# Outputs: streaming reference only (~80 lines)

# Want everything?
chub get openai/chat-api --full

# Multiple docs for a complex task -- Stripe + SendGrid receipt:
chub get stripe/api sendgrid/mail --lang js -o .context/
# Writes: .context/stripe-api.md, .context/sendgrid-mail.md

Persist a session discovery:

chub annotate stripe/api "Webhook verification requires raw body -- do not parse JSON before verifying"

# Next fetch appends it automatically:
chub get stripe/api
# --- end of docs ---
# [Agent note -- 2025-01-15T10:30:00Z]
# Webhook verification requires raw body -- do not parse JSON before verifying

The agent that found the gotcha on Tuesday helps the session on Thursday. Notes survive context resets. This is the previous-team’s-notes behavior.

💡 Tip: All commands support --json output for automated pipelines. chub search "stripe payments" --json | jq -r '.results[0].id' gives you the ID to pipe directly into chub get. Fully composable with standard shell scripting.

Install the agent skill for automatic doc fetching:

# Claude Code -- global (all projects)
mkdir -p ~/.claude/skills
cp $(npm root -g)/@aisuite/chub/skills/get-api-docs/SKILL.md ~/.claude/skills/get-api-docs.md

# Cursor
mkdir -p .cursor/rules
cp $(npm root -g)/@aisuite/chub/skills/get-api-docs/SKILL.md .cursor/rules/get-api-docs.md

After this, Claude Code reads the SKILL.md as a behavioral instruction. When you say “integrate the OpenAI API,” it runs chub get openai/chat-api automatically before generating code. No prompt engineering. No remembering to tell it to check the docs first.

That is the auto-detection explained. Registry metadata includes languages[] and versions[] arrays for docs. Skills are flat, no language array. chub get <id> reads the metadata and decides whether to write markdown to stdout (docs for agent consumption) or drop a file into a skills directory (behavioral instructions). The user types the same command for both.

But which sources your agent is allowed to pull from — that is a decision the default config quietly makes for you.

Configuration & Customization

Config lives at ~/.chub/config.yaml. It is not auto-generated on install — create it when you need to override defaults.

sources:
  - name: community
    url: https://cdn.aichub.org/v1 # default public CDN
  - name: internal
    path: /path/to/local/docs # local private registry

source: "official,maintainer,community" # trust policy
refresh_interval: 86400 # cache TTL in seconds (24h default)
telemetry: true # opt-out: set false
feedback: true # allow chub feedback to send ratings

The source trust policy is the one setting worth making deliberately:

ValueWhat agents seeWhen to use
"official"Provider-maintained docs onlyMaximum trust, narrowest coverage
"official,maintainer"Provider + well-known library authorsEnterprise default
"official,maintainer,community"Full 500+ library public registryDefault for solo developers

For private team docs, add a sources entry pointing to a local path, generate a registry from your markdown with chub build my-content/, and chub search returns results from both public and private sources in a single query. Multi-source disambiguation uses a colon prefix: chub get internal:mycompany/config-api versus chub get community:openai/chat-api. Slashes are ID separators; colons are source separators. The design is unambiguous by construction.

⚠️ Warning: Telemetry is opt-out, not opt-in. The CLI sends anonymous usage analytics by default. Add telemetry: false to ~/.chub/config.yaml or set CHUB_TELEMETRY=0 before first use if that matters to your organization.

Where It Fits (And Where It Doesn’t)

Context Hub covers one specific moment: the moment your agent reaches for outside knowledge about an API before writing code. Outside that moment, the tool is not for you.

ScenarioDoes chub help?
Agent writing code that calls OpenAI, Stripe, SendGridYes
Agent upgrading SDK from v1 to v2Yes
Team with internal APIs not in the public registryYes, via chub build
Agent answering general programming questionsNo
Agent debugging logic errorsNo
Agent doing architecture planningNo

Scalability path:

  • Solo developer: chub get ad hoc, chub annotate after discovering a gotcha
  • Team: shared git repo of internal docs, each developer’s ~/.chub/config.yaml points to it. Everyone’s agent shares the same private knowledge base without public exposure.
  • Enterprise: built registry on S3 or Cloudflare R2, source: "official,maintainer" set org-wide, chub build --validate-only in CI to enforce internal doc quality

The architecture scales without changing the CLI interface. The opportunity most teams skip: a company-wide knowledge hub. Every internal microservice documented in Context Hub format, served from an internal CDN. Onboarding becomes a directory copy, not a three-day documentation hunt. The part that stays unresolved until you actually build it is whether the team will keep those internal docs current after month two.

The Rough Edges

Community docs are reactive, not verified. The public registry covers 500+ libraries. Official and maintainer tiers are well-maintained. Community-tier docs come from GitHub PRs with no automated validation against live APIs. A well-intentioned contributor can ship subtly wrong examples. The chub feedback system — votes with labels like outdated or wrong-examples — is the quality mechanism, which means it fixes problems after someone encounters them.

Annotations are machine-local. Notes from chub annotate live in ~/.chub/annotations/. They do not sync to teammates or across your own devices. For solo agents, this is a genuine memory layer. For teams, it is invisible to everyone else on the project.

Single annotation per entry. Each chub annotate replaces the previous note entirely. Three separate gotchas for one library collapse into one note. The timestamp tells you when it was written. It does not tell you what it replaced.

No semantic chunking. Docs arrive as complete markdown files. There is no RAG-style chunk-and-embed. For large docs at 50K tokens, the incremental fetch pattern (--file references/webhooks.md) manages the context window, but it requires knowing which reference file you need before fetching.

No CLI path to contribute. chub feedback sends votes. Submitting new content requires a manual GitHub PR to the content repository. If a library is missing from the registry, you own the BYOD workflow from directory scaffolding to writing the markdown.

On security: The CLI makes HTTPS calls to cdn.aichub.org for registry and doc fetches, and to the feedback API for ratings. Community-tier content is unreviewed markdown. No elevated permissions required — everything writes to ~/.chub/ in your home directory. Standard global npm install; no custom postinstall scripts were present at review time. For production teams, set source: "official,maintainer" to exclude unreviewed community contributions from your agent’s context.

Getting Started

Four commands from zero to an agent that auto-fetches docs:

  1. Install the CLI:

    npm install -g @aisuite/chub
  2. Confirm the registry is available:

    chub search openai
  3. Test a fetch:

    chub get openai/chat-api --lang js
  4. Install the Claude Code skill globally:

    mkdir -p ~/.claude/skills
    cp $(npm root -g)/@aisuite/chub/skills/get-api-docs/SKILL.md ~/.claude/skills/get-api-docs.md

After step 4, open any Claude Code session and ask it to write code that calls an external API. It will run chub get automatically. You will probably not notice the first time. That silence is the signal — whether it pulled from the right source tier for your context is the one question worth verifying before you ship.

FAQ

Does Context Hub work with Cursor and other agents, or only Claude Code?

Any tool that reads a custom instructions or rules directory works. For Cursor, copy the SKILL.md to .cursor/rules/. For any other agent, find where it reads behavioral instructions and put the file there. The chub CLI runs anywhere Node.js 18 is available — the agent just needs shell access.

What happens when chub get is called for a library not in the public registry?

The CLI returns a not-found message and does not fabricate content. Your agent falls back to its training-data knowledge for that specific library. The BYOD workflow (chub build my-content/) lets you add it to a local registry, but you write the documentation yourself in markdown with YAML frontmatter. There is no auto-generation from a live API.

How is this different from pasting API docs into the system prompt?

Scope and freshness. A system prompt carries docs for every session regardless of the task at hand. chub get fetches exactly the docs for the current library at the current version, with your annotations appended. The incremental fetch pattern means the agent’s context window only pays for the sections it will actually use — not the full reference for endpoints it will never touch.

Should I set source: "official,maintainer" or leave the default?

For personal projects, the full default covers more libraries and is fine. For production code, "official,maintainer" is the safer choice — community-tier docs have no automated validation against live APIs. The coverage difference is real, and so is the trust tradeoff. Make the decision once per team and put it in a shared config.

Do annotations replace official docs or append to them?

Annotations append. The official doc content is unchanged. Your note appears at the bottom of chub get output, marked with a timestamp and [Agent note] label. What the public registry knows and what your machine has learned are clearly separated — which is also why annotations do not automatically sync to teammates.

Final Thoughts

A webhook handler that fails at midnight. The code was correct for the API version the model was trained on. That version changed eight months earlier. The agent had no signal to check.

chub is a small CLI that changes where an agent looks before it writes. Run chub get before coding. Run chub annotate after discovering a gotcha. Install the SKILL.md once and the fetching becomes automatic. Each behavior is one command.

The two surgeons again: one reads the current notes before beginning, the other relies on a fourteen-month-old memory of the same anatomy. Both walk into the room equally confident. One of them is operating on a system that changed while they were away.

The outcome is not a metaphor. It is a webhook that works at midnight.

andrewyng/context-hub · MIT · 12,824 stars

Hoang Yell

Hoang Yell

A software developer and technical storyteller. I spend my time exploring the most interesting open-source repositories on GitHub and presenting them as accessible stories for everyone.