PostGrad
API Reference

API Reference

View as Markdown

Complete reference for all PostGrad Knowledge API endpoints.

Overview

The PostGrad Knowledge API provides programmatic access to structured business knowledge extracted from real meetings. All endpoints return responses in a consistent JSON envelope format.

Base URL: https://postgrad.io/api/v1

Authentication: Most endpoints require a Bearer token (API key) in the Authorization header. See Authentication for details.

Public endpoints (no auth required):

Authenticated endpoints:

Response Envelope

All responses follow a consistent envelope:

{
  "data": "...",
  "pagination": { "total": 100, "limit": 20, "offset": 0, "has_more": true },
  "meta": { "queries_used": 42, "queries_remaining": 958 },
  "error": null
}
  • data contains the response payload (array or object)
  • pagination is present on list endpoints, null otherwise
  • meta includes usage counters on authenticated endpoints, null on public endpoints
  • error is null on success; on failure, data is null and error contains code and message

This envelope is a stable public contract. Field additions are backwards-compatible (additive); field removals or rename go through a deprecation window with the alias kept for at least one minor release.

Knowledge entry fields

Every knowledge entry returned by /knowledge, /knowledge/search, /knowledge/recent, and /knowledge/{id} follows the same canonical shape:

FieldTypeWhen setNotes
idstring (uuid)alwaysStable identifier — safe to cache and reference.
feed_idstring (uuid)alwaysAlways populated, including on ?feed=all cross-feed responses.
feed_namestringalwaysHuman-readable name of the originating feed.
feed_slugstringalwaysURL-safe slug of the originating feed. Accepted in X-PostGrad-Feed.
titlestringalwaysShort headline (≤ 100 chars typical).
categorystringalwaysNormalized to lowercase + underscores (e.g. deal_evaluation).
tagsstring[]alwaysFree-form tags.
contentstringalwaysMain body. Can be long — paginate if needed.
confidencenumberalways0–1. Static extraction confidence assigned at ingest. Not a relevance score.
scorenumbersearch responses onlyPublic relevance score. Equals sort_score (post-freshness adjustment). Clamped to [0,1]. Omitted on list / get-by-id responses — there's no query to score against.
permalinkstring | nullalwaysDirect dashboard URL for sharing / citation.
created_atstring (ISO 8601)verbosity=full onlyWhen the entry was first ingested.
updated_atstring (ISO 8601)verbosity=full onlyWhen the entry was last revised.
raw_scorenumber | nullverbosity=full search responses onlyPre-adjustment relevance score on the mode's native scale (raw ts_rank for keyword, cosine distance for semantic, raw RRF for hybrid). Useful only when re-ranking without the freshness penalty. Omitted on list / get-by-id responses.
sort_scorenumber | nullverbosity=full search responses onlyEquals score. The number that actually drove the API's sort order. Omitted on list / get-by-id responses.
freshness_multipliernumber | nullverbosity=full search responses onlyMultiplier applied to derive sort_score (e.g. 0.75 for stale-tagged entries). null when no adjustment fired. Was freshness_penalty in earlier versions — both keys ship for one release cycle. Omitted on list / get-by-id responses.

compact is the default verbosity for MCP and small REST defaults; pass ?verbosity=full to surface created_at, updated_at, raw_score, sort_score, and freshness_multiplier. List + get-by-id responses omit score, raw_score, sort_score, and freshness_multiplier entirely — there's no query to score against, so the fields would always be null (noise on the wire).

Which score should I sort on?

Always score. It's the post-adjustment number that drove the API's own sort order, clamped to [0,1], and on the same scale across all three modes. If you set a threshold like score >= 0.7 and later swap modes, the threshold still means roughly the same thing.

Modescoreraw_scoresort_scoreWhen they diverge
keywordclamped ts_ranksame as scoresame as scoreOnly when a freshness multiplier fires — then sort_score < raw_score and score === sort_score.
semanticclamped cosine similarityraw cosine distance (lower = closer)similarity × freshness_multiplierraw_score is on a different scale (distance, not similarity) — never re-sort using raw_score directly.
hybridnormalized RRFraw RRFRRF × freshness_multiplierSame shape as semantic; sort on score.

If freshness_multiplier is null, no adjustment fired — score, raw_score, and sort_score are effectively identical (modulo the mode-specific scale of raw_score). The three-field surface is only useful when you want to re-rank without the freshness penalty — drop the multiplier, use raw_score, sort yourself.

Important about within-mode distributions: the [0,1] scale is consistent across modes, but the typical distribution differs. Keyword scores cluster low (0.2–0.4 is a strong match). Semantic similarity tends to cluster in 0.55–0.75 on natural-language queries (the answer is in there if the topic matches). Hybrid mirrors semantic but with a top-result-as-1.0 normalization, so the headline scores look higher even when the underlying ranking is the same. A score >= 0.7 threshold will keep different absolute numbers of results in each mode, but the relative ranking within a single response is what you should trust.

Which mode should I pick?

All three modes are available on every tier. Mode is a per-query tuning choice:

  • keyword (default): exact-word ts_rank. Best when you know the term ("Supabase", "RAG"). Fastest, no embedding compute. Misses when the user phrases naturally.
  • semantic: vector similarity over the entry's content. Best for natural-language questions where the user wouldn't use the same words as the entry ("how do I tell if a client is about to churn"). The mode that makes the product feel agent-native.
  • hybrid: reciprocal-rank fusion of keyword + semantic. In practice we find it's near-identical to semantic on clean operator content; it can win when content has mixed sizes (long reference dumps + short operator notes) by re-balancing toward shorter precise matches. Worth A/B-ing for your specific content; not universally better.

Default to semantic for natural-language agent queries. Use keyword when you're searching for a specific term. Try hybrid if your content varies wildly in length.

confidence vs score — two different signals, both on every entry

These are the two most commonly confused fields. They answer different questions:

  • confidence — How sure are we that this entry is true? Set at ingest by the extraction pipeline (reinforced upward when multiple meetings agree). It's a static, per-entry rating — the same 0.85 whether the entry surfaces for the query "pricing" or for "team management." Think of it as the licensor's trust score for the insight itself. Use it to filter low-quality matches with confidence_min=0.7.

  • score — How well does this entry match this specific query? Set per-query by the search ranker — pure tsrank for keyword mode, cosine similarity for semantic, RRF for hybrid, all clamped to [0,1] and adjusted by freshness_multiplier to derive sort_score. Different on every query. Always null on list/get-by-id responses (there's no query to score against).

Sort, filter, and threshold against the field that matches your use case: confidence for "show me high-quality results," score for "show me the most relevant results to this query." They are independent — a high-confidence entry can be a poor match for a given query, and vice versa.

Error envelope

On failure (error is not null), the response body shape is:

{
  "data": null,
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded. Try again shortly.",
    "details": { "limit": 10, "remaining": 0, "resetAt": 1781897700 }
  }
}

Common error codes:

CodeHTTPMeaning
UNAUTHORIZED401Missing / malformed API key.
INVALID_API_KEY_FORMAT401Key doesn't match pg_live_* shape.
INVALID_FEED400X-PostGrad-Feed value is neither a UUID nor a valid slug.
FEED_NOT_SUBSCRIBED403Valid feed exists but the caller isn't subscribed.
FEED_NOT_FOUND404No feed exists with the given id (MCP only).
CATEGORY_RESTRICTED403Scoped key tried to access a category outside its allowed set.
TIER_INSUFFICIENT403Requested search mode (semantic / hybrid) above the key's tier.
VALIDATION_ERROR400Bad query param (e.g. limit=99999, malformed UUID).
RATE_LIMIT_EXCEEDED429Per-minute or monthly quota hit. Retry-After header indicates seconds to wait.
METHOD_NOT_ALLOWED405HTTP method not supported on this endpoint. Allow: header lists accepted methods.
INTERNAL_ERROR500Unhandled server failure. Includes a server-side log id where possible.

Rate-limit headers

Every authenticated response (success or error) carries:

  • X-RateLimit-Limit — per-minute cap for the caller's tier.
  • X-RateLimit-Remaining — calls remaining in the current minute.
  • X-RateLimit-Reset — Unix epoch seconds when the window resets.
  • X-Monthly-Quota-Limit / -Used / -Remaining — monthly counters.

429 responses additionally carry Retry-After (seconds to wait — Unix-spec, not millis).

Public endpoints (/feeds/catalog, /stats, /categories) don't emit these headers — there's no per-caller quota to report. They are cached at the edge for 5 minutes (Cache-Control: public, max-age=300) and rate-limited only at the network level.

HTTP methods

Read endpoints accept GET (and HEAD / OPTIONS for preflight). The three search endpoints additionally accept POST with a JSON body whose keys mirror the query params:

# These two are equivalent
curl "https://postgrad.io/api/v1/knowledge/search?q=pricing&limit=5"

curl -X POST https://postgrad.io/api/v1/knowledge/search \
  -H "Content-Type: application/json" \
  -d '{"q": "pricing", "limit": 5}'

Other methods (PUT, PATCH, DELETE) return 405 Method Not Allowed with the canonical error envelope and an Allow: header listing the methods this endpoint actually accepts.

Per-endpoint context block

Every response (success or error) includes a context field. Its contents depend on the endpoint + the variant of the call:

Feed-scoped responses (X-PostGrad-Feed: <uuid> or slug)

{
  "context": {
    "feed": {
      "id": "f1a2c3d4-...",
      "name": "Agency Growth Playbook",
      "slug": "agency-growth-playbook",
      "provider": "Big Steele"
    }
  }
}

Cross-feed responses (X-PostGrad-Feed: all or omitted)

{
  "context": {
    "all_feeds": true,
    "feeds_searched": [
      { "feed_id": "f1...", "name": "Agency Growth Playbook", "slug": "agency-growth-playbook" },
      { "feed_id": "f2...", "name": "Tech Stack Decisions",   "slug": "tech-stack-decisions" }
    ]
  }
}

The response additionally carries these headers (always emitted on feed=all):

  • X-PostGrad-Feeds-Searched — count, mirrors context.feeds_searched.length.
  • X-PostGrad-Feeds-Truncated — count of subscribed feeds beyond the per-request fan-out cap (currently 20). Present only when truncation actually occurred.
  • X-PostGrad-Feed-Sourcescoped / auto-selected / all-feeds / header-uuid / header-slug. How the server resolved the feed scope.

Catalog / stats responses (public)

{
  "context": {
    "catalog": { "total": 9, "cached_seconds": 300 }
  }
}

context.cached_seconds is present on every cached public endpoint so consumers can decide whether to refresh.

Search response data shape — extra fields on feed=all

Cross-feed responses include these top-level fields alongside data (which is still an array of canonical entry objects):

FieldTypeNotes
mode_served'keyword' | 'semantic' | 'hybrid'The mode the server actually ran. Equals mode from the request unless tier policy downgraded it.
fallbackbooleantrue when mode_served < mode_requested (defensive downgrade for unknown/legacy tiers). On Starter/Pro/Scale this is always false — current tier policy uses hard 403 TIER_INSUFFICIENT instead of silent fallback. Kept on the envelope for forward compatibility.
feeds_searched_countnumberSame as context.feeds_searched.length. Redundant by design — easier to read on a flat envelope.
dupes_droppednumberSame entry id surfaced by multiple feeds → keep one, count the rest.
all_feedstrueAlways true on the cross-feed branch (single-feed responses omit this).

Catalog feed shape

The /feeds/catalog and /feeds/{slug-or-uuid} endpoints share most fields. Catalog returns a list; the detail endpoint adds recent_entries[] + categories_with_counts[] + provider_bio for the eval-before-subscribe surface.

FieldTypeNotes
iduuidStable identifier.
slugstringURL-safe slug. Accepted in X-PostGrad-Feed.
namestringHuman-readable feed name.
descriptionstringOne-paragraph marketing description from the licensor.
categoriesstring[]Up to 12 categories (catalog) or all categories (detail). See categories_total for the un-truncated count.
categories_totalnumberCatalog only. Total distinct categories on this feed; categories[] is truncated to 12.
price_monthlynumberUSD/month. 0 for free feeds.
is_curatedbooleanLegacy flag — see source_type instead.
source_typeenumSee below.
provider_namestring | nullThe licensor's name when source_type='expert'; null otherwise (no individual operator on compiled/demo feeds).
entry_countnumberTotal published entries.
sample_titlesstring[]Catalog only. 10 recent titles, filtered for placeholder noise.
last_updated_attimestamp | nullMost recent entry's updated_at. Use this to tell "actively maintained" from "abandoned."

source_type enum

The marketplace tier signal. Drives the visual badge on /marketplace and is the right field to filter on for RAG consumers selecting feeds.

ValueMeaning
expertOperator-authored from the licensor's own experience. provider_name is populated. Highest-trust tier.
compiledAggregated by the PostGrad pipeline from public sources (news, articles, transcripts). Summaries + links, not full content. provider_name is null.
demoSeed/example content for trying the platform out. Not a real operator's knowledge. Avoid for production RAG.

Filtering the catalog

GET /api/v1/feeds/catalog accepts three optional query params:

  • ?categories=sales_process,deal_evaluation — comma-separated. Feed must contain ALL listed categories in its categories[] (logical AND). Matches whole category names exactly (case-insensitive); use /api/v1/categories to discover valid names.
  • ?source_type=expert — restrict to one tier (expert | compiled | demo).
  • ?q=growth — case-insensitive substring match against name + description.

Unknown query params are silently ignored — tracking params like ?utm_source=... won't 400. Response context.catalog.filter_applied is true when any filter narrowed the result, and total_before_filter exposes the unfiltered count for "0 of 9 feeds match" UX.

On this page