Skip to content

Data

The dataset

A neutral, entity-resolved graph of the wine trade. Each wine carries field-level detail with the source and confidence behind every fact. The public slice below is free; the full graph, realized-pricing time-series, and the resolution API sit behind a free, role-validated tier.

37,650public entities20,292restaurant10,753sku3,456retailer

Public coverage

Entities by type

Entity counts by type in the public graph
Entity typeCount
restaurant20,292
sku10,753
retailer3,456
producer2,872
term107
region107
grape35
place22
importer3
award3
total37,650

Methodology

How it's built

Resolution

Sources are resolved through a deterministic cascade (normalize → block → score), with an LLM adjudicator for borderline decisions. Every field records its source tier, confidence, observation date, and provenance reference.

Privacy boundary

Public facts (producers, wines, grapes, menu listings) cross freely. Tenant-private commercial data never does. Cross-tenant pricing is published only as backward-looking aggregates, never below a k-anonymity floor — and never forward-looking.

Provenance at field level

Every Sourced<T> value carries a tier, confidence score, and observation timestamp. The API exposes the full provenance object on every field.

Provenance system

Source tiers

Every field is tagged with one of seven source tiers — a perceptually-even categorical scale so no single source visually dominates. Color is always paired with a label; it is never the sole signal.

Provenance tiers

  • Verified (first-party)Confirmed directly by the tenant — the highest-confidence, first-party record.
  • Tech sheetProducer-published technical document (PDF or web). High confidence, directly from source.
  • Producer siteScraped from the producer's public website. Reliable but not formally verified.
  • Importer siteScraped from the importer's catalog or website. One step removed from the producer.
  • COLA / government recordTTB Certificate of Label Approval or other official registry. Authoritative but limited in scope.
  • Inferred (LLM research)Filled by language-model inference from multiple public sources. Useful, but verify before acting.
  • Manual / operatorEntered or curated by a WineGraph operator. Authoritative within its scope; check the ref.

Access

Provenance via API

Every entity response includes a provenance map — source tier and confidence per field. Field-level detail is available to free, validated-tier API keys.

json
GET /v1/entities/:id
Authorization: Bearer wg_live_…

{
  "id": "…",
  "entity_type": "sku",
  "display_name": "Overnoy Ploussard 2022",
  "provenance": {
    "grapes":  { "source": "producer_site", "confidence": 0.91 },
    "farming": { "source": "tenant",        "confidence": 1.00 }
  }
}