Skip to Content
Data LayerDataSchema

DataSchema

The DataSchema is the contract that drives three product surfaces at runtime — browse cards, search & filters, detail page. It lives in the project-schemas collection and validates the data blob on every records row.

A schema is “good” when scanning a card answers what is this and why should I care? in a glance, search returns relevant matches without opaque-ID noise, and the detail page surfaces what the card couldn’t fit.

AnnotatedField

type AnnotatedField = { key: string label: string type: 'string' | 'number' | 'boolean' | 'date' | 'url' | 'image' | 'enum' | 'array' searchable: boolean facetable: boolean role?: Role | null cardVisibility?: 'primary' | 'secondary' | 'detail-only' | null enumValues?: { value: string }[] required: boolean min?: number max?: number }
  • typestring | number | boolean | date | url | image | enum | array. There is no location / geo type — geo is handled by the auto-pair on lat/lng columns. There is no multi-value-enum — compound values like "Mon, Tue" need preprocessing in the source CSV.
  • searchable — included in the Orama full-text index.
  • facetable — exposed as a sidebar facet on /explore.

Roles

role is the universal vocabulary that decides where a field renders. The role taxonomy is shared across every card style and detail template; the card style picks which zones it exposes.

Universal slots

RoleRenders as
titleCard and detail header (required identifier)
subtitleInline under the title (e.g. business type, industry, neighbourhood)
summaryCard body snippet (2–3 lines)
imageHero photo (also accepts video / Lottie)
logoSquare brand mark
ratingNumeric 0–5 trust signal
ratingCountReview count next to the rating
verifiedBoolean trust badge
categoryPrimary tag (multi-value via array)
factPrimaryTop-tier fact (salary, price, funding)
factSecondarySecond-tier fact (hours, distance)
actionPrimaryMain URL action (Apply, Visit)
actionSecondaryPhone, directions
timestampDrives freshness chips, feed sort, expiry
pricingTierStructured pricing data (tier preview chip + matrix on detail)
awardBadgeTrophy / leader / certification

Kit-specific slots

Operators opt into a kit at setup based on the archetype (local biz / asset gallery / API marketplace / AI generation / …). Kits enable additional roles like license, format, contentRating, leadScore, signalPill, sloChip, rateLimit, recipe, parentRecord, version. Kits are bundles, not new component trees — they just toggle which optional slots/sections render.

cardVisibility

Solves the “we have 30 fields, what shows on the card?” problem.

ValueMeaning
primaryMust show on card; ordered first
secondaryShow on card if budget allows
detail-onlyOnly on detail page
nullAuto-rank via heuristic (default)

Per-card-style budgets (hardcoded):

StyleMax badgesMax meta
media32
logo43
minimal21
Detail pageunlimitedunlimited

Auto-inference at import

When an operator runs shipmore schema infer, the platform proposes types, roles, searchable, and facetable from key names + value distributions:

HeuristicInferred role
name / title / company field nametitle
Number with rating / score in key, range 0–5rating
Number with count / reviews / views in keyratingCount
URL field, first oneactionPrimary
URL field, additionalactionSecondary
Image field, square aspectlogo
Image field, landscapeimage
Boolean with verified / claimed semanticverified
Enum / array, low cardinality, facetablecategory
Currency-like number, not 0–5 rangefactPrimary

Auto-inference is best-effort. The agent and operator iterate via shipmore schema field set/drop and re-run schema apply.

Separation of concerns

ConcernCollectionOwns
Validation contractproject-schemasShape of records.data. Pure JSON Schema.
Source projectionimport-mappingsSource column renames, lat/lng extraction, ignores.
RowsrecordsThe data. data validated against the DataSchema at write time.

This split lets you re-run an import after a column rename without touching the schema, and update the schema without re-importing.