Skip to Content
Data LayerOverview

Data Layer

The data layer is what turns a CSV into a live, browsable, monetizable product. Three collections cooperate:

CollectionPurpose
recordsUniversal data store — one row per item. The data blob is validated against the tenant’s DataSchema at write time.
project-schemasThe DataSchema for a tenant’s dataset (JSON Schema describing records.data).
import-mappingsSource row → record projection rules (column renames, lat/lng paths, ignores).

Ingest flow

The canonical flow is CLI-first, designed to be driven by an agent or by a human at the terminal:

shipmore schema infer → schema apply → import
  1. Infer — sample the file, propose a DataSchema + import mapping. Output is a fields[] JSON the operator can edit.
  2. Apply — persist the schema (and mapping) to project-schemas + import-mappings.
  3. Import — project raw rows into records.data, validate against the schema, write in batches via Payload jobs.

Bulk imports use @payloadcms/plugin-import-export underneath. The job runner is enabled via jobs.autoRun in payload.config.ts (tune with PAYLOAD_JOB_AUTORUN_CRON).

See Import for the full CLI flow and the agent’s 4-phase workflow.

Records

Every row goes into the same records collection regardless of shape. Root fields are stable (id, tenant, slug, status, location, displaySummary, sourceId, …) and data is the JSON blob shaped by the DataSchema.

location is a special root field — when source rows include lat / lng columns, they’re paired automatically into record.location at import time. No CLI flag needed.

DataSchema

The project-schemas collection holds a DataSchema — a JSON Schema describing the shape of records.data. The collection name is legacy; the concept is a data schema, not a project-wide schema. It only describes the data blob — root fields on records are not part of it.

Each field in the DataSchema is an AnnotatedField that drives both data validation and presentation:

type AnnotatedField = { key: string label: string type: 'string' | 'number' | 'boolean' | 'date' | 'url' | 'image' | 'enum' | 'array' searchable: boolean facetable: boolean role?: 'title' | 'subtitle' | 'summary' | 'image' | 'logo' | 'rating' | 'ratingCount' | 'verified' | 'category' | 'factPrimary' | 'factSecondary' | 'actionPrimary' | 'actionSecondary' | 'timestamp' | 'pricingTier' | 'awardBadge' | cardVisibility?: 'primary' | 'secondary' | 'detail-only' | null enumValues?: { value: string }[] required: boolean min?: number max?: number }
  • searchable / facetable drive the Orama index and explore filters.
  • role drives where the field renders on cards and the detail page.
  • cardVisibility decides which fields show on the card vs. detail-only.

See DataSchema for the full field reference and role taxonomy.

Search & Explore

Once records are in, the ExploreBlock ships out of the box: per-tenant Orama index, faceted sidebar, search bar, sort, pagination. Five renderer variants (grid, table, feed, map, minimal) cover the common archetypes.

See Search & Explore.

Where the data layer is NOT

ShipMore is not a data sourcing or cleaning pipeline. You bring already-shaped data (or generate it via the generation pipeline). Sourcing, scraping, and messy ETL stay upstream — ShipMore maps to schema, presents, monetizes.