Added

Table Metadata Service

Automatic metadata collection enriches the AI's understanding of your data assets with column statistics, row counts, and sample values. This context helps the AI generate more accurate SQL by understanding data types, cardinality, and common values—reducing query errors and improving natural language interpretation.

What's New

  • Background Metadata Sync: A scheduled job collects metadata from your connected data sources without impacting query performance
  • Column-Level Statistics: Each column includes data type, nullability, and sample values to help the AI understand the data shape
  • Table Row Counts: Both estimated (from statistics) and actual counts help the AI make informed decisions about query complexity
  • Freshness Tracking: Know exactly when metadata was last synchronized, with automatic refresh on schema changes
  • Semantic Model Enhancement: Metadata automatically enriches your semantic model context for better AI responses

How It Improves AI Responses

Without MetadataWith Metadata
AI guesses column types from namesAI knows exact types and generates correct casts
AI doesn't know valid valuesAI suggests filters based on sample values
AI can't estimate result sizeAI warns about large result sets before execution
Ambiguous column referencesAI disambiguates using column statistics

API Changes

Data Assets Response

GET /api/v1/accounts/{accountId}/workspaces/{workspaceId}/data-assets

Enhanced response now includes metadata fields:

{
  "ok": true,
  "data": {
    "tables": [
      {
        "name": "orders",
        "schema": "public",
        "columnCount": 12,
        "rowCount": 1250000,
        "rowCountEstimated": false,
        "lastMetadataSync": "2026-01-16T08:30:00Z",
        "columns": [
          {
            "name": "order_id",
            "type": "VARCHAR(36)",
            "nullable": false,
            "sampleValues": ["ord_abc123", "ord_def456", "ord_ghi789"]
          },
          {
            "name": "status",
            "type": "VARCHAR(20)",
            "nullable": false,
            "sampleValues": ["pending", "shipped", "delivered", "cancelled"]
          },
          {
            "name": "total_amount",
            "type": "NUMBER(12,2)",
            "nullable": false,
            "sampleValues": [125.5, 89.99, 450.0]
          }
        ]
      }
    ]
  }
}

New Response Fields

FieldTypeDescription
columnCountnumberTotal columns in the table
rowCountnumberRow count (exact or estimated)
rowCountEstimatedbooleanWhether row count is estimated from statistics
lastMetadataSyncISO 8601When metadata was last collected
columns[].sampleValuesarrayUp to 10 representative values from the column

Metadata Sync Schedule

  • Initial sync: Runs automatically when a data source is connected
  • Scheduled refresh: Every 24 hours for active workspaces
  • On-demand: Triggered when schema changes are detected
  • Manual: Available via the workspace settings UI

Privacy & Security

  • Sample values are collected from non-sensitive columns only
  • PII columns (detected by name patterns) show type information but no samples
  • Metadata is stored encrypted and scoped to your account