Added
Table Metadata Service
about 1 month ago
Automatic metadata collection enriches the AI's understanding of your data assets with column statistics, row counts, and sample values. This context helps the AI generate more accurate SQL by understanding data types, cardinality, and common values—reducing query errors and improving natural language interpretation.
What's New
- Background Metadata Sync: A scheduled job collects metadata from your connected data sources without impacting query performance
- Column-Level Statistics: Each column includes data type, nullability, and sample values to help the AI understand the data shape
- Table Row Counts: Both estimated (from statistics) and actual counts help the AI make informed decisions about query complexity
- Freshness Tracking: Know exactly when metadata was last synchronized, with automatic refresh on schema changes
- Semantic Model Enhancement: Metadata automatically enriches your semantic model context for better AI responses
How It Improves AI Responses
| Without Metadata | With Metadata |
|---|---|
| AI guesses column types from names | AI knows exact types and generates correct casts |
| AI doesn't know valid values | AI suggests filters based on sample values |
| AI can't estimate result size | AI warns about large result sets before execution |
| Ambiguous column references | AI disambiguates using column statistics |
API Changes
Data Assets Response
GET /api/v1/accounts/{accountId}/workspaces/{workspaceId}/data-assets
Enhanced response now includes metadata fields:
{
"ok": true,
"data": {
"tables": [
{
"name": "orders",
"schema": "public",
"columnCount": 12,
"rowCount": 1250000,
"rowCountEstimated": false,
"lastMetadataSync": "2026-01-16T08:30:00Z",
"columns": [
{
"name": "order_id",
"type": "VARCHAR(36)",
"nullable": false,
"sampleValues": ["ord_abc123", "ord_def456", "ord_ghi789"]
},
{
"name": "status",
"type": "VARCHAR(20)",
"nullable": false,
"sampleValues": ["pending", "shipped", "delivered", "cancelled"]
},
{
"name": "total_amount",
"type": "NUMBER(12,2)",
"nullable": false,
"sampleValues": [125.5, 89.99, 450.0]
}
]
}
]
}
}New Response Fields
| Field | Type | Description |
|---|---|---|
columnCount | number | Total columns in the table |
rowCount | number | Row count (exact or estimated) |
rowCountEstimated | boolean | Whether row count is estimated from statistics |
lastMetadataSync | ISO 8601 | When metadata was last collected |
columns[].sampleValues | array | Up to 10 representative values from the column |
Metadata Sync Schedule
- Initial sync: Runs automatically when a data source is connected
- Scheduled refresh: Every 24 hours for active workspaces
- On-demand: Triggered when schema changes are detected
- Manual: Available via the workspace settings UI
Privacy & Security
- Sample values are collected from non-sensitive columns only
- PII columns (detected by name patterns) show type information but no samples
- Metadata is stored encrypted and scoped to your account