Data Enrichment
Data enrichment automatically fetches related data and includes it in your embeddings - without changing your source database. This is especially powerful when syncing to vector databases where semantic search quality depends on having rich, contextual data.
The Problem
Section titled “The Problem”When you sync data to a vector database, your records often contain foreign keys instead of meaningful data:
Name: "Morning Yoga Flow"Description: "A gentle morning session focusing on breath and movement"Category ID: 5Organization ID: 12If you generate an embedding from this data, your vector search can find “yoga” or “morning” - but it can’t find this class when someone searches for “fitness classes at Sunrise Wellness Studio” because the organization name isn’t in the record.
How Enrichment Solves This
Section titled “How Enrichment Solves This”Sync or Swim’s data enrichment automatically resolves foreign keys to their related data at sync time:
After enrichment, your embedding includes:
Name: "Morning Yoga Flow"Description: "A gentle morning session focusing on breath and movement"Category: "Yoga - Traditional and modern yoga practices"Organization: "Sunrise Wellness Studio - A holistic wellness center"Now searches for “wellness studio”, “holistic”, or “traditional yoga” will surface this class.
Configuring Enrichment
Section titled “Configuring Enrichment”In the mapping editor, specify which related data to include for each relationship:
| Setting | Description | Example |
|---|---|---|
| Source Field | The foreign key in your record | category_id |
| Related Table | Where to look up the data | categories |
| Related Key | The key field in the related table | id |
| Fields to Include | What data to fetch | name, description |
| Field Prefix | How to name the enriched fields | category_ |
Single Relationship
Section titled “Single Relationship”Enrich a class with its category:
| Setting | Value |
|---|---|
| Source Field | category_id |
| Related Table | categories |
| Related Key | id |
| Fields to Include | name, description |
| Field Prefix | category_ |
Multiple Relationships
Section titled “Multiple Relationships”Enrich a class with both category and organization data by adding multiple enrichment configurations:
Category Enrichment:
| Setting | Value |
|---|---|
| Source Field | category_id |
| Related Table | categories |
| Fields to Include | name |
| Field Prefix | category_ |
Organization Enrichment:
| Setting | Value |
|---|---|
| Source Field | organization_id |
| Related Table | organizations |
| Fields to Include | name, about |
| Field Prefix | org_ |
How It Works at Sync Time
Section titled “How It Works at Sync Time”When a record changes in your source database:
- Sync or Swim detects the change
- Fetches related records using your configuration
- Merges the data into the record
- Generates an embedding from the enriched data
- Syncs to your vector database
What Happens When Related Records Change
Section titled “What Happens When Related Records Change”When a related record is updated (e.g., an organization changes their description), all affected embeddings are automatically updated on the next sync. This ensures your vector search results stay current with your data.
Error Handling
Section titled “Error Handling”Supported Sources
Section titled “Supported Sources”Data enrichment works with any source that supports relationship lookups:
| Source | Lookup Method |
|---|---|
| PostgreSQL | SQL joins |
| MySQL | SQL joins |
| Salesforce | SOQL relationship queries |
Visibility and Debugging
Section titled “Visibility and Debugging”The enriched payload is visible in the sync history, so you can verify exactly what data is being sent to your vector database. This makes it easy to debug search quality issues and ensure enrichment is working correctly.
Current Limitations
Section titled “Current Limitations”- Single-level relationships: Enrichment currently supports one level of relationships (direct foreign keys). Nested relationships (e.g., category → parent category) are on the roadmap.
- Vector targets only: Enrichment is designed for embedding generation and applies to vector database targets.