iMessage Pipeline Usage Guide
Version: 1.0 Last Updated: 2025-10-19 Project: iMessage Timeline Refactor
Table of Contents
- Quick Start
- Installation
- Environment Setup
- Pipeline Stages
- End-to-End Workflow
- Configuration Reference
- CLI Commands
- Advanced Usage
Quick Start
Prerequisites:
- Node.js ≥22.20
- pnpm package manager
- Gemini API key (for enrichment)
- Firecrawl API key (optional, for link enrichment)
Minimal workflow:
# 1. Install dependencies
pnpm install
# 2. Set up environment
cp .env.example .env
# Edit .env and add your API keys
# 3. Run the pipeline
pnpm dev ingest-csv --input messages.csv --output ingested.json
pnpm dev normalize-link --input ingested.json --output normalized.json
pnpm dev enrich-ai --input normalized.json --output enriched.json
pnpm dev render-markdown --input enriched.json --output-dir ./timeline
Installation
From Source
# Clone repository
git clone https://github.com/yourusername/chatline.git
cd chatline
# Install dependencies
pnpm install
# Build TypeScript
pnpm build
# Run tests (optional)
pnpm test
Package Manager
# Via npm
npm install -g /chatline
# Via pnpm
pnpm add -g /chatline
Environment Setup
Required Environment Variables
Create a .env file in the project root:
# Gemini API (Required for enrichment)
GEMINI_API_KEY=your_gemini_api_key_here
# Firecrawl API (Optional, for link enrichment)
FIRECRAWL_API_KEY=your_firecrawl_api_key_here
Obtaining API Keys
Gemini API Key
- Visit Google AI Studio
- Click "Create API Key"
- Copy the key and add to
.env
Pricing: Free tier includes 15 RPM (requests per minute) with generous quotas.
Firecrawl API Key (Optional)
- Visit Firecrawl
- Sign up for an account
- Navigate to API Keys section
- Copy the key and add to
.env
Note: Firecrawl is optional. Link enrichment will fall back to provider-specific parsers (YouTube, Spotify, Twitter) without it.
Pipeline Stages
The pipeline consists of 4 main stages:
┌─────────────┐ ┌────────────────┐ ┌───────────┐ ┌─────────────────┐
│ Ingest │ -> │ Normalize-Link │ -> │ Enrich-AI │ -> │ Render-Markdown │
└─────────────┘ └────────────────┘ └───────────┘ └─────────────────┘
CSV or DB Dedupe + Link AI Analysis Obsidian Format
Stage 1: Ingest
Purpose: Import messages from CSV (iMazing) or macOS Messages.app database.
Inputs:
- CSV file exported from iMazing
- OR macOS Messages.app database (
chat.db)
Outputs: JSON array of normalized message objects
See: Ingest Commands for detailed usage
Stage 2: Normalize-Link
Purpose: Deduplicate messages from multiple sources and link replies/tapbacks.
Inputs: JSON messages from Stage 1
Outputs: Deduplicated, linked messages with:
- Reply threads (parent-child relationships)
- Tapback associations
- Unified GUIDs
- Absolute file paths
See: Normalize-Link Commands for detailed usage
Stage 3: Enrich-AI
Purpose: Add AI-powered analysis to media messages.
Features:
- Image Analysis: HEIC/TIFF → JPG conversion + Gemini Vision captions
- Audio Transcription: Speech-to-text with speaker labels
- PDF Summarization: Extract key points from documents
- Video Metadata: Basic metadata extraction (no transcription)
- Link Enrichment: Fetch context from URLs (Firecrawl + fallbacks)
Inputs: Normalized messages from Stage 2
Outputs: Enriched messages with media.enrichment arrays
See: Enrich-AI Commands for detailed usage
Stage 4: Render-Markdown
Purpose: Generate Obsidian-compatible markdown timeline files.
Features:
- Group by date + time-of-day (Morning/Afternoon/Evening)
- Nested reply threads as blockquotes
- Tapback emoji reactions
- Image embeds with
![[wikilink]]syntax - Transcriptions and link contexts as blockquotes
Inputs: Enriched messages from Stage 3
Outputs: Markdown files in output directory
See: Render-Markdown Commands for detailed usage
End-to-End Workflow
Example: Full Pipeline with Real Data
This example demonstrates processing 3 months of iMessage history.
1. Export Messages from iMazing
- Connect iPhone to iMazing
- Navigate to "Messages" → Select contact
- Export → CSV format
- Save as
melanie-messages.csv
2. Prepare Environment
# Create project directory
mkdir imessage-analysis
cd imessage-analysis
# Install chatline
pnpm add /chatline
# Setup environment variables
cat > .env << EOF
GEMINI_API_KEY=AIzaSyD...your_key_here
FIRECRAWL_API_KEY=fc-...your_key_here (optional)
EOF
3. Run Ingest Stage
pnpm chatline ingest-csv \
--input melanie-messages.csv \
--output ingested.json \
--attachment-roots ~/Library/Messages/Attachments
Expected output:
✓ Parsed 2,847 messages from CSV
✓ Split into 3,104 message objects (text + media)
✓ Resolved 1,423 attachment paths
⚠ Missing files: 12 (logged to ingested-errors.json)
✓ Wrote ingested.json (4.2 MB)
4. Run Normalize-Link Stage
pnpm chatline normalize-link \
--input ingested.json \
--output normalized.json \
--verbose
Expected output:
✓ Loaded 3,104 messages
✓ Linked 847 replies (783 via DB, 64 via heuristics)
✓ Linked 234 tapbacks
✓ Deduplicated: 3,104 → 3,098 (6 duplicates removed)
✓ Validated all messages against schema
✓ Wrote normalized.json (4.1 MB)
5. Run Enrich-AI Stage
pnpm chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--checkpoint-interval 50 \
--rate-limit 1000 \
--resume
Expected output:
✓ Loaded 3,098 messages (823 media attachments)
→ Enriching images: 542 HEIC/TIFF files
→ Converting HEIC to JPG previews... [============= ] 75% (407/542)
→ Analyzing with Gemini Vision... [============ ] 70% (380/542)
✓ Checkpoint written: enrich-checkpoint-500.json
→ Enriching audio: 134 M4A files
→ Transcribing with Gemini Audio... [=========== ] 65% (87/134)
✓ Checkpoint written: enrich-checkpoint-600.json
→ Enriching links: 89 URLs
→ Fetching contexts... [==================] 100% (89/89)
✓ Completed in 42min 18s
✓ Wrote enriched.json (6.7 MB)
Resume capability: If interrupted (Ctrl+C or API error), run the same
command with --resume to continue from the last checkpoint.
6. Run Render-Markdown Stage
pnpm chatline render-markdown \
--input enriched.json \
--output-dir ./timeline \
--date-range 2024-10-01:2024-12-31
Expected output:
✓ Loaded 3,098 enriched messages
✓ Grouping by date: 92 days with messages
✓ Rendering markdown files...
→ 2024-10-01.md (34 messages)
→ 2024-10-02.md (28 messages)
...
→ 2024-12-31.md (41 messages)
✓ Wrote 92 markdown files to ./timeline/
✓ Total size: 8.4 MB
7. View in Obsidian
- Open Obsidian
- File → Open Vault
- Select
./timelinedirectory - Navigate to any date file (e.g.,
2024-10-15.md)
Result: Fully formatted timeline with:
- ✅ Nested reply threads
- ✅ Tapback emoji reactions
- ✅ Embedded images with captions
- ✅ Audio transcriptions
- ✅ Link previews with context
- ✅ Clickable deep-link anchors
Configuration Reference
Global Configuration File
Create imessage-config.json in project root:
{
"attachmentRoots": [
"/Users/yourname/Library/Messages/Attachments",
"/Volumes/Backup/old-attachments"
],
"enrichment": {
"checkpointInterval": 100,
"enableAudioTranscription": true,
"enableLinkEnrichment": true,
"enableVisionAnalysis": true,
"forceRefresh": false,
"imageCacheDir": "./.cache/images"
},
"firecrawl": {
"apiKey": "${FIRECRAWL_API_KEY}",
"enabled": true
},
"gemini": {
"apiKey": "${GEMINI_API_KEY}",
"maxRetries": 3,
"model": "gemini-1.5-pro",
"rateLimitDelay": 1000
},
"render": {
"groupByTimeOfDay": true,
"maxNestingDepth": 10,
"renderRepliesAsNested": true,
"renderTapbacksAsEmoji": true
}
}
Configuration Precedence
- CLI flags (highest priority)
- Environment variables (
.envfile) - Config file (
imessage-config.json) - Built-in defaults (lowest priority)
Example:
# Uses GEMINI_API_KEY from .env
pnpm dev enrich-ai --input messages.json --output enriched.json
# Overrides with CLI flag
pnpm dev enrich-ai --input messages.json --output enriched.json --gemini-api-key YOUR_KEY
CLI Commands
1. ingest-csv
Import messages from iMazing CSV export.
Syntax:
chatline ingest-csv [options]
Options:
| Flag | Type | Default | Description |
|---|---|---|---|
--input <path> | string | (required) | Path to CSV file |
--output <path> | string | ingested.json | Output JSON file |
--attachment-roots <paths...> | string[] | ~/Library/Messages/Attachments | Attachment search directories |
--verbose | boolean | false | Detailed logging |
Example:
chatline ingest-csv \
--input ~/Desktop/export.csv \
--output ./data/ingested.json \
--attachment-roots ~/Library/Messages/Attachments /Volumes/Backup/attachments \
--verbose
Output: JSON file with structure:
{
"messages": [
{
"date": "2024-10-15T14:23:45.000Z",
"guid": "csv:1:0",
"isFromMe": true,
"messageKind": "text",
"metadata": {
"csvLineNumber": 2,
"source": "csv"
},
"service": "iMessage",
"text": "Hello world"
}
],
"metadata": {
"createdAt": "2025-10-19T22:00:00.000Z",
"source": "csv",
"totalMessages": 1234,
"version": "1.0"
}
}
2. ingest-db
Import messages from macOS Messages.app database.
Syntax:
chatline ingest-db [options]
Options:
| Flag | Type | Default | Description |
|---|---|---|---|
--db-path <path> | string | ~/Library/Messages/chat.db | Path to chat.db |
--output <path> | string | ingested-db.json | Output JSON file |
--contact <identifier> | string | (optional) | Filter by phone/email |
--date-range <start:end> | string | (optional) | Date range (YYYY-MM-DD:YYYY-MM-DD) |
--attachment-roots <paths...> | string[] | ~/Library/Messages/Attachments | Attachment directories |
Example:
chatline ingest-db \
--db-path ~/Library/Messages/chat.db \
--output ./data/ingested-db.json \
--contact "+61412345678" \
--date-range 2024-01-01:2024-12-31
Note: Requires full disk access permission on macOS (System Preferences → Security & Privacy → Privacy → Full Disk Access).
3. normalize-link
Deduplicate messages and link replies/tapbacks.
Syntax:
chatline normalize-link [options]
Options:
| Flag | Type | Default | Description |
|---|---|---|---|
--input <path> | string | (required) | Input JSON from ingest stage |
--output <path> | string | normalized.json | Output JSON file |
--merge-csv-db | boolean | false | Merge CSV and DB sources |
--csv-input <path> | string | (optional) | CSV JSON (if merging) |
--db-input <path> | string | (optional) | DB JSON (if merging) |
--verbose | boolean | false | Detailed logging |
Example (single source):
chatline normalize-link \
--input ingested.json \
--output normalized.json
Example (merge CSV + DB):
chatline normalize-link \
--merge-csv-db \
--csv-input ingested-csv.json \
--db-input ingested-db.json \
--output normalized-merged.json \
--verbose
Output: Normalized JSON with linked messages:
{
"messages": [
{
"guid": "abc-123",
"messageKind": "text",
"replies": ["def-456"],
"text": "What do you think?",
"threadTargetGuid": null
},
{
"guid": "def-456",
"messageKind": "text",
"text": "Sounds great!",
"threadTargetGuid": "abc-123"
}
],
"metadata": {
"duplicatesRemoved": 6,
"inputCount": 3104,
"outputCount": 3098,
"repliesLinked": 847,
"source": "normalize-link",
"tapbacksLinked": 234
}
}
4. enrich-ai
Add AI-powered analysis to media messages.
Syntax:
chatline enrich-ai [options]
Options:
| Flag | Type | Default | Description |
|---|---|---|---|
--input <path> | string | (required) | Normalized JSON |
--output <path> | string | enriched.json | Output JSON |
--checkpoint-interval <n> | number | 100 | Write checkpoint every N items |
--resume | boolean | false | Resume from last checkpoint |
--force-refresh | boolean | false | Re-enrich existing enrichments |
--rate-limit <ms> | number | 1000 | Delay between API calls (ms) |
--max-retries <n> | number | 3 | Retry attempts for 5xx errors |
--gemini-api-key <key> | string | $GEMINI_API_KEY | Override API key |
--enable-vision | boolean | true | Enable image analysis |
--enable-audio | boolean | true | Enable transcription |
--enable-links | boolean | true | Enable link enrichment |
--image-cache-dir <path> | string | ./.cache/images | Preview cache directory |
Example (standard run):
chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--checkpoint-interval 50 \
--rate-limit 1500
Example (resume after interruption):
chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--resume
Example (disable certain enrichments):
chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--enable-audio false \
--enable-links false
Checkpoint Files: Automatically created in
./checkpoints/enrich-checkpoint-<index>.json
Resume behavior: Continues from the last successfully enriched message. Idempotency ensures no duplicate enrichments.
5. render-markdown
Generate Obsidian-compatible markdown timeline files.
Syntax:
chatline render-markdown [options]
Options:
| Flag | Type | Default | Description |
|---|---|---|---|
--input <path> | string | (required) | Enriched JSON |
--output-dir <path> | string | ./timeline | Output directory |
--date-range <start:end> | string | (all dates) | Filter by date range |
--group-by-time | boolean | true | Group by time-of-day |
--nested-replies | boolean | true | Render replies as nested |
--max-nesting-depth <n> | number | 10 | Max reply nesting levels |
Example:
chatline render-markdown \
--input enriched.json \
--output-dir ./timeline \
--date-range 2024-10-01:2024-10-31
Output Structure:
timeline/
├── 2024-10-01.md
├── 2024-10-02.md
├── ...
└── 2024-10-31.md
Sample Output (2024-10-15.md):
# 2024-10-15
## Morning (00:00 - 11:59)
### 08:23 - You
Good morning! ☀️
### 08:25 - Melanie
Morning! How'd you sleep?
> **You** (08:26): Really well, thanks! 😊
❤️ Melanie
## Afternoon (12:00 - 17:59)
### 14:30 - Melanie
Check out this photo!
![[IMG_1234.heic]]
> _Gemini Vision Analysis_: A beautiful sunset over the ocean with vibrant
> orange and pink hues reflecting on the water.
## Evening (18:00 - 23:59)
### 19:45 - You
Let's meet at this place: https://example.com/restaurant
> _Link Context_:
> **[Best Restaurant in Town - Example Dining](https://example.com/restaurant)**
> Award-winning restaurant featuring modern Australian cuisine with seasonal
> menus and waterfront views.
Advanced Usage
Batch Processing Multiple Contacts
#!/bin/bash
# process-all-contacts.sh
CONTACTS=(
"+61412345678:melanie"
"+61498765432:john"
"alice@example.com:alice"
)
for contact_pair in "${CONTACTS[@]}"; do
IFS=':' read -r contact name <<< "$contact_pair"
echo "Processing $name ($contact)..."
# Ingest from DB
chatline ingest-db \
--contact "$contact" \
--output "data/${name}-ingested.json"
# Normalize
chatline normalize-link \
--input "data/${name}-ingested.json" \
--output "data/${name}-normalized.json"
# Enrich
chatline enrich-ai \
--input "data/${name}-normalized.json" \
--output "data/${name}-enriched.json" \
--checkpoint-interval 50
# Render
chatline render-markdown \
--input "data/${name}-enriched.json" \
--output-dir "timeline/${name}"
echo "✓ Completed $name"
done
echo "✓ All contacts processed"
Incremental Updates (Daily Sync)
#!/bin/bash
# daily-sync.sh
DATE=$(date +%Y-%m-%d)
YESTERDAY=$(date -d "yesterday" +%Y-%m-%d)
# Export yesterday's messages only
chatline ingest-db \
--date-range "$YESTERDAY:$DATE" \
--output "data/daily-$DATE.json"
# Merge with existing normalized data
chatline normalize-link \
--merge-csv-db \
--csv-input "data/normalized-master.json" \
--db-input "data/daily-$DATE.json" \
--output "data/normalized-master.json"
# Enrich new messages only (idempotency handles duplicates)
chatline enrich-ai \
--input "data/normalized-master.json" \
--output "data/enriched-master.json"
# Render updated timeline
chatline render-markdown \
--input "data/enriched-master.json" \
--output-dir "./timeline" \
--date-range "$YESTERDAY:$DATE"
echo "✓ Daily sync complete for $DATE"
Custom Enrichment Providers
You can extend enrichment with custom providers:
// custom-provider.ts
import type { MediaEnrichment } from '@nathanvale/chatline'
export async function enrichWithCustomAPI(
mediaPath: string,
): Promise<MediaEnrichment> {
const response = await fetch('https://your-api.com/analyze', {
method: 'POST',
body: JSON.stringify({ path: mediaPath }),
})
const data = await response.json()
return {
kind: 'custom_analysis',
summary: data.summary,
provider: 'custom-api',
model: 'custom-v1',
version: '1.0',
createdAt: new Date().toISOString(),
}
}
Then integrate in your workflow:
import { renderMessages } from '@nathanvale/chatline'
import { enrichWithCustomAPI } from './custom-provider'
// Load normalized messages
const messages = JSON.parse(readFileSync('normalized.json', 'utf-8'))
// Apply custom enrichment
for (const msg of messages) {
if (msg.messageKind === 'media' && msg.media?.mediaKind === 'image') {
const enrichment = await enrichWithCustomAPI(msg.media.path)
msg.media.enrichment = msg.media.enrichment || []
msg.media.enrichment.push(enrichment)
}
}
// Continue with standard render
const markdown = renderMessages(messages)
Next Steps
- Troubleshooting: See Troubleshooting Guide
- API Reference: See Technical Specification
- Implementation Notes: See Refactor Report
Document Version: 1.0 Author: Generated from iMessage Pipeline implementation Last Updated: 2025-10-19