Skip to main content

iMessage Pipeline Usage Guide

Version: 1.0 Last Updated: 2025-10-19 Project: iMessage Timeline Refactor


Table of Contents

  1. Quick Start
  2. Installation
  3. Environment Setup
  4. Pipeline Stages
  5. End-to-End Workflow
  6. Configuration Reference
  7. CLI Commands
  8. Advanced Usage

Quick Start

Prerequisites:

  • Node.js ≥22.20
  • pnpm package manager
  • Gemini API key (for enrichment)
  • Firecrawl API key (optional, for link enrichment)

Minimal workflow:

# 1. Install dependencies
pnpm install

# 2. Set up environment
cp .env.example .env
# Edit .env and add your API keys

# 3. Run the pipeline
pnpm dev ingest-csv --input messages.csv --output ingested.json
pnpm dev normalize-link --input ingested.json --output normalized.json
pnpm dev enrich-ai --input normalized.json --output enriched.json
pnpm dev render-markdown --input enriched.json --output-dir ./timeline

Installation

From Source

# Clone repository
git clone https://github.com/yourusername/chatline.git
cd chatline

# Install dependencies
pnpm install

# Build TypeScript
pnpm build

# Run tests (optional)
pnpm test

Package Manager

# Via npm
npm install -g /chatline

# Via pnpm
pnpm add -g /chatline

Environment Setup

Required Environment Variables

Create a .env file in the project root:

# Gemini API (Required for enrichment)
GEMINI_API_KEY=your_gemini_api_key_here

# Firecrawl API (Optional, for link enrichment)
FIRECRAWL_API_KEY=your_firecrawl_api_key_here

Obtaining API Keys

Gemini API Key

  1. Visit Google AI Studio
  2. Click "Create API Key"
  3. Copy the key and add to .env

Pricing: Free tier includes 15 RPM (requests per minute) with generous quotas.

Firecrawl API Key (Optional)

  1. Visit Firecrawl
  2. Sign up for an account
  3. Navigate to API Keys section
  4. Copy the key and add to .env

Note: Firecrawl is optional. Link enrichment will fall back to provider-specific parsers (YouTube, Spotify, Twitter) without it.


Pipeline Stages

The pipeline consists of 4 main stages:

┌─────────────┐    ┌────────────────┐    ┌───────────┐    ┌─────────────────┐
│ Ingest │ -> │ Normalize-Link │ -> │ Enrich-AI │ -> │ Render-Markdown │
└─────────────┘ └────────────────┘ └───────────┘ └─────────────────┘
CSV or DB Dedupe + Link AI Analysis Obsidian Format

Stage 1: Ingest

Purpose: Import messages from CSV (iMazing) or macOS Messages.app database.

Inputs:

  • CSV file exported from iMazing
  • OR macOS Messages.app database (chat.db)

Outputs: JSON array of normalized message objects

See: Ingest Commands for detailed usage


Purpose: Deduplicate messages from multiple sources and link replies/tapbacks.

Inputs: JSON messages from Stage 1

Outputs: Deduplicated, linked messages with:

  • Reply threads (parent-child relationships)
  • Tapback associations
  • Unified GUIDs
  • Absolute file paths

See: Normalize-Link Commands for detailed usage


Stage 3: Enrich-AI

Purpose: Add AI-powered analysis to media messages.

Features:

  • Image Analysis: HEIC/TIFF → JPG conversion + Gemini Vision captions
  • Audio Transcription: Speech-to-text with speaker labels
  • PDF Summarization: Extract key points from documents
  • Video Metadata: Basic metadata extraction (no transcription)
  • Link Enrichment: Fetch context from URLs (Firecrawl + fallbacks)

Inputs: Normalized messages from Stage 2

Outputs: Enriched messages with media.enrichment arrays

See: Enrich-AI Commands for detailed usage


Stage 4: Render-Markdown

Purpose: Generate Obsidian-compatible markdown timeline files.

Features:

  • Group by date + time-of-day (Morning/Afternoon/Evening)
  • Nested reply threads as blockquotes
  • Tapback emoji reactions
  • Image embeds with ![[wikilink]] syntax
  • Transcriptions and link contexts as blockquotes

Inputs: Enriched messages from Stage 3

Outputs: Markdown files in output directory

See: Render-Markdown Commands for detailed usage


End-to-End Workflow

Example: Full Pipeline with Real Data

This example demonstrates processing 3 months of iMessage history.

1. Export Messages from iMazing

  1. Connect iPhone to iMazing
  2. Navigate to "Messages" → Select contact
  3. Export → CSV format
  4. Save as melanie-messages.csv

2. Prepare Environment

# Create project directory
mkdir imessage-analysis
cd imessage-analysis

# Install chatline
pnpm add /chatline

# Setup environment variables
cat > .env << EOF
GEMINI_API_KEY=AIzaSyD...your_key_here
FIRECRAWL_API_KEY=fc-...your_key_here (optional)
EOF

3. Run Ingest Stage

pnpm chatline ingest-csv \
--input melanie-messages.csv \
--output ingested.json \
--attachment-roots ~/Library/Messages/Attachments

Expected output:

✓ Parsed 2,847 messages from CSV
✓ Split into 3,104 message objects (text + media)
✓ Resolved 1,423 attachment paths
⚠ Missing files: 12 (logged to ingested-errors.json)
✓ Wrote ingested.json (4.2 MB)
pnpm chatline normalize-link \
--input ingested.json \
--output normalized.json \
--verbose

Expected output:

✓ Loaded 3,104 messages
✓ Linked 847 replies (783 via DB, 64 via heuristics)
✓ Linked 234 tapbacks
✓ Deduplicated: 3,104 → 3,098 (6 duplicates removed)
✓ Validated all messages against schema
✓ Wrote normalized.json (4.1 MB)

5. Run Enrich-AI Stage

pnpm chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--checkpoint-interval 50 \
--rate-limit 1000 \
--resume

Expected output:

✓ Loaded 3,098 messages (823 media attachments)
→ Enriching images: 542 HEIC/TIFF files
→ Converting HEIC to JPG previews... [============= ] 75% (407/542)
→ Analyzing with Gemini Vision... [============ ] 70% (380/542)
✓ Checkpoint written: enrich-checkpoint-500.json
→ Enriching audio: 134 M4A files
→ Transcribing with Gemini Audio... [=========== ] 65% (87/134)
✓ Checkpoint written: enrich-checkpoint-600.json
→ Enriching links: 89 URLs
→ Fetching contexts... [==================] 100% (89/89)
✓ Completed in 42min 18s
✓ Wrote enriched.json (6.7 MB)

Resume capability: If interrupted (Ctrl+C or API error), run the same command with --resume to continue from the last checkpoint.

6. Run Render-Markdown Stage

pnpm chatline render-markdown \
--input enriched.json \
--output-dir ./timeline \
--date-range 2024-10-01:2024-12-31

Expected output:

✓ Loaded 3,098 enriched messages
✓ Grouping by date: 92 days with messages
✓ Rendering markdown files...
→ 2024-10-01.md (34 messages)
→ 2024-10-02.md (28 messages)
...
→ 2024-12-31.md (41 messages)
✓ Wrote 92 markdown files to ./timeline/
✓ Total size: 8.4 MB

7. View in Obsidian

  1. Open Obsidian
  2. File → Open Vault
  3. Select ./timeline directory
  4. Navigate to any date file (e.g., 2024-10-15.md)

Result: Fully formatted timeline with:

  • ✅ Nested reply threads
  • ✅ Tapback emoji reactions
  • ✅ Embedded images with captions
  • ✅ Audio transcriptions
  • ✅ Link previews with context
  • ✅ Clickable deep-link anchors

Configuration Reference

Global Configuration File

Create imessage-config.json in project root:

{
"attachmentRoots": [
"/Users/yourname/Library/Messages/Attachments",
"/Volumes/Backup/old-attachments"
],
"enrichment": {
"checkpointInterval": 100,
"enableAudioTranscription": true,
"enableLinkEnrichment": true,
"enableVisionAnalysis": true,
"forceRefresh": false,
"imageCacheDir": "./.cache/images"
},
"firecrawl": {
"apiKey": "${FIRECRAWL_API_KEY}",
"enabled": true
},
"gemini": {
"apiKey": "${GEMINI_API_KEY}",
"maxRetries": 3,
"model": "gemini-1.5-pro",
"rateLimitDelay": 1000
},
"render": {
"groupByTimeOfDay": true,
"maxNestingDepth": 10,
"renderRepliesAsNested": true,
"renderTapbacksAsEmoji": true
}
}

Configuration Precedence

  1. CLI flags (highest priority)
  2. Environment variables (.env file)
  3. Config file (imessage-config.json)
  4. Built-in defaults (lowest priority)

Example:

# Uses GEMINI_API_KEY from .env
pnpm dev enrich-ai --input messages.json --output enriched.json

# Overrides with CLI flag
pnpm dev enrich-ai --input messages.json --output enriched.json --gemini-api-key YOUR_KEY

CLI Commands

1. ingest-csv

Import messages from iMazing CSV export.

Syntax:

chatline ingest-csv [options]

Options:

FlagTypeDefaultDescription
--input <path>string(required)Path to CSV file
--output <path>stringingested.jsonOutput JSON file
--attachment-roots <paths...>string[]~/Library/Messages/AttachmentsAttachment search directories
--verbosebooleanfalseDetailed logging

Example:

chatline ingest-csv \
--input ~/Desktop/export.csv \
--output ./data/ingested.json \
--attachment-roots ~/Library/Messages/Attachments /Volumes/Backup/attachments \
--verbose

Output: JSON file with structure:

{
"messages": [
{
"date": "2024-10-15T14:23:45.000Z",
"guid": "csv:1:0",
"isFromMe": true,
"messageKind": "text",
"metadata": {
"csvLineNumber": 2,
"source": "csv"
},
"service": "iMessage",
"text": "Hello world"
}
],
"metadata": {
"createdAt": "2025-10-19T22:00:00.000Z",
"source": "csv",
"totalMessages": 1234,
"version": "1.0"
}
}

2. ingest-db

Import messages from macOS Messages.app database.

Syntax:

chatline ingest-db [options]

Options:

FlagTypeDefaultDescription
--db-path <path>string~/Library/Messages/chat.dbPath to chat.db
--output <path>stringingested-db.jsonOutput JSON file
--contact <identifier>string(optional)Filter by phone/email
--date-range <start:end>string(optional)Date range (YYYY-MM-DD:YYYY-MM-DD)
--attachment-roots <paths...>string[]~/Library/Messages/AttachmentsAttachment directories

Example:

chatline ingest-db \
--db-path ~/Library/Messages/chat.db \
--output ./data/ingested-db.json \
--contact "+61412345678" \
--date-range 2024-01-01:2024-12-31

Note: Requires full disk access permission on macOS (System Preferences → Security & Privacy → Privacy → Full Disk Access).


Deduplicate messages and link replies/tapbacks.

Syntax:

chatline normalize-link [options]

Options:

FlagTypeDefaultDescription
--input <path>string(required)Input JSON from ingest stage
--output <path>stringnormalized.jsonOutput JSON file
--merge-csv-dbbooleanfalseMerge CSV and DB sources
--csv-input <path>string(optional)CSV JSON (if merging)
--db-input <path>string(optional)DB JSON (if merging)
--verbosebooleanfalseDetailed logging

Example (single source):

chatline normalize-link \
--input ingested.json \
--output normalized.json

Example (merge CSV + DB):

chatline normalize-link \
--merge-csv-db \
--csv-input ingested-csv.json \
--db-input ingested-db.json \
--output normalized-merged.json \
--verbose

Output: Normalized JSON with linked messages:

{
"messages": [
{
"guid": "abc-123",
"messageKind": "text",
"replies": ["def-456"],
"text": "What do you think?",
"threadTargetGuid": null
},
{
"guid": "def-456",
"messageKind": "text",
"text": "Sounds great!",
"threadTargetGuid": "abc-123"
}
],
"metadata": {
"duplicatesRemoved": 6,
"inputCount": 3104,
"outputCount": 3098,
"repliesLinked": 847,
"source": "normalize-link",
"tapbacksLinked": 234
}
}

4. enrich-ai

Add AI-powered analysis to media messages.

Syntax:

chatline enrich-ai [options]

Options:

FlagTypeDefaultDescription
--input <path>string(required)Normalized JSON
--output <path>stringenriched.jsonOutput JSON
--checkpoint-interval <n>number100Write checkpoint every N items
--resumebooleanfalseResume from last checkpoint
--force-refreshbooleanfalseRe-enrich existing enrichments
--rate-limit <ms>number1000Delay between API calls (ms)
--max-retries <n>number3Retry attempts for 5xx errors
--gemini-api-key <key>string$GEMINI_API_KEYOverride API key
--enable-visionbooleantrueEnable image analysis
--enable-audiobooleantrueEnable transcription
--enable-linksbooleantrueEnable link enrichment
--image-cache-dir <path>string./.cache/imagesPreview cache directory

Example (standard run):

chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--checkpoint-interval 50 \
--rate-limit 1500

Example (resume after interruption):

chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--resume

Example (disable certain enrichments):

chatline enrich-ai \
--input normalized.json \
--output enriched.json \
--enable-audio false \
--enable-links false

Checkpoint Files: Automatically created in ./checkpoints/enrich-checkpoint-<index>.json

Resume behavior: Continues from the last successfully enriched message. Idempotency ensures no duplicate enrichments.


5. render-markdown

Generate Obsidian-compatible markdown timeline files.

Syntax:

chatline render-markdown [options]

Options:

FlagTypeDefaultDescription
--input <path>string(required)Enriched JSON
--output-dir <path>string./timelineOutput directory
--date-range <start:end>string(all dates)Filter by date range
--group-by-timebooleantrueGroup by time-of-day
--nested-repliesbooleantrueRender replies as nested
--max-nesting-depth <n>number10Max reply nesting levels

Example:

chatline render-markdown \
--input enriched.json \
--output-dir ./timeline \
--date-range 2024-10-01:2024-10-31

Output Structure:

timeline/
├── 2024-10-01.md
├── 2024-10-02.md
├── ...
└── 2024-10-31.md

Sample Output (2024-10-15.md):

# 2024-10-15

## Morning (00:00 - 11:59)

### 08:23 - You

Good morning! ☀️

### 08:25 - Melanie

Morning! How'd you sleep?

> **You** (08:26): Really well, thanks! 😊

❤️ Melanie

## Afternoon (12:00 - 17:59)

### 14:30 - Melanie

Check out this photo!

![[IMG_1234.heic]]

> _Gemini Vision Analysis_: A beautiful sunset over the ocean with vibrant
> orange and pink hues reflecting on the water.

## Evening (18:00 - 23:59)

### 19:45 - You

Let's meet at this place: https://example.com/restaurant

> _Link Context_:
> **[Best Restaurant in Town - Example Dining](https://example.com/restaurant)**
> Award-winning restaurant featuring modern Australian cuisine with seasonal
> menus and waterfront views.

Advanced Usage

Batch Processing Multiple Contacts

#!/bin/bash
# process-all-contacts.sh

CONTACTS=(
"+61412345678:melanie"
"+61498765432:john"
"alice@example.com:alice"
)

for contact_pair in "${CONTACTS[@]}"; do
IFS=':' read -r contact name <<< "$contact_pair"

echo "Processing $name ($contact)..."

# Ingest from DB
chatline ingest-db \
--contact "$contact" \
--output "data/${name}-ingested.json"

# Normalize
chatline normalize-link \
--input "data/${name}-ingested.json" \
--output "data/${name}-normalized.json"

# Enrich
chatline enrich-ai \
--input "data/${name}-normalized.json" \
--output "data/${name}-enriched.json" \
--checkpoint-interval 50

# Render
chatline render-markdown \
--input "data/${name}-enriched.json" \
--output-dir "timeline/${name}"

echo "✓ Completed $name"
done

echo "✓ All contacts processed"

Incremental Updates (Daily Sync)

#!/bin/bash
# daily-sync.sh

DATE=$(date +%Y-%m-%d)
YESTERDAY=$(date -d "yesterday" +%Y-%m-%d)

# Export yesterday's messages only
chatline ingest-db \
--date-range "$YESTERDAY:$DATE" \
--output "data/daily-$DATE.json"

# Merge with existing normalized data
chatline normalize-link \
--merge-csv-db \
--csv-input "data/normalized-master.json" \
--db-input "data/daily-$DATE.json" \
--output "data/normalized-master.json"

# Enrich new messages only (idempotency handles duplicates)
chatline enrich-ai \
--input "data/normalized-master.json" \
--output "data/enriched-master.json"

# Render updated timeline
chatline render-markdown \
--input "data/enriched-master.json" \
--output-dir "./timeline" \
--date-range "$YESTERDAY:$DATE"

echo "✓ Daily sync complete for $DATE"

Custom Enrichment Providers

You can extend enrichment with custom providers:

// custom-provider.ts
import type { MediaEnrichment } from '@nathanvale/chatline'

export async function enrichWithCustomAPI(
mediaPath: string,
): Promise<MediaEnrichment> {
const response = await fetch('https://your-api.com/analyze', {
method: 'POST',
body: JSON.stringify({ path: mediaPath }),
})

const data = await response.json()

return {
kind: 'custom_analysis',
summary: data.summary,
provider: 'custom-api',
model: 'custom-v1',
version: '1.0',
createdAt: new Date().toISOString(),
}
}

Then integrate in your workflow:

import { renderMessages } from '@nathanvale/chatline'
import { enrichWithCustomAPI } from './custom-provider'

// Load normalized messages
const messages = JSON.parse(readFileSync('normalized.json', 'utf-8'))

// Apply custom enrichment
for (const msg of messages) {
if (msg.messageKind === 'media' && msg.media?.mediaKind === 'image') {
const enrichment = await enrichWithCustomAPI(msg.media.path)
msg.media.enrichment = msg.media.enrichment || []
msg.media.enrichment.push(enrichment)
}
}

// Continue with standard render
const markdown = renderMessages(messages)

Next Steps


Document Version: 1.0 Author: Generated from iMessage Pipeline implementation Last Updated: 2025-10-19