Published dispatch Mar 30, 2026

Transcript to vault

A failed recording triggered a full pipeline build. Calendar attendees correct transcription errors. Three models reviewed it. Six hardening items shipped same day.

Filed

March 30, 2026

Published on galexc.me/dispatches

Read time

9 min

A practical pass on agent behavior.

What the pipeline does

                    ┌─────────────────┐
  WAV (on R2)  ───► │   whisperkit    │ ← mic channel only
                    └────────┬────────┘
                             │ raw transcript + diarization
                             ▼
                    ┌─────────────────────────────────────────┐
                    │          transcribe-enrich              │
                    └──┬──────────┬──────────┬────────────────┘
                       │          │          │
            ┌──────────▼──┐  ┌────▼──────┐   └────────────────────┐
            │ calendar-   │  │ identify- │                        │
            │ lookup.py   │  │ entities  │                        │
            │ (365 lines) │  │ .py       │                        │
            │             │  │ (811 lines│                        │
            │ Google Cal  │  │ Haiku LLM │                        │
            │ API: match  │  │ backstop) │                        │
            │ recording   │  └────┬──────┘                        │
            │ to event    │       │                               │
            └──────┬──────┘       │ unresolved → ntfy alert       │
                   │ attendee     │                               │
                   │ names/emails │                               │
                   └──────┬───────┘                               │
                          ▼                                       │
               ┌──────────────────────┐                           │
               │ resolve-vault-       │                           │
               │ entities.py          │                           │
               │ rapidfuzz top-K=5    │                           │
               │ 568 → 9-30 candidates│                           │
               └──────────┬───────────┘                           │
                          │                                       │
               ┌──────────▼───────────┐                           │
               │  dedup-wikilinks.py  │                           │
               │  (179 lines)         │                           │
               │  first-reference rule│                           │
               └──────────┬───────────┘                           │
                          │                                       │
               ┌──────────▼────────────────────────────────────┐  │
               │  link-daily-note.py  (fcntl concurrency guard) ◄─┘
               └──────────┬────────────────────────────────────┘
                          │
                          ▼
                   Obsidian vault (.md)

Three of the seven stages were built fresh during this session; the rest were existing scripts extended to fit. The interesting parts are the new ones.

Building identify-entities.py

The rule-based entity resolver works well for entities already in the vault, but has no handling for names it hasn’t seen before.

The LLM backstop approach: extract bare proper nouns from the transcript, send them to Haiku with the vault’s entity list as context, ask it to identify matches and flag unresolvables.

The key design decision was candidates-first:

# v1 (too expensive): send full transcript to LLM
# LLM sees: "[name] mentioned that [company] deal closed..."
# Result: LLM makes things up; high token cost

# v2 (right approach): extract proper nouns first, send bare list
# Extract: ["[name]", "[company]", "[misheard name]"]
# LLM sees: candidate list + entity index only
# Result: LLM matches candidates only; much lower token cost

import spacy  # https://spacy.io
nlp = spacy.load("en_core_web_sm")

def extract_candidates(text: str) -> list[str]:
    doc = nlp(text)
    return list({
        ent.text for ent in doc.ents
        if ent.label_ in ("PERSON", "ORG", "GPE")
        and len(ent.text) > 2
    })

Then pass that candidate list to Haiku. The LLM sees 20-30 tokens of proper nouns, not a full transcript.

The rule I enforced: no person stubs. If the LLM can’t resolve a person name with confidence, it fires an ntfy alert and moves on. It does NOT create a new [[Name]] entity card with placeholder content. A hallucinated stub with wrong information is worse than no link at all. A phonetically similar name that maps to the wrong entity is exactly the failure mode you’re trying to prevent.

# identify-entities.py -- unresolved handling
if resolution["confidence"] < 0.7:
    ntfy.send(
        f"Unresolved entity in transcript: '{candidate}'",
        title="Vault Review Needed",
        tags=["vault", "transcription"],
    )
    # Do NOT create a stub. Plain text is safer.
    continue

Building calendar-lookup.py

The calendar lookup matches the recording time window to a Google Calendar API event. The naive approach (find the event that starts nearest to the recording start time) fails for large calendar blocks.

The scoring function weights start-proximity 3x over duration overlap:

def score_event(recording_start: datetime, event: dict) -> float:
    event_start = parse_datetime(event["start"])
    event_end = parse_datetime(event["end"])
    
    # Start proximity: seconds between recording start and event start
    start_delta = abs((recording_start - event_start).total_seconds())
    proximity_score = max(0, 1 - start_delta / 3600)  # decay over 1 hour
    
    # Duration overlap: fraction of recording covered by event
    recording_end = recording_start + timedelta(minutes=recording_duration_minutes)
    overlap_start = max(recording_start, event_start)
    overlap_end = min(recording_end, event_end)
    overlap_seconds = max(0, (overlap_end - overlap_start).total_seconds())
    overlap_score = overlap_seconds / (recording_end - recording_start).total_seconds()
    
    # Weight start proximity 3x -- prevents large all-day blocks from matching
    return (proximity_score * 3 + overlap_score) / 4

Once the event is matched, extract the attendee list and inject names and email domains into the enrichment context before the entity resolver runs. A first name in the transcript becomes a fully-qualified person + company, which the resolver can actually match against vault entities. This was the fix for the vault entries that had been producing unresolved references for months.

The roundtable review

Before deploying ~1,350 lines of new code that writes directly into a knowledge graph, I ran a structured review across three models from different providers (Opus, Gemini, Codex). The theory is simple: models trained on different data surface different failure modes. Atomic writes and concurrency guards are well-documented patterns — any sufficiently trained model has absorbed enough production postmortems to flag them. Running the review before deployment means those failures never happen in production.

The cross-provider aspect matters too. When all three models independently flag the same item in the same round, it’s not a quirk of one model’s training — it’s a real gap. Three-way convergence on atomic writes in round 1 is the kind of signal that would take a solo code review much longer to produce, if it produced it at all.

Round	Opus	Gemini	Codex	New items
1	atomic writes, idempotency	atomic writes, entity scaling	model config, concurrency	5 items
2	calendar_matched flag	idempotency edge cases	entity scaling threshold	1 new
3	LGTM	LGTM	LGTM	0 new
Total	3 rounds, 3 providers			6 items

Six consensus items from three models across three rounds. Every item had at least two models independently flagging it:

Item	Priority	What it solves
Atomic writes (tmpdir + mv)	P0	Partial writes leave corrupt vault notes on crash
Idempotency guard	P0	Re-running enrichment overwrites manual vault edits
Entity catalog pre-filtering	P1	568-entity vault → O(n) resolver per token → slow
`TRANSCRIBE_MODEL` env var	P1	Hard-coded model string requires code change to swap
`calendar_matched` frontmatter	P1	No way to audit which notes got calendar context
fcntl concurrency guard	P1	Concurrent enrichment runs corrupt shared vault files

All six shipped the same session.

The entity catalog problem

The vault had 568 entities, and the original resolver checked every one against every token in the transcript. That’s O(transcript_tokens × vault_entities) — slow, and it only gets worse as the vault grows.

The fix: rapidfuzz pre-filtering with a score cutoff.

from rapidfuzz import process as rf_process

def get_candidates(
    token: str,
    entity_index: dict,
    k: int = 5,
    threshold: int = 60,
) -> list[dict]:
    """
    Pre-filter the entity catalog to the top-K fuzzy matches.
    Returns 9-30 candidates from 568 entities.
    
    entity_index: {canonical_name: entity_card_data}
    k: max candidates to return
    threshold: minimum fuzz score (0-100)
    """
    matches = rf_process.extract(
        token,
        entity_index.keys(),
        limit=k,
        score_cutoff=threshold,
    )
    return [entity_index[m[0]] for m in matches]

Instead of checking all 568 entities per token, the pre-filter narrows the field to 9-30 candidates before the resolver runs. Same recall for correctly-spelled names, much faster execution.

The `correction_aliases` schema

The vault has two kinds of aliases:

# Entity card frontmatter example

# aliases: intentional display names
# The entity IS known by these names in normal usage
aliases:
  - "Acme"
  - "Acme Corp"

# correction_aliases: transcription errors that should be auto-corrected
# These are NOT how the entity is known -- they're what WhisperKit mishears
correction_aliases:
  - "Pitro"      # → Pit Rho (heard wrong)
  - "Pit Row"    # → Pit Rho (heard wrong)
  - "Acmee"      # → Acme (phonetic variant, company name)

The distinction matters for the dedup-wikilinks step. A [[Acme]] wikilink stays as [[Acme]] because that’s a valid alias; the entity card will resolve it. A [[Pitro]] wikilink gets rewritten to [[Pit Rho]] because Pitro is a transcription artifact that should never appear in the final vault note.

The resolver handles both correctly. The dedup handles both correctly. But they have to be told which is which — that’s what the schema distinction does.

Other possible directions

If the pipeline keeps working as well as it has so far, a few things I may tackle next:

Enrichment versioning: when entity aliases change, old notes with outdated wikilinks should be retroactively corrected. There’s no batch re-enrich path yet, and the vault accumulates drift over time.
Controlled person stubs: unresolved people produce an ntfy alert but nothing else. A status: unresolved entity card — surfaced for review in hub-web — would close the loop without the risk of hallucinated stubs.
Bash to Python orchestrator: transcribe-enrich is 403 lines of bash. That’s probably too long for a bash script managing a multi-stage pipeline; the next feature added is the natural trigger to rewrite it.

Patterns I learned

Two patterns worth stealing.

Calendar injection as ground truth. If you’re transcribing meetings, you already have a structured attendee list for every call in Google Calendar. Surface names, emails, and company affiliations from the calendar API and inject them into your entity resolution context before the resolver runs. A bare first name in the transcript becomes a fully-qualified person + company, which is something the resolver can actually work with.

Run the roundtable before you deploy. The six hardening items from the three-model review aren’t things I would have caught myself in the same session. Atomic writes and fcntl concurrency guards are the kind of thing you only think about when you’ve been burned, or when you ask a model that has read a lot of production postmortems.