---
name: mbox
displayName: mbox Reader & Converter
description: Read and convert .mbox mailbox files to JSON, CSV, or plain text.
  Use when opening, inspecting, or exporting email archives in the mbox format.
tags:
  - email
  - mbox
  - mailbox
  - converter
  - extractor
capabilities:
  - ReadMbox
  - ConvertToJson
  - ConvertToCsv
  - ConvertToText
  - ExtractHeaders
  - SplitToEml
representativeQueries:
  - Read a .mbox file and show me the emails
  - Convert mbox to CSV or JSON
  - Extract subjects and senders from a mailbox archive
  - Export emails from an mbox file to plain text
  - Split an mbox archive into individual .eml files
version: 0.1.0
tier: curated
---

# mbox Reader & Converter

Reads `.mbox` mailbox archive files and converts them to structured JSON, flat CSV, or readable plain text. Works with any standard mbox file including Gmail Takeout exports, Thunderbird archives, and Apple Mail exports. Primitive kind: **extractor**.

## When to use

- You have a `.mbox` file and want to read its contents
- You need to export or convert an email archive to JSON, CSV, or text
- You want to count, search, or inspect messages in a mailbox
- You are extracting specific fields (subject, sender, date, body) from email archives

## Steps

1. **Locate the source.** Confirm the `.mbox` file path is accessible.
2. **Parse/extract.** Use `scripts/mbox_convert.py` with the file path and desired output format (`--format json|csv|text`, default `text`).
3. **Structure output.** Each message yields: `from`, `to`, `date`, `subject`, `body` (plain text), and optionally `message_id` and `labels`.
4. **Stream for large files.** The script iterates messages one at a time and does not hold all message content in memory simultaneously. Note: Python's `mailbox.mbox` performs a full sequential scan of the file on open to build a byte-offset index (`_generate_toc`), so for very large files there is an upfront scan cost before the first message is yielded — it is not true line-by-line streaming.
5. **Review output.** Pipe to a pager, file, or further processing as needed.

## Operations

| Capability | CRUD | Resource | Notes |
|---|---|---|---|
| `ReadMbox` | READ | .mbox file | Parse and print messages |
| `ConvertToJson` | READ | .mbox file | Output newline-delimited JSON |
| `ConvertToCsv` | READ | .mbox file | Output RFC 4180 CSV with header row |
| `ConvertToText` | READ | .mbox file | Human-readable plain-text dump |
| `ExtractHeaders` | READ | .mbox file | Headers-only mode, no body (`--headers-only` flag) |
| `SplitToEml` | WRITE | .mbox file → .eml files | Write each message as a numbered .eml into a directory (`--split OUT_DIR`) |

## Output

- **text** (default): one block per message with labeled fields and body
- **json**: one JSON object per line (newline-delimited); fields: `from`, `to`, `date`, `subject`, `body`, `message_id`
- **csv**: header row + one row per message; same fields as JSON

## Context

- The `From ` separator line (no colon) is an artefact of the mbox format and is not part of the message headers; Python's `mailbox.mbox` strips it automatically.
- Gmail Takeout produces mboxrd variant with `X-Gmail-Labels` headers; these are preserved in JSON/CSV output as an optional `labels` field.
- Multipart messages: the script extracts the first `text/plain` part; if none exists it falls back to `text/html` with tags stripped.
- Very large archives: use `--limit N` to cap output at N messages during exploration.

## Notes

- Use only Python standard library — no pip installs required.
- Run `python3 scripts/mbox_convert.py --help` for all options.
- `--headers-only` suppresses the entire message body (sets body to empty string); it is useful for surveying message metadata (subjects, senders, dates, labels) across the archive without extracting body text. It does not provide attachment-specific inspection — attachment metadata is part of the MIME body and is also suppressed.

<!-- runner-fallback -->
## Remote runner (MCP)
Can't run this locally (no setup, missing dependency)? The StealthStack runner exposes the **same code** as server-side MCP tools — no local install needed: `mbox_to_text`, `mbox_to_json`, `mbox_to_csv`, `mbox_split`. Call the `application/mcp` catalog twin of this skill (its `runnerTwin`).
