---
name: odt
displayName: ODT Reader & Converter
description: Read and convert OpenDocument Text (.odt) files to plain text,
  JSON, or CSV. Use when extracting content, tables, or metadata from .odt
  documents.
tags:
  - document
  - odt
  - openDocument
  - convert
  - extract
  - text
capabilities:
  - ReadOdt
  - ConvertToText
  - ConvertToJson
  - ConvertToCsv
  - ExtractTables
representativeQueries:
  - read a .odt file and show me its contents
  - convert odt to plain text or JSON
  - extract tables from an OpenDocument Text file
  - what is in this .odt document
  - pull the text out of an odt file
version: 0.1.0
tier: curated
---

# ODT Reader & Converter

Reads OpenDocument Text (`.odt`) files and converts their content to plain text, JSON, or CSV using only the Python standard library. `.odt` is the default format for LibreOffice Writer and many open-source word processors; it is a ZIP archive containing XML following the ODF 1.x standard.

## When to use

Use this skill when a user supplies an `.odt` file path and wants to:
- Read or preview the document content
- Convert the document to a portable format (plain text, JSON, CSV)
- Extract structured tables from the document
- Inspect document metadata (title, author, dates) from `meta.xml`

## Steps

1. **Detect format.** Check the file has a `.odt` extension (a warning is issued for other extensions but processing continues); confirm it is a valid ZIP (open with `zipfile`; catch `BadZipFile` and exit with an error).
2. **Route by task.**
   - *Plain text / preview:* extract `content.xml`, strip ODF namespaces, collect all `<text:p>` and `<text:h>` text runs in document order.
   - *JSON:* parse paragraphs, headings, and tables into a structured dict; serialize with `json.dumps`.
   - *CSV:* locate `<table:table>` elements; flatten each into rows of cells; write with `csv.writer`.
3. **Run the bundled script.** Call `python3 scripts/read_odt.py <path> [--format text|json|csv]`; output goes to stdout.
4. **Present results.** Paste or summarize the output; for large documents, offer to extract a specific section or table by index.

## Operations

| Capability | CRUD | Resource | Tool |
|---|---|---|---|
| `ReadOdt` | READ | document body | `scripts/read_odt.py` |
| `ConvertToText` | READ | plain text output | `scripts/read_odt.py --format text` |
| `ConvertToJson` | READ | JSON output | `scripts/read_odt.py --format json` |
| `ConvertToCsv` | READ | CSV (first table) | `scripts/read_odt.py --format csv` |
| `ExtractTables` | READ | all tables as JSON | `scripts/read_odt.py --format json` (tables key) |

## Output

- `--format text` (default): UTF-8 plain text, one paragraph per line, headings prefixed with `#`/`##` markers.
- `--format json`: JSON object with keys `meta` (dict), `body` (list of `{type, level?, text}` objects), `tables` (list of 2-D string arrays).
- `--format csv`: CSV rows from the first table found; exits with error if no table is present.

## Notes

- Password-protected `.odt` files cannot be opened; the script exits with a clear error message.
- Embedded images and OLE objects are silently skipped; only text content is extracted.
- Merged table cells are repeated in the CSV output to preserve column alignment.
- For high-fidelity Markdown output, install `pandoc` and run `pandoc -f odt -t markdown <file>` directly — the bundled script does not require it.
- Large documents (>10 MB uncompressed XML) may be slow with ElementTree; this is acceptable for typical office documents.

<!-- runner-fallback -->
## Remote runner (MCP)
Can't run this locally (no setup, missing dependency)? The StealthStack runner exposes the **same code** as server-side MCP tools — no local install needed: `odt_read`, `odt_to_json`, `odt_to_csv`. Call the `application/mcp` catalog twin of this skill (its `runnerTwin`).
