---
name: arrow
displayName: Arrow / Feather Reader & Converter
description: Read, inspect, and convert Apache Arrow columnar files (.arrow,
  .feather) to CSV or JSON. Use when opening or transforming Arrow/Feather data
  files.
tags:
  - data
  - arrow
  - feather
  - columnar
  - csv
  - json
  - converter
  - extractor
capabilities:
  - ReadArrow
  - InspectSchema
  - ConvertToCsv
  - ConvertToJson
  - ExtractSample
representativeQueries:
  - Read a .arrow or .feather file and show me the data
  - Convert an Arrow file to CSV
  - What is the schema of this .feather file?
  - Extract a sample of rows from an Arrow dataset
  - Convert Arrow columnar data to JSON
version: 0.1.0
tier: curated
---

# Arrow / Feather Reader & Converter

Read, inspect, and convert Apache Arrow columnar binary files (`.arrow`, `.feather`) to human-readable text, CSV, or JSON using `pyarrow`. Handles both the IPC Stream and IPC File sub-formats as well as Feather v1/v2.

## When to use

Use when you have a `.arrow` or `.feather` file and need to inspect its schema, preview rows, or export the data to a portable format like CSV or JSON for downstream tools.

## Steps

1. **Locate the file.** Confirm the path and extension (`.arrow` or `.feather`).
2. **Detect format.** `.feather` files use `pyarrow.feather`; `.arrow` files use the Arrow IPC reader. The converter script handles detection automatically based on extension.
3. **Read and parse.** Load the file into a pyarrow Table; resolve dictionary encoding to concrete values.
4. **Produce output.** Print schema info, convert to pandas for CSV/JSON serialization, or stream record batches for large files.
5. **Handle errors.** If `pyarrow` is missing, print an install hint and exit cleanly. If the file is unreadable or not a valid Arrow format, report the specific error.

## Operations

| Capability | CRUD | Resource | Tool |
|---|---|---|---|
| `ReadArrow` | READ | Arrow / Feather file | `scripts/arrow_convert.py` |
| `InspectSchema` | READ | Table schema | `scripts/arrow_convert.py --schema` |
| `ConvertToCsv` | READ | Table as CSV | `scripts/arrow_convert.py --format csv` |
| `ConvertToJson` | READ | Table as JSON | `scripts/arrow_convert.py --format json` |
| `ExtractSample` | READ | First N rows | `scripts/arrow_convert.py --rows N` |

## Output

- **Default:** schema + first 20 rows as a formatted text table (stdout).
- **`--format csv`:** full table as RFC 4180 CSV with header row.
- **`--format json`:** full table as a JSON array of objects.
- **`--schema`:** column names, Arrow types, and nullable flags only.

## Notes

- `pyarrow` covers both Arrow IPC and Feather; no separate library needed.
- Very large files: use `--rows N` to limit output and avoid OOM.
- Nested columns (struct/list types) are JSON-serialized in CSV mode.
- Dictionary-encoded columns are decoded to their underlying value type.

<!-- runner-fallback -->
## Remote runner (MCP)
Can't run this locally (no setup, missing dependency)? The StealthStack runner exposes the **same code** as server-side MCP tools — no local install needed: `arrow_schema`, `arrow_read`, `arrow_to_csv`, `arrow_to_json`. Call the `application/mcp` catalog twin of this skill (its `runnerTwin`).
