mcat

01 / Features

Everything cat does,
plus superpowers

01

>_

Drop-in cat

Every GNU cat flag works. -n, -b, -s, -A, -v — all of them. If you know cat, you know mcat.

02

{ }

7+ formats

Parquet, CSV, TSV, JSONL, JSON, Avro, Excel — auto-detected by extension or magic bytes.

03

SQL

SQL queries

Filter with --query using SQL WHERE clauses. Powered by DuckDB with predicate pushdown on Parquet.

04

∑

Instant stats

Column profiling from Parquet metadata — min, max, nulls, uniques with zero full-file I/O.

05

S3

Cloud native

Stream from S3, GCS, Azure, HTTP, MinIO, R2. Zero-config auth — uses your existing credentials.

06

gz

Compression

Transparent decompression for gzip, zstd, bz2, lz4, xz. Works on local and remote files.

02 / Demos

See it in action

parquet table

$ mcat sales_data.parquet
┏━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
┃ name        ┃ region   ┃   sales ┃ quarter ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
│ Alice Chen  │ APAC     │  94,230 │ Q1 2024 │
│ Bob Muller  │ EMEA     │  71,450 │ Q1 2024 │
│ Carol Smith │ Americas │  88,920 │ Q1 2024 │
│ David Park  │ APAC     │ 102,100 │ Q2 2024 │
│ Eve Santos  │ Americas │  67,800 │ Q2 2024 │
└─────────────┴──────────┴─────────┴─────────┘
5 rows · 4 columns · parquet

$ mcat sales.parquet --query "sales > 80000 AND region = 'APAC'"
┏━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
┃ name       ┃ region ┃   sales ┃ quarter ┃
┡━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
│ Alice Chen │ APAC   │  94,230 │ Q1 2024 │
│ David Park │ APAC   │ 102,100 │ Q2 2024 │
└────────────┴────────┴─────────┴─────────┘
2 rows · 4 columns · parquet

$ mcat --diff q1_sales.csv q2_sales.csv
┏━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Row ┃ Status ┃ name        ┃ sales                   ┃
┡━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━┩
│   0 │ ~      │ Alice Chen  │ 94,230 -> 98,100          │
│   1 │        │ Bob Muller  │ 71,450                   │
│   2 │ ~      │ Carol Smith │ 88,920 -> 91,340          │
│   3 │ +      │ Frank Lee   │ 55,200                   │
└─────┴────────┴─────────────┴─────────────────────────┘
1 unchanged · 2 modified · 1 added · 0 removed

$ mcat --stats sales_data.parquet
        Stats  sales_data.parquet  (1,234,567 rows · 4 columns)
┏━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓
┃ Column ┃ Type    ┃  Non-Null ┃   Null ┃    Min ┃     Max ┃   Mean ┃
┡━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩
│ name   │ STRING  │ 1,234,567  │      0 │  Aaron │     Zoe │      - │
│ age    │ INT64   │ 1,230,000  │  4,567 │     18 │      94 │   36.4 │
│ salary │ FLOAT64 │ 1,200,000  │ 34,567 │ 22,000 │ 450,000 │ 87,432 │
│ region │ STRING  │ 1,234,567  │      0 │   APAC │    EMEA │      - │
└────────┴─────────┴───────────┴────────┴────────┴─────────┴────────┘
 4.2 MB · parquet · compression: SNAPPY

03 / Install

Up and running
in one command

pip install mcat

# Install uv first (if you don't have it)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install mcat
uv tool install mcat

brew tap christyjacob4/tap
brew install mcat

# Add the PPA
sudo add-apt-repository ppa:christyjacob4/mcat
sudo apt update

# Install mcat
sudo apt install mcat

04 / Usage

Every flag you need,
at the command line

Drop-in cat

mcat file.txt
mcat -n file.txt
mcat -A file.txt
echo "hello" | mcat

Structured data

mcat data.parquet
mcat data.csv
mcat --format jsonl data.parquet
mcat --schema data.parquet

Filter & slice

mcat --head 10 data.parquet
mcat --tail 5 data.csv
mcat --columns name,age data.parquet
mcat --sample 20 data.parquet

SQL queries

mcat data.parquet \
  --query "age > 30 AND city = 'NYC'"
mcat data.csv --query "salary > 50000" \
  --format jsonl

Sort & grep

mcat data.parquet --sort age
mcat data.parquet --sort -age,name
mcat data.csv --grep "Smith"
mcat data.csv --grep "NYC" --head 5

Diff & stats

mcat --diff old.csv new.csv
mcat --stats data.parquet
mcat --count data.parquet
mcat --detect data.parquet

Remote sources

mcat s3://bucket/data.parquet
mcat gs://bucket/data.parquet
mcat https://example.com/data.csv
mcat --s3-endpoint https://play.min.io \
  s3://mybucket/data.parquet

Compression & output

mcat data.parquet.gz
mcat data.csv.zst --head 100
mcat data.parquet -o data.jsonl \
  --format jsonl
mcat data.parquet --pager

05 / Formats

7 formats,
zero configuration

Auto-detected by extension, then by magic bytes (PAR1, Obj\x01) as fallback.

Format	Extensions	Features
Parquet	`.parquet` `.pq`	Row-group streaming, schema inspect, instant count/stats
Avro	`.avro`	Stream blocks, schema inspect
CSV	`.csv`	Table with headers, auto-detect delimiter
TSV	`.tsv`	Table with headers
JSONL	`.jsonl` `.ndjson`	Pretty-print records
JSON	`.json`	Array of objects or single object
Excel	`.xlsx` `.xls`	First sheet, both legacy and modern

Remote Sources

Streaming via fsspec — no full downloads, range requests where supported.

Protocol	Backend	Notes
`s3://`	s3fs + boto3	AWS S3, MinIO, R2, B2, DO Spaces
`gs://`	gcsfs	Google Cloud Storage
`az://`	adlfs	Azure Blob Storage
`https://`	fsspec built-in	Range requests where supported
S3-compatible	s3fs + `--s3-endpoint`	MinIO, Cloudflare R2, Backblaze B2

06 / Auth

Zero-config
cloud auth

mcat piggybacks on credentials you've already configured for your cloud provider.

# AWS CLI (recommended)
aws configure
mcat s3://my-bucket/data.parquet

# Environment variables
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
mcat s3://my-bucket/data.parquet

# Named profile
AWS_PROFILE=prod mcat s3://my-bucket/data.parquet

# Per-command endpoint
mcat --s3-endpoint https://play.min.io s3://mybucket/data.parquet

# Environment variable
export AWS_ENDPOINT_URL=https://play.min.io
mcat s3://mybucket/data.parquet

# Cloudflare R2
mcat --s3-endpoint https://<account>.r2.cloudflarestorage.com \
  s3://bucket/file.parquet

# gcloud CLI (recommended)
gcloud auth application-default login
mcat gs://my-bucket/data.parquet

# Service account key
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
mcat gs://my-bucket/data.parquet

Everything cat does,plus superpowers

Drop-in cat

7+ formats

SQL queries

Instant stats

Cloud native

Compression

See it in action

Up and runningin one command

Every flag you need,at the command line

Drop-in cat

Structured data

Filter & slice

SQL queries

Sort & grep

Diff & stats

Remote sources

Compression & output

7 formats,zero configuration

Remote Sources

Zero-configcloud auth

Everything cat does,
plus superpowers

Up and running
in one command

Every flag you need,
at the command line

7 formats,
zero configuration

Zero-config
cloud auth