Parquet Viewer for Mac

Parquet File Inspection Tool for macOS

Download on App Store

App Features

View Parquet File Details

Quickly preview file contents, including metadata, row count, and file size.

View Parquet Schema

View detailed schema information including column types, compression algorithms, and encoding schemes with intuitive visualizations.

What is Parquet?

Parquet vs Other Formats

Parquet vs JSON

  • Storage Efficiency: Parquet's columnar storage typically offers 3-5x better compression than row-based JSON
  • Schema Evolution: Parquet supports explicit schema definition with backward/forward compatibility
  • Query Performance: Columnar format enables efficient column pruning and predicate pushdown
  • Best Use: JSON for APIs/web data, Parquet for analytical workloads

Parquet vs CSV

  • Type Safety: Parquet enforces data types while CSV relies on inference
  • Compression: Parquet achieves 75%+ compression rates vs CSV's typical 20-40%
  • Chunking: Parquet supports efficient data partitioning with row groups
  • Metadata: Built-in statistics in Parquet enable optimization without full scans

Working with Parquet

Java Implementation

Read Parquet:

ParquetReader<GenericRecord> reader = ParquetReader.builder(new Path("data.parquet"))
    .withConf(new Configuration())
    .build();
GenericRecord record;
while ((record = reader.read()) != null) {
    // Process record
}

Write Parquet:

MessageType schema = parseSchema();
ParquetWriter<GenericRecord> writer = AvroParquetWriter
    .<GenericRecord>builder(new Path("output.parquet"))
    .withSchema(schema)
    .build();
writer.write(record);
writer.close();

JavaScript Implementation

Read Parquet:

import { ParquetReader } from 'parquetjs';
const reader = await ParquetReader.openFile('data.parquet');
const cursor = reader.getCursor();
let record;
while (record = await cursor.next()) {
  // Process record
}

Write Parquet:

import { ParquetWriter } from 'parquetjs';
const schema = new ParquetSchema({ /* schema definition */ });
const writer = await ParquetWriter.openFile(schema, 'output.parquet');
await writer.appendRow({ /* data */ });
await writer.close();

Python Implementation

Read Parquet:

import pandas as pd
df = pd.read_parquet('data.parquet', engine='pyarrow')
print(df.head())

Write Parquet:

df.to_parquet('output.parquet',
               engine='pyarrow',
               compression='snappy')