Overview
Implement conversion from JSON Structure schemas to Apache Parquet schema.
Requirements
This conversion should:
- Lean on the corresponding Avro conversion (
avrotoparquet) as precedent for output structure, including use of Jinja templates where applicable
- Cover the full breadth of the JSON Structure Core spec as defined in draft-vasters-json-structure-core-00
- Follow the patterns established by
structuretocsharp and structuretopython, including their continued support for Avro schemas
Implementation Guidance
- Review
avrotize/avrotoparquet.py for output patterns and template usage
- Review
avrotize/structuretocsharp.py and avrotize/structuretopython.py for the JSON Structure handling patterns
- Ensure all JSON Structure Core types are supported:
- JSON Primitive Types: string, number, boolean, null
- Extended Primitive Types: binary, int8-128, uint8-128, float8/float/double, decimal, date, datetime, time, duration, uuid, uri, jsonpointer
- Compound Types: object, array, set, map, tuple, any, choice (both tagged and inline unions)
- Support JSON Structure-specific features:
- Namespaces and definitions
- Type references ($ref)
- Extensions ($extends) and add-ins ($offers/$uses)
- Abstract types
- Required/optional properties
- Type annotations (maxLength, precision, scale, contentEncoding, etc.)
References
Overview
Implement conversion from JSON Structure schemas to Apache Parquet schema.
Requirements
This conversion should:
avrotoparquet) as precedent for output structure, including use of Jinja templates where applicablestructuretocsharpandstructuretopython, including their continued support for Avro schemasImplementation Guidance
avrotize/avrotoparquet.pyfor output patterns and template usageavrotize/structuretocsharp.pyandavrotize/structuretopython.pyfor the JSON Structure handling patternsReferences
avrotize/avrotoparquet.pyavrotize/structuretocsharp.py,avrotize/structuretopython.py