Skip to content

Expose paimon table observability API (snapshots, partitions, tags) for pypaimon_rust #285

@SML0127

Description

@SML0127

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Beyond catalog metadata (issue #284), pypaimon_rust also lacks table-level introspection APIs that pypaimon exposes today: snapshot history, partition layout, and tags are not accessible from Python.

Solution

Add the following APIs to PyTable and introduce new wrapper classes:

# PyTable (additions)
table.latest_snapshot()  -> PySnapshot | None
table.list_snapshots()   -> list[PySnapshot]   # full history, newest first

# PySnapshot (new)
snapshot.id()                  -> int
snapshot.commit_time_ms()      -> int         # Snapshot::time_millis(), epoch ms
snapshot.total_record_count()  -> int | None  # None if not tracked
snapshot.delta_record_count()  -> int | None  # None if not tracked
snapshot.commit_kind()         -> str         # "APPEND" | "COMPACT" | "OVERWRITE" | "ANALYZE"

# PyTable (additions)
table.list_partitions()  -> list[dict[str, str]]
# e.g. [{"dt": "2024-01-01", "hr": "10"}, ...]

table.partition_stats()  -> list[PyPartitionStat]

# PyPartitionStat (new)
stat.partition()         -> dict[str, str]
stat.record_count()      -> int
stat.file_count()        -> int
stat.total_size_bytes()  -> int

# PyTable (addition)
table.list_tags() -> list[PyTag]

# PyTag (new)
tag.name()        -> str
tag.snapshot_id() -> int

Intended usage:

table = catalog.get_table("mydb.orders")
snap = table.latest_snapshot()

if snap:
    print(snap.commit_time_ms(), snap.total_record_count(), snap.commit_kind())

for stat in table.partition_stats():
    print(stat.partition(), stat.record_count(), stat.total_size_bytes())

for tag in table.list_tags():
    print(tag.name(), tag.snapshot_id())

Anything else?

No response

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions