Skip to content

feat: add DataFrame writeback to remote ClickHouse server#561

Draft
wudidapaopao wants to merge 4 commits intochdb-io:mainfrom
wudidapaopao:support_dataframe_writeback
Draft

feat: add DataFrame writeback to remote ClickHouse server#561
wudidapaopao wants to merge 4 commits intochdb-io:mainfrom
wudidapaopao:support_dataframe_writeback

Conversation

@wudidapaopao
Copy link
Copy Markdown
Contributor

@wudidapaopao wudidapaopao commented Apr 16, 2026

Closes #560

Add writeback APIs to DataStore for writing data back to a remote ClickHouse server.

Changes

  • save() — unified entry point
  • to_clickhouse() — write data to a remote table (fail / replace / append)
  • create_view() / create_materialized_view() — create views on remote server
  • DataFrame upload fallback for Pandas-only pipelines

@wudidapaopao wudidapaopao force-pushed the support_dataframe_writeback branch from c57e6f5 to a54fc6e Compare April 16, 2026 13:24
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 16, 2026

CLA assistant check
All committers have signed the CLA.

ClickHouse doesn't yet support atomic CREATE OR REPLACE MATERIALIZED
VIEW (PR #100539 still open), and emulating it via DROP + CREATE leaves
a window where source INSERTs are silently dropped. Remove
if_mv_exists from create_materialized_view and reject if_exists='replace'
in save(); existing MV now always raises. Re-enable once upstream lands.
…chema mismatch

- _check_remote_table_exists: query system.tables via remote() instead of
  EXISTS TABLE through remote(query=...), which only returns DDL status rows
- _build_engine_clause: emit SETTINGS allow_nullable_key = 1 for MergeTree
  so DESCRIBE-inferred Nullable columns can appear in ORDER BY
- normalize df index once via _materialize_for_writeback so CREATE TABLE,
  schema evolution and INSERT all see the same column shape
- tests: expect ExecutionError for fail-mode collisions and assert
  ClickHouse rejection for cross-server materialized views
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support DataFrame writeback to remote ClickHouse server

2 participants