Skip to content

What alternatives get wrong blog#1254

Draft
Naros wants to merge 1 commit intodebezium:developfrom
Naros:debezium-clarify-alternative-claims
Draft

What alternatives get wrong blog#1254
Naros wants to merge 1 commit intodebezium:developfrom
Naros:debezium-clarify-alternative-claims

Conversation

@Naros
Copy link
Copy Markdown
Member

@Naros Naros commented May 1, 2026

No description provided.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

😭 Deploy PR Preview failed.

Signed-off-by: Chris Cranford <chris@hibernate.org>
@Naros Naros force-pushed the debezium-clarify-alternative-claims branch from 21d9c14 to c5af8cb Compare May 1, 2026 17:18
**You need a fully managed service with no operational responsibility**.
Debezium is self-hosted.
You run it, you monitor it, you upgrade it.
If you want to hand all that to a vendor, Debezium may not be the right fit, not because it's complex, but because that is not what it is.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exact, there are vendors that give you CDC-as-a-Service. In fact this is not somehting that Debezium should provide you. It would be the same as saying yes, criticism is right Qorkuas project does not give you PaaS environment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also Debezium Platform could then simplify this point

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, our own company offers this, via Confluent Cloud :) Ofc. this article isn't meant to be a sales ad, but pointing out the relationship between Debezium as an upstream OSS project (which can be self-run) and managed downstream services like Confluent's would be fair.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you guys feel about

You need a fully managed CDC-as-a-Service offering.
Debezium is an open-source project, not a hosted service.
If your requirement is that someone else operate the CDC infrastructure for you, handling provisioning, upgrades, support, and SLAs, that's a different product category.

To be transparent, several managed services use upstream Debezium connectors in their offerings, including Confluent Cloud.
The relationship between Debezium, an upstream open-source project, and the managed downstream services built with it is a feature of the ecosystem.
Choose the self-hosted project if you want to run it yourself, or the managed service if you want someone else to run it.

Both are valid, and the criticism that "Debezium isn't a managed service" is really just a question of which layer of the stack you're shopping for.

Comment thread _posts/2026-05-06-what-alternatives-get-wrong.adoc
---

If you've ever searched for Change Data Capture (CDC) solutions recently, you've almost certainly landed on an article with a title like "_Top Debezium Alternatives in 2026_" or "_Why We Moved Away from Debezium_.".
These articles follow a familiar pattern: they list the same handful of criticisms, present their alternative solution, and move on.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't refer to present alternative solution, that sounds like an account settling.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two alternatives. What are your opinions?

These articles follow a familiar pattern: they list the same handful of criticisms and recommend a different tool.

These articles follow a familiar pattern: the same handful of criticisms appear again and again, often without any examination of whether they still hold today.

Comment thread _posts/2026-05-06-what-alternatives-get-wrong.adoc
From there, Debezium participates in the standard Quarkus application lifecycle.
Configuration lives in `application.properties`, alongside the rest of your application's configuration.
Change events are consumed as CDI events.
The entire Quarkus developer experience, including dev services that spin up databases automatically for local development, is available.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purpose of the message of this article it makes sense to point to https://docs.spring.io/spring-integration/reference/debezium.html

Also Debezium Engine without any framework support should be mentioned.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'd mention possibility to use Debezium even in Python stack using PyDebezium engine (not at the part about operating JVM stack, see my comment bellow).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG, TIL about PyDebezium, whattt?!

But yes, I think the sequencing needs to be different:

  • available as a library for in-app usage (e.g. cache invalidation is a great one)
  • plain, Quarkus, Spring, etc.
  • PyDebezium

Comment thread _posts/2026-05-06-what-alternatives-get-wrong.adoc

The alternative articles that cite this as a criticism are, in effect, saying "Debezium doesn't try to replace a stream processing framework."
That is correct, and it's not meant to.
It's meant to complement those solutions.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With in-process Debezium you can do anything as you have available the whole programming language.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on this

Any alternative article that cites this as a criticism is, in effect, saying "Debezium doesn't try to replace a stream processing framework."
That is correct, and it's not meant to.
It's meant to complement those solutions or be embedded within one.

The last point is worth emphasizing.
When you run Debezium in-process, either with the Quarkus extensions or the Embedded engine, transformation isn't a separate concern at all.
Change events arrive as objects in your application, and there you have the entire JVM ecosystem available: use any library, framework, or custom logic you need.

The "limited transformations" framing assumes a pipeline architecture in which Debezium emits events and something downstream processes them.
That's one valid deployment model, but it's not the only one.


**Your team has no existing JVM or distributed systems operational experience**.
Debezium runs on the Java Virtual Machine (JVM).
If your operations team has no familiarity with that ecosystem, there will be a learning curve.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we refer to pydebezium engine as an example of no need to work with Java directly?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid Chris' point about JVM knowledge still applies here, because you still need to run JVM and eventually do the troubleshooting (in case of integration with python, not so frequently used, I guess you would need to debug or at least report bugs more often than when you just run e.g. Debezium server).
Maybe I'd consider to mention here compiling Debezium into native executable using GraalVM to avoid installing and using JVM.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can work both of those in, as I think they both have merit.

Copy link
Copy Markdown
Member Author

@Naros Naros May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this fit what you two had in mind?

Your team has no existing JVM or distributed systems operational experience.::
Debezium runs on the Java Virtual Machine (JVM).
https://github.com/memiiso/pydbzengine[PyDebezium] makes it possible to consume change events from Python applications, and https://debezium.io/documentation/reference/stable/integrations/quarkus-debezium-engine-extension.html[GraalVM native compilation] removes the need to install a JVM at runtime. +
+
Even so, the underlying engine is still a JVM-based system.
When something goes wrong in production, and in any distributed system, eventually something will; troubleshooting and bug reporting will benefit from at least some familiarity with that ecosystem.
If your operations team has none, there will be a learning curve.


Let's look at what a basic Debezium Server setup actually involves.

A minimal `docker-compose.yml` for capturing changes from PostgreSQL and emitting them to a sink of your choice looks like this:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With docker composee cotaining only a single container you can also just use docker command directly. That could be other example using for example Google PubSub?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still prefer the docker-compose way. It'll hopefully be closer to real deployment scenarios.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be any harm in showing both?

Many tutorials and articles were written during a period long before these new options were introduced, and are still indexed and widely shared.
However, it is not an accurate description of Debezium today.

If you elect to choose an alternative tool specifically to avoid a Kafka dependency, it's worth asking whether you evaluate Debezium Server and Platform first.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say what it is, but I feel like this sentence should have another tone too. Maybe more passive wording? wdyt?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're referring to line 44, I can certainly see if there is a lighter way to make the point.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to be generally careful to not support any "Kafka is overhead" narrative, consciously or subconsciously. The Debezium Server angle should rather be that "Debezium also is available to you when using other streaming platforms".


The source is open.
The community is active.
And the architecture is significantly more flexible than what you may have realized.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pointing to the picture that other articles claim about Debezium is better here than pointing to the realizations of the reader? Or just summarizing that the presented options for running Debezium make a more flexible solution?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the latter is the better approach. I'd rather focus on giving the reader information, "hey, in case you aren't aware" mentality. At the end of the day, it's their decision, but we'd want them to be as informed as possible before making it.

Copy link
Copy Markdown
Member

@vjuranek vjuranek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, but really nice blog post!


What _does exist_ is comprehensive metrics exposure.
Debezium exposes a rich set of JMX metrics covering connector status, transaction log position, event counts, processing rates, and lag.
When paired with **Prometheus JMX Exporter** and **Grafana**, you get production-grade observability with dashboards the community has built and shared.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May link the monitoring example directly here?


**Your team has no existing JVM or distributed systems operational experience**.
Debezium runs on the Java Virtual Machine (JVM).
If your operations team has no familiarity with that ecosystem, there will be a learning curve.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid Chris' point about JVM knowledge still applies here, because you still need to run JVM and eventually do the troubleshooting (in case of integration with python, not so frequently used, I guess you would need to debug or at least report bugs more often than when you just run e.g. Debezium server).
Maybe I'd consider to mention here compiling Debezium into native executable using GraalVM to avoid installing and using JVM.

From there, Debezium participates in the standard Quarkus application lifecycle.
Configuration lives in `application.properties`, alongside the rest of your application's configuration.
Change events are consumed as CDI events.
The entire Quarkus developer experience, including dev services that spin up databases automatically for local development, is available.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'd mention possibility to use Debezium even in Python stack using PyDebezium engine (not at the part about operating JVM stack, see my comment bellow).

* Filtering events by table or operation type
* Masking or replacing sensitive column values
* Converting data types
* Adding metadata fields to events
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth to mention AI SMTs?

**This claim hasn't been true for years**.

Debezium is a Change Data Capture (CDC) platform.
Apache Kafka Connect is _one way to deploy Debezium_, and for teams that already run Kafka, it remains an excellent choice.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a note on the rich ecosystem of Kafka Connect (sink) connectors you can tap into that way, as well as HA, history storage, etc. pp. I.e. don't feed into the narrative "Kafka is a burden", but rather line out the advantages.

Comment on lines +32 to +35
One process.
No Zookeeper.
No Kafka brokers.
No Kafka Connect cluster setup.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punchy framing, but I think it needs rework. Zookeeper hasn't been a thing for many Kafka users for a while now, so that's a bit of a strawman. As for Kafka and KC, you also lose something when not using them (see above). The way I've always thought about DBZ Server is that it gives you connectivity with other streaming solutions (which ofc. will have their own operational toil).

A minimal `docker-compose.yml` for capturing changes from PostgreSQL and emitting them to a sink of your choice looks like this:
[source,yaml]
----
services:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, this is glossing over history storage (for certain connectors) and offset storage?

The complexity argument conflates the inherent requirements of CDC (which any tool must address) with the operational overhead of Debezium specifically (which has been substantially reduced).
Those are different things.

There is also a deployment path that alternative articles almost never mention at all.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant as a lead-in into the next section? If so, needs some clarification like "let's take a look at this next". Right now, it feels very disconnected.


#### For Java developers: the Debezium Quarkus Extensions

If you are already building Java applications, there is a fourth option that sidesteps the infrastructure question entirely: the **Debezium Quarkus Extensions**.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Fourth"? So far we discussed Kafka and DBZ Server, what's the third one?

#### For Java developers: the Debezium Quarkus Extensions

If you are already building Java applications, there is a fourth option that sidesteps the infrastructure question entirely: the **Debezium Quarkus Extensions**.
These let you embed Debezium directly inside any Quarkus-based application, running CDC as part of your existing service rather than as a separate piece of infrastructure.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a link and sentence about what Quarkus is.

If you are already building Java applications, there is a fourth option that sidesteps the infrastructure question entirely: the **Debezium Quarkus Extensions**.
These let you embed Debezium directly inside any Quarkus-based application, running CDC as part of your existing service rather than as a separate piece of infrastructure.

For Java developers, this is arguably the most natural entry point into Debezium that exists.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about this; it really depends on the use case, doesn't it. Like, if I want to stream events from Postgres to my datalake, going through a Java-based service probably isn't the natural choice.

The setup burden looks very different when Debezium is just another dependency in a project your team already knows how to build, test, and deploy.
There is no separate process to operate, no separate configuration format to learn, and no context switch between your application and your CDC pipeline.

For teams building microservices in Java, especially those implementing the outbox pattern for reliable event publishing, the Quarkus extensions deserve serious consideration before reaching for a separate CDC tool.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a great point!

* Converting data types
* Adding metadata fields to events

For teams that need heavier in-flight processing, such as complex joins, aggregations, conditional routing across multiple topic streams, the right answer is integrating Debezium with a stream processor like **Apache Flink** or **Kafka Streams**.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For teams that need heavier in-flight processing, such as complex joins, aggregations, conditional routing across multiple topic streams, the right answer is integrating Debezium with a stream processor like **Apache Flink** or **Kafka Streams**.
For teams that need heavier in-flight processing, such as complex joins, aggregations, conditional routing across multiple change event streams, the right answer is integrating Debezium with a stream processor like **Apache Flink** or **Kafka Streams**.

Debezium Platform is gaining native metrics and monitoring support, built in, not bolted on, so that operators will have first-class observability with the ability for Debezium to provide the entire stack for you.
That work is in progress, and it reflects the project taking this feedback seriously.

"Flying blind" is not an accurate description of what users have available today.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds a tad too defensive IMO.

Copy link
Copy Markdown
Member

@gunnarmorling gunnarmorling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, @Naros! A few comments inline. Another common topic is "Debezium is slow, and we're much faster". Also worth mentioning, although I'm not quite sure about the right angle; Ideally, we'd have some throughput numbers to drive home that point.

@Naros
Copy link
Copy Markdown
Member Author

Naros commented May 4, 2026

Another common topic is "Debezium is slow, and we're much faster". Also worth mentioning, although I'm not quite sure about the right angle; Ideally, we'd have some throughput numbers to drive home that point.

What about framing this around the new chunk-based initial snapshot feature?

As Jiri pointed out in chat, we need to clarify that Debezium is not designed to be, nor will it ever be, as efficient as a vendor-specific data-dumping tool for mass replication. However, if you want to use CDC tooling, Debezium has invested in the new parallel, chunk-based table snapshot feature.

The nice part about this angle is that we aren't talking about database performance, so we sidestep concerns about publishing numbers with any database vendor, because we're strictly focusing on how we re-engineered Debezium for higher throughput and comparing old and new throughput.

The only downside is that this isn't focused on the streaming side of the house, but I think, from a terms-of-use PoV, we're a bit limited in publishing benchmarks with many database vendors.

@gunnarmorling wdyt?

- DEBEZIUM_SINK_KINESIS_REGION=us-east-1
----

That is the entire infrastructure.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the entire infrastructure.
One container.
Configure your source connection and your sink destination, and you have a complete CDC pipeline.

That is the entire infrastructure: configure your source connection and your sink destination, and you have a complete CDC pipeline with one unit of deployment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants