Skip to content

changefeedccl: log on kafka sink v2 internal retry#169374

Open
aerfrei wants to merge 1 commit intocockroachdb:masterfrom
aerfrei:add-kafka-v2-log
Open

changefeedccl: log on kafka sink v2 internal retry#169374
aerfrei wants to merge 1 commit intocockroachdb:masterfrom
aerfrei:add-kafka-v2-log

Conversation

@aerfrei
Copy link
Copy Markdown
Contributor

@aerfrei aerfrei commented Apr 29, 2026

Previously we saw spikes in changefeed internal
retries without accompanying error logs. This change adds a log to the missing code path causing internal retries.

This change should make sure we can tell that this error path was the cause for the internal retries.

Epic: none

Release note: None

@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented Apr 29, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented Apr 29, 2026

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@aerfrei aerfrei marked this pull request as ready for review April 29, 2026 21:41
@aerfrei aerfrei requested a review from a team as a code owner April 29, 2026 21:41
@aerfrei aerfrei requested review from rharding6373 and removed request for a team April 29, 2026 21:41
Copy link
Copy Markdown
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have a bunch of nits about the log message, but don't want to do too much bike shedding so consider them optional. Take 'em or leave 'em.

if err := k.client.ProduceSync(ctx, msgs...).FirstErr(); err != nil {
if k.shouldTryResizing(err, msgs) {
a, b := msgs[0:len(msgs)/2], msgs[len(msgs)/2:]
log.Changefeed.Infof(ctx,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use Errorf?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My sense is that we weren't logging before because, in some sense, having to resize is expected. I think that's still accurate and this communicates that this isn't necessarily cause for concern.

Comment thread pkg/ccl/changefeedccl/sink_kafka_v2.go Outdated
if k.shouldTryResizing(err, msgs) {
a, b := msgs[0:len(msgs)/2], msgs[len(msgs)/2:]
log.Changefeed.Infof(ctx,
"kafka v2 sink reducing batch from %d to halves of %d and %d due to error: %v",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is including both len(a) and len(b) helpful? Seems verbose to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case I was looking at in cloud we saw a steady stream of these internal retries (they were happening in some kind of infinite loop presumably). I think this is a helpful proxy towards whether or not we're actually making progress. e.g. If the numbers are getting smaller, that would be a good sign.

Comment thread pkg/ccl/changefeedccl/sink_kafka_v2.go Outdated
Previously we saw spikes in changefeed internal
retries without accompanying error logs. This change
adds a log to the missing code path causing internal
retries.

This change should make sure we can tell that this
error path was the cause for the internal retries.

Epic: none

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants