stop rotating on partition change when rotate.interval.ms is set #715
+204
−92
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Committing open files on partition change results in creating a lot of small files when records belonging to different partitioned are interleaved. We have a use case where we aggregate raw events into sessions spanning for 5-15 minutes with session time being the time of the first event in it. We use hourly partitioning and observe up to 10x increase in number of files per hour due to this.
The issue has been reported multiple time previously:
And even had an attempted fix:
Solution
Removing rotation on partition changes makes the semantics of
rotate.interval.ms
similar toflush.size
.It now defines constrains not for a single file, but for a "segment" of a stream:
records are accumulated in appropriate partitions until partition time advances at least
rotate.interval.ms
from the first time of the message in the "segment", at which point all files are flushed.Testing
We have been running the patched version in our staging environment for more then a week now with constant consistency checks and have not seen any issues neither with number of files per hour nor with the consistency of the results.
Finally documentation for 'rotate.interval.ms' might need to be adjusted. Would appreciate any advice on how to do that.