source-mysql: Detect binlog offset wraparound #2151

In several places, MySQL represents binlog offsets within a file as a 32-bit unsigned integer. Most notably for our purposes are the binlog event header `log_pos` field, and the offset argument to the `COM_BINLOG_DUMP` command. The upshot of this is that we can't necessarily trust the offset to be correct when a file grows past 4GB, and even if we tracked the "full offset" ourselves we wouldn't be able to resume from there after a connector restart. This normally isn't an issue because binlog files are never supposed to grow that large. The system setting `max_binlog_size` which governs the point after which the file is rotated has a maximum possible value of just 1GB. Problem is, that's a soft limit and it's possible to force MySQL to stuff arbitrarily large amounts of data into a single file. So we need to handle that situation as gracefully as possible. This commit implements that handling. It detects binlog offset overflow whenever an event's `log_pos` header value is smaller than the prior cursor position (this is reliable because there's also a 1GB cap on the size of any single event, and unlike the binlog size setting this one's actually a hard maximum), and once that occurs an "offset overflow" state flag is set which prevents us from emitting any further checkpoints until after the next binlog rotation. However there is one other place where we use binlog offsets, and that's as part of the `/_meta/source/cursor` field. This field is used as the fallback collection key for keyless tables, so it's actually kind of important that it be basically correct, though it's actually sufficient for it to be properly ordered and unique. We handle this by maintaining a u64 "estimated offset" which is advanced based on event sizes instead of `log_pos` values after offset overflow occurs within the current file. It's not exactly feasible to reproduce the edge case this fixes on demand within the confines of a CI build, so there is no new test case accompanying these changes. We'll have to content ourselves with CI tests showing this doesn't break anything when overflow doesn't occur, and the real test will come when this happens again in production. Which we can tell because there will be a warning message logged when it happens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

source-mysql: Detect binlog offset wraparound #2151

source-mysql: Detect binlog offset wraparound #2151

Commits on Nov 15, 2024