Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source-mysql: Detect binlog offset wraparound #2151

Merged
merged 1 commit into from
Nov 15, 2024

Commits on Nov 15, 2024

  1. source-mysql: Detect binlog offset wraparound

    In several places, MySQL represents binlog offsets within a file
    as a 32-bit unsigned integer. Most notably for our purposes are
    the binlog event header `log_pos` field, and the offset argument
    to the `COM_BINLOG_DUMP` command.
    
    The upshot of this is that we can't necessarily trust the offset
    to be correct when a file grows past 4GB, and even if we tracked
    the "full offset" ourselves we wouldn't be able to resume from
    there after a connector restart.
    
    This normally isn't an issue because binlog files are never
    supposed to grow that large. The system setting `max_binlog_size`
    which governs the point after which the file is rotated has a
    maximum possible value of just 1GB. Problem is, that's a soft
    limit and it's possible to force MySQL to stuff arbitrarily
    large amounts of data into a single file. So we need to handle
    that situation as gracefully as possible.
    
    This commit implements that handling. It detects binlog offset
    overflow whenever an event's `log_pos` header value is smaller
    than the prior cursor position (this is reliable because there's
    also a 1GB cap on the size of any single event, and unlike the
    binlog size setting this one's actually a hard maximum), and
    once that occurs an "offset overflow" state flag is set which
    prevents us from emitting any further checkpoints until after
    the next binlog rotation.
    
    However there is one other place where we use binlog offsets,
    and that's as part of the `/_meta/source/cursor` field. This
    field is used as the fallback collection key for keyless tables,
    so it's actually kind of important that it be basically correct,
    though it's actually sufficient for it to be properly ordered and
    unique. We handle this by maintaining a u64 "estimated offset"
    which is advanced based on event sizes instead of `log_pos`
    values after offset overflow occurs within the current file.
    
    It's not exactly feasible to reproduce the edge case this fixes
    on demand within the confines of a CI build, so there is no new
    test case accompanying these changes. We'll have to content
    ourselves with CI tests showing this doesn't break anything
    when overflow doesn't occur, and the real test will come when
    this happens again in production. Which we can tell because
    there will be a warning message logged when it happens.
    willdonnelly committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    d09a67d View commit details
    Browse the repository at this point in the history