Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement zlib compression #1487

Draft
wants to merge 81 commits into
base: master
Choose a base branch
from
Draft

Conversation

joe-mann
Copy link

@joe-mann joe-mann commented Oct 5, 2023

Description

Implemented the SQL compression protocol. This new feature is enabled by:

  • Adding compress=true in DSN.
  • cfg.Apply(Compress(True))

A new PR to revive, rebase, and complete #649.

Checklist

  • Code compiles correctly
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary
  • Added myself / the copyright holder to the AUTHORS file

Summary by CodeRabbit

  • New Features
    • Introduced zlib compression support in the MySQL driver for Go, including a compress parameter for compression modes and a minCompressLength parameter to specify the minimum packet size for compression.
    • Added new authors B Lamarche and Joe Mann to the project.
  • Bug Fixes
    • Added error handling for scenarios where compression is requested but not supported by the server.
  • Documentation
    • Updated README.md to detail new compression features and parameters.
  • Tests
    • Enhanced benchmark and unit tests to cover new compression functionality.
  • Chores
    • Updated various internal components to support compression during MySQL communication, including packet reading/writing and connection handling.

Brigitte Lamarche and others added 30 commits August 11, 2017 15:38
* rows: implement driver.RowsColumnTypeScanType

Implementation for time.Time not yet complete!

* rows: implement driver.RowsColumnTypeNullable

* rows: move fields related code to fields.go

* fields: use NullTime for nullable datetime fields

* fields: make fieldType its own type

* rows: implement driver.RowsColumnTypeDatabaseTypeName

* fields: fix copyright year

* rows: compile time interface implementation checks

* rows: move tests to versioned driver test files

* rows: cache parseTime in resultSet instead of mysqlConn

* fields: fix string and time types

* rows: implement ColumnTypeLength

* rows: implement ColumnTypePrecisionScale

* rows: fix ColumnTypeNullable

* rows: ColumnTypes tests part1

* rows: use keyed composite literals in ColumnTypes tests

* rows: ColumnTypes tests part2

* rows: always use NullTime as ScanType for datetime

* rows: avoid errors through rounding of time values

* rows: remove parseTime cache

* fields: remove unused scanTypes

* rows: fix ColumnTypePrecisionScale implementation

* fields: sort types alphabetical

* rows: remove ColumnTypeLength implementation for now

* README: document ColumnType Support
AWS Aurora returns a 1290 after failing over requiring the connection to
be closed and opened again to be able to perform writes.
Most forks won't be in goveralls and so this command in travis.yml was,
previously, failing and causing the build to fail.

Now, it doesn't!
* Drop support for Go 1.6 and lower

* Remove cloneTLSConfig for legacy Go versions
…#623)

* Added support for custom string types.

* Add author name

* Added license header

* Added a newline to force a commit.

* Remove newline.
* Also add conversions for additional types in ConvertValue
  ref golang/go@d7c0de9
* Fixed broken import for appengine/cloudsql

appengine.go
import path of appengine/cloudsql has changed to google.golang.org/appengine/cloudsql - Fixed.

* Added my name to the AUTHORS
* Differentiate between BINARY and CHAR

When looking up the database type name, we now check the character set
for the following field types:
 * CHAR
 * VARCHAR
 * BLOB
 * TINYBLOB
 * MEDIUMBLOB
 * LONGBLOB

If the character set is 63 (which is the binary pseudo character set),
we return the binary names, which are (respectively):
 * BINARY
 * VARBINARY
 * BLOB
 * TINYBLOB
 * MEDIUMBLOB
 * LONGBLOB

If any other character set is in use, we return the text names, which
are (again, respectively):
 * CHAR
 * VARCHAR
 * TEXT
 * TINYTEXT
 * MEDIUMTEXT
 * LONGTEXT

To facilitate this, mysqlField has been extended to include a uint8
field for character set, which is read from the appropriate packet.

Column type tests have been updated to ensure coverage of binary and
text types.

* Increase test coverage for column types
* Fix prepared statement

When there are many args and maxAllowedPacket is not enough,
writeExecutePacket() attempted to use STMT_LONG_DATA even for
0byte string.
But writeCommandLongData() doesn't support 0byte data. So it
caused to send malfold packet.

This commit loosen threshold for using STMT_LONG_DATA.

* Change minimum size of LONG_DATA to 64byte

* Add test which reproduce issue 730

* TestPreparedManyCols test only numParams = 65535 case

* s/as possible//
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 876af07 and 1c05916.
Files selected for processing (4)
  • compress.go (1 hunks)
  • compress_test.go (1 hunks)
  • connector.go (2 hunks)
  • packets.go (14 hunks)
Files skipped from review as they are similar to previous changes (4)
  • compress.go
  • compress_test.go
  • connector.go
  • packets.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 1c05916 and 0e3ace3.
Files selected for processing (5)
  • compress.go (1 hunks)
  • compress_test.go (1 hunks)
  • connection.go (4 hunks)
  • infile.go (1 hunks)
  • packets.go (14 hunks)
Files skipped from review as they are similar to previous changes (4)
  • compress.go
  • compress_test.go
  • infile.go
  • packets.go
Additional comments: 4
connection.go (4)
  • 55-58: The resetSequenceNr method correctly resets the sequence numbers, aligning with the intended functionality for packet sequencing. This change is approved.
  • 60-69: The syncSequenceNr method correctly syncs the sequence numbers when compression is enabled, adhering to practices observed in MySQL and MariaDB. The inclusion of references to the external implementations is commendable, providing a solid basis for this approach.
  • 51-53: The addition of the packetReader interface with the readNext method is well-designed, providing a clear contract for reading packets. This change aligns with the objectives of implementing zlib compression support.
  • 48-73: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [28-70]

Overall, the changes made to support zlib compression in connection.go are well-integrated and consistent with the existing codebase. It's recommended to verify error handling and assess the potential impact on existing functionality due to the introduction of compression, ensuring a smooth integration.

connection.go Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 0e3ace3 and 750fe2a.
Files selected for processing (3)
  • compress.go (1 hunks)
  • compress_test.go (1 hunks)
  • connection.go (4 hunks)
Files skipped from review as they are similar to previous changes (3)
  • compress.go
  • compress_test.go
  • connection.go

methane
methane previously approved these changes Mar 15, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 750fe2a and 7800bd5.
Files selected for processing (1)
  • const.go (1 hunks)
Check Runs (27)
test (ubuntu-latest, 1.20, 8.0) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.21, 8.0) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, mariadb-10.5) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, mariadb-10.6) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, mariadb-10.11) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, mariadb-11.1) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, mariadb-11.3) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, 5.7) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, 8.3) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (windows-latest, 1.22, 8.0) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, mariadb-10.5) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, mariadb-10.6) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, mariadb-10.11) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, mariadb-11.1) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, mariadb-11.3) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, 5.7) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, 8.3) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (macos-latest, 1.22, 8.0) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, mariadb-10.5) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, mariadb-10.6) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, mariadb-10.11) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, mariadb-11.1) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, mariadb-11.3) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, 5.7) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, 8.3) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
test (ubuntu-latest, 1.22, 8.0) failure (3)
  • const.go: 14-14: debugTrace redeclared in this block
lint failure (6)
  • const.go: 14-14: debugTrace redeclared in this block
  • const.go: 14-14: debugTrace redeclared in this block
Additional comments: 1
const.go (1)
  • 21-21: Consider adding a comment above minCompressLength to explain the rationale behind the chosen threshold value of 150. This can improve code readability and maintainability by providing context to future contributors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 750fe2a and 0512769.
Files selected for processing (2)
  • compress.go (1 hunks)
  • const.go (1 hunks)
Files skipped from review as they are similar to previous changes (2)
  • compress.go
  • const.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 0512769 and ee70acf.
Files selected for processing (7)
  • compress.go (1 hunks)
  • compress_test.go (1 hunks)
  • connection.go (4 hunks)
  • connection_test.go (5 hunks)
  • connector.go (2 hunks)
  • packets.go (13 hunks)
  • packets_test.go (6 hunks)
Files skipped from review as they are similar to previous changes (6)
  • compress.go
  • compress_test.go
  • connection_test.go
  • connector.go
  • packets.go
  • packets_test.go
Additional comments: 4
connection.go (4)
  • 34-42: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [28-39]

The introduction of packetReader interface, compressSequence, and compress fields in the mysqlConn struct is a significant change that enables the handling of compressed packets. This change is crucial for the implementation of zlib compression. However, ensure that the initialization of these new fields is handled appropriately in all relevant constructors or initialization functions to avoid any uninitialized state issues.

  • 50-52: The packetReader interface with the readNext method is a good abstraction for handling packet reading, allowing for easy extension to support different types of packet reading strategies, such as uncompressed or compressed packets. This design aligns well with the SOLID principles, particularly the Open/Closed Principle, by allowing the behavior of the mysqlConn to be extended without modifying its code directly.
  • 54-57: The resetSequenceNr method now resets both sequence and compressSequence to 0. This change is necessary to ensure that both sequence numbers are in sync when a new command starts. It's important to verify that this method is called at all appropriate points in the code to prevent any sequence number mismatches, which could lead to communication issues with the MySQL server.
  • 59-68: The syncSequenceNr method ensures that the sequence and compressSequence are synchronized after writing a packet and before reading the next one. This synchronization is crucial for maintaining the integrity of packet sequencing, especially when compression is enabled. The comments provide helpful context by referencing the behavior in MySQL and MariaDB's source code. It's a good practice to include such references for complex or non-obvious behaviors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between ee70acf and 1e78561.
Files selected for processing (1)
  • driver_test.go (4 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 1e78561 and 77d86ec.
Files selected for processing (4)
  • .github/workflows/test.yml (1 hunks)
  • compress.go (1 hunks)
  • driver_test.go (5 hunks)
  • packets.go (14 hunks)
Files skipped from review as they are similar to previous changes (2)
  • compress.go
  • packets.go
Additional comments: 1
.github/workflows/test.yml (1)
  • 86-86: The increase of max_allowed_packet to 48MB is a significant change that aligns with the anticipated larger data transactions due to the introduction of zlib compression. Ensure this change has been tested for its impact on MySQL server performance and memory usage.

Comment on lines 146 to 152

db, err := sql.Open(driverNameTest, dsn)
if err != nil {
t.Fatalf("error connecting: %s", err.Error())
t.Fatalf("connecting %q: %s", dsn, err)
}
defer db.Close()

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 NOTE
This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [1508-1510]

The TLS configuration is missing the MinVersion field. It's a best practice to explicitly set the minimum version of the TLS protocol to TLS 1.3 to ensure the use of the most secure and up-to-date protocol. Consider adding MinVersion: tls.VersionTLS13 to the TLS configuration.

tls.Config{
  InsecureSkipVerify: true,
+ MinVersion: tls.VersionTLS13,
}

@methane methane linked an issue Mar 16, 2024 that may be closed by this pull request
@methane methane marked this pull request as draft March 16, 2024 15:26
Copy link

@omills omills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @joe-mann and @methane for bringing this back. My team loves this library and appreciate all the hard work behind it.

Sadly, we recently discovered network limitations that will require us to use compression. We would love for this library to support compression. We want to support this PR as much as possible, but I am not too familiar with mysql code/protocols. I did add a few small comments on what it took for the tests to pass locally. Hopefully that is helpful.

conn := new(mockConn)
mc.netConn = conn

n, err := mc.writeCompressed(uncompressedPacket)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since writeCompressed does not call mc.syncSequenceNr(), do you need to add it here? I tried it locally and that makes the TestRoundtrip pass.

return nil, ErrPktSyncMul
}
// TODO(methane): report error when the packet is not an error packet.
invalid = true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably covered by your TODO: I assume invalid should also cause a mc.Close() somewhere. The TestReadPacketWrongSequenceID wants an invalid collection but currently that check fails.

@methane
Copy link
Member

methane commented Apr 25, 2024

I realized we need massive rewrite. So nitpick comments are not helpful to me.

@SongZhen0704
Copy link

@joe-mann @methane
Hello, I want to use mysql compress in gorm, so may i known whether this pr can be merged?

Thank you

@methane
Copy link
Member

methane commented May 30, 2024

This branch works for simple cases. But when I checked edge cases, I found some difficult issues. I think we need some rewrite.

At the moment, please use this branch at your own risk.

@SongZhen0704
Copy link

This branch works for simple cases. But when I checked edge cases, I found some difficult issues. I think we need some rewrite.

At the moment, please use this branch at your own risk.

Thanks for your reply

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Compression Mode