-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: geo related tcl tests #2753
Conversation
WalkthroughThe changes span several source files and configuration updates, focusing on enhancing error handling, refining geohash calculations, and adjusting configuration settings for performance and stability. Key modifications involve improved error checks on Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (5)
- src/pika_geo.cc (11 hunks)
- src/pika_geohash_helper.cc (6 hunks)
- src/storage/src/redis_zsets.cc (3 hunks)
- tests/assets/default.conf (4 hunks)
- tests/unit/type/geo.tcl (1 hunks)
Files not summarized due to errors (1)
- tests/unit/type/geo.tcl: Error: Message exceeds token limit
Additional comments not posted (45)
src/pika_geohash_helper.cc (3)
113-124
: Improved bounding box calculation to consider hemisphere differences.Consider adding unit tests to ensure these calculations handle edge cases correctly.
Line range hint
151-182
: Changes ensure the geohashing step is decreased appropriately to cover the entire requested area.Monitor for any performance implications due to the potential increase in the number of geohash calculations.
236-252
: Simplified distance calculations with special handling for small longitude differences.Ensure comprehensive testing, especially near the edge cases of longitude differences, to validate these changes.
src/pika_geo.cc (5)
63-63
: HandlingIsInvalidArgument
error specifically improves error message consistency with Redis.Verify that all potential error conditions are handled appropriately in this context.
107-107
: Updated error handling forIsInvalidArgument
to align with Redis's behavior.Ensure that similar error handling consistency is maintained across all geo-related commands.
174-174
: Consistent error handling forIsInvalidArgument
across different commands enhances user experience.Review all geo-related commands to ensure error handling consistency is uniformly applied.
Line range hint
350-377
: Updated handling ofstore
andstoredist
options to replace existing data, ensuring consistency with Redis.Monitor for any performance impacts due to the deletion and re-addition of data in the target key.
576-578
: Improved error handling for not found members, enhancing clarity and user feedback.Consider adding more detailed logging around this error case to aid in debugging and operational monitoring.
tests/assets/default.conf (4)
37-47
: The configuration forsync-binlog-thread-num
is well-explained and the recommendation to match it with the number of databases is a good practice to ensure optimal performance.
320-324
: The setting formax-total-wal-size
is correctly configured to control the size of WAL files, which is crucial for managing the RocksDB performance and ensuring quick recovery times. Make sure this setting is tuned based on the actual workload and storage capabilities.
483-490
: The rsync rate limiting and timeout settings are crucial for controlling the replication load and avoiding unnecessary retries. Ensure these settings are optimized based on network conditions and the specific requirements of the slave nodes.
111-112
: Ensure thesync-binlog-thread-num
matches thedatabases
setting to maintain consistency and optimal performance, especially if the number of databases changes.Verification successful
The
sync-binlog-thread-num
matches thedatabases
setting in thetests/assets/default.conf
file.
sync-binlog-thread-num : 1
databases : 1
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that the sync-binlog-thread-num matches the databases setting. # Test: Search for the sync-binlog-thread-num and databases settings. Expect: They should match. grep -E 'sync-binlog-thread-num|databases' tests/assets/default.confLength of output: 923
tests/unit/type/geo.tcl (32)
1-5
: The introductory comments and helper function definitions are clear and concise.
17-25
: Thegeo_random_point
function correctly generates random longitude and latitude within specified ranges. The comment explaining the range limitation is helpful for understanding the context.
27-44
: Thecompare_lists
function, which returns elements not common to both lists, is implemented efficiently using TCL's built-inlsearch
function. This utility might be used extensively in tests to verify the correctness of geo queries.
46-56
: ThepointInCircle
function checks if a point is within a specified radius of a given point. The implementation is straightforward and uses the previously definedgeo_distance
function, ensuring consistency.
58-72
: ThepointInRectangle
function correctly calculates if a point is within a specified rectangle. The use of thegeo_distance
function for both longitude and latitude distances ensures consistency in calculations.
74-86
: Theverify_geo_edge_response_bylonlat
andverify_geo_edge_response_bymember
procedures are designed to assert the correctness of responses from geo commands. These tests are critical for ensuring that the changes made in the PR behave as expected. The commented-out sections indicate unsupported commands, which is useful information for maintaining the tests.
103-112
: Theverify_geo_edge_response_generic
procedure is a good abstraction for testing generic geo command responses. It uses thecatch
TCL command to handle command execution and matches responses against expected values, which is crucial for regression testing.
138-175
: The tests defined in this section cover various scenarios for geo commands, including handling of wrong types and non-existing keys. The structure and assertions in these tests are well-defined, ensuring thorough coverage of edge cases.
225-231
: The test for invalid coordinates in theGEOADD
command is crucial for ensuring robust error handling. The use of thecatch
command to capture and check for errors is appropriate.
233-240
: TheGEOADD multi add
test checks the addition of multiple geo points in a single command. This is an important test case for verifying bulk operations in geo functionalities.
241-244
: The test for theGEORADIUS
command with sorting ensures that the sorting behavior aligns with the changes made in the PR. This test is integral to verifying that the default sorting behavior now matches Redis.
285-288
: TheGEORADIUS withdist (sorted)
test is well-implemented to check the distance output of theGEORADIUS
command when thewithdist
option is used. This ensures that the command returns distances correctly when requested.
294-302
: TheGEORADIUS with multiple WITH* tokens
test is comprehensive and checks multiple options used together, which is crucial for ensuring that combined options are handled correctly by the command.
320-323
: The test forGEORADIUS with COUNT but missing integer argument
correctly checks for syntax errors when the COUNT argument is not properly specified, which is important for validating command syntax.
325-327
: TheGEORADIUS with COUNT DESC
test verifies the descending sorting functionality combined with the COUNT option. This is essential for ensuring that sorting options are correctly implemented.
329-332
: TheGEORADIUS HUGE, issue #2767
test addresses a specific bug by testing a large radius, which is important for ensuring the system's robustness in handling large values.
334-336
: TheGEORADIUSBYMEMBER simple (sorted)
test checks the basic functionality of theGEORADIUSBYMEMBER
command with sorting. This is crucial for ensuring that the command behaves as expected in simple cases.
343-359
: TheGEORADIUSBYMEMBER search areas contain satisfied points in oblique direction
test is well-crafted to check the accuracy of the command in complex scenarios, ensuring the command's reliability in real-world applications.
361-366
: TheGEORADIUSBYMEMBER crossing pole search
test is important for ensuring that the command handles geographical edge cases, such as searches that cross the poles.
374-380
: TheGEOSEARCH vs GEORADIUS
test compares the results ofGEOSEARCH
andGEORADIUS
commands. AlthoughGEOSEARCH
is not supported, the test setup forGEORADIUS
is correctly implemented.
409-411
: TheGEORADIUSBYMEMBER withdist (sorted)
test checks the distance output in a sorted order, which is essential for verifying the correctness of distance calculations in sorted geo queries.
413-418
: TheGEOHASH is able to return geohash strings
test verifies theGEOHASH
command's ability to return correct geohash strings for given points, which is crucial for applications relying on geohash for spatial indexing.
427-436
: TheGEOPOS simple
test checks the basic functionality of theGEOPOS
command, ensuring that it returns accurate geographical positions for given points.
451-461
: TheGEODIST simple & unit
test verifies that theGEODIST
command returns distances that are within expected ranges, which is important for accuracy in distance calculations.
475-481
: TheGEORADIUS STORE option: syntax error
test ensures that the command returns a syntax error when the STORE option is used incorrectly, which is important for robust error handling.
489-498
: TheGEORANGE STORE option: incompatible options
test checks for errors when incompatible options are used with the STORE option, which is crucial for preventing incorrect command usage.
501-507
: TheGEORANGE STORE option: plain usage
test verifies the basic functionality of the STORE option, ensuring that it behaves as expected when used without additional options.
509-522
: TheGEORADIUSBYMEMBER STORE/STOREDIST option: plain usage
test checks the functionality of the STORE and STOREDIST options in theGEORADIUSBYMEMBER
command, which is important for verifying that these options work correctly in storing distances.
530-538
: TheGEORANGE STOREDIST option: plain usage
test verifies the basic functionality of the STOREDIST option, ensuring that it correctly stores distances as expected.
550-562
: TheGEORANGE STOREDIST option: COUNT ASC and DESC
test checks the functionality of the STOREDIST option with COUNT in both ascending and descending orders, which is crucial for verifying that sorting and counting work correctly together.
574-579
: TheGEOSEARCH with small distance
test checks the accuracy of distance calculations in theGEORADIUS
command when dealing with small distances, which is important for precision in applications that require fine-grained spatial queries.
6-15
: Thegeo_distance
function implementation appears correct and uses the Haversine formula for calculating the distance between two points given in degrees. It's essential to ensure that the constants and calculations align with the geo-related fixes mentioned in the PR for accuracy.Verification successful
The constant
6372797.560856
used in thegeo_distance
function aligns closely with the standard value for Earth's radius (approximately 6371000 meters). This suggests that the implementation is accurate for the Haversine formula.
tests/unit/type/geo.tcl
:
- Line 6:
expr {2.0 * 6372797.560856 * \
- Other commented lines also reference this constant accurately.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the constants used in the geo_distance calculation align with standard geo formulas. # Test: Search for the usage of constants in geo calculations. Expect: Constants to align with standard formulas. rg --type tcl $'6372797.560856'Length of output: 695
src/storage/src/redis_zsets.cc (1)
476-476
: Indentation correction improves code readability and consistency.
@CodiumAI-Agent review |
PR Reviewer Guide 🔍
|
@CodiumAI-Agent improve |
PR Code Suggestions ✨
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Outside diff range and nitpick comments (1)
tests/assets/default.conf (1)
116-117
: Update documentation fordatabases
andsync-binlog-thread-num
interaction.The note on line 117 is crucial and should be highlighted or placed in a more prominent position to ensure it's not overlooked during configuration changes. This helps in maintaining the consistency and performance of the replication feature.
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- tests/assets/default.conf (7 hunks)
Additional comments not posted (1)
tests/assets/default.conf (1)
46-51
: Ensuresync-binlog-thread-num
configuration adheres to best practices.The comment on line 49 suggests that
sync-binlog-thread-num
should equal the number of databases, but the actual setting on line 51 sets it to 1, which may not align if the number of databases is greater than 1. This could lead to suboptimal performance or other issues if not configured correctly.Verification successful
The current configuration adheres to best practices.
- The
databases
setting is1
.- The
sync-binlog-thread-num
setting is1
.Since the number of databases is set to 1, the setting of
sync-binlog-thread-num
to 1 is appropriate and follows the recommended configuration.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify if the number of databases is greater than 1 and if so, ensure sync-binlog-thread-num is set appropriately. cat tests/assets/default.conf | grep -E 'databases|sync-binlog-thread-num'Length of output: 929
tests/assets/default.conf
Outdated
# This parameter is used to control whether to separate fast and slow commands. | ||
# When slow-cmd-pool is set to yes, fast and slow commands are separated. | ||
# When set to no, they are not separated. | ||
slow-cmd-pool : no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the behavior of the slow-cmd-pool
setting.
The comments around the slow-cmd-pool
setting could be more descriptive. It's mentioned what happens when it's set to 'yes' or 'no', but not what constitutes a 'slow' command. This could lead to confusion about how Pika determines command speed and categorization.
tests/assets/default.conf
Outdated
# Rocksdb max_subcompactions, increasing this value can accelerate the exec speed of a single compaction task | ||
# it's recommended to increase it's value if large compaction is found in you instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the max-subcompactions
setting.
The comment on line 249 could be improved by specifying what is considered a "large compaction" and providing guidelines or examples of scenarios where increasing max-subcompactions
would be beneficial.
tests/assets/default.conf
Outdated
# The total size of wal files, when reaches this limit, rocksdb will force the flush of column-families | ||
# whose memtables are backed by the oldest live WAL file. Also used to control the rocksdb open time when | ||
# process restart. | ||
max-total-wal-size : 1073741824 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document the impact of max-total-wal-size
on system performance.
The setting max-total-wal-size
on line 329 is crucial for controlling the RocksDB behavior. It would be beneficial to include more detailed documentation on how this setting affects system performance, particularly in terms of memory management and data integrity during unexpected shutdowns.
tests/assets/default.conf
Outdated
# The number of rocksdb background threads(sum of max-background-compactions and max-background-flushes) | ||
# If max-background-jobs has a valid value AND both 'max-background-flushs' and 'max-background-compactions' is set to -1, | ||
# then max-background-flushs' and 'max-background-compactions will be auto config by rocksdb, specifically: | ||
# 1/4 of max-background-jobs will be given to max-background-flushs' and the rest(3/4) will be given to 'max-background-compactions'. | ||
# 'max-background-jobs' default value is 3 and the value range is [2, 12]. | ||
max-background-jobs : 3 | ||
|
||
# The number of background flushing threads. | ||
# max-background-flushes default value is 1 and the value range is [1, 4]. | ||
max-background-flushes : 1 | ||
# max-background-flushes default value is -1 and the value range is [1, 4] or -1. | ||
# if 'max-background-flushes' is set to -1, the 'max-background-compactions' should also be set to -1, | ||
# which means let rocksdb to auto config them based on the value of 'max-background-jobs' | ||
max-background-flushes : -1 | ||
|
||
# [NOTICE] you MUST NOT set one of the max-background-flushes or max-background-compactions to -1 while setting another one to other values(not -1). | ||
# They SHOULD both be -1 or both not(if you want to config them manually). | ||
|
||
# The number of background compacting threads. | ||
# max-background-compactions default value is 2 and the value range is [1, 8]. | ||
max-background-compactions : 2 | ||
# max-background-compactions default value is -1 and the value range is [1, 8] or -1. | ||
# if 'max-background-compactions' is set to -1, the 'max-background-flushes' should also be set to -1, | ||
# which means let rocksdb to auto config them based on the value of 'max-background-jobs' | ||
max-background-compactions : -1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review auto-configuration logic for RocksDB background jobs.
The auto-configuration logic described from lines 361 to 382 is complex and could benefit from additional examples or a more detailed explanation. Ensuring that users fully understand how these settings interact could prevent misconfigurations that might lead to performance degradation.
tests/assets/default.conf
Outdated
# Rsync Rate limiting configuration [Default value is 200MB/s] | ||
# [USED BY SLAVE] The transmitting speed(Rsync Rate) In full replication is controlled BY SLAVE NODE, You should modify the throttle-bytes-per-second in slave's pika.conf if you wanna change the rsync rate limit. | ||
# [Dynamic Change Supported] send command 'config set throttle-bytes-per-second new_value' to SLAVE NODE can dynamically adjust rsync rate during full sync(use config rewrite can persist the changes). | ||
throttle-bytes-per-second : 207200000 | ||
|
||
# Rsync timeout in full sync stage[Default value is 1000 ms], unnecessary retries will happen if this value is too small. | ||
# [Dynamic Change Supported] similar to throttle-bytes-per-second, rsync-timeout-ms can be dynamically changed by configset command | ||
# [USED BY SLAVE] Similar to throttle-bytes-per-second, you should change rsync-timeout-ms's value in slave's conf file if it is needed to adjust. | ||
rsync-timeout-ms : 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify dynamic configuration capabilities for replication settings.
The settings for throttle-bytes-per-second
and rsync-timeout-ms
include support for dynamic changes, which is a powerful feature. It might be helpful to provide examples of how to use these dynamic settings effectively, especially in scenarios involving large data transfers or high network latency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- tests/integration/geo_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
- tests/integration/geo_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- src/storage/src/redis_zsets.cc (3 hunks)
- tests/assets/default.conf (7 hunks)
Files skipped from review as they are similar to previous changes (2)
- src/storage/src/redis_zsets.cc
- tests/assets/default.conf
src/storage/src/redis_zsets.cc
Outdated
@@ -1187,6 +1186,8 @@ Status Redis::ZScore(const Slice& key, const Slice& member, double* score) { | |||
uint64_t tmp = DecodeFixed64(data_value.data()); | |||
const void* ptr_tmp = reinterpret_cast<const void*>(&tmp); | |||
*score = *reinterpret_cast<const double*>(ptr_tmp); | |||
} else if (s.IsNotFound()) { | |||
return Status::NotFound("Invaild member"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid member 这个单词写的有问题
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- src/storage/src/redis_zsets.cc (2 hunks)
- tests/assets/default.conf (8 hunks)
Files skipped from review as they are similar to previous changes (2)
- src/storage/src/redis_zsets.cc
- tests/assets/default.conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- tests/assets/default.conf (1 hunks)
Files skipped from review due to trivial changes (1)
- tests/assets/default.conf
src/pika_geo.cc
Outdated
int32_t card = db->storage()->Exists({range.storekey}); | ||
if (card) { | ||
db->storage()->Del({range.storekey}); | ||
db->cache()->Del({range.storekey}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DB 要是没删除成功,去删除cache会不会数据不一致,所以这里是不是应该判断一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/pika_geo.cc (11 hunks)
Additional comments not posted (6)
src/pika_geo.cc (6)
13-13
: Header inclusion approved.The inclusion of
"rocksdb/status.h"
is appropriate as it is used throughout the file for handling status checks from RocksDB operations.
63-63
: Updated error handling logic inGeoAddCmd::Do()
.The handling of
rocksdb::Status
is now more granular, which improves error specificity and debugging capabilities.
107-107
: Enhanced error handling inGeoPosCmd::Do()
.The addition of handling for
IsInvalidArgument
increases robustness and aligns with other similar functions.
167-174
: Refined error handling inGeoDistCmd::Do()
.The addition of specific checks for
IsInvalidArgument
and other status-related errors enhances the function's ability to handle edge cases effectively.Also applies to: 174-174
246-246
: Improved error handling inGeoHashCmd::Do()
.The handling of
IsInvalidArgument
errors is a good addition, ensuring that the function can gracefully handle more error scenarios.
318-324
: Comprehensive updates inGetAllNeighbors
.The changes in error handling, sorting logic, and the handling of
store
andstoredist
options enhance the function's flexibility and correctness. The error handling is now more robust, and the sorting logic is clearer and more configurable.Also applies to: 350-355, 367-378
* modify geo.tcl ci * modify go_test * modify default.conf * modify code based on review
* modify geo.tcl ci * modify go_test * modify default.conf * modify code based on review
* modify geo.tcl ci * modify go_test * modify default.conf * modify code based on review
Fixed several geo bugs:
Modified error messages to be consistent with Redis. For example, for the same error, Pika returns ERR Invalid argument, while Redis returns WRONGTYPE Operation against a key holding the wrong kind of value.
When using the GEORADIUS command, Pika's default sort value is Unsort, whereas Redis's default sort value is Asc.
When the store and storedist options are enabled in the GEO command, Pika does not ensure data consistency between the storage layer and the cache layer.
When the store and storedist options are enabled in the GEO command, Pika appends the new results to the target key, while Redis replaces the existing data in the target key with the new data.
There is a logical error in Pika when finding the search boundaries (using geohashBoundingBox).
There is a logical error in Pika when calculating the distance between two points (using geohashGetDistance).
There is a logical error in Pika when validating the step's validity.
Result display:
Pika can pass all geo TCL test cases except for the unsupported geo commands.
修改了若干关于geo的bug
修改了与 Redis 不一致的报错内容。例如,对于相同的错误,Pika 报错为 ERR Invalid argument,而 Redis 报错为 WRONGTYPE Operation against a key holding the wrong kind of value。
使用 GEORADIUS 命令时,Pika 的默认排序值为 Unsort,而 Redis 的默认排序值为 Asc。
在 GEO 命令的 store 和 storedist 选项启动时,Pika 没有保证存储层和缓存层的数据一致性。
在 GEO 命令的 store 和 storedist 选项启动时,Pika 将新的结果追加到目标键中,而 Redis 则是用新数据替换目标键中的现有数据。
在查找搜索边界时(geohashBoundingBox),Pika 的逻辑有误。
在判断两点距离时(geohashGetDistance),Pika 的逻辑有误。
在检验 step 的有效性时,Pika 的逻辑有误。
Summary by CodeRabbit
Bug Fixes
Documentation
Refactor
Configuration