-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Off-by-one memory error in a string fastsearch since 3.11 #105235
Comments
Also wanted to add that this bug happens not only with |
The simplest fix would be just doing bounds checks, but this could eat away some performance improvements from a fastsearch and perhaps someone who is more familiar with the intricacies of this search algorithm would be able to come up with something better. diff --git a/Objects/stringlib/fastsearch.h b/Objects/stringlib/fastsearch.h
index 7403d8a3f7..785407fbf6 100644
--- a/Objects/stringlib/fastsearch.h
+++ b/Objects/stringlib/fastsearch.h
@@ -577,7 +577,8 @@ STRINGLIB(default_find)(const STRINGLIB_CHAR* s, Py_ssize_t n,
continue;
}
/* miss: check if next character is part of pattern */
- if (!STRINGLIB_BLOOM(mask, ss[i+1])) {
+ if ((i != w) && !STRINGLIB_BLOOM(mask, ss[i+1])) {
i = i + m;
}
else {
@@ -586,7 +587,8 @@ STRINGLIB(default_find)(const STRINGLIB_CHAR* s, Py_ssize_t n,
}
else {
/* skip: check if next character is part of pattern */
- if (!STRINGLIB_BLOOM(mask, ss[i+1])) {
+ if ((i != w) && !STRINGLIB_BLOOM(mask, ss[i+1])) {
i = i + m;
}
}
@@ -650,7 +652,8 @@ STRINGLIB(adaptive_find)(const STRINGLIB_CHAR* s, Py_ssize_t n,
}
}
/* miss: check if next character is part of pattern */
- if (!STRINGLIB_BLOOM(mask, ss[i+1])) {
+ if ((i != w) && !STRINGLIB_BLOOM(mask, ss[i+1])) {
i = i + m;
}
else {
@@ -659,7 +662,8 @@ STRINGLIB(adaptive_find)(const STRINGLIB_CHAR* s, Py_ssize_t n,
}
else {
/* skip: check if next character is part of pattern */
- if (!STRINGLIB_BLOOM(mask, ss[i+1])) {
+ if ((i != w) && !STRINGLIB_BLOOM(mask, ss[i+1])) {
i = i + m;
}
} |
Thanks for the report. We might be able to keep fastsearch.h the way it is (reading one past the end of the haystack, which is safe for null-terminated strings including python bytes objects), and instead alter the call for the mmap module to add a special-case to check for |
* Add a special case for s[-m:] == p in _PyBytes_Find * Add tests for _PyBytes_Find * Make sure that start <= end in mmap.find
…ythonGH-105252) * Add a special case for s[-m:] == p in _PyBytes_Find * Add tests for _PyBytes_Find * Make sure that start <= end in mmap.find (cherry picked from commit ab86426) Co-authored-by: Dennis Sweeney <[email protected]>
…H-105252) (#106708) gh-105235: Prevent reading outside buffer during mmap.find() (GH-105252) * Add a special case for s[-m:] == p in _PyBytes_Find * Add tests for _PyBytes_Find * Make sure that start <= end in mmap.find (cherry picked from commit ab86426) Co-authored-by: Dennis Sweeney <[email protected]>
Bug report
This bug happens in Objects/stringlib/fastsearch.h:589 during matching the last symbol. In some cases, it causes crashes, but it's a bit hard to reproduce since in order this to happen, the last symbol should be the last in this particular memory page and the next page should not be read accessible or have a different non-contiguous address with the previous one.
The simplest script that reproduces the bug for me is:
But since the result of this script depends on a file system, kernel, and perhaps even a moon phase 😄 , here's a much more reliable way to reproduce it:
This causes the bug across all Linux environments that I've tried. It uses a trick with inaccessible memory region to increase the chances of this bug happening and no files, to speed it up.
Here's some extra info from GDB:
Your environment
I've also tried a bit modified version of a script on OS X, and it crashes there as well.
cc @sweeneyde (since you are the author of d01dceb and 6ddb09f).
Linked PRs
The text was updated successfully, but these errors were encountered: