Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in automata release #1506

Closed
subhajit-cdot opened this issue Mar 30, 2022 · 14 comments · Fixed by #1597
Closed

Issue in automata release #1506

subhajit-cdot opened this issue Mar 30, 2022 · 14 comments · Fixed by #1597
Assignees
Labels

Comments

@subhajit-cdot
Copy link

Program terminated with signal SIGABRT, Aborted. #0 0x00007fba2dc187bb in raise () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fba2d4be400 (LWP 2275))] (gdb) bt #0 0x00007fba2dc187bb in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fba2dc03535 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fba2dc5a508 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fba2dc60c1a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x00007fba2dc6242c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #5 0x00007fba2e999e7d in node_release_pattern (thiz=0x563e360ab600) at third_party/src/ahocorasick.c:771 #6 node_release (thiz=0x563e360ab600, free_pattern=<optimized out>) at third_party/src/ahocorasick.c:786 #7 0x00007fba2e999eff in ac_automata_release_node (thiz=thiz@entry=0x563e360b0380, n=n@entry=0x563e360ab600, idx=idx@entry=0, data=data@entry=0x1) at third_party/src/ahocorasick.c:544 #8 0x00007fba2e99aa98 in ac_automata_walk (thiz=0x563e360b0380, node_cb=node_cb@entry=0x7fba2e999ed0 <ac_automata_release_node>, alpha_cb=alpha_cb@entry=0x0, data=data@entry=0x1) at third_party/src/ahocorasick.c:288 #9 0x00007fba2e99b3ab in ac_automata_release (thiz=0x563e360b0380, free_pattern=<optimized out>) at third_party/src/ahocorasick.c:556
(gdb) frame 5 #5 0x00007fba2e999e7d in node_release_pattern (thiz=0x563e360ab600) at third_party/src/ahocorasick.c:771 771 third_party/src/ahocorasick.c: No such file or directory.
(gdb) p *thiz $1 = {id = 10, one_alpha = 0 '\000', one = 0 '\000', range = 0 '\000', root = 0 '\000', final = -1 '\377', use = 0 '\000', ff = -1 '\377', depth = 9, matched_patterns = 0x563e360b1480, outgoing = 0x0, failure_node = 0x563e360a1340, a_ptr = 0x0} (gdb) p *thiz->matched_patterns $2 = {num = 1, max = 8, patterns = 0x563e360b1488}
(gdb) p *thiz->matched_patterns->patterns $3 = {astring = 0x7ffe7074c40e "apple.com", length = 9, is_existing = 0, rep = {number = 1, number64 = 0, breed = 0, category = 0, level = 0, from_start = 0, at_end = 0, dot = 0}}

@vel21ripn
Copy link
Contributor

Try to apply this patch vel21ripn@06e2967 . If it helps, I'll do a PR.

@subhajit-cdot
Copy link
Author

Thanks @vel21ripn , I will try. It seems the patch is from 2021, why is this not part of the latest 4.2 stable?

@vel21ripn
Copy link
Contributor

Yes, 2021.
The error was found in another project (ipset) and the fix was not included in ndpi due to my inattention.
Did the patch fix the problem?

@subhajit-cdot
Copy link
Author

subhajit-cdot commented Mar 30, 2022

No it didn't solve the crash. It's coming while calling ac_automata_release().

@subhajit-cdot
Copy link
Author

subhajit-cdot commented Mar 30, 2022

Got another segfault in the same code segment

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f926cd4e9ef in free () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7f926c5a7400 (LWP 5579))]
(gdb) bt
#0 0x00007f926cd4e9ef in free () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f926da82e7d in node_release_pattern (thiz=0x558ee71e9000) at third_party/src/ahocorasick.c:771
#2 node_release (thiz=0x558ee71e9000, free_pattern=) at third_party/src/ahocorasick.c:786
#3 0x00007f926da82eff in ac_automata_release_node (thiz=thiz@entry=0x558ee7357c40, n=n@entry=0x558ee71e9000, idx=idx@entry=0,
data=data@entry=0x1) at third_party/src/ahocorasick.c:544
#4 0x00007f926da83a98 in ac_automata_walk (thiz=0x558ee7357c40, node_cb=node_cb@entry=0x7f926da82ed0 <ac_automata_release_node>,
alpha_cb=alpha_cb@entry=0x0, data=data@entry=0x1) at third_party/src/ahocorasick.c:288
#5 0x00007f926da8437b in ac_automata_release (thiz=0x558ee7357c40, free_pattern=) at third_party/src/ahocorasick.c:556

771 third_party/src/ahocorasick.c: No such file or directory.
(gdb) p *thiz
$1 = {id = 10, one_alpha = 0 '\000', one = 0 '\000', range = 0 '\000', root = 0 '\000', final = -1 '\377', use = 0 '\000', ff = -1 '\377',
depth = 9, matched_patterns = 0x558ee41ac980, outgoing = 0x0, failure_node = 0x558ee73457c0, a_ptr = 0x0}
(gdb) p *thiz->matched_patterns
$2 = {num = 1, max = 8, patterns = 0x558ee41ac988}
(gdb) p *thiz->matched_patterns->patterns
$3 = {astring = 0x7ffd48b77eae "e_packets", length = 9, is_existing = 0, rep = {number = 1, number64 = 0, breed = 0, category = 0,
level = 0, from_start = 0, at_end = 0, dot = 0}}
(gdb)

@vel21ripn
Copy link
Contributor

There are two options: build with clang and options "-fsanitize=memory -fsanitize=fuzzer-no-link" or run via valgrind

It must be remembered that libahocorasick does not allocate memory for the string when adding a string.
ac_automata_release() has a free_pattern parameter that determines whether the memory occupied by the string should be freed.
If the string that was added was not distributed via ndpi_alloc(), then calling ac_automata_release(,1) will crash or corrupt memory.

@subhajitchatu
Copy link

subhajitchatu commented Mar 31, 2022

Yes @vel21ripn . I have used ndpi_alloc . I was using Ndpi 3.4 with ac_automata_release(,1) after applying this patch in Ndpi 3.4 stable (it was not initially in the release of Ndpi 3.4) . This issue never comes. Also I can see the unitTest funcs are commented in ndpiReader, it has an automata test case I think . I will have to dig further to check the difference between 3.4 and 4.2 in this part of the code/func. Please let me if you need any other info to debug this issue . Thanks

@subhajit-cdot
Copy link
Author

subhajit-cdot commented Apr 5, 2022

Hi @vel21ripn The issue has been resolved. I was using strdupa() instead of strdup(). Now I am confused why this issue was never reported with ndpi-3.4. ndpi-3.4 should also generate SIGABRT or SEGV in case of double free. ( added below patch in ndpi-3.4, [4aefbe0c7a665ee077b3ddbfc0c7738cd7333558] . However, can you please tell if there is any limit on the number of patterns/strings which can be added into a single automata? I can see the length is MAX 256 for a string.

@vel21ripn
Copy link
Contributor

There is no limit on the number of lines (there must be enough RAM).
The maximum length of a string is limited by the AC_PATTRN_MAX_LENGTH value, but it can be increased. This will increase the size of the AC_AUTOMATA_t structure.

@subhajit-cdot
Copy link
Author

Hi @vel21ripn ,

typedef struct {
unsigned short num; /* Number of matched patterns at this node /
unsigned short max; /
Max capacity of allocated memory for matched_patterns */
AC_PATTERN_t patterns[];
} AC_PATTERNS_t;

I was checking node_resize_mp() and mp_data_size() where the max is initialized with 8 at first and each time incremented with 8 when a new string added to automata. Can you please elaborate this logic? Is it optimized for performance?

@vel21ripn
Copy link
Contributor

Statistics can be obtained to evaluate the need to increase REALLOC_CHUNK_MATCHSTR.
In the tests/performance/top-1m.csv file, load a list of your strings.
In the file tests/performance/substringsearch.c add the line "ac_automata_dump(automa,stdout);" after "ndpi_finalize_automa(automa);".
After running, you will see how many elements are in AC_PATTERNS_t ( N:xxxx {} ).
It is recommended to increase REALLOC_CHUNK_OUTGOING by a multiple of 8 (for 64-bit systems).
The default is the minimum value.
The main concern was the optimal use of memory.

@IvanNardi
Copy link
Collaborator

Try to apply this patch vel21ripn@06e2967 . If it helps, I'll do a PR.

@vel21ripn, do you think this patch is still needed in current master (i.e. should we merge it)?

@IvanNardi
Copy link
Collaborator

Try to apply this patch vel21ripn@06e2967 . If it helps, I'll do a PR.

@vel21ripn, do you think this patch is still needed in current master (i.e. should we merge it)?

Kind reminder, @vel21ripn :)

@vel21ripn
Copy link
Contributor

Sorry.
This patch is highly recommended.
If, when searching for strings, 8-bit binary data is supplied as input, then without this patch, the application may crash.
If the patch applies to the current version without errors, then please make a PR.

@IvanNardi IvanNardi self-assigned this Jun 14, 2022
IvanNardi added a commit to IvanNardi/nDPI that referenced this issue Jun 14, 2022
IvanNardi added a commit that referenced this issue Jul 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants