Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSPFd: Crashed on Hash table Access on Multiple Scenario's #4223

Closed
SathishVenkatesh7 opened this issue Apr 29, 2019 · 3 comments
Closed

OSPFd: Crashed on Hash table Access on Multiple Scenario's #4223

SathishVenkatesh7 opened this issue Apr 29, 2019 · 3 comments
Labels

Comments

@SathishVenkatesh7
Copy link

SathishVenkatesh7 commented Apr 29, 2019

Version: FRR-4.0.0

We are facing the ospf daemon crash with the following back traces:

(Crash observed while accessing hash_clean & hash_get functions. It seems to be a synchronization problem, where the invalid memory is being accessed.This crash is observed especially after the node restart)

Kindly let me know, if anybody else facing this issue (or) holding any useful data to proceed further.

Scenario 1:

Catchpoint 1 (signal SIGSEGV), route_node_lookup (table=, pu=...) at lib/table.c:254
254 lib/table.c: No such file or directory.
#0 route_node_lookup (table=, pu=...) at lib/table.c:254
#1 0x10021db4 in ospf_lsdb_lookup (lsdb=, lsa=lsa@entry=0x1029be20) at ospfd/ospf_lsdb.c:207
#2 0x1001de44 in ospf_discard_from_db (ospf=ospf@entry=0x101e3000, lsdb=, lsa=lsa@entry=0x1029be20) at ospfd/ospf_lsa.c:2422
#3 0x10020d84 in ospf_maxage_lsa_remover (thread=0xbf8781a0) at ospfd/ospf_lsa.c:2704
#4 0x0fc0f4a0 in thread_call (thread=) at lib/thread.c:1385
#5 0x0fbe15b4 in frr_run (master=) at lib/libfrr.c:879
#6 0x1000d240 in main (argc=7, argv=) at ospfd/ospf_main.c:197
(gdb)

Scenario 2:

Catchpoint 1 (signal SIGSEGV), hash_clean (hash=,
free_func=) at lib/hash.c:298
298 lib/hash.c: No such file or directory.
[Current thread is 1 (process 8347)]
#0 hash_clean (hash=, free_func=)
at lib/hash.c:298
#1 0x0fc398b0 in route_table_free (rt=) at lib/table.c:102
#2 route_table_finish (rt=) at lib/table.c:59
#3 0x10021bf0 in ospf_lsdb_cleanup (lsdb=lsdb@entry=0x102928d0)
at ospfd/ospf_lsdb.c:67
#4 0x100222b0 in ospf_nbr_free (nbr=0x102927a0) at ospfd/ospf_neighbor.c:122
#5 0x10022484 in ospf_nbr_delete (nbr=0x102927a0) at ospfd/ospf_neighbor.c:208
#6 0x10022904 in ospf_nbr_self_reset (oi=oi@entry=0x102845e0,
router_id=)
at ospfd/ospf_neighbor.c:229
#7 0x10014ad8 in ospf_if_cleanup (oi=0x102845e0) at ospfd/ospf_interface.c:318
#8 0x10017648 in ism_interface_down (oi=)
at ospfd/ospf_ism.c:380
#9 0x100176cc in ospf_ism_event (thread=)
at ospfd/ospf_ism.c:586
#10 0x0fc3f4a0 in thread_call (thread=) at lib/thread.c:1385
#11 0x0fc3f66c in funcname_thread_execute (m=,
---Type to continue, or q to quit---
func=, arg=, val=,
funcname=, schedfrom=,
fromln=) at lib/thread.c:1439
#12 0x10015c60 in ospf_if_down (oi=0x102845e0) at ospfd/ospf_interface.c:815
#13 0x10060770 in ospf_interface_state_down (command=,
zclient=, length=, vrf_id=)
at ospfd/ospf_zebra.c:259
#14 0x0fc4948c in zclient_read (thread=) at lib/zclient.c:2192
#15 0x0fc3f4a0 in thread_call (thread=) at lib/thread.c:1385
#16 0x0fc115b4 in frr_run (master=) at lib/libfrr.c:879
#17 0x1000d240 in main (argc=7, argv=) at ospfd/ospf_main.c:197
@SathishVenkatesh7 SathishVenkatesh7 added the triage Needs further investigation label Apr 29, 2019
@qlyoung qlyoung added libfrr ospf and removed triage Needs further investigation libfrr labels Apr 29, 2019
@qlyoung
Copy link
Member

qlyoung commented Apr 30, 2019

@SathishVenkatesh7 can you recreate this on a later release? Preferable 7.0 or above? 4.0 is well out of our support lifetime at this point.

@SathishVenkatesh7
Copy link
Author

@SathishVenkatesh7 can you recreate this on a later release? Preferable 7.0 or above? 4.0 is well out of our support lifetime at this point.

@qlyoung I have tried to reproduce this issue in FRR-7.0. I am able to see the same ospfd crash in 7.0 as well. I could see this hash corruption specifically after the node restart.

(gdb) bt
#0 0x10292b88 in ?? ()
#1 0x0fbd6c08 in hash_get (hash=, data=, alloc_func=) at lib/hash.c:149
#2 0x0fc09c28 in route_node_lookup (table=, pu=...) at lib/table.c:254
#3 0x10021cc4 in ospf_lsdb_lookup (lsdb=, lsa=lsa@entry=0x10299b78) at ospfd/ospf_lsdb.c:207
#4 0x1001dd54 in ospf_discard_from_db (ospf=ospf@entry=0x101e3138, lsdb=, lsa=lsa@entry=0x10299b78) at ospfd/ospf_lsa.c:2398
#5 0x10020c94 in ospf_maxage_lsa_remover (thread=0xbfb24850) at ospfd/ospf_lsa.c:2680
#6 0x0fc0f4a0 in thread_call (thread=) at lib/thread.c:1385
#7 0x0fbe15b4 in frr_run (master=) at lib/libfrr.c:879
#8 0x1000d2b0 in main (argc=7, argv=) at ospfd/ospf_main.c:199
(gdb)

@odd22
Copy link
Member

odd22 commented May 23, 2019

@SathishVenkatesh7 can you provide the configuration (zebra + ospfd) you use and the circumstance of the crash ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants