Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredump in knn_inner_product_blas::sgemm_ #1051

Closed
fenglonz opened this issue Dec 3, 2019 · 3 comments
Closed

coredump in knn_inner_product_blas::sgemm_ #1051

fenglonz opened this issue Dec 3, 2019 · 3 comments

Comments

@fenglonz
Copy link

fenglonz commented Dec 3, 2019

Summary

core dump occurs, using index_factory with METRIC_INNER_PRODUCT.

faiss::index_factory(ndim, "Flat",faiss::MetricType::METRIC_INNER_PRODUCT);

OS: Linux version 3.10.0-514.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Nov 22 16:42:41 UTC 2016

Faiss version: download from github at 2019 Oct 17

Running on:
CPU

Interface:
C++

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./bin/index_server'.
Program terminated with signal 4, Illegal instruction.
#0 0x00000000009c32e7 in sgemm_ ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.1.x86_64 libgcc-4.8.5-11.el7.x86_64 libgfortran-4.8.5-11.el7.x86_64 libgomp-4.8.5-11.el7.x86_64 libquadmath-4.8.5-11.el7.x86_64 libstdc++-4.8.5-11.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00000000009c32e7 in sgemm_ ()
#1 0x0000000000680f6e in knn_inner_product_blas (res=0x7ff9a29a39e0, ny=7656, nx=20, d=, y=0x7ff959993740, x=) at utils/distances.cpp:257
#2 faiss::knn_inner_product (x=, y=0x7ff959993740, d=, nx=20, ny=7656, res=res@entry=0x7ff9a29a39e0) at utils/distances.cpp:365
#3 0x00000000006888d4 in faiss::IndexFlat::search (this=, n=, x=, k=, distances=, labels=0x7ff8bc44d950)
at IndexFlat.cpp:50
#4 0x00000000005fb0a8 in broalgo::index::FaissManager::Search (this=0x12e72e0 broalgo::index::FaissManager::manager_, src_doc=std::vector of length 150, capacity 150 = {...},
count=count@entry=40, metric_type=metric_type@entry=0, incDocs=..., resDocs=std::vector of length 0, capacity 0) at searcher/faiss_manager.cpp:333
#5 0x00000000005b3aa2 in broalgo::index::SearchQueryTaskElement::ExecuteQuery (this=this@entry=0x7ff9a29a4710, db_id=@0x7ff9a29a45a0: 4, p_scoring=std::shared_ptr (empty) 0x0,
trigger_docs=std::vector of length 35984, capacity 35984 = {...}) at searcher/search_query_task.cpp:103
#6 0x00000000005b47c4 in broalgo::index::SearchQueryTaskElement::Execute (this=this@entry=0x7ff9a29a4710) at searcher/search_query_task.cpp:214
#7 0x00000000005b4f6f in broalgo::index::SearchQueryTask::run (this=, element=...) at searcher/search_query_task.cpp:246
#8 0x00000000005b51fe in broalgo::index::ConsumerTaskbroalgo::index::SearchQueryTaskElement::Run (this=0x7ff9a29a4790, running=...) at ./common/consumer_task.h:25
#9 0x00000000005f19c3 in operator() (task=..., pool=, __closure=) at ./common/thread_pool.h:35
#10 _M_invoke<0ul, 1ul> (this=) at /usr/include/c++/4.8.2/functional:1732
#11 operator() (this=) at /usr/include/c++/4.8.2/functional:1720
#12 std::thread::_Impl<std::_Bind_simple<broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>::Start(broalgo::index::SearchQueryTask)::{lambda(broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>, broalgo::index::SearchQueryTask)#1} (broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>, broalgo::index::SearchQueryTask)> >::_M_run() (this=) at /usr/include/c++/4.8.2/thread:115
#13 0x00007ff9caa69230 in ?? () from /lib64/libstdc++.so.6
#14 0x00007ff9cb0d0dc5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ff9c9fac73d in clone () from /lib64/libc.so.6
(gdb) quit

@fenglonz
Copy link
Author

fenglonz commented Dec 4, 2019

same coredump again.
-rw------- 1 4956876800 Dec 4 09:54 core.11573
-rw------- 1 4548632576 Dec 3 19:13 core.20276
-rw------- 1 5494566912 Dec 3 19:40 core.30698

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./bin/index_server'.
Program terminated with signal 4, Illegal instruction.
#0 0x00000000009c32e7 in sgemm_ ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.1.x86_64 libgcc-4.8.5-11.el7.x86_64 libgfortran-4.8.5-11.el7.x86_64 libgomp-4.8.5-11.el7.x86_64 libquadmath-4.8.5-11.el7.x86_64 libstdc++-4.8.5-11.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00000000009c32e7 in sgemm_ ()
#1 0x0000000000680f6e in knn_inner_product_blas (res=0x7fb49afb09e0, ny=6784, nx=32, d=, y=0x7fb46c95f890, x=) at utils/distances.cpp:257
#2 faiss::knn_inner_product (x=, y=0x7fb46c95f890, d=, nx=32, ny=6784, res=res@entry=0x7fb49afb09e0) at utils/distances.cpp:365
#3 0x00000000006888d4 in faiss::IndexFlat::search (this=, n=, x=, k=, distances=, labels=0x7fb3b008eca0)
at IndexFlat.cpp:50
#4 0x00000000005fb0a8 in broalgo::index::FaissManager::Search (this=0x12e72e0 broalgo::index::FaissManager::manager_, src_doc=std::vector of length 148, capacity 148 = {...},
count=count@entry=40, metric_type=metric_type@entry=0, incDocs=..., resDocs=std::vector of length 0, capacity 0) at searcher/faiss_manager.cpp:333
#5 0x00000000005b3aa2 in broalgo::index::SearchQueryTaskElement::ExecuteQuery (this=this@entry=0x7fb49afb1710, db_id=@0x7fb49afb15a0: 4, p_scoring=std::shared_ptr (empty) 0x0,
trigger_docs=std::vector of length 29218, capacity 29218 = {...}) at searcher/search_query_task.cpp:103
#6 0x00000000005b47c4 in broalgo::index::SearchQueryTaskElement::Execute (this=this@entry=0x7fb49afb1710) at searcher/search_query_task.cpp:214
#7 0x00000000005b4f6f in broalgo::index::SearchQueryTask::run (this=, element=...) at searcher/search_query_task.cpp:246
#8 0x00000000005b51fe in broalgo::index::ConsumerTaskbroalgo::index::SearchQueryTaskElement::Run (this=0x7fb49afb1790, running=...) at ./common/consumer_task.h:25
#9 0x00000000005f19c3 in operator() (task=..., pool=, __closure=) at ./common/thread_pool.h:35
#10 _M_invoke<0ul, 1ul> (this=) at /usr/include/c++/4.8.2/functional:1732
#11 operator() (this=) at /usr/include/c++/4.8.2/functional:1720
#12 std::thread::_Impl<std::_Bind_simple<broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>::Start(broalgo::index::SearchQueryTask)::{lambda(broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>, broalgo::index::SearchQueryTask)#1} (broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>, broalgo::index::SearchQueryTask)> >::_M_run() (this=) at /usr/include/c++/4.8.2/thread:115
#13 0x00007fb4e8bdb230 in ?? () from /lib64/libstdc++.so.6
#14 0x00007fb4e9242dc5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007fb4e811e73d in clone () from /lib64/libc.so.6

@mdouze
Copy link
Contributor

mdouze commented Feb 28, 2020

Still unable to repro. Closing issue.

@mdouze mdouze closed this as completed Feb 28, 2020
@sandrew11
Copy link

same coredump again. -rw------- 1 4956876800 Dec 4 09:54 core.11573 -rw------- 1 4548632576 Dec 3 19:13 core.20276 -rw------- 1 5494566912 Dec 3 19:40 core.30698

[Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `./bin/index_server'. Program terminated with signal 4, Illegal instruction. #0 0x00000000009c32e7 in sgemm_ () Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.1.x86_64 libgcc-4.8.5-11.el7.x86_64 libgfortran-4.8.5-11.el7.x86_64 libgomp-4.8.5-11.el7.x86_64 libquadmath-4.8.5-11.el7.x86_64 libstdc++-4.8.5-11.el7.x86_64 zlib-1.2.7-17.el7.x86_64 (gdb) bt #0 0x00000000009c32e7 in sgemm_ () #1 0x0000000000680f6e in knn_inner_product_blas (res=0x7fb49afb09e0, ny=6784, nx=32, d=, y=0x7fb46c95f890, x=) at utils/distances.cpp:257 #2 faiss::knn_inner_product (x=, y=0x7fb46c95f890, d=, nx=32, ny=6784, res=res@entry=0x7fb49afb09e0) at utils/distances.cpp:365 #3 0x00000000006888d4 in faiss::IndexFlat::search (this=, n=, x=, k=, distances=, labels=0x7fb3b008eca0) at IndexFlat.cpp:50 #4 0x00000000005fb0a8 in broalgo::index::FaissManager::Search (this=0x12e72e0 broalgo::index::FaissManager::manager_, src_doc=std::vector of length 148, capacity 148 = {...}, count=count@entry=40, metric_type=metric_type@entry=0, incDocs=..., resDocs=std::vector of length 0, capacity 0) at searcher/faiss_manager.cpp:333 #5 0x00000000005b3aa2 in broalgo::index::SearchQueryTaskElement::ExecuteQuery (this=this@entry=0x7fb49afb1710, db_id=@0x7fb49afb15a0: 4, p_scoring=std::shared_ptr (empty) 0x0, trigger_docs=std::vector of length 29218, capacity 29218 = {...}) at searcher/search_query_task.cpp:103 #6 0x00000000005b47c4 in broalgo::index::SearchQueryTaskElement::Execute (this=this@entry=0x7fb49afb1710) at searcher/search_query_task.cpp:214 #7 0x00000000005b4f6f in broalgo::index::SearchQueryTask::run (this=, element=...) at searcher/search_query_task.cpp:246 #8 0x00000000005b51fe in broalgo::index::ConsumerTaskbroalgo::index::SearchQueryTaskElement::Run (this=0x7fb49afb1790, running=...) at ./common/consumer_task.h:25 #9 0x00000000005f19c3 in operator() (task=..., pool=, __closure=) at ./common/thread_pool.h:35 #10 _M_invoke<0ul, 1ul> (this=) at /usr/include/c++/4.8.2/functional:1732 #11 operator() (this=) at /usr/include/c++/4.8.2/functional:1720 #12 std::thread::Impl<std::Bind_simple<broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>::Start(broalgo::index::SearchQueryTask)::{lambda(broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>, broalgo::index::SearchQueryTask)#1} (broalgo::index::ThreadPool<broalgo::index::SearchQueryTaskElement, broalgo::index::SearchQueryTask>, broalgo::index::SearchQueryTask)> >::_M_run() (this=) at /usr/include/c++/4.8.2/thread:115 #13 0x00007fb4e8bdb230 in ?? () from /lib64/libstdc++.so.6 #14 0x00007fb4e9242dc5 in start_thread () from /lib64/libpthread.so.0 #15 0x00007fb4e811e73d in clone () from /lib64/libc.so.6

Has this problem been finally solved? What's the cause?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants