Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hnsw_stats.ndis seems wrong, it contains hnsw visited, which actual not call dis() #3819

Closed
ssk01 opened this issue Sep 2, 2024 · 3 comments

Comments

@ssk01
Copy link

ssk01 commented Sep 2, 2024

Summary

hnsw_stats.ndis seems wrong, it contains hnsw visited, which actual not call dis()

Reproduction instructions

  1. in old version of faiss

before (HNSW speedup + Distance 4 points ([#2841])
ndis didn't contain visited neighbors
image

  1. current main branch, it contains visited neighbors
image
  1. I use bench_hnsw.py as a demo. current hnsw_stats.ndis "ndis + visited"(94231860) is 42% larger than actual"ndis"(65960703).I guess that this ratio varies across vectors generated by different models.
// main branch
(base) 1984MacBook-Air benchs % python3 bench_hnsw.py 1 hnsw_sq
load data
10000 128
Testing HNSW with a scalar quantizer
training
add
hnsw_add_vertices: adding 1000000 elements on top of 0 (preset_levels=0)
  max_level = 5
Adding 1 elements at level 5
Adding 15 elements at level 4
Adding 194 elements at level 3
Adding 3693 elements at level 2
Adding 58500 elements at level 1
Adding 937597 elements at level 0
Done in 25994.155 ms
search
efSearch 16        0.012 ms per query, R@1 0.7797, missing rate 0.0000, ndis 4782698
efSearch 32        0.014 ms per query, R@1 0.8731, missing rate 0.0000, ndis 12252905
efSearch 64        0.023 ms per query, R@1 0.9285, missing rate 0.0000, ndis 25107377
efSearch 128       0.043 ms per query, R@1 0.9583, missing rate 0.0000, ndis 48825286
efSearch 256       0.082 ms per query, R@1 0.9714, missing rate 0.0000, ndis 94231860
// I try fix it.
(base) 1984MacBook-Air benchs % python3 bench_hnsw.py 1 hnsw_sq
load data
10000 128
Testing HNSW with a scalar quantizer
training
add
hnsw_add_vertices: adding 1000000 elements on top of 0 (preset_levels=0)
  max_level = 5
Adding 1 elements at level 5
Adding 15 elements at level 4
Adding 194 elements at level 3
Adding 3693 elements at level 2
Adding 58500 elements at level 1
Adding 937597 elements at level 0
Done in 21349.235 ms
search
efSearch 16        0.012 ms per query, R@1 0.7775, missing rate 0.0000, ndis 4190685
efSearch 32        0.014 ms per query, R@1 0.8706, missing rate 0.0000, ndis 10376491
efSearch 64        0.025 ms per query, R@1 0.9284, missing rate 0.0000, ndis 20265123
efSearch 128       0.045 ms per query, R@1 0.9590, missing rate 0.0000, ndis 36973128
efSearch 256       0.084 ms per query, R@1 0.9707, missing rate 0.0000, ndis 65960703

@alexanderguzhva, should I make a pull request to fix it ?

@alexanderguzhva
Copy link
Contributor

@ssk01 thanks, let me see

@alexanderguzhva
Copy link
Contributor

@ssk01 a definite bug, thanks for spotting it out!

facebook-github-bot pushed a commit that referenced this issue Sep 9, 2024
…3840)

Summary:
#3819

A definite bug in my code from the past.
The number of reported distances is higher that the number of distances actually computed.

Pull Request resolved: #3840

Reviewed By: junjieqi

Differential Revision: D62392577

Pulled By: mengdilin

fbshipit-source-id: c4d595849cc95e77eb6cd8d499b3128bbe9a6e13
@mnorris11
Copy link

I will go ahead and close this one, @ssk01 let me know if there's still an issue

aalekhpatel07 pushed a commit to aalekhpatel07/faiss that referenced this issue Oct 17, 2024
…acebookresearch#3840)

Summary:
facebookresearch#3819

A definite bug in my code from the past.
The number of reported distances is higher that the number of distances actually computed.

Pull Request resolved: facebookresearch#3840

Reviewed By: junjieqi

Differential Revision: D62392577

Pulled By: mengdilin

fbshipit-source-id: c4d595849cc95e77eb6cd8d499b3128bbe9a6e13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants