Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for checking the size of the table #10

Open
nikita-volkov opened this issue Nov 20, 2013 · 10 comments
Open

Add support for checking the size of the table #10

nikita-volkov opened this issue Nov 20, 2013 · 10 comments

Comments

@nikita-volkov
Copy link

No description provided.

@nominolo
Copy link

This ticket's description is... lacking. I assume it means the following problem:

If you call newSized 187787582217851179 (e.g. a QuickCheck test or programming error) a useful exception should be thrown. Currently you get *** Failed! Exception: 'Data.Vector.Mutable: uninitialised element'. In other words, newSized should check its input parameter.

@gregorycollins
Copy link
Owner

@nominolo I think the underlying vector code should be responsible for throwing an exception or otherwise indicating OOM in this case.

@nikita-volkov
Copy link
Author

@nominolo I meant counting the amount of elements in table.

@gregorycollins
Copy link
Owner

In general the reason there's no size function for the typeclass is that it's O(n) for all but one of them. That said, I should probably add a size function to BasicHashTable, since O(1) size is possible there.

@nikita-volkov
Copy link
Author

What about adding an extra STRef Int field to the other implementations, which will keep the current size? To this day I had to make wrapper APIs just because of that.

@gregorycollins
Copy link
Owner

Because hashtables are about speed, and if you don't need to do this bookkeeping (and clearly it's not necessary at least in the case of cuckoo hash), the program can go faster. As you've noted, the wrapper API is trivial.

One thing we could do is include the wrapper into the package -- but doing it right is a matter of design. We do have this information for the "basic" (linear probing) table, just not for the others -- so we wouldn't want to do extra work there. Associated type synonyms might work there.

@nikita-volkov
Copy link
Author

I've implemented a dome library over "hashtables", which approaches this issue amongst others. In the coming days I'll be test-driving it in another project. If no major issues arise, I'll release it right after. I'll post back here with updates.

@nikita-volkov
Copy link
Author

I've just released the library I was talking about: http://hackage.haskell.org/package/hashtables-plus

@infinity0
Copy link

infinity0 commented Dec 14, 2018

As you've noted, the wrapper API is trivial.

The "trivial" API would be inefficient since it would involving doing a lookup for each insert to check whether we should increment the counter by 0 or 1. It does need to be added specifically to the implementation. Alternatively we add a single class method maybeMutate (Maybe k -> Maybe v) where the second function takes the result of lookup and then re-implement insert/delete/lookup in terms of maybeMutate. Then and only then could we efficienlty and generically implement a wrapper.

Actually nvmd, it seems the existing mutateST :: (Eq k, Hashable k) => h s k v -> k -> (Maybe v -> ST s (Maybe v, a)) -> ST s a should do what I need. But then a generic wrapper needs to also overwrite mutate/insert/delete/lookup to pass it through mutateST.

@sanketr
Copy link

sanketr commented Jun 11, 2021

I second this. It will be useful to have O(1) size function, just like what vector, ByteString etc. provide. For now, I can do O(n) using toList but that also likely creates temporaries to serialize the contents. Keeping track of size is very important in production applications. Too bad that hashtables-plus was deprecated by @nikita-volkov.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants