-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to search by type signature on the docs #12866
Comments
I really like Hoogle too. (This would be most useful as a global search, currently rustdoc's search is local to a single crate.) |
/cc me |
I'm starting to research on what is needed to get this working. Any tips are appreciated :). I'll post my questions here as soon as I have them. |
Here's what I have after some code exploration: There is already a search index that is being built by rustdoc at generation time. This contains a list of items and paths (an item can be a function, a struct, a field etc.). There is quite some search logic written to query this index. Since we have access to the AST while building the index, I'm thinking that we should also build a mapping from Then, when the search query is performed, if the query contains What do you guys think? |
I think that that’s generally a good way of doing it, although I’m not sure that a string-based signature mapping would be a good idea; I think there are a few subtle cases that need to be handled here that might not interact so well with a text-based system. Certainly, not all of them need to be done straight away for a first implementation, but they’d still be nice in the long term:
Obviously, all of this shouldn’t have to be done for a first implementation; in fact, none of it really has to be done. But I think it’d be best to design a scheme with this in mind such that adding these features in the future will not require a complete restructuring of the system. This is quite a tricky task, though, because half of the searching functionality is done on the client, in Javascript. I think that verbose, fully-expanded-with-lots-of-details signatures should be output to the index, and then the Javascript code would have to parse the query as a function signature, and if successful, search the index for things that roughly match the inputted function’s signature. I think doing this textually might not be such a good idea, but I’m not sure. Later, the function signature from the search query could be varied in a number of ways to take into account some of the things I listed above (and maybe others). I think performance is a bit of a worry, though—loading the existing index already takes a long time. On my system, clearing the cache and reloading the standard library’s docs takes three seconds to download the index alone, which is ~1 MB in size. |
(To add to the list, argument order shouldn't matter, so |
Thanks for all the input! Great points indeed. I'd rather get the format right (or an approximation of it) and basic search (say for simple types, no generics / lifetimes) such that other contributors may have a starting point. How's this for the index representation? Given: JSON: {
"annotations": [
{
"type": "lifetime",
"name": "a"
},
{
"type": "type",
"name": "T",
"traits": [
"Show"
]
}
],
"arguments": [
{
"name": "x",
"lifetime": "a",
"type": "T"
},
{
"name": "x",
"lifetime": "a",
"type": "T"
}
],
"return": {
"type": "T",
"lifetime": "a"
}
} |
Also, I'm considering gzipping everything and extracting it in JS since the computation cost should be small but it would help with size. |
Neal Mitchell has written some things about the various iterations of Hoogle (and lessons learned), which would probably be useful as reference. In writing the (now outdated to the point of uselessness) simple version of this at https://github.com/dbp/rustle , one thing that turned out to be important to get useful ordering of results was to progressively generalize the query and order the results based on that. So Also, instead of searching for |
FWIW, I took a look yesterday at dumping a stylized type signature from RustDoc that Hoogle can then read in, which allows us to reuse all the fancy type search algorithms Hoogle uses (retrieval, selective generalization, canonicalization, scoring, etc). It seems to work OK as a demo with even just a simple implementation (~200 lines in https://github.com/ajtulloch/roogle has some more documentation on the approach, usage instructions, and examples (along with the Hoogle changes), and https://github.com/ajtulloch/rust has the The main missing features (relative to Hoogle for Haskell) are:
|
I'm pulling a massive triage effort to get us ready for 1.0. As part of this, I'm moving stuff that's wishlist-like to the RFCs repo, as that's where major new things should get discussed/prioritized. This issue has been moved to the RFCs repo: rust-lang/rfcs#658 |
This adds search by type (for functions/methods) support to Rustdoc. Target issue is at rust-lang/rfcs#658. I've described my approach here: rust-lang/rfcs#658 (comment). I'll copy the text in here as well: --- Hi, it took me longer than I wished, but I have implemented this in a not-too-complex way that I think can be extended to support more complex features (like the ones mentioned [here](#12866 (comment))). The idea is to generate a JSON representation of the types of methods/functions in the existing index, and then make the JS understand when it should look by type (and not by name). I tried to come up with a JSON representation that can be extended to support generics, bounds, ref/mut annotations and so on. Here are a few samples: Function: ```rust fn to_uppercase(c: char) -> char ``` ```json { "inputs": [ {"name": "char"} ], "output": { "name": "char", } } ``` Method (implemented or defined in trait): ```rust // in struct Vec // self is considered an argument as well fn capacity(&self) -> usize ``` ```json { "inputs": [ {"name": "vec"} ], "output": { "name": "usize" } } ``` This simple format can be extended by adding more fields, like `generic: bool`, a `bounds` mapping and so on. I have a working implementation in master...mihneadb:rustdoc-search-by-type. You can check out a live demo [here](http://data.mihneadb.net/doc/std/index.html?search=charext%20-%3E%20char). ![screenshot from 2015-02-28 00 54 00](https://cloud.githubusercontent.com/assets/643127/6422722/7e5374ee-bee4-11e4-99a6-9aac3c9d5068.png) The feature list is not that long: - search by types (you *can* use generics as well, as long as you use the exact name - e.g. [`vec,t -> `](http://data.mihneadb.net/doc/std/index.html?search=vec%2C%20t%20-%3E)) - order of arguments does not matter - `self` is took into account as well (e.g. search for `vec -> usize`) - does not use "complex" annotations (e.g. you don't search for `&char -> char` but for `char -> char`) My goal is to get a working, minimal "base" merged so that others can build upon it. How should I proceed? Do I open a PR (badly in need of code review since this is my first non "hello world"-ish rust code)? ---
One of hoogle's most useful features is the ability to search by type signatures. It makes finding functions whose behavior you know but whose names you don't know much easier.
We should add this ability to the search feature in the rust docs.
The text was updated successfully, but these errors were encountered: