Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swap from exact substring matches to each individual word matching #145

Merged
merged 8 commits into from
Jul 21, 2022

Conversation

joellabes
Copy link
Contributor

@joellabes joellabes commented Oct 10, 2020

resolves #143

Description

This is a first attempt at considering each word entered in a search individually, as opposed to requiring a perfect match (model my or my model should now find my_model, where earlier only an exact search, including underscores, would have worked). my_model still works of course.

I couldn't get the Docs site running locally (from looking at the stack traces, seemed to be a bunch of issues with webpack, node versions, python versions...) but I have tested that each individual piece of work behaves correctly outside of the wider Docs context I tried it in the Netlify CI preview and it works correctly.

Checklist

  • I have signed the CLA
  • I have generated docs locally, and this change appears to resolve the stated issue
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Oct 10, 2020
@@ -271,23 +271,24 @@ angular
'tags': 'array',
'arguments': 'array',
};
var search = new RegExp(val, "i")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was never used

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool @joellabes !!

I'll wait until @drewbanin can give the JS a once-over, likely next week

@drewbanin
Copy link
Contributor

ha - oops - thanks for your patience @joellabes - giving this a look now :D

Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love what you're going for here, but I think it's about time that we giving searching some more love! What do you think about leveraging something like fusejs here and deleting a lot of this searching logic? I kicked around fuse.js, and it looks to me like it will match the query from the original issue (#143) out of the box!

Joel, if this is something you're interested in pursuing, I'd be really happy to offer some guidance! If not, I imagine it's something that we could get to in the reasonably near future. Let me know either way!

@joellabes
Copy link
Contributor Author

joellabes commented Nov 25, 2020

I'd be so excited for search to get some love - I had an urge to rework the weighting process last week as well. Happy to have a blast with Fuse. I had dramas getting dbt running locally (Windows 🙄) but will give it another whirl - if it doesn't magically work I might have to get a hand on the initial setup!

(Might be a couple of weeks though, so don't feel obliged to wait for me if the mood strikes you)

@joellabes joellabes mentioned this pull request Jan 3, 2021
3 tasks
@joellabes
Copy link
Contributor Author

I've started looking into this on #162 - still a long way from it being ready, but I'm going to close this one for now.

@joellabes joellabes closed this Jan 3, 2021
@joellabes
Copy link
Contributor Author

OK so 162 wound up being non-performant and not very good at fuzzy search. Not a good combo.

Since this has been sitting around waiting for a year and a bit, and works just fine, I'm going to reopen it and hopefully we can get it out into a future docs release. If someone who knows more JS wants to have a swing at fuse or some other proper search lib, I'm into it.

But in general, dbt docs searches are only a couple of words and it's not unreasonable to just require accurate typing 😅

@joellabes joellabes reopened this Apr 5, 2022
@joellabes joellabes changed the base branch from master to main April 5, 2022 04:26
@cla-bot
Copy link

cla-bot bot commented Apr 6, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Joel Labes.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot cla-bot bot removed the cla:yes label Apr 6, 2022
@emmyoop emmyoop requested a review from stu-k April 25, 2022 14:54
Copy link
Contributor

@stu-k stu-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple comments about using lodash functions, nothing else stands out 👍

return string.replace(/[.*+\-?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}

function splitQuery(query){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: a better name for this might be getQueryTokens, since it is unclear what the query is being split on without reading this function. Also, if you use lodash's words method, I think this function becomes unnecessary _.words(text).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's a better name! Have kept the function for lowercasing purposes but have swapped out all the icky split stuff for _.words

src/app/services/project_service.js Outdated Show resolved Hide resolved
@emmyoop
Copy link
Member

emmyoop commented May 16, 2022

Tested this out and it seems like a big quality of life improvement! 🙌 I'll take it for another spin once you implement @stu-k's suggestions.

Your commit from October 13, 2020 is causing the cla bot to fail because it's associated with your [email protected] email. You should be able to squash down your commits and get the cla bot to pass.

@cla-bot
Copy link

cla-bot bot commented Jul 18, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Joel Labes.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Jul 18, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Joel Labes.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@joellabes
Copy link
Contributor Author

@cla-bot check

@cla-bot cla-bot bot added the cla:yes label Jul 18, 2022
@cla-bot
Copy link

cla-bot bot commented Jul 18, 2022

The cla-bot has been summoned, and re-checked this pull request!

@joellabes
Copy link
Contributor Author

@emmyoop I've applied @stu-k's suggestions now 🛠️

I couldn't get the squash to work, so just re-added the educationperfect domain to my GH account and the CLA bot is happy!

Copy link
Member

@emmyoop emmyoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for staying on it and pushing it through. The search improvements are so so satisfying.

@emmyoop emmyoop merged commit 07a617f into dbt-labs:main Jul 21, 2022
@joellabes joellabes deleted the feat-143-fuzzier-search branch October 17, 2022 07:16
@rightx2
Copy link

rightx2 commented Feb 16, 2024

This feature really inconvinient. In my case, I need to find out the model named s_user but search result get me all models containing s, or user or both... I can't easily find out s_user among the 108 search results.... makes me scroll down infinitely

Screenshot 2024-02-16 at 2 42 17 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement fuzzier search for model names in docs
6 participants