Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make quote context searchable #210

Open
mrphlip opened this issue Jun 2, 2016 · 9 comments
Open

Make quote context searchable #210

mrphlip opened this issue Jun 2, 2016 · 9 comments

Comments

@mrphlip
Copy link
Owner

mrphlip commented Jun 2, 2016

As suggested in-chat by dialMforMara, the context for quotes should be searchable.

It probably should just be treated the same as the actual quote body. Fulltext-index it, and make the normal !findquote command (and the "Search quotes" function of the website) search where either quote or context contains the search term.

(I think making it so !findquote foo bar can find a quote where quote contains "foo" and context contains "bar" would be a bit overcomplicated, so I wouldn't be worried about that. Just (pseudocode) quote contains 'foo bar' or context contains 'foo bar'.)

((Unless it's possible to make a single full-text index over both quote and context? I don't know enough postgresql to know if that's a thing...))

@andreasots
Copy link
Collaborator

Index over quote || context should work.

@danieljcrabtree
Copy link
Contributor

danieljcrabtree commented Nov 20, 2016

If you want to make context searchable, does it make sense to include attrib_name as well? It should be possible to index over quote || attrib_name || context.

I realise that !quote and the website already have ways to search on attribution but this way a query like 'paul fine' could return something like 'Everything Is Fine! - Paul'.

@mrphlip
Copy link
Owner Author

mrphlip commented Nov 20, 2016

Hmm... I know it's certainly been the case that people tend to get !quote and !findquote confused, and use one when they mean the other... we could fold the two together and it'd save a fair amount of confusion.

It'd mean we'd lose the ability to find quotes that are by eg Paul rather than about Paul or vice versa, though, and... maybe that's worth it?

Also relevant that the search on attrib_name is a simple "contains" query, not a word search... which is relevant since it lets people search for eg "Cam" and find quotes attributed to "Cameron". But maybe we can still handle that... it looks like postgres supports making custom synonym dictionaries... I don't really understand the details, but looks like that would let us put in alternate forms of people's names so the search can still find them.

@RebelliousUno
Copy link

Would it not make some sense to have some switches on findquote
!findquote by (author)
!findquote about (quote body)
!findquote why (context)

The last one feels like its not the right switch for contextual quotes.

@andreasots
Copy link
Collaborator

Oh boy, syntax bikeshedding. But first, a request from the chat:

11:45 Briars_the_fox: is there a way to implement key word AND person as a search option for the quotes?
11:46 Briars_the_fox: so if i wanted to look for alex AND butts
11:46 qrpth: Yes. There is a way to implement quote search so that you can look at Alex's butt.

I wrote this thing a long time ago so it needs some work before it can be added to LRRbot. The syntax:

query ::= disjunction
disjunction ::= conjunction '|' disjunction
conjunction ::= expr conjunction
expr ::= '(' disjunction ')' | atom
atom ::= quoted-string | token | token op (quoted-string | token)
quoted-string ::= '"' (<any character not '"'>)* '"'
token ::= (<any character not a whitespace, '(', ')', '|', '=', '>', '<' or ':'>)+
op ::= ':' | '=' | '>=' | '>' | '<=' | '<' 

This gist seems to be the parser code. Context, game and show tags need to be added and it should generate a SQLAlchemy query and not a SQL string.

Examples:

An alternative to this would be the !addquote syntax ((NAME) [DATE] QUOTE | CONTEXT). The advantages being that it's somewhat more familiar and simpler to describe. The disadvantages being that the queries are very limited and simple (but maybe it's fine?), being very strict on ordering of components and not being able to filter by game or show.

@danieljcrabtree
Copy link
Contributor

@RebelliousUno, did you imagine being able to chain switches together?

e.g. !findquote by (author) about (quote body)

Or did you see them working in the same way as the game and show switches on the !quote command?

@andreasots, I don't have a lot of experience with context-free grammars so pardon me if I'm just confused. There doesn't seem to be a way to terminate these rules:

disjunction ::= conjunction '|' disjunction
conjunction ::= expr conjunction

@andreasots
Copy link
Collaborator

I don't have a lot of experience with context-free grammars so pardon me if I'm just confused. There doesn't seem to be a way to terminate these rules.

You are correct. It should be

disjunction ::= conjunction '|' disjunction | conjunction
conjunction ::= expr conjunction | expr

@danieljcrabtree
Copy link
Contributor

To go back for just a moment: was the original request to search on just the context column or to search on context and other columns? If it’s as simple as searching on just context, then !quote context <query> or !findquote context <query> might suffice.

Combining !quote and !findquote would seem to make sense, especially as there are 4 quote related commands with subtly different syntaxes. If searching on multiple columns is required then Andreas’ syntax and parser look good. But, as Andreas suggests, it’s harder to explain. The need for double-quoted strings strikes me as something that could easily catch people out.

But is there much demand for searching on more than one column? If there isn’t, I’d be inclined not to risk complicating things. If the aim is just to combine !quote and !findquote into a single command and search on context, then how about something like this?

New command Current command
!quote !quote
!quote id <int> !quote <int>
!quote <int> (alias for !quote id <int>) !quote <int>
!quote name <query> !quote <query>
!quote <query> (alias for !quote name <query>) !quote <query>
!quote quote <query> !findquote <query>
!quote context <query>
!quote game <query> !quote game <query>
!quote show <query> !quote show <query>

!findquote would be deprecated.

This is similar what Uno seems to be suggesting and could be updated to use Andreas’ syntax at a later stage.

@RebelliousUno
Copy link

The two aliases for quote id and quote name could potentially add extra logic that might be a little hassle. Just in case a name ended up being confused with an id. Personally I'd drop one of the aliases (likely quote name alias)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants