Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General feedback #3

Open
MaxLap opened this issue Apr 10, 2018 · 15 comments
Open

General feedback #3

MaxLap opened this issue Apr 10, 2018 · 15 comments

Comments

@MaxLap
Copy link
Owner

MaxLap commented Apr 10, 2018

If you have any feedback (good or bad) about the gem that you don't feel is worthy of an issue, please post a comment here.

I'd be really interested in why you decided not to use this gem.

@yoelblum
Copy link

Looks like a good approach! Did you try showing this to anyone from the Rails team? I don't think there's a huge chance they will adopt this but it's worth hearing their feedback.

@MaxLap
Copy link
Owner Author

MaxLap commented Apr 27, 2018

@yoelblum No i didn't show it to the Rails team.

In 2015, someone did a PR for similar feature (not as fleshed out in the details) and it got refused just based on somone being against the feature. I don't know if their mood changed related to this.

rails/rails#21438

The guy ended up making a gem with it, and we didn't hear much about it after that sadly.

https://github.com/EugZol/where_exists

@fschwahn
Copy link

I'm doing a lot of filtering logic in my apps, and I'd often run into bugs when combining several scopes which do joins. I've just found this gem yesterday, and it's really helpful.

Is this likely to survive rails upgrades? I'm a bit wary with gems that extend activerecord, because they tend to have a hard time going through major rails upgrades.

@MaxLap
Copy link
Owner Author

MaxLap commented Sep 18, 2018

Glad you find it helpful!

This is essentially the same codebase that deals with ActiveRecord 4.1 all the way to ActiveRecord 5.2. I'm not doing some deep hacks of changing how ActiveRecord behaves. All that the gem does is generate a single where. Everything else is about finding the data in the associations, so I would say very likely to go through major Rails upgrades.

The CI build on Travis even has a job to run against ActiveRecord's master branch, so I can prepare in advance if something starts breaking from the ongoing development.

Thanks for the feedback!

@fschwahn
Copy link

That sounds good!

Do you have any data on how this performs compared to joins? Especially with > 1 million rows? Is it faster? Is it slower? Is it comparable? If you have any data on this, it might make a good addition to the README.

@MaxLap
Copy link
Owner Author

MaxLap commented Sep 18, 2018

SQL-wise, it should normally be either comparable or faster to use EXISTS. I don't personally have data on this, but you can find comparisons online of SQL JOIN vs EXISTS. Here is an example saying around 9% faster for EXISTS: https://danmartensen.svbtle.com/sql-performance-of-join-and-where-exists

However, when dealing with has_one, it is probable that this gem will be slower (can't really say by how much). This is because this gem will correctly filter based on only the "first" record of the association, but figuring out which record is that "first" is more work (Need to apply ordering). The JOIN way just doesn't care and will lead to unexpected results (Unless you expect it to treat as has_one like a has_many).

@MaxLap MaxLap closed this as completed May 10, 2019
@MaxLap MaxLap reopened this May 16, 2019
@pjungwir
Copy link

pjungwir commented Jun 4, 2020

Thanks for this gem! I've been writing things like this for years:

scope :with_foo, -> (foo) {
  where(<<~EOQ)
    EXISTS (SELECT 1
            FROM ...)
  EOQ
}

Having a gem to do it for me would be so much better!

Since you asked for general feedback, here are two things I notice:

Imagine you have 3 tables: users, user_roles, and roles. On User there is has_many :roles, through: :user_roles. (I think thousands of apps have this structure.) Now you want to get all the users with the foo role. This gem lets me say User.where_assoc_exists(:roles, name: 'foo'). That is awesome! It generates SQL like this (whitespace added for clarity):

SELECT "users".*
FROM   "users"
WHERE  (EXISTS (
  SELECT 1
  FROM   "user_roles"
  WHERE  "user_roles"."user_id" = "users"."id"
  AND    (EXISTS (
    SELECT 1
    FROM   "roles"
    WHERE  "roles"."id" = "user_roles"."role_id"
    AND    "roles"."name" = 'foo'))))

The 2 improvements are:

  • 'foo' should be a parameter like $1 instead, just as if I'd said User.where(first_name: 'bar').
  • user_roles and roles should be connected with a join (like when following the association), not with a nested EXISTS. These are equivalent in meaning (when used within the outer EXISTS), and a quick test on a (small, unfortunately) database shows that Postgres even produces the same query plan. But still it seems more compatible with Active Record to follow its approach. For example it might matter if you define custom methods on your association.

If these sound like improvements to you, I'd be willing to write up separate issues for them and look at sending you PRs. What do you think?

@MaxLap
Copy link
Owner Author

MaxLap commented Jun 4, 2020

@pjungwir Hey, thank you for the feedback! Glad you find the gem useful.

Having a gem to do it for me would be so much better!

"Would be" implies that you don't use the gem? Is the missing handling of ? a blocker to using it at all?

For your issue with the placeholders ?. I worry about the complexity of doing such a restructuring, but if you want to give it a try, by all means go for it.

Using JOIN creates more edge cases, since things remain in the same context. Off the top of my head, here are problems this bring:

  • If you have a recursive association, the joined table would then need a different name. And then this makes everything more complex when people pass in string conditions or use scopes, which name to use, etc.
  • You can nest where_assoc_* calls using blocks, those require their own exists. So going with joins would require still doing something similar to what is currently happening. It seems more mentally complex to have 2 ways of doing things, both for developers and users of the gem.

And I don't see any benefit to using JOIN. As you say, the query plan is the same.

@pjungwir
Copy link

pjungwir commented Jun 4, 2020

I'm not using it yet. I just discovered it. I don't think either of the issues are enough to stop me from using it.

Thanks for giving your thoughts about both those suggestions! To me, using a join feels like the more important of the two, although I don't have any good reasons to back that up.

I think at least for Postgres you don't need to add table aliases. For example you can do this (not that it means anything):

SELECT users.*
FROM   users
JOIN   user_roles
ON     user_roles.user_id = users.id
WHERE  user_roles.role_id = 5
AND   EXISTS (
  SELECT 1
  FROM   user_roles
  INNER JOIN roles
  ON     roles.id = user_roles.role_id
  WHERE  user_roles.user_id = users.id
  AND    roles.name = 'foo'))

Personally I think it is more mentally complex to have two ways a through association can behave: usually with a JOIN but here with an EXISTS. But I guess that's subjective.

Anyway, maybe I'll look into using parameters at least.

@MaxLap
Copy link
Owner Author

MaxLap commented Jun 4, 2020

I see what you mean about not needing alias. But that only works if the only "reused" name is the last one. Imagine a weird relation that goes users -> foo -> users -> bar. Your chain of JOIN will include users which was already there, and then you need an alias, or to nest in an EXISTS. And if you now must sometimes do it one way or another way, it doesn't feel like much would be gained.

Also, doing a JOIN could mean returning duplicated results if it is not inside of the EXISTS. Your query won't, because there is a WHERE user_roles.role_id = 5, but as you said, that query doesn't really mean anything.

What do you refer to when you say "usually with a join". When does a :through get used with a JOIN? The only times I can think of is when when you call joins, when you eager_load, and when you do includes.references. Those are all about getting data from the database. (Or, when you want to do conditions on associations in what I consider to be a less desirable way)

When data from another table must be returned, then JOIN is the tool for the job.
When filtering based on another table, the cleanest and safest way is using EXISTS. It's true, however, that in many cases, JOIN will be able to do the job, but its more error prone.

These are different operations, it makes sense that they use different operators.

Here are 3 relations which do the same thing. I like that they generate the same SQL query.

User.where_assoc_exists(:roles, name: 'foo')
User.where_assoc_exists(:user_roles) {|ur| ur.where_assoc_exists(:role, name: 'foo') }
User.where_assoc_exists([:user_roles, :role], name: 'foo')

I like this discussion, if you'd like to continue it, I would prefer if you could make a separate Gitub issue for it, it is out of the scope of general feedback. You can quote this post and reply to it there.

Thank you for your interest!

@ridiculous
Copy link

Is there a way to check if the association was already added? To avoid double-ups. For instance:

user = User.where_assoc_exists(:roles)
# passed to another method that also adds the assoc
user = user.where_assoc_exists(:roles)

Would produce a query that joins on roles twice. I would like to check if it's already added, so I can avoid adding it again.

@MaxLap
Copy link
Owner Author

MaxLap commented Aug 4, 2022

Good question.

While not as clean as normal usage. I guess you could use where_assoc_exists_sql, which returns the SQL for the condition. (where_assoc_exists is just a wrapper for that to add a where(sql))

Having that SQL string, you could then check in the relation's where_values (sorry I don't remember the exact name of that ActiveRecord-internal method) if that SQL is already in there. If not, then you do a where(sql).

Good luck

@jonny5
Copy link

jonny5 commented Nov 22, 2022

Thanks for the gem, I am curious if there is a way to use a block with a nested where_assoc_count and add conditions to both levels of nesting.

Using the example here Post.where_assoc_count(10, :<=, [:comments, :replies])(from:

# Post.where_assoc_count(10, :<=, [:comments, :replies])
)

I seem to be able to get it to work with something like

Post.where_assoc_count(10, :<=, [:comments, :replies], ['comments.created_at > ? AND replies.created_at < ?'])

Can't figure out if there is a way to access both levels of nesting within a block. If I do

Post.where_assoc_count(10, :<=, [:comments, :replies]) do |x|

x always seems to be the deepest scope (so replies). Is there a way to access the comments relation from within that block?

@MaxLap
Copy link
Owner Author

MaxLap commented Nov 22, 2022

Hello @jonny5, sadly, there is no way to do this at the moment. I didn't think of that use-case when developping this gem. For the where_assoc_exists, you can call it again from inside the first block, but for counting, that wouldn't work.

As you found out, SQL's nesting can allow you to it when using strings, so that's a pretty good work-around.

Since where_assoc_count is a very niche use-case and since there is a workaround, I don't plan on adding such a capability for now.

@ridiculous
Copy link

btw, some followup on my question above: I wasn't able to use where_values or the like to check if the query already had the assoc, and I didn't want to parse the generated SQL. So instead I changed how the code was calling it to reduce double-ups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants