-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
view_sum
cache for courses is miscalculated in CourseCacheManager
#5911
Comments
Wow, nice find. |
Working on this. |
After going through this issue and #33082 , I found that the only way forward, as mentioned in one of the comments, is: Most suggested solutions are similar to that comment. Here are a few examples I ran:For the Course with articles:WDG - AF 2018 Florence (305 Articles) Psychology Capstone (5 Articles) Time Taken:For Psychology Capstone (5 Articles): a) b) c) d) For WDG - AF 2018 Florence (305 Articles): a) b) c) After testing this out, I considered using the version with group, as shown below. However, I still have some doubts about its usefulness. @ragesoss @gabina , do you have any thoughts on this?
|
Hmm... the group workaround seems to be the main one that came out of the discussion in rails#33082, and it seems to be similar performance-wise. What are your doubts about that? One other thing to investigate might be whether those scopes could be rewritten to have the same intended behavior (returning only one result per unique Article record, rather than one per unique ArticlesCourses record) without using |
What is happening?
There is a pretty counter-intuitive behavior related to
ActiveRecord
distinct and sum, which makes theview_sum
cache incorrect. See issue #33082 in rails repo for more details on how this actually works.The main method for
CourseCacheManager
class isupdate_cache
, which updates the different caches.The
view_sum
cache is updated through the following private method:This method is associated to the following SQL query:
SELECT SUM(DISTINCT `articles_courses`.`view_count`) FROM `articles_courses` INNER JOIN `articles` ON `articles`.`id` = `articles_courses`.`article_id` WHERE `articles_courses`.`course_id` = 16 AND `articles_courses`.`tracked` = TRUE AND `articles`.`deleted` = FALSE
This means that if the value of
view_count
is the same for different article courses, the repeated values are ignored, generating an incorrect calculation of the total sum of views.This is caused by a not very intuitive behavior when using distinct and sum in
ActiveRecord
(note thattracked
andlive
articles courses scopes usedistinct
).To Reproduce
Expected behavior
view_sum
field forCourse
should have the sum ofview_count
for all tracked live articles courses, no matter if theview_count
value is not unique .The SQL query should be:
SELECT DISTINCT `articles_courses`.* FROM `articles_courses` INNER JOIN `articles` ON `articles`.`id` = `articles_courses`.`article_id` WHERE `articles_courses`.`course_id` = 16 AND `articles_courses`.`tracked` = TRUE AND `articles`.`deleted` = FALSE
One option is to use the following definition, but I' m not sure if this could be less performant.
Additional context
It is possible that this same behavior is causing problems in other parts of the code. We should review all the code when we fix this.
The text was updated successfully, but these errors were encountered: