Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicitly calculate collection size for Active Record Relations and Arrays #698

Merged
merged 2 commits into from
Sep 21, 2022

Conversation

adrianna-chang-shopify
Copy link
Contributor

Closes: #644

For v2.0, we've decided to start implicitly calculating a task's tick_total using the enumerator's size, rather than requiring users to define a #count method. tick_total is automatically calculated for several types of tasks -- CSV tasks, batch tasks, and batch CSV tasks, notably. CSV tasks and batch CSV tasks have their #count method overridden via a collection builder pattern, and batch tasks relies on the size of the enumerator constructed in TaskJob#build_enumerator to set the tick total.

For Arrays and Active Record Relations, we can also use the enumerator size to set the #tick_total. I modified the code in TaskJobConcern to always fall back to the size of the @collection_enum when setting the tick_total, if calling @task.count returns :no_count (our null collection default, which indicates that count hasn't been overridden).

I've updated the docs to drop references to #count where it's not necessary, and to indicate that this is only really necessary to no-op the operation. Note that this will be a breaking change for some apps, since leaving out the #count method previously meant not attempting to calculate the tick total. Users will need to explicitly add def count; nil; end to tasks that should not attempt to calculate the collection size.

…provided

This is new behaviour for Active Record Relation tasks and Array-based tasks:
count will now be calculated implicitly based on the enumerator if users
don't supply `#count`. However, `#count` can still be manually overriden in a task
if desired.
@adrianna-chang-shopify adrianna-chang-shopify requested a review from a team September 16, 2022 13:45
Copy link
Contributor

@nvasilevski nvasilevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a tiny non-blocking nitpick about the test expectation but the code change itself looks good 👍


TaskJob.perform_now(@run)

assert_equal 2, @run.reload.tick_total
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure yet what my preference is but I wanted to start a quick discussion to see whether we should consider using Post.count as an expectation instead of a hard-coded number.
Generally I prefer explicit and hard-coded expectations as it makes easier to read and make assumptions about the test. But in this case it would be unfortunate for this test to fail just because someone added a new fixture record for an unrelated reason. It's not a big problem to bump this number every time we add a new fixture as our codebase is small and need for a new fixture record is rare but generally for larger codebase such unrelated failures cause unnecessary friction and slow down feature development.

Perhaps the best option for us would be to keep the explicit expectation of 2 ticks, but also make sure that the task always iterates over collection of 2 by either stubbing the collection method or maybe adding an argument to the task that will contain a comma-separated list of ids so we can pass something like post_ids: [post(:one), post(:two)].map(&:id).join(",") from the test to make sure the context in this test never changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Nikita! This is an interesting point of discussion. I also usually prefer hardcoded expectations, although in this case I'm not sure changing 2 => Post.count really violates that 🤔 In general, I feel like when there are implementation-specific values in tests, it becomes a bit of a "smell" because the test implementation is too coupled to the code implementation (so if the code changes, the test will be forced to as well). But in this case, the fact that Post.count is the expected tick_total value is not tied to implementation, but rather an explicit public API -- unless something changes around how we calculate ticks for Active Record Relations, this won't change.

Consequently, I'd be fine to change the assertions to check Post.count as the expected value. I'd prefer not to have to use stubs here when we have the setup already with the fixtures -- IMHO stubs just make the test less clear. I'd also be fine just leaving as-is -- as you mentioned, we have very few tests that assert on this count, so I think hardcoding the count value is a fair tradeoff. For a larger codebase we'd maybe want to reconsider, but I think the tradeoffs should be weighed within the scope of the codebase we're working in 😄

All that to say, I'll leave as-is for this PR, but if you feel strongly and would like to change in a follow-up, feel free to do so!

@adrianna-chang-shopify adrianna-chang-shopify merged commit dc85ada into main Sep 21, 2022
@adrianna-chang-shopify adrianna-chang-shopify deleted the ac-implicit-count branch September 21, 2022 15:00
@shopify-shipit shopify-shipit bot temporarily deployed to rubygems September 26, 2022 13:57 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implicit count implementation
2 participants