Implicitly calculate collection size for Active Record Relations and Arrays #698

adrianna-chang-shopify · 2022-09-16T13:36:03Z

Closes: #644

For v2.0, we've decided to start implicitly calculating a task's tick_total using the enumerator's size, rather than requiring users to define a #count method. tick_total is automatically calculated for several types of tasks -- CSV tasks, batch tasks, and batch CSV tasks, notably. CSV tasks and batch CSV tasks have their #count method overridden via a collection builder pattern, and batch tasks relies on the size of the enumerator constructed in TaskJob#build_enumerator to set the tick total.

For Arrays and Active Record Relations, we can also use the enumerator size to set the #tick_total. I modified the code in TaskJobConcern to always fall back to the size of the @collection_enum when setting the tick_total, if calling @task.count returns :no_count (our null collection default, which indicates that count hasn't been overridden).

I've updated the docs to drop references to #count where it's not necessary, and to indicate that this is only really necessary to no-op the operation. Note that this will be a breaking change for some apps, since leaving out the #count method previously meant not attempting to calculate the tick total. Users will need to explicitly add def count; nil; end to tasks that should not attempt to calculate the collection size.

…provided This is new behaviour for Active Record Relation tasks and Array-based tasks: count will now be calculated implicitly based on the enumerator if users don't supply `#count`. However, `#count` can still be manually overriden in a task if desired.

nvasilevski

I have a tiny non-blocking nitpick about the test expectation but the code change itself looks good 👍

nvasilevski · 2022-09-19T17:01:48Z

test/jobs/maintenance_tasks/task_job_test.rb

+
+      TaskJob.perform_now(@run)
+
+      assert_equal 2, @run.reload.tick_total


I'm not sure yet what my preference is but I wanted to start a quick discussion to see whether we should consider using Post.count as an expectation instead of a hard-coded number.
Generally I prefer explicit and hard-coded expectations as it makes easier to read and make assumptions about the test. But in this case it would be unfortunate for this test to fail just because someone added a new fixture record for an unrelated reason. It's not a big problem to bump this number every time we add a new fixture as our codebase is small and need for a new fixture record is rare but generally for larger codebase such unrelated failures cause unnecessary friction and slow down feature development.

Perhaps the best option for us would be to keep the explicit expectation of 2 ticks, but also make sure that the task always iterates over collection of 2 by either stubbing the collection method or maybe adding an argument to the task that will contain a comma-separated list of ids so we can pass something like post_ids: [post(:one), post(:two)].map(&:id).join(",") from the test to make sure the context in this test never changes

Hey Nikita! This is an interesting point of discussion. I also usually prefer hardcoded expectations, although in this case I'm not sure changing 2 => Post.count really violates that 🤔 In general, I feel like when there are implementation-specific values in tests, it becomes a bit of a "smell" because the test implementation is too coupled to the code implementation (so if the code changes, the test will be forced to as well). But in this case, the fact that Post.count is the expected tick_total value is not tied to implementation, but rather an explicit public API -- unless something changes around how we calculate ticks for Active Record Relations, this won't change.

Consequently, I'd be fine to change the assertions to check Post.count as the expected value. I'd prefer not to have to use stubs here when we have the setup already with the fixtures -- IMHO stubs just make the test less clear. I'd also be fine just leaving as-is -- as you mentioned, we have very few tests that assert on this count, so I think hardcoding the count value is a fair tradeoff. For a larger codebase we'd maybe want to reconsider, but I think the tradeoffs should be weighed within the scope of the codebase we're working in 😄

All that to say, I'll leave as-is for this PR, but if you feel strongly and would like to change in a follow-up, feel free to do so!

adrianna-chang-shopify added 2 commits September 15, 2022 17:13

Clarify implicit count in README and remove #count from test tasks

fed6c55

adrianna-chang-shopify requested a review from a team September 16, 2022 13:45

nvasilevski approved these changes Sep 19, 2022

View reviewed changes

adrianna-chang-shopify merged commit dc85ada into main Sep 21, 2022

adrianna-chang-shopify deleted the ac-implicit-count branch September 21, 2022 15:00

shopify-shipit bot temporarily deployed to rubygems September 26, 2022 13:57 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implicitly calculate collection size for Active Record Relations and Arrays #698

Implicitly calculate collection size for Active Record Relations and Arrays #698

adrianna-chang-shopify commented Sep 16, 2022

nvasilevski left a comment

nvasilevski Sep 19, 2022

adrianna-chang-shopify Sep 21, 2022


		TaskJob.perform_now(@run)

		assert_equal 2, @run.reload.tick_total

Implicitly calculate collection size for Active Record Relations and Arrays #698

Implicitly calculate collection size for Active Record Relations and Arrays #698

Conversation

adrianna-chang-shopify commented Sep 16, 2022

nvasilevski left a comment

Choose a reason for hiding this comment

nvasilevski Sep 19, 2022

Choose a reason for hiding this comment

adrianna-chang-shopify Sep 21, 2022

Choose a reason for hiding this comment