-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep archived out of VimPerformanceState #22585
Conversation
state = VimPerformanceState.capture(host, capture_time) | ||
|
||
expect(state.timestamp).to be_within(1.minute).of(Time.now.beginning_of_hour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very weird condition (I fully understand that this is the way it works currently) and a future reader will not understand this. I think it would be helpful to have a NOTE
comment or something similar describing why now is used for Infra and thus capture_time is ignored. Not sure if that note should live here or in the code or both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
luckily we no longer need to deal with the nuance of infra vs containers
app/models/vim_performance_state.rb
Outdated
|
||
# Soft deletes can capture in the past (keep the timestamp), otherwise capture current state (timestamp = now) | ||
ts = Time.now if ts.nil? || !objs.first.class.respond_to?(:active_for) | ||
ts = ts.utc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, existing rollup_spec.rb
fails with
NoMethodError:
undefined method `utc' for "2010-04-14T21:00:00Z":String
# ./app/models/vim_performance_state.rb:64:in `capture'
# ./app/models/metric/ci_mixin/state_finders.rb:34:in `vim_performance_state_for_ts'
# ./app/models/metric/ci_mixin/state_finders.rb:82:in `containers_from_vim_performance_state_for_ts'
# ./app/models/metric/rollup.rb:271:in `rollup_child_metrics'
# ./app/models/metric/rollup.rb:211:in `block in rollup_hourly'
# ./app/models/metric/rollup.rb:211:in `each'
# ./app/models/metric/rollup.rb:211:in `rollup_hourly'
# ./app/models/metric/ci_mixin/rollup.rb:96:in `block (2 levels) in perf_rollup'
# /Users/joerafaniello/.gem/ruby/3.0.6/gems/more_core_extensions-4.4.0/lib/more_core_extensions/core_ext/benchmark/realtime_store.rb:20:in `realtime_store'
# /Users/joerafaniello/.gem/ruby/3.0.6/gems/more_core_extensions-4.4.0/lib/more_core_extensions/core_ext/benchmark/realtime_store.rb:56:in `realtime_block'
# ./app/models/metric/ci_mixin/rollup.rb:95:in `block in perf_rollup'
# /Users/joerafaniello/.gem/ruby/3.0.6/gems/more_core_extensions-4.4.0/lib/more_core_extensions/core_ext/benchmark/realtime_store.rb:20:in `realtime_store'
# /Users/joerafaniello/.gem/ruby/3.0.6/gems/more_core_extensions-4.4.0/lib/more_core_extensions/core_ext/benchmark/realtime_store.rb:62:in `realtime_block'
# ./app/models/metric/ci_mixin/rollup.rb:82:in `perf_rollup'
# ./spec/models/metric/ci_mixin/rollup_spec.rb:439:in `block (4 levels) in <main>'
# /Users/joerafaniello/.gem/ruby/3.0.6/gems/webmock-3.18.1/lib/webmock/rspec.rb:37:in `block (2 levels) in <main>'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's doing container_image.perf_rollup("2010-04-14T21:00:00Z", 'hourly')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea. this is a pisser
think it is a legit issue and not just a spec issue.
looking into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and fixed a caller and added code to protect this.
This code has been moved into 22589 (and this is built upon that)
FactoryBot.create(:container, :container_image => container_image, :deleted_on => (capture_time - 2.hours)) | ||
|
||
state = VimPerformanceState.capture(container_image, capture_time) | ||
expect(state.get_assoc(:containers).map(&:to_i).sort).to eq(containers.map(&:id).sort) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expect(state.get_assoc(:containers).map(&:to_i).sort).to eq(containers.map(&:id).sort) | |
expect(state.get_assoc(:containers)).to match_array(containers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is comparing to_i
to id
, so I did need to keep part of that in there.
But I did use match_array
and removed the sort
calls.
# not deleted is captured | ||
containers = FactoryBot.create_list(:container, 1, :container_image => container_image) | ||
|
||
state_data = {:assoc_ids => {:containers => {:on => (containers.map(&:id) + [22, 23, 24, 25, 26, 27])}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the random numbers for? If it's to get "other" ids, prefer building the id list from existing records, because a) this doesn't handle multi-region properly where the ids aren't in the 0 ragion and b) there's a chance the existing records could fall into the hardcoded range and fail the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put in a comment.
these represent ids that are no longer valid.
If container.id is within that range, then one of those ids will work, but the others will be thrown away. so there will be no false positives in here.
I could do containers.last.id
and build from there? - but that seemed like a bunch of code that will create invalid ids just as easily as this does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containers.last.id
sounds reasonable - we do this in other specs as well. We've hit problems in other places where we fake ids for records that either don't exist or shouldn't exist and hardcoded ids have always eventually caused problems.
c0d7ee2
to
d1fba45
Compare
end | ||
end | ||
|
||
describe ".active_for" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a few more cases (looking at my drawing that I shared the other day).
(created/deleted wording is relative to the time range in question (i.e. during means during the time range); blank in deleted means not deleted (i.e. still active))
case | created | deleted | in range |
---|---|---|---|
a | before | before | ❌ |
b | before | during | ✅ |
c | before | after | ✅ |
d | before | ✅ | |
e | during | during | ✅ |
f | during | after | ✅ |
g | during | ✅ | |
h | after | after | ❌ |
i | after | ❌ |
I think we might also need to handle cases where created is nil, which is cases where we're not sure when it was created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem: it is very tricky to define created
across all container models.
@Fryguy Do you want to define different active_on
scopes for each record?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need created for the first pass as discussed - I think h and i were hard for that reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d1fba45
to
a858f23
Compare
update:
|
6d17fb1
to
5688e61
Compare
update:
update:
|
5688e61
to
51e5b8d
Compare
update:
|
9f373b5
to
4e24700
Compare
4e24700
to
8d727e6
Compare
VimPerformanceState records only capture active objects and those deleted after this hour's time period This cuts down on the number of records processed. (by a lot) Numbers ======= ``` VimPerformanceState.first ; ExtManagementSystem.first ; ContainerImage.first ; Container.first ; nil bookend(gc:true) { Metric::Rollup.rollup_child_metrics(ContainerImage.find(627), Time.now.utc.iso8601, "historical", :containers) } ``` Numbers run for rollup_child_metrics for a parent object that hasn't been created yet. If the parent object has been run, then we don't need a capture, and we'll see no difference | ms | bytes | objects |query | qry ms | rows |` comments` | ---:| ---:| ---:| ---:| ---:| ---:| --- | 2,131.6 | 77,558,748* | 1,070,108 | 13 | 1,760.5 | 13,680 |`rollup-create-vps-before` | 1,632.5 | 5,139,458* | 393,268 | 13 | 1,517.7 | 174 |`rollup-create-vps-after` * Memory usage does not reflect 175,949 freed objects. * Memory usage does not reflect 59,056 freed objects. We've already merged code to reduce the number of records returned from VimPerformanceState#containers So removing archived from the VimPerformanceState will reduce the size of the query sent to fetch containers. This will reduce the query ms (simpler/smaller query), but this does not reduce the number of records returned And VPS#state_data went from 167,234 => 1,752 characters The savings there is due to the fact that capture does not download all the archived records. So this commit saves all the id downloads, and the commit from the fetch pr (just merged a few commits back), reduces all archived records fetch as part of the capture process
8d727e6
to
3ddd7d6
Compare
update:
UN-WIP: this has reduced quite a bit in complexity |
Checked commits kbrock/manageiq@65d0a84~...3ddd7d6 with ruby 2.6.10, rubocop 1.28.2, haml-lint 0.35.0, and yamllint |
return false unless resource.respond_to?(method) | ||
|
||
records = resource.send(method) | ||
records = records.not_archived_before(timestamp) if records.try(:klass).respond_to?(:not_archived_before) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only actual changed line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have the
try
here?
-- @Fryguy
If the association is a virtual attribute, like it is for MiqRegion
, then it will return an array that does not have a klass
.
else | ||
assoc | ||
end | ||
return false unless resource.respond_to?(method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a minor change to return something so the parent knows we need to next
(codeclimate this code to be its own method)
Backported to
|
Keep archived out of VimPerformanceState (cherry picked from commit 900fb1b)
reference:
requires:
extracted:
Overview
We were putting too many archived container ids in the
VimPerformanceState
records.This was filling up the database, causing us to process too many records, slowing down metrics capture, and slowing down metrics rollups.
Before
After
Numbers
comments
rollup-create-vps-before
rollup-create-vps-after
Outstanding