-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report physical disks through the inventory system #5145
Conversation
1ffd4d1
to
b962606
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; just two tiny nits picking at memory usage that probably doesn't matter - feel free to ignore either or both.
schema/crdb/38.0.0/up01.sql
Outdated
@@ -0,0 +1,13 @@ | |||
CREATE TABLE IF NOT EXISTS omicron.public.inv_physical_disk ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should we include the slot number here? (if so, can/should we use that for pagination/uniqueness?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we'll probably want that out of inventory anyway. Also, I agree, this seems like a better PK ("sled id" + "slot") than the reported values of serial/model/vendor.
Updated to plumb through the slot from the sled agent, and using as pagination key in this PR.
@@ -750,6 +752,17 @@ impl SledAgent { | |||
self.config.hardware.reservoir_ram, | |||
) | |||
.context("reservoir_size")?, | |||
disks: self | |||
.storage | |||
.lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any situations where this lock might be held for a long time (thinking tens of seconds or minutes)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe so, especially not since this is the simulated sled agent. At a cursory glance, i think it's mostly just getting locked to populate / grab contents from a group of hashmaps.
use db::schema::inv_physical_disk::dsl; | ||
|
||
let mut iter = | ||
disks.chunks(SQL_BATCH_SIZE.get().try_into().unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I like the idea of inserting in batches. But this is all going into one transaction so I'm not sure it's any better in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The batch size is still fairly large 1,000, so it's unlikely we'll be looping much regardless on a single rack or two. But I could see this blowing past the size of a "single SQL statement" if we consider it for larger multi-rack cases?
Thank you @jgallagher and @davepacheco for the feedback! Apologies for the delay on responding, I've been building off this PR in #5172 and trying to make sure it'll work -- seems like yes. I believe I've responded to all feedback -- I know y'all have me the +1 to merge, but I'll wait another ~half day if there are any other comments. |
|
InvPhysicalDisk::new(collection_id, *sled_id, disk.clone()) | ||
}) | ||
}) | ||
.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit: I think you can drop this collect()
(and the corresponding : Vec<_>
type hint), make this mut
, and then remove the disks.into_iter()
call on line 659.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes the following compilation error:
error: implementation of `FnOnce` is not general enough
--> nexus/src/app/background/inventory_collection.rs:58:9
|
58 | / async {
59 | | match inventory_activate(
60 | | opctx,
61 | | &self.datastore,
... |
88 | | }
89 | | .boxed()
| |________________^ implementation of `FnOnce` is not general enough
|
= note: closure with signature `fn(&'0 nexus_types::inventory::PhysicalDisk) -> InvPhysicalDisk` must implement `FnOnce<(&'1 nexus_types::inventory::PhysicalDisk,)>`, for any two lifetimes `'0` and `'1`...
= note: ...but it actually implements `FnOnce<(&nexus_types::inventory::PhysicalDisk,)>`
I believe the usage of disks.into_iter()
is called within a transaction, which is where the FnOnce
comes from?
I think I'm going to keep this pattern of "collect it into Vec, pass that Vec to the FnOnce" for the moment, because the lifetime complexity doesn't seem worth trying to optimize right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grrrr, this is very annoying. I checked that diff against nexus-db-queries
and it was fine, but didn't check whether it cause compilation errors down the line. That feels like a bug - if my function signature hasn't changed, why can implementation changes cause client breakage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to @cbiffle who found we're hitting rust-lang/rust#99492 here; this ought to work. I found a couple weird looking workarounds using hints from that issue:
@@ -127,15 +127,10 @@ impl DataStore {
})
.collect::<Result<Vec<_>, Error>>()?;
// Pull disks out of all sled agents
- let disks: Vec<_> = collection
- .sled_agents
- .iter()
- .flat_map(|(sled_id, sled_agent)| {
- sled_agent.disks.iter().map(|disk| {
- InvPhysicalDisk::new(collection_id, *sled_id, disk.clone())
- })
- })
- .collect();
+ let mut disks =
+ collection.sled_agents.iter().flat_map(|(sled_id, sled_agent)| {
+ std::iter::repeat(sled_id).zip(sled_agent.disks.iter())
+ });
// Partition the sled agents into those with an associated baseboard id
// and those without one. We handle these pretty differently.
@@ -656,10 +651,18 @@ impl DataStore {
use db::schema::inv_physical_disk::dsl;
let batch_size = SQL_BATCH_SIZE.get().try_into().unwrap();
- let mut disks = disks.into_iter();
loop {
- let some_disks =
- disks.by_ref().take(batch_size).collect::<Vec<_>>();
+ let some_disks = disks
+ .by_ref()
+ .take(batch_size)
+ .map(|(&sled_id, disk)| {
+ InvPhysicalDisk::new(
+ collection_id,
+ sled_id,
+ disk.clone(),
+ )
+ })
+ .collect::<Vec<_>>();
if some_disks.is_empty() {
break;
}
or if we don't like collection_id
floating around, it can go in the repeat
too:
@@ -127,15 +127,11 @@ impl DataStore {
})
.collect::<Result<Vec<_>, Error>>()?;
// Pull disks out of all sled agents
- let disks: Vec<_> = collection
- .sled_agents
- .iter()
- .flat_map(|(sled_id, sled_agent)| {
- sled_agent.disks.iter().map(|disk| {
- InvPhysicalDisk::new(collection_id, *sled_id, disk.clone())
- })
- })
- .collect();
+ let mut disks =
+ collection.sled_agents.iter().flat_map(|(sled_id, sled_agent)| {
+ std::iter::repeat((collection_id, *sled_id))
+ .zip(sled_agent.disks.iter())
+ });
// Partition the sled agents into those with an associated baseboard id
// and those without one. We handle these pretty differently.
@@ -656,10 +652,18 @@ impl DataStore {
use db::schema::inv_physical_disk::dsl;
let batch_size = SQL_BATCH_SIZE.get().try_into().unwrap();
- let mut disks = disks.into_iter();
loop {
- let some_disks =
- disks.by_ref().take(batch_size).collect::<Vec<_>>();
+ let some_disks = disks
+ .by_ref()
+ .take(batch_size)
+ .map(|((collection_id, sled_id), disk)| {
+ InvPhysicalDisk::new(
+ collection_id,
+ sled_id,
+ disk.clone(),
+ )
+ })
+ .collect::<Vec<_>>();
if some_disks.is_empty() {
break;
}
I don't know that either of these are worth saving the intermediate .collect()
, but at least we (kind of) understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I ran into this yesterday. It's a pretty bad bug in rustc.
schema/rss-service-plan-v3.json
Outdated
@@ -0,0 +1,828 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file used? I recently ran into something where I considered updating the RSS on-disk plan format, and found a way to not do that to avoid the hassle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used, and is stored, but it's not used particularly frequently.
- We always try to "load a plan if one already exists" before creating a new one:
omicron/sled-agent/src/rack_setup/service.rs
Lines 959 to 971 in 145f25c
let service_plan = if let Some(plan) = ServicePlan::load(&self.log, storage_manager).await? { plan } else { ServicePlan::create( &self.log, &config, &storage_manager, &plan.sleds, ) .await? }; - This tries to load the plan with the latest version:
omicron/sled-agent/src/rack_setup/plan/service.rs
Lines 137 to 172 in 145f25c
if let Some(ledger) = ledger { info!(log, "RSS plan already created, loading from file"); Ok(Some(ledger.data().clone())) } else if Self::has_v1(storage_manager).await.map_err(|err| { PlanError::Io { message: String::from("looking for v1 RSS plan"), err, } })? { // If we found no current-version service plan, but we _do_ find // a v1 plan present, bail out. We do not expect to ever see this // in practice because that would indicate that: // // - We ran RSS previously on this same system using an older // version of the software that generates v1 service plans and it // got far enough through RSS to have written the v1 service plan. // - That means it must have finished initializing all sled agents, // including itself, causing it to record a // `StartSledAgentRequest`s in its ledger -- while still running // the older RSS. // - But we're currently running software that knows about v2 // service plans. Thus, this process started some time after that // ledger was written. // - But the bootstrap agent refuses to execute RSS if it has a // local `StartSledAgentRequest` ledgered. So we shouldn't get // here if all of the above happened. // // This sounds like a complicated set of assumptions. If we got // this wrong, we'll fail spuriously here and we'll have to figure // out what happened. But the alternative is doing extra work to // support a condition that we do not believe can ever happen in any // system. Err(PlanError::FoundV1) } else { Ok(None) }
In the limit: I think we may get more mileage out of "wiping everything and forcing RSS to restart" in the case where we restart, which would make these intermediate files removable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But those bits of code are still referencing the "rss-service-plan-v2.json" path, right? Should we bump that to v3? (I don't see anywhere where the schema json is checked, either? Sorry if I'm just missing it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, thank you, good catch, I'm using this in #5172 , but not here yet. I'll pull this into the appropriate PR.
The purpose of that file is to exercise the CLI with data saved from a real system. That seemed useful because part of the goal was to be able to import data from real systems. But it could be more trouble than it's worth while we're still evolving the blueprint format. Also, it doesn't have to be any particular system. The test will dynamically pick an inventory collection and blueprints that it finds. |
I logged onto a machine set up by a4x2, configured my ssh keys to be able to copy files into and out of that machine, used zlogin to access, the switch zone, and ran I can spin up this environment and take another lap - to clarify, do I need to do |
Okay, I ran:
|
inv_physical_disk
tableFixes #5150