Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Avoid some tracebacks during populate #785

Closed
wants to merge 5 commits into from

Conversation

dwlehman
Copy link
Contributor

@dwlehman dwlehman commented Jul 8, 2019

Having fixed big issues with active LVM on partial VGs and with unusable dmraid arrays, the most-reported failure during blivet populate involves partitioned disks that have been cloned by users without adjusting the disklabel UUIDs to make them unique. In the past I have generally favored a "you must fix this problem before proceeding" approach, but now I'm thinking more about being robust and not crashing unless it's really necessary.

I'm mostly curious what people think of "Allow duplicate UUIDs...", "Unset partition UUIDs...", and "General protection against...". The first two of these are tested, both via unit tests and a "real" test using examples/list_devices.py in a vm. The third ("General protection...") probably needs more work.

Copy link
Member

@vojtechtrefny vojtechtrefny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks like a good improvement to me. Definitely needs some testing probably also from Anaconda.

result = six.next((d for d in devices if d.uuid == uuid or d.format.uuid == uuid), None)
matches = iter(d for d in devices if d.uuid == uuid or d.format.uuid == uuid)
result = six.next(matches, None)
extra = six.next(matches, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simple list comprehension and checking if len(matches) == 1 would be better here. It would also allow printing more than two duplicate devices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Done.

@@ -160,6 +160,12 @@ def _add_device(self, newdev, new=True):
dev = self.get_device_by_uuid(newdev.uuid, incomplete=True, hidden=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git comment in the commit message.

# pretend it has no UUID. The only negative effect is on event handling,
# which is disabled by default.
newdev.uuid = None
dev.uuid = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of "removing" the UUIDs. We already have two ways how to make the reset "stable" in this situation -- special exception that can be caught and used to ignore one of the disks and the flag. And now we are adding a special case for partitions that just "silently" removes the UUID?. I think the flag should be enough and Anaconda should set it by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that seems like a sketchy thing to do. I dropped that commit.

@dwlehman dwlehman force-pushed the robust-populate branch 3 times, most recently from 473229f to 6583721 Compare July 10, 2019 16:39
There is a new flag to control this behavior: allow_inconsistent_config.
When False, DuplicateUUIDError will be raised when trying to add any
subsequent device whose UUID matches one already in the tree. When True,
the duplicate UUIDs will be allowed to coexist until/unless a call to
DeviceTree.get_device_by_uuid is made with the duplicate UUID as its
argument. In this case, DuplicateUUIDError will be raised.
The goal here is to absolutely minimize tracebacks that occur when
populating the devicetree. There are mechanisms in place that will limit
the supported functionality for managing such devices, but there will be
no tracebacks until/unless the user tries to manage them in ways that
are precluded by the detection issues previously encountered.
Copy link
Contributor

@vpodzime vpodzime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

device = helper_class(self, info).run()
try:
device = helper_class(self, info).run()
except DeviceTreeError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could be really important here is that every run() method should make sure to gather as much information as possible before throwing the exception. Otherwise the information about the problematic device may be very incomplete.

@vojtechtrefny
Copy link
Member

Obsoleted by #1306

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants