-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ohpc node provisioning fails due to repo GPG key errors #77
Comments
I verified that the repo keys are all correct.
according to the ohpc guides, the first step is to install ohpc-release rpm to set up the repo and keys
the epel key from the fedora project is also accurate
|
I tried reverting to the OpenHPC 1.3.6 for this task by replacing the update repo entry with the content of the 1.3.6 relaese http://build.openhpc.community/OpenHPC:/1.3:/Update6/CentOS_7/OpenHPC:1.3:Update6.repo This still failed with the same error. |
The problematic task that is failing is found here. Converting that task in to a yum command and executing on the command-line produces no errors.
This doesn't provide an easy work-around because an earlier task removes the existing compute node image which means we repeat the same error each time this role is executed via ansible. |
I turned on logging for ansible. Noticing a difference between the output when the above is run by hand vs within ansible. Here's the transaction summary from the command-line run:
Here's the transaction summary from the run of the task within ansible;
In particular, notice the difference in the resolution of the delta packages. When run from the command line there is a warning about a failed delta but it succeeds. Also the total download size is 5.1MB, supposedly down from 402MB due to deltas. The ansible-based execution doesn't see the same download footprint reduction nor does it appear to recover from the delta download failure. |
After much debugging of the ansible yum command, the difference between successful installs and failures was narrowed down to the centos/7 vagrant box versions. The systems that were running prior to 1901.1 were failing. My existing vagrant box versions were:
I was able to resolve this error by doing a |
To provide some background on the debugging of this failed task, the perplexing thing is that the task fails but when a similar yum command is run from the ohpc command line after experiencing the failed task, the yum command installs all packages without error. This suggests there really is no problem with repos and that the ansible command should succeed. The equivalent yum command that succeeds without error is:
The ansible yum module builds a yum command-line and calls that command to executed the task. Ansible's yum module selects from yum3, yum4, or dnf, but defaults to automatically picking the one one the system. This suggests there is some subtle difference between the two commands or the environment they are executed in. Both ansible and yum rely on the system python 2.7 install and it doesn't appear that yum4 aka dnf is called. Further investigation is warranted but the work around of using the latest centos/7 box avoids requiring any edits to the role. |
For future debug reference login the failure via the -vvv args shows this as the log for this task in the output log. Not the error right at the end about the file not found. Not sure if this specifically the failure but it's the last part of the msg field. Not included is the result output, which just contains the dump of the yum ouput of which the portion above is a part.
|
After the upstream release of Open HPC 1.3.7 the
TASK [compute_build_vnfs : yum install into the image chroot]
is failing. There is lots of output but the final error message is:The seems related to not trusting the GPG keys for these repos. Given they are file-based keys, it may simply mean they are out of date with the upstream repos.
The text was updated successfully, but these errors were encountered: