Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for uploading an ostree commit to Pulp #3636

Merged
merged 19 commits into from
Oct 18, 2023

Conversation

achilleas-k
Copy link
Member

@achilleas-k achilleas-k commented Aug 17, 2023

A new target and uploader is added pulp.ostree. The pulp namespace will most likely be used for more targets in the future.

To make an ostree commit available in Pulp, we perform the following actions:

  • Upload the commit (as an artifact).
  • If the user-defined repository does not exist, create it.
  • If the repository was created, distribute it (make it available for download).
  • Import the commit into the repository.

The uploader workflow was based on Pulp's guide and demonstration video.
It uses a client API library that I autogenerated and made available here: https://github.com/osbuild/pulp-client

Relevant tickets
https://issues.redhat.com/browse/HMS-1163
https://issues.redhat.com/browse/HMS-2340


About testing
We can add a test in CI that spins up a Pulp container to work with but in my local testing using the One Container the import and distribution tasks take a few minutes and would be adding extra delays to our ostree integration tests. Maybe we should add a separate test for this that builds and pushes a single commit and does a remote repository check (with ostree remote add ... and ostree remote summary ...). (cc @henrywang)

Also, while testing, I discovered that there might be a bug when multiple commits with different refs are imported to the same repository: pulp/pulp_ostree#277


This pull request includes:

  • adequate testing for the new functionality or fixed issue
  • adequate documentation informing people about the change such as

Guides PR: osbuild/guides#144

@achilleas-k
Copy link
Member Author

Something I wasn't sure about is the decision to return from the upload code before async tasks are complete.
The import and distribute can take a while but I made it so that the URL is returned before they're finished. Would it be better if we blocked while these tasks run? I don't know how long it can take for them to finish in realistic scenarios.

@achilleas-k achilleas-k marked this pull request as draft August 21, 2023 10:51
@achilleas-k
Copy link
Member Author

Converted to draft until we add tests.

Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few comments, but the PR looks good. I have additional questions:

  1. Is there a plan to add the support for Cloud API for this upload target?
  2. Is there any dependency between publishing of the repo and importing the commit, which would mandate a requirement to make them synchronous?
  3. Can importing of the commit or publishing of the repo fail after the request for the action is created? I'm asking, because we may run into a situation where we return success for the target from the worker, but the actual import or publishing failed eventually and the content would not be available.

internal/upload/pulp/pulp.go Outdated Show resolved Hide resolved
internal/upload/pulp/pulp.go Outdated Show resolved Hide resolved
internal/weldr/upload.go Show resolved Hide resolved
cmd/osbuild-worker/jobimpl-osbuild.go Outdated Show resolved Hide resolved
@achilleas-k
Copy link
Member Author

I added a few comments, but the PR looks good. I have additional questions:

  1. Is there a plan to add the support for Cloud API for this upload target?

Yes, in fact that's the main goal. However, since in the cloud API we associate upload targets with image types directly, we need to wait until the internal Pulp service is available before adding the target to the Cloud API. For now, I'm adding it to weldr only so we can test everything.

  1. Is there any dependency between publishing of the repo and importing the commit, which would mandate a requirement to make them synchronous?

The two actions can be done in any order and are basically independent of each other. Publishing/distributing a repository before a commit is imported creates a distribution (a content path) that has nothing to serve (it just 404s). As soon as the commit import is complete, the path becomes available.

  1. Can importing of the commit or publishing of the repo fail after the request for the action is created? I'm asking, because we may run into a situation where we return success for the target from the worker, but the actual import or publishing failed eventually and the content would not be available.

That's definitely possible. It's basically the thing I was wondering in my second comment. It would make sense to wait for the import to finish, but it can take several minutes. So I'm wondering if it's a good idea to block for several minutes, holding up the worker that's not really doing anything, while a remote server is processing. I don't know what's better here.

@teg
Copy link
Member

teg commented Aug 21, 2023

It would make sense to wait for the import to finish, but it can take several minutes. So I'm wondering if it's a good idea to block for several minutes, holding up the worker that's not really doing anything, while a remote server is processing.

This is what happens with AMIs today, AFAIK. A potential solution for both AMIs and ostree commits is to split the job in two pieces: 1) build+upload; and 2) import. Then importing could be done in parallel with future builds, so there is no blocking.

@teg
Copy link
Member

teg commented Aug 21, 2023

Questions:

  1. Did you consider leaving the repository creation/publishing to the caller? Similarly to how S3 buckets are not set up in composer, but by the caller. Could simplify things potentially?
  2. How will we deal with authentication between workers and pulp in crc? Will the pulp instance in crc publish a public API the worker can talk to?

@achilleas-k
Copy link
Member Author

Questions:

  1. Did you consider leaving the repository creation/publishing to the caller? Similarly to how S3 buckets are not set up in composer, but by the caller. Could simplify things potentially?

Is the caller in this case the customer or the IB service? We could go this way. Assume it's all created beforehand.

  1. How will we deal with authentication between workers and pulp in crc? Will the pulp instance in crc publish a public API the worker can talk to?

I think that's the plan, yes. @croissanne ?

@achilleas-k achilleas-k force-pushed the pulp-ostree branch 2 times, most recently from 2815103 to 40d2754 Compare August 21, 2023 18:11
@thozza
Copy link
Member

thozza commented Aug 22, 2023

It would make sense to wait for the import to finish, but it can take several minutes. So I'm wondering if it's a good idea to block for several minutes, holding up the worker that's not really doing anything, while a remote server is processing.

This is what happens with AMIs today, AFAIK. A potential solution for both AMIs and ostree commits is to split the job in two pieces: 1) build+upload; and 2) import. Then importing could be done in parallel with future builds, so there is no blocking.

I had a very similar idea, I would just make the import async and add a new job type which would wait for import to finish in parallel with any osbuild job. Basically in the same way as we do for depsolving.

The tricky part is that such a change would be very disruptive, since composer assumes that the UUID of the osbuild job is the actual compose UUID and we use that to determine status, logs, etc in API endpoints. We already have a special case for Koji where the last job is the KojiFinalize and its UUID is used as the compose UUID. We would have to do a similar thing for new non-Koji composes and only for some of them depending on the used target. I'm sure someone can figure out an elegant and backward incompatible way to make it work, I just wanted to point out that this is not a trivial task.

@yih-redhat
Copy link
Contributor

yih-redhat commented Sep 5, 2023

Hi @achilleas-k I have tested uploading commit in pulp and it seems works well.
My steps:

  1. configure pulp using one container
  2. build edge-commit image and upload it to pulp.
  3. create local repo, pull content from pulp to local repo.
  4. provision vm with local repo
  5. build an upgrade edge-commit image, and upload to pulp. (same repo, same distribution, same path). Now pulp repository changed to version 2 with new commit.
  6. pull new commit content from pulp to local repo.
  7. run 'rpm-ostree upgrade' to upgrade current vm, success.

I will add test case for pulp but have some questions:

  1. Could you please give an example of 'composer-cli' command to build and upload to pulp?
  2. Customer need to configure pulp properly before using composer to build image? like creating repo, creating distribution, etc.
  3. The upload will take a while, if customer check the composer job status, what's the expected result? 'running', 'uploading', 'finished', etc?
  4. If build success but upload failed, can user still be able to download the image?

@kingsleyzissou
Copy link
Contributor

@yih-redhat, Achilleas is out until tomorrow so I can try answer some of your questions. He has a small test though that he wrote here: achilleas-k/pulp-ostree-test

The test does depend on pulp/pulp_ostree#277, but there is an open pull request for this... I built the all-in-one container locally with the changes from that PR and I was able to test it successfully using the above tests.

  1. Could you please give an example of 'composer-cli' command to build and upload to pulp?

For the configuration, here is a link to the guides PR: (osbuild/guides#144). Or you could use the small test Achilleas wrote for reference.

  1. Customer need to configure pulp properly before using composer to build image? like creating repo, creating distribution, etc.

As it stands the user doesn't need to do any of those steps. A repo will be created if it doesn't exist, the commit will get imported and the repository will get distributed.

  1. The upload will take a while, if customer check the composer job status, what's the expected result? 'running', 'uploading', 'finished', etc?

As far as I know the job status will be 'running' even for the 'uploading' part... for on-prem we just have 'running', 'waiting', 'finished' & 'failed'. But maybe one of the others could correct me if I'm wrong.

  1. If build success but upload failed, can user still be able to download the image?

I don't think this is possible at the moment. Again, someone could correct me here.

@yih-redhat
Copy link
Contributor

Thank you @kingsleyzissou

@yih-redhat
Copy link
Contributor

One more question @kingsleyzissou , what OS does this PR support? rhel9.3 and later? how about fedora and centos-stream?

@kingsleyzissou
Copy link
Contributor

AFAIK it will support rhel9.4 and later and then fedora and centos-stream. But maybe Achilleas can confirm that 100% when he is back.

@yih-redhat
Copy link
Contributor

yih-redhat commented Sep 6, 2023

What's the final decision of async import job? I found the compose job is at finished status, but importing job is not finished yet, usually need to wait several minutes until I can see content in pulp (At first I thought something was wrong in compose job but the log shows no error, then after a while I checked pulp again and found the content.).
Filed a github bug #3673 to track this.

@yih-redhat
Copy link
Contributor

Composer will build the image first even if the pulp configuration in pulp.toml is wrong and job will fail at final upload step.
Composer should check and deny the build request if pulp configuration is wrong, just like what it does for --url parameter, if --url is wrong, the compose job will not start.
Filed a bug #3674 to track this.

@yih-redhat
Copy link
Contributor

pulp/pulp_ostree#279 also blocks this PR.

I just found the child commit cannot be imported into pulp repo where parent commit was imported, and then I found @achilleas-k already filed this bug.

@yih-redhat
Copy link
Contributor

Added test case for pulp, and due to bug pulp/pulp_ostree#279 , test case cannot success.

@achilleas-k
Copy link
Member Author

One more question @kingsleyzissou , what OS does this PR support? rhel9.3 and later? how about fedora and centos-stream?

Any releases made after this PR is merged will support this. We wont be gating the feature on any OS or OS version. Also, we wont be backporting this to any existing RHEL versions and I don't think we want to go through the exception process to get this into 9.3 or 8.9.

FTR, this feature is primarily intended for the hosted service.

thozza
thozza previously approved these changes Oct 16, 2023
Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 :fishsticks:

@achilleas-k achilleas-k enabled auto-merge (rebase) October 16, 2023 16:04
@achilleas-k
Copy link
Member Author

@yih-redhat edge-minimal-raw-cs8 is failing on the persistent journal check. Any idea why that might have changed?

@yih-redhat
Copy link
Contributor

yih-redhat commented Oct 17, 2023

@yih-redhat edge-minimal-raw-cs8 is failing on the persistent journal check. Any idea why that might have changed?

Didn't see it before, I rerun it and now it passed.
Btw, once we enabled ostree-pulp.sh test in rhel-edge-ci, we will need to run all the test again for this pr. (we can see the pulp test result then)

achilleas-k and others added 19 commits October 17, 2023 12:20
Define a basic client struct to pull in the pulp-client library.
Function for importing a commit artifact (that's already been uploaded)
into a given repository.  Note that the "repo" argument to the
NewOstreeImportAll() function refers to the name of the repository
directory inside the archive, which for the commits we produce in
osbuild is always "repo".
Function for distributing an ostree repository, making it available for
consumption.
Define the task state enum based on the available values defined in the
API.

Add a helper function that returns true if a task is running or waiting
and ignores errors.
Helper function that performs the whole upload and distribute procedure.
Two more helper functions are added for retrieving the href for a
repository based on its name, and for retrieving the base URL for a
repository's distribution.
Simple command line wrapper around the UploadAndDistributeCommit()
function.
Add support for pulp client configuration in the worker config.
Add test values to worker config test.
When uploading and distributing a commit, wait for any async tasks to
finish before returning.  There are two tasks that can block this
function:
- Creating a distribution: this only happens when a new repository is
  created.
- Import commit: this will always happen in this function.
Since the UploadAndDistributeCommit() function now waits for all tasks
to finish, update the wording of the final message to indicate that the
commit is available.
The pulp client is very large and defines a lot of symbols in on
package, which causes very large memory usage on el8 aarch64 (presumably
because of 64k page sizes).
Adding a 1 GiB swapfile fixes the issue in our CI runners.
@achilleas-k
Copy link
Member Author

Rebased and fixed go.mod conflicts.

Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@achilleas-k achilleas-k merged commit 3d7b01b into osbuild:main Oct 18, 2023
70 checks passed
@achilleas-k achilleas-k deleted the pulp-ostree branch October 18, 2023 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants