Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pytorch API and controller #1294

Merged
merged 1 commit into from
Jul 7, 2021

Conversation

zw0610
Copy link
Member

@zw0610 zw0610 commented Jul 6, 2021

Add PyTorch API and controller into all-in-one branch.

Tested with create and delete pytorch job.

/cc @Jeffwan

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@johnugeorge
Copy link
Member

@zw0610 Are there any changes from the source?

Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "XGBoostJob")
if err = pytorchcontroller.NewReconciler(mgr).SetupWithManager(mgr); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might have some conflicts with #1293 after the merge. I can handle the conflicts.

// See the License for the specific language governing permissions and
// limitations under the License.

package v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not from Kubebuilder. right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, this is copied from pytorch-operator

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jeffwan, terrytangyuan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-robot google-oss-robot merged commit c172880 into kubeflow:all-in-one-operator Jul 7, 2021
@Jeffwan
Copy link
Member

Jeffwan commented Jul 7, 2021

... Seems bot now detects the Github approval and converts it to a /approve label. I use Github approve and this PR was merged immediately which is not intended. but the overall change looks good to me, I just like to keep it open to get approval from other reviewers.. I will hold future PRs just in case

@zw0610
Copy link
Member Author

zw0610 commented Jul 7, 2021

@zw0610 Are there any changes from the source?

For the API part, little. For the controller part, a lot. As we are adopting the reconciler mode, the pytorch-operator controller changes the code from setting up the controller to reconciling a pytorch job. Although we are using the reconcile method from kubeflow/common, I double checked the existing pytorch operator code and found the reconciliation part looks almost the same, despite from the service generation and Pod env setting.

@Jeffwan
Copy link
Member

Jeffwan commented Jul 11, 2021

part of kubeflow/common#138 #1298

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants