Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update jupyter tensorflow cuda rock for 1.8 #48

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

NohaIhab
Copy link
Contributor

Update the jupyter-tensorflow-cuda-full rock for 1.8, based on the upstream image

Summary of changes

  • change the branch to v1.8-branch
  • bump the rock version to 1.8
  • pin pyjuju to <4.0 in the unit test requirements due to migration to juju 3 in 1.8
  • update kubectl to 1.25

Testing

tox -e unit
unit installed: anyio==4.0.0,asttokens==2.4.0,attrs==23.1.0,backcall==0.2.0,bcrypt==4.0.1,cachetools==5.3.1,certifi==2023.7.22,cffi==1.15.1,charmed-kubeflow-chisme==0.2.0,charset-normalizer==3.2.0,cryptography==41.0.4,decorator==5.1.1,deepdiff==6.2.1,exceptiongroup==1.1.3,executing==1.2.0,google-auth==2.23.0,h11==0.14.0,httpcore==0.18.0,httpx==0.25.0,hvac==1.2.1,idna==3.4,importlib-resources==6.0.1,iniconfig==2.0.0,ipdb==0.13.13,ipython==8.12.2,jedi==0.19.0,Jinja2==3.1.2,jsonschema==4.17.3,juju==3.2.2,kubernetes==28.1.0,lightkube==0.14.0,lightkube-models==1.28.1.4,macaroonbakery==1.3.1,MarkupSafe==2.1.3,matplotlib-inline==0.1.6,mypy-extensions==1.0.0,oauthlib==3.2.2,ops==2.6.0,ordered-set==4.1.0,packaging==23.1,paramiko==2.12.0,parso==0.8.3,pexpect==4.8.0,pickleshare==0.7.5,pkgutil-resolve-name==1.3.10,pluggy==1.3.0,prompt-toolkit==3.0.39,protobuf==3.20.3,ptyprocess==0.7.0,pure-eval==0.2.2,pyasn1==0.5.0,pyasn1-modules==0.3.0,pycparser==2.21,Pygments==2.16.1,pyhcl==0.4.5,pymacaroons==0.13.0,PyNaCl==1.5.0,pyRFC3339==1.1,pyrsistent==0.19.3,pytest==7.4.2,pytest-asyncio==0.21.1,pytest-operator==0.29.0,python-dateutil==2.8.2,pytz==2023.3.post1,PyYAML==6.0.1,requests==2.31.0,requests-oauthlib==1.3.1,rsa==4.9,ruamel.yaml==0.17.32,ruamel.yaml.clib==0.2.7,serialized-data-interface==0.7.0,six==1.16.0,sniffio==1.3.0,stack-data==0.6.2,tenacity==8.2.3,tomli==2.0.1,toposort==1.10,traitlets==5.10.0,typing-extensions==4.8.0,typing-inspect==0.9.0,urllib3==1.26.16,wcwidth==0.2.6,websocket-client==1.6.3,websockets==8.1,zipp==3.17.0
unit run-test-pre: PYTHONHASHSEED='1657534265'
unit run-test: commands[0] | rockcraft pack
WARNING: test command found but not installed in testenv
  cmd: /snap/bin/rockcraft
  env: /home/ubuntu/kubeflow-rocks/jupyter-tensorflow-cuda-full/.tox/unit
Maybe you forgot to specify a dependency? See also the whitelist_externals envconfig setting.

DEPRECATION WARNING: this will be an error in tox 4 and above!
Retrieved base ubuntu:20.04 for amd64                                                                                                                                                     
Extracted ubuntu:20.04                                                                                                                                                                    
Executed: pull conda-jupyter                                                                                                                                                              
Executed: pull generate-locale                                                                                                                                                            
Executed: pull kubectl                                                                                                                                                                    
Executed: pull nodejs                                                                                                                                                                     
Executed: pull non-root-user                                                                                                                                                              
Executed: pull overlay-pkgs                                                                                                                                                               
Executed: pull pebble                                                                                                                                                                     
Executed: pull security-team-requirement                                                                                                                                                  
Executed: overlay conda-jupyter                                                                                                                                                           
Executed: overlay generate-locale                                                                                                                                                         
Executed: overlay kubectl                                                                                                                                                                 
Executed: overlay nodejs                                                                                                                                                                  
Executed: overlay non-root-user                                                                                                                                                           
Executed: overlay overlay-pkgs                                                                                                                                                            
Executed: overlay pebble                                                                                                                                                                  
Executed: overlay security-team-requirement                                                                                                                                               
Executed: build conda-jupyter                                                                                                                                                             
Executed: build generate-locale                                                                                                                                                           
Executed: build kubectl                                                                                                                                                                   
Executed: build nodejs                                                                                                                                                                    
Executed: skip pull conda-jupyter (already ran)                                                                                                                                           
Executed: skip overlay conda-jupyter (already ran)                                                                                                                                        
Executed: skip build conda-jupyter (already ran)                                                                                                                                          
Executed: stage conda-jupyter (required to build 'non-root-user')                                                                                                                         
Executed: build non-root-user                                                                                                                                                             
Executed: build overlay-pkgs                                                                                                                                                              
Executed: build pebble                                                                                                                                                                    
Executed: build security-team-requirement                                                                                                                                                 
Executed: skip stage conda-jupyter (already ran)                                                                                                                                          
Executed: stage generate-locale                                                                                                                                                           
Executed: stage kubectl                                                                                                                                                                   
Executed: stage nodejs                                                                                                                                                                    
Executed: stage non-root-user                                                                                                                                                             
Executed: stage overlay-pkgs                                                                                                                                                              
Executed: stage pebble                                                                                                                                                                    
Executed: stage security-team-requirement                                                                                                                                                 
Executed: prime conda-jupyter                                                                                                                                                             
Executed: prime generate-locale                                                                                                                                                           
Executed: prime kubectl                                                                                                                                                                   
Executed: prime nodejs                                                                                                                                                                    
Executed: prime non-root-user                                                                                                                                                             
Executed: prime overlay-pkgs                                                                                                                                                              
Executed: prime pebble                                                                                                                                                                    
Executed: prime security-team-requirement                                                                                                                                                 
Executed parts lifecycle                                                                                                                                                                  
Exported to OCI archive 'jupyter-tensorflow-cude-full_v1.8.0_20.04_1_amd64.rock'                                                                                                          
unit run-test: commands[1] | bash -c 'NAME=$(yq eval .name rockcraft.yaml) &&  VERSION=$(yq eval .version rockcraft.yaml) &&  ARCH=$(yq eval -r ".platforms | keys" rockcraft.yaml | cut -d" " -f2) &&  ROCK="${NAME}_${VERSION}_${ARCH}" &&  sudo /snap/rockcraft/current/bin/skopeo --insecure-policy copy oci-archive:$ROCK.rock docker-daemon:$ROCK:$VERSION && docker save $ROCK > $ROCK.tar'
WARNING: test command found but not installed in testenv
  cmd: /usr/bin/bash
  env: /home/ubuntu/kubeflow-rocks/jupyter-tensorflow-cuda-full/.tox/unit
Maybe you forgot to specify a dependency? See also the whitelist_externals envconfig setting.

DEPRECATION WARNING: this will be an error in tox 4 and above!
Getting image source signatures
Copying blob edaedc954fb5 done  
Copying blob adfbb81f1512 done  
Copying blob d9c306b19569 done  
Copying blob 81a655aa3207 done  
Copying blob d07cc0564d97 done  
Copying config b7b83a0eac done  
Writing manifest to image destination
Storing signatures
unit run-test: commands[2] | pytest -v --tb native --show-capture=all --log-cli-level=INFO /home/ubuntu/kubeflow-rocks/jupyter-tensorflow-cuda-full/tests
=================================================================================== test session starts ===================================================================================
platform linux -- Python 3.8.10, pytest-7.4.2, pluggy-1.3.0 -- /home/ubuntu/kubeflow-rocks/jupyter-tensorflow-cuda-full/.tox/unit/bin/python
cachedir: .tox/unit/.pytest_cache
rootdir: /home/ubuntu/kubeflow-rocks/jupyter-tensorflow-cuda-full
plugins: anyio-4.0.0, asyncio-0.21.1, operator-0.29.0
asyncio: mode=strict
collected 1 item                                                                                                                                                                          

tests/test_rock.py::test_rock 
------------------------------------------------------------------------------------- live log setup --------------------------------------------------------------------------------------
INFO     pytest_operator.plugin:plugin.py:647 Adding model microk8s-localhost:test-rock-ek3t on cloud microk8s
PASSED                                                                                                                                                                              [100%]
------------------------------------------------------------------------------------ live log teardown ------------------------------------------------------------------------------------
INFO     pytest_operator.plugin:plugin.py:783 Model status:

Model           Controller          Cloud/Region        Version  SLA          Timestamp
test-rock-ek3t  microk8s-localhost  microk8s/localhost  3.1.5    unsupported  15:17:43Z

INFO     pytest_operator.plugin:plugin.py:789 Juju error logs:


INFO     pytest_operator.plugin:plugin.py:877 Resetting model test-rock-ek3t...
INFO     pytest_operator.plugin:plugin.py:882 Not waiting on reset to complete.
INFO     pytest_operator.plugin:plugin.py:855 Forgetting main...


=================================================================================== 1 passed in 28.55s ====================================================================================
unit run-test: commands[3] | python /home/ubuntu/kubeflow-rocks/jupyter-tensorflow-cuda-full/tests/test_imports.py
Running command in jupyter-tensorflow-cude-full_v1.8.0_20.04_1_amd64:v1.8.0_20.04_1
2023-09-20T15:17:54.625Z [pebble] Started daemon.
2023-09-20T15:17:54.641Z [pebble] POST /v1/exec 15.699896ms 202
2023-09-20T15:17:54.653Z [pebble] GET /v1/tasks/1/websocket/control 10.505927ms 200
2023-09-20T15:17:54.653Z [pebble] GET /v1/tasks/1/websocket/stdio 88.106µs 200
2023-09-20T15:17:54.654Z [pebble] GET /v1/tasks/1/websocket/stderr 72.641µs 200
2023-09-20T15:17:54.672Z [pebble] POST /v1/exec 7.388258ms 202
2023-09-20T15:17:54.679Z [pebble] GET /v1/tasks/2/websocket/control 6.359588ms 200
2023-09-20T15:17:54.680Z [pebble] GET /v1/tasks/2/websocket/stdio 154.941µs 200
2023-09-20T15:17:54.681Z [pebble] GET /v1/tasks/2/websocket/stderr 109.585µs 200
2023-09-20 15:17:56.528488: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-09-20 15:17:56.528533: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
[I 230920 15:18:12 font_manager:1443] generated new fontManager
2023-09-20T15:18:15.469Z [pebble] GET /v1/changes/2/wait 20.787890084s 200
2023-09-20T15:18:15.485Z [pebble] GET /v1/changes/1/wait 20.831186238s 200
unit run-test: commands[4] | python /home/ubuntu/kubeflow-rocks/jupyter-tensorflow-cuda-full/tests/test_access.py
Running jupyter-tensorflow-cude-full_v1.8.0_20.04_1_amd64:v1.8.0_20.04_1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3914  100  3914    0     0   136k      0 --:--:-- --:--:-- --:--:--  136k
87a6e6e767ec
87a6e6e767ec
_________________________________________________________________________________________ summary _________________________________________________________________________________________
  unit: commands succeeded
  congratulations :)

@NohaIhab NohaIhab requested a review from a team as a code owner September 21, 2023 07:46
@@ -25,10 +25,10 @@ def main():
container_id = container_id[0:12]

# to ensure container is started
time.sleep(5)
time.sleep(10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to use tenacity here as in other ROCKs.

@i-chvets
Copy link
Contributor

@NohaIhab We will need to comment out kfp in imports test as we did here.
I created an issue to track it the fix for it (upgrade?).

Ivan Chvets added 3 commits October 10, 2023 15:28
Summary of changes:
- Updated handling of env variables and service command to use env vars.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants