Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for kubernetes trial log entries #925

Merged

Conversation

hkang1
Copy link
Contributor

@hkang1 hkang1 commented Jul 22, 2020

Description

add support for parsing kubernetes log entries

 045b68a5 [RUNNING] || 2020-07-20T15:35:38.548199278Z INFO: Running workload <RUN_STEP (100): (141,137,9)>
 045b68a5 [RUNNING] || 2020-07-20T15:35:41.197537494Z INFO: Workload completed: <RUN_STEP (100): (141,137,9)> (duration 0:00:02.649047)
 045b68a5 [RUNNING] || 2020-07-20T15:35:41.208223876Z INFO: Running workload <COMPUTE_VALIDATION_METRICS: (141,137,9)>
 045b68a5 [RUNNING] || 2020-07-20T15:35:43.607370553Z INFO: Workload completed: <COMPUTE_VALIDATION_METRICS: (141,137,9)> (duration 0:00:02.398744)
 045b68a5 [RUNNING] || 2020-07-20T15:35:43.620408964Z INFO: Running workload <CHECKPOINT_MODEL: (141,137,9)>
 045b68a5 [RUNNING] || 2020-07-20T15:35:43.64517623Z INFO: Saved trial to checkpoint b5258099-0138-4ed7-9ff7-d01ee4b24ef0
 045b68a5 [RUNNING] || 2020-07-20T15:35:43.646657674Z INFO: Uploading checkpoint b5258099-0138-4ed7-9ff7-d01ee4b24ef0 to GCS
 045b68a5 [RUNNING] || 2020-07-20T15:35:46.372110929Z INFO: Workload completed: <CHECKPOINT_MODEL: (141,137,9)> (duration 0:00:00.025378)
 045b68a5 [RUNNING] || 2020-07-20T15:35:46.377021261Z INFO: Running workload <TERMINATE: (141,137,9)>

current default format looks like:

[2020-07-22T17:21:31Z] 89d54fe1 [PULLING] || image already found, skipping pull phase: docker.io/determinedai/environments:py-3.6.9-pytorch-1.4-tf-1.15-cpu-f2038fa
[2020-07-22T17:21:31Z] 89d54fe1 [STARTING] || copying files to container: 
[2020-07-22T17:21:31Z] 89d54fe1 [STARTING] || copying files to container: 

Test Plan

Commentary (optional)

@hkang1 hkang1 merged commit aedd2ed into determined-ai:master Jul 23, 2020
@hkang1 hkang1 deleted the 3670-parse-different-trial-log-entries branch July 23, 2020 06:05
stoksc pushed a commit that referenced this pull request Jul 20, 2023
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
eecsliu pushed a commit that referenced this pull request Jul 24, 2023
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
stoksc pushed a commit that referenced this pull request Oct 17, 2023
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
azhou-determined pushed a commit that referenced this pull request Dec 7, 2023
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
wes-turner pushed a commit that referenced this pull request Feb 2, 2024
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
@dannysauer dannysauer added this to the 0.12.13 milestone Feb 6, 2024
rb-determined-ai pushed a commit that referenced this pull request Feb 29, 2024
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
amandavialva01 pushed a commit that referenced this pull request Mar 18, 2024
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
eecsliu pushed a commit that referenced this pull request Apr 18, 2024
…make slurmcluster" unconfigured user failures [FE-101] (#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
eecsliu pushed a commit to determined-ai/determined-release-testing that referenced this pull request Apr 22, 2024
…make slurmcluster" unconfigured user failures [FE-101] (determined-ai#925)

* chore: Show the Determined users at startup to help us troubleshoot "make slurmcluster" unconfigured user failures [FE-101]

* Display the Determined users before running the tests

* Show the Determined users in config.yml, just prior to running the test

* specify the host and port when calling 'det user list'

* Show the master host being pointed to when showing the 'det user list'

* Show the master host being pointed to when showing the 'det user list'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants