Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

depending on hp algorithms, experiment parameters' feasible space require step size #1472

Closed
MLXQ opened this issue Mar 15, 2021 · 8 comments

Comments

@MLXQ
Copy link

MLXQ commented Mar 15, 2021

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

I'm trying to use experiment hp with two different algorithms such as grid search and random search.

For example, I've been using hyperparameter tuning for batchsize and learning rate.
and it worked seamlessly without specifying step size when I used random search with batch size and learning rate.

However, to use grid search, it require to specify the step size for learning rate in the experiment parameter's feasible space
and it seems like batch size doesn't require step size for grid search.

  1. Is there any document that describe which hyperparameter need step size and which hyperparameter doesn't need step size
    ?

  2. It works fine without step size for random search but i'm curious does make it different by adding step size for random search.

What did you expect to happen:

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

----grid search------

apiVersion: "kubeflow.org/v1beta1"
kind: Experiment
metadata:
namespace: user-namespace-nohardspec
name: katib-tfjob-stdout-profile-nohardlimit
spec:
metricsCollectorSpec:
collector:
kind: StdOut
source:
filter:
metricsFormat:
- "([\w|-]+)\s*:\s*((-?\d+)(\.\d+)?)"
parallelTrialCount: 1
maxTrialCount: 20
maxFailedTrialCount: 10
objective:
type: maximize
goal: 1000000000000
objectiveMetricName: loss
algorithm:
algorithmName: grid
earlyStopping:
algorithmName: medianstop
algorithmSettings:
- name: min_trials_required
value: "2"
- name: start_step
value: "2"
parameters:
- name: batch_size
parameterType: int
feasibleSpace:
min: "1"
max: "100"
- name: learning_rate
parameterType: double
feasibleSpace:
min: "0.0001"
max: "0.01"
step: "0.001"
trialTemplate:
primaryContainerName: tensorflow
trialParameters:
- name: batchSize
description: Batch Size
reference: batch_size
- name: learningRate
description: Learning Rate
reference: learning_rate
trialSpec:
apiVersion: "kubeflow.org/v1"
kind: TFJob
spec:
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- command:

----random search ---

apiVersion: "kubeflow.org/v1beta1"
kind: Experiment
metadata:
namespace: user-namespace-nohardspec
name: katib-tfjob-stdout-profile-nohardlimit
spec:
metricsCollectorSpec:
collector:
kind: StdOut
source:
filter:
metricsFormat:
- "([\w|-]+)\s*:\s*((-?\d+)(\.\d+)?)"
parallelTrialCount: 1
maxTrialCount: 20
maxFailedTrialCount: 10
objective:
type: maximize
goal: 1000000000000
objectiveMetricName: loss
algorithm:
algorithmName: random
earlyStopping:
algorithmName: medianstop
algorithmSettings:
- name: min_trials_required
value: "2"
- name: start_step
value: "2"
parameters:
- name: batch_size
parameterType: int
feasibleSpace:
min: "1"
max: "100"
step: "1"
- name: learning_rate
parameterType: double
feasibleSpace:
min: "0.0001"
max: "0.01"
step: "0.001"
trialTemplate:
primaryContainerName: tensorflow
trialParameters:
- name: batchSize
description: Batch Size
reference: batch_size
- name: learningRate
description: Learning Rate
reference: learning_rate
trialSpec:
apiVersion: "kubeflow.org/v1"
kind: TFJob
spec:
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- command:

Environment:

  • Kubeflow version (kfctl version): 1.2
  • Minikube version (minikube version):
  • Kubernetes version: (use kubectl version): 1.15
  • OS (e.g. from /etc/os-release): Ubuntu 18.04
@MLXQ MLXQ changed the title depending on the hp algorithm, experiment parameters' feasible space require step size depending on hp algorithms, experiment parameters' feasible space require step size Mar 15, 2021
@andreyvelich
Copy link
Member

Thank you for creating this @MLXQ!

Is there any document that describe which hyperparameter need step size and which hyperparameter doesn't need step size ?

For the Grid, default step for int parameter is "1": https://github.com/kubeflow/katib/blob/master/pkg/suggestion/v1beta1/internal/search_space.py#L35.
For double we don't have default step.
Currently, we don't have such documentation, but it would be great if we can add some information here: https://www.kubeflow.org/docs/components/katib/experiment/#grid-search.

It works fine without step size for random search but i'm curious does make it different by adding step size for random search.

Grid search requires discrete search space, but random can work with inf search space.
That is why you have to specify step parameter for your HPs.

@stale
Copy link

stale bot commented Jun 16, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@RDarrylR
Copy link

Is there any reason why step for double values isn't used by random search. When i am searching for values for many types of params I don't usually want to end up with a value with many decimal places. Maybe the underlying hyperopt doesn't support step for floats?

@stale stale bot removed the lifecycle/stale label Jun 29, 2021
@andreyvelich
Copy link
Member

andreyvelich commented Jun 29, 2021

Is there any reason why step for double values isn't used by random search. When i am searching for values for many types of params I don't usually want to end up with a value with many decimal places. Maybe the underlying hyperopt doesn't support step for floats?

Yes, currently we are using uniform for double HPs in hyperopt, which doesn't use step.
@gaocegege can we use quniform for double parameters to support step ?

We also discuss about supporting various search space distributions: #1207 from the APIs.

@RDarrylR
Copy link

RDarrylR commented Jun 29, 2021

Thanks @andreyvelich. What I do now is just make up categorical lists of values for double params like "['0.05', '0.01', '0.005, '0.001']" but this isn't a great way to deal with it.

@johnugeorge
Copy link
Member

This can be supported without any api changes.

@stale
Copy link

stale bot commented Jan 3, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Jan 3, 2022
@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@stale stale bot closed this as completed Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants