depending on hp algorithms, experiment parameters' feasible space require step size #1472

MLXQ · 2021-03-15T09:53:38Z

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

I'm trying to use experiment hp with two different algorithms such as grid search and random search.

For example, I've been using hyperparameter tuning for batchsize and learning rate.
and it worked seamlessly without specifying step size when I used random search with batch size and learning rate.

However, to use grid search, it require to specify the step size for learning rate in the experiment parameter's feasible space
and it seems like batch size doesn't require step size for grid search.

Is there any document that describe which hyperparameter need step size and which hyperparameter doesn't need step size
?
It works fine without step size for random search but i'm curious does make it different by adding step size for random search.

What did you expect to happen:

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

----grid search------

apiVersion: "kubeflow.org/v1beta1"
kind: Experiment
metadata:
namespace: user-namespace-nohardspec
name: katib-tfjob-stdout-profile-nohardlimit
spec:
metricsCollectorSpec:
collector:
kind: StdOut
source:
filter:
metricsFormat:
- "([\w|-]+)\s*:\s*((-?\d+)(\.\d+)?)"
parallelTrialCount: 1
maxTrialCount: 20
maxFailedTrialCount: 10
objective:
type: maximize
goal: 1000000000000
objectiveMetricName: loss
algorithm:
algorithmName: grid
earlyStopping:
algorithmName: medianstop
algorithmSettings:
- name: min_trials_required
value: "2"
- name: start_step
value: "2"
parameters:
- name: batch_size
parameterType: int
feasibleSpace:
min: "1"
max: "100"
- name: learning_rate
parameterType: double
feasibleSpace:
min: "0.0001"
max: "0.01"
step: "0.001"
trialTemplate:
primaryContainerName: tensorflow
trialParameters:
- name: batchSize
description: Batch Size
reference: batch_size
- name: learningRate
description: Learning Rate
reference: learning_rate
trialSpec:
apiVersion: "kubeflow.org/v1"
kind: TFJob
spec:
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- command:

----random search ---

apiVersion: "kubeflow.org/v1beta1"
kind: Experiment
metadata:
namespace: user-namespace-nohardspec
name: katib-tfjob-stdout-profile-nohardlimit
spec:
metricsCollectorSpec:
collector:
kind: StdOut
source:
filter:
metricsFormat:
- "([\w|-]+)\s*:\s*((-?\d+)(\.\d+)?)"
parallelTrialCount: 1
maxTrialCount: 20
maxFailedTrialCount: 10
objective:
type: maximize
goal: 1000000000000
objectiveMetricName: loss
algorithm:
algorithmName: random
earlyStopping:
algorithmName: medianstop
algorithmSettings:
- name: min_trials_required
value: "2"
- name: start_step
value: "2"
parameters:
- name: batch_size
parameterType: int
feasibleSpace:
min: "1"
max: "100"
step: "1"
- name: learning_rate
parameterType: double
feasibleSpace:
min: "0.0001"
max: "0.01"
step: "0.001"
trialTemplate:
primaryContainerName: tensorflow
trialParameters:
- name: batchSize
description: Batch Size
reference: batch_size
- name: learningRate
description: Learning Rate
reference: learning_rate
trialSpec:
apiVersion: "kubeflow.org/v1"
kind: TFJob
spec:
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- command:

Environment:

Kubeflow version (kfctl version): 1.2
Minikube version (minikube version):
Kubernetes version: (use kubectl version): 1.15
OS (e.g. from /etc/os-release): Ubuntu 18.04

The text was updated successfully, but these errors were encountered:

andreyvelich · 2021-03-15T15:59:53Z

Thank you for creating this @MLXQ!

Is there any document that describe which hyperparameter need step size and which hyperparameter doesn't need step size ?

For the Grid, default step for int parameter is "1": https://github.com/kubeflow/katib/blob/master/pkg/suggestion/v1beta1/internal/search_space.py#L35.
For double we don't have default step.
Currently, we don't have such documentation, but it would be great if we can add some information here: https://www.kubeflow.org/docs/components/katib/experiment/#grid-search.

It works fine without step size for random search but i'm curious does make it different by adding step size for random search.

Grid search requires discrete search space, but random can work with inf search space.
That is why you have to specify step parameter for your HPs.

stale · 2021-06-16T22:37:01Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

RDarrylR · 2021-06-29T19:38:08Z

Is there any reason why step for double values isn't used by random search. When i am searching for values for many types of params I don't usually want to end up with a value with many decimal places. Maybe the underlying hyperopt doesn't support step for floats?

andreyvelich · 2021-06-29T22:11:17Z

Is there any reason why step for double values isn't used by random search. When i am searching for values for many types of params I don't usually want to end up with a value with many decimal places. Maybe the underlying hyperopt doesn't support step for floats?

Yes, currently we are using uniform for double HPs in hyperopt, which doesn't use step.
@gaocegege can we use quniform for double parameters to support step ?

We also discuss about supporting various search space distributions: #1207 from the APIs.

RDarrylR · 2021-06-29T22:41:35Z

Thanks @andreyvelich. What I do now is just make up categorical lists of values for double params like "['0.05', '0.01', '0.005, '0.001']" but this isn't a great way to deal with it.

johnugeorge · 2021-06-30T05:27:53Z

This can be supported without any api changes.

stale · 2022-01-03T21:44:31Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2022-03-02T11:58:26Z

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

google-oss-robot added the kind/bug label Mar 15, 2021

MLXQ changed the title ~~depending on the hp algorithm, experiment parameters' feasible space require step size~~ depending on hp algorithms, experiment parameters' feasible space require step size Mar 15, 2021

stale bot added the lifecycle/stale label Jun 16, 2021

stale bot removed the lifecycle/stale label Jun 29, 2021

stale bot added the lifecycle/stale label Jan 3, 2022

stale bot closed this as completed Mar 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

depending on hp algorithms, experiment parameters' feasible space require step size #1472

depending on hp algorithms, experiment parameters' feasible space require step size #1472

MLXQ commented Mar 15, 2021

andreyvelich commented Mar 15, 2021

stale bot commented Jun 16, 2021

RDarrylR commented Jun 29, 2021

andreyvelich commented Jun 29, 2021 •

edited

Loading

RDarrylR commented Jun 29, 2021 •

edited

Loading

johnugeorge commented Jun 30, 2021

stale bot commented Jan 3, 2022

stale bot commented Mar 2, 2022

depending on hp algorithms, experiment parameters' feasible space require step size #1472

depending on hp algorithms, experiment parameters' feasible space require step size #1472

Comments

MLXQ commented Mar 15, 2021

andreyvelich commented Mar 15, 2021

stale bot commented Jun 16, 2021

RDarrylR commented Jun 29, 2021

andreyvelich commented Jun 29, 2021 • edited Loading

RDarrylR commented Jun 29, 2021 • edited Loading

johnugeorge commented Jun 30, 2021

stale bot commented Jan 3, 2022

stale bot commented Mar 2, 2022

andreyvelich commented Jun 29, 2021 •

edited

Loading

RDarrylR commented Jun 29, 2021 •

edited

Loading