int
default: 2
Verbosity level
0
= off1
= show progress2
= show statistics
str
default: ./
Directory or filename of the output.
- If a directory, it must end with
/
. In that case, the filename will be the name of the yaml config file. - If a filename, it must end with
.png
. Note that a number is attached to the filename or is automatically increased, if the file already exists.
int, float
default: 20.0
Interval after which a snapshot of the currently trained image is saved.
A float number specifies the interval in seconds. An integer number specifies the interval in number-of-epochs.
A zero (either int of float) disables storage of snapshots.
int
default: 300
The number of training steps before stopping the training, not including batch sizes.
For example, if the number of epochs is 100
and a target has a batch_size of 10
,
then 1000
training steps will be performed.
int
default: 0
The number of epochs to skip in the beginning.
This is used by the GUI application to continue training after config changes.
list of length 2 of int
default: [224, 224]
expression variables: time
Resolution of the image to create. A single number for square images or two numbers for width and height.
It supports expression variables so you can actually change the resolution during training, e.g:
resolution:
- 224 if t < .2 else 448
would change the resolution from 224x224 to 448x448 at 20% of training time.
The interpolation method defaults to 'cubic' and can be changed with interpolation.
str
default: cubic
expression variables: time
Filter used when resizing the training image.
Can be one of:
nearest
: no interpolationlinear
: bilinear interpolationcubic
: bicubic interpolation
str
default: ViT-B/32
The pre-trained CLIP model to use. Options are RN50
, RN101
, RN50x4
, ViT-B/32
The models are downloaded from openaipublic.azureedge.net
and stored in the user's ~/.cache/
directory
str
default: auto
The device to run the training on. Can be cpu
, cuda
, cuda:1
etc.
float
default: 1.0
expression variables: resolution, time
The learning rate of the optimizer.
Different optimizers have different learning rates that work well.
However, this value is scaled by hand so that 1.0
translates to
about the same learning rate for each optimizer.
The learnrate value is available to other expressions as lr
or learnrate
.
float
default: 1.0
expression variables: resolution, time
A scaling parameter for the actual learning rate.
It's for convenience in the case when learnrate_scale is an expression like 1. - t
.
The actual learnrate can be overridden with fixed values like 2
or 3
in
different experiments.
The learnrate_scale value is available to other expressions as lrs
or learnrate_scale
.
str
default: adam
The torch optimizer to perform the gradient descent.
Defines the way, the pixels are initialized. Default is random pixels.
list of length 2 of int
no default
This can alter the resolution of the noise or loaded image before it is converted to the resolution of the training image.
list of length 3 of float
default: [0.5, 0.5, 0.5]
The mean (brightness) of the initial pixel noise.
Can be a single number for gray or three numbers for RGB.
list of length 3 of float
default: [0.1, 0.1, 0.1]
The standard deviation (randomness) of the initial pixel noise.
A single number will be copied to the RGB values.
str
no default
A filename of an image to use as starting point.
The image will be scaled to the desired resolution if necessary.
list
no default
A 3-dimensional matrix of pixel values in the range [0, 1]
The layout is the same as used in
torchvision,
namely [C, H, W]
, where C
is number of colors (3),
H
is height and W
is width.
This is used by the GUI application to continue training after config changes.
This is a list of targets that define the desired image.
Most important are the features where texts or images are defined which get converted into CLIP features and then drive the image creation process.
It's possible to add additional constraints which alter image creation without using CLIP, e.g. the image mean, saturation or gaussian blur.
bool
default: True
A boolean to turn off the target during development.
This is just a convenience parameter. To turn of a target
during testing without deleting all the parameters, simply
put active: false
inside.
str
default: target
The name of the target.
Currently this is just displayed in the statistics dump and has no functionality.
int, float
default: 0.0
Start frame of the target. The whole target is inactive before this time.
- an
int
number defines the time as epoch frame - a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
End frame of the target. The whole target is inactive after this time.
- an
int
number defines the time as epoch frame - a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
Weight factor that is multiplied with all the weights of features and constraints.
int
default: 1
The number of image frames to process during one epoch.
In machine learning the batch size is one of the important and magic hyper-parameters. They control how many different training samples are included into one weight update.
With CLIPig we are not training a neural network or anything complicated, we just adjust pixel colors, so different batch sizes probably do not make as much difference to the outcome.
However, increasing the batch size certainly reduces the overall computation time. E.g. you can run an experiment for 1000 epochs with batch size 1, or for 100 epochs with a batch size of 10. The latter is much faster. Basically, you can increase the batch size until memory is exhausted.
str
default: all
Selects the way how multiple features are handled.
all
: All feature losses (multiplied with their individual weights) are added together.best
: The similarity between the features of the current image pixels and each desired feature is calculated and the feature with the highest similarity is chosen to adjust the pixels in it's direction.worst
: Similar to thebest
selection mode, the current similarity is calculated and then the worst matching feature is selected. Whilebest
mode will generally increase the influence of one or a few features, theworst
mode will try to increase the influence of all features equally.mix
: All individual features are averaged together (respecting their individual weights) and the resulting feature is compared with the features of the current image. This actually works quite well!
str
default: equal
Adjusts the initial scaling of the similarity between training image and this feature.
equal
: All factors are 1.fair
: Factors are set such that the initial frame of the training image has the same similarity with each feature.
A list of features to drive the image creation.
The CLIP network is used to convert texts or images into a 512-dimensional vector of latent variables.
In the image creation process each target takes a section of the current image, shows it to CLIP and compares the resulting feature vector with the vector of each defined feature.
Through backpropagation each pixel is then slightly adjusted in a way that would make the CLIP feature more similar to the defined features.
str
no default
A word, sentence or paragraph that describes the desired image contents.
CLIP does understand english language fairly good, also some phrases in other languages.
str
no default
Path or URL to an image file (supported formats).
Alternatively to text an image can be converted into the target feature.
Currently the image is resized to 224x224, ignoring the aspect-ratio to fit into the CLIP input window.
If the path starts with http://
or https://
it's treated as an URL and the image
is downloaded and cached in ~/.cache/img/<md5-hash-of-url>
.
int, float
default: 0.0
Start frame of the specific feature
- an
int
number defines the time as epoch frame - a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
End frame of the specific feature
- an
int
number defines the time as epoch frame - a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
float
default: 1.0
expression variables: learnrate, resolution, target feature, time
A weight parameter to control the influence of a specific feature of a target.
Note that you can use negative weights as well which translates roughly to: Generate an image that is the least likely to that feature.
float
default: 1.0
expression variables: learnrate, resolution, target feature, time
A scaling parameter that is multiplied with the similarity value to yield the actual similarity used, e.g., for best_match select.
str
default: cosine
The loss function used to calculate the difference (or error) between current and desired feature.
cosine
: The loss function is1 - cosine_similarity(current, target)
. The CLIP network was trained using cosine similarity so that is the default setting.l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Transforms shape the area of the trained image before showing it to CLIP for evaluation.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adds a fixed value to all pixels.
Three numbers specify red, green and blue while a single number specifies a gray-scale color.
A gaussian blur is applied to the pixels.
See torchvision gaussian_blur.
list of length 2 of int
default: [3, 3]
expression variables: learnrate, resolution, time
The size of the pixel window. Must be an odd, positive integer.
Two numbers define width and height separately.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Gaussian kernel standard deviation. The larger, the more blurry.
If not specified it will default to 0.3 * ((kernel_size - 1) * 0.5 - 1) + 0.8
.
Two numbers define sigma for x and y separately.
Draws a border on the edge of the image. The resolution is not changed.
list of length 2 of int
default: [1, 1]
expression variables: learnrate, resolution, time
One integer two specify width and height at the same time, or two integers to specify them separately.
list of length 3 of float
default: [0.0, 0.0, 0.0]
expression variables: learnrate, resolution, time
The color of the border as float numbers in the range [0, 1]
.
Three numbers for red, green and blue or a single number to specify a gray-scale.
float
no default
expression variables: learnrate, resolution, time
Adds gray-scale noise to the image.
The noise has a scalable normal distribution around zero.
Specifies the standard deviation of the noise distribution.
list of length 2 of int
no default
expression variables: learnrate, resolution, time
Crops an image of the given resolution from the center.
One integer for square images, two numbers to specify width and height.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Clamps the pixels into a fixed range.
First number is the minimum allowed value for all color channels, second is the maximum allowed value.
An image displayed on screen or converted to a file does only include
values in the range of [0, 1]
.
float
default: 1.0
expression variables: learnrate, resolution, time
Adjust the image saturation.
In the target.transforms stage, decreasing the contrast before evaluation by CLIP will increase the dark and bright portions of the training image.
How much to adjust the contrast. Can be any non negative number. 0 gives a solid gray image, 1 gives the original image while 2 increases the contrast by a factor of 2.
list of length 4 of float
no default
expression variables: learnrate, resolution, time
Crops a specified section from the image.
4 numbers: x and y of top-left corner followed by width and height.
A number between 0 and 1 is considered a fraction of the full resolution. A number greater or equal to 1 is considered a pixel coordinate
This removes everything except edges and generally has a bad effect on image quality. It might be useful, however.
A gaussian blur is used to detect the edges:
edge = amount * abs(image - blur(image))
list of length 2 of int
default: [3, 3]
expression variables: learnrate, resolution, time
The size of the pixel window used for gaussian blur. Must be an odd, positive integer.
Two numbers define width and height separately.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Gaussian kernel standard deviation. The larger, the more blurry.
If not specified it will default to 0.3 * ((kernel_size - 1) * 0.5 - 1) + 0.8
.
Two numbers define sigma for x and y separately.
list of length 3 of float
default: [1.0, 1.0, 1.0]
expression variables: learnrate, resolution, time
A multiplier for the edge value. Three numbers to specify red, green and blue separately.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adds noise to the image's fourier space.
It's just a bit different than the normal noise.
The noise has a scalable normal distribution around zero.
Specifies the standard deviation of the noise distribution.
The actual value is multiplied by 15.0
to give a visually
similar distribution as the normal noise.
One value or three values to specify red, green and blue separately.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adjust the mean color value.
Three numbers specify red, green and blue while a single number specifies a gray-scale color.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Multiplies all pixels by a fixed value.
Three numbers specify red, green and blue while a single number specifies a gray-scale color.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adds noise to the image.
The noise has a scalable normal distribution around zero.
Specifies the standard deviation of the noise distribution.
One value or three values to specify red, green and blue separately.
Pads the image with additional pixels at the borders.
list of length 2 of int
no default
expression variables: learnrate, resolution, time
The number of columns/rows to add.
One integer to specify x and y at the same time, or two integers to specify them separately.
E.g. 1, 2
would add 1 column left and one column right of
the image and two rows on top and bottom respectively.
list of length 3 of float
default: [0.0, 0.0, 0.0]
expression variables: learnrate, resolution, time
The color of the pixels that are padded around the image.
str
default: fill
expression variables: learnrate, resolution, time
The way the padded area is filled.
fill
: fills everything with thecolor
valueedge
: repeats the edge pixelswrap
: repeats the image from the opposite edge
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Quantize the color values.
This defines a fixed step-size for each color value.
Generally, do not use in target.transforms because it will remove the small gradient steps of the training. It might be useful in the post-processing stage.
The step-size. Three numbers specify red, green and blue while a single number specifies a gray-scale color.
list of length 2 of int
no default
expression variables: learnrate, resolution, time
Crops a section of the specified resolution from a random position in the image.
One integer for square images, two numbers to specify width and height.
Randomly rotates the image.
Degree and center of rotation are chosen randomly between in the range of the specified values.
The resolution is not changed and areas outside of the image are filled with black (zero).
list of length 2 of float
no default
expression variables: learnrate, resolution, time
The minimum and maximum counter-clockwise angle of ration in degrees.
list of length 2 of float
default: [0.5, 0.5]
expression variables: learnrate, resolution, time
The minimum and maximum center of rotation (for x and y) in the range [0, 1]
.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Randomly scales an image in the range specified.
The resolution does not change, only contents are scaled. Areas outside of the image are filled with black (zero).
Minimum and maximum scale, where 0.5
means half and 2.0
means double.
list of length 2 or 4 of float
no default
expression variables: learnrate, resolution, time
This randomly translates the pixels of the image.
Pixels that are moved outside get attached on the other side.
Specifies the random range of translation.
A number larger 1 or smaller -1 translates by the actual pixels.
A number between -1 and 1 translates by the fraction of the image resolution.
E.g., shift: 0 1
would randomly translate the image to every possible position
given it's resolution.
Two numbers specify minimum and maximum shift both axis, four numbers specify minimum and maximum shift for axis x and y respectively.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Randomly translates an image in the specified range.
The resolution does not change. Areas outside of the image are filled with black (zero).
Maximum absolute fraction for horizontal and vertical translations.
For example: random_translate: a, b
, then horizontal shift is randomly sampled in
the range -img_width * a < dx < img_width * a
and vertical shift is randomly sampled in the range
-img_height * b < dy < img_height * b
.
list of length 2 of int
no default
expression variables: learnrate, resolution, time
Repeats the image a number of times in the right and bottom direction.
One integer to specify x and y at the same time, or two integers to specify them separately.
list of length 2 of int
no default
expression variables: learnrate, resolution, time
The resolution of the image is changed.
One integer for square images, two numbers to specify width and height.
Adds noise with a different resolution to the image.
The noise has a scalable normal distribution around zero.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Specifies the standard deviation of the noise distribution.
One value or three values to specify red, green and blue separately.
list of length 2 of int
no default
expression variables: learnrate, resolution, time
The resolution of the noise image. It will be resized to the processed image.
Rotates the image.
The resolution is not changed and areas outside of the image are filled with black (zero).
float
no default
expression variables: learnrate, resolution, time
The counter-clockwise angle of ration in degrees ([0, 360]
).
list of length 2 of float
default: [0.5, 0.5]
expression variables: learnrate, resolution, time
The center of rotation in the range [0, 1]
.
Two numbers to specify x and y separately.
float
default: 1.0
expression variables: learnrate, resolution, time
Adjust the image saturation.
In the target.transforms stage, decreasing the saturation before evaluation by CLIP will actually increase the contrast of the training image.
How much to adjust the saturation. 0 will give a black and white image, 1 will give the original image while 2 will enhance the saturation by a factor of 2.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
This translates the image while wrapping the edges around.
Pixels that are moved outside get attached on the other side.
A number larger 1 or smaller -1 translates by the actual pixels.
A number between -1 and 1 translates by the fraction of the image resolution.
E.g., shift: .5
would move the center of the image to the previous bottom-right
corner.
A single number specifies translation on both x and y axes while two numbers specify them separately.
A list of post-processing effects that are applied every epoch and change the image pixels directly without interfering with the backpropagation stage.
All transforms that do not change the resolution are available as post processing effects.
bool
default: True
A boolean to turn of the post-processing stage during development.
This is just a convenience parameter. To turn of a stage
during testing without deleting all the parameters, simply
put active: false
inside.
int, float
default: 0.0
Start frame for the post-processing stage. The stage is inactive before this time.
- an
int
number defines the time as epoch frame - a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
End frame for the post-processing stage. The stage is inactive after this time.
- an
int
number defines the time as epoch frame - a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adds a fixed value to all pixels.
Three numbers specify red, green and blue while a single number specifies a gray-scale color.
A gaussian blur is applied to the pixels.
See torchvision gaussian_blur.
list of length 2 of int
default: [3, 3]
expression variables: learnrate, resolution, time
The size of the pixel window. Must be an odd, positive integer.
Two numbers define width and height separately.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Gaussian kernel standard deviation. The larger, the more blurry.
If not specified it will default to 0.3 * ((kernel_size - 1) * 0.5 - 1) + 0.8
.
Two numbers define sigma for x and y separately.
Draws a border on the edge of the image. The resolution is not changed.
list of length 2 of int
default: [1, 1]
expression variables: learnrate, resolution, time
One integer two specify width and height at the same time, or two integers to specify them separately.
list of length 3 of float
default: [0.0, 0.0, 0.0]
expression variables: learnrate, resolution, time
The color of the border as float numbers in the range [0, 1]
.
Three numbers for red, green and blue or a single number to specify a gray-scale.
float
no default
expression variables: learnrate, resolution, time
Adds gray-scale noise to the image.
The noise has a scalable normal distribution around zero.
Specifies the standard deviation of the noise distribution.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Clamps the pixels into a fixed range.
First number is the minimum allowed value for all color channels, second is the maximum allowed value.
An image displayed on screen or converted to a file does only include
values in the range of [0, 1]
.
float
default: 1.0
expression variables: learnrate, resolution, time
Adjust the image saturation.
In the target.transforms stage, decreasing the contrast before evaluation by CLIP will increase the dark and bright portions of the training image.
How much to adjust the contrast. Can be any non negative number. 0 gives a solid gray image, 1 gives the original image while 2 increases the contrast by a factor of 2.
This removes everything except edges and generally has a bad effect on image quality. It might be useful, however.
A gaussian blur is used to detect the edges:
edge = amount * abs(image - blur(image))
list of length 2 of int
default: [3, 3]
expression variables: learnrate, resolution, time
The size of the pixel window used for gaussian blur. Must be an odd, positive integer.
Two numbers define width and height separately.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Gaussian kernel standard deviation. The larger, the more blurry.
If not specified it will default to 0.3 * ((kernel_size - 1) * 0.5 - 1) + 0.8
.
Two numbers define sigma for x and y separately.
list of length 3 of float
default: [1.0, 1.0, 1.0]
expression variables: learnrate, resolution, time
A multiplier for the edge value. Three numbers to specify red, green and blue separately.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adds noise to the image's fourier space.
It's just a bit different than the normal noise.
The noise has a scalable normal distribution around zero.
Specifies the standard deviation of the noise distribution.
The actual value is multiplied by 15.0
to give a visually
similar distribution as the normal noise.
One value or three values to specify red, green and blue separately.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adjust the mean color value.
Three numbers specify red, green and blue while a single number specifies a gray-scale color.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Multiplies all pixels by a fixed value.
Three numbers specify red, green and blue while a single number specifies a gray-scale color.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Adds noise to the image.
The noise has a scalable normal distribution around zero.
Specifies the standard deviation of the noise distribution.
One value or three values to specify red, green and blue separately.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Quantize the color values.
This defines a fixed step-size for each color value.
Generally, do not use in target.transforms because it will remove the small gradient steps of the training. It might be useful in the post-processing stage.
The step-size. Three numbers specify red, green and blue while a single number specifies a gray-scale color.
Randomly rotates the image.
Degree and center of rotation are chosen randomly between in the range of the specified values.
The resolution is not changed and areas outside of the image are filled with black (zero).
list of length 2 of float
no default
expression variables: learnrate, resolution, time
The minimum and maximum counter-clockwise angle of ration in degrees.
list of length 2 of float
default: [0.5, 0.5]
expression variables: learnrate, resolution, time
The minimum and maximum center of rotation (for x and y) in the range [0, 1]
.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Randomly scales an image in the range specified.
The resolution does not change, only contents are scaled. Areas outside of the image are filled with black (zero).
Minimum and maximum scale, where 0.5
means half and 2.0
means double.
list of length 2 or 4 of float
no default
expression variables: learnrate, resolution, time
This randomly translates the pixels of the image.
Pixels that are moved outside get attached on the other side.
Specifies the random range of translation.
A number larger 1 or smaller -1 translates by the actual pixels.
A number between -1 and 1 translates by the fraction of the image resolution.
E.g., shift: 0 1
would randomly translate the image to every possible position
given it's resolution.
Two numbers specify minimum and maximum shift both axis, four numbers specify minimum and maximum shift for axis x and y respectively.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
Randomly translates an image in the specified range.
The resolution does not change. Areas outside of the image are filled with black (zero).
Maximum absolute fraction for horizontal and vertical translations.
For example: random_translate: a, b
, then horizontal shift is randomly sampled in
the range -img_width * a < dx < img_width * a
and vertical shift is randomly sampled in the range
-img_height * b < dy < img_height * b
.
Adds noise with a different resolution to the image.
The noise has a scalable normal distribution around zero.
list of length 3 of float
no default
expression variables: learnrate, resolution, time
Specifies the standard deviation of the noise distribution.
One value or three values to specify red, green and blue separately.
list of length 2 of int
no default
expression variables: learnrate, resolution, time
The resolution of the noise image. It will be resized to the processed image.
Rotates the image.
The resolution is not changed and areas outside of the image are filled with black (zero).
float
no default
expression variables: learnrate, resolution, time
The counter-clockwise angle of ration in degrees ([0, 360]
).
list of length 2 of float
default: [0.5, 0.5]
expression variables: learnrate, resolution, time
The center of rotation in the range [0, 1]
.
Two numbers to specify x and y separately.
float
default: 1.0
expression variables: learnrate, resolution, time
Adjust the image saturation.
In the target.transforms stage, decreasing the saturation before evaluation by CLIP will actually increase the contrast of the training image.
How much to adjust the saturation. 0 will give a black and white image, 1 will give the original image while 2 will enhance the saturation by a factor of 2.
list of length 2 of float
no default
expression variables: learnrate, resolution, time
This translates the image while wrapping the edges around.
Pixels that are moved outside get attached on the other side.
A number larger 1 or smaller -1 translates by the actual pixels.
A number between -1 and 1 translates by the fraction of the image resolution.
E.g., shift: .5
would move the center of the image to the previous bottom-right
corner.
A single number specifies translation on both x and y axes while two numbers specify them separately.
Constraints do influence the trained image without using CLIP.
They only affect the pixels that are processed by the transforms of the target.
Adds the difference between the image and a blurred version to the training loss.
This is much more helpful than using the gaussian blur as a post-processing step. When added to the training loss, the blurring keeps in balance with the actual image creation.
Areas that CLIP is excited about will be constantly updated and will stand out of the blur, while unexciting areas get blurred a lot.
list of length 2 of int
default: [3, 3]
expression variables: learnrate, resolution, target constraint, time
The size of the pixel window. Must be an odd, positive integer.
Two numbers define width and height separately.
list of length 2 of float
no default
expression variables: learnrate, resolution, target constraint, time
Gaussian kernel standard deviation. The larger, the more blurry.
If not specified it will default to 0.3 * ((kernel_size - 1) * 0.5 - 1) + 0.8
.
Two numbers define sigma for x and y separately.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Adds a border with a specific size and color to the training loss.
list of length 2 of int
default: [1, 1]
expression variables: learnrate, resolution, target constraint, time
One integer two specify width and height at the same time, or two integers to specify them separately.
list of length 3 of float
default: [0.0, 0.0, 0.0]
expression variables: learnrate, resolution, target constraint, time
The color of the border as float numbers in the range [0, 1]
.
Three numbers for red, green and blue or a single number to specify a gray-scale.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Pushes the contrast above or below a threshold value.
The contrast is currently calculated in the following way:
The image pixels are divided into the ones that are above and below the pixel mean values. The contrast value is then the difference between the mean of the lower and the mean of the higher pixels.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
below the above
value.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
above the below
value.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Adds the difference between the current image and and an edge-detected version to the training loss.
A gaussian blur is used to detect the edges:
edge = amount * abs(image - blur(image))
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
below the above
value.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
above the below
value.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
list of length 2 of int
default: [3, 3]
expression variables: learnrate, resolution, target constraint, time
The size of the pixel window of the gaussian blur. Must be an odd, positive integer.
Two numbers define width and height separately.
list of length 2 of float
no default
expression variables: learnrate, resolution, target constraint, time
Gaussian kernel standard deviation. The larger, the more blurry.
If not specified it will default to 0.3 * ((kernel_size - 1) * 0.5 - 1) + 0.8
.
Two numbers define sigma for x and y separately.
Pushes the image color mean above or below a threshold value.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
below the above
value.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
above the below
value.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Adds the difference between the current image and a noisy image to the training loss.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
Specifies the standard deviation of the noise distribution.
One value or three values to specify red, green and blue separately.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Adds image normalization to the training loss.
The normalized version is found by moving the image colors into the range of min and max.
list of length 3 of float
default: [0.0, 0.0, 0.0]
expression variables: learnrate, resolution, target constraint, time
The desired lowest value in the image.
One color for gray-scale, three colors for red, green and blue.
list of length 3 of float
default: [1.0, 1.0, 1.0]
expression variables: learnrate, resolution, target constraint, time
The desired highest value in the image.
One color for gray-scale, three colors for red, green and blue.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Pushes the saturation above or below a threshold value.
The saturation is currently calculated as the difference of each color channel to the mean of all channels.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
below the above
value.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
above the below
value.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.
Pushes the standard deviation above or below a threshold value.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
below the above
value.
list of length 3 of float
no default
expression variables: learnrate, resolution, target constraint, time
If specified, the training loss increases if the current value is
above the below
value.
float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
A multiplier for the resulting loss value of the constraint.
int, float
default: 0.0
expression variables: learnrate, resolution, target constraint, time
Start frame of the constraints. The constraint is inactive before this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
int, float
default: 1.0
expression variables: learnrate, resolution, target constraint, time
End frame of the constraints. The constraint is inactive after this time.
-
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs. -
an
int
number defines the time as epoch frame -
a
float
number defines the time as ratio between 0.0 and 1.0, where 1.0 is the final epoch. -
percent
(e.g.23.5%
) defines the time as percentage of the number of epochs.
str
default: l2
expression variables: learnrate, resolution, target constraint, time
The loss function used to calculate the difference (or error) between current and desired image.
l1
ormae
: Mean absolute error is the mean of the absolute difference of each vector variable.l2
ormse
: Mean squared error is the mean of the squared difference of each vector variable. Compared to mean absolute error, it produces a smaller loss for small differences (below 1.0) and a larger loss for large differences.