Skip to content

Specifying an Architecture

Clemens-Alexander Brust edited this page Sep 18, 2015 · 7 revisions

For maximum flexibility without creating further dependencies, CN24 uses its own language to specify neural network architectures.

To design a new network architecture, create an empty file (preferred extension: .net).

Adding Layers

Layers are specified one at a time, in the order in which they are evaluated during a forward pass. There are several types of layers supported by CN24, each with their own set of parameters. You can add your own layers to the language by modifying the code.

In general, a layer is specified by a single line beginning with a question mark, followed by the type of layer and an optional list of parameters:

?layer_type param1=a param2=b

Convolutional Layers

In fully convolutional network, these layers are the only layers that have weights. These connection weights are many three-dimensional convolution kernels used in a "valid" type convolution. Convolutional layers are specified using the following command:

?convolutional size=5x5 kernels=8

The size parameter describes the dimensions of the individual convolution kernels. The third dimension is setup automatically because it needs to match the previous layer. kernels specifies the number of individual three-dimensional convolution kernels in the layer. This is the number of feature maps coming out of the layer.

Note: Using odd numbers in the size parameter is preferred because it leads to receptive fields with even dimensions which are easier to process. However, this is not enforced by CN24.

There are advanced options similar to the ones provided by Caffe:

?convolutional size=5x5 kernels=8 group=2 stride=2x2

There is a special command for convolutional layers with 1x1 kernels:

?fullyconnected neurons=100

In a network processing individual patches, a 1x1 convolutional layer would be equal to a fully connected layer. The neurons parameter corresponds to the kernels parameter of the convolutional layer.

Maximum Pooling Layers

Spatial pooling layers divide their input into equally sized regions. Each output pixel represents the region through its value. Maximum pooling uses the maximum to represent a region. The pooling is applied to each feature map separately. Maximum pooling layers are specified using the following command:

?maxpooling size=2x2

Nonlinearities

CN24 supports the most common nonlinear activation functions. Use one of the following commands to add a nonlinearity layer:

?relu
?sigm
?tanh

Special layers

Spatial prior

During the research for our paper Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding we found that adding spatial context in form of coordinates improves segmentation performance for certain tasks. The spatial prior layer adds two feature maps, one for the horizontal and one for the vertical pixel coordinates. Use the following command to add a spatial prior layer:

?spatialprior

Hyperparameters

The default hyperparameters used during training can be overridden in the configuration file. This is recommended, because optimal hyperparameters are highly dependent on the architecture and training data. The following is an example section containing the default values:

l1=0.001
l2=0.0005
lr=0.0001
gamma=0.003
momentum=0.9
exponent=0.75
iterations=500
sbatchsize=4
pbatchsize=4

Please read the reference article on hyperparameters for more information.

Example

This is one of the configurations used for road detection on the KITTI dataset. You can use this example as a starting point for your own configurations:

# Sample CNN for KITTI Dataset

# Network configuration
?convolutional kernels=12 size=7x7
?maxpooling size=2x2
?relu

?convolutional size=5x5 kernels=6
?relu

?convolutional size=5x5 kernels=48
?relu

?fullyconnected neurons=192
?relu

?fullyconnected neurons=(o)
?output

# Learning settings
l1=0.001
l2=0.0005
lr=0.0001
gamma=0.003
momentum=0.9
exponent=0.75
iterations=500
sbatchsize=4
pbatchsize=4