Tutorial matrix #280

ghost · 2021-01-23T16:21:58Z

This is issue is mainly to get a discussion started about what the (contentwise) end state of the model zoo should be after the refresh. Not to skip steps in the list from FluxML/FluxML-Community-Call-Minutes#9, but we better have this at least a bit settled before starting to convert, add or trim any of the tutorials.

Below I started a matrix to see what the coverage of problems, datasets and highlighted features will be. Ideally we fill it mostly with the already existing models in the zoo and create some new ideas for tutorials to fill any gaps. The rows I already added are merely suggestions, feel free to edit.

Problem	Model	Dataset	Highlighted features / ecosystem	Priority
First steps
Basics (60 min blitz)	CNN	MLDatasets.CIFAR10	High level basics	1
Linear regression	-	MLDatasets.BostonHousing	Low level API	2
Classification	-	MLDatasets.Iris	Datasets processing with MLDataPatterns.jl	3
Images
Image classification	LeNet5	MLDatasets.MNIST	Convolutional networks Model saving/loading	1
Image classification	VGG (Metalhead)	MLDatasets.CIFAR10	Using pretrained model	3
Transfer learning	ResNet (Metalhead)	?	Transfer learning	1
Latent feature learning	VAE	?	Custom loss function	3
Image generation	DCGAN	MLDatasets.MNIST	Complex training function	2
3D vision	-	-	Flux3D	external
Text
Language detection	LSTM	Custom (Europarl?)	Custom dataset	2
Text generation	Char RNN	?	?	3
?	BERT	?	Transformers.jl	3
Differentiable programming
Ray tracing	-	-	Integrating with external packages	external
Other
Adversarial training	FGSM	MLDatasets.?	Optimization of input	3
Hierarchical multiple-instance learning			Mill.jl	3

Features to highlight:

MLDataPatterns.jl
TensorBoardLogger.jl
GeometricFlux.jl

Priorities are from 1 (need to have) to 3 (nice to have)

DhairyaLGandhi · 2021-01-23T16:24:45Z

Before we get into this, could we get an idea of what models remain to be updated.

ghost · 2021-01-23T16:51:12Z

The list of open issues is in #266, in short: they all run or need major work to get them running.

But I strongly prefer to have the direction settled before solving any of these remaining issues. The problems are often in parts that are not really necessary and could thus be resolved by removing things from the model.

Take for example the language detection model. The dataset handling should be improved, but there are two options: use a different, more "standard" dataset or stay with the custom dataset creation. If in the end we don't want to focus on the creation of a dataset, this part in the code should just be removed and it would be a waste of effort trying to improve it now.

DhairyaLGandhi · 2021-01-23T17:08:16Z

Sure. I would add the following:

Vae - custom loss
Cppn - training params directly
Conv - custom training loop
Cifar10 - standard datasets, pretrained models
Vgg - sequential chain

We should add examples from the raytracer and flux3d for some more recent dP problems, even as stubs to point to the right places cc @avik-pal @nirmal-suthar

avik-pal · 2021-01-23T17:19:30Z

For reference:

Examples for RayTracer.jl --> https://github.com/avik-pal/RayTracer.jl#tutorials
Examples for Flux3D.jl --> https://github.com/FluxML/Flux3D.jl#examples

I think stubs to point to the right examples is the right way, else it becomes very tedious to update the same example at multiple places (after any breaking change).

pevnak · 2021-01-23T19:10:30Z

@darsnack asked me to put https://github.com/pevnak/Mill.jl and https://github.com/pevnak/JsonGrinder.jl here.

Using the matrix from the above:
problem: hierarchical multiple-instance learning problems, under which you can imagine JSONs, ProtoBuffers, etc. To our know
model: all classes of models as defined in Tomáš Pevný , Vojtěch Kovařík (2019), Approximation capability of neural networks on spaces of probability measures and tree-structured domains, arXiv:1906.00764
Dataset Recipes, which are very simple, or https://github.com/endgameinc/ember, which is gigantic
Features a step towards AutoML on general data

ghost mentioned this issue Jan 23, 2021

Fix all the models #266

Open

59 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tutorial matrix #280

Tutorial matrix #280

ghost commented Jan 23, 2021 •

edited by ghost

Loading

DhairyaLGandhi commented Jan 23, 2021

ghost commented Jan 23, 2021

DhairyaLGandhi commented Jan 23, 2021

avik-pal commented Jan 23, 2021

pevnak commented Jan 23, 2021

Tutorial matrix #280

Tutorial matrix #280

Comments

ghost commented Jan 23, 2021 • edited by ghost Loading

DhairyaLGandhi commented Jan 23, 2021

ghost commented Jan 23, 2021

DhairyaLGandhi commented Jan 23, 2021

avik-pal commented Jan 23, 2021

pevnak commented Jan 23, 2021

ghost commented Jan 23, 2021 •

edited by ghost

Loading