Skip to content
This repository has been archived by the owner on Aug 25, 2024. It is now read-only.

plugin: model: Add some new models! #29

Open
pdxjohnny opened this issue Mar 20, 2019 · 17 comments
Open

plugin: model: Add some new models! #29

pdxjohnny opened this issue Mar 20, 2019 · 17 comments
Labels
enhancement New feature or request good first issue Good for newcomers kind/ml Issues partaining to machine learning p2 Medium Priority
Milestone

Comments

@pdxjohnny
Copy link
Member

pdxjohnny commented Mar 20, 2019

Add a model!

This issue is for discussion and help needed comments while adding new Modelss to DFFML.

First, get familiar with how models can be used via the DFFML command line: https://intel.github.io/dffml/master/plugins/dffml_model.html

Make sure you follow: https://intel.github.io/dffml/master/contributing/dev_env.html

Look at what libraries are already being wrapped or models have already been implemented. If you want to use a library that has not yet been integrated, reference the new model tutorial: https://intel.github.io/dffml/master/tutorials/models/

If want to create a new model using any libraries we already have wrappers for,just start working on those packages that already exist under model/. Create a new file under dffml_model_library_name. Each library wrapper does things differently, you should check out how that wrapper is interacting with the underlying library by looking at how the existing models are implemented.

@pdxjohnny pdxjohnny added enhancement New feature or request good first issue Good for newcomers gsoc Google Summer of Code related labels Mar 20, 2019
@yashlamba
Copy link
Contributor

Hey! So I thought of trying to add a new model but am confused about a good starting point for the same. Can you suggest a model that you are/were thinking to add? This way I'll have a more clear way of starting up on the same. Thanks.

@pdxjohnny pdxjohnny modified the milestone: 0.5.0 Beta Release Jun 27, 2019
@pdxjohnny pdxjohnny added the kind/ml Issues partaining to machine learning label Sep 12, 2019
@aghinsa
Copy link
Contributor

aghinsa commented Oct 7, 2019

Is this still open? If yes what models are you looking to add?Also the tutorial link given is not working

@pdxjohnny
Copy link
Member Author

Hi @aghinsa yes it's still open. I've updated the issue. We don't have any neural networks that aren't classifiers right now. So that would be the top priority. The quickest way to fix that is probably by copying the tensorflow based classifier and modifying it to use: https://www.tensorflow.org/versions/r1.14/api_docs/python/tf/estimator/DNNEstimator

or you could create a new package and use another machine learning framework other than tensorflow.

@aghinsa
Copy link
Contributor

aghinsa commented Oct 8, 2019

Thanks.I'll go through the links. I'll go with your suggestion as I'm more comfortable with tensorflow than other frameworks.
PS: Pretty sure I'll have lots of doubts.But I'm ready to invest time,so please to help with this.Also do we discuss related things here or on gitter(just found out about this).

@pdxjohnny
Copy link
Member Author

Awesome! Yes I'm around to answer any questions. Thanks for the help!

@pdxjohnny pdxjohnny added the hacktoberfest hacktoberfest 2019 label Oct 8, 2019
@aghinsa
Copy link
Contributor

aghinsa commented Oct 9, 2019

So, I went through model/tensorflow/dffml_model_tensorflow/dnnc.py,to clarify things

  1. we want to add a regression model which trains on all features
  2. we aren't passing any separate model_fn to the estimator(are we?) rather using the hidden_units from config to specify the model
  3. should the model be such that warm_start is enabled?if yes, can i use the model_dir arg for that

@pdxjohnny
Copy link
Member Author

pdxjohnny commented Oct 9, 2019

  1. You'll probably want to make a class which subclasses from TensorflowModelContext (or maybe even DNNClassifierModelContext depending on how many methods you think don't need to be changed).
  2. Features we want to train on are passed to the __init__ method of the TensorflowModelContext class (I noticed model: tensorflow: dnnc: DNNClassifierModelContext first init arg should be features not config #216, just so you know that the other method arguments aren't named correctly). We want to train on all features that we know how to make a feature column for using the tensorflow API. The feature columns are created in TensorflowModelContext.__init__, the names of the features we care about training on will be in self.features the feature volumes are self.feature_columns.
    A. You'll notice in the *_input_fn methods self.features is passed to repo.features in order to get a dict where the keys are feature names, and the values are the values of that feature.
  3. Yes, we're just using hidden units (for now, we could change this later, after you get the first version working, if you want).
  4. Yes, warm_start should be enabled. And yes, using self.model_dir_path would be the right way to do that.

def model(self):
"""
Generates or loads a model
"""
if self._model is not None:
return self._model
self.logger.debug(
"Loading model with classifications(%d): %r",
len(self.classifications),
self.classifications,
)
self._model = tensorflow.estimator.DNNClassifier(
feature_columns=list(self.feature_columns.values()),
hidden_units=self.parent.config.hidden,
n_classes=len(self.parent.config.classifications),
model_dir=self.model_dir_path,
)
return self._model

I'm sorry there's not a ton of comments in there, another good thing to do to start would be to copy the test file for the existing model to create a new test, and then run just that test.

$ cp tests/test_dnnc.py tests/test_dnnr.py
$ python3.7 setup.py test -s tests.test_dnnr

@aditisingh2362
Copy link

Is someone working on the issue? If not I would love to work on it.

@rohit901
Copy link

rohit901 commented Feb 3, 2020

Hi, I'm interested in working for this sub org under GSOC 2020, if you are planning to apply for GSOC 2020, could you please let me know where I can get started with so that I can start contributing. Thanks for your time.

@sparkingdark
Copy link

sparkingdark commented Mar 16, 2020

I want to implement some neural network based models...
and i am a gsoc 2020 aspirant and want to work on this topic.

@pdxjohnny
Copy link
Member Author

pdxjohnny commented Mar 16, 2020

@rohit901 @darkdebo @aditisingh2362 Sorry for the late reply to those of you who commented on this a while ago, I'm sorry no one saw your comments. I've updated the issue, to point to the new tutorial. Let me know if you have any questions

@purnimapatel
Copy link
Contributor

Hello sir, I want to contribute in this project under GSOC 2020, please guide me any tutorial or videos to familiar with concept of adding ml models

@pdxjohnny
Copy link
Member Author

Hello sir, I want to contribute in this project under GSOC 2020, please guide me any tutorial or videos to familiar with concept of adding ml models

@purnimapatel Please see the New Model Tutorial

@purnimapatel
Copy link
Contributor

@pdxjohnny thanks sir

@pdxjohnny pdxjohnny pinned this issue Apr 5, 2020
@pdxjohnny pdxjohnny unpinned this issue Apr 16, 2020
@pdxjohnny pdxjohnny added this to the 0.3.8 Alpha Release milestone Apr 16, 2020
@spur19
Copy link
Contributor

spur19 commented Oct 20, 2020

Hey. I'd love to contribute to this issue, if it's still open. Are there any specific models you're looking to add? I had a few ideas, and would love to discuss them with you.
@pdxjohnny

@pdxjohnny pdxjohnny removed the gsoc Google Summer of Code related label Jan 26, 2021
This was referenced Jan 28, 2021
@Soumyajain29
Copy link

Hi,
I am GSoC ' 2021 participant.
I'd love to contribute to this issue if it's still open. Are there any specific models you're looking to add? I am comfortable with python and PyTorch.
I am contributing for the very first time. I really appreciate any help/suggestion you can provide to get me started.

@pdxjohnny
Copy link
Member Author

something that might be useful:

Mark Tenenholtz (@marktenenholtz) Tweeted:
TL;DR:

Tabular: XGBoost/LightGBM/RF
Time series: XGBoost/LightGBM/RF
Image: ResNet/EffNet
Text: RoBERTa
Audio: ResNet/EffNet

Your best bet is usually to start with these and then experiment from there.

Nothing in ML is an end-all-be-all!

^ https://mobile.twitter.com/marktenenholtz/status/1501905757842731014

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request good first issue Good for newcomers kind/ml Issues partaining to machine learning p2 Medium Priority
Projects
None yet
Development

No branches or pull requests

9 participants