Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: gradient support #25

Open
1 task
migueldeicaza opened this issue Feb 4, 2017 · 14 comments
Open
1 task

WIP: gradient support #25

migueldeicaza opened this issue Feb 4, 2017 · 14 comments

Comments

@migueldeicaza
Copy link
Owner

migueldeicaza commented Feb 4, 2017

This requires the C API to get support for it.

There is a bug here: tensorflow/tensorflow#6268

  • Add port of the test suite from tensorflow/908d5b6ede6ae829dff138a873eec397ef434cd6
@JimSEOW
Copy link

JimSEOW commented Apr 5, 2017

It seems the gradient support is recently addressed :-)
tensorflow/tensorflow#6268

@migueldeicaza
Copy link
Owner Author

Not quite :-)

Still waiting on it.

@migueldeicaza
Copy link
Owner Author

Added the basic binding, but I have not written the tests, I updated the bug description to track that.

Additionally, the tensorflow commit [0] not all capabilities from the C++ API have been surfaced yet.

[0] tensorflow/tensorflow@908d5b6

@migueldeicaza migueldeicaza changed the title PENDING: gradient support WIP: gradient support May 18, 2017
@mfagerlund
Copy link

Is it now possible to retrieve the gradients and do some form of gradient descent?

@cesarsouza
Copy link
Contributor

I would like to ask the same question - using the new pre-release package that has just been uploaded to NuGet (1.3.0-pre1), is it already possible to retrieve gradients for tensors and do any form of gradient descent at this time? I am not totally currently up-to-date with the status of the C API support for this feature yet, so I guessed it would be easier to ask :-)

@Dorokhov
Copy link
Contributor

Dorokhov commented Sep 4, 2017

@migueldeicaza Hi, I started to verify AddGradients() API with code like this:

			var x = graph.Const (3.0);
			
			var y = graph.Square (x);
			var y1 = graph.Square (y);

			var y2 = graph.Square (y1);
			var g = graph.AddGradients (new TFOutput [] { y, y2 }, new [] {  x});

			var r = session.Run (new TFOutput [] { }, new TFTensor [] { }, g);
			double dy = (double)r [0].GetValue ();
			double dy2 = (double)r [1].GetValue ();
			Assert.Equal (17502.0, dy + dy2);

and got couple of problems:
1.
'cstatus.Handle' should be passed to TF_AddGradients() but not 'status.Handle' variable.
2.
I'm expecting two results 6 (y derivative) and 17496 (y2 derivative) in that run according to documentation:
d(y[0] + y[1]+ ...)/dx[0], d(y[0] + y[1] + ...)/dx[1]

but the API returns only one result 6.
3.
If the number of inputs > 1 the API fails with Access to unprotected memory, for instance
var g = graph.AddGradients (new TFOutput [] { y, y2 }, new [] { x, x2});

1 is quite simple to fix
what do you think about 2 and 3 ? Is it binding issue or underlying native API?
Thanks.

@cesarsouza
Copy link
Contributor

cesarsouza commented Sep 9, 2017

Hi @Dorokhov,

If I understood correctly TensorFlow's documentation for AddGradients, the return vector should have the same length as the inputs vector, therefore the answer would have indeed just one value (cf. the doc: "The partial derivatives are returned in dy. dy should be allocated to size nx." where I understand that in this case, nx would have been the length of new [] { x } which would therefore be 1).

But maybe I am wrong, I am just starting with the gradients API. Let me see if I can help find where the problem is.

Regards,
Cesar

@Dorokhov
Copy link
Contributor

Dorokhov commented Sep 11, 2017

Hi @cesarsouza,

I think you are right, and the API should produce one value - partial derivatives of 'y' sums which are 17502 (6+17496) in my case, but it returns 6. I will try to test the same case in native API.

Thanks.

@sqlBender
Copy link

sqlBender commented Sep 15, 2017

I'm struggling with getting simple models built in Python "productionized" into C#.
This is what I see as a first class use case for this library, but I am not seeing a good example for it.

Take a model built, trained, and saved in python.
Save off the *.pb file (From what i can gather, that is all i need)
In C#:
Create Graph, a Session, Input Variables, Input Values, and Outputs to run the model.
Run the model, then interrogate the output.

The most challenging part for me is constructing the InputVariables, InputValues, and Output. The construction patterns, and instance usages for those objects are befuddling me to say the least.

I assume I am not seeing a pattern in the examples, or that I am fighting the naming conventions, or... I don't know what I am missing... And I can't be the only person struggling with this.

The python code is attached. It's a simple linear regression model I ripped off from a stanford class and made more portable.

Here are the samples i am trying to create. Happy to donate them to the cause when they are complete.
FireTheftLinearRegression.py.txt

FireTheftLinearRegression.cs.txt

@cesarsouza
Copy link
Contributor

Hi @sqlBender,

I have to say I also share your pain, but I am not sure if the (actually very relevant) issue you have raised is connected to the original topic of this current issue, that is, the ability to obtain automatic gradient calculations through the AddGradients method.

If you want to take a look, the Keras Sharp project is aiming at providing an API that is very similar to its Python equivalent, but unfortunately that project is also still a bit blocked until this very issue here gets eventually addressed.

Regards,
Cesar

@sqlBender
Copy link

@cesarsouza it's just where I found an issue similar. Gradient etc. I'll spin up a new issue since my issue is not really related to this thread.

@Dorokhov
Copy link
Contributor

Started discussing the gradient API issue in the tensorflow repository, hoping we will find how the API actually works soon.

migueldeicaza pushed a commit that referenced this issue Nov 5, 2017
These changes and the fix in native API resolves issue #25
Now, add gradients works correctly

The test will fail until a new version of TensorFlowSharp with updated native libraries is released.
@cesarsouza
Copy link
Contributor

Seems like TF doesn't have all gradient operations defined yet. I am referencing here an issue I've just created in TF's issue tracker regarding a missing gradient for tf.select.

@Deep-Blue-2013
Copy link

Deep-Blue-2013 commented Feb 22, 2019

Hi @migueldeicaza, @cesarsouza, The Linear Regression example can run on another .NET binding library now, check the code here.

Piece of code

            // tf Graph Input
            var X = tf.placeholder(tf.float32);
            var Y = tf.placeholder(tf.float32);

            // Set model weights 
            // We can set a fixed init value in order to debug
            // var rnd1 = rng.randn<float>();
            // var rnd2 = rng.randn<float>();
            var W = tf.Variable(-0.06f, name: "weight");
            var b = tf.Variable(-0.73f, name: "bias");

            // Construct a linear model
            var pred = tf.add(tf.multiply(X, W), b);

            // Mean squared error
            var cost = tf.reduce_sum(tf.pow(pred - Y, 2.0f)) / (2.0f * n_samples);

            // gradient descent
            // Note, minimize() knows to modify W and b because Variable objects are trainable=True by default
            var optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost);

            // Initialize the variables (i.e. assign their default value)
            var init = tf.global_variables_initializer();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants