Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss augmented inference using SparseNetworks #445

Merged
merged 38 commits into from
Dec 5, 2016

Conversation

kordjamshidi
Copy link
Member

Also added the badge example with constraints as a test example.

@kordjamshidi
Copy link
Member Author

kordjamshidi commented Nov 4, 2016

@bhargav tests don't fail on my machine, I am not sure why semaphore is failing.

@kordjamshidi
Copy link
Member Author

great, passed! please feel free to review at your earliest convenience!

override lazy val classifier = new SparseNetworkLearner()
override def feature = using(BadgeFeature1)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a little comment to each of these classifiers?

br.close();
}catch (Exception e) {}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you apply the autoformatter on this file?

{
val tokens = x.split(" ")
tokens(1).charAt(1).toString
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop these paranthesis?

else
"true"
}
}
Copy link
Member

@danyaljj danyaljj Nov 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of doing this?

Why not re-use BadgeLabel here and say:
if(BadgeOppositLabel(x) == "true") "false" else "true"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no specific reason, I guess the overhead is the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but you don't repeat the code; instead reuse it.

import edu.illinois.cs.cogcomp.lbjava.learn.{ SparseNetworkLearner, SparsePerceptron }

/** Created by Parisa on 9/13/16.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop the comment?


object BadgeClassifiers {
import BadgeDataModel._
import edu.illinois.cs.cogcomp.saul.classifier.Learnable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this import to the top?


public class BadgeReader {
public List<String> badges;
// int currentBadge;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this?

@@ -33,7 +33,7 @@ object InitSparseNetwork {
if (label >= N || iLearner.getNetwork.get(label) == null) {
val isConjunctiveLabels = iLearner.isUsingConjunctiveLabels | iLearner.getLabelLexicon.lookupKey(label).isConjunctive
iLearner.setConjunctiveLabels(isConjunctiveLabels)
val ltu: LinearThresholdUnit = iLearner.getBaseLTU
val ltu: LinearThresholdUnit = iLearner.getBaseLTU.clone().asInstanceOf[LinearThresholdUnit]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the necessity for clone()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this bug was caught by @bhargav, it needs to create a new instance of linear threshold here each time a new label is met. This was the main bug for the SparseNetwork initialization.

@@ -18,16 +18,16 @@ object JointTrainSparseNetwork {

val logger: Logger = LoggerFactory.getLogger(this.getClass)
var difference = 0
def apply[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], init: Boolean)(implicit headTag: ClassTag[HEAD]) = {
train[HEAD](node, cls, 1, init)
def apply[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], init: Boolean, lossAugmented: Boolean)(implicit headTag: ClassTag[HEAD]) = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a doc to this function and explain what it does as well as the parameters?

Copy link
Member

@danyaljj danyaljj Nov 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually here what I meant was documentation for the function.
Like:

/** 
* This function does blah blah ... 
* @param node .... 
* @param cls ... 
* .... 
* @param lossAugmented ....
*/

ilearner.getNetwork.set(label, ltu)
val ltu: LinearThresholdUnit = baseClassifier.getBaseLTU.clone().asInstanceOf[LinearThresholdUnit]
ltu.initialize(baseClassifier.getNumExamples, baseClassifier.getNumFeatures)
baseClassifier.getNetwork.set(label, ltu)
N = label + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not required. Also N can be made a val.

val ltu_actual = ilearner.getLTU(LTU_actual).asInstanceOf[LinearThresholdUnit]
val ltu_predicted = ilearner.getLTU(LTU_predicted).asInstanceOf[LinearThresholdUnit]
val ltu_actual = baseClassifier.getLTU(LTU_actual).asInstanceOf[LinearThresholdUnit]
val ltu_predicted = baseClassifier.getLTU(LTU_predicted).asInstanceOf[LinearThresholdUnit]

if (ltu_actual != null)
ltu_actual.promote(a0, a1, 0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are promoting/demoting by a fixed update of 0.1, shouldn't we take into account the learning rate parameter. The update rule inside LinearThresholdUnit's learn function is according to the learning rate and margin thickness.

Copy link
Member Author

@kordjamshidi kordjamshidi Nov 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this has remained here from my very first trial version. How should I pass the parameters, do you think that I just add it to the list of input parameters? Since we have two apply versions it can not have the default value for both cases as well, I guess. Isn't it a separate issue to have a consistent way for parameter setting in Saul?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The baseLTU already has all parameters to use. We can directly call the learn function to use those parameters.

val labelValues = a(3).asInstanceOf[Array[Double]]

if (ltu_actual != null) {
    # Learn as Positive Example
    ltu_actual.learn(a0, a1, Array(1), labelValues)
}

if (ltu_predicted != null) {
    # Learn as a negative example
    ltu_predicted.learn(a0, a1, Array(0), labelValues)
}

Also it might be better to rename all the variables a, a0, a1 etc for better readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call learn?! and what we are doing here then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

learn does not use internal prediction result?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/IllinoisCogComp/lbjava/blob/master/lbjava/src/main/java/edu/illinois/cs/cogcomp/lbjava/learn/LinearThresholdUnit.java#L462

Learn promotes or demotes the LTU's weight vector. The third argument controls if promote should be called or demote should be called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the score, s?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine if we cannot use learn. My only concern was using that having a fixed learning rate might affect performance. We can fix that separately.

}

@scala.annotation.tailrec
def train[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], it: Int, init: Boolean)(implicit headTag: ClassTag[HEAD]): Unit = {
def train[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], it: Int, init: Boolean, lossAugmented: Boolean = false)(implicit headTag: ClassTag[HEAD]): Unit = {
// forall members in collection of the head (dm.t) do
logger.info("Training iteration: " + it)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an assertion here to check that the base classifiers are of the type SparseNetworkLearner. Also you can add that to the function documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I already had a line about it in the new documentation.

-remove redundant assignment
val ltu_actual = ilearner.getLTU(LTU_actual).asInstanceOf[LinearThresholdUnit]
val ltu_predicted = ilearner.getLTU(LTU_predicted).asInstanceOf[LinearThresholdUnit]
val ltu_actual = baseClassifier.getLTU(LTU_actual).asInstanceOf[LinearThresholdUnit]
val ltu_predicted = baseClassifier.getLTU(LTU_predicted).asInstanceOf[LinearThresholdUnit]

if (ltu_actual != null)
ltu_actual.promote(a0, a1, 0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine if we cannot use learn. My only concern was using that having a fixed learning rate might affect performance. We can fix that separately.

@kordjamshidi
Copy link
Member Author

Yes, parameter setting is a different issue that certainly should be done in a sophisticated way for join setting as well.

-smaller iterations
@danyaljj
Copy link
Member

danyaljj commented Nov 17, 2016

This PR looks good to me, except a few things:

  1. Semaphore looks unhappy.
  2. Could you keep the logger changes locally (not check in the PR), until we fix the logger issue? (very soon; I'll have it in my plans after Saul inference: moving beyond LBJava's inference  #401)
  3. Regarding adding names here and there, my suggestion is to clean them all. Git will keep track of all of your contributions, with much more detail. If we want to keep them, we probably have to make it consistent (adding them to all the files).
  4. If there any way to verify that the loss-augmented inference is working? I just want prefer not check in something that we are not sure about its functional correctness.

@kordjamshidi
Copy link
Member Author

see my inline comments.

@kordjamshidi
Copy link
Member Author

This PR looks good to me, except a few things:

  1. Semaphore looks unhappy.

Yes, I am experimenting on this and changed the path of SRL. This is strange that the same path has been used for SRL tests and SLRApps! So, changing the path there makes the tests fail! Something to work on it. To me, the whole SRLconfigurator should be removed.

  1. Could you keep the logger changes locally (not check in the PR), until we fix the logger issue? (very soon; I'll have it in my plans after Saul inference: moving beyond LBJava's inference  #401)

I did not expect right after my experimental changes today you decide to merge this since this was ready a few days ago :-). Ok.

  1. Regarding adding names here and there, my suggestion is to clean them all. Git will keep track of all of your contributions, with much more detail. If we want to keep them, we probably have to make it consistent (adding them to all the files).

Actually, this has been always your suggestion to remove the names and I always start mentioning that I like to keep them. I am for keeping all.

  1. If there any way to verify that the loss-augmented inference is working?

The only thing that I could do was to describe it conceptually and then test it with the Badge example, you can see the Badge example, it is one of the run options. Also, Feel free to check it algorithmically, it is a very small code.

I just want prefer not check in something that we are not sure about its functional correctness.

Me too :-).

-add loss option to SRL runnables
- added scala configurator for SRL
- removed the redundant configurations from the SRL app
- set the defaults to train aTR
-added results of joint training with loss-augmented inference
-removed redundant property symbols
-SRL configuration back to its original
-some more documentation
- returned the symbol names back because of existing trained models!
-
-changed back all paths and config to SRL toy
@kordjamshidi
Copy link
Member Author

I returned back the temporary experimental changes and logger messages. Please merge if this looks fine.

@kordjamshidi
Copy link
Member Author

@bhargav @danyaljj

@kordjamshidi
Copy link
Member Author

@christos-c : nobody is responding here, it would be great if you could review this and merge.

@bhargav
Copy link
Contributor

bhargav commented Dec 5, 2016

Changes look good to me. Only concern is that using loss-augmented inference does not seem to improve the performance (82.644 vs 83.673 without) -- But I saw that you mentioned in the documentation that we are performing better on some class labels.

Btw I don't seem to have permission to merge PRs. 😕

@kordjamshidi
Copy link
Member Author

why? because you approved it probably?

@kordjamshidi
Copy link
Member Author

kordjamshidi commented Dec 5, 2016

We have a lack of people. I hope @christos-c can review and merge. (@bhargav, related to the loss, it has been tested on Badge example also in this PR and seems to work fine there. For improving the results with using the loss, a simple try would be working on reweighting the losses, I can do it in a different PR, and will have that as an option to tweak with it.)

@kordjamshidi kordjamshidi assigned christos-c and unassigned danyaljj Dec 5, 2016
@christos-c
Copy link
Member

I'm on it now, will check and merge!

Copy link
Member

@christos-c christos-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SRL part looks good, only minor typos; will merge as soon as they are fixed.


```

See [here](saul-examples/src/main/scala/edu/illinois/cs/cogcomp/saulexamples/Badge/BadgeClassifiers.scala#L43), for a working example.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be out of the code block.

@@ -35,8 +35,8 @@ OrgClassifier.test()

### Availale algorithms
Here is a list of available algorithms in Saul:
- [LBJava learning algorithms](https://github.com/IllinoisCogComp/lbjava/blob/master/lbjava/doc/ALGORITHMS.md)
- [Weka learning algorithms](https://github.com/IllinoisCogComp/saul/blob/master/saul-core/src/main/java/edu/illinois/cs/cogcomp/saul/learn/SaulWekaWrapper.md)
- [LBJava learning algorithms](https://githu/IllinoisCogComp/lbjava/blob/master/lbjava/doc/ALGORITHMS.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL is wrong. Needs to be changed back to github.com


```

See [here](saul-examples/src/main/scala/edu/illinois/cs/cogcomp/saulexamples/Badge/BadgeClassifiers.scala#L43), for a working example.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last sentence needs to be taken out of the code block.

@kordjamshidi
Copy link
Member Author

kordjamshidi commented Dec 5, 2016

@christos-c: see if this is ok now.

@christos-c christos-c merged commit 765f480 into CogComp:master Dec 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants