CogComp · christos-c · Dec 5, 2016 · Nov 4, 2016 · Nov 4, 2016 · Nov 4, 2016
diff --git a/README.md b/README.md
@@ -31,7 +31,6 @@ Visit each link for its content
  5. [Data modeling and feature extraction](saul-core/doc/DATAMODELING.md)
  6. [Learners and constraints](saul-core/doc/SAULLANGUAGE.md)
  7. [Model configurations](saul-core/doc/MODELS.md)
- 8. [Saul library](saul-core/doc/LBJLIBRARY.md)
 
 The api docs are included [here](http://cogcomp.cs.illinois.edu/software/doc/saul/). 
 

diff --git a/saul-core/doc/MODELS.md b/saul-core/doc/MODELS.md
@@ -1,6 +1,117 @@
-* Designing flexible learning models including various configurations such as:
 
-   * Local models i.e. single classifiers. (Learning only models (LO)).
-   * Constrained conditional models (CCM)[1] for training independent classifiers and using them jointly for global decision making in prediction time. (Learning+Inference (L+I)).
-   * Global models for joint training and joint inference (Inference-Based-Training (IBT)).
-   * Pipeline models for complex problems where the output of each layer is used as the input of the next layer.
+
+#Learning Paradigms
+
+/*Documented by Parisa Kordjamshidi*/
+
+Saul facilitates the flexible design of complex learning models with various configurations.
+By complex models we mean the models that aim at prediction of more than one output variable where these outputs might have relationships to each other.
+Such models can be designed using the following paradigms,
+
+   * [Local models](#local) trains single classifiers (Learning only models (LO)) each of which learns and predicts a single variable in the output independently.
+   * [Constrained conditional models (CCM)](#L+I) for training independent classifiers and using them jointly for global decision making in prediction time. (Learning+Inference (L+I)).
+   * [Global models](#IBT) for joint training and joint inference (Inference-Based-Training (IBT)).
+   * [Pipeline models](#pipeline) for complex problems where the output of each model is used as the input of the next model (these models are different layers in a pipeline).
+
+The above mentioned paradigms can be tested using this simple badge classifier example, [here](saul-examples/src/main/scala/edu/illinois/cs/cogcomp/saulexamples/Badge/BagesApp.scala).
+
+<a name="local">
+##Local models
+These models are a set of single classifiers. Each classifier is defined with the `Learnable` construct and is trained and makes prediction independent from other classifiers.
+ The `Learnable` construct requires specifying a single output variable, that is, a label which is itself a property in the data model, and the features which is also a
+ comma separated list of properties.
+
+ ```scala
+ object ClassifierName extends Learnable (node) {
+   def label = property1
+   def feature = using(property2,property3,...)
+   //a comma separated list of properties
+ }
+ ```
+
+ For the details about the `Learnable` construct see [here](SAULLANGUAGE.md).
+
+<a name="L+I">
+##Learning+Inference models
+These models are useful for when we need to consider the global relations between the outputs of a bunch of classifiers during the
+prediction time. Each classifier is defined with the same `Learnable` construct as a local model. In addition to the Learnable definitions, the programmer
+has the possibility to define a number of logical constraints between the output of the Learnables (classifiers).
+Having the constraint definitions in place (see [here](SAULLANGUAGE.md) for syntax details), the programmer is able to define
+new constrained classifiers that use the Learnables and constraints.
+
+```scala
+object ConstrainedClassifierName extends ConstrainedClassifier[local_node_type,global_node_type](LocalClassifier)
+ {
+   def subjectTo = constraintExpression
+   // Logical constraint expression comes here, it defines the relations between the
+   // LocalClassifier and other Learnables defined before
+ }
+ ```
+When we use the above `ConstrainedClassifierName` to call test or make predictions, the `LocalClassifier` is used
+but the predictions are made in way that `constraintExpression` is hold. There is no limitation for the type of local classifiers.
+They can be SVMs, decision trees or any other learning models available in Saul, [here](https://github.com/IllinoisCogComp/lbjava/blob/master/lbjava/doc/ALGORITHMS.md)
+                                                                                  and [here](https://github.com/IllinoisCogComp/saul/blob/master/saul-core/src/main/java/edu/illinois/cs/cogcomp/saul/learn/SaulWekaWrapper.md).
+
+<a name="IBT">
+##Inference-based Learning
+For the inference based models the basic definitions are exactly similar to the L+I models. In other words, the programmer
+just needs to define the `Learnables` and `ConstrainedClassifiers`. However, to train the ConstrainedClassifiers jointly, instead of
+training local classifiers independently, there are a couple of joint training functions that can be called.
+These functions receive the list of constrained classifiers as input and train their parameters jointly. In contrast to
+L+I models here the local classifiers can be defined as `SparsePerceptron`s or `SparseNetworkLearner`s only. This is because the
+joint training should have its own strategy for the wight updates of the involved variables (those variables come down to be the outputs of the local classifiers here).
+For the two cases the programmer can use
+
+```scala
+ JointTrainSparseNetwork.train(param1,param2,...) /* a list of parameters go here*/
+ JointTrainSparsePerceptron.train(param1,param2,...) /*a list of parameters here*/
+```
+
+For example,
+
+ ```scala JointTrainSparseNetwork.train(badge, cls, 5, init = true, lossAugmented = true)```
+
+The list of parameters are the following:
+
+- param1: The name of a global node in the data model that itself or the connected nodes to it are used by the involved `Learnable`s.
+
+- param2: The collection of ```ConstainedClassifier```s
+
+- param3: The number of iterations over the training data.
+
+- param4: If the local classifiers should be cleaned from the possibly pre-trained values and initialized by zero weights, this parameter should be true.
+
+- param5: If the approach uses the loss augmented objective for making inference, see below for description.
+
+###Basic approach
+
+The basic approach for training the models jointly is to do a global prediction at each step of the training and if the
+predictions are wrong update the weights of the related variables.
+
+###Loss augmented
+
+The loss-augmented approach adds the loss of the prediction explicitly to the objective of the training and finds the most violated output per each training example;
+it updates the weights of the model according to the errors made in the prediction of the most violated output.
+This approach minimizes a convex upper bound of the loss function and has been used in structured SVMs and Structured Perceptrons.
+ However, considering an arbitrary loss in the objective will make complexities in the optimization, therefore in the implemented version, here, we assume the loss is decomposed similar to
+feature function. That is, the loss is a hamming loss defined per classifier. The loss of the whole structured output is computed by the weighted sum of
+ the loss of its components.
+ In Saul, the programmer can indicate if he/she needs to consider this global hamming loss in the objective or not. And this can be done by passing
+ the above mentioned `param5` as true in the `JointTrainingSparseNetwork` algorithm.
+ An example of this usage can be seen [here](saul-examples/src/main/scala/edu/illinois/cs/cogcomp/saulexamples/Badge/BagesApp.scala) at line #64.
+
+<a name="pipeline">
+##Pipelines
+Building pipelines is naturally granted in Saul. The programmer can simply define properties that are the predictions of
+the classifiers and use those outputs as the input of other classifiers by mentioning them in the list of the properties in the below construct when defining the
+pipeline classifiers,
+
+```scala
+   def feature = using(/*list of properties including the prediction of other classifiers.*/)
+```
+.
+See [here](saul-examples/src/main/scala/edu/illinois/cs/cogcomp/saulexamples/Badge/BadgeClassifiers.scala), at line #43, for an example.
+
+
+
+
diff --git a/...core/src/main/scala/edu/illinois/cs/cogcomp/saul/classifier/JointTrainSparseNetwork.scala b/...core/src/main/scala/edu/illinois/cs/cogcomp/saul/classifier/JointTrainSparseNetwork.scala
@@ -9,7 +9,7 @@ package edu.illinois.cs.cogcomp.saul.classifier
 import edu.illinois.cs.cogcomp.lbjava.learn.{ LinearThresholdUnit, SparseNetworkLearner }
 import edu.illinois.cs.cogcomp.saul.datamodel.node.Node
 import org.slf4j.{ Logger, LoggerFactory }
-
+import Predef._
 import scala.reflect.ClassTag
 
 /** Created by Parisa on 5/22/15.
@@ -18,16 +18,16 @@ object JointTrainSparseNetwork {
 
   val logger: Logger = LoggerFactory.getLogger(this.getClass)
   var difference = 0
-  def apply[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], init: Boolean)(implicit headTag: ClassTag[HEAD]) = {
-    train[HEAD](node, cls, 1, init)
+  def apply[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], init: Boolean, lossAugmented: Boolean)(implicit headTag: ClassTag[HEAD]) = {
+    train[HEAD](node, cls, 1, init, lossAugmented)
   }
 
-  def apply[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], it: Int, init: Boolean)(implicit headTag: ClassTag[HEAD]) = {
-    train[HEAD](node, cls, it, init)
+  def apply[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], it: Int, init: Boolean, lossAugmented: Boolean = false)(implicit headTag: ClassTag[HEAD]) = {
+    train[HEAD](node, cls, it, init, lossAugmented)
   }
 
   @scala.annotation.tailrec
-  def train[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], it: Int, init: Boolean)(implicit headTag: ClassTag[HEAD]): Unit = {
+  def train[HEAD <: AnyRef](node: Node[HEAD], cls: List[ConstrainedClassifier[_, HEAD]], it: Int, init: Boolean, lossAugmented: Boolean = false)(implicit headTag: ClassTag[HEAD]): Unit = {
     // forall members in collection of the head (dm.t) do
     logger.info("Training iteration: " + it)
     if (init) ClassifierUtils.InitializeClassifiers(node, cls: _*)
@@ -43,19 +43,25 @@ object JointTrainSparseNetwork {
             if (idx % 5000 == 0)
               logger.info(s"Training: $idx examples inferred.")
 
-            cls.foreach {
-              case classifier: ConstrainedClassifier[_, HEAD] =>
-                val typedClassifier = classifier.asInstanceOf[ConstrainedClassifier[_, HEAD]]
-                val oracle = typedClassifier.onClassifier.getLabeler
+            if (lossAugmented)
+              cls.foreach { cls_i =>
+                cls_i.onClassifier.classifier.setLossFlag()
+                cls_i.onClassifier.classifier.setCandidates(cls_i.getCandidates(h).size * cls.size)
+              }
 
-                typedClassifier.getCandidates(h) foreach {
+            cls.foreach {
+              currentClassifier: ConstrainedClassifier[_, HEAD] =>
+                assert(currentClassifier.onClassifier.classifier.getClass.getName.contains("SparseNetworkLearner"), "The classifier should be of type SparseNetworkLearner!")
+                val oracle = currentClassifier.onClassifier.getLabeler
+                val baseClassifier = currentClassifier.onClassifier.classifier.asInstanceOf[SparseNetworkLearner]
+                currentClassifier.getCandidates(h) foreach {
                   candidate =>
                     {
                       def trainOnce() = {
-                        val result = typedClassifier.classifier.discreteValue(candidate)
+
+                        val result = currentClassifier.classifier.discreteValue(candidate)
                         val trueLabel = oracle.discreteValue(candidate)
-                        val ilearner = typedClassifier.onClassifier.classifier.asInstanceOf[SparseNetworkLearner]
-                        val lLexicon = typedClassifier.onClassifier.getLabelLexicon
+                        val lLexicon = currentClassifier.onClassifier.getLabelLexicon
                         var LTU_actual: Int = 0
                         var LTU_predicted: Int = 0
                         for (i <- 0 until lLexicon.size()) {
@@ -69,26 +75,25 @@ object JointTrainSparseNetwork {
                         // and the LTU of the predicted class should be demoted.
                         if (!result.equals(trueLabel)) //equals("true") && trueLabel.equals("false")   )
                         {
-                          val a = typedClassifier.onClassifier.getExampleArray(candidate)
+                          val a = currentClassifier.onClassifier.getExampleArray(candidate)
                           val a0 = a(0).asInstanceOf[Array[Int]] //exampleFeatures
                           val a1 = a(1).asInstanceOf[Array[Double]] // exampleValues
                           val exampleLabels = a(2).asInstanceOf[Array[Int]]
                           val label = exampleLabels(0)
-                          var N = ilearner.getNetwork.size
+                          val N = baseClassifier.getNetwork.size
 
-                          if (label >= N || ilearner.getNetwork.get(label) == null) {
-                            val conjugateLabels = ilearner.isUsingConjunctiveLabels | ilearner.getLabelLexicon.lookupKey(label).isConjunctive
-                            ilearner.setConjunctiveLabels(conjugateLabels)
+                          if (label >= N || baseClassifier.getNetwork.get(label) == null) {
+                            val conjugateLabels = baseClassifier.isUsingConjunctiveLabels | baseClassifier.getLabelLexicon.lookupKey(label).isConjunctive
+                            baseClassifier.setConjunctiveLabels(conjugateLabels)
 
-                            val ltu: LinearThresholdUnit = ilearner.getBaseLTU
-                            ltu.initialize(ilearner.getNumExamples, ilearner.getNumFeatures)
-                            ilearner.getNetwork.set(label, ltu)
-                            N = label + 1
+                            val ltu: LinearThresholdUnit = baseClassifier.getBaseLTU.clone().asInstanceOf[LinearThresholdUnit]
+                            ltu.initialize(baseClassifier.getNumExamples, baseClassifier.getNumFeatures)
+                            baseClassifier.getNetwork.set(label, ltu)
                           }
 
                           // test push
-                          val ltu_actual = ilearner.getLTU(LTU_actual).asInstanceOf[LinearThresholdUnit]
-                          val ltu_predicted = ilearner.getLTU(LTU_predicted).asInstanceOf[LinearThresholdUnit]
+                          val ltu_actual = baseClassifier.getLTU(LTU_actual).asInstanceOf[LinearThresholdUnit]
+                          val ltu_predicted = baseClassifier.getLTU(LTU_predicted).asInstanceOf[LinearThresholdUnit]
 
                           if (ltu_actual != null)
                             ltu_actual.promote(a0, a1, 0.1)
@@ -100,8 +105,13 @@ object JointTrainSparseNetwork {
                       trainOnce()
                     }
                 }
+
             }
           }
+          if (lossAugmented)
+            cls.foreach { cls_i =>
+              cls_i.onClassifier.classifier.unsetLossFlag()
+            }
       }
       train(node, cls, it - 1, false)
     }

diff --git a/.../cogcomp/saul/classifier/JointTrain.scala → ...assifier/JointTrainSparsePerceptron.scala b/.../cogcomp/saul/classifier/JointTrain.scala → ...assifier/JointTrainSparsePerceptron.scala
@@ -14,7 +14,7 @@ import scala.reflect.ClassTag
 
 /** Created by parisakordjamshidi on 29/01/15.
   */
-object JointTrain {
+object JointTrainSparsePerceptron {
   def testClassifiers(cls: Classifier, oracle: Classifier, ds: List[AnyRef]): Unit = {
 
     val results = ds.map({

diff --git a/...core/src/main/scala/edu/illinois/cs/cogcomp/saul/classifier/infer/InitSparseNetwork.scala b/...core/src/main/scala/edu/illinois/cs/cogcomp/saul/classifier/infer/InitSparseNetwork.scala
@@ -18,7 +18,10 @@ object InitSparseNetwork {
     //this means we are not reading any model into the SparseNetworks
     // but we forget all the models and go over the data to build the right
     // size for the lexicon and the right number of the ltu s
+
     cClassifier.onClassifier.classifier.forget()
+    assert(cClassifier.onClassifier.classifier.getClass.getName.contains("SparseNetworkLearner"), "The classifier should be of type SparseNetworkLearner!")
+
     val iLearner = cClassifier.onClassifier.classifier.asInstanceOf[SparseNetworkLearner]
     allHeads.foreach {
       head =>
@@ -33,7 +36,7 @@ object InitSparseNetwork {
               if (label >= N || iLearner.getNetwork.get(label) == null) {
                 val isConjunctiveLabels = iLearner.isUsingConjunctiveLabels | iLearner.getLabelLexicon.lookupKey(label).isConjunctive
                 iLearner.setConjunctiveLabels(isConjunctiveLabels)
-                val ltu: LinearThresholdUnit = iLearner.getBaseLTU
+                val ltu: LinearThresholdUnit = iLearner.getBaseLTU.clone().asInstanceOf[LinearThresholdUnit]
                 ltu.initialize(iLearner.getNumExamples, iLearner.getNumFeatures)
                 iLearner.getNetwork.set(label, ltu)
               }

diff --git a/...la/edu/illinois/cs/cogcomp/saul/classifier/JoinTrainingTests/IntializeSparseNetwork.scala b/...la/edu/illinois/cs/cogcomp/saul/classifier/JoinTrainingTests/IntializeSparseNetwork.scala
@@ -1,3 +1,9 @@
+/** This software is released under the University of Illinois/Research and Academic Use License. See
+  * the LICENSE file in the root folder for details. Copyright (c) 2016
+  *
+  * Developed by: The Cognitive Computations Group, University of Illinois at Urbana-Champaign
+  * http://cogcomp.cs.illinois.edu/
+  */
 package edu.illinois.cs.cogcomp.saul.classifier.JoinTrainingTests
 
 import edu.illinois.cs.cogcomp.infer.ilp.OJalgoHook
@@ -71,7 +77,7 @@ class InitializeSparseNetwork extends FlatSpec with Matchers {
     val wv1After = clNet1.getNetwork.get(0).asInstanceOf[LinearThresholdUnit].getWeightVector
     val wv2After = clNet2.getNetwork.get(0).asInstanceOf[LinearThresholdUnit].getWeightVector
 
-    wv1After.size() should be(5)
+    wv1After.size() should be(6)
     wv2After.size() should be(12)
   }
 

diff --git a/saul-examples/src/main/java/edu/illinois/cs/cogcomp/saulexamples/Badge/BadgeReader.java b/saul-examples/src/main/java/edu/illinois/cs/cogcomp/saulexamples/Badge/BadgeReader.java
@@ -0,0 +1,32 @@
+/** This software is released under the University of Illinois/Research and Academic Use License. See
+  * the LICENSE file in the root folder for details. Copyright (c) 2016
+  *
+  * Developed by: The Cognitive Computations Group, University of Illinois at Urbana-Champaign
+  * http://cogcomp.cs.illinois.edu/
+  */
+package edu.illinois.cs.cogcomp.saulexamples.Badge;
+
+import java.io.BufferedReader;
+import java.io.FileInputStream;
+import java.io.InputStreamReader;
+import java.util.ArrayList;
+import java.util.List;
+
+public class BadgeReader {
+    public List<String> badges;
+
+    public BadgeReader(String dataFile) {
+        badges = new ArrayList<String>();
+
+        try {
+            BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(dataFile)));
+
+            String str;
+            while ((str = br.readLine()) != null) {
+                badges.add(str);
+            }
+
+            br.close();
+        }catch (Exception e) {}
+    }
+}