Skip to content

all-contributors/ac-learn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ac-learn

Where the All Contributors machine can learn about your contributions.

Install

yarn add ac-learn --save
#or
npm i -D ac-learn

Usage

const Learner = require('ac-learn')

//If you want to load a learner from a JSON export:
const learner = Learner.fromJSON(require('./your-learner.json'))
//If you want to use the default one
const learner = new Learner()
//If you want to your own dataset or customise the learner, check https://github.com/all-contributors/ac-learn#learner

//Training
learner.train() //Or
learner.train(someTrainingSet)

//Testing and getting stats
const fullStats = learner.eval()

//Cross-validation
const {microAvg, macroAvg} = learner.crossValidate()

//Confusion matrix (as string or console table)
const textualTable = learner.confusionMatrix.toString()
const cmTable = learner.confusionMatrix.toTable()

//Classifying an input
const output = learner.classify(someInput)
//Getting an input from an output
const input = learner.backClassify(someOutput)

//Saving the model to a JSON file
const savedModel = learner.toJSON()
const {writeFileSync} = require('fs')
writeFileSync('your-learner.json', JSON.stringify(jsonData))

Documentation

Table of Contents

Learner

NodeJS Classification-based learner.

Parameters

  • opts Object Options.
    • opts.dataset Array<Object> Dataset (for training and testing) (optional, default require('./conv')('io'))
    • opts.splits number Dataset split percentage for the training/validation set (default: 70%/15%/15%) (optional, default [.7,.15])
    • opts.classifier function (): Object Classifier builder function (optional, default classifierBuilder)
    • opts.pastTrainingSamples Array<Object> Past training samples for the classifier (optional, default [])
    • opts.classes Array<string> List of classes (categories) (optional, default require('./categories'))

Examples

Using pre-defined data

const learner = new Learner()

Using a custom dataset

const learner = new Learner({
  dataset: [
    {input: 'something bad', output: 'bad'},
    {input: 'a good thing', output: 'good'},
  ],
})

Using a specified classifier function

const learner = new Learner({
  classifier: myClassifierBuilderFn, //see {@link module:./classifier} for an example (or checkout `limdu`'s examples)
})

Changing the train/test split percentage

const learner = new Learner({
  splits: [0.6, 0.2],
})

(Re-)Using past-training samples

const learner = new Learner({
  pastTrainingSamples: [
    {input: 'something bad', output: 'bad'},
    {input: 'a good thing', output: 'good'},
  ],
})

train

Parameters
  • trainSet Array<Object> Training set (optional, default this.trainSet)

eval

Parameters
  • log boolean Log events (optional, default false)

Returns Object Statistics from a confusion matrix

serializeClassifier

Returns string Serialized classifier

serializeAndSaveClassifier

Parameters
  • file string Filename (optional, default 'classifier.json')

Returns Promise<(string | Error)> Serialized classifier

deserializeClassifier

Parameters
  • serializedClassifier string .

Returns Object Deserialized classifier

loadAndDeserializeClassifier

Parameters
  • file string Filename (optional, default 'classifier.json')

Returns Promise<(string | Error)> Deserialized classifier

classify

Parameters
  • data {input: any, output: any} Data to classify

Returns Array<string> Classes

crossValidate

Parameters
  • numOfFolds number Cross-validation folds (optional, default 5)
  • verboseLevel number Verbosity level on limdu's explainations (optional, default 0)
  • log boolean Cross-validation logging (optional, default false)

Returns {microAvg: Object, macroAvg: Object} Averages

backClassify

Parameters
  • category string Category name.

Returns Array<string> Labels associated with category

toJSON

JSON representation of the learner with the serialized classification model.

Returns Object JSON representation

fromJSON

Parameters

Returns Learner Generated learner from json

getCategoryPartition

Get the observational overall/train/validation/test count for each classes in the associated dataset.

Parameters
  • log boolean Log events (optional, default false)
  • outputFile string Filename for the output (to be used by chart.html) (optional, default '')

Returns Object<string, {overall: number, test: number, validation: number, train: number}> Partitions

getStats

Parameters
  • log boolean Log events (optional, default false)
  • categoryPartitionOutput string Filename for the output of the category partitions. (optional, default '')

Returns Object Statistics

ConfusionMatrix

Multi-class focused confusion matrix.

addEntry

Parameters
  • actual string Actual class
  • predicted string Predicted class

Returns number Updated entry

setEntry

Parameters

getEntry

Parameters
  • actual string Actual class
  • predicted string Predicted class

Returns number Entry

getTotal

Get the total count of all entries.

Returns number Total count

getTP

Number of elements in the category class correctly predicted.

Parameters
  • category string Class/category considered as positive

Returns number True Positives

getFP

Number of elements that aren't in the category class but predicted as such.

Parameters
  • category string Class/category considered as positive

Returns number False Positives

getFN

Number of elements in the category class but predicted as not being in it.

Parameters
  • category string Class/category considered as positive

Returns number False Negatives

getTN

Number of elements that aren't in the category class correctly predicted.

Parameters
  • category string Class/category considered as positive

Returns number True Negatives

getDiagonal

Diagonal of truth (top-left β†’ bottom-right)

Returns Array<number> Numbers in the diagonal

getTrue

Number of correct (truthful) predictions.

Returns number TP

getFalse

Number of incorrect predictions.

Returns number FP + FN

getPositive

Number of real (actual) "positive" elements (i.e. elements that belong to the category class).

Parameters
  • category string Class/category considered as positive

Returns number TP + FN

getNegative

Number of real (actual) "negative" elements (i.e. elements that don't belong to the category class).

Parameters
  • category string Class/category considered as positive

Returns number TN + FP

getPredPositive

Number of predicted "positive" elements (i.e. elements guessed as belonging to the category class).

Parameters
  • category string Class/category considered as positive

Returns number TP + FN

getPredNegative

Number of predicted "negative" elements (i.e. elements guessed as not belonging to the category class).

Parameters
  • category string Class/category considered as positive

Returns number TN + FP

getSupport

Support value (count/occurrences) of category in the matrix

Parameters
  • category string Class/category to look at

Returns number Support value

getAccuracy

Prediction accuracy for category.

Parameters
  • category string Class/category considered as positive

Returns number (TP + TN) / (TP + TN + FP + FN)

getMicroAccuracy

Micro-average of accuracy.

Returns number (TP0 + ... + TPn + TN0 + ... + TNn) / (TP0 + ... + TPn + TN0 + ... + TNn + FP0 + ... + FPn + FN0 + ... + FNn)

getMacroAccuracy

Macro-average of accuracy.

Returns number (A0 + ...+ An_1) / n

getWeightedAccuracy

Weighted accuracy.

Returns number (A0 _ s0 + ... + An _ sn) / Total

getTotalPositiveRate

Predicition recall.

Parameters
  • category string Class/category considered as positive

Returns number TP / (TP + FN)

getMicroRecall

Micro-average of recall.

Returns number (TP0 + ... + TPn) / (TP0 + ... + TPn + FN0 + ... + FNn)

getMacroRecall

Macro-average of recall.

Returns number (R0 + R1 + ... + Rn-1) / n

getWeightedRecall

Weighted recalll.

Returns number (R0 _ s0 + ... + Rn _ sn) / Total

getPositivePredictiveValue

Prediction precision for category.

Parameters
  • category string Class/category considered as positive

Returns number TP / (TP + FP)

getPositivePredictiveValue

Prediction F1 score for category.

Parameters
  • category string Class/category considered as positive

Returns number 2 _ (Pr _ R) / (Pr + R)

getMicroPrecision

Micro-average of the precision.

Returns number (TP0 + ... + TPn) / (TP0 + ... + TPn + FP0 + ... FPn)

getMacroPrecision

Macro-average of the precsion.

Returns number (Pr0 + Pr1 + ... + Pr_n-1) / n

getWeightedPrecision

Weighted precision.

Returns number (Pr0 _ s0 + ... + Prn _ sn) / Total

getMicroF1

Micro-average of the F1 score.

Returns number 2 _ (TP0 + ... + TPn) / (2 _ (TP0 + ... + TPn) + (FN0 + ... + FNn) + (FP0 + ... + FPn))

getMacroF1

Macro-average of the F1 score.

Returns number (F0_1 + F1_1 + ... + F_n-1_1) / n

getWeightedF1

Weighted F1.

Returns number (F01 * s0 + ... + Fn1 * sn) / Total

getFalseNegativeRate

Miss rates on predictions for category.

Parameters
  • category string Class/category considered as positive

Returns number FN / (TP + FN)

getMicroMissRate

Micro-average of the miss rate.

Returns number (FN0 + ... + FNn) / (TP0 + ... + TPn + FN0 + ... FNn)

getMacroMissRate

Macro-average of the miss rate.

Returns number (M0 + M1 + ... + Mn) / n

getWeightedMissRate

Weighted miss rate.

Returns number (M0 _ s0 + ... + Mn _ sn) / Total

getFalsePositiveRate

Fall out (false alarm) on predictions for category.

Parameters
  • category string Class/category considered as positive

Returns number FP / (FP + TN)

getMicroFallOut

Micro-average of the fall out.

Returns number (FP0 + ... + FPn) / (FP0 + ... + FPn + TN0 + ... TNn)

getMacroFallOut

Macro-average of the fall out.

Returns number (Fo0 + Fo1 + ... + Fo_n) / n

getWeightedFallOut

Weighted fall out.

Returns number (Fo0 _ s0 + ... + Fon _ sn) / Total

getTrueNegativeRate

Specificity on predictions for category.

Parameters
  • category string Class/category considered as positive

Returns number TN / (FP + TN)

getMicroSpecificity

Micro-average of the specificity.

Returns number (TN0 + ... + TNn) / (FP0 + ... + FPn + TN0 + ... TNn)

getMacroSpecificity

Macro-average of the specificity.

Returns number (S0 + S1 + ... + Sn) / n

getWeightedSpecificity

Weighted specificity.

Returns number (S0 _ s0 + ... + Sn _ sn) / Total

getPrevalence

Prevalence on predictions for category.

Parameters
  • category string Class/category considered as positive

Returns number (TP + FN) / (TP + TN + FP + FN)

getMicroPrevalence

Micro-average of the prevalence.

Returns number (TP0 + ... + TPn + FN0 + ... + FNn) / (TP0 + ... + TPn + TN0 + ... + TNn + FP0 + ... + FPn + FN0 + ... + FNn)

getMacroPrevalence

Macro-average of the prevalence.

Returns number (Pe0 + Pe1 + ... + Pen) / n

getWeightedPrevalence

Weighted prevalence.

Returns number (Pe0 _ s0 + ... + Pen _ sn) / Total

toString

Textual tabular representation of the confusion matrix.

Parameters
  • opt Object Options (optional, default {})
    • opt.split boolean Split the classes in half (β†’ 2 matrices) (optional, default false)
    • opt.clean boolean Remove empty column/row pairs (optional, default false)
    • opt.colours boolean Colourize cells (optional, default true)
    • opt.maxValue (optional, default 100)
Examples

Example output (cf. /src/tests/confusionMatrix.js)

```
Actual \\ Predicted  bug   code  other
------------------  ----  ----  -----
bug                 5.00  0.00  1.00
code                1.00  2.00  0.00
other               0.00  3.00  8.00

Returns string String representation

toTable

console.table version of confusionMatrix.toString().

Parameters
  • opt Object Options (optional, default {})
    • opt.split boolean Split the classes in half (β†’ 2 matrices) (optional, default false)
    • opt.clean boolean Remove empty column/row pairs (optional, default false)
    • opt.colours boolean Colourize cells (optional, default true)
    • opt.maxValue (optional, default 100)

getShortStats

Parameters
  • type string Type of stats (micro/macro/weighted average) (optional, default 'micro')

Returns string Short statistics (total, true, false, accuracy, precision, recall and f1)

getStats

Returns {total: number, correctPredictions: number, incorrectPredictions: number, classes: Array<string>, microAvg: Object, macroAvg: Object, results: Object} (Long) statistics

fromData

Creates a confusion matrix from the actual and predictions classes.

Parameters

Returns ConfusionMatrix Filled confusion matrix

Contributors

Maximilian Berkmann
Maximilian Berkmann

πŸ’» πŸ“– πŸ€” 🚧 πŸ“¦ πŸš‡ ⚠️ πŸ›‘οΈ
Angel Aviel Domaoan
Angel Aviel Domaoan

πŸ’» 🚧 πŸ‘€
Dependabot
Dependabot

πŸ›‘οΈ
Gregor Martynus
Gregor Martynus

πŸ‘€

This project follows the all-contributors specification. Contributions of any kind welcome!