Skip to content

A Python package with some statistical tools for evaluating Machine Learning models.

License

Notifications You must be signed in to change notification settings

Vincent-Vercruyssen/ML-stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ML-stats

A Python package with some statistical tools for evaluating Machine Learning models.

Table of contents

autoauto- [ML-stats](#ml-stats)auto- [Table of contents](#table-of-contents)auto- [Installation](#installation)auto- [Use the package](#use-the-package)auto- [Current contents of the package](#current-contents-of-the-package)autoauto

Installation

To install the package, simply download the files for now, add the source to your path in python, and import the necessary functionality.

Use the package

To do a statistical test, you need to:

  1. Construct a matrix of the experimental results. The rows are the datasets/blocks, the columns are the methods/groups, and the values of the matrix are the recorded performance metric (i.e., what the experiment measures). For now, let's assume random results:
import pandas as pd

matrix = pd.DataFrame(
    np.random.randn(2, 2), 
    columns=['method1', 'method2'], 
    index=['dataset1', 'dataset2']
)
  1. Create an instance of the BlockDesign class. The BlockDesign class stores the results and preprocesses them for later use. You can specify precision, threshold...
from src.classifier_comparisons import BlockDesign

block_design = BlockDesign(matrix, threshold=0.01, precision=4, higher_is_better=True)
  1. Give this instance to the appropriate statistical test.
test_results = friedman_test(block_design, alpha=0.05)

Current contents of the package

Assuming that the results are stored in a matrix (let's create a random one for now):

import pandas as pd

matrix = pd.DataFrame(
    np.random.randn(2, 2), 
    columns=['method1', 'method2'], 
    index=['dataset1', 'dataset2']
)

Comparing classifiers:

  1. Compute the average ranks (Friedman)

    from src.classifier_comparisons import BlockDesign
    
    average_ranks = BlockDesign(matrix).to_ranks()
  2. Compute the wins/ties/losses between different methods

    from src.classifier_comparisons import BlockDesign
    
    average_ranks = BlockDesign(matrix).to_wins_ties_losses()

Non-parametric tests:

  1. Friedman test

    from src.multiple_classifiers import friedman_test
    
    block_design = BlockDesign(matrix)
    test_results = friedman_test(block_design, alpha=0.05)

Non-parametric post-hoc tests:

  1. Nemenyi Friedman test

    from src.multiple_classifiers import nemenyi_friedman_test
    
    block_design = BlockDesign(matrix)
    p_values, sign_diffs = nemenyi_friedman_test(block_design, alpha=0.05)
  2. Bonferroni-Dunn test

    from src.multiple_classifiers import bonferroni_dunn_test
    
    block_design = BlockDesign(matrix)
    p_values, sign_diffs = bonferroni_dunn_test(block_design, alpha=0.05)

About

A Python package with some statistical tools for evaluating Machine Learning models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages