GitHub - asbisen/PreProcessing.jl: PreProcessing for ML modeled after sklearn.preprocessing

Library modeled after preprocessing module of scikit-learn. This library intends to implement the following transformers

TODO:

Generalize to handle 1D arrays

julia> using PreProcessing

julia> x = rand(-10:10, 8,4)
8×4 Array{Int64,2}:
   3   8  -2  -10
   3  10  -4    3
  -9   4   9   -5
   0   3   9  -10
   6  -8   4   -4
  -2   5  -5    7
 -10   2   9    3
  -6   6  -8    1

julia> clf = fit(StandardScaler, x)
PreProcessing.StandardScaler{Float64}([-1.875, 3.75, 1.5, -1.875], [5.55512, 5.06828, 6.61438, 5.92532], 4, 1)

julia> xnew = transform(clf, x)
8×4 Array{Float64,2}:
  0.877569    0.838548   -0.52915   -1.37123
  0.877569    1.23316    -0.831522   0.822741
 -1.2826      0.0493264   1.13389   -0.527398
  0.337526   -0.147979    1.13389   -1.37123
  1.41761    -2.31834     0.377964  -0.358631
 -0.0225018   0.246632   -0.982708   1.49781
 -1.46261    -0.345285    1.13389    0.822741
 -0.742558    0.443937   -1.43626    0.485206

julia> inverse_transform(clf, xnew)
8×4 Array{Float64,2}:
   3.0   8.0  -2.0  -10.0
   3.0  10.0  -4.0    3.0
  -9.0   4.0   9.0   -5.0
   0.0   3.0   9.0  -10.0
   6.0  -8.0   4.0   -4.0
  -2.0   5.0  -5.0    7.0
 -10.0   2.0   9.0    3.0
  -6.0   6.0  -8.0    1.0

julia> x = rand(-10:10, 8,4)
8×4 Array{Int64,2}:
   3   8  -2  -10
   3  10  -4    3
  -9   4   9   -5
   0   3   9  -10
   6  -8   4   -4
  -2   5  -5    7
 -10   2   9    3
  -6   6  -8    1

julia> clf = fit(MinMaxScaler, x, range_min=-4, range_max=4)
PreProcessing.MinMaxScaler{Float64,Int64}([-10.0, -8.0, -8.0, -10.0], [6.0, 10.0, 9.0, 7.0], -4, 4, 4, 1)

julia> xnew = transform(clf, x)
8×4 Array{Float64,2}:
 3.25  3.55556  1.41176   0.0    
 3.25  4.0      0.941176  3.05882
 0.25  2.66667  4.0       1.17647
 2.5   2.44444  4.0       0.0    
 4.0   0.0      2.82353   1.41176
 2.0   2.88889  0.705882  4.0    
 0.0   2.22222  4.0       3.05882
 1.0   3.11111  0.0       2.58824

julia> inverse_transform(clf, xnew)
8×4 Array{Float64,2}:
   3.0   8.0  -2.0  -10.0
   3.0  10.0  -4.0    3.0
  -9.0   4.0   9.0   -5.0
   0.0   3.0   9.0  -10.0
   6.0  -8.0   4.0   -4.0
  -2.0   5.0  -5.0    7.0
 -10.0   2.0   9.0    3.0
  -6.0   6.0  -8.0    1.0

julia> x = rand(-10:10, 8,4)
8×4 Array{Int64,2}:
   3   8  -2  -10
   3  10  -4    3
  -9   4   9   -5
   0   3   9  -10
   6  -8   4   -4
  -2   5  -5    7
 -10   2   9    3
  -6   6  -8    1

julia> clf = fit(Binarizer, x)
PreProcessing.Binarizer{Int64}(0, 4, 1)

julia> xnew = transform(clf, x)
8×4 Array{Int64,2}:
 1  1  0  0
 1  1  0  1
 0  1  1  0
 0  1  1  0
 1  0  1  0
 0  1  0  1
 0  1  1  1
 0  1  0  1

julia> x = rand(-10:10, 8,4)
8×4 Array{Int64,2}:
   3   8  -2  -10
   3  10  -4    3
  -9   4   9   -5
   0   3   9  -10
   6  -8   4   -4
  -2   5  -5    7
 -10   2   9    3
  -6   6  -8    1
  
julia> clf = fit(MaxAbsScaler, x)
MaxAbsScaler transformer with 4 features

julia> xnew = transform(clf, x)
8×4 Array{Float64,2}:
  0.3   0.8  -0.222222  -1.0
  0.3   1.0  -0.444444   0.3
 -0.9   0.4   1.0       -0.5
  0.0   0.3   1.0       -1.0
  0.6  -0.8   0.444444  -0.4
 -0.2   0.5  -0.555556   0.7
 -1.0   0.2   1.0        0.3
 -0.6   0.6  -0.888889   0.1

julia> inverse_transform(clf, xnew)
8×4 Array{Float64,2}:
   3.0   8.0  -2.0  -10.0
   3.0  10.0  -4.0    3.0
  -9.0   4.0   9.0   -5.0
   0.0   3.0   9.0  -10.0
   6.0  -8.0   4.0   -4.0
  -2.0   5.0  -5.0    7.0
 -10.0   2.0   9.0    3.0
  -6.0   6.0  -8.0    1.0

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
src		src
test		test
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

asbisen/PreProcessing.jl

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages