Skip to content

The repository mainly introduces the methods of dimension reduction.

Notifications You must be signed in to change notification settings

PNightOwlY/Advanced_Probability_Machine_Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

In this project, we implement 3 methods(PCA, PPCA, VAE) to do dimention reduction. Variation AutoEncoder owns best performance of reconstruction of images, as well as in generating images.

In the experiments, we encode the high dimension images to 2 dimension, and then decode, to see the differences between them which can also be called loss function.

Then we read a paper which attached in paper folder, a method called Robust Deep AutoEncoder(RDAE), that can be used in dimension reduction and anomaly dection.

Experiments

We compare the result of the dimension reduction results on three model, and the results as follows:

PCA(2 dims)

PPCA(2 dims)

VAE(2 dims)

Conclusion

The results of the PCA and PPCA are similar, and it is also hard to split the data into different labels from the results graph. But VAE performs much better on split data into 2 dimensions since the boudaries between labels are very clear.

Generator

Herein, we generate some images with temperature value 0.2.

More advanced tech

RDAE[1]

RDAE can be viewed as replacing the nuclear norm with a non-linear autoencoder.

  1. The lowest-rank matrix L (i.e. no noise in L) has the minimum of nuclear norm.
  2. The lowest-rank matrix L (i.e. no noise in L) could be reconstructed perfectly via autoencoder.
  3. Replace the nuclear norm with a non-linear autoencoder.

Robust deep autoencoder is more scalable since we can train the model with noise data and it performs well. As we mentioned, we use ADMM to minimize complex objective separately with other parts fixed. As you can see in the right part, the algorithm, we split the data into 2 parts, LD and S, and firstly we train the deep autoencoder model with LD to minimize the l2 norm of LD and D(E(LD)), and we assign the value of D(E(LD)) to LD, and use the X to divided LD and get the sparse matrix S, then we fixed LD to minimize S by using shrinkage operator to solve it. And there are two criteria to prove that the model is converged.

We define the shrink methods for optimizing the l1 and l2,1 norm of S, as you can see in the top of two functions, and I think it is easy to follow, so I skip these two parts, and we have a training method to train our model with input data and optimizer. We also create a function to add noise to training data X. To visualize the anomaly detection, we transform the sparse S to label and calculate the corresponding criteria to judge the performance of the model.

Denoising

Anomaly dection

Here we can see the number circled are outliers.

The outlier numbers decrease when we increase the values of the lambda.

References

[1] Zhou, Chong and Paffenroth, Randy C (2017). Anomaly detection with robust deep autoencoders. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,665--674.

About

The repository mainly introduces the methods of dimension reduction.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published