ML_101
  • Introduction
  • ML Fundamentals
    • Basics
    • Optimization
    • How to prevent overfitting
    • Linear Algebra
    • Clustering
    • Calculate Parameters in CNN
    • Normalization
    • Confidence Interval
    • Quantization
  • Classical Machine Learning
    • Basics
    • Unsupervised Learning
  • Neural Networks
    • Basics
    • Activation function
    • Different Types of Convolution
    • Resnet
    • Mobilenet
  • Loss
    • L1 and L2 Loss
    • Hinge Loss
    • Cross-Entropy Loss
    • Binary Cross-Entropy Loss
    • Categorical Cross-Entropy Loss
    • (Optional) Focal Loss
    • (Optional) CORAL Loss
  • Computer Vision
    • Two Stage Object Detection
      • Metrics
      • ROI
      • R-CNN
      • Fast RCNN
      • Faster RCNN
      • Mask RCNN
    • One Stage Object Detection
      • FPN
      • YOLO
      • Single Shot MultiBox Detector(SSD)
    • Segmentation
      • Panoptic Segmentation
      • PSPNet
    • FaceNet
    • GAN
    • Imbalance problem in object detection
  • NLP
    • Embedding
    • RNN
    • LSTM
    • LSTM Ext.
    • RNN for text prediction
    • BLEU
    • Seq2Seq
    • Attention
    • Self Attention
    • Attention without RNN
    • Transformer
    • BERT
  • Parallel Computing
    • Communication
    • MapReduce
    • Parameter Server
    • Decentralized And Ring All Reduce
    • Federated Learning
    • Model Parallelism: GPipe
  • Anomaly Detection
    • DBSCAN
    • Autoencoder
  • Visualization
    • Saliency Maps
    • Fooling images
    • Class Visualization
Powered by GitBook
On this page
  • kmeans
  • t-SNE
  • Principal Component Analysis (PCA)

Was this helpful?

  1. Classical Machine Learning

Unsupervised Learning

PreviousBasicsNextBasics

Last updated 3 years ago

Was this helpful?

t-SNE

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets. We applied it on data sets with up to 30 million examples.‌

​

Principal Component Analysis (PCA)

Statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components.

  • Principal Component Analysis (PCA) is a dimension reduction technique that projects the data into a lower dimensional space

  • PCA uses Singular Value Decomposition (SVD), which is a matrix factorization method that decomposes a matrix into three smaller matrices (more details of SVD )

  • PCA finds top N principal components, which are dimensions along which the data vary (spread out) the most. Intuitively, the more spread out the data along a specific dimension, the more information is contained, thus the more important this dimension is for the pattern recognition of the dataset

  • PCA can be used as pre-step for data visualization: reducing high dimensional data into 2D or 3D. An alternative dimensionality reduction technique is

Here is a visual explanation of PCA

kmeans
https://github.com/ztlevi/t-sne-python/blob/master/tsne.py
here
t-SNE