ML_101
  • Introduction
  • ML Fundamentals
    • Basics
    • Optimization
    • How to prevent overfitting
    • Linear Algebra
    • Clustering
    • Calculate Parameters in CNN
    • Normalization
    • Confidence Interval
    • Quantization
  • Classical Machine Learning
    • Basics
    • Unsupervised Learning
  • Neural Networks
    • Basics
    • Activation function
    • Different Types of Convolution
    • Resnet
    • Mobilenet
  • Loss
    • L1 and L2 Loss
    • Hinge Loss
    • Cross-Entropy Loss
    • Binary Cross-Entropy Loss
    • Categorical Cross-Entropy Loss
    • (Optional) Focal Loss
    • (Optional) CORAL Loss
  • Computer Vision
    • Two Stage Object Detection
      • Metrics
      • ROI
      • R-CNN
      • Fast RCNN
      • Faster RCNN
      • Mask RCNN
    • One Stage Object Detection
      • FPN
      • YOLO
      • Single Shot MultiBox Detector(SSD)
    • Segmentation
      • Panoptic Segmentation
      • PSPNet
    • FaceNet
    • GAN
    • Imbalance problem in object detection
  • NLP
    • Embedding
    • RNN
    • LSTM
    • LSTM Ext.
    • RNN for text prediction
    • BLEU
    • Seq2Seq
    • Attention
    • Self Attention
    • Attention without RNN
    • Transformer
    • BERT
  • Parallel Computing
    • Communication
    • MapReduce
    • Parameter Server
    • Decentralized And Ring All Reduce
    • Federated Learning
    • Model Parallelism: GPipe
  • Anomaly Detection
    • DBSCAN
    • Autoencoder
  • Visualization
    • Saliency Maps
    • Fooling images
    • Class Visualization
Powered by GitBook
On this page
  • Standard RNN
  • Stacked RNN & LSTM
  • Sample code for stacked LSTM
  • Bidirectional RNN && LSTM
  • Sample code for Bi-LSTM
  • Summary

Was this helpful?

  1. NLP

LSTM Ext.

PreviousLSTMNextRNN for text prediction

Last updated 3 years ago

Was this helpful?

Standard RNN

Stacked RNN & LSTM

Sample code for stacked LSTM

from keras.models import Sequential
from keras.layers import LSTM, Embedding, Dense

vocabulary = 10000
embedding_dim = 32
word_num = 500
state_dim = 32

model = Sequential()
model.add(Embedding(vocabulary, embedding_dim, input_length=word_num))
model.add(LSTM(state_dim, return_sequences=True, dropout=0.2))
model.add(LSTM(state_dim, return_sequences=True, dropout=0.2))
model.add(LSTM(state_dim, return_sequences=False, dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
Layer (type)
Output Shape
Param

embedding_1 (Embedding)

(None, 500, 32)

320000

lstm_1(LSTM)

(None, 500, 32)

8320

lstm_2(LSTM)

(None, 500, 32)

8320

lstm_3(LSTM)

(None, 32)

8320

dense_1 (Dense)

(None, 1)

33

Total params: 344,993

Trainable params: 344,993

Non-trainable params: 0

Bidirectional RNN && LSTM

Sample code for Bi-LSTM

from keras.model import Sequential
from keras.layers import LSTM, Embedding, Dense, Bidirectional

vocabulary = 10000
embedding_dim = 32
word_num = 500
state_dim = 32

model = Sequential()
model.add(Embedding(vocabulary, embedding_dim, input_length=word_num))
model.add(Bidirectional(LSTM(state_dim, return_sequences=False, dropout=0.2)))
model.add(Dense(1, activation='sigmoid'))
Layer (type)
Output Shape
Param

embedding_1 (Embedding)

(None, 500, 32)

320000

bidirectional_1 (Bidirection)

(None, 64)

16640

dense_1 (Dense)

(None, 1)

65

Total params: 336,705 Trainable params: 336,705 Non-trainable params: 0

Summary

  • SimpleRNN and LSTM are two kinds of RNNs; always use LSTM instead of SimpleRNN.

  • Use Bi-RNN instead of RNN whenever possible.

  • Stacked RNN may be better than a single RNN layer (if n is big).

  • Pretrain the embedding layer (if n is small).