LSTM Ext.

Standard RNN

Stacked RNN & LSTM

Sample code for stacked LSTM

from keras.models import Sequential
from keras.layers import LSTM, Embedding, Dense

vocabulary = 10000
embedding_dim = 32
word_num = 500
state_dim = 32

model = Sequential()
model.add(Embedding(vocabulary, embedding_dim, input_length=word_num))
model.add(LSTM(state_dim, return_sequences=True, dropout=0.2))
model.add(LSTM(state_dim, return_sequences=True, dropout=0.2))
model.add(LSTM(state_dim, return_sequences=False, dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
Layer (type)Output ShapeParam

embedding_1 (Embedding)

(None, 500, 32)

320000

lstm_1(LSTM)

(None, 500, 32)

8320

lstm_2(LSTM)

(None, 500, 32)

8320

lstm_3(LSTM)

(None, 32)

8320

dense_1 (Dense)

(None, 1)

33

Total params: 344,993

Trainable params: 344,993

Non-trainable params: 0

Bidirectional RNN && LSTM

Sample code for Bi-LSTM

from keras.model import Sequential
from keras.layers import LSTM, Embedding, Dense, Bidirectional

vocabulary = 10000
embedding_dim = 32
word_num = 500
state_dim = 32

model = Sequential()
model.add(Embedding(vocabulary, embedding_dim, input_length=word_num))
model.add(Bidirectional(LSTM(state_dim, return_sequences=False, dropout=0.2)))
model.add(Dense(1, activation='sigmoid'))
Layer (type)Output ShapeParam

embedding_1 (Embedding)

(None, 500, 32)

320000

bidirectional_1 (Bidirection)

(None, 64)

16640

dense_1 (Dense)

(None, 1)

65

Total params: 336,705 Trainable params: 336,705 Non-trainable params: 0

Summary

  • SimpleRNN and LSTM are two kinds of RNNs; always use LSTM instead of SimpleRNN.

  • Use Bi-RNN instead of RNN whenever possible.

  • Stacked RNN may be better than a single RNN layer (if n is big).

  • Pretrain the embedding layer (if n is small).

Last updated