CVDL - 循环神经网络
十、循环神经网络
10.1 RNN
引入
data:image/s3,"s3://crabby-images/2d8e4/2d8e4700ff7dc71560f15a4c47fb0e33f630e101" alt="RNN-1-1"
几种形式
- 基本形式
data:image/s3,"s3://crabby-images/59b4a/59b4a43623af19a7b994bd5216a2b00b04c1e546" alt="RNN-1-2"
- 更深
data:image/s3,"s3://crabby-images/8f2dc/8f2dc64027fcd186845a456fc88d1f391df6efff" alt="RNN-1-3"
- 两种网络设计
- Elman Network
- Jordan Network
data:image/s3,"s3://crabby-images/72f8b/72f8b04dd9e1dc1ac84695c381420c4f247e123b" alt="RNN-1-4"
-
双向设计
Bidirectional RNN
data:image/s3,"s3://crabby-images/14e4b/14e4b446f36011344d0273a3084015319852cdc7" alt="RNN-1-5"
10.2 LSTM
定义
data:image/s3,"s3://crabby-images/f8206/f82067857efe04f8525b65a384eb4253c505f05f" alt="RNN-2-1"
- 函数示意图:
data:image/s3,"s3://crabby-images/43ee8/43ee8700937e2fc20a1ae986b0bba6933876b072" alt="RNN-2-2"
工作流程
单个神经元工作流程:
![]()
RNN工作流程:
![]()
训练问题
- 很不幸,往往RNN训练过程中,反向传播的梯度是并不平滑的,很任意梯度消失或者梯度爆炸。
data:image/s3,"s3://crabby-images/59461/5946176366789e8147f974b61321df8758c82508" alt="RNN-2-5"
data:image/s3,"s3://crabby-images/f1cf4/f1cf4257498fdd4c71ce082b7018fc8520026344" alt="RNN-2-6"
-
为什么会这样?
太多的连乘!
-
控制方法:学习率的控制(太大则设置只能为xx,太小为xxx)
data:image/s3,"s3://crabby-images/1b44d/1b44dc5adea1a9ac7c804145ddef152c3df85e72" alt="RNN-2-7"
LSTM优势
- LSTM本身可以解决上述的训练问题。
data:image/s3,"s3://crabby-images/ccabf/ccabf87ba9e30034a8b36e10c25baf91e9036eab" alt="RNN-2-8"
10.3 更多应用
Many to One
data:image/s3,"s3://crabby-images/0a66a/0a66a0c3c31a2d64c2566a4d01cf870dba916c4d" alt="RNN-2-9"
Many to Many (Output is shorter)
data:image/s3,"s3://crabby-images/402cd/402cd053cc00acc7144a0c42039450a2c6b05110" alt="RNN-2-10"
Many to Many (No Limitation)
data:image/s3,"s3://crabby-images/71527/71527f44da8e067b2144dd8b3abcd2dd405fcd17" alt="RNN-2-11"
10.4 注意力机制
定义
data:image/s3,"s3://crabby-images/85e76/85e76ebab11655beaaa5d6c3a62db588bc8e747d" alt="RNN-4-1"
示例1:机器翻译
data:image/s3,"s3://crabby-images/af35e/af35e11f9abf3636ba6e46dac265f29873b0e52f" alt="RNN-4-2"
data:image/s3,"s3://crabby-images/4e366/4e366b4fa1be3314eaa3944504500dd2a46aa7f6" alt="RNN-4-3"
示例2:图像捕获
data:image/s3,"s3://crabby-images/a39a3/a39a3516606bbd489b20b8f1fff4a6f794dc26d6" alt="RNN-4-4"
data:image/s3,"s3://crabby-images/97b41/97b41414530ab5f04c071f54e4e819401f0d3adb" alt="RNN-4-5"
data:image/s3,"s3://crabby-images/9641b/9641b859f1474ea00b9716cb5cfa5eb41ef00849" alt="RNN-4-6"
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 isSeymour!
评论