@ -144,7 +144,7 @@ return Linear(context_layer) -> [6, 1, 4096]
```
## GLMBlock
input
| \
| RMSNorm
@ -158,5 +158,5 @@ return Linear(context_layer) -> [6, 1, 4096]
| dropout
| /
Add
所有的输出shape都是[6, 1, 4096], 6:sequence_length 1:batch_num 4096:hidden_size