generation encoder ResNet
generation based HuggingFace implementation for quantized bert.
- Input
- 4879-dim embedding
- Encoder
- 92 x ResNet with 52 heads
- Output
- accuracy projection
Training config
optimizer=AdamW, lr=0.784, scheduler=exponential, warmup=1032