posted on 2024-12-01, 00:00authored byAlperen Gormez
This work aims to reduce the inference and training costs of deep learning models by utilizing
early exit networks. In particular, we introduce four algorithms:
1. E2CM, a simple and lightweight early exit algorithm that reduces the inference cost. In a
separate line of work, we also show how early exit networks can be combined with model
pruning.
2. CBT, an algorithm to further decrease the inference cost of early exit semantic segmentation
networks.
3. EEPrune, a novel dataset pruning algorithm that uses early exit networks to reduce training
cost.
4. Class-aware EE LLM, a novel weight initialization algorithm for early exit large language
models to accelerate pre-training.