TLDR; the Stichable Neural Networks paper includes some interesting concepts. It allows the creation of multiple neural networks with varying complexity and performance trade-offs from a family of pretrained models. Key Principles How to choose anchors from well-performed pretrained models in a model family The design of stitching layers The stitching direction and strategy Simple but effective training strateg A key question about combining sub-networks from different pretrained models is ho...