Knowledge Amalgamation from Heterogeneous Pre-Trained Models

Thadajarassiri, Jidapa

Etd

Knowledge Amalgamation from Heterogeneous Pre-Trained Models

Public Deposited

In recent years, many pre-trained models have been released to the deep learning community for reuse. However, these models typically contain heterogeneous knowledge derived from their unique tasks and datasets, which may restrict their reuse on a downstream task that requires a knowledge base beyond any single pre-trained model. This dissertation thus studies knowledge amalgamation, which is the problem of how best to combine complementary knowledge from multiple pre-trained models, referred to as teachers, into a lightweight student model. We propose four challenging knowledge amalgamation tasks as follows: 1. We combine pre-trained word embedding models (teachers), which contain representations for different words and encode unique word relationships into a student that learns meta-embeddings for all words handled by all teachers. 2. We integrate discriminative knowledge captured differently between pre-trained multi-class classifiers (teachers) into a student to become an expert on the union of the teachers’ classes. 3. We unify pre-trained multi-label teacher models that usually contain unique label dependencies informed by the particular label sets into a multi-label student to be able to handle all labels across teachers. 4. We fuse knowledge from pre-trained multi-task teacher models, each of which captures unique knowledge to be generalized toward different task sets, into a high-quality common feature learned in a student to be useful for all tasks across all teachers.

Creator