Large pretrained language models have shown surprising In-Context Learning (ICL) ability. With a few demonstration input-label pairs, they can predict the label for an unseen input without additional parameter updates. Despite the great success in performance, the working mechanism of ICL still remains an open problem. In order to better understand how ICL works, this paper explains language models as meta-optimizers and understands ICL as a kind of implicit finetuning.
2022: Damai Dai, Yutao Sun, Li Dong, Y. Hao, Zhifang Sui, Furu Wei
https://arxiv.org/pdf/2212.10559v2.pdf
view more