Examples of ICL in a sentence
The largest memory cost for ICL is storing the model’s parameters at inference time while for SFT and SIT it is the memory require- ments during fine-tuning.
We focus on the comparison between SFT, SIT and ICL in few-shot multilingual and cross-lingual setups, aiming to make the comparison as fair as possible across languages, learning paradigms and models, and targeting the following setups:In-Language Generalisation.
Ta- ble 3 suggests that storage cost for models used for ICL is at least 4 higher than the models used in SIT and SFT.Inference Cost.
While Flan-T5 was pre- trained mostly in English and several high-resource languages, mT0-XL offers a more comprehensive and balanced multilingual pretraining set.The inputs for ICL were designed in a cross- lingual manner, where the task descriptions and context were in English while the few-shot exam- ples and the sentence to be analysed were provided in the target language.
In this work, we ex- plore whether such language-specific PEFT-style adaptation can improve ICL and generation capabil- ities of LLMs in languages other than English.