LC
Leshem Choshen
@lchoshen.bsky.social
148 followers107 following317 posts
Why evaluate on huge datasets when a fast check would get you most of the way? arxiv.org/abs/2402.14992arxiv.org/abs/2308.11696 e.g. (recent) evaluate on multi prompts
Can we better understand LoRAs? Apparently you don't need to train A (but you need B) arxiv.org/abs/2402.16842 We compress Lots of Loras (lol😅) and show you can serve a 1000 at a fraction of the cost, due to their weight similarities
Asymmetry in Low-Rank Adapters of Foundation Models
Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective. Inspired by...
LC
Leshem Choshen
@lchoshen.bsky.social
148 followers107 following317 posts