BLUE

Leshem Choshen

@lchoshen.bsky.social

🥇 #NLProc researcher 🥈 Opinionatedly Summarizing #ML & #NLP papers 🥉 Good science #scientivism

148 followers107 following317 posts

LClchoshen.bsky.socialJul 23, 2024 8:43am

Can we better understand LoRAs? Apparently you don't need to train A (but you need B) arxiv.org/abs/2402.16842 We compress Lots of Loras (lol😅) and show you can serve a 1000 at a fraction of the cost, due to their weight similarities

Asymmetry in Low-Rank Adapters of Foundation Models

Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective. Inspired by...

LClchoshen.bsky.socialJul 23, 2024 8:44am

Following, we show that loras are not parameter efficient Take a lora➡️throw 80% parameters➡️make it binary➡️ improve result🤯 arxiv.org/abs/2311.13171 github.com/zipnn/zipnn

ComPEFT: Compression for Communicating Parameter Efficient Updates...

Parameter-efficient fine-tuning (PEFT) techniques make it possible to efficiently adapt a language model to create "expert" models that specialize to new tasks or domains. Recent techniques in...

Leshem Choshen

@lchoshen.bsky.social

🥇 #NLProc researcher 🥈 Opinionatedly Summarizing #ML & #NLP papers 🥉 Good science #scientivism

148 followers107 following317 posts