SB
Stella Biderman
@stellaathena.bsky.social
I study large language models
193 followers11 following10 posts
- Training jointly on code and NL - GPT-NeoX (math- and code-aware) tokenizer - 4-bit quant and qLoRA - RWKV - Instruction-tuning - Pre-layer norm - VQGAN - CLIP-Guidance - Latent diffusion - Decision transformers and more!
SB
Stella Biderman
@stellaathena.bsky.social
I study large language models
193 followers11 following10 posts