BLUE
SB
Stella Biderman
@stellaathena.bsky.social
I study large language models
193 followers11 following10 posts
SBstellaathena.bsky.social

- Training jointly on code and NL - GPT-NeoX (math- and code-aware) tokenizer - 4-bit quant and qLoRA - RWKV - Instruction-tuning - Pre-layer norm - VQGAN - CLIP-Guidance - Latent diffusion - Decision transformers and more!

0

SB
Stella Biderman
@stellaathena.bsky.social
I study large language models
193 followers11 following10 posts