P
figured it out. backward process is properly defined as: prediction = (latent - sigma * noise) / sqrt(alpha_bar) where sigma is sqrt(1 - alpha_bar). so the diffusers library is combining the sigma & denominator & calling the whole result sigma, scaling latents according to that denominator elsewhere
the way hf diffusers code works & defines "sigmas", the prediction is: prediction = latent * sqrt(sigma^2+1) + sigma * noise
P