BLUE
Profile banner
N
(Enn) Nafnlaus 🇮🇸 🇺🇦
@nafnlaus.bsky.social
1k followers341 following16.7k posts
Nnafnlaus.bsky.social

I don't have time to dig today, but what's the error for downsampling to fp8_e4m3 and doing multiplications in it? That's the real comparison. There's already huge errors vs. staying at high precision. And I just don't see the errors mattering much in inference (training, definitely ).

3

PLpekka.bsky.social

As for real comparisons, I think the closest ones are those earlier algorithms that do essentially the same. L-Mul computation is 1+x+y+(0.125 or 0.0625) and ApproxLM with the minimal level 1 precision does 1+x+y+(y), where y needs to be the smaller value. So simplest form it's almost the same.

2
PLpekka.bsky.social

I guess the worst case scenario would be something like truncating (instead of rounding) 1.99999... to 1.875 that 3 bit precision allows. That would be 6.25% error. And about double that if two of them are multiplied.

0
Nnafnlaus.bsky.social

I had Claude write a test program for me ;) --- Analysis of FP8_e4m3 multiplication errors across 1000000 tests: Mean error ratio: 1.0371 Median error ratio: 1.0316 Max error ratio: 1.1492 --- This program only tested numbers that wouldn't overflow FP8_e4m3, of course.

0
Profile banner
N
(Enn) Nafnlaus 🇮🇸 🇺🇦
@nafnlaus.bsky.social
1k followers341 following16.7k posts