As for real comparisons, I think the closest ones are those earlier algorithms that do essentially the same. L-Mul computation is 1+x+y+(0.125 or 0.0625) and ApproxLM with the minimal level 1 precision does 1+x+y+(y), where y needs to be the smaller value. So simplest form it's almost the same.
L-Mul basically just uses a constant offset that doesn't take the values into account at all, whereas the more thought out ApproxLM has various alternatives how that offset is estimated based on some simple comparisons on the input values. L-Mul can hardly be considered to be a new invention.
Thinking about this some more: 1+x+y+min(abs(x),abs(y)) <- right? .. at a hardware level, you really don't have to do a full bitwise comparison. You can ignore the size, just check 1 or 2 of the most significant bits, and call that good enough. Small fraction of a fraction of a clock cycle.