Google's bfloat for neural networks

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
smatovic
Posts: 609
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Google's bfloat for neural networks

Post by smatovic » Tue Apr 16, 2019 9:30 am

Hehe, funny, Google started to use their own bfloat datatype in TPU gen 2 and gen 3 for neural networks,

https://en.wikipedia.org/wiki/Bfloat16_ ... int_format
https://www.nextplatform.com/2018/05/10 ... processor/

and now Intel starts to implement it in their hardware. That's when you know you are a big player :)

https://venturebeat.com/2018/05/23/inte ... -training/

Wonder if Nvidia or AMD will join.

--
Srdja

mar
Posts: 1929
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Google's bfloat for neural networks

Post by mar » Tue Apr 16, 2019 9:57 am

I misread as "Google's bloat...", thought that Google open sourced yet another masterpiece :D

So this bfloat16 is basically float where you throw away 16 bits worth of mantissa.
Packing/unpacking from 32-bit float should be trivial, so probably clever, but hey only 7 bits of mantissa, is it really enough?
Martin Sedlak

smatovic
Posts: 609
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Google's bfloat for neural networks

Post by smatovic » Tue Apr 16, 2019 10:07 am

mar wrote:
Tue Apr 16, 2019 9:57 am
I misread as "Google's bloat...", thought that Google open sourced yet another masterpiece :D

So this bfloat16 is basically float where you throw away 16 bits worth of mantissa.
Packing/unpacking from 32-bit float should be trivial, so probably clever, but hey only 7 bits of mantissa, is it really enough?
Dunno :)

https://www.hpcwire.com/2019/04/15/bsc- ... -training/

"As training progresses and it hones the value of the weights, then greater precision becomes important in order to optimize the solution."

“We believe dynamic numerical precision approaches offer the best benefit to training and inferencing,”

--
Srdja

Post Reply