Benjamin Carr, Ph.D. 👨🏻💻🧬<p>Sizing up <a href="https://hachyderm.io/tags/MI300A" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MI300A</span></a>’s <a href="https://hachyderm.io/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a><br>It’s well ahead of <a href="https://hachyderm.io/tags/Nvidia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Nvidia</span></a>’s <a href="https://hachyderm.io/tags/H100" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>H100</span></a> PCIe for just about every major category of 32- or 64-bit operations. MI300A can achieve 113.2 TFLOPS of <a href="https://hachyderm.io/tags/FP32" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FP32</span></a> throughput, with each FMA counting as two floating point operations. For comparison, H100 PCIe achieved 49.3 TFLOPS in same test.<br><a href="https://hachyderm.io/tags/AMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AMD</span></a> cut down <a href="https://hachyderm.io/tags/MI300X" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MI300X</span></a>’s GPU to create MI300A. 24 <a href="https://hachyderm.io/tags/Zen4" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Zen4</span></a> cores is a lot of <a href="https://hachyderm.io/tags/CPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CPU</span></a> power, and occupies one quadrant on the MI300 chip. But MI300’s main attraction is still the GPU.<br><a href="https://chipsandcheese.com/p/sizing-up-mi300as-gpu" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">chipsandcheese.com/p/sizing-up</span><span class="invisible">-mi300as-gpu</span></a></p>