Double-Precision Tensor Cores Speed High-Performance Computing

Fri, Apr 3 · 12:09 AM UTC 2 min read

Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.

★ Tier-1 Source

What you can see, you can understand.

Key facts

That’s one reason why an A100 with a total of 432 Tensor Cores delivers up to 19.5 FP64 TFLOPS, more than double the performance of a Volta V100
Simulations help them understand the mysteries of black holes and see how a protein spike on the coronavirus causes COVID-19
As the next big step in their efforts to accelerate high performance computing, the NVIDIA Ampere architecture defines third-generation Tensor Cores that accelerate FP64 math by 2.5x compared
And the simulations that need to be run will run 2.5x faster on an A100 GPU

Summary

Simulations help them understand the mysteries of black holes and see how a protein spike on the coronavirus causes COVID-19. But simulations are also among the most demanding computer applications on the planet because they require lots of the most advanced math. Simulations make numeric models visual with calculations that use a double-precision floating-point format called FP64. As the next big step in their efforts to accelerate high performance computing, the NVIDIA Ampere architecture defines third-generation Tensor Cores that accelerate FP64 math by 2.5x compared to last-generation GPUs.

Read full article at NVIDIA Blog →