Significant overall performance of 18 DGX POD with FP64 operating at 275PFLOPS and bifurcation bandwidth of 230TB/s. The DGX H100 has an H100 GPU capable of 640 PFLOPS with 24GB of HBM3 memory and 32TB/s memory bandwidth.Ī DGX POD (Multi-Rack Unit) and NVLink switch supports 20.5TB of HBM3 memory and 768TB/s memory bandwidth. The H100 is available in the HGX H100 configuration or as the DGX H100 shown below. H100 Increases Transformer Performance by 6X, Secure Data and AI Model Allows Up to 7X More Secure Supported Tenants and Includes 4 thGeneration NVLink 7X delivers the performance of PCIE Gen5. The H100 has 80B transistors, and provides 4.9TB/s bandwidth using a 4N TSMC process. NVIDIA introduced its H100 Tensor Core GPU, shown below. AI frameworks include Merlin 1.0 for hyperscale recommendation systems and Nemo Megatron for training large language models. The NVIDIA AI Software Development Kit (SDK) includes Riva 2.0 for Speech AI and Maxine for AI Video Conferencing. NVIDIA’s Triton Intent Server software allows model deployment and execution for AI applications. In addition to training, AI entrance is also making significant progress. This will increase the demand for AI training processing power. Transformers are particularly useful for problems such as natural language training and computer vision and can be useful for many other AI applications. Growing Computation Requirements for AI Transformer Training The slide below from Jensen’s talk shows the growing demand for transformer training compared to earlier AI training models. In a transformer trained model the density of connections between data results in increased computational training efficiency. This allows a transformer trained model to see traces of the entire data set as soon as training begins. Transformers run their training so that each element in the input data is connected to every other element. He said this approach could be up to 45,000X faster than some other modeling approaches.Ī recent concept for some AI applications is the transformer (introduced in 2017). The NVIDIA Modulus for Scientific Digital Twin Modeling can perform physis-ML accelerated digital twins using a transformer-based model that can be trained in low resolution data and inferred with high resolution. They discussed the need to increase the computational complexity by 1M times to build large-scale models that can model global climate. The slide below gives a view of all the topics discussed in Keynote, near the beginning of Keynote, and covered in more detail in the GTC sessions. There were several hardware and software announcements at the event and during Jensen Huang’s keynote presentation he went over some of the drivers for some of these announcements. Omniverse) and real-time AI applications such as advanced driving assistance. Announcements on hardware and software were showcased by the company at the 2022 NVIDIA Global Technology Conference (GTC) to enable the next generation of AI applications with a focus on digital twins, Realistic Physics Driven Models for Virtual Simulation (NVIDIA Co.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |