In Dec 2023, #AMD showed a 4x performance advantage on the #Instinct #MI300 #APU vs. traditional discrete #GPUs. Many asked, how was this accomplished? And how can other teams achieve similar acceleration? We are pleased to announce a publication at: https://lnkd.in/gy94ea-9 appearing at #ISC 2024, "Porting HPC Applications to AMD Instinct MI300A Using Unified Memory and OpenMP" which is intended to serve as a guide to application developers on how to program, using portable directives such as OpenMP, to leverage the tight integration of CPU and GPUs on the same package in the same memory space (i.e., the APU). The paper encompasses the programming model, memory model, and performance profiling in OpenFOAM's HPC_motorbike. The 4x performance benefit came from eliminating page migrations, rapid access between the CPU or GPU cores, and increased memory bandwidth delivered to the CPUs. Be sure to stop by the technical talk by Suyash Tandon, Ph.D. at ISC on Tuesday, May 14th! More details at: https://t.co/dkNIIQm0LF Brent Gorda Daniele Piccarozzi, MBA Hisaki Ohara Nicholas Malaya
Chris Clarkson’s Post
More Relevant Posts
-
On Dec 2023, #AMD showed a 4x performance advantage on the #Instinct #MI300 #APU vs. traditional discrete #GPUs. Many asked, how was this accomplished? And how can other teams achieve similar acceleration? I'm pleased to announce a publication at: https://lnkd.in/gy94ea-9 appearing at #ISC 2024, "Porting HPC Applications to AMD Instinct MI300A Using Unified Memory and OpenMP" which is intended to serve as a guide to application developers on how to program, using portable directives such as OpenMP, to leverage the tight integration of CPU and GPUs on the same package in the same memory space (i.e., the APU). The paper encompasses the programming model, memory model, and performance profiling in OpenFOAM's HPC_motorbike. The 4x performance benefit came from eliminating page migrations, rapid access between the CPU or GPU cores, and increased memory bandwidth delivered to the CPUs. Be sure to stop by the technical talk by Suyash Tandon, Ph.D. at ISC on Tuesday, May 14th! More details at: https://t.co/dkNIIQm0LF Many thanks to co-authors: Carlo Bertolli, Leopold Grinberg, Gheorghe-Teodor Bercea, Mark Olesen, and Simone Bnà
To view or add a comment, sign in
-
Apacer will unveil a wide selection of new DRAM modules at our VIP Reveals event during COMPUTEX. In fact, there are quite a few new acronyms out there – are you sure what all of them stand for? Here’s a little “cheat sheet” to help keep you up to date: MRDIMM 🔊 No, it’s not “Mister DIMM!” (Although I’ve heard some engineers do call it that.) Actually, MRDIMM stands for Multi-ranked Buffered Dual In-line Memory Module. No explanation for why the “B” in Buffered isn’t represented in the acronym. LPDDR5X CAMM2 🔊 OK, a two-parter! Well, DDR5 will be immediately recognizable to most of our followers, as it was one of the big DRAM success stories of 2023. LPDDR5X stands for “Low Power Double Data Rate 5X.” Here, the “5X” indicates the generation of LPDDR – 5X is the highest currently available. And as for CAMM2 – well, JEDEC (the group that creates these standards) has published that it stands for Compression Attached Memory Module, although that doesn’t explain what the 2 stands for, unfortunately. DDR5 CXL 🔊 DDR5 we already know, but CXL stands for Compute Express Link, which is a technology that connects CPUs directly to devices or to memory to perform high-speed, high-capacity operations. That’s all the new acronyms we think you need to know! Come check out the actual products at Apacer’s VIP Reveals event during COMPUTEX! #DRAM #DDR5 #LPDDR5CAMM2 #MRDIMM #DDR5CXL
To view or add a comment, sign in
-
Here’s an episode of a technical series Nonbox produced for FormFactor.
#AdvancedPackaging is enabling complex technologies that demand advanced testing techniques. In this video, we delve into the significance of testing thousands of chiplets used in CPUs, GPUs, HBM, I/O, and interposers on a single wafer. FormFactor's CEO, Mike Slessor, explains how the company is addressing these challenges with innovative testing solutions. https://lnkd.in/gy3nk6Vi
Mission Central - Exploring Today's Advanced Packaging
To view or add a comment, sign in
-
With AGI becoming prohibitively compute-hungry with the ever increasing number of parameters, and therefore suited for GPUs, especially for the model training task, how much longer before narrow-AI becomes suitable for CPU-based inference first and then expand to training? IEEE article from June 23 spoke of this case not being entirely dead, but a year later, where are we?
To view or add a comment, sign in
-
#AdvancedPackaging is enabling complex technologies that demand advanced testing techniques. In this video, we delve into the significance of testing thousands of chiplets used in CPUs, GPUs, HBM, I/O, and interposers on a single wafer. FormFactor's CEO, Mike Slessor, explains how the company is addressing these challenges with innovative testing solutions. https://lnkd.in/gy3nk6Vi
Mission Central - Exploring Today's Advanced Packaging
To view or add a comment, sign in
-
The story of xBiDa with GPUs and CPUs 🦾 CPUs are great for handling tasks in sequence, perfect for general-purpose computing, while GPUs excel in parallel processing, ideal for tasks like AI and complex data processing. Our Achievement at xBiDa At xBiDa we faced a challenge: our work demanded powerful processing, and while GPUs are often the go-to for speed and complexity, we set out to optimize our algorithms for CPU. Through careful tuning, we succeeded in getting the results we needed, even within the constraints of a CPU. It was a powerful reminder that with the right approach, sometimes we can achieve big results even with limited resources. #TechInnovation #CPUVsGPU #Xbida #AlgorithmOptimization
To view or add a comment, sign in
-
🔍 CPUs vs. GPUs: What's the Difference? CPUs are great for handling tasks one at a time, making them ideal for general computer tasks we use daily. GPUs, however, are built to work on many tasks at once, making them perfect for complex tasks like artificial intelligence and data processing. 🎯 Why does it matter? Understanding these differences helps us choose the right tech for specific jobs, making our work faster and smarter. 💬 How are you using CPUs or GPUs in your work? #TechTips #AI #DataProcessing #Innovation #TechEssentials
The story of xBiDa with GPUs and CPUs 🦾 CPUs are great for handling tasks in sequence, perfect for general-purpose computing, while GPUs excel in parallel processing, ideal for tasks like AI and complex data processing. Our Achievement at xBiDa At xBiDa we faced a challenge: our work demanded powerful processing, and while GPUs are often the go-to for speed and complexity, we set out to optimize our algorithms for CPU. Through careful tuning, we succeeded in getting the results we needed, even within the constraints of a CPU. It was a powerful reminder that with the right approach, sometimes we can achieve big results even with limited resources. #TechInnovation #CPUVsGPU #Xbida #AlgorithmOptimization
To view or add a comment, sign in
-
👀 Great visual on the capabilites of CPU and GPU
The story of xBiDa with GPUs and CPUs 🦾 CPUs are great for handling tasks in sequence, perfect for general-purpose computing, while GPUs excel in parallel processing, ideal for tasks like AI and complex data processing. Our Achievement at xBiDa At xBiDa we faced a challenge: our work demanded powerful processing, and while GPUs are often the go-to for speed and complexity, we set out to optimize our algorithms for CPU. Through careful tuning, we succeeded in getting the results we needed, even within the constraints of a CPU. It was a powerful reminder that with the right approach, sometimes we can achieve big results even with limited resources. #TechInnovation #CPUVsGPU #Xbida #AlgorithmOptimization
To view or add a comment, sign in
-
Powerful state of art CPUs and GPUs on HPCs alone cannot make simulations significantly faster. Complex simulations need innovative algorithms that can give better result with HPCs with GPUs, simulations still consume significant time. The bottleneck? Outdated algorithms. BosonQ Psi (BQP) is revolutionizing simulation by introducing cutting-edge quantum-powered algorithms. Traditional algorithms developed three to four decades ago, have become obsolete, given the complexity and tech advancements. Not only do simulations take time, but some complex simulations cannot even be performed. At BosonQ Psi (BQP), we are redefining the standards of simulation to harness the power of modern GPUs efficiently. Our quantum-inspired optimization algorithms are reducing simulations for optimizing designs significantly.
To view or add a comment, sign in
-
Run CUDA on AMD GPUs?? Maybe.... The team at Spectral Compute have developed a new GPGPU programming toolkit called SCALE that allows CUDA applications to be natively compiled and run on AMD GPUs. Basically it is just converting nvcc dialect CUDA to corresponding ROCm libraries for AMD GPUs. What's interesting is it has been tested with some projects which I use regularly 1. llama.cpp 2. FAISS 3. XGBoost (haven't been using it so much lately 🙈) There has been so much effort lately to find alternatives for NVIDIA, do you think we will get any sensible alternatives which will work in production? Let me know your thoughts... Link to the article: https://lnkd.in/dCVApmfi #CUDA #AMD #GPUs #GPUComputing #TechInnovation
To view or add a comment, sign in