Chris Clarkson’s Post

HPC | AI | Quantum

7mo

In Dec 2023, #AMD showed a 4x performance advantage on the #Instinct #MI300 #APU vs. traditional discrete #GPUs. Many asked, how was this accomplished? And how can other teams achieve similar acceleration? We are pleased to announce a publication at: https://lnkd.in/gy94ea-9 appearing at #ISC 2024, "Porting HPC Applications to AMD Instinct MI300A Using Unified Memory and OpenMP" which is intended to serve as a guide to application developers on how to program, using portable directives such as OpenMP, to leverage the tight integration of CPU and GPUs on the same package in the same memory space (i.e., the APU). The paper encompasses the programming model, memory model, and performance profiling in OpenFOAM's HPC_motorbike. The 4x performance benefit came from eliminating page migrations, rapid access between the CPU or GPU cores, and increased memory bandwidth delivered to the CPUs. Be sure to stop by the technical talk by Suyash Tandon, Ph.D. at ISC on Tuesday, May 14th! More details at: https://t.co/dkNIIQm0LF Brent Gorda Daniele Piccarozzi, MBA Hisaki Ohara Nicholas Malaya

To view or add a comment, sign in

More Relevant Posts

Nicholas Malaya

Fellow, High Performance Computing, AMD
7mo Edited
Report this post
On Dec 2023, #AMD showed a 4x performance advantage on the #Instinct #MI300 #APU vs. traditional discrete #GPUs. Many asked, how was this accomplished? And how can other teams achieve similar acceleration? I'm pleased to announce a publication at: https://lnkd.in/gy94ea-9 appearing at #ISC 2024, "Porting HPC Applications to AMD Instinct MI300A Using Unified Memory and OpenMP" which is intended to serve as a guide to application developers on how to program, using portable directives such as OpenMP, to leverage the tight integration of CPU and GPUs on the same package in the same memory space (i.e., the APU). The paper encompasses the programming model, memory model, and performance profiling in OpenFOAM's HPC_motorbike. The 4x performance benefit came from eliminating page migrations, rapid access between the CPU or GPU cores, and increased memory bandwidth delivered to the CPUs. Be sure to stop by the technical talk by Suyash Tandon, Ph.D. at ISC on Tuesday, May 14th! More details at: https://t.co/dkNIIQm0LF Many thanks to co-authors: Carlo Bertolli, Leopold Grinberg, Gheorghe-Teodor Bercea, Mark Olesen, and Simone Bnà
17 Comments
Like Comment
To view or add a comment, sign in
Apacer宇瞻

5,250 followers
7mo
Report this post
Apacer will unveil a wide selection of new DRAM modules at our VIP Reveals event during COMPUTEX. In fact, there are quite a few new acronyms out there – are you sure what all of them stand for? Here’s a little “cheat sheet” to help keep you up to date: MRDIMM 🔊 No, it’s not “Mister DIMM!” (Although I’ve heard some engineers do call it that.) Actually, MRDIMM stands for Multi-ranked Buffered Dual In-line Memory Module. No explanation for why the “B” in Buffered isn’t represented in the acronym. LPDDR5X CAMM2 🔊 OK, a two-parter! Well, DDR5 will be immediately recognizable to most of our followers, as it was one of the big DRAM success stories of 2023. LPDDR5X stands for “Low Power Double Data Rate 5X.” Here, the “5X” indicates the generation of LPDDR – 5X is the highest currently available. And as for CAMM2 – well, JEDEC (the group that creates these standards) has published that it stands for Compression Attached Memory Module, although that doesn’t explain what the 2 stands for, unfortunately. DDR5 CXL 🔊 DDR5 we already know, but CXL stands for Compute Express Link, which is a technology that connects CPUs directly to devices or to memory to perform high-speed, high-capacity operations. That’s all the new acronyms we think you need to know! Come check out the actual products at Apacer’s VIP Reveals event during COMPUTEX! #DRAM #DDR5 #LPDDR5CAMM2 #MRDIMM #DDR5CXL
Like Comment
To view or add a comment, sign in
Steve Karakas

Principal at Nonbox
1mo
Report this post
Here’s an episode of a technical series Nonbox produced for FormFactor.

FormFactor Inc.

22,063 followers
2mo

#AdvancedPackaging is enabling complex technologies that demand advanced testing techniques. In this video, we delve into the significance of testing thousands of chiplets used in CPUs, GPUs, HBM, I/O, and interposers on a single wafer. FormFactor's CEO, Mike Slessor, explains how the company is addressing these challenges with innovative testing solutions. https://lnkd.in/gy3nk6Vi

Mission Central - Exploring Today's Advanced Packaging
Like Comment
To view or add a comment, sign in
Ranjan Sharma

Telecommunications Product Management | Systems Engineering | Cloud | ML | Innovator
6mo
Report this post
With AGI becoming prohibitively compute-hungry with the ever increasing number of parameters, and therefore suited for GPUs, especially for the model training task, how much longer before narrow-AI becomes suitable for CPU-based inference first and then expand to training? IEEE article from June 23 spoke of this case not being entirely dead, but a year later, where are we?
Like Comment
To view or add a comment, sign in
FormFactor Inc.

22,063 followers
2mo
Report this post
#AdvancedPackaging is enabling complex technologies that demand advanced testing techniques. In this video, we delve into the significance of testing thousands of chiplets used in CPUs, GPUs, HBM, I/O, and interposers on a single wafer. FormFactor's CEO, Mike Slessor, explains how the company is addressing these challenges with innovative testing solutions. https://lnkd.in/gy3nk6Vi

Mission Central - Exploring Today's Advanced Packaging
Like Comment
To view or add a comment, sign in
Loai Abdallah, Ph.D

Deep Learning, Generative AI and Machine Learning expert
1mo
Report this post
The story of xBiDa with GPUs and CPUs 🦾 CPUs are great for handling tasks in sequence, perfect for general-purpose computing, while GPUs excel in parallel processing, ideal for tasks like AI and complex data processing. Our Achievement at xBiDa At xBiDa we faced a challenge: our work demanded powerful processing, and while GPUs are often the go-to for speed and complexity, we set out to optimize our algorithms for CPU. Through careful tuning, we succeeded in getting the results we needed, even within the constraints of a CPU. It was a powerful reminder that with the right approach, sometimes we can achieve big results even with limited resources. #TechInnovation #CPUVsGPU #Xbida #AlgorithmOptimization

1 Comment
Like Comment
To view or add a comment, sign in
Edgar Oblitey

Transformative Technology Leader | Driving Digital Transformation & Innovation | Helping Businesses Create Lasting Success, Growth & Efficiency with Innovative Technology
1mo
Report this post
🔍 CPUs vs. GPUs: What's the Difference? CPUs are great for handling tasks one at a time, making them ideal for general computer tasks we use daily. GPUs, however, are built to work on many tasks at once, making them perfect for complex tasks like artificial intelligence and data processing. 🎯 Why does it matter? Understanding these differences helps us choose the right tech for specific jobs, making our work faster and smarter. 💬 How are you using CPUs or GPUs in your work? #TechTips #AI #DataProcessing #Innovation #TechEssentials

Loai Abdallah, Ph.D

Deep Learning, Generative AI and Machine Learning expert
1mo

The story of xBiDa with GPUs and CPUs 🦾 CPUs are great for handling tasks in sequence, perfect for general-purpose computing, while GPUs excel in parallel processing, ideal for tasks like AI and complex data processing. Our Achievement at xBiDa At xBiDa we faced a challenge: our work demanded powerful processing, and while GPUs are often the go-to for speed and complexity, we set out to optimize our algorithms for CPU. Through careful tuning, we succeeded in getting the results we needed, even within the constraints of a CPU. It was a powerful reminder that with the right approach, sometimes we can achieve big results even with limited resources. #TechInnovation #CPUVsGPU #Xbida #AlgorithmOptimization
Like Comment
To view or add a comment, sign in
Michelle Belsher

Experiential Designer, Producer, & Manager | Master of Logistics, Operations & Systems | Creator of Meaningful Moments, Connections & Epic Experiences.
1mo
Report this post
👀 Great visual on the capabilites of CPU and GPU

Loai Abdallah, Ph.D

Deep Learning, Generative AI and Machine Learning expert
1mo

The story of xBiDa with GPUs and CPUs 🦾 CPUs are great for handling tasks in sequence, perfect for general-purpose computing, while GPUs excel in parallel processing, ideal for tasks like AI and complex data processing. Our Achievement at xBiDa At xBiDa we faced a challenge: our work demanded powerful processing, and while GPUs are often the go-to for speed and complexity, we set out to optimize our algorithms for CPU. Through careful tuning, we succeeded in getting the results we needed, even within the constraints of a CPU. It was a powerful reminder that with the right approach, sometimes we can achieve big results even with limited resources. #TechInnovation #CPUVsGPU #Xbida #AlgorithmOptimization
Like Comment
To view or add a comment, sign in
BosonQ Psi (BQP)

10,686 followers
8mo
Report this post
Powerful state of art CPUs and GPUs on HPCs alone cannot make simulations significantly faster. Complex simulations need innovative algorithms that can give better result with HPCs with GPUs, simulations still consume significant time. The bottleneck? Outdated algorithms. BosonQ Psi (BQP) is revolutionizing simulation by introducing cutting-edge quantum-powered algorithms. Traditional algorithms developed three to four decades ago, have become obsolete, given the complexity and tech advancements. Not only do simulations take time, but some complex simulations cannot even be performed. At BosonQ Psi (BQP), we are redefining the standards of simulation to harness the power of modern GPUs efficiently. Our quantum-inspired optimization algorithms are reducing simulations for optimizing designs significantly.

1 Comment
Like Comment
To view or add a comment, sign in
Siddhesh Gunjal

Senior Research Scientist @ NielsenIQ | Creator of Slackker (PyPi Package) | Former Adjunct faculty @ upGrad
5mo Edited
Report this post
Run CUDA on AMD GPUs?? Maybe.... The team at Spectral Compute have developed a new GPGPU programming toolkit called SCALE that allows CUDA applications to be natively compiled and run on AMD GPUs. Basically it is just converting nvcc dialect CUDA to corresponding ROCm libraries for AMD GPUs. What's interesting is it has been tested with some projects which I use regularly 1. llama.cpp 2. FAISS 3. XGBoost (haven't been using it so much lately 🙈) There has been so much effort lately to find alternatives for NVIDIA, do you think we will get any sensible alternatives which will work in production? Let me know your thoughts... Link to the article: https://lnkd.in/dCVApmfi #CUDA #AMD #GPUs #GPUComputing #TechInnovation
Like Comment
To view or add a comment, sign in

19,420 followers

View Profile Follow

Chris Clarkson’s Post

More from this author

On reaching 15,000 connections

LinkedIn Sins

Another Milestone - 6000 Connections

Explore topics

Chris Clarkson’s Post

More Relevant Posts

Mission Central - Exploring Today's Advanced Packaging

Mission Central - Exploring Today's Advanced Packaging

More from this author

On reaching 15,000 connections

LinkedIn Sins

Another Milestone - 6000 Connections

Explore topics