AMD To Refresh Instinct MI300 Series With MI350 AI Accelerator Using 4nm Node This Year

Hassan Mujtaba
AMD To Refresh Instinct MI300 Series With MI350 AI Accelerator Using 4nm Node This Year 1

AMD seems to be preparing a 4nm refresh of its MI300 AI Accelerators known as the MI350 which is planned for later this year.

AMD MI350 AI Accelerator To Feature Refreshed 4nm Architecture, Aiming Launch Later This Year

AMD's current MI300 lineup consists of the AI-optimized MI300X & the compute-optimized MI300A accelerators but it looks like the company is planning to expand its portfolio. We recently saw the emergence of the MI388X which might be an export-compliant variant for China but AMD did state that it was prevented from shipments. The MI388X was likely going to be another CDNA 3 offering utilizing a 5nm and 6nm process technology but it looks like AMD has a proper refresh planned for its Instinct family for later this year.

Related Story Basemark “Breaking Limit” Ray Tracing Benchmark Now Out, Tested With NVIDIA, AMD & Intel GPUs

According to a report from TrendForce, it looks like AMD might be launching a new part known as the Instinct MI350 which will utilize a refreshed CDNA 3 architecture utilizing TSMC's 4nm process node. While details on the Instinct MI350 are slim, it was recently teased by AMD itself that they'll be offering higher HBM3E capacities in future refreshes of the Instinct MI300 series. So higher HBM capacities coupled with a fine-tuned architecture on the 4nm node can lead to some decent gains.

Furthermore, TrendForce notes that the extension of export controls now includes not only the previously restricted AI chips from NVIDIA and AMD, such as the NVIDIA A100/H100, AMD MI250/300 series, NVIDIA A800, H800, L40, L40S, and RTX4090, but also their next-generation successors like NVIDIA's H200, B100, B200, GB200, and AMD's MI350 series. In response, HPC manufacturers have quickly developed products that comply with the new TPP and PD standards, such as NVIDIA's adjusted H20/L20/L2, which remain eligible for export.

TrendForce

Videocardz was also able to spot a listing from AMD Singapore which confirms the Instinct MI350 accelerator lineup. The product has already been submitted for silicon readiness & optimizations.

Image Source: AMD Singapore

It should be remembered that AMD will be competing against both NVIDIA & Intel in the AI space. The Blackwell B100 GPUs are in production and B100/B200 will be rolling out to customers soon. Meanwhile, Intel also announced its Gaudi 3 accelerators which offer up to 50% faster AI compute versus the NVIDIA H100 GPUs. So the space is heating up. In recent MLPerf benchmarks, NVIDIA & Intel were the only ones to submit their AI performance benchmarks meanwhile AMD missed the spotlight as it didn't submit any numbers.

TrendForce has also shared the full list of products that are affected by the latest version of the US export controls against China. These include several current and upcoming GPUs including AMD's Instinct MI388X & MI350 series.

US Export Controlled Products (Restricted For China / As of 29th March):

VendorProductProcess TechnologyRelease Date
NVIDIAGB2004nm (TSMC)2H 2024
NVIDIAB2004nm (TSMC)2H 2024
NVIDIAB1004nm (TSMC)2H 2024
NVIDIAH2004nm (TSMC)11/2023
NVIDIAH1004nm (TSMC)03/2022
NVIDIAH8004nm (TSMC)03/2022
NVIDIAL40/L40S5nm (TSMC)10/2022
NVIDIARTX 40905nm (TSMC)10/2022
NVIDIAA1007nm (TSMC)05/2020
NVIDIAA8007nm (TSMC)05/2020
AMDMI2506nm (TSMC)11/2021
AMDMI250X6nm (TSMC)11/2021
AMDMI300/MI3095nm (TSMC)11/2021
AMDMI300X/MI388X5nm/6nm (TSMC)12/2023
AMDMI3504nm (TSMC)2H 2024

AMD has also confirmed its next-gen MI400 AI accelerator which should be released in 2025 and feature a more capable architecture that is tuned for the AI-era. AMD is also working on its ROCm software suite and has made certain blocks open source to fine-tune its performance for AI work-loads.

AMD Radeon Instinct Accelerators

Accelerator NameAMD Instinct MI400AMD Instinct MI350XAMD Instinct MI300XAMD Instinct MI300AAMD Instinct MI250XAMD Instinct MI250AMD Instinct MI210AMD Instinct MI100AMD Radeon Instinct MI60AMD Radeon Instinct MI50AMD Radeon Instinct MI25AMD Radeon Instinct MI8AMD Radeon Instinct MI6
CPU ArchitectureZen 5 (Exascale APU)N/AN/AZen 4 (Exascale APU)N/AN/AN/AN/AN/AN/AN/AN/AN/A
GPU ArchitectureCDNA 4CDNA 3+?Aqua Vanjaram (CDNA 3)Aqua Vanjaram (CDNA 3)Aldebaran (CDNA 2)Aldebaran (CDNA 2)Aldebaran (CDNA 2)Arcturus (CDNA 1)Vega 20Vega 20Vega 10Fiji XTPolaris 10
GPU Process Node4nm4nm5nm+6nm5nm+6nm6nm6nm6nm7nm FinFET7nm FinFET7nm FinFET14nm FinFET28nm14nm FinFET
GPU ChipletsTBDTBD8 (MCM)8 (MCM)2 (MCM)
1 (Per Die)
2 (MCM)
1 (Per Die)
2 (MCM)
1 (Per Die)
1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)
GPU CoresTBDTBD19,45614,59214,08013,3126656768040963840409640962304
GPU Clock SpeedTBDTBD2100 MHz2100 MHz1700 MHz1700 MHz1700 MHz1500 MHz1800 MHz1725 MHz1500 MHz1000 MHz1237 MHz
INT8 ComputeTBDTBD2614 TOPS1961 TOPS383 TOPs362 TOPS181 TOPS92.3 TOPSN/AN/AN/AN/AN/A
FP16 ComputeTBDTBD1.3 PFLOPs980.6 TFLOPs383 TFLOPs362 TFLOPs181 TFLOPs185 TFLOPs29.5 TFLOPs26.5 TFLOPs24.6 TFLOPs8.2 TFLOPs5.7 TFLOPs
FP32 ComputeTBDTBD163.4 TFLOPs122.6 TFLOPs95.7 TFLOPs90.5 TFLOPs45.3 TFLOPs23.1 TFLOPs14.7 TFLOPs13.3 TFLOPs12.3 TFLOPs8.2 TFLOPs5.7 TFLOPs
FP64 ComputeTBDTBD81.7 TFLOPs61.3 TFLOPs47.9 TFLOPs45.3 TFLOPs22.6 TFLOPs11.5 TFLOPs7.4 TFLOPs6.6 TFLOPs768 GFLOPs512 GFLOPs384 GFLOPs
VRAMTBDHBM3e192 GB HBM3128 GB HBM3128 GB HBM2e128 GB HBM2e64 GB HBM2e32 GB HBM232 GB HBM216 GB HBM216 GB HBM24 GB HBM116 GB GDDR5
Infinity CacheTBDTBD256 MB256 MBN/AN/AN/AN/AN/AN/AN/AN/AN/A
Memory ClockTBDTBD5.2 Gbps5.2 Gbps3.2 Gbps3.2 Gbps3.2 Gbps1200 MHz1000 MHz1000 MHz945 MHz500 MHz1750 MHz
Memory BusTBDTBD8192-bit8192-bit8192-bit8192-bit4096-bit4096-bit bus4096-bit bus4096-bit bus2048-bit bus4096-bit bus256-bit bus
Memory BandwidthTBDTBD5.3 TB/s5.3 TB/s3.2 TB/s3.2 TB/s1.6 TB/s1.23 TB/s1 TB/s1 TB/s484 GB/s512 GB/s224 GB/s
Form FactorTBDTBDOAMAPU SH5 SocketOAMOAMDual Slot CardDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Half LengthSingle Slot, Full Length
CoolingTBDTBDPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive Cooling
TDP (Max)TBDTBD750W760W560W500W300W300W300W300W300W175W150W
Share this story

Comments

  翻译: