abdul samad’s Post

Associate Professor | PhD in Computer Science

7mo

Learn how to compress models with Hugging Face Transformers library and Quanto library with this interesting short course from deeplearning.AI! Check it out here: https://lnkd.in/daGR6re6 #LLM #huggingface #QLora

Quantization Fundamentals with Hugging Face

deeplearning.ai

To view or add a comment, sign in

More Relevant Posts

Chris Holland

Accelerating AI Understanding & Adoption
7mo
Report this post
Quantization is one particular process powering NVIDIA's #AI Triton Server. To learn about #quantization, check out this new course by Andrew Ng's DeepLearning.AI. https://lnkd.in/gPMGM2yu #mlops # ml # AI Hugging Face #lls #genai #tritonserver #inference #machinelearning #science #learning

Quantization Fundamentals with Hugging Face

deeplearning.ai
Like Comment
To view or add a comment, sign in
Ihshan Gumilar, PhD

Generative AI | Data Scientist | Long Life Learner in Neuropsychology and AI
7mo
Report this post
Do you want to reduce the size of LLM, for the speed and efficiency, while maintaining the performance ? It will help your enterprise based LLM system. Upgrade yourself to upgrade your sytem via this FREE course 😊

Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
7mo Edited

New free beginner friendly Hugging Face course on DeepLearning.AI! 👨🎓🚀 As models get bigger, we need to find new efficient ways to make them accessible and practical. 👀 Quantization Fundamentals with Hugging Face is a new 1-hour free course where you will learn how to: 📚 Learn how to quantize nearly any open-source model 👨🔬 Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library 🧠 Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers Start now and for free: https://lnkd.in/eWjn2vT7 Join Younes Belkada and Marc Sun and learn how to efficiently run open LLMs with Hugging Face. Thank you, Andrew Ng and DeepLearning.AI AI for this collaboration. 🤗

Quantization Fundamentals with Hugging Face

deeplearning.ai
Like Comment
To view or add a comment, sign in
Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
7mo Edited
Report this post
New free beginner friendly Hugging Face course on DeepLearning.AI! 👨🎓🚀 As models get bigger, we need to find new efficient ways to make them accessible and practical. 👀 Quantization Fundamentals with Hugging Face is a new 1-hour free course where you will learn how to: 📚 Learn how to quantize nearly any open-source model 👨🔬 Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library 🧠 Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers Start now and for free: https://lnkd.in/eWjn2vT7 Join Younes Belkada and Marc Sun and learn how to efficiently run open LLMs with Hugging Face. Thank you, Andrew Ng and DeepLearning.AI AI for this collaboration. 🤗

Quantization Fundamentals with Hugging Face

deeplearning.ai

7 Comments
Like Comment
To view or add a comment, sign in
Dani Azzam

AI Engineer @ inmind.ai | Mechatronics, Robotics
6mo
Report this post
In the ever-evolving landscape of AI, especially with the rapid growth of large language models (LLMs), managing resource constraints is becoming increasingly crucial. LLMs often require gigabytes of memory, making them challenging to run on consumer-grade hardware. This is where the technique of quantization comes into play. Quantization can significantly reduce the size of these models—often by 4x or more—while maintaining reasonable performance levels. This makes a wider selection of models accessible to developers, enabling them to run efficiently on various devices. However, it's important to note that while quantization can reduce model size, its practical usability limited depending on the scenario and the goals of the project. If you're interested in learning more about this transformative technique, I highly recommend the short course on Quantization Fundamentals, taught by experts from Hugging Face, Younes Belkada and Marc Sun. In just around two hours, this course covers: - How to quantize nearly any open-source model using the Quanto library. - Implementing int8 and bfloat16 (Brain float 16) data types to load and run LLMs with PyTorch and the Hugging Face Transformers library. - The technical details of linear quantization to map 32-bit floats to 8-bit integers. - Applying “downcasting” to compress models further using the BFloat16 data type. By the end of this course, you will have a solid foundation in quantization techniques, enabling you to compress and optimize your own generative AI models, making them more efficient and accessible. This is an excellent opportunity to enhance your skills in model quantization for generative AI. Investing just a couple of hours in this course can open up new possibilities for your projects. Check out the course here: https://lnkd.in/g66yNW8W Check my accomplishment on: https://lnkd.in/dRF75utf Hope to see you in the course! #AI #MachineLearning #Quantization #LLMs #PyTorch #HuggingFace #GenerativeAI

Quantization Fundamentals with Hugging Face

deeplearning.ai

9 Comments
Like Comment
To view or add a comment, sign in
Aldrin Cebe

🧑🏻💻Cyber Security Graduate | Emerging Digital Marketer | Merging Tech and Marketing for Innovative Solutions.
6mo
Report this post
Unleashing the power of AI - using the Rabbit R1. A sleek retro designed device with an enhanced AI performance and bridging the way between Technology and User Experience. Book yours now for only $199. Join now for the new AI revolution. #RabbitR1 #AI #perfomance

The Future is here... Unveiling the Rabbit R1.

itsmealdri.wixsite.com
Like Comment
To view or add a comment, sign in
Dennis Hauser

Data Analyst
7mo
Report this post
🚀 Just completed the course "Quantization Fundamentals with Hugging Face" on DeepLearning.AI! Here are some key takeaways I'm excited to share: - Scale and efficiency: Quantization can optimize compute performance 📈 and reduce memory demands 📉 - Data types: Which data types are commonly used in AI models to store the parameters and how to use downcasting 🗜 with PyTorch - Practical Tools: How to leverage quanto and the transformers library from Hugging Face to quantize AI models A bonus point for me are the practical examples 🔬 alongside the video lectures which allowed me to apply the concepts in real-time. https://lnkd.in/eaRQktPx

Quantization Fundamentals with Hugging Face

deeplearning.ai
Like Comment
To view or add a comment, sign in
Meng Li

AI Engineer，Full-time open source engineer, Apache Linkis Committer, initiator of the SolidUI AI painting project.
1mo
Report this post
Boost long-context reasoning with DuoAttention! Reduce memory usage and enhance decoding speeds while maintaining accuracy for tasks involving millions of tokens. #AI #LLM #DuoAttention #opensource #agent #rag #memory #tokens

DuoAttention: Single GPU Achieves 3.3 Million Token Context Inference

aidisruptionpub.com
Like Comment
To view or add a comment, sign in
Elnur Asgarli

Finance and Technology
2w
Report this post
Check out my latest article on Deep Learning and Face Detection! We dive into MTCNN, Facenet, and KNN classification. Would love to hear you commnets. Link in bio.

Building a Modern Face Recognition System with Deep Learning and Web Interface

link.medium.com
Like Comment
To view or add a comment, sign in
Nivitus .

John 3:16 | Artificial Intelligence Engineer - Visual AI @ Avathon | NVIDIA Jetson AI Specialist | Edge AI | Computer Vision | Deep Learning | C++ | Python
1mo
Report this post
🌟 Excited to Share My Latest Article! 🌟 I’ve just published a new article on integrating the Depth-Pro (apple/ml-depth-pro) model with a YOLO V5 custom object detector for real-time depth and distance estimation. This powerful combination not only enhances object detection capabilities but also provides critical spatial information, paving the way for advancements in real-world computer vision problems. 🖼️🔍 A special shoutout to Nicolai Nielsen for his insights on various real-world computer vision concepts that inspired this work! 🙌 If you're interested in computer vision and want to learn more about how to leverage these technologies, check it out! 📚✨ Article Link: https://lnkd.in/gdSdeR-3 Thank you for your support, and I look forward to your thoughts! 💬 #ai #imageprocessing #computervision #deeplearning #depthestimation
Like Comment
To view or add a comment, sign in
Hussein Eid

AI enthusiast
6mo Edited
Report this post
🚀 Just finished the "Introduction to on-device AI" short course from DeepLearning.AI and Qualcomm, and I had a blast ! • I learnt how to effectively deploy AI models from the cloud to smartphones and edge devices , using their local compute power for faster and more secure inference. • I explored model conversion by converting PyTorch/TensorFlow models for device compatibility, and quantizing them to achieve performance gains while reducing model size. • I learnt the main concepts of model quantization, and the importance of model conversion. • I learnt about device integration, including runtime dependencies, and how GPU, NPU, and CPU compute unit utilization affect performance. #deeplearningai #qualcomm #ondeviceai #ai

Hussein Eid, congratulations on completing Introduction to on-device AI!

learn.deeplearning.ai
Like Comment
To view or add a comment, sign in

2,097 followers

80 Posts

View Profile Follow

abdul samad’s Post

More Relevant Posts

Explore topics