Ziaul Kamal’s Post

Coder Enthusias

8mo

Understanding the generative AI development process https://lnkd.in/gfTxqiYr Back in the ancient days of machine learning, before you could use large language models (LLMs) as foundations for tuned models, you essentially had to train every possible machine learning model on all of your data to find the best (or least bad) fit. By ancient, I mean prior to the seminal paper on the transformer neural network architecture, “Attention is all you need,” in 2017. To read this article in full, please click here

To view or add a comment, sign in

More Relevant Posts

Rahav Mohandass

Lead Software Engineer | AI Applications
4mo
Report this post
I've successfully completed Generative AI @Cisco Blue Belt. This learning pathway assess a variety of the technical skills associated with AI including fundamentals of machine learning, ANNs (Artificial Neural Networks), GenAI - transformer architectures, Hugging Face, LLM inference and fine-tuning and LangChain ecosystem.

Data Science @Cisco Generative AI Blue Belt 2024 was issued by Cisco to Rahav Mohandass.

credly.com

1 Comment
Like Comment
To view or add a comment, sign in
Abhik Dey

Full Stack Data Scientist | Machine Learning | Deep Learning | Computer Vision | NLP | Gen-AI
4mo
Report this post
🚀 𝐔𝐧𝐥𝐨𝐜𝐤𝐢𝐧𝐠 𝐭𝐡𝐞 𝐏𝐨𝐰𝐞𝐫 𝐨𝐟 𝐀𝐥𝐞𝐱𝐍𝐞𝐭: 𝐀 𝐃𝐞𝐞𝐩 𝐃𝐢𝐯𝐞 𝐢𝐧𝐭𝐨 𝐃𝐞𝐞𝐩 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 🌟 Excited to share my latest article where we explore the revolutionary AlexNet architecture that transformed the world of computer vision! Dive into how this pioneering model achieved groundbreaking results in the ILSVRC 2012 and discover how it’s built and implemented from scratch using TensorFlow. 🔍 𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:: ● The game-changing innovations of AlexNet ● Step-by-step implementation with TensorFlow ● Insights into training and performance on the Caltech-256 dataset 🔗 Read more to see how AlexNet continues to shape the future of AI and machine learning. Don’t forget to share and let’s spark some conversations about the future of deep learning! 💬👇 #AlexNet #DeepLearning #ComputerVision #TensorFlow #MachineLearning #AI #DataScience

AlexNet: The Architecture That Changed AI

link.medium.com
Like Comment
To view or add a comment, sign in
Michael Mullen

Producing games and interactive in VR/AR/MR/mobile since 2014
1mo
Report this post
Incredible to me that one academic paper from Google, "Attention Is All You Need," back in 2017 spawned the Transformer architecture that is still driving the seismic changes in AI. But, it wasn't some new discovery like penicillin or the Polio vaccine or the wheel. It involved cleverly tweaking ML techniques that had been in use for years. And it's just neural networks!!!!!! For all the hype, some academic paper could land next week that blows this paradigm out of the water! Maybe even a technique that can actually add 2 + 2 (try it, simple math is beyond most LLMs😹)
Like Comment
To view or add a comment, sign in
Saravanan k

Full Stack Developer At Finstein
5mo
Report this post
Day 3 of Learning: Today, I explored CNN architecture and CNN layers. Check out the full story on my blog! 👇 https://lnkd.in/gKa7iz2s #CNN #MachineLearning #AI #DeepLearning #TechBlog #NeuralNetworks #NewLearning

CNN Simple Architecture

medium.com
Like Comment
To view or add a comment, sign in
Syed Zain Ali Shah

Machine Learning Engineer | Deep Learning Engineer | Computer Vision Engineer | NLP Engineer
4mo
Report this post
🚀 Successfully built and trained a Convolutional Neural Network (CNN) on the Fashion MNIST dataset! 👗👚 ✅ Achieved a 91% accuracy on the test set, classifying various fashion items with a robust model architecture. 🎯 The model included: Two Conv2D layers for feature extraction 🧠 MaxPooling layers for dimensionality reduction 📏 Dense layers for classification 🎉 📊 Visualized performance using confusion matrices and loss/accuracy plots to fine-tune the model. Github Link: "https://lnkd.in/gh9rTkr8" #DeepLearning #MachineLearning #AI #CNN #NeuralNetworks

2 Comments
Like Comment
To view or add a comment, sign in
Alex Nedyalkov

Data Scientist at Tokwise
3mo
Report this post
A fascinating new paper called "Were RNNs all we need?" (arxiv.org/abs/2410.01201), which revisits traditional RNNs was just published a few days ago. The authors introduce minimal versions of LSTMs and GRUs, called minLSTMs and minGRUs, which demonstrate strong empirical performance on various tasks, including the long-range Selective Copying task, MuJoCo locomotion tasks from the D4RL benchmark, and character-level language modeling on the Shakespeare dataset. This research challenges the notion that traditional RNNs are inherently inferior to more complex architectures like Transformers. By addressing the key limitation of slow training, minLSTMs and minGRUs offer an alternative for sequence modeling, particularly in scenarios where computational efficiency is paramount. Why does this matter? For me personally the important point is that in the large-data regime, the architecture itself—whether it's transformers, RNNs, or even architectures like MAMBA—might not be the most critical factor. What's crucial is the ability to express the dataset effectively, a concept sometimes referred to as curve-fitting in machine learning. In deep learning, the "curve" represents the model's architecture and parameters (weights and biases). However, I think that that, at a high level, the specific architecture (like RNNs, transformers, etc.) may be less important than the quality and size of the data itself. As long as the model is expressive enough it will eventually "fit/embed" the dataset well, especially in the large-data regime. The idea is that with enough data, the "best" model is less about the specific bells and whistles of the architecture and more about the ability to capture the data's structure. #sequencingmodeling #RNN #LSTM #GRU #deeplearning #AI

Were RNNs All We Needed?

arxiv.org

1 Comment
Like Comment
To view or add a comment, sign in
mohamed aaboud

1337 Student
2mo
Report this post
Exploring Deep Learning with VGG-16 on CIFAR-10! I'm excited to share my recent project where I implemented the VGG-16 Convolutional Neural Network (CNN) architecture on the CIFAR-10 dataset. This was a great opportunity to dive into the world of image classification, leveraging the power of transfer learning with VGG-16's deep layers to recognize 10 different classes of images. Key Takeaways: Dataset: CIFAR-10 (60,000 32x32 color images in 10 classes) Architecture: VGG-16 – a classic deep learning model known for its simplicity and effectiveness. Techniques: Preprocessing, data augmentation, and fine-tuning. Results: Achieved promising accuracy, and further improvements are possible with more tuning! Check out the code: https://lnkd.in/gKtweqJn I’m constantly learning and growing in my AI/ML journey and would love to connect with others passionate about deep learning and computer vision. #DeepLearning #AI #VGG16 #CIFAR10 #MachineLearning #ImageClassification #ArtificialIntelligence #NeuralNetworks
2 Comments
Like Comment
To view or add a comment, sign in
Elvis S.

Cofounder & CEO at DAIR.AI | Ph.D. | Prev: Meta AI, Galactica LLM, Elastic | Prompting Guide (6M+ learners) | I teach how to build with AI ⬇️
6mo
Report this post
🎓Efficient ML (MIT) One of the most important topics in AI today is efficiency. It's an important topic given the large amounts of computational resources required by modern ML systems. This course provides a solid overview of techniques that enable efficient ML systems. Includes lectures on: - Compression - Pruning - Quantization - Neural Architecture Search - Distribute Training - Data/Model Parallelism - On-Device Fine-tuning ... and a whole lot more. https://lnkd.in/eBqF5Nyd ↓ For more, follow my weekly summary of the top AI and LLM papers. Read by 65K+ AI researchers and developers: https://lnkd.in/e6ajg945
19 Comments
Like Comment
To view or add a comment, sign in
Ashish Mradul Bamania

I simplify the latest advances in AI, Quantum Computing & Software Engineering for you | Writer With 1M+ views | Software Engineer | Emergency Physician
1mo
Report this post
I finished writing an exhaustive piece on a novel neural network architecture that beats MLPs, KANs, conventional Transformers, and Mamba in different real-world tasks. It's a deep dive where we learn to build this architecture from the ground up, both mathematically and then in code, all in easy-to-follow language. This is a ton of value! Publishing soon. Follow along: https://intoai.pub #artificialintelligence #ai #datascience #tech
Like Comment
To view or add a comment, sign in
Carlos Javier Martín Escabia

Consulting Engineer at Cisco
1mo Edited
Report this post
Just wrapped up the Data Science Cisco Generative AI Blue Belt 2024 🚀 This program has given me a solid foundation in the world of Generative AI, covering essential topics like Large Language Models (LLMs), AI, Machine Learning (ML), Deep Learning (DL), and Artificial Neural Networks (ANN). In addition to this, throughout the course, I've gained insights into prompt engineering and the transformative power of transformer architecture. Understanding where to find data and how to handle it responsibly has been crucial, along with diving deep into the attention mechanism 🧑🏼💻 I'm eager to apply these skills and knowledge in practical scenarios and explore the endless possibilities that Generative AI offers 🙌🏼 #GenerativeAI #DataScience #AI #MachineLearning #DeepLearning

Data Science @Cisco Generative AI Blue Belt 2024 was issued by Cisco to Carlos Javier Martin Escabia.

credly.com
Like Comment
To view or add a comment, sign in

765 followers

View Profile Connect

Ziaul Kamal’s Post

More from this author

Dedy Permadi: Co-Chair of the Digital Economy Working Group at G20

I will build wordpress using elementor with gtmetrix score a, fast loading

Debian Jessie / Debian 8.5 Release

Explore topics