Understanding the generative AI development process https://lnkd.in/gfTxqiYr Back in the ancient days of machine learning, before you could use large language models (LLMs) as foundations for tuned models, you essentially had to train every possible machine learning model on all of your data to find the best (or least bad) fit. By ancient, I mean prior to the seminal paper on the transformer neural network architecture, “Attention is all you need,” in 2017. To read this article in full, please click here
Ziaul Kamal’s Post
More Relevant Posts
-
I've successfully completed Generative AI @Cisco Blue Belt. This learning pathway assess a variety of the technical skills associated with AI including fundamentals of machine learning, ANNs (Artificial Neural Networks), GenAI - transformer architectures, Hugging Face, LLM inference and fine-tuning and LangChain ecosystem.
To view or add a comment, sign in
-
🚀 𝐔𝐧𝐥𝐨𝐜𝐤𝐢𝐧𝐠 𝐭𝐡𝐞 𝐏𝐨𝐰𝐞𝐫 𝐨𝐟 𝐀𝐥𝐞𝐱𝐍𝐞𝐭: 𝐀 𝐃𝐞𝐞𝐩 𝐃𝐢𝐯𝐞 𝐢𝐧𝐭𝐨 𝐃𝐞𝐞𝐩 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 🌟 Excited to share my latest article where we explore the revolutionary AlexNet architecture that transformed the world of computer vision! Dive into how this pioneering model achieved groundbreaking results in the ILSVRC 2012 and discover how it’s built and implemented from scratch using TensorFlow. 🔍 𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:: ● The game-changing innovations of AlexNet ● Step-by-step implementation with TensorFlow ● Insights into training and performance on the Caltech-256 dataset 🔗 Read more to see how AlexNet continues to shape the future of AI and machine learning. Don’t forget to share and let’s spark some conversations about the future of deep learning! 💬👇 #AlexNet #DeepLearning #ComputerVision #TensorFlow #MachineLearning #AI #DataScience
AlexNet: The Architecture That Changed AI
link.medium.com
To view or add a comment, sign in
-
Incredible to me that one academic paper from Google, "Attention Is All You Need," back in 2017 spawned the Transformer architecture that is still driving the seismic changes in AI. But, it wasn't some new discovery like penicillin or the Polio vaccine or the wheel. It involved cleverly tweaking ML techniques that had been in use for years. And it's just neural networks!!!!!! For all the hype, some academic paper could land next week that blows this paradigm out of the water! Maybe even a technique that can actually add 2 + 2 (try it, simple math is beyond most LLMs😹)
To view or add a comment, sign in
-
Day 3 of Learning: Today, I explored CNN architecture and CNN layers. Check out the full story on my blog! 👇 https://lnkd.in/gKa7iz2s #CNN #MachineLearning #AI #DeepLearning #TechBlog #NeuralNetworks #NewLearning
CNN Simple Architecture
medium.com
To view or add a comment, sign in
-
🚀 Successfully built and trained a Convolutional Neural Network (CNN) on the Fashion MNIST dataset! 👗👚 ✅ Achieved a 91% accuracy on the test set, classifying various fashion items with a robust model architecture. 🎯 The model included: Two Conv2D layers for feature extraction 🧠 MaxPooling layers for dimensionality reduction 📏 Dense layers for classification 🎉 📊 Visualized performance using confusion matrices and loss/accuracy plots to fine-tune the model. Github Link: "https://lnkd.in/gh9rTkr8" #DeepLearning #MachineLearning #AI #CNN #NeuralNetworks
To view or add a comment, sign in
-
A fascinating new paper called "Were RNNs all we need?" (arxiv.org/abs/2410.01201), which revisits traditional RNNs was just published a few days ago. The authors introduce minimal versions of LSTMs and GRUs, called minLSTMs and minGRUs, which demonstrate strong empirical performance on various tasks, including the long-range Selective Copying task, MuJoCo locomotion tasks from the D4RL benchmark, and character-level language modeling on the Shakespeare dataset. This research challenges the notion that traditional RNNs are inherently inferior to more complex architectures like Transformers. By addressing the key limitation of slow training, minLSTMs and minGRUs offer an alternative for sequence modeling, particularly in scenarios where computational efficiency is paramount. Why does this matter? For me personally the important point is that in the large-data regime, the architecture itself—whether it's transformers, RNNs, or even architectures like MAMBA—might not be the most critical factor. What's crucial is the ability to express the dataset effectively, a concept sometimes referred to as curve-fitting in machine learning. In deep learning, the "curve" represents the model's architecture and parameters (weights and biases). However, I think that that, at a high level, the specific architecture (like RNNs, transformers, etc.) may be less important than the quality and size of the data itself. As long as the model is expressive enough it will eventually "fit/embed" the dataset well, especially in the large-data regime. The idea is that with enough data, the "best" model is less about the specific bells and whistles of the architecture and more about the ability to capture the data's structure. #sequencingmodeling #RNN #LSTM #GRU #deeplearning #AI
Were RNNs All We Needed?
arxiv.org
To view or add a comment, sign in
-
Exploring Deep Learning with VGG-16 on CIFAR-10! I'm excited to share my recent project where I implemented the VGG-16 Convolutional Neural Network (CNN) architecture on the CIFAR-10 dataset. This was a great opportunity to dive into the world of image classification, leveraging the power of transfer learning with VGG-16's deep layers to recognize 10 different classes of images. Key Takeaways: Dataset: CIFAR-10 (60,000 32x32 color images in 10 classes) Architecture: VGG-16 – a classic deep learning model known for its simplicity and effectiveness. Techniques: Preprocessing, data augmentation, and fine-tuning. Results: Achieved promising accuracy, and further improvements are possible with more tuning! Check out the code: https://lnkd.in/gKtweqJn I’m constantly learning and growing in my AI/ML journey and would love to connect with others passionate about deep learning and computer vision. #DeepLearning #AI #VGG16 #CIFAR10 #MachineLearning #ImageClassification #ArtificialIntelligence #NeuralNetworks
To view or add a comment, sign in
-
🎓Efficient ML (MIT) One of the most important topics in AI today is efficiency. It's an important topic given the large amounts of computational resources required by modern ML systems. This course provides a solid overview of techniques that enable efficient ML systems. Includes lectures on: - Compression - Pruning - Quantization - Neural Architecture Search - Distribute Training - Data/Model Parallelism - On-Device Fine-tuning ... and a whole lot more. https://lnkd.in/eBqF5Nyd ↓ For more, follow my weekly summary of the top AI and LLM papers. Read by 65K+ AI researchers and developers: https://lnkd.in/e6ajg945
To view or add a comment, sign in
-
I finished writing an exhaustive piece on a novel neural network architecture that beats MLPs, KANs, conventional Transformers, and Mamba in different real-world tasks. It's a deep dive where we learn to build this architecture from the ground up, both mathematically and then in code, all in easy-to-follow language. This is a ton of value! Publishing soon. Follow along: https://intoai.pub #artificialintelligence #ai #datascience #tech
To view or add a comment, sign in
-
Just wrapped up the Data Science Cisco Generative AI Blue Belt 2024 🚀 This program has given me a solid foundation in the world of Generative AI, covering essential topics like Large Language Models (LLMs), AI, Machine Learning (ML), Deep Learning (DL), and Artificial Neural Networks (ANN). In addition to this, throughout the course, I've gained insights into prompt engineering and the transformative power of transformer architecture. Understanding where to find data and how to handle it responsibly has been crucial, along with diving deep into the attention mechanism 🧑🏼💻 I'm eager to apply these skills and knowledge in practical scenarios and explore the endless possibilities that Generative AI offers 🙌🏼 #GenerativeAI #DataScience #AI #MachineLearning #DeepLearning
Data Science @Cisco Generative AI Blue Belt 2024 was issued by Cisco to Carlos Javier Martin Escabia.
credly.com
To view or add a comment, sign in