Welcome to this week’s SymphonyAI generative AI weekly newsletter summarizing all the important AI industry developments and technology advancements you need to know.
While the spotlight often shines on massive AI models, the true revolution is unfolding with Small Language Models (SLMs). These efficient models thrive by using smarter, leaner reasoning. With advancements in model compression, knowledge distillation, and retrieval-augmented generation (RAG), SLMs have evolved beyond their earlier limitations. These innovations allow developers to shrink large models without compromising intelligence. RAG in particular allows SLMs to tap into external knowledge bases on demand, letting them remain nimble without needing to store vast amounts of data internally. The result? AI that’s not just smaller—it’s specialized, efficient, and lightning-fast.
This week, we cover how leading companies are rapidly adopting SLMs and the key developments driving this transformation.
SymphonyAI news
Next-gen AI: Small models and multimodality
- Precision Meets Efficiency in AI: NVIDIA introduced Llama 3.1-Nemotron-51B, a language model derived from Meta’s Llama-3.1-70B. It achieves 2.2x faster inference, enabling 4x larger workloads, reducing costs, and facilitating scalable AI solutions.
- Phi-3: Redefining what’s possible with SLMs: Microsoft recently announced the availability of Phi-3.5 MoE (Mixture of Experts) in Azure AI Studio, offering dynamic model scaling so enterprises can use powerful AI while optimizing compute efficiency.
- Salesforce’s xLAM-1B: AI Efficiency Leader: Salesforce released xLAM-1B, a 1B parameter SLM outperforming larger models in function-calling tasks. Designed to deploy autonomous AI agents, xLAM-1B facilitates complex task execution while maintaining a balance between power and operational limitations.
- Zamba2 Launch: Open-Source SLM: Zyphra introduces Zamba2-2.7B, a high-performance SLM. Released as open-source, this model's reduced computational demands make it suitable for a wide range of industries.
- AI Voice Interaction for All: Google's Gemini Live voice mode is now available for all Android users. It enhances AI accessibility with hands-free, multi-turn conversations and supports personalized interactions.
- AI-Driven Engagement in Social Media: Meta introduced new multimodal AI features, including voice interaction, photo editing, and translation. The updates aim to increase user engagement and enhance ad performance.
AI in financial services
- Navigating AI in Stock Picks: Israeli startup Bridgewise will soon launch Stocktalk, a stock-picking chatbot approved for use by Israel Discount Bank customers. Stocktalk offers financial disclosure summaries, firm background info, and market-based stock recommendations. However, concerns about AI-driven market instability persist, highlighted by SEC head Gary Gensler.
- Enhanced Analytics to Combat Financial Crime: Nasdaq Verafin's Targeted Typology Analytics advances AI-based detection for terrorist financing ($11 billion) and drug trafficking ($800 billion), using data from 2,500+ financial institutions.
Generative AI impact, adoption, and projections
- Rapid Rise of Generative AI Usage: A recent study shows 39.4% of Americans adopted generative AI within two years, surpassing early PC and internet adoption rates. AI is broadly used across sectors, saving time in tasks like writing (57%) and information searches (49%), with 28% of employees using AI at work and 25% weekly. Potential productivity boosts are estimated at 0.125%-0.875%.
- Navigating the New AI Infrastructure Paradigm: There is a shift towards hybrid infrastructures driven by generative AI demands, with 85% of cloud buyers deploying or planning hybrid solutions. Key use cases include latency-sensitive applications.
Big tech
- Fine-Tuning and New Model Support in Azure AI: Microsoft Azure AI introduces fine-tuning for GPT-4o and GPT-4o mini. New models like Phi-3.5-MoE and Llama 3.2 enhance Azure's capabilities in multilingual processing and image reasoning.
- Microsoft Unveils AI Hallucination Correction Tool: Microsoft's new Azure AI Content Safety feature enhances AI reliability by detecting inconsistencies and triggering a smaller AI model to correct unsupported text. The Groundedness Detection tool reduces inconsistencies to 0.1-1%, aiming to mitigate hallucinations for broader AI applications.
- Microsoft Launches Azure AI Inference SDK for .NET: Microsoft's Azure AI Inference SDK for .NET simplifies integration of generative AI models from Azure AI Studio's catalog. It supports advanced AI functionalities with minimal setup, enhancing tasks like chat integration.
Responsible AI and public policy
- Empowering Futures: Quantum AI Challenge: Flapmax and Intel's Quantum AI Challenge invites HBCU students to tackle real-world problems using quantum computing and AI. Participants will access advanced tools, collaborate with experts, and address sustainability challenges.
- US Launches AI Partnership with Meta, OpenAI, and NVIDIA: The U.S. government launched the Partnership for Global Inclusivity on AI, committing over $100 million. Key initiatives include increasing AI model access, building technical capacity, expanding datasets, and ensuring responsible governance.
- California Enacts 18 New AI Laws Addressing Key Issues: California Governor Gavin Newsom signed 18 AI-related bills. Key measures include requiring AI providers to disclose data sources, extending privacy laws to generative AI, criminalizing AI-generated pornography, and mandating AI literacy in schools.
Other generative AI models
- Molmo: Accessible AI for All: AI2's Molmo, a multimodal AI model family, rivals major tech firms by offering efficient, open-access AI solutions. Using a smaller, curated dataset of ~1 million images, it excels in visual interpretation with fewer errors and faster training.
- Tailored AI for Insurance Efficiency: EXL has launched EXL Insurance LLM, an industry-specific language model. The model, leveraging NVIDIA's AI Enterprise platform, improves accuracy by 30%. It offers structured data ingestion, contextual classification, and real-time insights.
- AI-Powered Insights for Climate Science: IBM and NASA have launched an open-source AI model for weather and climate applications. Pre-trained on 40 years of NASA data, it offers 12x resolution localized forecasts and supports diverse challenges, from high-resolution forecasts to global model improvements, enhancing predictive accuracy and environmental data analysis.
Notable research
- Small Language Models Survey: A comprehensive survey on SLMs across architectures, training datasets, and training algorithms, this study analyzes 59 open-source SLMs and capabilities such as reasoning, in-context learning, math, and coding. Other discussions include on-device runtime costs, latency, memory footprint, and valuable insights.
- Logic-of-Thought: Enhancing LLM Reasoning: Logic-of-Thought (LoT) is introduced as a prompting technique that improves logical reasoning by incorporating logical propositions into model inputs. LoT enhances model reasoning capabilities, outperforming other prompting techniques across multiple reasoning benchmarks.
- LLMs Still Can’t Plan: This study finds that a domain-independent planner can solve all instances of Mystery Blocksworld but LLMs struggle, even on small instances. OpenAI’s o1-preview shows progress on more challenging planning problems, but degrades in performance as the plan length increases, showing that the accuracy gains cannot be considered general or robust.