Waiss H.’s Post

Global SEO Director (BEng Software & Economics)

5mo

🆕 OpenAI yesterday introduced GPT-4o mini, a new and affordable small model that is not only significantly smarter but also much cheaper. It is now available in the API. 🚀 🔍 A quick overview: 🧠 Intelligence: GPT-4o mini outperforms GPT-3.5 Turbo in textual intelligence, scoring 82% on MMLU compared to ~70%, and excels in multimodal reasoning. 💲 Price: GPT-4o mini is over 60% cheaper than GPT-3.5 Turbo, priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens (~the equivalent of 2,500 pages in a standard book). 🔄 Modalities: GPT-4o mini currently supports text and vision capabilities, with plans to add support for audio/video inputs and outputs in the future. 🌐 Languages: GPT-4o mini has improved multilingual understanding over GPT-3.5 Turbo across a wide range of non-English languages. ⚡ Performance: GPT-4o mini is ideal for high-volume tasks, cost-sensitive tasks, and tasks requiring fast responses. It has a knowledge cut-off date of October 2023. 🌟

To view or add a comment, sign in

More Relevant Posts

Andreas Nigg

I write about tips and tricks around AI, LLMs and data
1mo
Report this post
Llama 3.3 is here - and it finally puts the nail in GPT-4os coffin. Ok this might be too dramatic of a sentence, but it beats GPT-4o in every regard. Great model free and 'only' 70B parameter. So I expect it to run on consumer-grade hardware with some quantization. https://lnkd.in/dxk8Sfru

meta-llama/Llama-3.3-70B-Instruct · Hugging Face

huggingface.co
Like Comment
To view or add a comment, sign in
Thorsten Maier

To stay ahead 🚀 we must embrace new tech and never stop learning, even on a rollercoaster ride 🎢
2mo
Report this post
Great example of a growing trend! We’re seeing more specialized models like NuExtract emerge, designed to tackle very specific tasks with impressive efficiency. The shift from general-purpose models to smaller, focused ones is only gaining momentum—and it’s easy to see why. Tailoring models to specific needs can lead to better performance and greater flexibility. This is the direction we’ll continue to see in the future!
Emanuele Fabbiani

Head of AI @ xtream | Professor @ UniCatt
2mo

What is the best model to extract data from documents? Not GPT. 🛑 NuMind (YC S22) has launched a Small Language Model based on Phi-3.5B-instruct. NuExtract v1.5 is specifically fine-tuned to extract data as JSON from unstructured documents. To this goal, the original Phi-3.5B-instruct by Microsoft was trained against a proprietary collection of documents: 50% in English, and 50% in other languages, mainly French, German, Spanish, Italian, and Portuguese. Key benefits of NuExtract v1.5: ✅ It’s 400 times smaller than GPT-4o. ✅ It outperforms GPT-4o on English documents by 1%. ✅ It’s only 3% less accurate on non-English texts. ✅ It can process documents of any size. Best of all, NuExtract is released under the MIT license, allowing you to use it for free and even self-host it!
Like Comment
To view or add a comment, sign in
Gagandeep Singh

🚀 Data Science Manager/Gen AI Specialist | MLOps & LLM Innovator | 10+ Years in AI, ML, NLP | Python, AWS, Azure, GCP | x-Walmart, TCS,Xerox
6mo
Report this post
RouteLLM: An Open-Source Framework for Cost-Effective Language Model Routing 🚀 RouteLLM, an open-source framework that intelligently routes queries to the most suitable language model (LLM) based on the query's characteristics and each model's capabilities. The goal? To minimize costs while maintaining high response quality! 💰✨ ✅ RouteLLM uses preference data for training routers, comparing response quality between models on each prompt. ✅ Four routers were trained using public data from Chatbot Arena and data augmentation techniques. ✅ Evaluated on MT Bench, MMLU, and GSM8K benchmarks, RouteLLM achieved significant cost savings: - 75% cheaper than random routing on MT Bench while maintaining 95% of GPT-4 performance 📈 - 54% GPT-4 usage on MMLU to reach 95% GPT-4 performance with data augmentation 🌿 ✅ RouteLLM performed comparably to commercial offerings, while being over 40% cheaper! 💸 ✅ The routers generalized well to new model pairs without retraining. 🌟 🎉 The researchers have open-sourced the framework, routers, and datasets for public use. This is a game-changer for anyone looking to deploy LLMs cost-effectively in real-world applications! 🌐 🚀 Excited to see what the community builds with RouteLLM! #RouteLLM #LanguageModels #CostOptimization #OpenSource #AI
Like Comment
To view or add a comment, sign in
Guillermo M. Proaño

Maximizing ROI on my customer’s data assets
8mo
Report this post
Microsoft has introduced phi-3-mini, a 3.8 billion parameter language model that achieves remarkable performance on both academic benchmarks and internal testing. Despite being small enough to be deployed on a phone, phi-3-mini rivals the performance of larger models such as Mistral 8x7B and GPT-3.5, achieving 69% on MMLU and 8.38 on MT-bench. What sets phi-3-mini apart is its dataset, a scaled-up version of the one used for phi-2, composed of heavily filtered web data and synthetic data. Check out the details of this innovative language model in the link below. https://lnkd.in/g9JFJSAi

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

arxiv.org
Like Comment
To view or add a comment, sign in
Reisel González Pérez

Sr. Technology Specialist @Microsoft | Data & AI
5mo
Report this post
GPT-4o mini supports text and vision in the API, with support for text, image, video and audio inputs and outputs coming in the future. The model has a context window of 128K tokens, supports up to 16K output tokens per request, and has knowledge up to October 2023. Thanks to the improved tokenizer shared with GPT-4o, handling non-English text is now even more cost effective. GPT-4o mini scores 82% on MMLU and currently outperforms GPT-4 on chat preferences in LMSYS leaderboard. It is priced at 15 cents per million input tokens and 60 cents per million output tokens, an order of magnitude more affordable than previous frontier models and more than 60% cheaper than GPT-3.5 Turbo.

OpenAI's new GPT-4o mini offers higher performance at 60% lower price

the-decoder.com

2 Comments
Like Comment
To view or add a comment, sign in
Netizine

6 followers
1mo
Report this post
This looks very promising. It does well at data extraction, and I like the idea of a self-hosted SLM that's about 400 times smaller than ChatGPT.
Emanuele Fabbiani

Head of AI @ xtream | Professor @ UniCatt
1mo

What is the best model to extract data from documents? Not GPT. 🛑 NuMind has launched a Small Language Model based on Phi-3.5B-instruct. NuExtract v1.5 is specifically fine-tuned to extract data as JSON from unstructured documents. To this goal, the original Phi-3.5B-instruct by Microsoft was trained against a proprietary collection of documents: 50% in English, and 50% in other languages, mainly French, German, Spanish, Italian, and Portuguese. Key benefits of NuExtract v1.5: ✅ It’s 400 times smaller than GPT-4o. ✅ It outperforms GPT-4o on English documents by 1%. ✅ It’s only 3% less accurate on non-English texts. ✅ It can process documents of any size. Best of all, NuExtract is released under the MIT license, allowing you to use it for free and even self-host it!
Like Comment
To view or add a comment, sign in
Sunny N.

Senior software developer, Manager, Certified AWS Architect
5mo
Report this post
OpenAI launched GPT-4o mini, outputperfroms GPT3.5, would be great if I can download it and host locally 😊 : Intelligence: GPT-4o mini outperforms GPT-3.5 Turbo in textual intelligence (scoring 82% on MMLU compared to 69.8%) and multimodal reasoning. Price: GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo, priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens (roughly the equivalent of 2500 pages in a standard book). Modalities: GPT-4o mini currently supports text and vision capabilities, and we plan to add support for audio and video inputs and outputs in the future. Languages: GPT-4o mini has improved multilingual understanding over GPT-3.5 Turbo across a wide range of non-English languages. With its low cost and latency, GPT-4o mini works well for high-volume tasks (e.g., passing a full code base or conversation history to the model), cost-sensitive tasks (e.g., summarizing large documents), and tasks that require fast responses (e.g., customer support chatbots). Like GPT-4o, GPT-4o mini has a 128k context window, supports up to 16k output tokens per request, and has a knowledge cut-off date of October 2023. We plan to launch fine-tuning for GPT-4o mini in the coming days.
Like Comment
To view or add a comment, sign in
Leliuga

84 followers
5mo
Report this post
Llama 3.1 405B matches or beats the Openai GPT-4o across many text benchmarks! New and improvements of 3.1: - 8B, 70B & 405B versions as Instruct and Base with 128k context - Multilingual, supports 8 languages, including English, German, French, and more. - Trained on >15T Tokens & fine-tuned on 25M human and synthetic samples - Commercial friendly license with allowance to use model outputs to improve other LLMs - Quantized versions in FP8, AWQ, and GPTQ for efficient inference. - Llama 3 405B matches and beast GPT-4o on many benchmarks - 8B & 70B improved Coding and instruction, following up to 12% - Supports Tool use and Function Calling Blog: https://lnkd.in/g9yTBFnv Model Collection: https://lnkd.in/g_bVRpmp

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

huggingface.co
Like Comment
To view or add a comment, sign in
Saira Khan

Student at The Islamia University of Bahawalpur
2w
Report this post
#LargeLanguageModels🔥 Large Language Models (LLMs) like GPT are powerful tools for understanding and generating language, but their true potential emerges when combined with other capabilities. Here's how the journey unfolds: 1. LLMs: At their core, LLMs generate and understand language. They're great for answering questions, generating text, and performing language-based tasks. But they operate in isolation, responding only to the input given. 2. LLMs + Planning: Adding planning enables these models to think a few steps ahead. Instead of providing isolated answers, they can outline a multi-step solution or devise strategies to achieve a specific goal. 3. LLMs + Planning + Memory: Memory takes things further. Now the model can retain context across sessions or long conversations. For example, imagine troubleshooting a technical issue with multiple steps-the model won't forget what's already been discussed. 4. Al Agents (LLM + Planning + Memory + Tools) : This is where things get truly powerful. Al agents combine the above capabilities with access to tools like APIs, databases, or the web. They don't just answer questions-they execute tasks. For instance, they can research, make decisions, write reports, or even automate workflows.
7 Comments
Like Comment
To view or add a comment, sign in
Pratyush Khare

Data Scientist @ Careem | Harnessing Insights for Strategic Decision-Making
6mo
Report this post
🔍 Exploring the Role of GPT-4 via LangChain and DSPy in Enhancing Offline Relevance Testing When it comes to searching, we all crave the instant gratification of finding what we need in a blink. But have you ever wondered how search engines ensure they’re delivering the most relevant results? Search engines use two main methods for evaluation: online and offline. While online methods like A/B testing provide direct user feedback, they can be resource-intensive and risky, especially for early-stage models. This is where offline evaluation shines. It uses pre-defined test collections and human or AI judges to assess search quality without relying on user behaviour. 🔍 Why Offline Evaluation? Here’s why offline evaluation is crucial for search engine optimization: Repeatability: Standardized and repeatable for consistent comparisons. Core Functionality Focus: Isolates core capabilities without external influences. Targeted Analysis: Allows in-depth analysis of specific aspects. Cost-Effective: More resource-efficient than online A/B testing. 🔍 How Offline Search Evaluation Works Selection of Test Collections: Consisting of a document corpus, a query set, and relevance judgments. Running the Search: The algorithm processes each query and ranks the documents. Comparing Rankings: Predicted rankings are compared to relevance judgments using metrics like precision, recall, DCG, and NDCG. Analyzing Results: Evaluators delve into ranking inconsistencies, query performance, and bias detection. 🔍 Using GPT-4 via LangChain and DSPy for Relevance Testing Traditionally reliant on human evaluators, offline search evaluation now benefits from advancements in large language models (LLMs) like GPT-4. Tools like LangChain and DSPy automate relevance testing, providing efficient and scalable AI-driven assessments. Offline search evaluation remains an indispensable tool for refining search engines, ensuring they deliver the most relevant results. The integration of advanced tools like GPT-4, LangChain, and DSPy revolutionizes this process, making it more accurate and efficient. 📚 Read the full blog to explore more insights: https://lnkd.in/d2mqpqZB 🤝 Let’s Connect! 💭🗣 Share your comments and questions below. Follow me for more AI/ML insights. #SearchEngine #OfflineEvaluation #AI #GPT4 #LangChain #DSPy #MachineLearning #RelevanceTesting #TechInnovation

Shedding Light in the Dark: The Importance of Offline Search Evaluation

medium.com
Like Comment
To view or add a comment, sign in

839 followers

353 Posts

View Profile Connect

Waiss H.’s Post

More Relevant Posts

Explore topics