The AI machines that we've become accustomed to via large language models from places like OpenAI and Anthropic require an incredible amount of data to train them. For the past decade or so, this data has been easily accessible - for free - on the open web. Not so much anymore, as content creators and publishers have become wise to companies whose raison d'etre is to eventually replace the humans making (or, at the very least, squeeze the human creativity out of) the very content they're using to train their models. This has led to the rise of synthetic data, but an overreliance on this method could lead to model collapse - just ask the Spanish Hapsburgs. We round out the week asking what the MTA actually did at the Grand Central 4/5/6 entrance all summer and find out where all of George Strait's exes live. This week's newsletter is live! If you like the (100% human-created) writing I put out, think about subscribing to get this in your inbox every Tuesday!
Matthew Kane’s Post
More Relevant Posts
-
Bites from last week AI news 1/ Microsoft announced that their browser Edge will offer live translation and subtitles for YouTube, Coursera, Linkedin and other popular platforms 🔥 Irrespective of whether you use Edge, this is a feature that will probably be replicated across browsers and products very soon which will make a lot of content more accessible. https://lnkd.in/gitCT7iU 2/ Google introduced AI overview so search a few weeks ago which marks the biggest change in search for years. The feature does not come withouts its flaws as users quickly pointed out. This demonstrates the difficult balance an incumbent has to face compared to disruptors such as OpenAI and Perplexity. https://lnkd.in/gE4J42hi 3/ Scale the data platform behind most LLM providers, released a leaderboard that promises to be impartial and not contaminated ✨ They ensure the latter by keeping the evaluation data private. Unsurprisingly GPT4 comes first in all categories except math where its a close second. Gemini 1.5 Pro and Claude Opus are close behind. Interestingly Llama 3 70b and Mistral large perform very close on instruction following which is the most common use case for everyday users. https://lnkd.in/gUsUqSyX 4/ A really insightful blog series from AI experts on the ground shares many similarities with our own experience and previous posts. Some highlights include to start with prompting and using it rightly including few shots, proper chain of thoughts and break big prompts to smaller deterministic agentic workflows, using hybrid RAG before fine tuning and evaluating using binary or pairwise tasks which are easier and more cost effective for annotators https://lnkd.in/gn4GQJ3i 5/ Sony announces that is looking into AI to cut costs in movie production 😮 This a difficult terrain to navigate given the many ethical issues on how the technology can be used responsibly in that area. Used rightly, AI can boost the productivity of many creatives. Used wrongly it may destroy the very thing it tries to improve. https://lnkd.in/djTz5PdA
To view or add a comment, sign in
-
GPT-4 - The King is dead. Or is it? Days ago, Claude-3-Opus officially took the number 1 spot from the Chatbot Arena leaderboard. If you don't know, there are many dozens LLMs models out there. Each claiming to beat others in scientific benchmark. But what about usefulness to users? Well, the Large Model Systems Organization (LMSYS ORG) set up a voting system for ordinary. humans to rate each chatbot's response to the same prompts. And after 500,000 ratings, Claude-3-Opus came out on top, by 3 Elo points. That is super close. But based on recent users' testing, it's clear that Claude-3 is better than GPT-4 at: ↳ Following instructions for closely ↳ Uses less generic AI keywords like "dive in" or "unleash" ↳ Larger context length (up to 1 million tokens ~ 750,000 words) ↳ Updated knowledge (cut-off date until 08/23 compared to GPT4's 04/23) What do you think? Will GPT-5 bring the glory back to OpenAI? I'm betting that it will :D P/s: You can check out the full ranking below
To view or add a comment, sign in
-
Google Gemini 1.5 Pro Leaps Ahead In AI Race, Challenging GPT-4o: An anonymous reader quotes a report from VentureBeat: Google launched its latest artificial intelligence powerhouse, Gemini 1.5 Pro, today, making the experimental "version 0801" available for early testing and feedback through Google AI Studio and the Gemini API. This release marks a major leap forward in the company's AI capabilities and has already sent shockwaves through the tech community. The new model has quickly claimed the top spot on the prestigious LMSYS Chatbot Arena leaderboard (built with Gradio), boasting an impressive ELO score of 1300. This achievement puts Gemini 1.5 Pro ahead of formidable competitors like OpenAI's GPT-4o (ELO: 1286) and Anthropic's Claude-3.5 Sonnet (ELO: 1271), potentially signaling a shift in the AI landscape. Simon Tokumine, a key figure in the Gemini team, celebrated the release in a post on X.com, describing it as "the strongest, most intelligent Gemini we've ever made." Early user feedback supports this claim, with one Redditor calling the model "insanely good" and expressing hope that its capabilities won't be scaled back. "A standout feature of the 1.5 series is its expansive context window of up to two million tokens, far surpassing many competing models," adds VentureBeat. "This allows Gemini 1.5 Pro to process and reason about vast amounts of information, including lengthy documents, extensive code bases, and extended audio or video content." Read more of this story at Slashdot.
To view or add a comment, sign in
-
We're excited to announce that Epsilla (YC S23) now supports GPT-4o-mini, the cheaper, faster, and better LLM by OpenAI that’s set to obsolete GPT-3.5-turbo! 🚀🤖 But what about its performance against GPT-4o? Here, we use GPT-4o-mini and GPT-4o to create two financial analysts and let them compete side by side in analyzing Meta’s 10-K report. The report contains many tables and charts, and we leveraged a secret sauce technology to extract the information (to be announced tomorrow, stay tuned!). The question involves a deep understanding of the numbers and math calculations. Do you think GPT-4o-mini does a similarly good job as GPT-4o? Watch the video and join the conversation! For more in-depth insights, check out the detailed comparisons here: - GPT-4o-mini: https://lnkd.in/eBf7Z6_U - GPT-4o: https://lnkd.in/eZ8uFwY7 PS: A less mentioned advancement is that GPT-4o-mini supports 16k output tokens, meaning it can generate 4 times more content with each completion request than previous GPT-4 models (and almost all other SOTA LLMs). In my honest opinion, this is much bigger than the so-called million-token long context window advance, which focuses on increasing input token length. Think about this: now you can let the LLM do more things with fewer completion requests, without needing to repeatedly provide the same context. This means 4 times less token passing, on top of the per-token cost reduction of GPT-4o-mini. I am really excited to see the huge potential with RAG plus this more balanced input-output token limit setting. #Epsilla #RAG #GPT4omini #GPT4o #AI #ML #LLM
To view or add a comment, sign in
-
Google Gemini 1.5 Pro Leaps Ahead In AI Race, Challenging GPT-4o: An anonymous reader quotes a report from VentureBeat: Google launched its latest artificial intelligence powerhouse, Gemini 1.5 Pro, today, making the experimental "version 0801" available for early testing and feedback through Google AI Studio and the Gemini API. This release marks a major leap forward in the company's AI capabilities and has already sent shockwaves through the tech community. The new model has quickly claimed the top spot on the prestigious LMSYS Chatbot Arena leaderboard (built with Gradio), boasting an impressive ELO score of 1300. This achievement puts Gemini 1.5 Pro ahead of formidable competitors like OpenAI's GPT-4o (ELO: 1286) and Anthropic's Claude-3.5 Sonnet (ELO: 1271), potentially signaling a shift in the AI landscape. Simon Tokumine, a key figure in the Gemini team, celebrated the release in a post on X.com, describing it as "the strongest, most intelligent Gemini we've ever made." Early user feedback supports this claim, with one Redditor calling the model "insanely good" and expressing hope that its capabilities won't be scaled back. "A standout feature of the 1.5 series is its expansive context window of up to two million tokens, far surpassing many competing models," adds VentureBeat. "This allows Gemini 1.5 Pro to process and reason about vast amounts of information, including lengthy documents, extensive code bases, and extended audio or video content." Read more of this story at Slashdot.
To view or add a comment, sign in
-
AI is Eating the Web My friend Tom Wheeler, former chair of the FCC, and an incredibly smart human, wrote this fascinating article on how #GenAI is about to change the very web that it was built upon. A great read and worth your time! The large language models that power generative AI tools were built using data scraped from countless websites, but they now seek to eliminate the need for users to go to those same sites. Already, a quarter of all web pages developed between 2013 and 2023 no longer exist, and traffic from search engines to the web is predicted to fall by 25% in the next two years. While some publishers are suing AI developers for using their data to train AI tools, many are entering into partnerships with companies like OpenAI, who promise monetary compensation and the promotion of the publisher’s websites within AI-generated content.
Connecting the dots: AI is eating the web that enabled it | Brookings
https://www.brookings.edu
To view or add a comment, sign in
-
AI can see the future. At least, they can see it a lot better than humans can. Studies using ‘crowd’ LLMs have shown the significantly greater forecasting ability than the typical human ‘crowd’ used in forecasting. Utilising human crowds have some serious limitations: ➡️ biases ➡️ scalability ➡️ cost and time Researchers created their own LLM ‘crowd' using models from companies like OpenAI, Google, Anthropic, and Meta, to replicated a human crowd. With standardised prompting, their accuracy was indistinguishable from human predictions & in some testing even outperformed them. However... I’m a bit confused as to why these LLMs are performing better than humans. But perhaps it shouldn’t be that surprising. I mean, LLMs are the ultimate “crowd source” by definition. They’re the sum aggregate of millions of written artefacts, and the absolute average of thought in some sense. And aggregating across different language models only pushes that further. Food for thought, anyway. It’s really exciting to see such novel applications of LLMs coming out. The next time someone’s doing the “count jelly beans in the jar as a crowd exercise” - let’s maybe use LLMs? Super keen to see what other uses these AI-crowds have in a more practical sense.
To view or add a comment, sign in
-
in AI I think the growth area will be Big/General v Little/Specific. As I've said befor for SMEs small process specific LMs combined with a low code app and API integration canncreate powerful tools v quickly - and you cam slot in new LLMs or SLMs as things grow. this is a good article to get an overview on the AI tech. Or DM me and we can go through an example (insurance industry)
2023 was about LLMs. 2024 will be about SLMs. Here's what you need to know about Small Language Models: SLMs are what they sound like. They're smaller versions of LLMs. Nagesh Mashette 🇮🇳 wrote a great article on this (link here: https://lnkd.in/e-rZYXNt). Here are my notes from his article. Problem: Only large enterprises can afford to work with LLMs. The only way forward is down-market. Solution: SLMs - They're efficient - less computational cost than LLMs - They're accessible: lower cost means broader application - They're customizable: fine tune them for specific use cases and industries How SLMs work: - They simplify the complex from pre-trained LLMs through pruning and quantizations - They're becoming increasingly efficient and effective Major benefits: - Far less parameters - Use mobile device processors - Process data locally - important for privacy - Good use case for simple tasks - Less resources needed - Train in a week Examples of SLMs: - DistilBERT: https://lnkd.in/edVu44rf - Microsoft's Orca 2: https://lnkd.in/eKmVi2sr - Microsoft's Phi 2: https://lnkd.in/egqH2eZ4 - Google's BERT Mini, Small, Medium, and Tiny: https://lnkd.in/eesvC8rm - OpenAI's GPT-Neo and GPT-J: https://lnkd.in/eJVBrDdm - MobileBERT: https://lnkd.in/eEHgqeDR - Google's T5-Small: https://lnkd.in/eDCzg7ws ♻️ If you found this valuable, please like and share 👉 Follow me at Albert Chun as I learn in public #artificialintelligence #aiadoption #ai #ml #llm #slm #breakintoai
To view or add a comment, sign in
-
𝐆𝐏𝐓-5 𝐢𝐬 𝐋𝐚𝐮𝐧𝐜𝐡𝐢𝐧𝐠 𝐒𝐨𝐨𝐧! 🍓 Big news for AI enthusiasts: GPT-5, the next-gen model from OpenAI, is set to be released in the coming month. Sam Altman dropped a hint by tweeting about "Project Strawberry" alongside an image of five strawberries, indicating the upcoming launch of GPT-5. 𝐊𝐞𝐲 𝐅𝐞𝐚𝐭𝐮𝐫𝐞𝐬 𝐨𝐟 𝐆𝐏𝐓-5: 🧠 𝐀𝐝𝐯𝐚𝐧𝐜𝐞𝐝 𝐏𝐫𝐨𝐛𝐥𝐞𝐦 𝐒𝐨𝐥𝐯𝐢𝐧𝐠: GPT-5 is engineered to tackle problems in a human-like manner, enhancing its ability to understand complex situations and assist with strategic planning and intricate problem-solving. ✍️ 𝐌𝐮𝐥𝐭𝐢𝐦𝐨𝐝𝐚𝐥 𝐀𝐛𝐢𝐥𝐢𝐭𝐢𝐞𝐬: Capable of processing and generating text, images, and audio, GPT-5 can help with writing, creating visual content, and even composing music. ⚡ 𝐈𝐦𝐩𝐫𝐨𝐯𝐞𝐝 𝐒𝐩𝐞𝐞𝐝 𝐚𝐧𝐝 𝐏𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞: Offering faster responses and the ability to handle more complex tasks with ease, GPT-5 boosts efficiency and productivity. 🤖 𝐀𝐝𝐚𝐩𝐭𝐢𝐯𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠: GPT-5 learns from user interactions, adapting to individual preferences and functioning like a personal assistant that knows your needs intimately. A standout aspect of this release is Project Strawberry, aimed at empowering large language models to conduct "deep research," a sophisticated capability that was previously out of reach. What are you most excited about with GPT-5?
To view or add a comment, sign in
-
While OpenAI is making headlines with its new o1 models. Google has taken the first step toward addressing one of GenAI's most significant challenges. One of the biggest challenges with Large Language Models (LLMs today is their tendency to confidently provide inaccurate information, known as "hallucination." Google has just launched DataGemma, a breakthrough solution to address this issue by grounding LLMs in real-world data.. Built on Google’s Data Commons, a massive repository of over 240 billion trustworthy data points from sources like the UN, WHO, and CDC, DataGemma models are designed to improve the factual accuracy of AI responses. Here’s how: 1. RIG (Retrieval-Interleaved Generation) enables AI models to query trusted data in real time. 2. RAG (Retrieval-Augmented Generation) allows models to access relevant context before generating outputs, ensuring deeper reasoning and fewer errors. Preliminary results show significant improvements in handling numerical facts. By anchoring responses in reliable statistics, these models offer more accurate and trustworthy insights across sectors like research, policymaking, and more. #AI #DataScience #MachineLearning #LLMs #Innovation #GoogleAI #DataDriven #TechNews #ArtificialIntelligence
To view or add a comment, sign in