Big Week in AI
No time this week to stay across all the huge AI developments?
I got you covered 😉
Runway Act-One
Act-One from Runway is flipping the script on animation and filmmaking. Imagine becoming a "puppet master" for your characters, where you can shoot a scene and let AI do the heavy lifting to fine-tune the final look and movements.
No fancy motion capture suits needed—just your camera.
You shoot the footage, upload it, and Act-One applies an AI-generated style that keeps eye-lines, micro-expressions, and even pacing in check. It bridges the gap between human performances and AI’s touch, producing results with real emotional depth.
The coolest part?
This isn’t just about single shots—it’s about creating entire complex scenes that blend your performance with AI’s capabilities.
It’s rolling out slowly, but when it hits, expect more creators to dive into this new frontier.
This could redefine the whole approach to character animation and film creation.
Claude 3.5 Sonnet (New) and 3.5 Haiku
Anthropic’s Claude 3.5 Sonnet and Haiku are raising the bar for AI-driven coding and computer interaction.
The Sonnet model boasts major upgrades, hitting a 49% score on SWE-bench Verified—up from 33.4% in previous versions—putting it ahead of other models like OpenAI’s 4o Model.
You will see the fine print that o1-Preview was not compared, which is worth noting!
P.S : Come on AI gurus, can please learn how to name things! The namings are getting silly 😆
It’s also acing agentic tasks, improving results in retail and airline scenarios, which makes it ideal for complex software projects, from design to optimisation.
On the flip side, Haiku is gearing up for release later this month. This model focuses on speed, delivering top-notch performance similar to Claude 3 Opus without sacrificing efficiency or cost.
Scoring over 40.6% on SWE-bench Verified, Haiku is perfect for quick-response coding tasks and excels in processing large data volumes, making it a solid fit for industries like finance and healthcare.
What does it mean for us humble miners?
Models are getting better and better and handling more complex tasks - I wonder how far we are away from agents and model with capabilities that could handle mining workflows.
Time will tell!
Perplexity AI’s new Reasoning Mode
This update is for users tackling complex problems and making informed decisions.
Designed to enhance the AI experience, this mode is especially handy for academic research, nuanced decision-making, and other in-depth tasks.
The standout feature here is its advanced reasoning capability, allowing the AI to dig into intricate questions and offer deep, insightful analysis. It’s like having a powerful assistant that can break down abstract problems and guide users to make smarter decisions.
Plus, you can narrow down your focus—whether it’s academic research, social media trends, or video content—to get results that align with your specific needs, boosting both relevance and productivity.
But it doesn’t stop there. The Reasoning Mode plays nicely with other tools, integrating seamlessly with web search and Academic mode to pull peer-reviewed papers and real-time info from across the web.
All of this is wrapped up in a user-friendly interface that lets you switch modes and settings with ease, giving you the flexibility to fine-tune AI performance based on your task.
Computer Use Anthropic API
This one really shocked me in terms of potential impact.
Anthropic's Computer Use API is really next level, freeing agents from to confines of python and giving them access to a workstation directly.
You can ask it to do things like "Save this picture of a cat to my desktop," and it just handles it for you.
Recommended by LinkedIn
We are still a while from - "Complete the 16 Week Mine Plan" but it is impressive none the less.
Powered by their AI model, Claude, the API uses screenshots and text prompts to figure out what you're asking and then executes tasks like clicking buttons, typing, or navigating web pages.
You keep interacting with Claude, sending results back and forth until the task is done. It's a seamless, ongoing conversation where Claude helps you automate things you’d normally do manually.
You keep interacting with Claude, sending results back and forth until the task is done. It's a seamless, ongoing conversation where Claude helps you automate things you’d normally do manually.
What’s cool is that it’s loaded with tools for basic computer operations, text editing, and even running bash commands.
The latest version can even control mouse movements with pinpoint accuracy by identifying coordinates from screenshots.
Plus, it all happens in a safe, sandboxed environment, so there’s less risk of anything going wrong on your machine.
New open source AI video Mochi 1
Genmo’s Mochi 1 is shaking up the AI video generation space.
Announced as a “research preview,” it’s an open-source model under the Apache 2.0 license, which means anyone can access, modify, and use it freely.
That’s a huge deal in a field where most advanced models are proprietary!
Mochi 1 isn’t just open; it’s technically ambitious, with a 10 billion parameter diffusion architecture called the Asymmetric Diffusion Transformer (ASMD). This design helps it handle video creation more efficiently than traditional models that juggle video, image, and text data all at once.
Right now, Mochi 1 outputs videos at 480p with a target of 24 frames per second.
Running Mochi 1 isn’t for the faint of heart, though.
You’ll need serious hardware—four Nvidia H100 GPUs—for the best results.
This might limit its accessibility for now, but it sets the stage for future improvements that could bring this tech to more users.
Ideogram Canvas
Finally, Ideogram Canvas is a new feature on the Ideogram platform that gives users an expansive creative workspace to generate, edit, and organise images.
It’s designed for both professionals and hobbyists, letting users seamlessly upload their own images or create new ones within the platform.
This addition enhances Ideogram’s previous focus on AI-powered image generation and typography by introducing more hands-on tools and flexibility.
One standout feature is Magic Fill, which allows users to edit specific areas of an image through inpainting.
You can replace objects by selecting a section and providing a new prompt, add elements or text, and even zoom in to fix small details.
This level of control makes it especially useful for graphic designers looking to fine-tune their work.
Another highlight is Extend, which expands images beyond their original borders.
This outpainting tool is perfect for adjusting compositions or adapting images for different formats, like social media or larger prints, all while keeping the original style intact.
Wow, what a week right!
Can't wait to see what's next on the AI Horizon.
⛏ My name is Harry Finn
🔷Elevating miners with next-level mining technology
Liked this post? Want to see more?
Ring the 🔔 on my Profile and🔗 Connect with me.
Subscribe to my (this) Newsletter
Bringing you the confidence of more productive, more profitable and safer mines, globally | Mining Productivity Software | Part Owner & GM at Information Alignment | Founder of Actix | Still a Miner at Heart!
2moThanks Lian!
Business Operations Strategist | Digital Transformation Evangelist | AI Enthusiast | Tech Gadgets Lover | Foodie | Kindness
2moMind-blowing updates. Love the fresh perspectives on AI's disruptive evolution.