Technological Convergence...
...and some regulation.
🚀 Intro
There continues to be a lots of big announcements and, whilst it seems there is some degree of AI fatigue creeping in, my sense is that the hype will continue for some time yet. But it was a surprise to many that Mira Murati, OpenAIs CTO, and 2 other execs have left the company on the same day, and doing so amid a fundraising round that will see the company valued at around $150Bn. With that said, I had a conversation with a AI startup founder at the end of last week and her pragmatic view was that OpenAI is a startup with startup problems, whereas their competitors are really established companies with a different set of problems… and of course she’s right.
Today I want to talk a little bit about convergence and how AI is changing the UI, both of which are themes that I’ve talked about several times before according to my Notion AI assistant.
I use Notion to collect information from across the web, and as the place where I keep my journal and author content. If you’re interested in this idea of an AI-powered “Second Brain”, I wrote a post on my workflow about 18 months ago and, whilst it could do with a refresh, it’s still mostly correct.
🚀 The Intelligence Age
But before we get fully into it, and the opening paragraph notwithstanding, Sam Altman seems very confident about achieving Artificial Super Intelligence (ASI) in his latest post, titled “The Intelligence Age”. As he points out, “Deep learning works, and we will solve the remaining problems”. And in so doing, “Our children will have virtual tutors who can provide personalized instruction in any subject, in any language, and at whatever pace they need.”
I liked Dan Fitzpatrick ’s take in his Forbes article, where he says:
“This could be an opportunity for education to shift in focus. Instead of an obsession with memorization and learning for an exam, education could pivot toward fostering problem-solving skills, critical thinking and innovation. Students could collaborate with AI systems to tackle real-world challenges. Maybe they can even contribute to scientific discoveries and technological advancements as part of their learning process.”
This is somewhat the essence of an interesting Google sponsored article titled “The Metacognition Revolution” which argues that the integration of AI in education is transforming the focus from content mastery to metacognition, enabling students to reflect on their learning processes. AI tools can help personalise learning, assess student progress through process-oriented evaluations, and highlight gaps in understanding. They say that this shift encourages educators to teach students not only what to think but how to think critically about their own thinking, preparing them for a rapidly changing world and enhancing their lifelong learning skills.
🧭 Situational Awareness
At the end of the video demo last week, I made a comment about NotebookLM not supporting YouTube… well, they added it!
During my conversation with the startup founder I mentioned at the start of this newsletter, she reminded me of Leopold Aschenbrenner's "Situational Awareness", which is a 2024 report warning of an imminent "intelligence explosion”. So I thought I would use this section not the labour the point about my new favourite AI app, but just to use it to create a podcast to get us into the future-gazing mood.
I listened to this whilst walking the dogs and I have to say the fact that I knew this was AI wasn’t at all distracting. If anything it seemed a very strangely similar experience to my usual podcast listening habits! However, I did chuckle at a few moments when it was almost as if the AIs we’re having an existential crisis about the rise of AI… the script is generated in a way that they are supposed to appear as two humans having a conversation.
Olivia Moore on X showcased this more starkly, when she posted “The NotebookLM hosts realizing they are AI and spiraling out is a twist I did not see coming”. The 2 min clip is worth a listen. What a time to be alive.
🧠 First Neuralink Patient Using It to Learn New Languages
This Futurism article was one that I included in the AI podcast last week, so for those of you that listened to it, this is a bit of a repeat I’m afraid. They report that Noland Arbaugh, the first human patient of Neuralink, is using a brain chip to learn French and Japanese while also relearning math. As the subtitle suggests, he aims to return to school to finish his degree and hopes for future FDA approval to control physical machines with the device. I mean, if that’s not a new user interface, then I don’t know what is!
But since Noland is learning languages…
🗣️ Advanced Voice Mode
ChatGPTs advanced voice mode is being rolled out to all plus subscribers. Rather than writing anything, clearly the appropriate thing to do here is a quick demo for those who haven’t already experimented with it:
Now, this has created a bit of a dilemma in my household. My eldest daughter asked ChatGPT to create an opening paragraph of a story (you can see the transcript of the voice conversation below), to which ChatGPT dutifully replied in a very expressive way.
She then turn to my wife and I and made the comment that ChatGPT had just creatively written in a few seconds what had effectively taken her several lessons in her English classes that week. 🫤
There aren’t any vision capabilities (yet) as they had originally demoed a few months ago, but that just gives me the opportunity to ask, did you know that the latest Pearson+ Channels feature does have vision capabilities?
Recommended by LinkedIn
Last Minute Update - Canvas
Overnight, OpenAI announced Canvas, which redesigns their standard user interface to gives a whole new way to work with GPT, particularly for writing and coding. I've only spent a few minutes playing with it, so nothing more to report... this will likely be a focus for next week.
🕶️ Augmented Reality
Meta Connect happened last week, and whilst they announced Llama 3.2, the first open-source multi-modal model (try saying that 5 times quickly), I wanted to pick up on arguably the star of the show, Orion, their first true augmented reality glasses.
Now OK, this is an AI newsletter, but I hope you can allow me a little latitude here given the success (particularly vs other AI wearables) of the Meta x RayBan partnership which saw Meta AI added in their last release. Now Llama is multimodal in all probability the goal here is to bring all this together as they more or less say in the article. Is there a possible future where a smartphone is a thing of the past?
By the way, these aren’t the only AR glasses on the market. A far well less know product called Spectacles by Snap have been around for a few years and they have recently been updated.
🤖 An Agentic Future
There continues to be a lot of talk, and development, of agents that perform specific functions or roles (somewhat) autonomously. Agents are one of those terms that seem to have a slightly loose definition; for our purposes, let’s say it’s an autonomous entity that can perceive its environment, make decisions, and take actions to achieve specific goals. So if you squint, it could include Cursor’s AI capabilities for coding. This is certainly an area that is seeing a lot of activity. But let’s take a look at a few more signals.
Salesforce’s annual Dreamforce conference happened a few weeks ago, at which much of the focus was their Agentforce platform that enables you to “Build and customize autonomous AI agents to support your employees and customers 24/7”.
Microsoft recently published WindowsAgentArena on GitHub, which is a “scalable Windows AI agent platform for testing and benchmarking multi-modal, desktop AI agents”. As they say on their webpage, “Large language models (LLMs) show remarkable potential to act as computer agents, enhancing human productivity and software accessibility in multi-modal tasks that require planning and reasoning.”
Clearly OpenAI agree, because they are hiring a multi-agent team since they “view multi-agent as a path to even better AI reasoning.”
And finally, what more evidence could you need than AI being 100% successful in solving CAPTCHAs… nevermind agentic, technically they are provably more human than me! I suppose there is a follow up joke quantitative bias, but I’m 94% sure I’d mess it up. (Disclosure, ChatGPT helped me write this pun).
I hope we can agree, from a UX/UI standpoint, agents could be a paradigm shift.
♾️ Convergence
If you’ve read this far (thank you!), you’ve probably already drawn a similar conclusion, but let’s spend a few sentences converging these ideas. Hopefully it’s not too much of a leap to say, combining Neuralink with advanced voice might change what learning languages means, and zooming out one layer further you can see how AI transforming the ways we interact with technology, across modalities, as it converges with them. Broader still and this applies at an organisational level as the Salesforce example points to, and even at an industry or system level if we think about autonomous vehicles and robotics.
🚨 Regulation
OK, this has been a long one, but I just wanted to finish off with a few thoughts on this hot topic. The big news is that California's Gov. Newsom vetoes controversial AI safety bill, and he cited the need to maintain leadership in the Tech/AI space as one of the key reasons. This was despite 125 Hollywood actors signing an open letter urging him to sign.
By contrast, there continues to be push back and criticisms of the EU’s approach and this is translating into real world implications, even for some of the latest announcements in this newsletter…
As you probably know, Llama is an open source model and, as such, arguably puts pressure on other maintain an ethical approach, be transparent, and keep prices for the consumer low. Llama’s community license is not granted to EU domiciled entities. Arguably that's a significant hole in the EU tech ecosystem, making it less effective overall, and given the points about convergence it might lead to further unintended consequences downstream.
OpenAIs Advance Voice mode is unavailable in the EU at the moment and the consensus as to why seems to be around a section of the EU AI Act about emotion recognition. Technically this could make Advance Voice mode illegal in the workplace or school.
The final thought is that the Presidential Executive Order last year tried to quantify “the set of technical conditions for models and computing clusters that would be subject to the reporting requirements” and this focused on floating-point operations, or essentially the compute layer. But there are 3 elements to the development of models that leads to the scaling laws, the other 2 being data, and algorithms/methodology. It would appear that a significant amount of the improvement in the recent GPT-o1 was thanks algorithms/methodology.
As always, thanks for reading.
Jody
N.B. This newsletter is a lightly edited version of one that I create and regularly send internally to my Pearson colleagues.
News I didn’t use
--
Insights & Market Research | Data Storyteller
2moExcited to read this again! It's packed full of great info
Thank you Jody Gajic! Great idea!
Social innovation + philanthropy consultant | Seed stage impact investor | Director @ Lightbulb Trust
2moLove the idea! 💡
Manager of Strategic Initiatives
2moBOOM!