GPT-4's Visual Intelligence: possibilities, dark side and future
🚨 This is a signal from the future. You can use the signal to reflect on yourself, society or your business.
It appears that I'm not connected to everyone of you on LinkedIn. So let's connect, shall we? Please use this link.
Thank you for taking the time to read my "Signals from the future" newsletter. If you'd like to support me and the time and effort I put into creating this newsletter, you can help by liking this article on LinkedIn, tagging a person in the comments, or sending it to a friend or colleague via email. Every little bit helps, and I appreciate your support. Thanks again for reading!
kind regards, Jarno Duursma
ps: You can enhance your event with a presentation from me on the cutting-edge topics of ChatGPT, AI, and synthetic media. Simply complete this form to book me and bring your event to the next level.
Are you not quite up to date on what has happened recently? Here's what you need to know:
GPT-4 update
......visual intelligence..
GPT-4's Visual Intelligence
In the recent report on GPT-4, researchers uploaded this picture to the system and asked a simple question: "What is funny about this image?"
GPT-4 explained: "The humor in this image comes from the absurdity of plugging a large, outdated VGA connector into a small, modern smartphone charging port."
As most of you know, I have written extensively on artificial intelligence, including a book in 2017 titled 'The Digital Butler: Opportunities and Threats of Artificial Intelligence,' and a 2019 report called 'Machines with Imagination: An Artificial Reality.' However, I never anticipated that by 2023, an AI system would be capable of explaining a joke based on visual input. The development is truly astonishing.
Image-to-text object recognition can be seen as an 'image whisperer,' reflecting on visual content such as images and photos. Imagine what the possibilities could be. This goes far beyond object recognition without context or reflection.
Recommended by LinkedIn
This technology can read handwritten texts from a photo and generate instant translations based on text within an image. Maybe in the future, students can receive real-time feedback on their handwritten assignments, and explanations for visual educational materials (books, diagrams, presentations), reducing the burden on educators.
Maybe in the future this kind of software will be used to even better detect offensive images on social media platforms, like memes for example. Memes often have a context-sensitive message in them that cannot be determined by 'normal' image recognition.
These are just a few examples I can think of right now.
Help for the visually impaired
With the integration of GPT-4 into the 'Be My Eyes' app, blind individuals can receive detailed descriptions of their physical environment by simply taking a photo with their phone. The app serves as a virtual companion with perfect vision, ready to lend a helping hand.
Thanks to this technology, visually impaired users can receive auditory descriptions of items in their refrigerator, suggestions for meal preparation, expiration dates of products, and assistance while traveling on public transportation. In essence, GPT-4 provides an additional set of eyes for those who cannot see.
Dark Side
However, there is a darker side to this technology: the potential use for surveillance. Every police force or government worldwide can (and will) use this kind of technology for spying and surveillance.
And maybe the development of image-to-text object recognition leads to unemployment in sectors relying on visual analysis or interpretation.
The future: critical questions
The rapid development of AI software raises critical questions about its implications. If AI systems can already extract information from images and engage in discussions with us, is it only a matter of time before they can learn from every YouTube video in the world? Could they reflect on the content and answer questions about correlations between multiple videos or contradictions within them? Can AI systems ultimately develop a comprehensive model of the physical world based on our images and videos, thereby dramatically increasing their intelligence?
As I ponder the future of this technology, I wonder how intelligent AI software will become. What will it be capable of in three, five, or ten years? Although it may be too early to make definitive predictions, one thing is clear: the advancements in AI technology, particularly in the realm of image-to-text object recognition, are transforming the world around us. The potential benefits for media, education, accessibility, and research are immense, but it is crucial to remain vigilant about the potential threats and ethical implications associated with such powerful technology.
What do you think? Leave your comments!
ps: Do you like this kind of content? I have another newsletter named "Trending in Tech" (in Dutch, 8k subs). It covers various topics, including the future of "text-to-video". Please subscribe and make my day!
Quartermaster in the United Arab Emirates with Holland Legal Services
1yYou can ponder all you want. Realiseer jij je wel wat de impact is van ChatGTP-4 op de Sinterklaasrijmpjes!? Dat worden straks complete alexandrijnen!