Let me try to explain a key development in the journey of GenAI through 2 simple words -
Seq and Vec
A direct result of Warren Weaver’s Memorandum of Translation that we spoke about in my previous post, seq2vec family of models perfectly apply the encoder-decoder architecture that Transformers so elegantly adapted. Enhancements also use the Attention mechanism designed to capture “context”. Let me explain -
Seq = sequence
Vec = vectors
Let’s say -
Seq is a sequence - lets say of words, i.e., a sentence.
Vec is a mathematical representation of an image or video, i.e., a collection of pixels
Seq to vec?
Take a sequence (a phrase or a sentence) and convert it to a video or image.
SeqVec tools you know: Dall-E, Midjourney, Runway, Sora
Vec to seq?
Take a vec (image or video) and conduct a search (Google Image Search). It can also “listen to” and summarize the image or video.
VecSeq tools you know: Gemini, AWS Nova, 4o/o1 all have these capabilities now
Seq to seq?
What do you think? Tell me in the comments below?
Sales | IT | Business Analysis
3wThere's likely still a lot of untapped value in training across multiple languages, given these models are essentially creating semantic maps and that different cultures/languages have unique linguistic features that are translated imperfectly by human translators. Unlocking those connections across different languages is fascinating in its own right and a boon to linguistic research, but could also help scale and uncover contemporary human knowledge that is currently obscured by language barriers.