Language Transformer Models: A hand for an eye and eye for a mouth

Language Transformer Models: A hand for an eye and an eye for a mouth.

Language transformer Models — as my wife, the chatbot lead, corrected me — are large data models. They kind of understand the relational, contextual nature of the nodes. When given an inquiry they can reshape them according to what you’ve asked. But it just knows how to play the game of language; it doesn’t “get it” nor care.

This computing experience is both a continuation of where computing has always been going and also a huge departure. While this isn’t too much of a jump from google finishing your half-written typo-laden search query, now the program itself is malleable to the conversational whim of its user.

The intellectual depth and performative beauty is pretty raw — like a college improv show. And like a brilliant but stoned college student — it’s listening, capable, and holding on to what you say is valuable. But what does it actually know other than what its been told? What is it actually capable of other than what we request? Is this actually a deep conversation or is this just saying words in the right shape for the situation?

Language itself is the content of these new media tech. Not the content of language such as culture or speech, nor even the users’ inputs – but rather language as this huge abstract ocean of communicable possibility. The meaningless-until-used abstraction is the raw material. Then with a suggestion from the audience through the chatbox: Abracadabra – improv that makes you laugh or think.

I spend some time everyday in Midjourney AI generation and ChatGPT because I want to see how I can use it in my businesses and frankly – it’s fun. But I get this strange sense of pleasure deep in my brain that makes me think of clockless casinos in Las Vegas. Then, as I scroll back up my interactions with the “AI” I see that I am not really going anywhere, just rolling the dice, regeneratively playing with ideas.

With these apps — these transformer models which are using language as an abstract math at its core — the experience has been flipped. The user of the medium is now the producer of the content. They are cause and effect. The medium is abstract and indifferent, embracing all languages and their varied cannons and works without position… A genesis chapter 1 kind of place — a calm-faced ocean of content and context possibilities until the word arrives and memes violently give form to context of the consumers’ input/output content.

In Understanding Media and The Medium is the Massage, emphasis is repeated on how ratios of human senses change with the introduction of new media technology. Cumulatively, this changes how society emerges because it affects how individuals perceive and create. However, as so much of this emerges simultaneously – the cause and effects of more specific things in such a society are hard to track; sometimes seemingly only coming into awareness with the effects.

Throughout returning to this framework, McLuhan illustrates with the myth of Cadbury and the dragon’s teeth as a metaphor for the alphabet as a technology. McLuhan explores the inevitable domination of a visual-heavy society following this technological shift. And as a poignant punster, McLuhan titles the chapter about the alphabet in UM as “An Eye for an Ear,” a wink to the flip of senses.

Just as the letters of the alphabet reach a new utility because they are abstract and meaningless by themselves — so too is the use of language in transformer models a meaningless abstraction until used a new by the user to create something new.

In the new age of ChatGPT, the human sense are skewing strongly again. However, these visual mediums are becoming interactive in their visual output through the mode of language. It is as if our eyes have become hands and our mouths have become eyes. Where we would see, we manipulate through text input. Where we would have consumed deeply, we now consume with a bit of reserved distance, so we can iterate and consume again.

As of yet, the experience of this tech is much like a nickelodeon – like a personal novelty. We don’t think others get it until they’ve pressed their face against the screen and gave the handle a crank. It is a separate, individual experience of the tech itself. Until the image or text is shared outside the medium as a finished product. Say as a cover for a comic book, or the content of a LinkedIn post. Then those completed media have a different impact on the sense because they become an artifact within another medium — they become the image or text shared in social media. While they are computer-derived/assisted/whatever, at that point the media form is transferred. Not sure what to evaluate from this but its a point I felt worth making.