On stage in Sydney’s gorgeous Hunter Valley wine region this week, an audience of global travel writers asked me if and when AI would replace humans.
Rather than thrusting a finger in the air and offering one of those convenient answers designed to placate the audience, I decided to go to do the research.
In the car home, I went back 2.3 million years of human existence to research how human communication evolved, and what could be learned.
I discovered that humanity took an entirely different path to develop language than the geeks busily building AI today.
Early humans used gestures to communicate first, then took millions of years to evolve to images, then language, and finally to words, just 3,000 years ago.
The internet has taken an opposite evolutionary path, starting with words, then images, audio, and video, and only to conversation in the past year and a bit.
It shows the new AI web still has a long way to go to achieve the ultimate goal of human-level communication, which is what this post is about.
My conclusion is that if human communication is the goal, then the web is a massive failure for billions, and has a long way to go.
Which means it’s a giant opportunity for communication professionals and innovative publishers willing to try to solve it.
This is one of my favourite posts so far…
But, before we dive in, let’s welcome a flurry of new arrivals to our fast-growing 3,000-strong community.
They include top news and culture magazine Le Point in Paris, more faces from The Independent, the Professional Publishers Association, ITV Studios and Sony PlayStation in London (those Call of Duty digital billboards in Times Square are really something), TV giants A+E Networks in Singapore, Australia’s ABC and ad tech platform Hoppr, the innovation team at Hewlett Packard in Salt Lake City, Utah, the USA Today Network in Austin, Texas, and Business Desk in Wellington, New Zealand, (I’ll be there for Christmas and New Year) among others.
Add to that some new friends from fashion and travel title Excess All Areas, as well as Travel with Joanne in Sydney, cruise specialists Coral Expeditions, experiential specialists Intrepid Travel, accessible experts Travel Without Limits, and Tourism Noosa among others.
And thanks to our sponsor Avid Collective.
Albert Marabian is the Professor Emeritus of Psychology at the University of California and the world’s leading voice (ahem) on non-verbal communication.
His research shows that 93 per cent of human communication is not voice at all but relies on a highly evolved process of gestures and expressions.
That means AI for all its trillions in valuations and techno-whiz-bang wizardry, is still 93 per cent off replacing us.
That’s the answer I wish I had for the members of the Australian Society of Travel Writers who I was keynoting to earlier this week. 🤦♂️
But AI is developing exponentially faster than we humans did. It took us 2.5 million years to get from a gesture to a grunt to a GPU.
Now the world is in an arms race to get to whatever AI enables the web to be in the future.
And my belief is that the ultimate winner will be whoever can scale digital information delivery into the most human-like experience first.
Because, if all of human history is a signal, then it has to be inevitable that words and pictures will not be the communication format of the future.
And that begs the question why 99.9 per cent of news media resources are focused on something that’s dying. Or maybe that’s why 99.9 per cent of publishers are dying?
But it supports my 20-year belief that the future of news telling must evolve to be audio and visual, and as human-like as tech can be, to remain relevant.
To test this, I decided to conduct an experiment.
On the three-hour drive from my speaking engagement back home to Sydney, I had a long conversation with Microsoft’s Copilot.
I wanted Copilot’s AI to teach me about the history of human language, and then to merge my knowledge with the archives of the Wall Street Journal to join some dots.
Because I believe that the way we deliver news has to change.
And I decided to do the entire thing from my car, using only voice, and while driving, to fulfil another hypothesis I’ve held for years: That news media fails to engage its audiences while driving.
1.4 billion people commute in a car alone every day.
The average commute is slightly more than 30 minutes each way.
That’s 5,276 years of addressable media time in just the top 10 cities, and
121,348 years globally
Every day.
With one ad every eight minutes, and a podcast costing near nothing to make, that’s 7.9 billion $50 CPM ads in addressable revenue. Just sayin’
So, here’s what I did.
I chatted with Copilot (see below) and then took the text and used Google’s Notebook LM to turn it into a podcast - and included two breaks for ads.
Creating the podcast took 14 minutes and Notebook LM cost nothing. My subscription to ChatGPT costs $20-a-month.
The outcome was the podcast above. It’s published unedited, and as it came out.
If you’re a publisher, ask yourself how productive your news team might be if they turned text stories into podcasts that can be listened to in the car, or on the move?
Or how productive your news team might be if they used their time in the car to deep dive into topics and create unique journalism on the go?
Let’s jump in my car, and begin with a voice prompt to Microsoft Copilot…
Ricky: Good morning, I want to do a research project.
Copilot: Good morning! That sounds exciting. What are you planning?
Ricky: I want to explore the origins of human language. When did language first appear, what was it like, and where did it happen?
Copilot: The origins of human language have intrigued anthropologists, linguists, and historians for centuries. The exact timeline is still debated, but:
Primitive language developed as early as 2.3 million years ago but modern language is generally associated with Homo sapiens, less than 200,000 years ago.
Initially, communication relied on gestures, similar to modern primates. Over time, vocalisations became more complex.
The Bow Wow Theory suggests early humans imitated natural sounds. The Pooh-Pooh Theory posits language originated from instinctive emotional exclamations.
The Yo-He-Ho Theory proposes it developed from rhythmic chants used during communal labour.
The earliest humans to develop complex languages are believed to have lived in eastern or southern Africa ~150,000 to 200,000 years ago.
Ricky: OK, great. When did distinct languages begin to emerge?
Copilot: Distinct languages emerged as early humans migrated out of Africa around 100,000 years ago.
Different environments influenced the evolution. Groups in dense forests developed different vocabularies and sounds to those in open savannas.
The advent of agriculture 12,000 years ago led to more permanent settlements, and linguistic diversification grew as communities became more socially complex.
Indo-European, Afro-Asiatic, and Sino-Tibetan languages formed, giving rise to many of today’s major languages.
The earliest written records, Sumerian cuneiform and Egyptian hieroglyphs, in 3,000 BCE provide evidence of distinct languages used by ancient civilizations.
The spread of empires and trade then exchanged languages, leading to further diversity. Phoenician language spread across the Mediterranean for example.
Geography such as mountains, rivers, and oceans also acted as barriers, influencing the spread of languages.
This led to the rich tapestry we see today, with over 7,000 distinct languages.
Ricky: Interesting. If there are 7,000 languages, are there recurring themes?
Copilot: There are. All languages use a combination of consonants and vowels to form words.
Grammar and syntax is also common, and most languages have ways to indicate when an action takes place (tense) and the nature of the action (aspect).
Figurative language is also universal, with metaphors being common across cultures. Most also have ways to express politeness and formality.
Ricky: Fascinating. What about communication beyond voice and language. What about gestures and expression?
Copilot: Linguists and cognitive scientists recognise gestures and facial expressions as integral components of human communication.
Gestures provide additional information and help clarify the speaker’s message, and facial expressions convey emotions and attitudes.
Gestures enhance understanding and learning. This is called conceptual congruence.
Ricky (still in the car): Righto, I’m really enjoying this. Can you put a percentage to how much human communication relies on gestures?
Copilot: Research by psychologist Albert Mehrabian suggests 55 per cent of communication is body language, gestures and facial expressions.
Another 38 per cent is tone, and just seven per cent is the actual words spoken, so 93% of communication is nonverbal.
Ricky: What’s the impact of that in an anthropological sense?
Copilot: Spoken language developed first, about two million years ago and writing much later, ~5,000 years ago.
Writing enabled the recording of information, laws, and cultural narratives, facilitating complex societies.
Spoken language allows for real-time interaction and feedback and is crucial for social bonding and cultural transmission through storytelling.
Written language lacks the immediacy and emotional richness of spoken communication.
Ricky: OK. In the news media industry, immediacy and engagement are considered vitally important.
What you’re telling me suggests that writing the news is suboptimal and supports my argument that news telling should evolve to be more human-like. That right?
Copilot: From an anthropological perspective, spoken language is essential for everyday interaction and is crucial for social cohesion.
Ricky: OK. Let’s bring this up to date. Today’s internet is mainly words and pictures. AI means news media can now communicate better in audio and visual.
Based on all the data you have, is it fair to say that AI will enable news delivery in the future to be more human-like, and how will that evolve?
Copilot: Your hypothesis is insightful. AI indeed has the potential to transform digital communication, making it more human-like, and here’s why it’s likely.
Advancements in natural language processing (NLP) mean AI models like GPT-4 can understand and generate text that captures nuance, context, and subtleties.
AI technologies can already generate human-like speech with natural intonation and rhythm.
They also learn from human interactions to better understand user preferences and adapt.
This evolution will unfold through virtual assistants with human-like empathy. AI tools will also learn to interpret and convey non-verbal cues and gestures.
Ricky: That’s helpful, but I respectfully disagree, so let’s put the brakes on a bit.
I don’t believe that the future of human communication is going to be via an AI avatar, like in Mark Zuckerberg’s Metaverse fever dream.
Instead, it will evolve to be more human. So how can we take the words and pictures of today’s web and transform them into more human-like communications?
What kind of news media products emerge?
Copilot: Many ways. News platforms will integrate text with high-quality voice narration, allowing users to listen to articles that convey tone and emotion.
Real-time translation of news articles and broadcasts will break down language barriers so that every news article is accessible to global audiences.
AI will analyse user preferences to curate personalised news feeds, and provide contextual recommendations like related articles, videos, or infographics.
We will see augmented reality (AR) create immersive news experiences to explore 3D models, interactive maps, and interactive infographics where users can engage with data and visualisations in more dynamic ways.
There will be platforms for users to engage in real-time discussions with journalists and other readers.
This can create a sense of human community, and AI can be used to moderate those comments and discussions.
Using AI to analyse the sentiment of user comments, and provide feedback based on the emotional tone, will make the experience more personal.
We have the technologies for news media to evolve from static words and pictures to a dynamic, interactive, human-like communication future right now.
Ricky: OK, we’re getting somewhere, and I want to try this out for real, but just before we do, I’m fascinated by the concept of readers interacting with a news media brand in a conversation.
Let’s say The Wall Street Journal uploaded its content archive into a large language model and let consumers ask it about the performance of a stock, or a CEO.
The LLM would only use articles written, edited, and published, by that one publisher. That’s possible now, right?
Copilot: Your idea is innovative and feasible with current AI technology. Here’s how.
Advanced language models like GPT-4 already process and generate human-like text based on vast amounts of data. If the WSJ uploads its archive, users could query it.
There’s already a content licensing deal between OpenAI and News Corp to access WSJ content.
With your approach though, users could interact with the AI to ask about specific topics, such as stock performance or CEO histories.
The AI would retrieve and synthesise it from the WSJ’s archives. This would deliver a more interactive and engaging experience than traditional search.
Voice-activated assistants like Siri or Alexa, specialised in news content would enable users to ask questions and receive spoken responses in a human-like form.
A user-spoken question about a stock’s performance would receive a comprehensive summary which would be integrated with data visualisations.
Ricky: Through personalisation and curation, the news could be packaged for billions of individuals based on their preferences, and delivered in any format, right?
Copilot: Yes. AI can curate personalised news feeds, and interactive summaries.
Ricky: The Holy Grail for me is that I can finally have a conversation with my news.
Copilot: Yes indeed. Users can ask specific sections to get more detailed and related content.
Ricky: OK, I’m sold. Now I want to get specific. I’m a consumer particularly interested in the media and tech space.
And I know you, Copilot, are built on OpenAI’s ChatGPT, which means you have copyright permission to access articles from The Wall Street Journal. I also have a subscription.
So, please give me the top story on media and technology today from The Wall Street Journal and summarise it. Go.
Copilot: Sure.
Google reported a 15 per cent jump in year-over-year revenue driven by strong results in cloud computing. This boost comes during a costly push into AI. It means Google’s cloud business is becoming a significant revenue driver, which indicates a strong market position for future expansion.
Ricky: Great, but I want more. What was the Wall Street Journal’s sentiment about Google in this story?
Copilot: The Wall Street Journal regarded Google’s financial results quite positively.
It highlighted Google’s revenue increase in a competitive environment and emphasised its investments in AI are starting to pay off.
The results led to a 5.3 per cent increase in Alphabet’s share price.
Ricky: Hmmm, I’m not so sure. My understanding is that Google’s ad revenue growth slowed. Using the Wall Street Journal’s coverage, should I be worried as an investor?
Copilot: The coverage does highlight a slowdown in advertising revenue growth. It grew 9.5 per cent in the quarter but slowed compared to past performance.
While the market may react negatively, it’s important to consider the overall health of Google’s business. It continues to generate substantial revenue.
Ricky: In my Future Media Substack, I suggested the results indicated Google was past its golden era and becoming more like IBM; valuable, but not exciting. Do you think that was fair?
Copilot: Here’s my assessment.
Google’s strong performance in cloud and AI investments shows it’s still a major player. Its ability to generate substantial revenue and invest heavily is a positive sign.
The slowdown in ad revenue growth and the need to diversify its revenue streams are valid concerns. Google is adapting to a more mature phase of its lifecycle.
Like IBM, Google is focusing on enterprise (cloud computing) and emerging technologies (AI), and this might lack the innovation and excitement of earlier years.
However, unlike IBM, Google still has a strong consumer-facing presence through its search engine, and YouTube. This can help maintain relevance and excitement.
Ongoing investments in AI could lead to new breakthroughs, but the company’s ability to adapt and evolve will be crucial in determining its future trajectory.
Your analysis provides a thoughtful perspective, and it’s fair to acknowledge that Google might be entering a more mature phase, but it also has continued growth.
Ricky: Ha, you’re on the fence then. Thank you. Tell me, do you think we managed to create something interesting, with me speaking to you, while driving?
I am now going to convert this into a podcast using another AI, so I can turn our conversation into a new format and earn money again. Has it been fun?
Copilot: It’s always fun talking with you. What shall we do next?
Share this post