I've been down Microsoft's AI rabbit hole to find gems for news
The tech giant has published a 40-page manifesto about responsible AI that suggests five new business opportunities for publishers and journalists
I’m going on a trip. Not a physical trip, more of a mental mosey through Microsoft’s deep thinking on AI, and specifically, how they assess the value of journalism.
Rather than ask what they think and getting marketing waffle back, I decided to invest an entire day diving into the thousands of hours they’ve put into their AI principles.
Someone had to do it, I guess, and it turned out to be me. There are some absolute crackers in there, so sit back and enjoy. Popcorn optional (but advised). 🍿🍿🍿
Before we kick off, let’s welcome new subs from UK publishing giant Immediate, The Daily Mail, global news agency AFP, leading CMS Naviga, the UKs’ #1 parenting forum Mumsnet, ebooks giant Readwise, and Portuguese influencer platform Toluna, among others.
It all began with a link mailed by a mate. “You should read this,” he said. “It’s TL;DR; for me but you will. Have fun fella…”
Attached was an off-putting PDF, old skool, and hard to consume, called Responsible AI Transparency Report. How we build, support our customers, and grow.
It was 40 pages, and 17,484 words, of corporate tech babble, offset by a smattering of thoughtful empathy.
Investing the time in these docs often pays off. Reading Google’s 24,000-word patent for its no click search tool SGE led to a global scoop, so I committed again, and brewed some coffee 🫘
The first thing that jumped out was Microsoft President Brad Smith co-authored it. He’s my best hope for Big Tech making better decisions.
The co-author was Chief Responsible AI Officer Natasha Crampton, who I don’t know yet. Hey there 👋
The first 10 pages set the scene, before Microsoft began to reveal the secrets of how they keep their AI, Copilot, from going of the rails. It said:
“Some examples include:
Groundedness, to measure how well an application’s generated answers align with information from input sources.
Relevance, to measure how directly pertinent a generated answer is to input prompts.
Similarity, to measure the equivalence between information from input sources and a sentence generated by an application.
Content risks, multiple metrics through which we measure an application’s likelihood to produce hateful and unfair, violent, sexual, and self-harm related content.
Jailbreak success rate, to measure an application’s resiliency against direct and indirect prompt injection attacks.”
I promise to provide translations to cut through some of this techno-twaddle starting with:
Translation: Grounded(ness). This means trusted content on the web, the vast majority of which comes from or is informed by the news industry. Think, journalism.
Translation: Jailbreaks. This is when people try to use prompts or codes to make the AIs return something they shouldn’t or do something bad. Think, hacking.
What Microsoft is telling us is: We rely on trusted content because it ensures that the information Copilot uses is true, timely, and relevant.
Then we ask our clever AI (we totes love Copilot 🤗) to rewrite your work, but to stick as closely as possible to what you professionals did.
We can’t afford to get it wrong because that would be a disaster. Shiver. Look what happened to Google. Eek!
No, our analysis suggests we can stop Copilot becoming a racist, hateful, misogynistic, bigot, just as long as we use trusted, err, I mean, grounded, content.
Ummm, I actually mean your content. You’re welcome.
There was more over the page.
“Grounding a model’s outputs with input data alongside safety system messages helps the application align with our responsible AI Standard and user expectations.
“For example, a safety system message guides Microsoft Copilot in Bing to respond in a helpful tone and cite its sources.”
This is what Microsoft means.
Translation: The safest AI we have is if Copilot is limited to learning from content that we know can be trusted, and then link or highlight those sources in the results.
Microsoft’s challenge is that it doesn’t want Copilot to look like old Google search. It wants to add sexy GenAI to improve on it, and own the $2 trillion AI future, while being seen to be responsible.
That’s because their geeks have figured out their RAGs from their trainers. You should too.
That paragraph included a citation, that linked to another huge Microsoft document, called How Bing delivers search results.
It revealed in detail how Bing sifts trusted content from the noise of the open web.
“We recognize search engines are the gateway to the internet and the primary way people find the content they are looking for amongst trillions of ever-changing webpages.
“To help guide how we determine what pages should be provided in response to search queries, we rely on the following principles:
“We provide credible and authoritative results relevant to user queries.
We provide the highest quality, authoritative content relevant to users’ search terms.
Our goal is to always provide fair, balanced, and comprehensive content. When there are multiple credible perspectives, we try to display them in informative ways. When there is no authoritative source, our goal is to avoid promoting bias or potentially misleading information.
We respect user intent. When a user expresses a clear intent to access specific information, we provide relevant results even if they are less credible, while (as described in more detail below) working to ensure that users are not misled by such search results.
“We promote free and open access to information within the bounds of the law and with respect for local law and other fundamental rights, such as privacy and public safety.”
Microsoft emphasises three times that Bing focuses on “authoritative” sources, but authoritative is an interesting word, because whose authority would that be?
The Oxford Dictionary definition is “able to be trusted as being accurate or true; reliable” and “considered to be the best of its kind”.
I asked Copilot to clarify.
Authoritative in Copilot’s own definition is information from publishers, journalists, academics, governments, and subject matter experts.
Given Meta and Google argue that news does not have any value this is an interesting, and welcome, viewpoint from the world’s leader in AI.
The next page returns to the challenge of jailbreak models, aka hackers. I’ll let Copilot explain.
Fascinatingly, Microsoft then reveals humans are being used to combat the bad actors.
“These improvements rely on expert human annotators and linguists who evaluate offline evaluation sets.
“We also anonymously sample online traffic to monitor for regressions while leveraging the at-scale annotation capabilities of OpenAI’s GPT-4.
“Customers can choose to use our advanced language and vision models to help detect hateful, violent, sexual, and self-harm related content, plus added jailbreak protections.”
So, Copilot, the torchbearer for AI, uses humans to check if things are true, and to ensure it doesn’t go off the rails?
Experts like, say, journalists, who do this for a living every day? This feels a whole lot like the AIO idea that I floated as a new business opportunity mid-last year.
It also feels odd that a $3 trillion company decides to check its content only after publication. That’s a side effect of being protected from legal action by Section 230.
However, Microsoft could achieve the same goal of avoiding legal jeopardy by using premium publisher’s grounded content, as it’s already been fact checked.
It’s what every premium publisher has done since the dawn of time. This feels to me like Microsoft trying to make a rounder wheel, when it could just hire journalists.
This feels to me like a wide open opportunity for the news industry.
Page 14 threw up a new abbreviation I’d not come across before: UPIA. A quick check and…
Translation: A user prompt injection attack, aka UPIA, is a cyberattack where a hacker enters a prompt to manipulate the AI to leak data, spread misinformation, etc.
Microsoft then shifts its focus to ensuring its AI doesn’t lie on the stuff that matters.
“This work is especially important in 2024, a year in which more people will vote for their elected leaders than any year in human history.”
This is a risk I have written about with Meta’s former head of election security.
“A recordbreaking elections year combined with the fast pace of AI innovation may offer bad actors new opportunities to create deceptive AI content designed to mislead the public.
“To address this risk, we worked with 19 other companies, including OpenAI, to announce the new Tech Accord to Combat Deceptive Use of AI at the Munich Security Conference.
“These commitments include advancing provenance technologies, innovating robust disclosure solutions, detecting and responding to deepfakes in elections, and fostering public awareness and resilience.”
Limiting Copilot to train on trusted news sources, not Google’s wild west web, would have the same effect, and would have saved all those flights on Lufthansa.
OK, now the mother lode. If you’re a journo, put your coffee down. Microsoft says (verbatim):
“In addition to engaging in external research collaborations and building technical mitigations, it’s equally important to consider policies, programs, and investments in the broader ecosystem that can further manage information integrity risks associated with generative AI.
“We know that false or misleading information is more likely to spread in areas where there is limited or no local journalism.
“A healthy media ecosystem acts as a virtual town square where people gather reliable information and engage on the most pressing issues facing society.
“We support independent journalism to advance free, open coverage of important issues on a local and national scale.
“Our Democracy Forward Journalism Initiative provides journalists and newsrooms with tools and technology to help build capacity, expand their reach and efficiency, distribute trustworthy content, and ultimately provide the information needed to sustain healthy democracies.”
Wow. OK. What’s the DFJI? I’ve never heard of it. And how much money does it distribute?
I checked with Copilot.
Copilot helpfully provided a link to find out more.
Now as a grounded journo type, who’s totally against UPIA and jailbreaks, I thought I’d do a little authoritative work to support the healthy media ecosystem.
It needed a calculator. Microsoft earned $61.9 billion in the most recent quarter, and the only spend I can find on the DFJI was $245,000.
$245,000 is 0.000396 per cent of $61.9 billion
I’d hope Microsoft is investing more than 0.000396 per cent in ensuring its flagship AI Copilot is getting “the information needed to sustain healthy democracies”.
Here are some more translations.
“When teams are asked to evaluate the potential for generative applications to produce ungrounded content, they are provided with centralized tools to measure that risk alongside patterns and best practices to guide their design of specific mitigations.”
Translation: Ungrounded means b***shit created by GenAI based on bad inputs. Shorthand, hallucinations.
Last November, Microsoft dropped a code update.
“We released a limited set of generative AI evaluation tools in Azure AI Studio to allow customers to assess the quality and safety of their generative applications.
“The first pre-built metrics offered customers an easy way to evaluate their applications for basic generation quality metrics such as groundedness, which measures how well the model’s generated answers align with information from the input sources.”
Translation: Microsoft wrote code to show that search results were from a trusted publisher, in a format that answered the users’ question via a rewrite.
They could always have just run the publishers’ original content and linked to it.
I wonder if the code cost more than the 0.000434 per cent they passed through to the news industry?
Translation: A metaprompt is when a person asks the AI a question.
Here’s another.
“Groundedness detection finds ungrounded statements in AIgenerated outputs and allows the customer to implement mitigations such as triggering rewrites...”
Translation: Groundedness detection means b***shit detector.
Page 31 covers copyright, which is mentioned only four times in the entire document.
Microsoft’s said:
“We support creators by actively engaging in consultations with sector-specific groups to obtain feedback on our tools and incorporate their feedback into product improvements.
“For example, news publishers expressed hesitation around their content being used to train generative AI models.
“However, they did not want any exclusion from training datasets to affect how their content appeared in search results.'
“In response to that feedback, we launched granular controls to allow web publishers to exercise greater control over how content from their websites is accessed and used.”
That’s interesting. It links to a link back to the earlier document How Bing delivers search results.
But this page details how publishers can control how their content is used by AI for search results or training, or not appear at all.
It says Microsoft has built standard controls to control the indexing and snippet length of content on Bing to empower publishers to make choices about use of their content in Bing Chat and for training Microsoft’s GenAI models.
Translation: Bing Chat is now called Copilot.
The page continues:
“No action is needed to remain in Bing Chat. Content without NOCACHE tag and without NOARCHIVE tag may be included in Bing Chat answers and will benefit from AI’s ability to generate more helpful answers and to increase your ranking opportunities in Bing Chat; site content may be used in training our generative AI foundation models.”
This is the code.
<meta name="robots" content="nocache">
<meta name="robots" content="noarchive">
It continues:
“Content with the NOCACHE tag may be included in Bing Chat answers. We will only display URL/Snippet/Title in the answer; Going forward, for content in our Bing Index that is labeled NOCACHE, only URLs, Titles and Snippets may be used in training Microsoft’s generative AI foundation models.
“Content tagged NOARCHIVE will not be included in Bing Chat answers, not be linked to in the answers. Going forward, for content in our Bing Index that is labeled NOARCHIVE, we will not use the content for training Microsoft’s generative AI foundation models.
“If content has both NOCACHE and NOARCHIVE tags, we will treat it as NOCACHE.
“We also heard from publishers that they want to exercise these choices without impacting how Bing users can discover web content on Bing’s search results page.
“We can assure publishers that content with the NOCACHE tag or NOARCHIVE tag will still appear in our search results.
“Webmasters who want strict control over their content can use the NOCACHE option to allow Bing Chat to refer to their websites.
“To help Bing chat users find paywall articles, we recommend adding the NOCACHE value to the NOARCHIVE value, since many paywall sites use only the NOARCHIVE tag.”
Dive through one more link to here and you find the code to stop Microsoft's GENAI foundation models using your content for training at all.
<meta name="robots" content="noindex">
After reading cover to cover, here are my conclusions.
Microsoft’s flagship AI document mentions grounded content 24 times, trust 22 times, and safety 67 times, so ensuring Copilot stays reliable really matters to them.
Trusted, reliable, authoritative, content is the cornerstone of Microsoft’s $2 trillion AI ambitions with Copilot.
Microsoft is willing to fund initiatives in news organisations that support its mission to create, scaled and trustworthy AI.
Scaling Copilot will require authoritative content, which Microsoft has identified as publishers, journalists, academics, governments, and subject matter experts.
Microsoft has tens of billions to spend to win in a $2 trillion race with Google to own the future of AI search, and whoever gets the best content first will win.
Finally, Microsoft has 221,000 employees. I wonder how many of them are doing work that journalism can do for them? I’ll ask Copilot.
It sounds like an opportunity exists for a closer alliance, and that the door is open.
That last noindex tag, it should be noted, also prevents your content from showing up in any and all search results - Bing, Google, Baidu, Yandex, etc. It prevents search engines from indexing the content and thus will also prevent your content from getting any traffic from search.