The Floodgates Have Opened

But is it worth the price?

Time for a catch-up!

Well, we’re back to normal this week, and by that, I mean another slew of new model releases including as per usual a number from OpenAI. This time it’s GPT 4.5, which is described as OpenAI’s most powerful chat model, offering faster, more accurate conversational capabilities. OpenAI has also integrated its new GPT4.1 and GPT4.1 mini into ChatGPT and will be removing GPT4.0 mini for all users, but what you get depends on the level of your subscription.

We’ve also had releases from DeepSeek AI with their new R2 release, that advances domain-specific performance for complex tasks. However, personally the one that interested me the most was the Allen Institute’s 1 billion-parameter model Ai2 Olmo 2 1B, that outperforms peers of similar size on multiple benchmarks. 

We’ll delve more into it below, but the headline grabbing news has been the deals being struck in the Gulf States with the big tech companies, and the lifting of the embargo / restrictions of US computer chips, which was due to come into force this week. The loosening of legal and regulatory controls on the main industry players may have wider ramifications, some of which we will also touch on below. So, let’s get started …

Big Sharks vs. small fish

So, the big news this week was the various investment agreements signed during the Trump’s Middle East tour, which was supported by representatives of all of the key US players, such as Nvidia, Google, Microsoft, Meta and xAI.

A week ago, there were concerns regarding some of the planned billion-dollar infrastructure projects that were been talked about, including OpenAI’s $500B Stargate data centre project which aims to build advanced AI infrastructure in the US and overseas. Issues regarding the wider economic situation and the impact of Trump’s tariffs was raising doubts about the financial viability of such projects, but that sentiment seems to have reversed. Why? Well in part due to the Saudi government’s planned investments in AI, coupled with the U.S. Department of Commerce cancelling the Artificial Intelligence Diffusion Rule, as already mentioned above.

Google’s AI research lab, DeepMind, has developed a new AI agent—AlphaEvolve—that can solve practical problems, overcome complex coding and math challenges better than any previous models, and also (reportedly) reduce the risk of hallucinations. However, this isn’t a particularly massive breakthrough: DeepMind has applied similar techniques in the past, but the use of Google’s advanced Gemini models makes this version much more capable matching competitors 75% of the time but surpassing them 20% of the time. The drawback is that whilst it can currently just solve numerical problems, like coding and math, etc, which makes it great for computer science and system optimization, that’s its limit at the moment.

There was more news about AI and maths this week, with reports that DeepMind’s latest creation, AlphaEvolve, has cracked a range of thorny code problems. As an example, it identified ways to reduce Google data centre power usage and chip inefficiencies by 0.7%. That may sound small, but when we’re talking about a company as large as Google, the scale in terms of cost savings and computing efficiency are massive.

Beyond that Open AI again dominated the news as it also announced its plans for a shared AI operating system based on ChatGPT. OpenAI aims for this new platform to become the central part of people’s digital lives, going far beyond just a chatbot. The plan includes smart interfaces for future devices and platforms, allowing AI to work like a real operating system with the plan to make AI a key part of everyday life, not just a tool. This was seen as significant enough to dent Alphabet’s share price (Google’s mother company).

It also looks like OpenAI has walked back from its privatisation plans and has now announced its intention to create a public benefit corporation (PBC). This is being resisted by Microsoft, as it renegotiates their partnership, with OpenAI aiming to cut Microsoft's revenue share, while Microsoft wants long-term access to OpenAI's technology.

OpenAI has also launched a new “safety evaluations hub” where it plans to regularly share the results of its internal AI safety tests to increase transparency and build back public trust. This comes as OpenAI recently faced criticism from AI ethicists and industry experts for prioritising new model releases and features over safety testing, after several key OpenAI employees quit over safety concerns. CEO Sam Altman was also accused of misleading OpenAI executives about model safety testing, right before he was briefly ousted in 2023.

There was news involving other companies as well this week, with legal tech startup Harvey is raising $250 million, targeting a $5 billion valuation. Harvey’s fast growth reflects strong demand for AI tools in legal work, attracting major investors and boosting competition in the AI legal tech space.

However, the biggest new about “Small Fish” was in relation to the coding tool Lovable, which just exploded in popularity with an increase in users of 17,600% over 12 weeks. This underlines the fact that software engineering relies increasingly on AI, meaning that these tools are becoming the fundamental infrastructure.

Legislation, policy and other news

It seems that Silicon Valley’s influence with the White House is resulting in the loosening of controls, and we have seen the likes of Elon Musk pushing for the removal of Intellectual Property Rights, in order to give LLMs more data to learn from. This was underscored by the sacking of the head of the US Copyright Office, just days after she reportedly “refused to rubber-stamp Elon Musk’s efforts to mine troves of copyrighted works to train AI models” in a report seen as critical of his efforts.

The issue of copyright also came into focus this week with the behaviour of US students, who have been using AI to answer essays and coursework. In order to hide their behaviour, they have even been found to insert mistakes and alter the AI-generated work, to avoid it being “too perfect”.

It’s not all doom and gloom though, with the company SoundCloud changing its policies back. They had quietly slipped potential permissions into their terms, allowing your uploaded content to train AI, but after public backlash, those changes were reversed in a victory for privacy rights.

Research has also highlighted two interesting facts this week. First it identified that asking chatbots (like ChatGPT) for short answers to queries significantly increases the likelihood that they will hallucinate and deliver inaccurate information. Many tech companies train their apps to prioritise short, concise responses to reduce data usage and minimise costs, but it seems that detailed explanations are often needed so models can counter misinformation or navigate complex, vague, or misguiding questions. The second fact was that it’s estimated 80% of AI models still require manual human labelling to function optimally, although I wonder how long it is before that position reverses?

Finally, there were two bits of news about translation. The first was the release of new “instant” translation headphones from Spatial Speech Translation, able to not only translate multiple languages in real time but also clone the emotional tone and voice of speakers in room-like acoustics. If that wasn’t impressive enough, the Chinese company Baidu filed a patent for an AI system designed to translate animal sounds. We haven’t reached the Dr. Dolittle stage just yet, but Baidu joins a number of companies working on AI tools to understand animals, including bird songs and whale sounds.

Stay informed. Stay critical. And wherever possible—stay ahead.

Regards

Tom Carter