DW Weekly #118 - 3 July 2023 | Digital Watch Observatory

DigWatch Weekly 100th issue 1920x1080px generic

Dear all,

Generative AI is in the news again, with two lawsuits against OpenAI over alleged data theft and privacy violations. In other news: Companies are finding the idea of withdrawing from a country to be an increasingly enticing strategy to wield against governments and regulators, especially when it comes to AI regulation and content policy.

Let’s get started.
Stephanie and the Digital Watch team

// HIGHLIGHT //

OpenAI is sued for data theft and privacy violations: Here’s what we can expect

OpenAI and Microsoft have been sued in California in a major class-action lawsuit. The long-anticipated legal battle will address crucial privacy and copyright concerns surrounding the data used, and still being used, to train generative AI models. Hopefully, some clarity on how to apply the books to this latest technology is finally in sight.

OpenAI’s alleged violations. The lawsuit, launched by a California-based law firm, alleges that through its practices to train its AI models, ChatGPT:

Secretly scraped people’s data (the legal term is misappropriation)
Violated intellectual property rights of users
Violated the privacy of millions of users
Normalised the illegal scraping of data which is forever embedded in AI models
Gathered more data than users consented to
Posed (and poses) special privacy and security risks for children.

The last one is a particularly serious accusation, considering that the ‘defendants have been unjustly enriched by their theft of personal information as its billion-dollar AI business’, the lawsuit states.

Rings a bell? The case reminds us of two concluded cases:

Italy’s ChatGPT ban in March 2023, which the country lifted a few weeks later after OpenAI added information on how user data is collected and used and started allowing users to opt out of data processing that trains the ChatGPT model.
The ACLU vs Clearview AI case, which ended in a settlement last year, after the company agreed to stop selling access to its face database to businesses and people in the USA, and to any entity in Illinois, including state and local police, for five years.

There are also two similar ongoing lawsuits:

The copyright sections of the new case are similar to another case initiated last week in San Francisco against OpenAI by two US authors who say the company mined data copied from thousands of books, without permission.
It’s also similar to a class action suit initiated in January 2023 against Stability AI, DeviantArt, and Midjourney for their use of Stable Diffusion, a tool that was trained on copyrighted works of artists.

What is the lawsuit asking for, as a remedy? Obviously, financial compensation to users affected by the company’s violations, and more transparency on how personal data is collected and used. But also:

Digital dividends as compensation for people whose data was used to develop and train OpenAI’s models
Establishment of an independent body to approve products before they are launched publically. Until then, a freeze on the commercial use of some of the company’s models.

The legal arguments. The main argument used by companies challenged by generative AI lawsuits is that the outputs are different from their source material, and are therefore, unequivocally transformative. But the Andy Warhol Foundation for the Visual Arts v. Goldsmith ruling, in May 2023, clarified how transformative use is defined: If it’s meant to be used commercially, the fair use argument won’t stand.

The courts will also undoubtedly be looking at the data-scraping practices of OpenAI. Beyond this, there’s ChatGPT itself: If the software can’t function without underlying data, is it continuously infringing copyright laws? And where does that leave people who use ChatGPT as part of their work?

The ethical issues. One of the things that irks people is the permanence of models trained with personal data. If your data was used to train the model, that data has now become part of it and is, in turn, merged with additional data to train the model further. It’s a never-ending loop.

There’s also the unprecedented scale of it all. The entire internet has become one unending source of data for AI companies. In OpenAI’s case, one wonders whether any data at all was off-limits to the company.

If people seek solace in any of this, we don’t think any can be found. The fact that generative AI models are not infallible provides more worry than consolation. It’s also becoming increasingly difficult to tell whether a piece of content was created by humans or generated by an AI model – not to mention the increasing difficulties of discerning truth from fiction (and lies).

And yet, despite all the bad press for OpenAI (and other AI companies, for that matter), this doesn’t seem to be stopping anyone from exploring or using AI tools. Reversing data misuse is next to impossible; the closest thing is to forcibly improve company practices, as similar cases have already shown.

Digital policy roundup (26 June–3 July)

// AI GOVERNANCE //

Draft AI rules could lead us to pull out of Europe, say industry executives

More than 160 executives from companies, including Meta, Siemens, and Renault, jointly signed an open letter to EU lawmakers expressing their concerns regarding the proposed EU AI Act.

They think the new rules, as they currently stand, will have a negative impact on Europe’s competitiveness and technological independence due to the substantial compliance costs and significantly increased liability risks for companies. The executives also warn that the rules may lead innovative companies to relocate their operations and investors, withdrawing their support for European AI development.

Why is it relevant? First, Member of Parliament and co-drafter Dragos Tudorache pushed back quite forcibly: ‘I am convinced that they have not carefully read the text but have rather reacted on the stimulus of a few who have a vested interest in this topic’. Second, the proposed rules are still under negotiation, so they can still be changed (and watered down). Third, it reminds us of OpenAI Sam Altman’s recent comment (which he later retracted) on pulling out of the EU.

A meme with the words 'Is that a threat?' superimposed over a photo of a face.

// DATA //

EU policymakers reach agreement on Data Act

EU countries and lawmakers have reached a provisional agreement on the Data Act, proposed by the European Commission in 2022. The next step is to be endorsed by the Council and the European Parliament.

The act will give users access to data generated by connected devices and will address concerns about unauthorised data access and the protection of trade secrets.

Why is it relevant? One of the most important concepts is that the owners of connected devices will be able to monetise the generated data, which so far has been predominantly harvested by manufacturers and service providers.

Was this newsletter forwarded to you, and you’d like to see more?

SUBSCRIBE

// CONTENT POLICY //

Twitter implements limits on reading tweets

In what seems to be a reaction to last week’s lawsuit against OpenAI (and the violations the lawsuit is alleging), Twitter’s Elon Musk announced the company will limit the number of tweets a verified account can read to 6,000 posts per day. Unverified accounts will have a limit of 600 posts per day, while new accounts will be limited to 300 posts per day.

A few hours later, he increased the numbers to 10,000, 1,000, and 500 – a moderate increase, but an increase nonetheless. Companies (like Twitter) are also bothered by extensive web scraping.

Tweet from Elon Musk explains the limits on the number of tweets that can be read each day by users — *https://twitter.com/elonmusk/status/1675260424109928449*

US regulators to crack down on fake reviews

The US Federal Trade Commission (FTC) is proposing new rules that prohibit businesses from paying for reviews, manipulating honest reviews, and posting fake social media engagement. The announcement follows a period of public consultation, which ended in January.

Why is this relevant? The rules will be accompanied by civil penalties for violators. As the FTC confirmed, fines are a stronger deterrent.

Cambodian PM backtracks on country-wide Facebook ban

Cambodian Prime Minister Hun Sen briefly considered a country-wide ban on Facebook over the many abusive messages he was receiving from political opponents on the platform. He also announced a switch to the messaging app Telegram, citing its effectiveness and its usability in countries where Facebook is banned.

The announcement came right before Meta’s independent oversight board ordered the removal of a video where the Prime Minister threatened his political rivals, overturning the company’s original decision to keep the video online in line with its newsworthiness allowance policy. The board also recommended a six-month suspension of the premier’s Facebook and Instagram accounts.

Why is it relevant? It’s not so much the decisions by Meta or its oversight board that are so important, but rather the remarks made (on Telegram) by the prime minister in reaction to the board’s decision: ‘I have no intention to ban Facebook in Cambodia… I am not so stupid as to block the breath of all the people.’

Google follows Meta’s lead: Canadian news to be blocked

Google announced it will remove links to Canadian news content from its platform in response to new rules requiring companies to compensate local news publishers for linking to their content. This decision follows a similar move by Facebook owner Meta.

Canada’s Parliament passed the new law, known as the Online News Act or Bill C-18, last week.

Why is it relevant? We’ve already compared this development with what happened in Australia two years ago, when Google temporarily blocked news outlets from its search engine in reaction to the Australian government’s plans to enact the news media bargaining code. The difference, however, is that by the time the law was enacted in Australia, Google had already entered into private agreements with news agencies. So far, it looks like the situation in Canada will have a more pronounced impact on both consumers and the company’s operations in the country.

Bar graph shows the relative positions of top companies in the world by market cap, in millions of USD. Apple leads with USD 3 trillion, followed by Microsoft, Saudi Arabian Oil Co, Alphabet, Amazon (below USD 1,500 million), NVIDIA, Berkshire Hathaway (below USD1,000 million), Meta, Tesla, and Taiwan Semiconductor Manufacturing (below USD500 million). — *The top 10 companies in the world by market cap, in millions of USD. Source: Adapted from a* *Reuters graph*

// MARKETS //

Apple has become the world’s first USD3 trillion (EUR2.7 trillion) company (after closing with this market cap, compared to the intraday trading high in January 2022), achieving what no other tech or non-tech firm has ever achieved. While this milestone may elate company investors, it also raises concerns about the immense power wielded by Big Tech, leaving some feeling uneasy.

The week ahead (3–10 July)

2–8 July: The IEEE International Conference on Quantum Software, taking place in Chicago, Illinois, USA, and online, will bring researchers and practitioners from different areas of quantum (and classical) computing, software, and service engineering to discuss architectural styles, languages, and best practices.

3–4 July: The 18th World Telecommunication/ICT Indicators Symposium, in Geneva, Switzerland, and online, will highlight the need to improve how we measure data to achieve universal connectivity.

6-7 July: The annual AI for Good Global Summit returns to Geneva, Switzerland and online this week. Over 100 speakers from governments, international organisations, academia, and the private sector are expected to discuss the opportunities and challenges of using AI responsibly.

10–19 July: The annual High-Level Political Forum (HLPF), taking place in New York, USA, will focus this year on accelerating the recovery from COVID-19 and fully implementing the 2030 Agenda for Sustainable Development.

10–12 July: The second IGF Open Consultations and Multistakeholder Advisory Group (MAG) meeting, in Geneva, will continue shaping the Internet Governance Forum meeting, to be held in Japan later this year.

#ReadingCorner

Microsoft offers Europe suggestions on AI regulation

We couldn’t help but notice the constructive tone in Microsoft President Brad Smith’s message to European lawmakers on AI rules:

‘From early on, we’ve been supportive of a regulatory regime in Europe that effectively addresses safety and upholds fundamental rights while continuing to enable innovations that will ensure that Europe remains globally competitive. Our intention is to offer constructive contributions to help inform the work ahead. … In this spirit, here we want to expand upon our five-point blueprint, highlight how it aligns with EU AI Act discussions, and provide some thoughts on the opportunities to build on this regulatory foundation.’ Read the full text.

A five-point blueprint for governing Al

1) Implement and build upon new government-led Al safety frameworks

2) Require effective safety brakes for AI systems that control critical infrastructure

3) Develop a broader legal and regulatory framework based on the technology architecture for Al

4) Promote transparency and ensure academic and public access to Al

5) Pursue new public-private partnerships to use Al as an effective tool to address the inevitable societal challenges that come with new technology

Source: Microsoft

Stephanie Borg Psaila – Author

Director of Digital Policy, DiploFoundation

Virginia Paque – Editor

Senior editor – Digital Policy, DiploFoundation