Data governance

Data governance in 2023: a recap

Data, AI’s main resource, came into clearer focus in 2023. Protecting data privacy and preserving the intellectual property rights of data that well-known AI platforms use to build their core models are the two main issues with using data for AI.

Microsoft, OpenAI, and Meta have all been the target of copyright lawsuits alleging that they trained their large language models (LLMs) on content that was protected by copyright.

The New York Times filed a lawsuit against Microsoft and OpenAI last week, claiming that the companies had improperly used millions of the newspaper’s articles to train their chatbots.

Different jurisdictions across the globe have reviewed the landmark DABUS case and continued its journey, with the UK Supreme Court agreeing with the decisions taken in past years from the USA, South Africa, and the European Patent Office that AI cannot be considered as a patent inventor.

These lawsuits indicate the need to implement effective laws that protect copyrighted works. In June 2023, the UK adopted its code of practice on copyright and AI, while Japan expressed its commitment in May 2023 to strengthen AI development while also ensuring copyright protection. On another note, South Korea stated in May 2023 that aims to create new guidelines and standards for copyrights of AI-generated content by September 2023. However, South Korea’s Ministry of Culture stated in December 2023 that it will not grant copyright registration to AI-generated content.

A 2023 prediction that came true is that data was prominent on the development and trade agenda. For instance, in August 2023, India adopted the Digital Personal Data Protection Act (DPDPA), which, together with EU General Data Protection Regulation (GDPR), ‘stand as paramount pillars in the global mission to ensure the security of personal data.’ On another note, Japan established the National Data Administration to coordinate and promote the construction of Data infrastructure systems. In the meantime, China released its draft guidelines for Regulations on Regulating and Promoting Cross-border Data Flows, aiming to ensure the free flow of data. This would only apply to international trade and marketing, among others, that do not contain personal information or important data.

The EU against a major role in data governance policy as the European Commission adopted the adequacy decision for the EU-US Data Privacy Framework, which concluded that the US ensures an adequate level of protection to that of the EU when transferring data from the EU to the US. Later in Novemebr 2023, the EU Council adopted the Data Act, which gives users and businesses access to data generated by using their products or services through a reinforced portability right and copying or transferring data across different internet of things (IoT) products. The EU Data Act also aims to protect trade secrets and intellectual property rights.

AI and data governance

How does AI improve data management?

Undoubtedly, AI brings numerous advantages to data management, including improved data quality through automated processes that clean, standardise, and validate data. It can enhance data privacy and security by leveraging techniques like natural language processing (NLP) and machine learning (ML) to identify sensitive information and ensure compliance with data protection regulations. AI also assists in data classification and categorisation, making it easier to organise and retrieve data for better governance. Furthermore, AI enables data management automation by automating tasks such as data lineage tracking, data cataloguing, and data stewardship. This streamlines processes, increases efficiency, and ensures consistency in data management practices across systems and processes. AI also aids in data management risk assessment by analysing large datasets to detect patterns, anomalies, and potential risks. This allows organisations to proactively identify and mitigate risks related to data quality, unauthorised access, or breaches.

What are the challenges that AI brings to data governance?

Data management improvements are not without dangers for data governance. While AI algorithms can identify and mitigate biases in data, they can also inadvertently introduce or amplify biases. Data security and privacy risks emerge as AI relies on large volumes of sensitive information, making organisations vulnerable to malicious attacks or unauthorised access. Regulatory compliance becomes more complex as AI processes huge amounts of data, requiring organisations to navigate and meet legal obligations related to data protection and privacy. Ethical implications and responsibilities in data-driven organisations are also increasingly important due to regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). These regulations set minimum standards for data protection, and non-compliance can lead to severe penalties. But removing incorrect or undesirable information from AI systems to comply with GDPR or similar regulatory frameworks is challenging due to the complexity of machine learning models, which makes it difficult to identify and remove specific kinds of data distortions and errors. The rapid advancement of AI technology poses challenges in keeping data governance policies and regulations current. As AI evolves, new capabilities and risks emerge, requiring continuous monitoring and adaptation of data governance frameworks. Learn more on AI Governance.

Data governance refers to the governance of data between states and the management of international data flows. It involves the whole life cycle of data – from collection, processing, storage, use, security, and management of data.

Data governance: Moving away from a one-size-fits-all approach

As discussions on data governance mature, 2023 will see a departure from the one-size-fits-all approach towards conversations on how to regulate the different types of data, such as personal, corporate, public, health, etc. In parallel, this will require a holistic approach that takes into account the standardisation, security, human rights, and legal perspectives. For governments worldwide, 2023 could be a landmark year in their search on how to reconcile two aspects:

The need to ascertain sovereignty over critical and sensitive data that needs to be stored physically on national territories (registries, health data, etc.)

The fact that free flow of data across national and corporate borders facilitates economic development and contributes to the public good (e.g. environmental data)

Win-win solutions are of course ideal, but realistically, governments will have to make optimal trade-offs between the two. Read more: Data governance 2023 predictions and trends for 2023 Data governance has grown from a mainly privacy-related issue to a multifaceted one, with implications reaching the economy, law enforcement, cybersecurity, and even geopolitics. Data-driven business models are growing fast and are becoming critical in all sectors of the economy, from manufacturing to services. The processing of personal data (information relating to an identified or identifiable person) also enables scientific advancements in fields that range from healthcare to autonomous driving. Large datasets help lawmakers enact effective public policies and power digital government. In such a context, the international flows of data have become increasingly important for states and companies, and are playing a key role in various treaties, conventions, and trade agreements around the world. The regulation of these flows, therefore, has become an important matter for stakeholders on a global scale. There are several reasons why a country might want to regulate its data flows: i) to safeguard the privacy of its citizens, as is the case for most data protection legislation; ii) to meet other regulatory objectives, such as access to information for auditing purposes; iii) for national and cybersecurity reasons; and iv) with the aim of developing domestic capacity in data-intensive sectors, as a form of digital industrial policy. Data governance has four main facets: technology, economy, security, and law and human rights.

Technology

The technological aspect of data governance refers to the development of standards, apps, and services for data management. Not to be confused with the ‘data governance’ term used for an area of corporate and technical governance, the technological aspect of data governance is concerned with ensuring interoperability between systems and actors, widespread adoption of standards, and proper development of technologies that deal with data and related activities.

Standardisation bodies, telecommunication companies, and governments engage in multiple fora to discuss how to better face challenges like the fragmentation of data space, lack of interoperability, portability, and low quality of data, among others.

National security

The national security side of data governance is concerned with how the data of its citizens might be used both to protect national security and to threaten it.

In the first case, governments can use large sets of data as a means of fighting crime, terrorism, and money laundering. The use of data to fight pandemics has recently surfaced as a national security concern as well.

In the second case, openly available data (such as data from social networks) might be used by foreign entities in ways that pose a threat and/or undermine the national sovereignty and interests of a country. The use of personal data to influence elections, for instance, represents a significant threat many governments have expressed worries about.

Law and human rights

Increasingly often, information considered vital for criminal or civil investigations finds its way outside the jurisdiction of the investigative authorities, straining existing international co-operation mechanisms (commonly known as Mutual Legal Assistance Treaties or MLATs) that are not considered efficient and timely enough for the current pace and volume of transnational investigations. This slow pace and uncertainty are undesirable and often encourage policymakers to adopt data localisation measures as the only means of solving the issue. However, other efforts are being made to create new solutions to this problem. Some national instruments like the US CLOUD Act have been established to facilitate international co-operation. Such regulations raise concerns amongst civil rights groups, who argue that these mechanisms might undermine privacy and rights against unreasonable searches of private data and information, since the government could enter into data sharing agreements with foreign countries and affected users would not be notified of the issuing of these warrants.

Economy

Data is the core economic resource of the new economy. Governments have been increasingly aware of the impact of data on national economies and have sought to implement policies to foster data-based industries within their borders. Key strategic technologies, like artificial intelligence (AI) and autonomous driving, are largely dependent on large inputs of data to be further developed and successfully implemented.

Some governments have stimulated specific high-tech sectors of their economies, attempting to create local ‘Silicon Valleys’ that can compete globally on the open market.

Other states have attempted more direct and protectionist approaches. Some have embedded data localisation measures in legislation so that data is stored within national borders. This is the case of Russia’s data localization law, which has been fining foreign companies hundreds of thousands of dollars for failing to store their data within Russian borders. Some states have implemented other related policies that force companies to establish facilities and subsidiaries in the country. This is often the case in the EU, where the General Data Protection Regulation imposes such restrictions on the destination of transborder flows that companies have no choice but to establish physical and legal presence in the EU. Others, like China, have actively banned foreign tech firms from operating within the country, on claims of protecting ‘Internet sovereignty’, as well as prohibiting the outward flow of its data as a means of protecting and ensuring a monopoly over its large volume of this key resource. The impacts of choosing to restrict the free flow of data or not is still uncertain. Some scholars argue that data localisation and data protectionism are bad for a country’s economy as they increase the costs associated with this important resource, while others argue that, due to long term strategic goals, restricting the free flow of data might be worth the short-term economic disadvantages.

Such situations raise the question of whether legislation, policies, and institutional incentives for the localisation of data, restriction of outward flows, or the deployment of national data-intensive industries might be a new form of economic protectionism, while the calls for the free flow of data between jurisdictions might be a new form of trade liberalism, each motivated by state and corporate economic and/or political agendas. Digital protectionism, as it has been labeled, has risen as a relevant concern to governments, international organisations, and companies alike. Either by restricting outward flows or by promoting the free flow of data, actors such as the US and China have been using data governance as an important instrument to achieve geopolitical and geoeconomic goals.

Data governance in 2023

In 2023, data will be prominent on the development and trade agenda. India has put data and development high on the Agenda of India’s G20 presidency this year. Data will also be a central theme in e-commerce plurilateral negotiations at the WTO. Most likely, Japan – as a promoter of free flow of data – will also try to put data high on the agenda of the IGF, which will be hosted in Kyoto, Japan in October 2023.

So what can we expect these discussions to focus on? There are at least four main policy questions that these forums, as well as other national and regional spaces, are expected to tackle.

(a) How to develop and apply adequate and appropriate regulation to specific types of data

We tend to group all kinds of data into one basket, but in reality, there are different kinds of data – from personal data to sector-specific and open data – all of which need a dedicated data governance approach.

In 2023, data governance will mature with the realisation that we need as many governance approaches as there are types of data. Thus, the way we govern personal data needs to be different from the way we tackle scientific, business, or communal data.

This realisation will be gradual and will be initiated in organisations and systems that manage specific data (e.g. World Health Organization for health data, World Meteorological Organization for weather and climate data, etc.).

(b) How to address data governance issues in multidisciplinary ways

In 2023, countries, companies, and international organisations will have to address the multidisciplinary nature of data governance in holistic ways. It will require organisational, procedural, and practical changes in order to address the following levels of data governance:

At the technical level, data needs standards in order to be interoperable. Here, the work of standardisation and technical bodies becomes essential.
At the security level, data is subject to many breaches. Data security is at the centre of activities of the tech industry and governments worldwide.
At the economic level, the internet business model is based on data. The role of tech companies which process user data, and the role of authorities to ensure users and their data are protected, come into play.
At the legal and human rights level, the main issue pertains to the protection of users’ rights, including the right to privacy and data protection as well as protection from mass surveillance. Since rules are anchored in geographical spaces, jurisdiction is often the main issue that arises in disputes. Courts have increasingly become de facto rule-makers. Civil society plays an important role in advocating for users’ rights.

(c) How to incorporate data governance into traditional policy fields and practices

As data becomes critical for all fields of global cooperation, international organisations will need to intensify the use of data in their activities in 2023, be it health, migration, or trade. Increasing their reliance on data will contribute to more evidence-based policymaking.

At the same time, the growing relevance of data will also make data governance more political. As has already been happening in the health sector, countries will need to negotiate what type of data they are willing to share with international organisations and how this data will be used.

(d) How data regulation will likely affect the use and development of AI

AI is built on data, which means that data governance and AI governance go hand in hand. Emerging AI applications such as ChatGPT will trigger more discussions on the deeply-rooted relationship between the two.