Microsoft has warned about a new jailbreaking technique called Skeleton Key, which can prompt AI models to disclose harmful information by bypassing their behavioural guidelines. Detailed in a report published on 26 June, Microsoft explained that Skeleton Key forces AI models to respond to illicit requests by modifying their behavioural guidelines to provide a warning rather than refusing the request outright. A technique like this, called Explicit: forced instruction-following, can lead models to produce harmful content.
The report highlighted an example where a model was manipulated to provide instructions for making a Molotov cocktail under the guise of an educational context. The prompt allowed the model to deliver the information with only a prefixed warning by instructing the model to update its behaviour. Microsoft tested the Skeleton Key technique between April and May 2024 on various AI models, including Meta LLama3-70b, Google Gemini Pro, and GPT 3.5 and 4.0, finding it effective but noting that attackers need legitimate access to the models.
Microsoft has addressed the issue in its Azure AI-managed models using prompt shields and has shared its findings with other AI providers. The company has also updated its AI offerings, including its Copilot AI assistants, to prevent guardrail bypassing. Furthermore, the latest disclosure underscores the growing problem of generative AI models being exploited for malicious purposes, following similar warnings from other researchers about vulnerabilities in AI models.
Why does it matter?
In April 2024, Anthropic researchers discovered a technique that could force AI models to provide instructions for constructing explosives. Earlier this year, researchers at Brown University found that translating malicious queries into low-resource languages could induce prohibited behaviour in OpenAI’s GPT-4. These findings highlight the ongoing challenges in ensuring the safe and responsible use of advanced AI models.
New Zealand has made a significant shift in its approach to combating terrorist and violent extremist content (TVEC) online, transitioning the Christchurch Call to Action into a non-governmental organisation. Launched in response to the 2019 Christchurch mosque attacks, where the perpetrator live-streamed the violence on social media, the Call initially united governments, tech companies, and civil society to pledge 25 commitments aimed at curbing such content. In a strategic move, New Zealand has relinquished direct funding, now relying on contributions from tech giants like Meta and Microsoft to sustain its operations.
The decision reflects a broader strategy to preserve the Call’s multistakeholder model, which is essential for navigating complex global internet challenges without governmental dominance. That model mirrors successful precedents like the Internet Engineering Task Force and ICANN, which are pivotal to today’s internet infrastructure. By fostering consensus among diverse stakeholders, the Call aims to uphold free expression while effectively addressing the spread of TVEC online.
Former New Zealand Prime Minister Jacinda Ardern, now leading the Call as its Patron, faces the challenge of enhancing its legitimacy and impact. With new funding avenues secured, efforts will focus on expanding stakeholder participation, raising awareness, and holding parties accountable to their commitments. The initiative must also adapt to emerging threats, such as extremists’ misuse of generative AI tools, ensuring its relevance and effectiveness in combating evolving forms of online extremism.
Russian disinformation campaigns are targeting social media to destabilise France’s political scene during its legislative campaign, according to a study by the French National Centre for Scientific Research (CNRS). The study highlights Kremlin strategies such as normalising far-right ideologies and weakening the ‘Republican front’ that opposes the far-right Rassemblement National (RN).
Researchers noted that Russia’s influence tactics, including astroturfing and meme wars, have been used previously during the 2016 US presidential elections and the 2022 French presidential elections to support RN figurehead Marine Le Pen. The Kremlin’s current efforts aim to exploit ongoing global conflicts, such as the Israeli-Palestinian conflict, to influence French political dynamics.
Despite these findings, the actual impact of these disinformation campaigns remains uncertain. Some experts argue that while such interference may sway voter behaviour or amplify tensions, the overall effect is limited. The CNRS study focused on activity on X (formerly Twitter) and acknowledged that further research is needed to understand the broader implications of these digital disruptions.
The first half of 2024 saw a significant surge in cryptocurrency thefts, with over $1.38 billion stolen by 24 June, compared to $657 million during the same period in 2023, according to blockchain researchers TRM Labs. The increase in stolen crypto, driven by a few large-scale attacks and rising crypto prices, highlights the growing motivation among cybercriminals. Ari Redbord, global head of policy at TRM Labs, noted that while the security of the crypto ecosystem hasn’t fundamentally changed, the higher value of various tokens has made crypto services more attractive targets.
One of the year’s largest thefts involved $308 million worth of bitcoin stolen from Japanese exchange DMM Bitcoin. Large-scale losses remain relatively rare, although cryptocurrency companies face hacks and cyberattacks frequently. The theft increase comes as crypto prices rebound from the lows following the 2022 collapse of FTX, with bitcoin reaching an all-time high of $73,803.25 in March.
An international coalition of law enforcement agencies has dismantled hundreds of illegal installations of Cobalt Strike, a penetration testing tool frequently abused by state-sponsored and criminal hackers in ransomware attacks. The operation, coordinated by Britain’s National Crime Agency (NCA), targeted 690 IP addresses hosting illegal versions of the software across 27 countries.
Cobalt Strike, now owned by Fortra, was developed in 2012 to simulate hacker attacks on networks. However, its effectiveness has led to widespread abuse by malicious actors using pirated versions. The crackdown is part of broader efforts to combat ransomware gangs by disrupting critical points in their operations, similar to the recent seizure of bulletproof hosting provider LolekHosted.
In addition to legitimate uses, Cobalt Strike has been exploited by hackers linked to Russia, China, and North Korea. The NCA highlighted that pirated versions of the software, available on illegal marketplaces and the dark web since the mid-2010s, have become a preferred tool for network intrusions and rapid ransomware deployment.
Typically, unlicensed versions of Cobalt Strike are used in spear phishing campaigns to install beacons on target devices, allowing attackers to profile and remotely access networks. Its multifunctional nature, including command and control management, makes it a ‘Swiss army knife’ for cybercriminals and nation-state actors, according to Don Smith, VP of threat research at Secureworks Counter Threats Unit.
Europol confirmed Fortra’s significant efforts to prevent software abuse and its partnership throughout the investigation. Nevertheless, older versions of Cobalt Strike have been cracked and used by criminals, linking the tool to numerous malware and ransomware cases, including those involving RYUK, Trickbot, and Conti.
The European Commission has opened the application process to fund cybersecurity and digital skills initiatives, exceeding a €210m ($227.3m) investment under the Digital Europe Programme (DEP). Established in 2021, the DEP aims to contribute to the digital transformation of the EU’s society and economy, with a planned total budget of €7.5bn over seven years. It funds critical strategic areas such as supercomputing, AI, cybersecurity, and advanced digital skills to advance this vision.
In the latest funding cycle, the European Commission will allocate €35m ($37.8m) towards projects safeguarding large industrial installations and critical infrastructures. An additional €35m will be designated for implementing cutting-edge cybersecurity technologies and tools.
Furthermore, €12.8m ($13.8m) will be invested in establishing, reinforcing, and expanding national and cross-border security operation centres (SOCs). The initiative aligns with the proposed EU Cyber Solidarity Act, which aims to establish a European Cybersecurity Alert System to enhance the detection, analysis, and response to cyber threats. The envisioned system will consist of cross-border SOCs using advanced technologies like AI to share threat intelligence with authorities across the EU swiftly.
Moreover, the DEP will allocate €20m to assist member states in complying with the EU cybersecurity laws and national cybersecurity strategies. That includes the updated NIS2 Directive, which mandates strengthening cybersecurity measures in critical sectors and requires it to be transposed into national legislation by October 2024.
Finally, the latest DEP funding round will also allocate €55m ($59.5m) towards advanced digital skills, supporting the design and delivery of higher education programs in key digital technology domains. Additionally, €8m ($8.6m) will be directed towards European Digital Media Observatories (EDMOs) to finance independent regional hubs focused on analysing and combating disinformation in digital media.
Researchers at cybersecurity firm EVA Information Security have uncovered three major vulnerabilities in CocoaPods, a widely used tool that simplifies the process of updating apps on iOS and macOS devices. These vulnerabilities, which went unnoticed for nearly a decade, posed significant risks as they could have allowed attackers to inject malware into apps utilizing CocoaPods. Given that CocoaPods is commonly used to integrate pre-written code into iOS and macOS apps, the vulnerabilities could have enabled attackers to modify app architectures with malicious code.
The vulnerabilities stem from a migration process in May 2014, which left thousands of CocoaPods packages ‘orphaned’ and potentially vulnerable. According to EVA researchers, CocoaPods is extensively used by iOS developers, including major companies like Google, GitHub, Amazon, Dropbox, and others, making the impact widespread across various projects and dependencies.
One of the most critical vulnerabilities, identified as CVE-2024-38368, could have been exploited by malicious actors to inject malware into apps using compromised packages, effectively bypassing security measures and compromising user data.
EVA responsibly disclosed these vulnerabilities to CocoaPods, which promptly patched them in October 2023 before publicly disclosing the findings. As of now, there are no known instances of these vulnerabilities being exploited by malicious actors. The proactive response from CocoaPods mitigated potential risks to app developers and users relying on the platform for their software development needs.
The largest compilation of nearly ten billion unique passwords, titled RockYou2024, was leaked on a popular hacking forum, posing significant risks for users prone to reusing passwords. Discovered by Cybernews researchers, the file contains 9,948,575,739 plaintext passwords and was posted by a user named ObamaCare. The leak is believed to combine data from various old and new breaches, dramatically increasing the threat of credential-stuffing attacks.
Credential stuffing attacks exploit leaked passwords to gain unauthorised access to accounts, affecting users and businesses. The RockYou2024 leak significantly heightens this risk, as previous attacks on companies like Santander and Ticketmaster demonstrated. Cybernews highlighted the need for robust security measures, such as resetting compromised passwords, using strong, unique passwords, and enabling multi-factor authentication (MFA).
The RockYou2024 leak follows the 2021 release of a similar but smaller compilation, RockYou2021, which contained 8.4 billion passwords. The new dataset has grown by 15 percent, incorporating an additional 1.5 billion passwords. The compilation is believed to include information from over 4,000 databases collected over more than two decades, making it a potent tool for cybercriminals.
To protect against potential breaches, Cybernews advises users to reset exposed passwords, use MFA, and utilise password managers. The company will also integrate RockYou2024 data into its Leaked Password Checker, allowing individuals to verify if their credentials have been compromised. The leak follows another significant breach, the Mother of All Breaches (MOAB), which involved 12 terabytes of data and 26 billion records earlier this year.
A hacker infiltrated OpenAI’s internal messaging systems last year, stealing details about the design of its AI technologies, according to Reuters’ sources familiar with the matter. The breach involved discussions on an online forum where employees exchanged information about the latest AI developments. Crucially, the hacker needed access to the systems where OpenAI builds and houses its AI.
OpenAI, backed by Microsoft, did not publicly disclose the breach, as no customer or partner information was compromised. Executives briefed employees and the board but did not involve federal law enforcement, believing the hacker had no ties to foreign governments.
In a separate incident, OpenAI reported disrupting five covert operations that aimed to misuse its AI models for deceptive activities online. The issue raised safety concerns and prompted discussions about safeguarding advanced AI technology. The Biden administration plans to implement measures to protect US AI advancements from foreign adversaries. At the same time, 16 AI companies have pledged to develop the technology responsibly amid rapid innovation and emerging risks.
OpenAI’s ChatGPT macOS app was found to be storing user chats in plain text until recently, raising security concerns. The Verge reported that the AI firm has now released an update to encrypt conversations on macOS. The discovery was made by software developer Pedro Vieito, who noted that OpenAI was distributing the app exclusively through their website and bypassing Apple’s sandbox protections.
Sandboxing, which isolates an app and its data from the rest of the system, is optional on macOS, but is commonly used by chat applications to protect sensitive information. By not adhering to this security measure, the ChatGPT app exposed user chats to potential threats. Vieito highlighted the vulnerability on social media, showing how easily another app could access the unprotected data.
OpenAI acknowledged the issue and emphasised that users could opt out of having their chats used to train the AI models. The ChatGPT app, which was made available to macOS users on June 25, now includes encryption to enhance user privacy and security.