Skip to main content

Security experts are sounding the alarm that data momentarily exposed to the internet can persist in online AI chatbots, such as Microsoft Copilot, even after it has been made private.

According to recent discoveries by Lasso, an Israeli cybersecurity firm specializing in emerging generative AI threats, thousands of formerly public GitHub repositories from major corporations, including Microsoft, are impacted.

Lasso co-founder Ophir Dror revealed to TechCrunch that the company’s own GitHub repository, which had been briefly made public by mistake, was found in Copilot due to being indexed and cached by Microsoft’s Bing search engine. Despite the repository being set back to private, with a “page not found” error displayed on GitHub, its content was still accessible through Copilot.

Dror expressed surprise at finding one of Lasso’s private repositories on Copilot, stating, “If I were to browse the web, I wouldn’t see this data. However, anyone in the world could ask Copilot the right question and obtain this data.”

Following this discovery, Lasso conducted a further investigation into the potential exposure of data on GitHub, even if only briefly, through tools like Copilot.

The company extracted a list of repositories that were public at any point in 2024 and identified those that had been deleted or set to private. Utilizing Bing’s caching mechanism, Lasso found over 20,000 previously private GitHub repositories still had accessible data through Copilot, affecting more than 16,000 organizations.

Among the affected organizations are Amazon Web Services, Google, IBM, PayPal, Tencent, and Microsoft itself, according to Lasso. For some affected companies, Copilot could be prompted to return confidential GitHub archives containing intellectual property, sensitive corporate data, access keys, and tokens.

Lasso also noted that it used Copilot to retrieve the contents of a since-deleted GitHub repository that hosted a tool for creating “offensive and harmful” AI images using Microsoft’s cloud AI service.

Dror stated that Lasso notified all affected companies that were “severely affected” by the data exposure and advised them to rotate or revoke any compromised keys.

None of the affected companies named by Lasso responded to TechCrunch’s inquiries, and Microsoft also declined to comment on the matter.

Lasso informed Microsoft of its findings in November 2024, and Microsoft classified the issue as “low severity,” stating that the caching behavior was “acceptable.” Microsoft subsequently removed links to Bing’s cache from its search results starting in December 2024.

However, Lasso argues that although the caching feature was disabled, Copilot still had access to the data, even if it was not visible through traditional web searches, indicating a temporary solution.


Source Link