We are sharing with you the information revealed in Microsoft documents leaked from the very early hours of Tuesday, April 19. This information includes the future of Xbox consoles, new upcoming games and many other things. But apparently there was much more.
The size of the data leaked from Microsoft is exactly 306 Starifelds: 38 TB!
It was revealed that Microsoft researchers accidentally leaked 38TB of confidential information to the company’s GitHub page and this information could be seen by everyone. Among the trove of data was a backup of two former employees’ workstations containing keys, passwords, secrets and more than 30,000 private Teams messages.
According to cloud security firm Wiz, the leak was posted on Microsoft’s AI GitHub repository and was mistakenly included in a slice of open-source training data. This means visitors are encouraged to download this data, meaning it can fall into the wrong hands again and again.
Data breaches can happen from any source, but it’s especially embarrassing for Microsoft when it comes from its own AI researchers. Microsoft uploaded the data using Shared Access Signature (SAS) tokens, an Azure feature that allows users to share data through their Azure Storage accounts, The Wiz reported.
Visitors to the repository were instructed to download training data from a provided URL. But the web address provided access to much more than the intended training data, allowing users to browse files and folders that were not intended to be publicly available.

To make matters worse, the access token that allowed all of this was misconfigured to provide full control permissions instead of the more restrictive read-only permissions, Wiz reported. In practice, this meant that anyone visiting the URL could not only view but also delete and overwrite the files they found.
Wiz explains that this could have dire consequences. Since the repository was full of AI training data, the intention was for users to download this data and transfer it to a script so they could develop their own AI models.
But because it was open to manipulation thanks to misconfigured permissions, “an attacker could inject malicious code into any AI models in this storage account, and any user who trusts Microsoft’s GitHub repository would be affected,” Wiz explains.
What are you thinking? Please don’t forget to share your thoughts with us in the comments.