The breach could be one of the largest ever recorded in history, cyber security experts say, highlighting the risks of collecting and storing vast amounts of sensitive personal data online – especially in a country where authorities have wide and uncontrolled access to such data.
The vast trove of Chinese personal data has been publicly available through what appeared to be an unsecured backdoor — a direct web address that offers unrestricted access to anyone who knows about it — since at least April 2021, according to LeakIX, a site that discovers and indexes open databases online.
Access to the database, which did not require a password, was suspended after an anonymous user advertised more than 23 terabytes (TB) of data for sale for 10 bitcoins — roughly $200,000 — in a hacker forum post last Thursday.
The user claims the database was collected by the Shanghai police and contains sensitive information on one billion Chinese citizens, including their names, addresses, mobile numbers, national identification numbers, ages and places of birth, as well as billions of records of phone calls made to the police to report civil disputes and crimes.
A sample of 750,000 data records from the three major database indexes was included in the vendor publication. CNN verified the authenticity of more than two dozen records from the sample provided by the seller, but was unable to access the original database.
The Shanghai government and police did not respond to CNN’s repeated written requests for comment.
The seller also claimed that the unsecured database was hosted by Alibaba Cloud, a subsidiary of Chinese e-commerce giant Alibaba. In a statement to CNN, Alibaba said it was aware of the incident and was investigating.
But experts CNN spoke to said the owner of the data is to blame, not the company that hosts it.
“As it stands, I believe this will be the largest public leak to date — certainly in terms of the scope of the impact in China, we’re talking about the majority of the population here,” said Troy Hunt, a Microsoft regional representative. director based in Australia.
China is home to 1.4 billion people, meaning a data breach could potentially affect more than 70% of the population.
“This is a small case where the genie will not be able to go back into the bottle. Once the data is out there in the form it looks now, there’s no going back,” Hunt said.
It is not clear how many people have accessed or downloaded the database in the 14 or more months it has been publicly available online. Two Western cybersecurity experts who spoke to CNN were aware of the database’s existence before it came into the spotlight last week, suggesting it could be easily discovered by people who know where to look.
Vinny Troya, a cybersecurity researcher and founder of dark web intelligence firm Shadowbyte, said he first discovered the database “around January” while searching for open databases online.
“The site I found it on is public, anyone (can) access it, all you have to do is sign up for an account,” Troya said. “Since it opened in April 2021, any number of people could download the data,” he added.
Troy said he downloaded one of the main indexes of the database, which appears to contain information on nearly 970 million Chinese citizens.
Troy said it’s hard to say for sure whether open access was an oversight on the part of the database’s owners or a deliberate shortcut meant to be shared among a small number of people.
“Either they forgot about it or they deliberately left it open because it’s easier to access,” he said, referring to the authorities responsible for the database. “I don’t know why they would do that. It sounds very careless.”
Unprotected personal data – revealed through leaks, breaches or some form of incompetence – is an increasingly common problem facing companies and governments around the world, and cyber security experts say it’s not unusual to find databases, which are left open for public access.
But the latest data leak is particularly worrisome, cybersecurity researchers say, not only because of its potentially unprecedented volume, but also because of the sensitive nature of the information it contains.
A sample analysis of CNN’s database found police records of cases spanning nearly two decades from 2001 to 2019. While the majority of the records are civil disputes, there are also records of criminal cases ranging from fraud to rape .
In one case, a Shanghai resident was cited by police in 2018 for using a virtual private network (VPN) to bypass China’s firewall and access Twitter, allegedly retweeting “reactionary remarks including ( the Communist) Party, Politics and Leaders’.
In another recording, a mother called police in 2010 accusing her father-in-law of raping her 3-year-old daughter.
“There could be domestic violence, child abuse, all kinds of things in there, that’s much more concerning to me,” said Hunt, a regional director for Microsoft.
“Could this lead to extortion? We often see people being blackmailed after a data breach, examples where hackers may even try to ransom individuals.”
Bob Diachenko, a security researcher based in Ukraine, first came across the database in April. In mid-June, his company discovered that the database had been attacked by an unknown malicious player who destroyed and copied the data and left a ransom note demanding 10 bitcoins to restore it, Diachenko said.
It is not clear if this is the work of the same person who announced the sale of the database information last week.
By July 1, the ransom note was gone, according to Diachenko, but only 7 gigabytes (GB) of data were available — instead of the 23 TB originally announced.
Dyachenko said he assumed the ransom was solved, but the database owners continued to use the exposed database for storage until it was shut down over the weekend.
“Maybe there was some junior developer who noticed it and tried to remove the notes before senior management noticed,” he said.
Shanghai police did not respond to CNN’s request for comment on the ransom note.