On Jan. 29, U.S.-based Wiz Research announced it responsibly disclosed a DeepSeek database previously open to the public, exposing chat logs and other sensitive information. DeepSeek locked down the database, but the discovery highlights possible risks with generative AI models, particularly international projects.
DeepSeek shook up the tech industry over the last week as the Chinese company’s AI models rivaled American generative AI leaders. In particular, DeepSeek’s R1 competes with OpenAI o1 on some benchmarks.
How did Wiz Research discover DeepSeek’s public database?
In a blog post disclosing Wiz Research’s work, cloud security researcher Gal Nagli detailed how the team found a publicly accessible ClickHouse database belonging to DeepSeek. The database opened up potential paths for control of the database and privilege escalation attacks. Inside the database, Wiz Research could read chat history, backend data, log streams, API Secrets, and operational details.
The team found the ClickHouse database “within minutes” as they assessed DeepSeek’s potential vulnerabilities.
“We were shocked, and also felt a great sense of urgency to act fast, given the magnitude of the discovery,” Nagli said in an email to TechRepublic.
They first assessed DeepSeek’s internet-facing subdomains, and two open ports struck them as unusual; those ports lead to DeepSeek’s database hosted on ClickHouse, the open-source database management system. By browsing the tables in ClickHouse, Wiz Research found chat history, API keys, operational metadata, and more.
The Wiz Research team noted they did not “execute intrusive queries” during the exploration process, per ethical research practices.
What does the publicly available database mean for DeepSeek’s AI?
Wiz Research informed DeepSeek of the breach and the AI company locked down the database; therefore, DeepSeek AI products should not be affected.
However, the possibility that the database could have remained open to attackers highlights the complexity of securing generative AI products.
“While much of the attention around AI security is focused on futuristic threats, the real dangers often come from basic risks—like accidental external exposure of databases,” Nagli wrote in a blog post.
IT professionals should be aware of the dangers of adopting new and untested products, especially generative AI, too quickly — give researchers time to find bugs and flaws in the systems. If possible, include cautious timelines in company generative AI use policies.
SEE: Protecting and securing data has become more complicated in the days of generative AI.
“As organizations rush to adopt AI tools and services from a growing number of startups and providers, it’s essential to remember that by doing so, we’re entrusting these companies with sensitive data,” Nagli said.
Depending on your location, IT team members might need to be aware of regulations or security concerns that may apply to generative AI models originating in China.
“For example, certain facts in China’s history or past are not presented by the models transparently or fully,” noted Unmesh Kulkarni, head of gen AI at data science firm Tredence, in an email to TechRepublic. “The data privacy implications of calling the hosted model are also unclear and most global companies would not be willing to do that. However, one should remember that DeepSeek models are open-source and can be deployed locally within a company’s private cloud or network environment. This would address the data privacy issues or leakage concerns.”
Nagli also recommended self-hosted models when TechRepublic reached him by email.
“Implementing strict access controls, data encryption, and network segmentation can further mitigate risks,” he wrote. “Organizations should ensure they have visibility and governance of the entire AI stack so they can analyze all risks, including usage of malicious models, exposure of training data, sensitive data in training, vulnerabilities in AI SDKs, exposure of AI services, and other toxic risk combinations that may exploited by attackers.”
https://assets.techrepublic.com/uploads/2025/01/tr_20250130-deepseek-wiz-research-database-leak.jpg
Source link
Megan Crouse