DeepSeek Locked Down Public Database Access That Exposed Chat History

On Jan. 29, U.S.-based Wiz Research announced it responsibly disclosed a DeepSeek database previously open to the public, exposing chat logs and other sensitive information. DeepSeek locked down the database, but the discovery highlights possible risks with generative AI models, particularly international projects.

DeepSeek shook up the tech industry over the last week as the Chinese company’s AI models rivaled American generative AI leaders. In particular, DeepSeek’s R1 competes with OpenAI o1 on some benchmarks.

How did Wiz Research discover DeepSeek’s public database?

In a blog post disclosing Wiz Research’s work, cloud security researcher Gal Nagli detailed how the team found a publicly accessible ClickHouse database belonging to DeepSeek. The database opened up potential paths for control of the database and privilege escalation attacks. Inside the database, Wiz Research could read chat history, backend data, log streams, API Secrets, and operational details.

The team found the ClickHouse database “within minutes” as they assessed DeepSeek’s potential vulnerabilities.

“We were shocked, and also felt a great sense of urgency to act fast, given the magnitude of the discovery,” Nagli said in an email to TechRepublic.

They first assessed DeepSeek’s internet-facing subdomains, and two open ports struck them as unusual; those ports lead to DeepSeek’s database hosted on ClickHouse, the open-source database management system. By browsing the tables in ClickHouse, Wiz Research found chat history, API keys, operational metadata, and more.

Wiz Research identified key DeepSeek information in the database. Image: Wiz Research

The Wiz Research team noted they did not “execute intrusive queries” during the exploration process, per ethical research practices.

What does the publicly available database mean for DeepSeek’s AI?

Wiz Research informed DeepSeek of the breach and the AI company locked down the database; therefore, DeepSeek AI products should not be affected.

However, the possibility that the database could have remained open to attackers highlights the complexity of securing generative AI products.

“While much of the attention around AI security is focused on futuristic threats, the real dangers often come from basic risks—like accidental external exposure of databases,” Nagli wrote in a blog post.

IT professionals should be aware of the dangers of adopting new and untested products, especially generative AI, too quickly — give researchers time to find bugs and flaws in the systems. If possible, include cautious timelines in company generative AI use policies.

SEE: Protecting and securing data has become more complicated in the days of generative AI.

“As organizations rush to adopt AI tools and services from a growing number of startups and providers, it’s essential to remember that by doing so, we’re entrusting these companies with sensitive data,” Nagli said.

Depending on your location, IT team members might need to be aware of regulations or security concerns that may apply to generative AI models originating in China.

“For example, certain facts in China’s history or past are not presented by the models transparently or fully,” noted Unmesh Kulkarni, head of gen AI at data science firm Tredence, in an email to TechRepublic. “The data privacy implications of calling the hosted model are also unclear and most global companies would not be willing to do that. However, one should remember that DeepSeek models are open-source and can be deployed locally within a company’s private cloud or network environment. This would address the data privacy issues or leakage concerns.”

Nagli also recommended self-hosted models when TechRepublic reached him by email.

“Implementing strict access controls, data encryption, and network segmentation can further mitigate risks,” he wrote. “Organizations should ensure they have visibility and governance of the entire AI stack so they can analyze all risks, including usage of malicious models, exposure of training data, sensitive data in training, vulnerabilities in AI SDKs, exposure of AI services, and other toxic risk combinations that may exploited by attackers.”

https://assets.techrepublic.com/uploads/2025/01/tr_20250130-deepseek-wiz-research-database-leak.jpg

Source link
Megan Crouse

How to grow your ecommerce business with AI

WhatsApp is officially getting ads – and I’m worried it’s a slippery slope from here

Microsoft has made it harder to log in to Windows 11 using your face – and that’s good and bad news

OnePlus ditched metal on the new Nord 5 because its buyers ‘prefer styles that are brief, simple, and elegant’

Ex-Israeli Intelligence Official: Shockwaves of Trump’s “Take Over Gaza” Heard, Felt Across Region

What UK political parties are promising in the 2019 general election

Otto Warmbier’s parents want North Korea to suffer for their son’s death

Could a ‘youthquake’ cause Boris Johnson to lose the general election?

Which Celebrity Styles Americans Copy Most in 2025: New Study

New ‘Westworld’ trailer introduces us to another dystopian tech company

What’s the point of ‘Charlie’s Angels’ without Sam Rockwell dancing?

These striking photos capture the future of human flight

Enterprise Products Partners' SWOT analysis: midstream giant's stock resilience tested

JetBlue's SWOT analysis: airline stock faces turbulence amid strategic shifts

Minnesota lawmaker killed on Saturday served with compassion, governor says

Minnesota shooting suspect told friend in text message: I might be dead soon

The YouTuber who has become one of Gen Z’s most beloved celebrities

26 last-minute holiday gifts that are still thoughtful and unique

Practicing gratitude regularly can make you less stressed and sleep better

8 things millennials wish you would just stop getting them for the holidays