Data protection experts strongly advise against entering patient data into ChatGPT and similar Large Language Models (LLMs) chatbots. So, how can artificial intelligence be used safely in healthcare? Which generative AI tools are the most secure, and which ones should be avoided?
According to a study published in BMJ Health and Care Informatics, one in five general practitioners (GPs) uses AI to draft clinical letters. Meanwhile, a Fierce Healthcare survey of 107 GPs found that 76% use large language models (LLMs) in clinical decision-making. Specifically, 60% use them to check drug interactions, 40% for treatment planning, and 70% for patient education. Actual usage of ChatGPT in daily medical practice may be even higher, as some doctors may be reluctant to disclose it.
ChatGPT is gaining traction
AI tools like ChatGPT, Perplexity, Copilot, and Gemini have become increasingly popular, especially among younger healthcare professionals who are more open to innovation. However, many physicians still remain skeptical due to concerns about privacy and the risk of misinformation. Many healthcare workers are still uncertain about how to begin using AI.
But one doesn't need to use LLMs regularly to violate data security rules in healthcare. All it takes is one prompt. Few people realize that every conversation with a chatbot is stored in the cloud – often outside the European Union. This raises serious concerns about potential GDPR violations and the risk of sensitive data leaks. So, which AI tools prioritize privacy?
DeepSeek on the blacklist
DeepSeek has made headlines following the launch of its latest model, which is claimed to be the most cost-effective among major players in generative AI. While the Chinese LLM had only around 7,000 users in August 2024, in January, that number had surged to over 22 million.
However, there is a major concern. According to its data use policy, DeepSeek stores and processes all user prompts to train its model. This raises questions about data security, especially under the oversight of the Chinese government, where transparency remains a black box.
Several governments and data protection authorities around the world have issued warnings or implemented restrictions regarding the use of DeepSeek due to significant data privacy and security concerns. For example, South Korea's National Intelligence Service (NIS) accused DeepSeek of "excessively" collecting personal data and using all input data for training purposes. The Australian government banned DeepSeek on all federal government devices, expressing concerns over national security and data privacy. Also, multiple U.S. federal and state entities have restricted DeepSeek's use.
In the European Union, regulators, including Italy's data protection authority, are scrutinizing DeepSeek's data collection practices. Concerns have been raised about the app's compliance with the General Data Protection Regulation (GDPR), especially regarding the storage of user data on servers located in China.
Copilot, ChatGPT, Gemini, and Perplexity also raise concerns
DeepSeek is not the only AI model that saves conversations and interactions with the chatbot for training purposes. US-based ChatGPT and Perplexity, as well as the French model LeChat from Mistral AI, do the same. However, in some cases, this can be restricted through system settings. For example, in the ChatGPT, users can toggle off the Chat History & Training option. When history is disabled, new conversations will not be used to train or improve the models, and they won’t appear in the history sidebar. However, they are still processed in the cloud.
Microsoft 365 Copilot stores user interaction data, including commands and responses, as part of the Copilot activity history. According to Microsoft, this data remains within the Microsoft 365 cloud service, is encrypted, and is not used for further AI training. Users can also delete their Copilot activity history, and the company claims Copilot complies with GDPR.
Even if an AI chatbot does not process user prompts – which may contain sensitive data – another issue remains: most popular large language models operate in cloud environments with servers located in the U.S. For example, ChatGPT data is stored in Microsoft Azure’s Texas-based data center, potentially violating European GDPR regulations. This concern was central to Italy’s temporary ban on ChatGPT in March 2023, which was lifted after OpenAI introduced an option to limit data usage for training.
Additionally, companies developing LLMs are vulnerable to hacking and data breaches. While no chat data has been compromised so far, ChatGPT has already experienced security incidents.
Are there any safe LLM models?
The answer is yes – open-source models offer the highest level of privacy. These models run locally and do not transmit data to external servers, making them the safest option for healthcare use cases. However, they are not as easy to implement and use as ChatGPT and require installation on local servers. Only qualified IT specialists or data engineers can effectively operate and train them, and implementing open-source models demands significant investment in data infrastructure.
The costs can be substantial, often exceeding what most healthcare facilities can afford, not to mention the ongoing expenses for maintenance and servicing.
Open-source models have a key advantage: they can be trained on specific datasets, enabling the development of AI tools tailored to specialized tasks, such as summarizing electronic medical records or integrating data from various sources, even those that are not interoperable. Some of the most popular open-source models include Llama 3 (Meta), Bloom, Mistral Small 3, and MPT-7B.
A good alternative is LLM features available directly in health IT systems. These models are securely hosted within local networks, eliminating the need to switch between applications while ensuring compliance with data privacy regulations. Over time, an increasing number of such solutions will likely be integrated into electronic health records by leading health IT developers.
Prompts with no personal identifiers are ok
No generative AI model operating in the cloud should be used to process sensitive patient data. The number one rule is to avoid entering personal details or any information that could identify an individual (e.g., based on age, location, or unique characteristics). Doctors can use AI to assist with diagnosis, review scientific research, or check medical guidelines, but they should disable the option that allows queries to be used for AI training whenever possible.
Open-source models provide the highest level of data security, though they require substantial IT infrastructure. The most straightforward approach is to test AI features integrated into healthcare IT systems, as these solutions offer both convenience and compliance with data protection laws. ChatGPT and similar AI tools are here to stay. Doctors should familiarize themselves with these technologies to stay ahead of advancements in AI within medicine. It’s also important for them to understand what these tools can do and where their limitations lie.