OpenAI says ChatGPT health use is growing as it improves medical response quality

OpenAI points to rising health use in ChatGPT

OpenAI is highlighting how often people turn to ChatGPT for medical help, saying more than 230 million users each week ask the product health and wellness questions. The company says its latest model, GPT-5.5 Instant, marks a significant step forward in how the chatbot handles those requests.

The company described the model as better at recognizing when a situation may require urgent care, asking for missing context, explaining uncertainty and simplifying complex information. OpenAI said the model now performs at a level comparable to its frontier reasoning models on its most difficult health benchmarks, while remaining available to free users in ChatGPT.

The update comes as conversational AI tools continue to be used for a broad range of health tasks, from interpreting lab results and preparing for appointments to navigating insurance, understanding medical information and building healthier habits. OpenAI is positioning these improvements as part of a wider effort to make ChatGPT more useful in everyday health situations.

Physician review drives the evaluation process

A central part of OpenAI’s health work is physician involvement. The company said a global network of more than 260 doctors across 60 countries helps define what good answers should look like in real-world health scenarios. Those physicians review example responses, identify failure modes and help shape evaluation rubrics.

OpenAI said doctors have reviewed more than 700,000 example responses so far. Their assessments focus on qualities such as accuracy, clarity, completeness, caution and whether the response is useful enough to guide next steps. The company said that process helps researchers measure progress over time and spot weaknesses in model behavior.

To evaluate performance, OpenAI relies on health-specific benchmarks including HealthBench and HealthBench Professional. These tests use realistic conversations and physician-written scoring criteria to assess safety, context awareness, communication and appropriate escalation.

The company also said it compared model responses with answers written by physicians, using panels of doctors to judge which was better across several criteria. In that evaluation, OpenAI said GPT-5.5 Instant was rated above both older models and physician-written responses. According to the company, doctors also found fewer failure modes in the newer model, including fewer cases where answers failed to account for local healthcare context, missed warning signs or did not ask for more information when needed.

OpenAI said another measure of improvement comes from production traffic. Using privacy-preserving monitoring on billions of health-related messages each week, the company said the share of responses with at least one flagged factuality issue has fallen by 71% in the last two months.

Broader health ambitions

The announcement also ties into OpenAI’s work on tools aimed at clinicians and healthcare organizations. The company pointed to products such as ChatGPT for Clinicians and OpenAI for Healthcare, which are intended to support documentation, research and care delivery.

To illustrate how model answers have changed, OpenAI contrasted older systems with GPT-5.5 Instant on a medical question about why a doctor might want an MRI before a steroid injection for sciatica. The newer model gave a detailed explanation covering diagnosis, procedure planning, safety concerns and when an injection might not be appropriate.

OpenAI said the broader goal is to make ChatGPT more accurate and helpful in high-stakes moments, especially as health remains one of the most common ways people use the product. The company framed the work as part of its longer-term view that improvements in human health could be among the most meaningful impacts of AGI.