Fresh daily

AI News

Latest AI tool releases, research breakthroughs, and industry news.

All Releases Research Funding Tutorials Opinion

Older

Why language models hallucinate

OpenAI’s new research explains why language models hallucinate. The findings show how improved evaluations can enhance AI reliability, honesty, and safety.

OpenAI Blog·Sep 5research

Collective alignment: public input on our Model Spec

OpenAI surveyed over 1,000 people worldwide on how AI should behave and compared their views to our Model Spec. Learn how collective alignment is shaping AI defaults to better reflect diverse human values and perspectives.

OpenAI Blog·Aug 27research

OpenAI and Anthropic share findings from a joint safety evaluation

OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations, jailbreaking, and more—highlighting progress, challenges, and the value of cross-lab collaboration.

OpenAI Blog·Aug 27research

Helping people when they need it most

How we think about safety for users experiencing mental or emotional distress, the limits of today’s systems, and the work underway to refine them.

OpenAI Blog·Aug 25research

Accelerating life sciences research

Discover how a specialized AI model, GPT-4b micro, helped OpenAI and Retro Bio engineer more effective proteins for stem cell therapy and longevity research.

OpenAI Blog·Aug 22research

Medical research with GPT-5

Learn how GPT-5 is used for medical research.

OpenAI Blog·Aug 6research

How Amgen uses GPT-5

Learn how Amgen uses GPT-5.

OpenAI Blog·Aug 6research

From hard refusals to safe-completions: toward output-centric safety training

Discover how OpenAI's new safe-completions approach in GPT-5 improves both safety and helpfulness in AI responses—moving beyond hard refusals to nuanced, output-centric safety training for handling dual-use prompts.

OpenAI Blog·Aug 6research

Estimating worst case frontier risks of open weight LLMs

In this paper, we study the worst-case frontier risks of releasing gpt-oss. We introduce malicious fine-tuning (MFT), where we attempt to elicit maximum capabilities by fine-tuning gpt-oss to be as capable as possible in two domains: biology and cybersecurity.

OpenAI Blog·Aug 4research

OpenAI’s new economic analysis

Analysis provides insights into ChatGPT’s impact on the economy. OpenAI also launches new research collaboration to study AI’s broader effects on the labor market and productivity.

OpenAI Blog·Jul 21research

Toward understanding and preventing misalignment generalization

We study how training on incorrect responses can cause broader misalignment in language models and identify an internal feature driving this behavior—one that can be reversed with minimal fine-tuning.

OpenAI Blog·Jun 18research

Preparing for future AI risks in biology

Advanced AI can transform biology and medicine—but also raises biosecurity risks. We’re proactively assessing capabilities and implementing safeguards to prevent misuse.

OpenAI Blog·Jun 18research

Disrupting malicious uses of AI: June 2025

Our latest report featuring case studies of how we’re detecting and preventing malicious uses of AI.

OpenAI Blog·Jun 4research

Introducing HealthBench

HealthBench is a new evaluation benchmark for AI in healthcare which evaluates models in realistic scenarios. Built with input from 250+ physicians, it aims to provide a shared standard for model performance and safety in health.

OpenAI Blog·May 12research

Expanding on what we missed with sycophancy

A deeper dive on our findings, what went wrong, and future changes we’re making.

OpenAI Blog·May 2research

Our updated Preparedness Framework

Sharing our updated framework for measuring and protecting against severe harm from frontier AI capabilities.

OpenAI Blog·Apr 14research

BrowseComp: a benchmark for browsing agents

BrowseComp: a benchmark for browsing agents.

OpenAI Blog·Apr 10research

New commission to provide insight as OpenAI builds the world’s best-equipped nonprofit

Already a nonprofit, and already using AI to help people solve hard problems, OpenAI aims to build the best-equipped nonprofit the world has ever seen—combining potentially historic financial resources with something even more powerful: technology that can scale human ingenuity itself.

OpenAI Blog·Apr 2research

PaperBench: Evaluating AI’s Ability to Replicate AI Research

We introduce PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research.

OpenAI Blog·Apr 2research

Moving from intent-based bots to proactive AI agents

Moving from intent-based bots to proactive AI agents.

OpenAI Blog·Mar 27research

Search AI Workflow Pro

AI News

Older

Why language models hallucinate

Collective alignment: public input on our Model Spec

OpenAI and Anthropic share findings from a joint safety evaluation

Helping people when they need it most

Accelerating life sciences research

Medical research with GPT-5

How Amgen uses GPT-5

From hard refusals to safe-completions: toward output-centric safety training

Estimating worst case frontier risks of open weight LLMs

OpenAI’s new economic analysis

Toward understanding and preventing misalignment generalization

Preparing for future AI risks in biology

Disrupting malicious uses of AI: June 2025

Introducing HealthBench

Expanding on what we missed with sycophancy

Our updated Preparedness Framework

BrowseComp: a benchmark for browsing agents

New commission to provide insight as OpenAI builds the world’s best-equipped nonprofit

PaperBench: Evaluating AI’s Ability to Replicate AI Research

Moving from intent-based bots to proactive AI agents