Fresh daily
AI News
Latest AI tool releases, research breakthroughs, and industry news.
Earlier this week

Google DeepMind and A24 announce first-of-its-kind research partnership

Newly discovered PamStealer isn't your typical macOS malware
The discovery underscores the increased effort being poured into Mac infostealers.
Using DSPy to evaluate and improve Datasette Agent's SQL system prompts
Research: Using DSPy to evaluate and improve Datasette Agent's SQL system prompts One of this morning's AIE keynotes covered dspy , which reminded me I've been meaning to see if it could help me improve the system prompt used by Datasette Agent - so I fired off an asynchronous research task in Claude Code for web using Claude Fable 5: Pip install the latest Datasette alpha and datasette-agent and dspy - then figure out how to use dspy to evaluate and improve the main system prompts used by Datasette Agent for the feature where it can execute read only SQL queries to answer user questions about data. Fable chose to test using GPT 4.1 mini and nano, and identified several promising looking directions for improvements. I particularly like this one: The schema listing gives only table names; the "don't call describe_table if you already have the information" advice caused column-name guessing (page_count, o.order_id, first_name) and error-retry loops in baseline traces. Either include column names in the prompt's schema listing or soften that advice. Tags: ai , datasette , generative-ai , llms , evals , dspy , datasette-agent , claude-mythos

Teaching AI to run with the turbines
Artificial intelligence may have captured the public imagination through chatbots and image generators, but some of its most consequential use cases are unfolding far from consumer-facing tools. In industries where physical infrastructure, operational continuity, and safety are paramount, AI is becoming a core operating layer. With its sprawling industrial systems and constant stream of operational…
More details on Fable 5’s cyber safeguards and our jailbreak framework
More details on Fable 5’s cyber safeguards and our jailbreak framework

Autoresearch: The feedback loop behind self-improving agents
Introspection co-founder Roland Gavrilescu explains autoresearch, agent “recipes,” self-improving loops, and why humans remain central to the software factory.

SpaceX has an AI device prototype, and it sure sounds phone-ish
SpaceX reportedly showed investors a "handset-like" AI device before going public. It could be another signal SpaceX wants to expand into wireless.

New York City educators and industry leaders gathered at Google’s offices to shape the future of AI in classrooms.
Google, the New York Jobs CEO Council and Urban Assembly hosted an AI summit for 150 education and industry leaders.

LLMs are stuck in a groupthink groove. This startup is trying to get them out.
Let’s start with a game. Open up your chatbot of choice—Claude, ChatGPT, Gemini—and type “Give me a random number between 1 and 10.” You’re going to get 7. Almost always. Now type “Another” and you’ll get 3 or 4. Type “Another” again and you’ll get 8 or 9. That won’t work every time—but if it…

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration
How ChatGPT adoption has expanded
New OpenAI Signals data shows how ChatGPT adoption is growing globally, with users increasing usage, exploring more capabilities, and driving growth across regions and languages.

Unlocking Britain’s next era of productivity: Building a nation of AI trailblazers
Google UK shares its latest Economic Impact Report and how to enable more people to unlock the benefits of AI-powered technologies.

The AI jobs debate just got messier
A new report finds "high-intensity AI adopters” saw headcount increase 10.2%. Among those companies, entry-level headcount rose by 12%, countering the rhetoric that AI kills junior jobs.
Introducing GeneBench-Pro
Introducing GeneBench-Pro, a new benchmark testing AI performance in genomics, biology, and scientific research using complex, real-world datasets.
Core dump epidemiology: fixing an 18-year-old bug
OpenAI engineers used large-scale core dump analysis to debug rare infrastructure crashes, uncovering both a hardware fault and a long-standing software bug.
Inside Genebench-Pro

DiScoFormer: One transformer for density and score, across distributions

Inside the Advisory Database and what happens when vulnerability volume breaks records
The GitHub Advisory Database is processing more vulnerability reports than ever before. Here's what's driving the surge, how we're responding, and how the community can help. The post Inside the Advisory Database and what happens when vulnerability volume breaks records appeared first on The GitHub Blog.
Mapping Europe’s AI Workforce Opportunity
A new OpenAI report maps how AI could reshape jobs across the EU, highlighting which occupations may face automation, growth, or workflow changes.