The computational reproducibility crisis (and a $200B market opportunity)

This week we explore the pain and opportunity of orchestration in cutting-edge computing. Along with the usual roundup of Deep Tech news catching our eye.

Sep 05, 2025

I’m sitting here writing about computers and algorithms while a large portion of the quantum computing industry is at IEEE Quantum Week. If you’re there this week, hit that reply button, and let me know what’s been most interesting for you so far. I’ll be covering some of the interesting announcements next week and would love to include your perspective.

This week’s topic is one that will be critical for the wider Deep Tech community in the next few years. Coming from the software engineering side, I think of this as “the orchestration nightmare”, but others know it as the “replication crisis in science”. You might know it by another name, but whatever we call it, it’s fast becoming a critical (and a critically expensive) problem. Which in turn makes it a very valuable problem to solve.

But first, some quick things that caught my eye recently…

Is Zapata back from the dead?

A press release dropped yesterday with the news that Zapata Quantum was emerging from the collapse of Zapata Computing. This is interesting, as their collapse (analysed here by long-time employee Max Radin) might have been avoided had they been able to hang on for a few more months and ride the influx of speculation that pumped into quantum stocks. Perhaps without which, we might have lost both D-Wave and Rigetti, who have both received delisting warnings at various times. I’m not sure if this is a private equity play on the original IP, or a full reboot, but given the range of the pioneering case studies that Zapata has done over the years (such as this and this and this), any kind of a comeback story will be impressive to see. Good luck team!

Unitary Foundation’s QOSS survey opens for 2025

The Unitary Foundation has just opened up the Quantum Open Source Software survey for 2025. The annual survey is “a chance for anyone in quantum technology to share their voice and help create an informative and representative snapshot of the community and field”. Check out the beautifully presented results from 2024 if you haven’t already. At the moment the volume is relatively low at ~800 people in 2024, and skews towards the younger enthusiast demographic, but the source data is available in the GitHub repo and contains useful insights all the same. Well worth supporting, as are the Unitary Foundation’s own open source projects, a couple of which I’ve made some commits to recently, and been impressed by how fast the team responds to PRs. A great crew all round.

MIT Releases Full Quantum Index Report 2025

In case you missed it, the Quantum Index Report from MIT covers the current state of the quantum computing industry. While it came out a few months ago, I think it’s worth bumping again, as it’s both very accessible and very tactful. It’s also useful to study how MIT has presented this information in multiple formats. This includes the Arxiv version, the PDF version, a hosted flip-book version, and an interactive content version. As someone who has a report to do later in the year, this sets a high bar! Also respect to Jonathan Ruane from MIT for chatting with myself and Laurent Prost from Alice & Bob about the methodology. There are a few of us working on somewhat overlapping projects relating to benchmarking (and orchestration in my case) so the industry collaboration is greatly appreciated.

Did 95% of AI already fail???!!!

The short answer is “no”. We can ignore the clickbait headlines given the “State of AI in business” report was ostensibly looking at business sentiment around the various generative AI pilots being undertaken. The underlying research was great. The user perspectives around incumbent advantages are how enterprises operate. Give this a read if you work on the sales or go-to-market side of Deep Tech. But the rest is a mess of overly contrived charts and shifting definitions. A pilot project is a discovery mechanism. But the authors want to warn that 95% of “firms with over $100 million in annual revenue” are not seeing “a marked and sustained P&L impact” just months after running a Generative AI pilot. Hmmm.

The computational reproducibility crisis (is a $200B market opportunity)

Here’s a rabbit hole. I recently caught up with a friend who works in bioinformatics. This is a field that’s struggling with the scientific replicability crisis, which is a problem that also occurs in frontier technology like quantum-classical and hybrid computing.

The crisis stems not (just) from poor methodology and the temptation to cut corners, but from technical complexity that has grown beyond human management capacity. Whatever the frontier of computation, there’s a complexity that risks undermining the efforts toward consistent and commercial utility.

The studies exploring this make for tough reading. While computational reproducibility rates vary dramatically across domains, from 5.9% for Jupyter notebooks in data science to 26% for computational physics papers and near 0% for complex bioinformatics workflows, the overall estimate of the economic impacts amounts to the same thing: a lot. In one case suggesting an annual $200 billion global drain on scientific computing resources.

But it’s not all bad news. Underneath the headlines and hand-wringing around AI as being an existential threat (and somehow also an over-hyped bubble), there are efforts to apply AI-assisted orchestration to scientific computing. This approach is exploring automated systems that can manage dependencies, orchestrate workflows that span quantum-classical boundaries, and enable reproducible results across heterogeneous computing environments. And not a moment too soon.

The crisis differs from traditional research failures

The computational reproducibility crisis is a distinct challenge from the reproducibility issues found in the more experimental sciences. While wet-lab experiments fail due to biological variability or protocol differences, computational research is theoretically deterministic. Numbers go in and numbers come out. And yet, this area of research faces a surprisingly broad range of issues due to the systemic technical barriers that compound across the computing stack.

Recent quantitative assessments document the severity. The replicability rate for the Jupyter notebooks I mentioned up top comes courtesy of a study that found that only 245 of the 4,169 Jupyter notebooks it analysed produced similar results when re-executed. The failures were attributed to missing dependencies, broken libraries, and environment differences. The study goes on to quote other cases, including those where none of the targeted findings could be replicated due to missing data, software version issues, and inadequate documentation. If entropy eats up every single research project, and nobody can replicate it, did it ever really happen?

The most common problem in computational research

The same question can be asked over in the computational chemistry sector, which is worth noting given it’s often cited as one of the most world-changing applications for quantum and hybrid computing. One landmark study explored the problem by showing how 15 different software packages, all widely used in pharmaceutical and materials development, gave different answers when calculating the properties of the same simple crystals. The paper covers the effort and collaboration that was required to build the benchmarks and approaches to resolve this. It’s ultimately a good news story that this appears to be possible to correct for, but these are tools representing millions of dollars in development and decades of research, which were initially and intrinsically unable to agree on the basic properties of simple elemental crystals. So what’s occurring in other computational corners?

A great paper from Oak Ridge National Laboratory discussed how GPU atomic operations can produce variations of several percent in Monte Carlo simulations depending on the specific GPU model and driver version. Another reveals how high-performance computing as a whole faces nondeterministic interactions where parallel execution order variations, floating-point arithmetic differences across architectures, and compiler optimization choices produce divergent results.

I probably don’t even need to mention the sheer variability inherent in quantum computing in the current NISQ era, where we face everything from significant hardware-dependent noise models through to (yes really) the impact of cosmic rays inducing faults. In the current NISQ era we’re also looking at gate fidelity variations of between 10⁻⁴ and 10⁻⁷, which means that for every time a quantum computer performs a basic operation (a "gate"), it fails roughly one in every 1000 to 10,000 times. These quantum errors currently accumulate so fast that even moderate-length quantum algorithms will contain multiple errors. By contrast, classical computers can perform billions of operations without a single error under normal conditions.

Understanding the $200 billion a year market size

Okay, that’s great science-ing, now let’s get to the money. The financial impact of computational irreproducibility is a substantial cost that organisations rarely fully quantify. The pharmaceutical industry is estimated to waste $40 billion annually on irreproducible computational research, with individual study replications requiring between 3-24 months and $500,000-$2 million in additional investment. Extrapolating this globally across all computational sciences, we’re looking at the total economic drain approaching $200 billion annually.

Let’s dig a little into the computational costs. Failed HPC simulations waste thousands of dollars in compute cost per project, where a single 1,000-core simulation running 24 hours can cost $3,600 at commercial rates of $0.15 per core-hour. Quantum computing has comparable costs. A QAOA optimization requiring 10,000 shots taking about 10 seconds per iteration (where algorithms typically need dozens of iterations), would result in minutes to hours of quantum runtime. At current IBM Quantum rates of $1.60 per second of quantum runtime, that’s a minimum of $1344 for QPU time, let alone the HPC and other resources.

The energy and environmental costs add another dimension. Cloud computing is suggested to generate 1.8-2.8% of global greenhouse gas emissions, with data centres consuming 200-250 TWh globally. That’s equivalent to 1% of global electricity demand. With this framing, we can see the compounding scale and tangible impact of these irreproducible computations as not just lost time and money, but actual carbon waste. Facilities like the Pawsey Supercomputing Centre in Australia might be leading the charge on green energy use (ranked as the fourth greenest supercomputer in the world), but more needs to be done to tackle the problem at its source.

Next week I’ll dig into some of the attempts being made to tackle this, but I’d be keen to hear your thoughts in the meantime. How do you handle the complexity of orchestrating all the moving parts in your organisation? Have you encountered this kind of scientific replicability problem in your own experience?

Hit that reply button and let me know.

Product in Deep

Discussion about this post

Ready for more?