a16z Just Bet $9M That AI Hallucinations Can Be Fixed — Here's Why That Validates Structured No-Code

Written By:

Peter Juten

Andreessen Horowitz just put $9M into Probably, a startup that wraps LLMs in deterministic validators targeting 99.99% accuracy. The bet isn't really about Probably — it's a16z signalling that AI reliability is the next billion-dollar market category. Structured no-code platforms have been deterministic by design for years. You're not behind. You're early to the market venture capital just anointed as the next big thing.

What Probably actually does Why a16z's cheque matters more than the technology Structured no-code solves the same problem from the other end The trust problem that won't go away What this means for no-code builders The takeaway

Andreessen Horowitz just put $9 million into a startup whose entire pitch is that AI can't be trusted. Not "AI needs better prompts." Not "models will improve." The pitch is simpler and more damning: the output of a large language model should never reach a user without passing through a deterministic validator first.

The startup is Probably, and it's the most important AI infrastructure bet of the month, partly because of what it does and partly because of what it says about the market. Probably wraps LLMs in a validation harness that checks every output against source data before it surfaces to a user. The target is 99.99% accuracy, which is the kind of number you quote when you're building for accounting firms, not chatbots.

CEO Peter Elias has been refreshingly direct about why this needs to exist. Major AI labs, he told TechCrunch, have no incentive to build this layer themselves because "they make money the more times you have to correct the model." That's not a technical observation. That's a structural critique of the entire foundation model business.

And a16z just funded it.

TL;DR

Probably raised $9M from a16z to wrap LLMs in deterministic validators targeting 99.99% accuracy
The architecture uses weaker models with a validation harness, running locally and cutting token costs
This is venture capital signalling that AI reliability is the next big market
Structured no-code platforms already solve this problem by design, not by patching LLM output after the fact

What Probably actually does

The architecture is worth understanding because it's the opposite of everything the AI industry has spent two years telling you to do.

Instead of running the biggest, most expensive frontier model and hoping it doesn't hallucinate, Probably runs a model four classes weaker than frontier and wraps it in a deterministic validator. The validator doesn't guess. It checks every factual claim, every output, against a defined dataset. If something doesn't match, it bounces back. The model tries again. The user never sees the failure.

Two things follow immediately. The system runs on local hardware and token costs drop because you're not paying for frontier inference. The bigger point, the one that matters for anyone building software that actual businesses depend on, is the audit trail. Citations. A record of what was checked against what.

The first product is dataset summaries for data science teams. The roadmap targets accounting and medical verticals. A wrong number in a financial statement. A fabricated drug interaction. These aren't awkward chatbot responses.

Why a16z's cheque matters more than the technology

Seed rounds are bets on founders and market timing as much as products. When a16z backs a company whose thesis is that untrusted AI output is a structural market failure, they're not just placing a bet on Probably. They're placing a bet on the idea that AI reliability is a standalone category worth billions.

Think about what that means for the no-code space.

For the past eighteen months, the dominant narrative has been that AI coding tools would make structured no-code platforms irrelevant. Why learn Bubble when Cursor can write you a React app? Why use Webflow when Bolt can generate a landing page from a screenshot?

Probably's raise doesn't directly answer those questions, but it validates the premise behind the counterargument we have been making all year: raw AI output is not trustworthy enough for production software. If the fix requires a deterministic layer between the model and the user, platforms that are already deterministic by design have a head start measured in years, not months.

Structured no-code solves the same problem from the other end

Probably's approach is to let the AI generate and then validate. Structured no-code platforms take a different route. They don't generate things that need validating.

When you build an app on Bubble or Stacker, you are not prompting a model to guess at SQL queries or authentication logic. You're configuring visual components that render deterministically every time. A button does what you told it to do. A database query follows the logic you defined. There's no probabilistic layer between your intent and the output.

This matters because the alternative is getting expensive, quickly. We covered the MIT and Wharton study that tracked 100,000 developers through the full delivery pipeline. AI coding agents produced 17.3 times more lines of code but only 1.3 times more shipped software. The gap was consumed by testing, security review, and debugging. The binding constraint was never writing code. It was verifying it.

Probably is building a verification layer you bolt onto an LLM. Structured no-code platforms eliminate the verification problem by never introducing it in the first place.

The trust problem that won't go away

The data keeps stacking up in the same direction. Stack Overflow's 2025 survey found that 84% of developers use AI coding tools daily, but only 29% trust the output. That gap is widening. In 2023, trust was around 40%. Familiarity is breeding scepticism, not confidence.

Veracode found that 45% of AI-generated code contains OWASP Top 10 vulnerabilities. We reported on Red Access discovering over 5,000 vibe-coded apps leaking API keys, database credentials, and personal data because nobody had configured authentication. The apps worked. They just weren't secure. And the people who built them had no way to know.

This is the backdrop against which Probably raised $9 million. The market is screaming for reliability, and a16z just wrote a very large cheque that says "we agree."

What this means for no-code builders

If you've been holding back on deploying AI features because you don't trust the output, your instinct is correct. And it just got institutional backing from one of the most influential venture firms in the world.

The practical implication is straightforward. The AI industry is splitting into two tracks. There's the raw capability track: bigger models, longer context windows, faster inference. That's the race between OpenAI, Anthropic, Google, and Mistral. And then there's the reliability track: deterministic validation, guardrails, output guarantees. That's where Probably sits, and it's where structured no-code platforms have been sitting for years.

For no-code builders, this means the reliability argument stops being a defensive talking point and starts being a competitive position. When a client asks why they should build on Bubble instead of having an AI generate a custom app, the answer isn't "Bubble is easier." The answer is "Bubble is deterministic, and even a16z is now betting that determinism is worth billions."

The takeaway

Probably's $9 million seed round is not the biggest AI funding story of the month. But it might be the most strategically legible one for anyone building software that businesses depend on.

The investment says three things clearly. First, the AI industry's reliability problem is real enough to fund as a standalone category. Second, the fix isn't better models. It's deterministic systems that sit between models and users. Third, that fix looks a lot like what structured no-code platforms already do.

If you are building on a no-code platform and wondering whether the AI wave is leaving you behind, look at where the smart money is going. It's not toward bigger models. It's toward making AI output trustworthy. You already build on systems that are trustworthy by design. You're not behind. You're early to the market that venture capital just anointed as the next big thing.

Want to read
more articles
like these?

Become a NoCode Member and get access to our community, discounts and - of course - our latest articles delivered straight to your inbox twice a month!

Join 10,000+ NoCoders already reading!

The No-Code Newsletter

Our fortnightly newsletter comes packed full of practical tips, explainers and insight into all things no-code.

Join 15,000+ NoCoders already reading!

a16z Just Bet $9M That AI Hallucinations Can Be Fixed — Here's Why That Validates Structured No-Code

Table of Contents

What Probably actually does

Why a16z's cheque matters more than the technology

Structured no-code solves the same problem from the other end

The trust problem that won't go away

What this means for no-code builders

The takeaway

Want to read
more articles
like these?

Similar STORIES

OPINION

AI Agents Now Have Employee IDs — NewCore Just Raised $66M to Manage Them

OPINION

Cursor Just Hit $4B in Revenue — Does No-Code Still Have a Reason to Exist?

OPINION

Citadel's Ken Griffin: AI Agents Now Finish PhD-Level Finance Work in Days — What This Means for No-Code Business Tools

OPINION

Bubble Just Bolted AI Onto Its Platform — But Is It Enough to Stay Relevant?

The No-Code Newsletter

Popular

Contact