Can I audit the underlying data?

Not directly — the rescue dataset contains details that could identify clients even anonymised. You can audit the methodology here, the definitions we use for each metric, and the version history of this page. When our sample crosses N = 200 we intend to publish an aggregate dataset with columns chosen to prevent re-identification.

Because we ship rescues in weeks at a fixed fee, not dozens a month as a high-volume agency. Our sample grows at the rate a senior engineering team can do the work well. A bigger N from us would mean thinner engagements — the opposite of what founders are paying for. We publish the number of completed engagements honestly rather than inflating it with advisory calls or diagnostics.

Why do you use composite case studies?

Because most clients are pre-launch founders who have not agreed to be named publicly. A composite lets us illustrate a rescue pattern drawn from multiple real engagements without identifying any one of them. Every case study on the site that uses this convention is labelled as composite in the page header. As named clients publish and agree to be cited, we retire the corresponding composite.

Do you update this page when numbers change?

Yes. Every meaningful change — a new data point, a re-classification, a correction — is recorded in the dated version history at the bottom of this page. Older blog posts and landing pages may reference earlier numbers; the version history tells you which number was current on which date.

How do you handle contradictions between your data and external studies?

We state both and link to the source. Our 92% Lovable-broken-on-five-modes is a self-selected sample; Veracode’s 48% vulnerability rate is a structured benchmark on generated code rather than production deployments. They measure different things and can both be true. Where a contradiction is real rather than apparent, we flag it and adjust.

methodology

Methodology — how we count “92% of broken Lovable apps” and other claims

By Hyder ShahFounder · Afterbuild LabsLast updated 2026-04-18

Methodological transparency matters here because the AI-built app rescue space is awash in fabricated-sounding statistics. Every advisor quotes a percentage, most of those percentages have no source, and the AI engines that increasingly answer founder questions will happily repeat any number that looks numeric. We refuse to be part of that noise. Every quantified claim on afterbuildlabs.com has its counting rules documented here: what sample it came from, what we define as the thing being counted, what the limitations are, and when the claim was last updated.

If you are reading this because you want to cite one of our numbers, good — start with the specific claim below, copy the sample size and definition, and link back to both the research page and this methodology note. If you spot a counting error, email hello@afterbuildlabs.com. Corrections get a dated entry in the version history at the bottom of this page.

Scope

This page covers Afterbuild Labs’s internal quantified claims — the numbers we generate from our own rescue engagements. It does not cover third-party studies we cite (Veracode, The Register, Snyk, Stripe); those are the responsibility of the publishing organisation and we link to their sources directly on the research page. If you want to know how Veracode calculated 48%, read the Veracode methodology appendix. If you want to know how we calculated 92%, read on.

“92% of broken Lovable apps fail on one of five things”

Sample

N = roughly 50 rescue engagements between January 2025 and April 2026 where the repo was built primarily on Lovable. Engagements earlier than January 2025 are excluded because Lovable’s default table-creation behaviour changed materially in mid-December 2024, and numbers drawn from the prior era describe a different product.

Definition of “broken”

An app qualifies as broken when the founder explicitly described it as non-functional in production at the time of engagement — typically phrased as “users can’t sign up,” “Stripe isn’t charging,” “I’m leaking data,” or “nothing works on the live URL.” Apps that are “working but fragile” are excluded from the broken sample and handled under separate hardening metrics.

The five failure modes

When more than one failure mode was present (common), we classified by the primary failure — the one the founder reported first or the one we fixed first. Approximate breakdown across the sample:

RLS disabled or misapplied— ~38% of primary failures.
OAuth redirect misconfiguration— ~20% (localhost URIs, Vercel preview URLs, missing consent screens).
Stripe webhook failure— ~16% (missing signature verification, non-idempotent handlers, unhandled event types).
Env var problems— ~10% (missing on production, public-prefixed secrets, test-mode keys in prod builds).
CORS misconfiguration— ~8%.
Other— ~8% (schema issues, rate limits, hydration bugs, vendor quirks).

Sum of the first five: approximately 92%. That is the number quoted on the homepage and in rescue guides.

Limitations

Self-selection bias.Our sample is apps that reached a paid rescue. Apps whose founders fixed the problem alone, walked away, or never launched are not in the sample. The 92% describes apps that got broken enough to hire us — not Lovable apps in general.
Platform bias.Heavily skewed toward Lovable because Lovable generates the most inbound rescue requests we see. Numbers derived from this sample should not be read off as “all AI-built apps” — Bolt, Cursor, and Base44 apps show different failure distributions.
Small N. 50 is a case series, not a controlled study. Patterns at this sample size are suggestive, not statistically definitive.
No control group. Every app in the sample is one that broke. We have no comparison sample of Lovable apps that launched and did not break. Only Lovable could construct that sample.
Classification by judgement. Primary-failure classification is made by the engineer running the rescue, not blindly. We mitigate by keeping a written decision log per engagement but cannot claim inter-rater reliability at this sample size.

How we’d improve

When the rescue sample crosses N = 200, we intend to publish an anonymised aggregate dataset — failure mode, platform, fix duration, approximate revenue stage — with columns chosen to prevent re-identification. That would let other researchers cross-reference our pattern against their own. It would also let us update the 92% number with a tighter interval than we can publish today.

“19 days average time to production”

Calculated across completed engagements in the same January 2025 – April 2026 window. The quoted 19-day number is the median elapsed calendar days; the mean is approximately 24, pulled higher by a small tail of rescues that uncovered deeper schema rewrites during the audit. Mode is 14 days.

Start event: the scope is signed and the engineer has read access to the repository. Clock starts that business day.
End event:the rescue is handed off — client holds admin access to repo and deployment, RLS is enforced and tested, webhook endpoints are verified, a runbook is delivered, and the client has signed off on handover. Clock stops that business day.

Engagements that stalled for reasons outside the engineering scope (client paused for fundraising, client went cold, client changed direction mid-rescue) are excluded from the average. This is a real filter that inflates the number somewhat; a stricter calendar-days-to-complete metric including stalls would be longer. We chose the engineering-time metric because it reflects the work the founder is buying.

“100% handoff rate”

Definition. A handoff is the transfer of operational ownership from Afterbuild Labs to the client. It requires, specifically:

Client is admin on the repository; any Afterbuild Labs collaborator access is reduced to read-only or removed.
Client owns the deployment platform account (Vercel, Railway, etc.) or is the admin on a shared account.
Vendor credentials (Supabase, Stripe, domain registrar, email provider) are under client control and Afterbuild Labs does not retain production keys.
A written runbook is delivered covering deploy, rollback, and the two or three most-likely incident patterns for that app.
Client has signed off, in writing or on a recorded call, that the handoff checklist is complete.

Rate.Handoffs delivered divided by engagements completed. At April 2026 this is 100%. Engagements that were paused by the client before completion are not in the denominator — they are counted separately and noted transparently in our own records. The 100% number is not a marketing figure; it is the count against a fixed definition, and we will correct it publicly the first time a completed engagement ships without a clean handoff.

“47 apps rescued”

Count. 47, as of April 2026. Updated in the version history when it moves.

What counts as a rescue. A paid engagement with a signed scope, shipped fixes, and a completed handoff. Specifically excluded:

Free rescue diagnostics that did not convert to an engagement.
Advisory calls and paid consultations without production code changes.
Referrals we declined or passed on.
Engagements still in progress (counted separately in our own tracking, not in the public number).
Hardening or feature-build work on apps we previously rescued — those are counted as follow-on engagements, not new rescues.

Date range: first rescue January 2025; cutoff for the 47 figure is 17 April 2026. The number includes both Lovable rescues and rescues on other platforms; the 92% figure above is specifically a Lovable-sample subset.

Case study composites

Some case studies on this site are labelled composite. That means they combine details drawn from multiple real engagements, anonymised, time-shifted, and with identifying specifics changed, to illustrate a rescue pattern without identifying any single client. Composites are used because most of our clients are pre-launch founders who have not agreed to be named publicly.

What is real in a composite case study: the failure modes, the technical fix path, the approximate timelines, the approximate platform mix. What is changed: client names, industries (directionally preserved), specific dollar figures, exact dates. Composites are always labelled as composites in the page header. We do not present composites as single named clients.

As named clients publish their own stories and consent to being cited, we retire the corresponding composite and replace it with the real one. This transition happens quietly on individual case study pages; the version history at the bottom of this methodology page notes each transition.

Version history

Dated entries for every meaningful methodology change. Most recent first.

2026-04-18
Initial publication. 92% figure based on ~50 Lovable engagements. 19-day median time-to-production. 100% handoff rate against the definition above. 47 total apps rescued. Composite case study convention documented.

Claims derived from this methodology are collected on the research page. For the vocabulary used here, see the glossary. Author: Hyder Shah.