Base44 is now adding $1M ARR every 2.5 days, debate about AI Coding still on
A 2-Person Team hit $1M ARR building Financial OS for GPT Wrappers
A few months ago, Wix quietly snapped up Base44, an AI coding startup that had only been around for six months, had raised zero funding, and was built by a single founder.
The deal price? $80 million. No surprise it dominated tech headlines.
At the time of acquisition, Base44 reported $3.5M in ARR, 250K users, 9 employees (including the founder), and $189K in profit.
Other AI coding products were already scaling fast — for example, Lovable claims it’s adding $8M in new ARR every month.
But according to Base44’s founder Maor Shlomo, the numbers look very different now. Since joining Wix, Base44 is adding roughly $400K in new ARR per day — or about $1M every 2.5 days — and that growth is accelerating week over week.
Most of these new users, he notes, are not coming from Wix’s existing base. On the product side, Base44 has been busy shipping features:
Reasoning for every message, making it smarter at handling complex edits.
An alpha infrastructure for building autonomous apps, enabling AI agents to be embedded directly inside Base44 applications.
Improved security scanning, capable of catching both XSS vulnerabilities and exposed API keys. Previously, many apps shipped with misconfigurations despite warnings. Now, security scans are built into every iteration, making such mistakes nearly impossible.
Growth has been so fast that customer support fell behind, forcing Base44 to quadruple its team size.
Still, Maor hasn’t disclosed costs, profitability, or ad spend. Users speculate Base44 has been pouring money into ads — its promotions are showing up on “almost every YouTube video.”
That raises a bigger industry question: are AI coding products actually profitable, or just subsidized?
Critics like Chris Paik of Pace Capital argue that many have PMF (product-market fit) but not yet BMPF (business-model fit).
a16z Fires Back
Last week, a16z GPs Martin Casado and Sarah Wang published a rebuttal — “Questioning Margins is a Boring Cliché…” — dismissing margin skepticism as shortsighted.
Their argument: history shows low margins in the early days don’t mean unsustainability. Amazon, Netflix, and Uber were all criticized for thin margins but built massive, durable businesses.
AI apps, they argue, are not like past DTC subscription plays — they deliver stronger user value, higher retention, and faster enterprise expansion.
They laid out five key points:
Low margins aren’t forever. Pricing tiers, usage throttling, and routing to cheaper models can lift gross margins over time.
High-cost users are manageable. A small set of power users drives most costs, and throttling doesn’t necessarily cut revenue. Enterprise customers are far more profitable — and harder to see in public data (something Replit’s founder has also noted).
The model market isn’t monopolized. Open source and multiple cloud vendors keep inference costs dropping — down 10–100x in the past 18 months, with more efficiency gains to come.
Subsidies don’t hide real value. PMF should be judged on conversion, retention, and enterprise expansion, not short-term gross margins.
AI apps aren’t “thin wrappers.” Strong applications differentiate via multi-model orchestration, proprietary data, and product-layer innovation — all of which can sustain pricing power.
Cline Fires Back at a16z
Not everyone was impressed. Pash, head of AI at Cline, called the piece “garbage” and was stunned it took two a16z GPs to publish it:
“Sarah calls margin criticism a ‘boring cliché’… and then trots out Uber/DTC analogies we’ve been hearing for a decade. That debate ended 15 years ago. Totally missing the point.”
He argued the real issue is how closed apps force users through proprietary inference funnels, calling the resulting throughput ‘ARR.’
Meanwhile, open platforms — even with equal or higher usage — don’t get credit because they let developers bring their own API keys.
Nick, perception czar Cline, laid out the fuller critique in his essay “We Have No Idea How to Value the AI App Layer.” His key points:
Throughput isn’t ARR. AI apps often report ARR numbers that scale linearly with inference (token usage), making them closer to commodity flow than software revenue.
Open source has structural advantages. BYO-inference shifts costs off the platform, clarifies revenue, and better aligns incentives.
Transparency and control matter. Users should be able to choose inference strategies (speed vs. cost) and see model choices, prices, and downgrades clearly.
Accounting must be clean. Don’t call total token volume “ARR.” Instead, break out GTV (Gross Token Volume), net revenue, contribution margin, etc.
Incentives must align. Platforms should profit from improving software, not from locking users into hidden routing strategies.
Nick also warned of a creeping problem: silent downgrades. Some tools secretly swap users from high-performing to cheaper models to protect margins, lowering quality without disclosure. That’s not optimization — it’s pretending.
His analogy: “Payments companies don't call total payment volume ARR. Marketplaces don't call GMV ARR. Electricity retailers don't call kilowatt‑hours ARR. AI apps built on paid models should not call pass‑through token spend ARR.”
👉 This clash — Base44’s explosive growth, Paik’s skepticism, a16z’s bullish defense, and Cline’s sharp rebuttal — cuts to the heart of today’s AI application layer: are these businesses software companies, utilities, or something in between? That’s an interesting debate.
A 2-Person Team building Financial OS for GPT Wrappers hit $1M ARR
On the financial side, two guys built a Financial OS for GPT Wrappers, and just hit $1M ARR in ten months.
Without a sales team. More than a business milestone, it’s a bold statement about the future of work itself. I really like their idea and what they do.





