What AI billing can learn from telecoms and fintechs

What AI billing can learn from telecoms and fintechs

Mar 3, 2026

Arnon Shimoni

Lots of AI companies are rebuilding billing infrastructure from scratch thinking it's a simple engineering problem that can be vibe-coded away.

Then come custom metering pipelines, homegrown credit systems, wallet logic held together with application code and spreadsheets. And prayers.

These aren't new problems.

Telecoms solved prepaid credit drawdown, pooled minutes, rollover, and fair-use throttling in the 1990s. Payments processors solved transaction-level margin tracking, interchange pass-through, and multi-corridor pricing before most AI founders were born.

The patterns are identical. The vocabulary is different. And that vocabulary gap is costing AI companies millions in engineering time they didn't need to spend.


The problem


Telecoms (1990s)


Fintechs (2000s)

AI companies (2020s)

Variable cost per unit

Spectrum & tower load per call

Interchange per transaction

Inference cost per request

Prepaid balance drawdown

Prepaid minutes & SMS

Credits & token wallets

Multi-rate consumption

Local vs. roaming vs. international

Domestic vs. cross-border, debit vs. credit

GPT-5 vs. mini vs. image vs. embedding

Pooled usage

Family plans

Org-level credit pools

Heavy-user economics

5% of users = 50% of network load

Fraud & chargeback concentration

5% of users = 75% of compute cost

Bill shock / trust erosion

Overage charges on data plans

Unexpected FX fees

Usage spikes on AI credits

Hybrid bundle

Base plan + included minutes + overage

Platform fee + take-rate + per-txn

Subscription + included credits + overage

Margin visibility

Per-call

Per transaction

Per-request, per-model

The credit problem is a prepaid minutes problem

When OpenAI sells API credits or ElevenLabs sells credits to use in voice generation - they're running the same economic model as a prepaid mobile carrier.

When a customer loads value into an account, that value gets consumed at different rates depending on what they use. Some actions cost more than others. Balances expire. Customers want rollover.

Later, finance departments need to recognize revenue as credits get consumed, not necessarily when they're purchased.

Again, telecoms figured this out decades ago - the prepaid model at scale required:

  1. Multi-rate drawdown - one minute of a local call costs X. One minute of roaming costs 5X. The system draws from the same balance at different rates depending on the event. AI credits work the same way: one API call to Gemini 3 Pro costs different credits than one call to Nano Banana. The billing system needs to resolve this per-event, not per-plan.

  2. Balance pooling - family plans let multiple users draw from one shared pool, with per-user caps to prevent one teenager from burning the entire allocation on roaming data. AI companies are hitting the same problem now. Per-user credit allotments create stranded assets: credits locked to individuals who don't use them. Org-level pools with per-user guardrails solve it. Again, telecoms learned this back in the early 2000s.

  3. Expiry and rollover rules. T-Mobile's "rollover data" became a competitive advantage because customers hated losing unused balances. AI companies are learning the same thing: credits that expire without clear rules erode trust. Credits that roll over without limits create revenue recognition nightmares. Lovable just started introducing these…
    The design space between these two extremes is well-mapped by telecoms. AI companies are re-mapping it from zero.

  4. Fair-use policies. Unlimited plans were never really unlimited. Carriers introduced throttling thresholds and deprioritization because a tiny fraction of users consumed half of the network capacity. AI companies face the exact same dynamic. One AI vendor's top 5% of users consumed 75% of compute costs while paying the same flat fee. The solution isn't "charge more." It's architectural: metering, tiered consumption rates, and transparent fair-use guardrails. All invented by telecoms!

What I take from this is that AI pricing is a methodology. It's a structure. The data model for prepaid credit drawdown (wallets, events, multi-rate consumption, pooling, expiry) is a solved problem. AI companies that build this from scratch are spending 6–12 months and 2–3 engineers on something that should be infrastructure.

The margin problem can become an interchange issue

Payments processors don't have uniform costs, because a domestic debit transaction costs them less than a cross-border credit card transaction. Interchange rates vary by card type, region, merchant category, authentication method.

A single 2.9% + $0.30 blended rate hides massive margin variation underneath, and AI inference costs work the same way.

An Opus 4.6 call costs more than a Sonnet 4.6 call, just like a 4,000-token completion costs more than a 200-token completion. A batch job during off-peak hours might get cheaper compute than a synchronous request during peak.

Payments solved this with interchange-plus pricing: they pass through the variable cost transparently, then add a fixed markup. The billing system tracks cost-per-transaction alongside revenue-per-transaction, so finance can see margin at the corridor level, not just the blended level.

Most AI companies today operate on blended pricing. They charge $20/month or $X per 1,000 credits, but they don't track cost-per-request alongside revenue-per-request at the billing layer. Margin is calculated quarterly by engineering pulling logs and finance pulling invoices and someone reconciling in a spreadsheet.

This is how payments companies operated in the early 2000s.

What payments infrastructure taught us:

  1. Cost metadata belongs in the billing event. Every billable event should carry both the price charged and the cost incurred. This is how you detect margin compression before it kills you.

  2. Rate cards, not hard-coded prices. Payments processors don't hard-code interchange rates. They maintain rate tables that map event attributes (card type, region, channel) to costs. When Visa changes interchange, you update a table, not application code. AI companies should map model, token count, and request type to cost rates the same way. When you switch from one foundation model to another, or when OpenAI changes their API pricing, your billing system absorbs it through configuration.
    Snowflake does rate cards really really well. A little too well, if you ask me.

  3. Blended pricing is a sales tool, not a finance tool. You can and should charge customers a simple blended rate. Internally, your billing infrastructure needs to resolve margin per-event.
    Later, blended pricing for the customer, interchange-plus visibility for you as the operator.

The hybrid model is a telecom bundle

The mobile industry spent 20 years figuring out how to bundle subscriptions with variable usage. The result: a base plan (your monthly subscription), included usage (X minutes, Y GB), overage charges (per-unit beyond the included amount), and add-ons (international calling packs, roaming bundles).

AI SaaS pricing in 2026 looks identical: a platform fee (your monthly subscription), included credits or usage (X API calls, Y image generations), overage charges (per-unit beyond the included amount), and add-ons (premium models, priority compute, dedicated capacity).

And the data backs this up. Hybrid pricing isn't experimental anymore. It's the default.

Notice the tension in that ICONIQ data. 46% of customers want consumption-based pricing. 40% want predictable pricing. Those sound contradictory, but they're not. It's the same tension telecoms resolved with the hybrid bundle: a predictable base that covers typical usage, plus consumption charges for anything above it. Customers get a floor they can budget against and a ceiling that scales with actual value.

Telecoms also learned the hard way that hybrid billing creates specific operational problems:

  • The proration problem. Customer upgrades mid-cycle from a 5GB plan to a 20GB plan. What happens to the unused data from the old plan? Do credits from the old tier carry over? Is the new tier prorated from today, or does it start next billing cycle? Every AI company running hybrid pricing will hit this. Telecoms standardized proration logic over decades of customer complaints.

  • The bill shock problem. Overage charges destroy customer trust if they arrive without warning. Telecoms learned to send usage alerts at 50%, 80%, and 100% of included usage. AI companies that skip this step get the same result: customer churn driven not by product dissatisfaction, but by billing surprise. The core anxiety of consumption billing is the unknown.

  • The "cancel and replace" problem. When a customer moves from a self-serve plan to an enterprise contract, telecoms learned you never cancel and recreate. You migrate. The customer's history, usage, and balance continuity matter.

AI companies doing PLG-to-SLG conversions are hitting this now. A self-serve user spending $500/month on credits wants to upgrade to enterprise. If your system can't preserve their existing usage and layer on new contract terms in a single operation, you've created a 6-week deal cycle for what should be a same-day conversion.

Billing architecture in 2026 onwards

None of these lessons require you to actually study telecom billing systems or payments processing manuals, but they do require you to recognize that the problems you're solving aren't new, and the architectural patterns that solve them are known.

Three principles borrowed from industries that already figured this out:

1. Events first, pricing second. Telecoms and payments both separate the event stream (calls made, transactions processed) from the pricing logic (rate plans, interchange tables). Your usage events should flow into your billing system as raw data. The mapping from usage to money happens in configuration, not in code. This is what lets finance change pricing without waiting for engineering.

2. Cost and revenue travel together. In payments, every settled transaction carries both the merchant fee collected and the interchange cost incurred. AI billing should do the same. Every metered event should carry the inference cost alongside the price charged. Without this, margin visibility is always a lagging, manual exercise.

3. Wallets are primitives, not features. Telecoms didn't bolt prepaid balance management onto a postpaid billing system. They built wallets as a foundational data structure: deposit, drawdown, transfer, expire, rollover. AI credit systems need the same treatment. Credits are ledger entries with rules, not a "balance" field on a customer record.

Billing layer

Solved by

Era

AI Equivalent

Prepaid wallets + drawdown

Telecoms

1990s

Credit systems

Multi-rate consumption

Telecoms

2000s

Per-model pricing

Pooling + fair-use caps

Telecoms

2000s

Org pools + guardrails

Transaction-level cost tracking

Payments

2000s

Inference cost metadata

Rate table configuration (not code)

Payments

2010s

Pricing-as-config

Hybrid bundles (base + variable)

Both

2010s

Platform fee + credits + overage

Plan migration without cancel-and-replace

Telecoms

2010s

PLG-to-SLG conversion

The irony for me is that most AI companies would never build their own payments processing or their own telecom switching infrastructure. They'd use Adyen, Stripe, even PayPal or Twilio - but they very well may spend 12 months building billing logic that those industries solved and productized years ago.

Will you build this yourself, or recognize that it already exists? Solvimon can help, reach out.