For AI-native consumer apps to be truly ubiquitous we need: ARPU > Average Inference Cost Per User. How far away are we from that?
Ideally AI native apps can hit APIs on every screen yet also pay for just by throwing some ads on it. But I think we're a while away:
— We're probably >10x off right now. Monthly ARPU is $2-5? Token costs for an AI heavy app might be $20-50 of cost
— Much of consumer is global. Even if we hit the US/EU it'll be a while before we can serve the broad base
— More importantly every time AI improves, consumers demand more. No one wants to talk to a last gen AI character. If video gets good they'll want videos everywhere
— It might be we need major innovations in small models or new mobile hardware so that we have free local inference. But that will still not be as good as SOTA cloud LLMs
No wonder so many products focus on productivity, and on prosumers who can pay $100s or $1000s on work related tasks. This is where you can have huge ARPUs and benefit from being SOTA. And it seems as though there's no limit for tokens… so why do the low end?
→ view thread
Ideally AI native apps can hit APIs on every screen yet also pay for just by throwing some ads on it. But I think we're a while away:
— We're probably >10x off right now. Monthly ARPU is $2-5? Token costs for an AI heavy app might be $20-50 of cost
— Much of consumer is global. Even if we hit the US/EU it'll be a while before we can serve the broad base
— More importantly every time AI improves, consumers demand more. No one wants to talk to a last gen AI character. If video gets good they'll want videos everywhere
— It might be we need major innovations in small models or new mobile hardware so that we have free local inference. But that will still not be as good as SOTA cloud LLMs
No wonder so many products focus on productivity, and on prosumers who can pay $100s or $1000s on work related tasks. This is where you can have huge ARPUs and benefit from being SOTA. And it seems as though there's no limit for tokens… so why do the low end?
The replies are doing math: @mlchild calculates $2 ARPU = 200 ad impressions at $30 CPM with 30% fill — "we might need 50x cost compression." Chen responds: "yeah and $30 CPM is US brand video, not remnant/DR which is $2-3 CPM." Jevon's paradox keeps getting invoked — cheaper inference = users demand more of it. @ArelAvellino: "This is why the play is bolting AI onto businesses that already make money."