UNICORNE | WHITEPAPER
as actionable resource-level insights, rather than service-level aggregates, and optimisation recommendations prioritised by quick wins versus long-term architectural changes.
The Gen AI layer of complexity Traditional infrastructure costs scale linearly: provision more capacity, pay proportionally more. AI workloads behave differently and the disconnect can catch even sophisticated teams off guard.
In a traditional setup, businesses can predict the monthly cost of an EC2 instance. With Gen AI, usage can scale unpredictably into billions of tokens, and expenses accumulate much faster than expected.
KEY TAKEAWAY
Quick wins requiring no downtime deliver immediate savings visible in the next billing cycle. One client cut 60 % of total savings in the first week through reserved instances, unused resource cleanup and storage optimisation. Momentum builds trust for larger changes
Most implementations resend the entire conversation history with every exchange to maintain context. Éric calls this“ conversation creep”, where a 10-turn dialogue doesn’ t cost 10 times a single exchange but something closer to 55 times( 1 + 2 + 3 + 4...+ 10).
“ Outbound tokens are several times more expensive than inbound ones, images cost more than text and audio can be even pricier,” Éric notes.“ Teams miss the hidden costs of integration – vector databases, caching systems and security guardrails – that all scale directly with usage.”
Teams can also overlook the huge variations in pricing between models, where two models with apparently similar performance can vary in cost by a factor of 10. Anthropic Claude costs roughly 10 times more than Amazon Nova, for example, yet for many use cases the less expensive model performs adequately. Using less expensive models for routine queries while reserving premium models for complex reasoning allows organisations to balance cost and quality effectively.
Building FinOps maturity When cloud costs spiral unexpectedly, the financial impact is only part of the problem. Teams become cautious about experimentation, leadership hesitates on new initiatives and development velocity decreases.
“ The real damage isn’ t just the money lost, it’ s the loss of confidence,” Éric explains.“ When a CTO hesitates to invest further in AI or cloud, the company risks
116 November 2025