Building global finance ops in public.
Field notes from running a multi-country finance platform - what we shipped, what broke, what we're still arguing about.
- ·12 min read
Production LLM Cost Optimization: Model Choice, Prompt Caching & Output Engineering
Every LLM call in a hot path is a recurring bill — and cutting it usually means quietly trading away accuracy, speed, or insight. This is a practical guide to the four-way trade behind LLM cost optimization, and the three techniques — model choice, prompt caching, output engineering — that move it on purpose. With a worked, measured example.
llmcost-optimizationproductionai-engineeringprompt-cachingobservabilityinference - ·72 min read
Software Architecture Has No Rigor — and AI Feeds on It
Every serious engineering field can verify a design before it's built. Software architecture can't — and an AI generating against an unverifiable spec will hallucinate, duplicate, and reorder with impunity. Here's the method I use at Guliel: model software as what it actually is — data, transformations, and a place to run — into a category you can check.
architecturesoftware-designcategory-theoryaiengineering