We decreased our LLM costs with Opus

(mendral.com)

36 points | by shad42 1 hour ago

4 comments

wxw 58 minutes ago
> We switched to the "triager" pattern: a Haiku agent with a very specific and narrow job. Is this issue already tracked or not? If it is, stop right there. If not, escalate to Opus.
> 4 out of 5 failures never reach Opus. A triager match costs around 25x less than a full investigation.
The title feels misleading. Why clickbait on that when you can just be genuine about the architecture?
[-]
- idorosen 51 minutes ago
  The title does not match the article title: “We Upgraded to a Frontier Model and Our Costs Went Down”.
  [-]
  - stingraycharles 15 minutes ago
    It’s still misleading, though.
cadamsdotcom 45 minutes ago
I have rewritten the article to be slightly shorter:
“Let a cheap agent decide if the expensive one is needed.”
[-]
- a_t48 5 minutes ago
  Sounds like L1 vs L2 support :)
- dmazhukov 39 minutes ago
  [dead]
saltyoldman 14 minutes ago
I do a similar thing with a "planner agent" that uses the cheapest (I think it's using openai-gpt-5.2-mini or something at like 20 cents for 1M.) that more or less emits a plan name, task list and the task list has a recommended model in each task. It's not perfect, but many of our tasks are accomplished with lighter weight models. When doing code generation or fixing we upgrade to a more expensive model, planning and decisions are done more cheaply. Keep in mind the tasks are relatively constrained, so planning done with a cheap agent makes sense here. An open-ended agent would likely use a more expensive call for planning.
whalesalad 44 minutes ago
Looking at the diagram, is this seriously a case of replacing basic functional concepts like "write to clickhouse" or "have we seen this before" to a model? could those be actual function calls in some language?
just seems wasteful all around. having an agent in the critical path when a regular expression (or similar) could do just seems odd. yeah haiku is cheap but re.match() is cheaper.