Judging what to carry in your AI toolkit
Every new agentic AI improvement promises a win — but does yours pay off? I ran 18 experiments across three frontier models, using two popular tools, and only one beat its control.
2 posts
Every new agentic AI improvement promises a win — but does yours pay off? I ran 18 experiments across three frontier models, using two popular tools, and only one beat its control.
Can a markdown persisted designed system stay aligned while iterating on changes with an AI? I shipped four design changes and a drift detector, with the markdown spec as the contract.