Single Prompt vs Step-Based LLM Workflows: A Design Tradeoff

January 15, 2026

2 min read

While building an AI-driven side project, I ran into a design tradeoff that kept resurfacing with each LLM-powered feature:

Should a single prompt handle an entire workflow, or should each step be its own call?

The Single-Prompt Approach

In my initial version of a feature, the system did everything it could in one shot. A single LLM prompt would:

Categorize an item from user context
Decide whether there was enough information to score it
Generate all follow-up questions it thought were necessary at once if context was missing

That approach worked initially, but it had a clear limitation: the follow-up questions were often repetitive or poorly targeted, because the model couldn’t incorporate context from previous answers. Once questions were generated, there was no opportunity to adapt.

The Step-Based Approach

My latest version breaks the workflow into discrete, single-purpose steps:

Categorizing the item
Explicitly deciding whether there’s enough context
Generating follow-up questions through a feedback loop when context is missing
Scoring the item with a structured rationale

Splitting the workflow into smaller prompts added orchestration overhead, but changed the system in important ways:

Each step became independently observable and testable
Failures could be detected and recovered from in isolation
Schemas were easier to enforce
Each capability could be versioned and rolled out independently

Runtime Model Selection with OpenRouter

To achieve this, I integrated OpenRouter so each step can choose its model at runtime. That makes model choice observable: outputs can be judged, analyzed, and scored per task.

What stood out wasn’t that one approach was universally better, but that they optimize for different goals. Single prompts optimize for speed and simplicity. Step-based prompts optimize for control, reliability, and evolvability.

What’s Next

The next step is adding an evaluation layer (via Langfuse) to track quality, consistency, and failure modes over time. With per-step routing in place, the system can treat model selection itself as a feedback loop—continuously scoring LLM performance and adjusting which model is used for each task based on what actually performs best.

Continue Reading

February 2, 2026 · software engineering

AI Products Don't Fail Loudly — They Fail Quietly

Why debugging AI systems is less about fixing errors and more about auditing decision-making.

January 30, 2026 · software engineering

Building Agentic AI for Personal Finance: Lessons from a Year of Shipping LLMs

What I learned designing an agentic AI system to help prioritize and forecast future spending—with guardrails that keep it explainable, auditable, and context-aware.

January 1, 2026 · software engineering

My 2025 Wrapped: What I Built, Shipped, and Learned in AI

A look back at the AI side projects I built in 2025 and the technologies that shaped how I think about shipping AI into production.

Tanner Goins

Software consultant helping businesses leverage technology for growth. Based in Western Kentucky.

Get in touch

Want to discuss your project?

Learn how these ideas can be applied to your business. Contact me for a free consultation.

Get In Touch