
Kurt FischmanFounder, Marshal
Kurt is the CEO of Marshal, a Managed AI Ops service built for small businesses. That means AI agents doing the work, leads coming from answer engines, and a team that keeps your business running at full speed.

Choosing the right AI agent use case means passing each candidate workflow through three gates in order: value (is the job worth owning), risk (can the business afford the worst day), and feasibility (can an agent actually run it). The gates are vetoes in series, not scores to average. A workflow that fails any gate is out, whatever the other two say.
Search for how to choose an AI agent use case and the engines hand you inventory: ten use cases, twenty-two examples, eight that free up real hours. We ran the probe ourselves, on the exact query, on June 10, 2026. Perplexity returned zero results for the selection-intent phrasing, Google's AI Overview slot was present but served nothing into it, and the organic bench was the same count-titled listicles that rank for every agent query this year. Inventory ranks. Method does not. The pages that tell you what exists outrank the pages that tell you how to decide, and the asymmetry is not an accident of SEO: every page on the bench is published by a company with an agent to sell, and nobody selling agents profits from your no.
The selection advice that does exist lives as a paragraph inside the inventory. Workday tucks a prioritization matrix into its examples roundup; Dataiku attaches a "Should I Build an Agent?" quick test to its five-use-case pitch. Both are better than nothing and both are load-bearing in the wrong places, which is worth examining closely, because the standard artifact they represent, the two-by-two matrix, is precisely the instrument that lets bad deployments through.
Choosing an AI agent use case is not a scoring exercise; it is three vetoes in a row, and the order is the method. The strongest version of the standard advice deserves stating first. Workday's selection guidance says good use cases share clear goals, clean data, and repeatable logic, and recommends a matrix mapping candidates by strategic value and automation readiness: high-value high-readiness workflows are prime candidates, and the rest wait or die. The same roundup carries the projection that makes the question urgent: 33% of enterprise software is expected to include agentic AI by 2028, up from 1% in 2024, with at least 15% of business decisions made autonomously. That is the steelman, and it is genuinely useful as far as it goes.
Here is where it stops going: the matrix has no risk axis. Value and readiness are both reasons to say yes; nothing in the two-by-two prices the worst day. A workflow that writes to customer billing can be high-value, high-readiness, and capable of an unaffordable failure, and the matrix will file it under prime candidate, because the matrix structurally cannot say no for the one reason that bankrupts agent deployments. The deeper flaw is arithmetic: scoring systems average, and this decision needs vetoes. Value below the floor kills a candidate regardless of feasibility. Risk above budget kills it regardless of value. Infeasibility kills it regardless of both. Value decides if the job is worth owning. Risk decides if you can afford to be wrong. Feasibility decides if it can ship this quarter. Average them and you deploy a demo; gate them and you deploy a system. The gates run in that order because each is cheaper to evaluate than the next is to fix: value falls out of a spreadsheet, risk out of a meeting, feasibility out of systems work.
Value, in agent selection, is the manual-coordination cost a workflow burns today, not the impressiveness of what an agent could do tomorrow. The measurable floor is real and large: Forrester research, cited in Sprinklr's use-case roundup, puts knowledge workers at 30% of their time spent just searching for information, with 60-70% of routine work automatable by current tools. Those are ceiling numbers from people selling the ceiling, but the direction is right: the value pool is the re-keying, the chasing, the relaying, the searching, summed in salary-weighted hours per month against a specific workflow.
The gate works as a floor, not a ranking. Name the workflow's hours and whose hours they are; if the number is small, the candidate dies here, no matter how good the demo would look. Two refinements keep the measurement honest. Count the interruption cost, not just the task cost: a ten-minute relay that breaks a closer's focus four times a day burns more than forty minutes. And count the work that silently does not happen, the follow-up never sent, the mismatch never chased, because absent work is coordination cost the time sheet cannot see. The full economics, including the oversight and integration costs that net against the gross hours, are laid out in the business case for AI agents, whose posture this gate inherits: the default answer is no, and the work is finding the workflows that survive. A useful tell at this gate is volume sensitivity. Coordination cost that grows with the business, every new client adds onboarding hours, every inbound spike adds response lag, marks a workflow whose value gate opens wider over time. Static cost marks one that may never justify its oversight.
Risk, in agent selection, is the cost of the workflow's worst day divided by the oversight available to catch it. Three questions price it. Reversibility: can the worst plausible action be rolled back, a draft deleted, a record corrected, or is it a sent email, a charged card, a cancelled subscription? Blast radius: does a failure stay inside one record, or propagate across systems, customers, and the brand? Oversight surface: do approval gates and exception queues exist where this workflow would run, so the wrong action gets held before it lands rather than discovered after?
The gate's output is not "risky or safe"; it is whether the failure budget of this function covers the worst day of this workflow. An internal helpdesk agent that misroutes a ticket spends almost nothing against the budget. A finance agent that writes to the ledger spends real money and must run behind approvals, which raises its oversight cost and sends a signal back to gate one: net value shrinks as risk controls grow. This is also the gate the matrix conceals, and the reason gate order matters. Evaluating risk before feasibility means a business never spends engineering effort proving an agent can technically do a job it could never afford to let the agent do wrong. The matrix exists because consultants need every workflow to land somewhere actionable. A gate exists because operators need most workflows to land in the bin.
Feasibility, in agent selection, is whether the agent can read the systems the job lives in, write where the job completes, and prove the done-state in a log. Workday's three characteristics, clear goals, clean structured data, repeatable logic, live inside this gate, as does Dataiku's quick test; the addition that matters in production is testability. A workflow whose completion cannot be verified mechanically cannot be supervised mechanically, and an agent nobody can audit is an agent nobody should run.
The feasibility question Marshal asks first is unglamorous: can the agent read the systems this job lives in, and can a log prove the job finished? If the data lives in someone's inbox and done lives in someone's head, the workflow is not a candidate. It is a cleanup project. That distinction does real work, because failing this gate is not a permanent no; it is a to-do list. Move the tribal knowledge into the CRM, define the done-state as a field a system can hold, and the same workflow re-enters the pipeline next quarter as a legitimate candidate. Feasibility runs last for exactly this reason: it is the only gate whose failures convert into roadmap. Value failures are facts about the business, risk failures are facts about the stakes, but feasibility failures are usually facts about housekeeping, and housekeeping can be scheduled.
A worked pass through the gates shows how fast the method runs once the questions are concrete, so take a real candidate: invoice reconciliation at a services business with two hundred invoices a month. Gate one, value. The bookkeeper spends ninety minutes a day matching invoices to records, chasing the mismatches by email, and re-keying corrections, call it thirty salary-weighted hours a month, growing with client count. The number is real, named, and volume-sensitive. Value gate: pass, in about the time it took to read the time sheet.
Gate two, risk. The worst plausible action is a wrong write to the ledger or a wrong payment release. Wrong writes are reversible with an audit trail; payment release is not, so the design answer is a split: the agent matches, flags, and writes back clean reconciliations, while anything that moves money queues for human approval. With that gate in the architecture, the worst unsupervised day is a mismatched record a month-end review catches. Risk gate: pass, conditional on approvals, and the condition is now a written requirement instead of a discovery.
Gate three, feasibility. The invoices live in the accounting system, the records live in the CRM, both have APIs, and done is mechanically checkable: every invoice matched, flagged, or queued, with a log line per decision. One gap surfaces: a third of mismatches are resolved today through an email thread with the client, which is judgment work that stays human and routes to the exception queue. Feasibility gate: pass, with the exception path named. Total elapsed method time is one afternoon and three meetings, the output is a deployable specification, and notice what never happened: nobody scored anything out of ten, and nobody argued about weights. The gates either passed or they did not, and every pass left behind a requirement the build now inherits.
Run the same candidate workflows through the value/readiness matrix and the three gates and the disagreements are exactly where deployments go wrong. The table takes five common candidates and shows both verdicts side by side; the rows where they split are the rows that produce expensive lessons.
Five common candidate workflows run through the three gates and through the standard value/readiness matrix. The rows where the verdicts split are where deployments go wrong.
| Candidate workflow | Three gates (value, risk, feasibility) | Value/readiness matrix | Where they split, and why |
|---|---|---|---|
| Inbound lead response | Pass; gate sends early, reversible drafts | Prime candidate | Agreement; volume and friendly failure budget |
| Invoice reconciliation | Pass with approvals on money movement | Prime candidate | Gates add the approval condition the matrix omits |
| Subscription billing changes | Fail at risk; worst day unaffordable unsupervised | Prime candidate, high value and readiness | Matrix has no axis that prices a wrong cancellation |
| Executive reporting | Pass; read-only blast radius near zero | Low value, often deprioritized | Matrix undervalues safe practice deployments |
| Contract negotiation | Fail at value and risk; judgment work | High value, scored as invest-later | Gates kill it now; matrix keeps it alive on a roadmap |
The matrix and the gates agree on the easy rows. The expensive disagreements are the high-value workflows whose worst day the matrix never priced.
Sequencing the workflows that survive the gates is a portfolio decision: run the friendliest failure budget first, bank the logs, and let each deployment de-risk the next. The first agent's job is only half its workflow; the other half is teaching the business to supervise agents at all, which is why the opening deployment should be the survivor whose worst day is an internal apology rather than a customer incident. Each gated workflow that runs clean produces the evidence the next gate evaluation needs: real oversight costs instead of estimates, real exception rates instead of fears, and a team that has stopped treating the approval queue as a novelty.
The gates also compose with the map. Selection assumes candidates, and the candidate pool for any function, which workflow tends to go first, what the agent owns, where the failure budget bites, is laid out in AI agent use cases by business function. Run the map's rows through the gates and the output is not a quadrant chart for a steering committee; it is a short, ordered list with named workflows, named owners, and a reason attached to every no. Most candidates should die at the gates. That is the method working, not failing, because the businesses that get burned are rarely the ones that deployed too few agents. They are the ones that deployed the matrix's prime candidates and discovered the missing axis in production.
One last discipline keeps the portfolio honest: the gates are re-run, not run once. A risk verdict ages as oversight matures; a feasibility verdict ages as systems get cleaned; a value verdict ages as the business grows into volume it did not have last year. The candidate that failed in January is not the candidate it will be in June, and the bin is a queue, not a graveyard. Re-gating quarterly costs an afternoon, and it is the difference between a selection method and a selection event, which is what most agent strategies quietly are: one burst of enthusiasm, one matrix, one workshop, and no standing mechanism for saying yes slowly and no quickly as the facts move.
Choosing the right AI agent use case means passing each candidate workflow through three gates in order: value, measured as the manual-coordination cost the workflow burns today; risk, the cost of the worst day against the oversight available; and feasibility, whether an agent can read the systems, write the completion, and prove the done-state in a log. Each gate holds veto power, and a candidate that fails any gate is out.
An AI agent use case is valuable when a workflow burns significant manual-coordination cost: re-keying, chasing, relaying, and searching, summed in salary-weighted hours per month. Forrester research puts knowledge workers at 30% of their time searching for information, which is the kind of recurring cost the value gate measures. Impressiveness of the demo does not count; hours do.
Assessing AI agent use case risk means pricing the workflow's worst day with three questions: is the worst plausible action reversible, how far does a failure propagate beyond one record, and do approval gates and exception queues exist to hold wrong actions before they land. The risk gate passes when the function's failure budget covers the workflow's worst day.
An AI agent use case is feasible when the agent can read the systems the job lives in, write to the systems where the job completes, and a log can mechanically prove the done-state. Clear goals, clean structured data, and repeatable logic support feasibility, and testability completes it. Feasibility failures usually convert into a cleanup roadmap rather than a permanent no.
A scoring matrix fails for AI agent use case selection because it averages where the decision needs vetoes, and the standard value/readiness matrix carries no risk axis at all. A high-value, high-readiness workflow with an unaffordable worst day files as a prime candidate under the matrix, which is exactly how demo-ware reaches production.
The first AI agent use case a business runs should be the gate survivor with the friendliest failure budget, often inbound lead response or internal reporting, where the worst day is recoverable. The opening deployment teaches the business to supervise agents at all, and the logs it banks de-risk every gate evaluation that follows.
Drive more awareness in answer engines. Transfer more work to machines. Build the operating structure that will keep you ahead of whatever comes next.