THE UNIVERSITY OF ALABAMA®
All sessions
Part 2 · Design toolkit / Session 04 · Week 4

Mechanics I: challenge, feedback, and productive failure.

Three mechanics do more heavy lifting than all others combined. Get challenge calibration, feedback timing, and failure design right and almost any crosswalk will teach. Get one of them wrong and no amount of polish on the others will save you.

Contact time 180 min 60 lecture · 90 analysis · 30 apply
Deliverable Revise D2 Annotate failure modes
Outcomes 4 Peer discussed
Materials Video bank Three pre-class clips
01 · Learning outcomes

By the end of this session, you can…

  1. LO 4.1Locate your target difficulty as a zone, not a single point — and describe the adjustment knobs that keep learners inside it.
  2. LO 4.2Distinguish four kinds of feedback by timing and grain, and choose the right kind for each objective type.
  3. LO 4.3Design a failure loop that teaches — short, costly-enough, recoverable, and instructive — and say when not to design one.
  4. LO 4.4Annotate your D2 with the specific challenge/feedback/failure risks per row; identify the highest-risk row first.
02 · Challenge

Difficulty is a zone, not a point

"Flow" is popular and mostly right but too coarse to design against. Think instead about two ranges inside one envelope: the tolerance band (where most learners can continue without help) and the productive-struggle band (where they stall but can recover within one or two attempts). Your game should spend 70% of time in the first and 25% in the second. The remaining 5% is genuine failure — see §04.

Knob A

Time pressure

Shorten the clock and every other difficulty goes up with it. Most powerful for retrieval, discrimination, and procedural fluency; dangerous for judgment under uncertainty (it hides reasoning).

Knob B

Information completeness

Hide features, delay readouts, obscure second-order effects. Most powerful for conceptual reasoning and judgment; can frustrate retrieval/discrimination if overused.

Knob C

Distractor similarity

Make the wrong options look more like the right one. The fulcrum of discrimination learning. Too subtle and learners guess; too crude and they never have to look at the discriminating feature.

Knob D

Stake

What the player loses on error. Progress, resources, relationship, identity. Low stake = low engagement and low attention. High stake = avoidance. Calibrate against the learner's risk tolerance.

i
Rule of the adjustable knob

Every mechanic in your game should have at least one tunable knob that you — or an adaptive system — can turn during play. If you have no knobs, your game teaches one learner and fails the rest.

03 · Feedback

Four kinds, four purposes

Most educational games default to immediate, verification-grain feedback ("correct!"). That choice is only defensible for retrieval. Every other objective type wants something else.

KindTiming / grainBest used forCharacteristic mistake
VerificationImmediate; correct/incorrect.Retrieval, basic discrimination.Used for judgment tasks — short-circuits the reasoning you wanted.
ElaborativeImmediate; why the answer was right or wrong.Discrimination, procedural fluency.Too long — players skim past it and lose the point.
ConsequenceDelayed; change in world state.Judgment under uncertainty, conceptual reasoning.Consequence is ambiguous or lucky — learner cannot trace cause.
ReflectionPost-loop; learner-generated.Conceptual reasoning, transfer.Written prompts treated as paperwork; no integration with play.
!
The immediacy trap

Immediate feedback maximizes engagement but can destroy transfer. Judgment and reasoning learners need time to sit in uncertainty — give feedback after a round, a shift, a day of game-time. Your players will hate it in week 1 and thank you in week 4.

04 · Productive failure

Designing a failure loop that teaches

Kapur showed it in classrooms and every designer who has shipped a roguelite knows it: failure before instruction beats instruction-then-practice for conceptual transfer. But only if the failure is productive — short, costly-enough, recoverable, and instructive.

  1. ShortOne failure-to-retry cycle < 5 minutes for first-hour play. If recovery is slow, learners quit before they reach the lesson.
  2. CostlyLosing has to feel like losing — not a reset-with-everything. Resource or progress loss is fine; identity loss almost never is.
  3. RecoverableThe retry must be available immediately and changeable — player must be able to try a different approach. Same-exact retry teaches nothing.
  4. InstructiveThe failure state reveals a cue the player can use next attempt. If the player cannot tell why they lost, you have frustration, not failure-as-teacher.
When not to design a failure loop

Retrieval-only games. High-stakes professional domains where modeling failure is itself the harm (e.g., giving clinical trainees explicit "how to miss a diagnosis" patterns). In those cases, use consequence feedback without framing it as a failure loop.

05 · Minigame — 4 min

Dial your failure loop

Productive failure has four criteria: short, costly, recoverable, instructive. The criteria trade off. Turn the dials, watch the verdict change, and see which combinations collapse into "homework," "punishment," or a real teach.

Minigame

Productive-failure sandbox

Drag any dial

Imagine a single iteration of your failure loop — the time between a player getting a call wrong and their next meaningful decision. Adjust the dials to fit your game. The verdict updates live.

Loop length Seconds between failure and next decision
30s
Cost of failure How much in-game state the player loses
40%
Recoverability How quickly the player can climb back
60%
Instruction density Signal in feedback that names the discriminator
50%
Verdict
06 · Analysis — 60 min

Three clips, three verdicts

Watch each clip in triads. Log challenge, feedback, and failure design choices; judge whether each one fits the game's apparent objective. Deliverables: one-paragraph verdict per clip.

CLIP A · PLACEHOLDER (6 MIN)
Clip A · Timed discrimination · commercial release Open worksheet →
CLIP B · PLACEHOLDER (8 MIN)
Clip B · Branching scenario · high-stakes domain Open worksheet →
CLIP C · PLACEHOLDER (5 MIN)
Clip C · Roguelite failure loop · consumer hit Open worksheet →
07 · Tools — Google AI Studio

Writing feedback copy that lands

Feedback is copy before it is UX. "Correct!" is a copywriting decision. Elaborative feedback is 1-2 lines of writing that must do three jobs in 30 words: confirm or contradict, explain the discriminating feature, and leave the player attentive for the next round. AI Studio is excellent at generating that kind of short-form copy — if you brief it properly.

AI Studio

Use case · Draft feedback copy for every event in your loop

Gemini 2.5 · temperature 0.5

Give the model your event → feedback map (even a draft) and ask for copy by feedback kind. You will get 30–50 candidate lines in one pass. Most will be wrong tone. Three will be right — and three is more than you had five minutes ago.

System prompt
You write in-game feedback copy for educational games. You follow four
feedback kinds with distinct rules:

- VERIFICATION: <=5 words, no elaboration. "Correct." "Miss."
- ELABORATIVE: <=30 words, names the discriminating feature or
  corrects the specific error. Second person. No praise words.
- CONSEQUENCE: 0 words on-screen; describe instead what changes in
  world state (score, NPC reaction, resource).
- REFLECTION: a question the player answers, not a statement. One
  sentence. Not leading.

For each event I give you, produce three candidates in EACH of the four
kinds (12 lines total). Label them. Never invent content I did not
give you; if a field is missing, ask.

Do not use: "Great job," "Awesome," "Oops," exclamation points,
emoji, "Let's," or the word "learn."
Your message
Event: Resident picks a non-discriminating test (e.g., CBC when the
discriminator was lactate).

Context:
- Learner: 1st-year IM resident, night shift, overnight on call.
- Objective type: discrimination.
- Role: the intern.
- Tone: plain clinical; no cheerleading; no clinical shorthand a
  layperson would need decoded.

Give me 12 lines across the four feedback kinds.
!
The "Great job!" test

If AI Studio gives you a line that could appear in a toothpaste ad, delete it. Educational game copy has a register; match your learner's professional culture, not the model's default cheerful LMS voice.

Use it when

You have an event→feedback map (even a rough one from S8 drafts) and need to populate each cell with copy candidates. The model is faster than you at the first 20 lines; you are better at choosing.

Don't use it when

You have not decided which feedback kind each event uses. The choice is a design decision that carries from S4's taxonomy — not a copywriting detail.

AI Studio

Use case · Diagnose a failure loop that feels "stuck"

Adversarial

If your paper prototype's failure loop is frustrating playtesters and you cannot tell why, walk the model through the loop and ask it to locate which of the four failure-design criteria (short / costly / recoverable / instructive) is failing.

Prompt
Here is a failure loop from my game, described step by step:

1. Player makes a triage call.
2. Scene advances; they see more patients.
3. At shift end, the critical patient they under-triaged is revealed.
4. Score summary screen.
5. "Retry shift" button.

Playtesters say it "feels like homework." Using the four criteria —
short, costly, recoverable, instructive — identify which is failing and
why. Do not propose a fix yet. Diagnose first. Be specific about which
step in my loop carries the failure.
08 · Apply to your D2

Annotate your crosswalk

Return to your D2. For each row, write one sentence on challenge (which knob dominates), one on feedback (which kind, why), one on failure (loop / no loop, why). Star the row with the most unresolved risk; that row drives your prototype priorities in Session 07.

09 · Preparation for Session 05

Before next week

10 · Exit ticket

Your riskiest row

The D2 row with the highest unresolved challenge/feedback/failure risk, and what will make me drop it: