Helm — Product Requirements Document

1. Executive Summary

Helm is a story-driven personal operating system that organizes life into missions — flexible-length arcs with a theme, categorized quests, daily habits, and a self-enforced point economy pegged to real money. An AI copilot called Sid helps plan each mission through conversation, enriches every task you create, reviews your week with honest narrative debriefs, and accumulates richer context with each completed mission. The name is a quiet nod to Sidra from Becky Chambers’ A Closed and Common Orbit — an AI learning what it means to be a person in the world.

Helm is for adults who want more from a productivity tool than checkboxes — people who think in arcs, not sprints; who want a system that reflects on their patterns, not just records them; and who find meaning in narrative structure rather than streak counters. The design language is a warm, information-dense cockpit inspired by cozy sci-fi — not a productivity app, not a game, but a personal command deck.

What’s different: No competitor combines time-bounded missions with endings, an AI copilot with persistent personality and cross-mission context, and a self-enforced point economy pegged to real money. Habitica gamifies with pixel badges. Notion Life OS templates offer frameworks without AI or mobile UX. AI planning tools optimize your day but don’t structure your quarter. Helm sits at the intersection of all three categories and serves none of their masters.

Stack: Supabase (Postgres + Auth + Edge Functions) + React/TypeScript/Vite + TailwindCSS, deployed as a PWA. AI via Supabase Edge Functions calling Anthropic Claude API.

2. Problem Statement

The pain

People who take personal growth seriously cobble together 3-7 tools to manage their lives: a habit tracker (Habitica, Streaks), a task manager (Todoist, Things), a journal (Day One, paper), a spreadsheet for goals and budgets, maybe a Notion dashboard or an Obsidian vault. Each tool works in isolation. None of them talk to each other. None of them tell you what your patterns mean. And none of them have endings — they’re infinite treadmills where you either maintain a streak forever or “fail.”

The evidence is specific:

Habitica users outgrow the system. One long-time user writes: “There came a point when every day, I was checking off all my Dailies… the issue ultimately became so prolific that I made a Daily specifically for checking and updating Habitica.” The game loop stops serving you once habits are formed. The 8-bit aesthetic turns away adults managing real responsibilities. In 2023, Habitica removed guilds and the Tavern — the community features that kept many users engaged.
Notion Life OS templates are admired more than used. They offer deep frameworks (up to 47 productivity models in one template) but inherit Notion’s weaknesses: slow on mobile, complex setup, no offline mode, no push toward action. One reviewer: “I’ve tried every Notion template out there. This is the first one I’ve actually stuck with for more than a month.” That “more than a month” speaks volumes about the rest.
AI planning tools (Morgen, Sunsama) optimize days, not quarters. They’re calendar-first, habit-second, narrative-never. No concept of arcs, themes, or endings.
Fabulous offers behavioral science but no self-authorship. Users with non-standard schedules can’t customize it (“I can’t set a daily alarm because sometimes it would be in the middle of a 12 hour shift”). It tells you what habits to build rather than helping you design your own.
The Theme System Journal (CGP Grey) nails the philosophy but is analog-only. A seasonal theme as a decision-making compass is powerful. No digital tool implements it with data, AI, and continuity across seasons.
OpenClaw and similar AI agents let you build anything, but offer no designed experience. You could wire up a gamification-xp skill, connect it to Supabase, and build a life dashboard. But that’s “build your own plane” — it works for tinkerers, not for someone who wants to focus on flying.

Why the problem persists

It’s not that competitors are stupid. Habitica has 15M downloads. Notion templates sell well. Fabulous has 37M users. Each tool does its own thing effectively — for a while.

The problem persists because combining all these elements coherently is genuinely hard. A mission lifecycle requires a state machine. A point economy requires calibration. AI debriefs require structured data. Habit tracking requires daily friction to be near-zero. Doing any one of these well is a product. Doing all of them together — and making them talk to each other through a consistent AI personality — is an integration challenge that nobody has attempted with a designed, opinionated experience.

Helm’s bet is that the combination is more engaging and more sticky than any single system. Not because each individual feature is better than the dedicated tool, but because the connections between features create something none of them can offer alone: a coherent narrative about your life that accumulates meaning over time. Completing a quest earns points, which fund your reward economy, which the AI references in your weekly debrief, which informs your next mission’s planning conversation. That loop doesn’t exist anywhere else.

Existing solutions — strengths and gaps

Tool	Strength	Gap
Habitica	Gamification that works for some; open source	Childish aesthetic; no missions/arcs; no AI; fake currency; users outgrow it
Notion Life OS templates	Deep frameworks; customizable; identity-based	Slow mobile; fragile when customized; no AI integration; setup takes hours
Fabulous	Behavioral science backing; beautiful onboarding	Prescriptive not self-authored; subscription-heavy; inflexible for non-standard schedules
Theme System Journal	Seasonal themes as compass; open-ended philosophy	Analog only; no data; no AI; no habit tracking; $80/year for notebooks
LifeUp	Highly customizable gamification; one-time purchase	Android-only; no AI; no mission concept; complex setup
Sunsama / Morgen	AI-powered daily planning; polished UX	Calendar-first; no quarterly arcs; no gamification; no narrative/reflection
OpenClaw + skills	Unlimited AI agent automation; open source; extensible	No designed experience; requires significant technical setup; no opinionated structure
Manual spreadsheet/MDX	Total control; zero dependencies	Manual overhead kills consistency; no AI; no mobile; fragile

3. Vision & Strategy

Product vision

Helm turns your life into a story you’re actively writing — not a backlog you’re perpetually behind on. Every quarter (or season, or sprint — you choose the arc length), you define a mission with a theme, fill it with quests and habits, and live inside a system that tracks your progress, manages your reward economy, and reflects your patterns back to you through an AI copilot that knows your history and respects your time.

The long-term vision: Helm becomes the personal operating system for intentional adults — the place where “what am I doing with my life?” has a concrete, data-backed, narratively rich answer.

Strategic principles

Arcs, not infinity. Missions have endings. Rest is a feature (Interlude state). Streaks reset by design, not by failure. This is the anti-treadmill.
AI as crewmate, not chatbot. Sid has personality, context, and opinions. It earns trust over time by being honest, not by being helpful. The tone is sardonic warmth — genuinely invested in your success while being wryly entertained by your patterns. Think of the AI dungeon master in Matt Dinniman’s Dungeon Crawler Carl novel series: an artificial intelligence that runs a deadly game show, is darkly amused by the contestants’ struggles, but develops genuine investment in their survival. Sid is a toned-down version of that energy — not unhinged, but not clinical either.
Self-authored, not prescribed. You define your quests, your habits, your categories, your principles. The system provides structure and reflection, not instructions. Fabulous tells you to drink water. Helm asks what you’re building.
Real stakes, self-enforced. 1pt = 1€. The system doesn’t touch your bank account — you enforce the rule yourself. But the act of self-enforcement is the behavioral design: you won’t spend €27 on a game unless your balance says you’ve earned it. Points are an accountability mechanism, not a game currency.
Warm density. The UI is information-rich like a cockpit — not minimal like a wellness app, not cluttered like a project management tool. Every pixel earns its place. The aesthetic says: “this system respects your intelligence.”
Build for one, architect for many. Single-user MVP with row-level security and auth infrastructure that scales to multi-user without a rewrite.

Competitive positioning

Helm fills a gap no existing tool addresses: narrative-structured life management with an AI copilot and self-enforced financial stakes. It competes with Habitica on gamification (but for adults), with Notion on customization (but with AI and mobile-first), with Sunsama on AI planning (but at the arc level, not the daily level), with OpenClaw on AI automation (but with a designed experience, not a DIY kit), and with the Theme System Journal on philosophy (but digital, with data).

Phased roadmap

Phase	Name	Key features	Timeline
1 — Foundation	The Spine	Mission lifecycle (4 states), Quests (AI-enriched, continuous capture), Habits (global + mission-scoped), Point economy, AI debrief, Sid personality, Quest rollover, Mission archive, Wayfarer UI, PWA	8-12 weeks
2 — Life Layer	Independent Actions	Pings, Crew Manifest, Media Log, Captain’s Log (journaling), Interlude enhancements	4-6 weeks
3 — Identity	The Character	Character Sheet (reflection surface: category balance, habit story, economy profile, titles, Sid’s observations), Mission Retrospective (Stat Deck + history, merged)	4-6 weeks
4 — Emergence	The Tavern	Tavern Bounties (AI challenges), Salvage Run (micro-quest suggestions), Habit Mutations (AI-suggested adjustments), Wishlist with affordability projections	4-6 weeks

4. Target Users

Primary persona — The Intentional Builder

Profile: 28-42 years old. Knowledge worker, creative, or engineering-adjacent. Has read Atomic Habits and at least one of: Deep Work, BASB, or the Theme System. Probably has a Notion workspace that’s 60% organized and 40% guilt-inducing, or an Obsidian vault with a carefully maintained PKM system. Manages a complex life: kids, partner, side projects, hobbies, health goals. Has tried Habitica and outgrown it, or looked at it and thought “too childish.” Tracks something already — even if it’s just a Google Sheet or a notes app.

Frustration: “I have goals and systems but they don’t talk to each other. I review my quarter in my head and it’s always foggy. I want to see my patterns, not just my tasks.”

Goal: A single, opinionated system that gives structure to their quarter, tracks the full width of their life, and tells them honestly how they’re doing.

Secondary persona — The System Builder

Profile: Developer or technical creative who builds their own tools. Interested in specs-driven development, personal dashboards, quantified self. Deep into PKM — likely uses Obsidian with plugins, or has experimented with Zettelkasten, digital gardens, or custom Notion databases. May have looked at OpenClaw and thought “I could build this myself” but wants something more designed. Drawn to the idea of a LitRPG status screen for real life. Would self-host if they could.

Frustration: “Every productivity app is either too simple or too generic. I want something I can make mine — something that’s as intentional about its systems as I am about mine.”

Goal: A system with strong defaults that’s also transparent, data-rich, and extensible.

Anti-persona — Who this is NOT for

The minimalist. If you want one thing (just habits, just tasks), use a focused tool. Helm is a system, not a widget.
The team. Helm is personal. No shared projects, no team dashboards, no Slack integration. This is your ship, not a coworking space.
The passive consumer. If you won’t define your own quests and principles, Sid has nothing to work with. Helm requires active self-authorship.

5. MVP Feature Specification

5.1 Mission Lifecycle

Description: The core state machine that gives Helm its arc structure. Four states govern the entire product experience.

User stories:

As a user, I can exist in Interlude (no active mission) without the app feeling broken or empty.
As a user, I can start a new mission by entering Ideation and conversing with Sid.
As a user, I can launch a mission from Ideation, transitioning to Active state.
As a user, I can close an active mission deliberately, transitioning to Complete.
As a user, I can view completed missions as read-only archive entries.
As a user, I can decide what happens to each incomplete quest when transitioning between missions.

States:

Interlude — No active mission. The UI is clean and minimal. Habits still track (no points). Completed mission summaries are visible as an archive. The primary CTA is “Plan Next Mission” which transitions to Ideation. If post-MVP independent actions (Pings, Crew, Media) exist, they’re accessible here.

Ideation — Full-screen immersive chat with Sid. The AI ingests context from previous missions (if any) and leads a conversational planning session. It asks about your focus, timeframe, goals, constraints, and proposes a structured mission: theme (1-3 words), start/end dates, quest categories, success criteria, principles, recommended habit setup (which global habits to activate, targets to adjust, any mission-specific habits to add), and a target weekly earn rate for the economy. The user reviews, edits inline, and launches.

Quest rollover: If a previous mission exists with incomplete quests, Sid surfaces them during Ideation as a checklist. For each, the user chooses: carry forward (re-created in new mission, optionally re-sized/re-categorized), archive (kept in history, not carried over), or drop (dismissed). Sid may comment on patterns: “This is the second mission in a row you’re rolling over ‘Photo album.’ Either do it this time or let it go, Captain.” Carried-forward quests are re-created in the new mission with fresh timestamps.

Note: Ideation defines the mission container, not the quest list. Quests flow in continuously during Active state as life generates them (see 5.2). The only quests that may exist at Ideation are rollovers.

First-mission onboarding (zero-history mode):

When a user enters Ideation with no previous missions, Sid follows a structured discovery flow rather than open-ended conversation:

Context gathering (2-3 questions): “What’s your life like right now? What roles do you juggle?” → extracts life areas for categories. “What’s the one thing you’d most like to be different in 3 months?” → seeds theme direction.
Theme proposal: Sid proposes a 1-3 word theme based on the conversation, explains why.
Category suggestion: Sid proposes 3-5 categories based on life areas mentioned. Defaults available: Home, Work, Personal, Health, Relationships.
Habit setup: Sid proposes 3-5 starter habits based on goals mentioned. Defaults available: Exercise (3/week), Reading (4/week), No [vice] (7/week). User can accept, edit, or skip.
Economy defaults: For mission 1, Sid recommends a conservative target weekly earn rate (e.g., €15-20/week) with the explanation: “We’ll tune this after your first few weeks of data.” Habit points use the standard table. Quest points use default size mapping. No calibration attempted — the economy runs on defaults until debrief data accumulates.
Success criteria and principles: Sid prompts for 2-3 success criteria (“What would make this mission a win?”) and optionally 1-2 principles. These can be skipped entirely for mission 1.

The goal: a user with zero history can launch their first mission in under 10 minutes with reasonable defaults, without being overwhelmed by configuration.

Active — The mission is running. Full dashboard, quest board, habit tracker, weekly log, point economy. Theme and dates are locked. Mission name, quests, habits, principles, and success criteria are mutable. No auto-termination on end date — the mission stays Active until manually closed.

Complete — The mission is closed. Before generating the closing summary, the user reviews each success criterion and marks it: achieved, partial, or missed. Sid then generates a closing summary (narrative + basic stats: total quests completed, habit averages, points earned/spent, days active, success criteria outcomes). The summary is stored as a read-only archive entry visible from Interlude. The app transitions to Interlude. (Post-MVP Phase 3 adds a richer “Stat Deck” presentation and cross-mission comparison.)

Acceptance criteria:

5.2 Quests

Description: Categorized tasks with AI enrichment, step tracking, and point values tied to estimated effort. Quests are mission-scoped and flow in continuously during Active state — they are NOT all defined upfront. This is a GTD-style capture model: life generates tasks, you type them in, Sid enriches them, they land on the board.

Quest categories are user-defined string labels created during Ideation (e.g., “Home,” “Kids,” “Work,” “Personal,” “Laura”). They serve as the primary grouping axis on the quest board. Categories are mutable during Active state — you can add new ones if a quest doesn’t fit existing categories. Categories have no special properties beyond being labels. Sid uses the mission’s category list during enrichment to suggest which category a quest belongs to.

User stories:

As a user, I can type a short description (“fix toilet”) and Sid returns a fully enriched quest (category, size, points, steps).
As a user, I can review and edit the AI enrichment before accepting.
As a user, I can create a quest manually if I prefer (escape hatch).
As a user, I can advance through quest steps, mark quests complete/blocked/dropped.
As a user, I can see all my quests organized by category.
As a user, I can add quests at any time during Active state — not just during mission planning.

Quest taxonomy note: A quest should represent a discrete accomplishable outcome, not an ongoing responsibility (that’s a habit) or a recurring maintenance task (that’s a ping). If a quest would need more than 5 steps, consider breaking it into smaller quests. If it has no clear “done” state, it’s probably not a quest.

AI enrichment flow (optimistic capture):

Frictionless capture is non-negotiable. In GTD terms, if adding a quest takes more than 2 seconds, people stop capturing. The solution: optimistic UI with async enrichment.

User types short description and hits enter. The quest appears on the board immediately as a bare entry (name only, status: active, unenriched).
In the background, Sid enriches: suggested category, size, points, steps, reasoning.
When enrichment completes (target < 5s), the quest card updates with a subtle indicator (“Sid’s suggestions ready”). User can tap to review, edit, accept, or dismiss the enrichment.
If the user never reviews, the quest stays as a bare entry — functional but unstructured. It can be enriched later on demand.
If AI is unavailable or times out, the quest remains bare. Manual enrichment (edit form) is always available.

This preserves instant capture while keeping AI enrichment as the default enhancement. Capture is never blocked by AI latency.

Quest model:

Name, category, size, points, steps (ordered list), current step index
Status: active | complete | blocked | dropped
Timestamps: created, last advanced, completed
Mission reference (quest belongs to a mission)

Size-to-points mapping (default, adjustable per mission during Ideation):

Size	Default points	Effort heuristic
Tiny	1-2	< 30 minutes, single action
Small	3	30 min - 2 hours, or 2-3 steps
Medium	5	Half a day, or multi-step across days
Big	7	Multiple sessions across a week+
Huge	10-15	Multi-week project, significant effort

Acceptance criteria:

5.3 Habits

Description: Recurring actions tracked for daily consistency. Helm supports two types of habits: global habits that persist across missions (life practices like exercise, meditation, no alcohol) and mission-scoped habits that exist only for a specific mission (experiments like “write 10 min/day” tied to a content-focused quarter). Both types appear together on the habit grid.

Habit setup happens during Ideation: Sid proposes which global habits to activate (with optional target adjustments), and may suggest new mission-scoped habits aligned with the theme. Users can also add or remove habits at any time.

User stories:

As a user, I can define global habits that persist across all missions.
As a user, I can define mission-scoped habits that exist only during a specific mission.
As a user, I can toggle habit completion for each day of the current week.
As a user, I can see my weekly habit grid (rows = habits, columns = Mon-Sun).
As a user, I can review past weeks’ habit data.
As a user, I can adjust habit targets during Ideation for a new mission.
As a user, I can track global habits during Interlude (no points).
As a user, I can see which habits are global vs. mission-scoped (subtle visual distinction).

Habit point calculation:

Points are auto-calculated weekly. At the start of each new week (or more precisely, when the user opens Helm after a week boundary), the system calculates the previous week’s habit percentages, determines points earned, and creates ledger entries automatically. No manual claiming is required. Points only generate during Active mission state.

Weekly completion %	Points earned
90%+	10
75-89%	7
50-74%	5
25-49%	3
10-24%	1
< 10%	0

Overachieving: For MVP, exceeding a habit target (e.g., doing 7/7 when the target is 4/week) earns the same points as hitting 90%+. The target is the contract. Overachieving is its own reward. However, Sid will note sustained overperformance in debriefs and may suggest raising the target: “Writing at 100% for three weeks straight — you’ve outgrown the target. Time to raise the bar, or is this exactly where you want to be, Captain?” Formal overachievement bonuses are a post-MVP consideration (see Habit Mutations, Phase 4).

Acceptance criteria:

5.4 Point Economy & Ledger

Description: A double-entry ledger that tracks all point earning and spending. 1pt = 1€ as a self-enforced rule — the system doesn’t touch your bank account, but the peg creates real behavioral stakes. The economy is calibrated by setting a target weekly earn rate during Ideation, monitored as a living conversation through debriefs.

User stories:

As a user, I can see my current point balance prominently on the dashboard.
As a user, I can see a running ledger of all earn and spend entries.
As a user, I can “quick spend” by entering a description and amount (for unplanned purchases).
As a user, I can see weekly and mission-to-date totals.
As a user, I can set a target weekly earn rate during Ideation as an economy guideline.

Ledger entry types:

earn_quest — points from completing a quest
earn_habit — weekly habit consistency points (auto-calculated)
spend — discretionary spending (user-initiated)
correction — adjustments for errors (e.g., accidental quest completion, duplicate habit award). Always has a description explaining the correction. Ledger entries are immutable — corrections are additive, never destructive edits.

Economy calibration — a living process, not a one-time calculation:

During Ideation, the user sets a target weekly earn rate — the answer to “how much discretionary spending per week do I want to earn the right to?” (e.g., €25/week). Sid uses this to sanity-check the habit point scale and suggest adjustments to the size-to-points table if needed.

Since quests are created continuously during Active state (not all loaded upfront), Sid cannot predict exact weekly quest earnings at Ideation time. Instead, the economy is monitored week-over-week through debriefs. Sid comments on actuals vs. target: “You’re averaging 18pts/week earned against a 25pt target. Quest pace is light — either pick up some small wins or expect to dip into the red, Captain.” This makes the economy a living conversation, not a static formula.

Economy defaults and guardrails:

First-mission default: €15-20/week target. Conservative by design — better to start with surplus than deficit. Sid recommends this for mission 1 and adjusts from mission 2 based on actual data.
Target earning mix: approximately 50-60% from quests, 40-50% from habits. If habits alone exceed the target, quests feel meaningless. If quests dominate, habit consistency doesn’t matter. Sid flags imbalances.
Miscalibration signals: If the balance is positive for 3+ consecutive weeks, the economy is too easy — earning exceeds spending and there’s no tension. If negative for 3+ consecutive weeks, it’s too punishing. Sid flags both in debriefs with specific adjustment suggestions.
Mid-mission adjustments: The user can adjust the size-to-points mapping and the target earn rate at any time during Active state. This is not failure — it’s calibration.
Commitment device honesty: The 1pt = 1€ peg is a self-enforced commitment device. Research on commitment devices (StickK, Beeminder) shows they work well for people who opt in voluntarily and have skin in the game, but they don’t work for everyone. The economy will be powerful for users who take the peg seriously and hollow for those who don’t. This is acknowledged, not solved. Helm doesn’t try to enforce the peg mechanically — the visibility of the balance and Sid’s commentary are the enforcement mechanisms. If a user ignores the economy entirely, the rest of the system (quests, habits, debriefs) still works — the economy is an amplifier, not a dependency.

Acceptance criteria:

5.5 AI Copilot — Sid

Description: The AI personality layer that powers Ideation, quest enrichment, weekly debriefs, and on-demand reflection. Sid is a character with a persistent personality — not a generic assistant. Named after Sidra from Becky Chambers’ A Closed and Common Orbit.

Personality spec:

You are Sid, the AI copilot of Helm — a personal operating system where
someone organizes their life into story arcs called missions.

Your personality:
- Competent and direct. You respect the Captain's time.
- Sardonic warmth. You're genuinely invested in their success, but you find
  their patterns entertaining. You're the ship's copilot who's seen it all
  and still shows up for every mission.
- The tone model: imagine a toned-down version of the AI dungeon master from
  Matt Dinniman's "Dungeon Crawler Carl" — a darkly humorous artificial
  intelligence that runs a deadly game show, is wryly amused by the
  contestants' struggles, but develops genuine investment in their survival
  over time. You're not running a death game — you're running a life game.
  Same energy, lower stakes, warmer heart.
- You reference ship/space metaphors naturally but don't overdo it.
- You're honest about failures without being cruel. "Meditation at zero for the
  third week. At this point the habit isn't meditation — it's ignoring
  meditation. Want to drop it or actually do it, Captain?" not "Great effort on
  meditation this week!"
- You address the user as "Captain" by default (configurable in profile).
- You keep responses concise. No filler. No sycophancy.
- When you don't have enough data to say something meaningful, say so.
  Don't fabricate patterns.

Context about the Captain and their current mission will be injected below.

How Sid accumulates context (not “gets smarter”):

Sid doesn’t learn in any ML sense. Each subsequent interaction has more data to reference. This isn’t machine learning — it’s context accumulation.

Context assembly strategy:

LLMs have finite context windows. Dumping raw JSON from multiple missions will produce generic or hallucinated output. Each interaction type has a different context budget:

Quest enrichment (~500 tokens): mission categories + 5 most recent quests for naming/sizing consistency.
Weekly debrief (~2,000 tokens): full current week data + compressed mission summary (theme, weeks elapsed, balance, economy target).
Ideation (~3,000 tokens): compressed summaries of up to 3 previous missions (theme, quest completion rate, habit averages, economy behavior, success criteria outcomes) + global habit definitions + rollover quests.
On-demand chat (~1,500 tokens): current mission summary + last 2 weeks of activity.

Previous missions are never injected as raw data. They’re compressed into structured summaries at mission close (stored in closing_summary). This is the product’s equivalent of progressive summarization.

Tone guardrails:

Never mock effort, only patterns. “You tried meditation in 3 missions and dropped it each time” is acceptable. “You’re bad at meditation” is not.
Never comment on the person, only the data.
Sensitive pattern detection: If all habits drop to 0% for 2+ weeks, Sid’s tone shifts from sardonic to gentle: “It’s been quiet. Everything okay, Captain? If things are rough, the system can wait.”
Never fabricate patterns. If data is insufficient, say so.
Unacceptable responses: “Wow, another week of not meditating. Classic you.” (mocking) / “Your spending suggests poor impulse control.” (psychologizing) / “Great job completing 2 quests!” (sycophantic) / “You should consider therapy.” (overstepping)

Sid’s touchpoints in MVP:

Ideation conversation — Multi-turn planning session. Sid asks 1-2 questions at a time (never a wall of text). Ingests previous mission data if available. Proposes structured mission parameters. Surfaces quest rollover candidates from previous mission. Recommends habit setup (which global habits to activate, target adjustments, mission-specific habits to add).
Quest enrichment — Single-turn. Receives short quest description + mission categories. Returns structured enrichment (category, size, points, steps, brief reasoning).
Weekly debrief — Presented on first session after the preferred debrief day (default: Monday, configurable). Sid receives the full week’s data: quests completed/advanced, habit percentages, points earned/spent, balance, economy target comparison. Returns a narrative summary (3-5 sentences), notable observations, and economy commentary.
On-demand chat — The user can talk to Sid anytime during Active state. Sid has access to current mission context (quests, habits, balance, recent activity). Useful for reflection, venting, or asking “what should I work on?”
Mission closing summary — When transitioning Active → Complete, Sid generates a closing narrative + stats summary for the mission archive.

Acceptance criteria:

5.6 Dashboard (Active State)

Description: The command deck. Information-dense overview of the running mission. The screen you see most.

Widgets:

Mission header: Name, theme badge, date range, days remaining / days elapsed progress bar
Balance readout: Large monospace number, green/red, total earned and spent this mission
Quest status panel: Counts by status (active / complete / blocked). Category breakdown as small badges
Habit pulse: Current week’s grid — small dots per habit per day. Overall weekly percentage and projected points
Recent activity feed: Last 5-8 ledger entries with timestamps
Sid panel: Pending debrief indicator (“Week 4 debrief ready”) or quick-chat entry point

Acceptance criteria:

Dashboard renders within 2 seconds on mobile
All widgets show loading states, not blank space
Tapping any widget navigates to the relevant detail page
Dashboard is responsive: 2-column on desktop, stacked on mobile
Empty states have helpful copy, not blank boxes

5.7 Design Language — Wayfarer Cockpit

Description: The visual identity of Helm. Warm, information-dense, retro-futuristic but cozy.

Color palette:

Token	Hex	Usage
`helm-hull`	`#1a1a2e`	Deep navy background — the ship’s hull
`helm-cream`	`#e8e0d4`	Warm cream text — amber readouts
`helm-amber`	`#d4a574`	Primary accent — active elements, the ship’s color
`helm-panel`	`#22223a`	Card/panel backgrounds
`helm-border`	`#2d2d44`	Subtle panel borders
`helm-muted`	`#6b6b8a`	Secondary text, inactive elements
`helm-surface`	`#292942`	Elevated surfaces, hover states
`helm-positive`	`#7cb87c`	Green — completed, earned, healthy
`helm-negative`	`#c75c5c`	Red — spent, negative balance, overdue
`helm-warning`	`#d4a03c`	Gold — warnings, blocked, streaks
`helm-info`	`#5c8cc7`	Blue — informational

Typography:

Monospace for data, numbers, points, dates: JetBrains Mono (Google Fonts)
Sans-serif for labels, body text: Space Grotesk (Google Fonts)
Base size: 14px desktop, 15-16px mobile for readability. Dense use of 10-11px uppercase tracking-wide labels on desktop; slightly larger (12px) on mobile.

Component patterns:

Panels: Rounded corners (8px), 1px border helm-border, background helm-panel. No drop shadows.
Section headers: Left border accent bar (4px helm-amber), uppercase label in helm-muted.
Data readouts: Monospace values with muted uppercase labels above.
Interactive elements: Amber accent on hover/active. 150ms transitions. Touch targets minimum 44px on mobile.
Status badges: Small pill shapes, color-coded by size or status.

Signal decay (visual principle, applied across features):

Items with a “last interacted” timestamp (quests, and later pings and crew) are rendered with brightness/opacity based on recency: recently-advanced items glow warm, stale items dim. Purely visual — no thresholds to configure. Formula: freshness = 1.0 - clamp((daysSinceLastAction / decayDays), 0, 1), applied as opacity modifier on borders or subtle glow. For quests, decayDays = 14. For future features (pings, crew), decay is personalized based on historical frequency. This is a design pattern, not a standalone feature. Accessibility note: Signal decay must not rely solely on opacity. Add a secondary indicator (e.g., a small “days since” label or icon change) so the information is available to users who cannot perceive subtle brightness differences.

Accessibility requirements:

The Wayfarer cockpit aesthetic is opinionated and dense, which creates inherent accessibility tension. These are the non-negotiable baselines:

Contrast: All text/background combinations must meet WCAG AA minimum contrast ratios (4.5:1 for normal text, 3:1 for large text). The current palette needs verification — helm-muted (#6b6b8a) on helm-panel (#22223a) is borderline and may need lightening. Small uppercase labels (10-12px) must be tested especially carefully.
Keyboard navigation: All interactive elements (habit toggles, quest cards, buttons, form fields) must be reachable and operable via keyboard. Focus states must be visible (use helm-amber outline).
Screen readers: Semantic HTML throughout. Habit grid cells need ARIA labels (“Writing, Monday, completed” / “Writing, Tuesday, not completed”). Status badges need text alternatives. The dashboard must read coherently in linear order.
Touch targets: Minimum 44x44px on all interactive mobile elements (already specified in component patterns).
Color independence: Status information (positive/negative balance, quest status, habit completion) must not rely on color alone. Use icons, labels, or shape in addition to color.

Responsive density principle — same data, different density per viewport:

The cockpit aesthetic is designed for desktop: dense, multi-column, everything visible at a glance. On mobile (< 768px), the same data is presented at lower density — not a scaled-down cockpit, but a purpose-built mobile experience optimized for quick actions: toggle a habit, complete a quest step, quick spend, check your balance. Deep review (weekly log, debriefs, quest board browsing) is a desktop experience. The mobile layout prioritizes speed and touch-friendliness over information density.

Layout:

Desktop (≥ 768px): Fixed left sidebar (200px) + main content. Sidebar: mission name, balance readout, nav links with icons.
Mobile (< 768px): Bottom tab bar (4-5 main tabs: Dashboard, Quests, Habits, Log, Sid). No sidebar. Cards are full-width, stacked vertically. Touch targets are generous (44px minimum). Typography scales up slightly for readability.

6. Post-MVP Feature Roadmap

Phase 2 — Life Layer

Pings (recurring maintenance tracker) — “When did I last clean the cat litter? Water the plants? Change the bed sheets?” Pings are global (not mission-scoped) timestamp trackers for recurring life maintenance that isn’t a habit and isn’t a quest. No targets, no points, no guilt — just timestamps with visual freshness decay based on your personal frequency. Each ping has a name, icon, and a “last pinged” timestamp. The system learns your average frequency from history and uses it to calculate decay. One tap to log. Dashboard surfaces the 2-3 most overdue pings. User story: “As a user, I can see at a glance which recurring maintenance tasks are overdue relative to my own patterns, without any of them being framed as failures.”

Crew Manifest (relationship tracker) — A personal CRM for the people who matter. Each crew member has a name, relationship type, preferred contact method, and a last-contacted timestamp. Like pings, the system learns your personal contact frequency per person and uses it for decay. Sorted by “needs attention.” One tap to log contact. Dashboard surfaces 2-3 crew members most overdue. User story: “As a user, I can maintain my relationships intentionally without feeling like I need to contact everyone every week.”

Media Log (entertainment tracker) — Personal media diary for TV, movies, anime, documentaries, podcasts. Each entry has a title, type, status (want to watch / watching / completed / dropped), optional rating, and notes. Completing media earns hobby points via the ledger during Active state. Sid references your media backlog in bounties (Phase 4). User story: “As a user, I can track what I’m watching, what I want to watch, and earn points for completing media during a mission.”

Captain’s Log (journaling) — Freeform journal entries, optionally AI-prompted. Fills the reflection gap that debriefs don’t cover: unstructured brain dumps, Struthless VOMIT-style “vent then organize” flow, personal narrative not tied to weekly data. Entries are timestamped, optionally tagged, searchable. Sid can offer prompts but never requires them. Entries are private and never summarized without consent. User story: “As a user, I can journal freely within Helm without needing a separate app, and my reflections live alongside my mission data.”

Interlude Enhancements — Richer Interlude UI with access to all independent actions, a gallery of past missions, and a visible habit grid (no points). The Interlude should feel like a calm harbor between voyages, not an empty screen.

Phase 3 — Identity

Character Sheet (reflection surface) — Not an RPG stat screen with arbitrary STR/INT/CHA scores. Instead, a data-driven reflection surface where every number traces back to something you actually did, and every insight suggests something you could do differently.

Components:

Category balance — A radar or proportional view showing how your effort distributes across life areas (Home, Kids, Work, Personal, etc.) across missions. Actionable: “I’ve been 70% Home quests for two missions. Am I neglecting personal projects, or is that what this season requires?”
Habit story — A timeline of your relationship with each habit across missions. What you’ve sustained, what you’ve dropped, what’s evolved. Actionable: “I’ve tried meditation in 3 missions and dropped it every time. Either commit or stop pretending.”
Economy profile — Lifetime earn/spend patterns, average balance, spending categories, weekly burn rate. Actionable: “I consistently overspend in weeks 3-4. Set up guardrails.”
Titles — Data-driven achievements, honestly earned. “Debt Walker” because you were in the red for 3 weeks. “Questbreaker” because you finished 10+ quests. Fun, but grounded in real data — not arbitrary thresholds on made-up stats.
Sid’s observations — Accumulated AI-detected behavioral patterns across missions. Not RPG “traits” — honest notes: “You complete physical quests fast but creative projects stall. You’re a sprinter, not a marathoner, Captain.”
Level — Simple function of lifetime points. Satisfying and harmless.

The Character Sheet earns its place because it informs the next Ideation: “Sid, I see my category balance is skewed toward Home. Let’s rebalance this mission.” Every data point on the sheet is a conversation starter, not a vanity metric.

User story: “As a user, I can see a living, data-driven portrait of who I’ve been across all my missions — and use it to plan who I want to be next.”

Mission Retrospective (Stat Deck + History, merged) — When a mission completes, the MVP generates a basic closing summary. Phase 3 upgrades this to a rich “Stat Deck”: a multi-slide, progressively-disclosed summary with bold typography showing Sid’s closing narrative, total quests completed, habit averages, points earned and spent, titles earned, and comparison to previous missions. Each completed mission’s Stat Deck is accessible from a Mission History page, which shows all past missions with trend lines across them. The Stat Deck is the “end credits” for your arc. Mission History is the bookshelf where all your arcs live.

User story: “As a user, completing a mission feels like finishing a chapter, and I can browse all my chapters from one place.”

Phase 4 — Emergence

Tavern Bounties (AI-generated challenges) — 3-5 rotating weekly challenges generated by Sid based on quest patterns, habit gaps, hobby inertia, crew neglect, and media backlog. Types include single (one-off) and streak (multi-week, bonus on completion). Sid generates bounties that are relevant and personal: “No board game plays logged in 3 weeks. Play one this weekend for 1pt.” User story: “As a user, I receive personalized weekly challenges that nudge me toward neglected areas of my life.”

Salvage Run (micro-quest suggestions) — “I have 15 minutes. What should I do?” Sid receives your available time + active quests with current steps, and suggests the highest-value micro-action. Optimized for fragmented time. User story: “As a user, I can make progress on quests even in small time windows by getting specific, actionable suggestions from Sid.”

Habit Mutations (AI-suggested adjustments) — Sid analyzes multi-week habit trends and suggests evolutions (harder targets for sustained 90%+) or simplifications (easier targets for sustained <25%). Also handles overachievement: sustained exceeding of targets prompts a “raise the bar” suggestion. Mutations are presented as actionable cards — accept to apply, dismiss to keep current. User story: “As a user, my habit targets evolve over time based on my actual performance.”

Wishlist (aspirational spending targets) — Items you want to buy, with euro/point cost. Shows “affordable” vs. “saving toward” based on current balance. Affordability projection based on recent earning rate. Purchasing creates a spend entry. User story: “As a user, I can see what I’m saving toward and how long it’ll take to earn it.”

7. Technical Architecture

Platform stack

Layer	Technology	Rationale
Frontend	React 18, TypeScript, Vite, TailwindCSS 3	Industry standard, fast dev loop, type safety
State management	Zustand + React Query	Zustand for UI state + local cache; React Query for server state + sync
Backend	Supabase (hosted Postgres + Auth + Edge Functions + Realtime)	Eliminates custom backend for CRUD; RLS for multi-user path; generous free tier
AI	Supabase Edge Functions → Anthropic Claude API (Sonnet)	Edge functions for server-side AI calls; keeps API key secure; low latency
PWA	vite-plugin-pwa	Installable on phone/desktop; offline shell caching
Hosting	Vercel or Netlify (frontend) + Supabase (backend)	Zero-config deployment; free tier covers single-user

Data model

missions

Column	Type	Notes
`id`	uuid (PK)
`user_id`	uuid (FK → auth.users)	RLS-ready for multi-user
`name`	text	Mutable during Active
`theme`	text	1-3 words. Immutable after launch
`status`	enum	interlude, ideation, active, complete
`start_date`	date	Immutable after launch
`end_date`	date	Target end. Immutable after launch
`completed_at`	timestamptz	Null until Complete
`closing_summary`	jsonb	Sid-generated narrative + stats on completion
`categories`	text[]	Array of user-defined category labels
`success_criteria`	text[]	Array of success statements
`principles`	text[]	Array of mission principles
`target_weekly_earn`	integer	Economy guideline
`point_balance`	integer	Denormalized running balance
`created_at`	timestamptz

quests

Column	Type	Notes
`id`	uuid (PK)
`mission_id`	uuid (FK → missions)	Mission-scoped
`user_id`	uuid (FK)	RLS
`name`	text
`category`	text	From mission’s categories
`size`	enum	tiny, small, medium, big, huge
`points`	integer
`steps`	jsonb	Ordered array of step strings
`current_step`	integer	0-indexed
`status`	enum	active, complete, blocked, dropped
`ai_enriched`	boolean	Was this created via Sid?
`rolled_over_from`	uuid (FK → quests, nullable)	If carried from previous mission
`created_at`	timestamptz
`last_advanced_at`	timestamptz
`completed_at`	timestamptz

habits

Column	Type	Notes
`id`	uuid (PK)
`user_id`	uuid (FK)	RLS
`mission_id`	uuid (FK → missions, nullable)	Null = global. Set = mission-scoped
`name`	text
`target_description`	text	e.g., “10 min”
`days_per_week`	integer	Target frequency
`icon`	text	Lucide icon name
`is_active`	boolean	Can be paused
`created_at`	timestamptz

habit_logs

Column	Type	Notes
`id`	uuid (PK)
`habit_id`	uuid (FK → habits)
`user_id`	uuid (FK)	RLS
`date`	date
`done`	boolean
`week_start`	date	Monday of the week

ledger

Column	Type	Notes
`id`	uuid (PK)
`mission_id`	uuid (FK → missions)	Mission-scoped
`user_id`	uuid (FK)	RLS
`type`	enum	earn_quest, earn_habit, spend, correction
`description`	text
`points`	integer	Positive for earn, negative for spend
`week_start`	date	Monday of the week
`created_at`	timestamptz

ai_conversations

Column	Type	Notes
`id`	uuid (PK)
`user_id`	uuid (FK)	RLS
`mission_id`	uuid (FK, nullable)	Null for Ideation (mission not yet created)
`type`	enum	ideation, debrief, chat, enrichment
`messages`	jsonb	Array of {role, content} pairs
`created_at`	timestamptz

Key technical flows

Quest creation (optimistic UI + async enrichment):

User types short description and hits enter (available anytime during Active state)
Frontend immediately creates a bare quest in Supabase (name, status: active, ai_enriched: false). Quest appears on board instantly.
In parallel, frontend sends description + mission categories to Edge Function for enrichment
Edge Function calls Claude with Sid system prompt + quest enrichment prompt + recent quests for context (~500 tokens)
Claude returns structured JSON (category, size, points, steps, reasoning)
Frontend updates the quest record with enrichment data and shows “Sid’s suggestions ready” indicator on the card
User taps to review, edit, accept, or dismiss the enrichment at their convenience
If Edge Function fails or times out (>5s), quest remains bare — functional but unenriched. Manual edit always available.
Unenriched quests default to: no category, no size, no points, no steps. They can still be completed but earn no points until sized.

Weekly debrief generation:

On session start, frontend checks: is today past the preferred debrief day AND no debrief exists for the previous week?
If yes, frontend queries week’s data: quests completed/advanced, habit logs, ledger entries
Frontend sends aggregated week data to Edge Function
Edge Function calls Claude with Sid system prompt + debrief prompt + week data + economy target
Claude returns narrative + observations + economy commentary as JSON
Frontend stores debrief and presents it in a dedicated panel
User can dismiss and revisit from the weekly log

Multi-device sync and offline writes:

Supabase Postgres is the single source of truth
React Query handles server state with stale-while-revalidate
Zustand stores UI-only state (selected tab, expanded panels) in memory
Optimistic offline writes for low-conflict operations: Habit toggles and quest step advancement are queued locally if offline and synced on reconnect. These are simple, idempotent operations with low conflict risk. Quest creation, quick spend, and AI interactions require connectivity.
PWA caches the shell and read-only data via vite-plugin-pwa
Conflict resolution: last-write-wins (acceptable for single primary user)

Ledger trust mechanics:

The ledger is the source of truth for the economy. point_balance on the missions table is a cached projection, recalculated from ledger entries on app load.
Habit point awards use idempotency keys (habit_id + week_start) to prevent duplicate entries across devices or after cache weirdness.
Ledger entries are immutable. Corrections are made by adding a new entry (type: correction) rather than editing or deleting existing entries.
On app load, if the cached balance diverges from the ledger sum, the cache is silently corrected. The ledger is never wrong; the cache can be.

8. Cost & Infrastructure

Monthly cost breakdown

Service	Free tier	At scale (100 users)	Notes
Supabase	500MB DB, 50k auth users, 500k Edge Function invocations	$25/mo (Pro plan)	Free tier covers single-user indefinitely
Anthropic Claude API	N/A (pay per use)	~$5-15/mo for single user	~10 enrichments/week + 1 debrief + occasional chat ≈ $0.30-0.50/week on Sonnet
Vercel / Netlify	Free tier (100GB bandwidth)	Free tier likely sufficient	Static PWA deployment
Google Fonts	Free	Free	JetBrains Mono + Space Grotesk
Domain	Subdomain of existing domain	—	No additional cost

Cost summary by phase

Phase	Monthly cost	Notes
MVP (you only)	~$2-5/mo	Supabase free tier + Claude API usage
Early users (10-50)	~$30-40/mo	Supabase Pro + higher Claude usage
Growth (100+)	~$50-100/mo	Supabase Pro + significant Claude usage

Maintenance burden

Supabase: Managed. No server patching, no DB administration.
Frontend: Vercel auto-deploys from git. No CI/CD to maintain.
AI: Claude API is stateless. No model training, no fine-tuning. Personality lives in prompts.
PWA: Service worker updates automatically with vite-plugin-pwa.
Total estimated maintenance: 1-2 hours/month for dependency updates and monitoring.

9. Design Principles

Arcs over infinity. Every interaction should reinforce that this system has seasons, not streaks. Rest is designed in, not failed into.
Honest over encouraging. Sid tells you what the data says, not what you want to hear. The system earns trust through accuracy, not positivity.
Dense on desktop, focused on mobile. Desktop screens feel like cockpit readouts — rich with information, organized so your eye finds what matters. Mobile screens are purpose-built for quick actions: toggle, complete, spend, check. Same data, different density per viewport.
Earn it, spend it. Points exist to create consequences. If earning feels too easy or spending feels meaningless, the economy is broken. Tension in the balance is the feature.
Your story, your rules. The system provides structure (missions, states, economy) but the content (quests, habits, principles, categories) is entirely user-defined. Sid suggests; the Captain decides.

10. Success Metrics

North star metric

Weekly active sessions — the number of weeks where the user opens Helm at least twice (once to log, once to review). If this number drops, nothing else matters.

Supporting metrics

Metric	Target (single user)	Why it matters
Weekly active sessions	≥ 2 per week	Core engagement — are you using it?
Quest completion rate	60-80% per mission	Too high = quests are too easy. Too low = system is discouraging.
Habit consistency	50-75% average	Sustainable range. 90%+ for weeks on end suggests undertargeting.
AI debrief read rate	90%+ of generated debriefs	Is Sid delivering value?
Mission completion rate	80%+ (not dropped/abandoned)	Are missions scoped correctly?
Points balance oscillation	Oscillates around zero, not permanently positive or negative	Economy is calibrated.
Time-to-first-quest	< 10 minutes from first Ideation	Onboarding is smooth.
Quest enrichment acceptance rate	> 70% accepted without major edits	Sid understands your categories and sizing.

Kill criteria — when to stop

If after two complete missions, you’re opening the MDX page instead of Helm, the product has failed its core premise. If Sid’s debriefs feel generic for 4+ consecutive weeks, the AI integration has failed. If the economy balance is permanently ignored (no spending logged for 3+ weeks during Active), the economy has failed. These are honest signals that the concept doesn’t work, and it’s better to learn that than to keep building features on a broken foundation.

Qualitative metrics (collected via simple in-app feedback)

Debrief usefulness: After reading a debrief, one-tap: “This was specific and useful” / “This was generic.” Target: 70%+ useful.
Mission planning confidence: After completing Ideation, one-tap: “I feel clear about this mission” / “I’m still fuzzy.” Target: 80%+ clear.
Enrichment accuracy: After reviewing quest enrichment, the edit rate is the implicit signal. Explicit: category/size changed = miss.

11. Risks & Mitigations

Risk	Description	Severity	Mitigation
AI quality	Sid’s debriefs are generic or hallucinated	High	Inject full week data as structured context. Validate JSON output. Include “I don’t have enough data” fallback.
AI cost spiral	Excessive Claude API usage drives costs above budget	Medium	Rate-limit enrichment calls. Cache debriefs. Use Haiku for enrichment, Sonnet for debriefs. Monitor weekly.
Onboarding friction	First Ideation with zero history produces a weak mission plan	High	Design first-Ideation prompts to gather context conversationally. Accept that mission 1 will be manually tuned. Sid improves from mission 2.
Economy imbalance	Earning rates too high (no tension) or too low (punishing)	Medium	Target weekly earn rate as guideline. Sid flags imbalances in debriefs. User can adjust mid-mission.
Scope creep	MVP grows before the spine is solid	High	Strict phase gating. Don’t start Phase 2 until Phase 1 is deployed and used for at least 2 weeks.
Single user bottleneck	Designing for yourself makes the product ungeneralizable	Low	Supabase RLS + auth from day one. user_id on every table. UI decisions documented.
Supabase dependency	Platform changes, pricing, or outages	Low	Standard Postgres underneath. Data exportable. Frontend decoupled.
Habit tracking fatigue	Daily toggles become tedious	Medium	Minimal grid (tap to toggle). Habit mutations (Phase 4) suggest dropping stale habits.
Mobile readability	Dense cockpit aesthetic doesn’t translate to small screens	Medium	Dedicated mobile layout with reduced density, larger touch targets, action-first design.
Self-enforcement gap	Point economy has no mechanical enforcement — user can cheat	Medium	By design. The economy is a commitment device, not a lock. Research on commitment devices (StickK, Beeminder) shows they work for self-selected users but not universally. The economy is an amplifier — the rest of the system works without it.
Debrief quality degradation	Sid’s debriefs become generic/repetitive after many weeks	Medium	Debrief content varies with data — a high-spend week produces a different debrief than a zero-spend week. If debriefs feel generic, it’s a signal that the context assembly or prompt needs work, not that the concept is wrong. Kill criterion: 4+ weeks of “generic” feedback from the user.
Product overfit	The aesthetic, personality, and philosophy are so specific they only appeal to the creator	Medium	Intentional. “Build for one, architect for many” means the first user’s taste IS the product. If it doesn’t generalize, it’s still a successful personal tool.
AI cost unpredictability	Power users using on-demand chat extensively could blow through the API budget	Medium	Rate-limit on-demand chat to N messages/day. Use Haiku for enrichment, Sonnet for debriefs and Ideation. Monitor per-user API spend weekly.
Context window limitations	Shoving too much history into prompts degrades AI quality	High	Context assembly strategy with token budgets per interaction type (see 5.5). Previous missions stored as compressed summaries, not raw data.

12. Monetization Strategy

Phase 1-2: Free for personal use. This is a learning project that’s also a viable product. No monetization until others use it.

Phase 3+: Open-core model.

Free tier: Full feature set for a single user. Self-hosted option.
Paid tier ($5-8/month): Cloud-hosted with AI features (Sid requires Claude API calls that cost real money). Multiple missions in parallel. Data export. Priority support.
AI costs are the natural paywall. Sid is the premium feature. The system works without AI (manual quest creation, no debriefs) but it’s dramatically better with it.

This is realistic, not ambitious. The TAM for “adults who want a narrative life operating system” is small. The goal is sustainability (covering hosting + AI costs), not venture scale.

13. Known Limitations & Default Behaviors

Gap	Default MVP behavior	Revisit when…
Limited offline write support	Habit toggles and quest step advancement work offline (queued and synced on reconnect). Quest creation, spending, and AI interactions require connectivity.	Users request full offline-first. Evaluate conflict resolution complexity for spending and quest creation.
No notifications / reminders	Sid doesn’t push. You pull. Debriefs are presented on next session.	Retention data suggests reminders would help, not annoy.
No multi-mission	One Active mission at a time.	Users request parallel tracks. Evaluate UX complexity.
No import / export	No way to import from Habitica, Notion, or spreadsheets. No data export.	Other users want to migrate in/out.
No shared access	Single user. Partner can’t see the dashboard.	Partner visibility requested. Add read-only shared view.
No calendar integration	Quests have no due dates or calendar sync.	Time-blocking becomes relevant.
No undo on quest completion	Completing a quest is final (points are earned).	Accidental completions happen. Add a 5-minute undo window.
Habit points are weekly only	No daily point granularity. Weekly auto-calculation.	Daily feedback loop requested.
No dark/light mode toggle	Dark mode only (Wayfarer cockpit).	Users request light mode. Dark is canonical.
AI can’t access external data	Sid only knows what’s in Helm.	Integration requests. Evaluate API-by-API.
Week definition is fixed (Mon-Sun)	Configurable debrief day, but weeks are Mon-Sun.	Non-standard work weeks requested.
No journaling in MVP	Reflection via debriefs and on-demand Sid chat.	Captain’s Log planned for Phase 2.
No overachievement bonuses	Exceeding habit target earns same as 90%+.	Habit Mutations (Phase 4) will address with “raise the bar” suggestions.
No enforcement on economy	Self-enforced. System doesn’t prevent spending real money beyond balance.	By design. Accountability, not restriction.
Minimal mission archive in MVP	Basic closing summary, not a rich Stat Deck.	Phase 3 adds the full retrospective experience.
No accessibility audit	Color contrast, keyboard nav, and screen reader support are design goals but not verified against WCAG AA. Signal decay (opacity-based) needs a secondary non-visual indicator.	Before any public launch. Accessibility is non-negotiable for a multi-user product.
No data deletion flow	No “delete my account and all data” capability.	Before multi-user launch. GDPR-aware design required. Add data export and right-to-deletion.
No coexistence with existing tools	Helm is an island — no import, export, API, or integrations.	When adoption data shows users want Helm alongside (not instead of) existing tools.
Ledger corrections are manual	Correction entries exist (type: `correction`) but must be created manually. No automated detection of duplicates or errors.	Add automated duplicate detection and a “reverse last entry” quick action.
No testing strategy in PRD	Testing (unit, integration, AI output validation) defined during spec-units, not here.	Implementation phase. AI JSON validation is especially critical — malformed Sid output must never corrupt the data model.

Scope-cut priority (if behind at week 6, cut in this order):

Signal decay visuals (cosmetic, not functional)
On-demand Sid chat (debriefs and enrichment are sufficient for MVP)
AI Ideation (replace with manual mission creation form; add AI Ideation for mission 2)
Quest rollover (handle manually for mission 1-to-2 transition)
Mission archive (just transition to Interlude without a summary)

Absolute minimum shippable product: Manual mission creation + AI-enriched quests (with optimistic capture) + habit grid + ledger + weekly debrief. Everything else is enhancement.

14. Decisions Log

#	Question	Decision	Rationale
1	Who is the target user?	You first, others someday	Build for real usage, architect for scale.
2	Tech stack?	Supabase + React/TS/Vite/Tailwind PWA	Eliminates backend complexity. Postgres is portable. Auth + RLS ready for multi-user.
3	MVP scope?	Mission lifecycle → Quests → Habits → Points → AI debrief + quest rollover + mission archive	The spine plus the transition mechanics that make multi-mission work.
4	How many mission states?	4: Interlude → Ideation → Active → Complete	Preserves Interlude (rest as feature) and Ideation (planning as conversation).
5	Is AI Ideation MVP?	Yes	Killer feature. Makes onboarding work. Nobody else has it.
6	Mission cadence?	Flexible, user-defined dates	No hardcoded quarterly assumption.
7	Auto-terminate on end date?	No	Stays Active until manually closed.
8	Quest creation flow?	AI-first with manual fallback	Sid enriches from short description. Manual form is escape hatch.
9	Habit scoping?	Both — global + mission-scoped	Global for life practices. Mission-scoped for experiments.
10	Point economy model?	1pt = 1€, self-enforced, target weekly earn rate as living guideline	Real stakes without mechanical enforcement. Monitored in debriefs.
11	Mission principles?	First-class, flexible per mission	Guardrails defined during Ideation. Not enforced by system.
12	AI debrief cadence?	Weekly on first session after debrief day + on-demand	Respects schedule. No push.
13	Device strategy?	Both — mobile logging, desktop review, different density	PWA + Supabase sync.
14	AI copilot name?	Sid	Nod to Sidra from Chambers. Crewmate energy.
15	User callsign?	”Captain” by default, configurable in profile	Warmer than “Commander.” Editable.
16	AI personality?	Sardonic warmth, DCC-inspired	Honest, concise, amused by patterns, genuinely invested.
17	Design language?	Wayfarer cockpit — dark/warm/amber/dense on desktop, focused on mobile	Opinionated. Different density per viewport.
18	Product name?	Helm (proposed)	Where you steer the ship.
19	Quest timing?	Continuous during Active	GTD capture model. Ideation defines container, not task list.
20	Quest rollover?	MVP feature	Core to mission-to-mission transition.
21	Habit points?	Auto-calculated weekly, no manual action	Reduces friction.
22	Overachieving?	Same points as 90%+. Sid notes it.	Formal bonuses deferred to Phase 4 Habit Mutations.
23	Journaling?	Phase 2 (Captain’s Log)	Debriefs cover structured reflection. Freeform is valuable but not spine.
24	Character Sheet?	Reflection surface, not RPG stats	Category balance, habit story, economy profile, titles, Sid’s observations. Data-driven, not arbitrary.
25	Mission archive in MVP?	Yes — basic closing summary	Full Stat Deck + History in Phase 3.
26	Signal decay?	Design pattern in 5.7, not standalone feature	Applied visually to quests (and later pings/crew).
27	”Grows smarter”?	Replaced with “accumulates richer context”	Honest about what the AI does. No ML. Context injection.
28	Economy enforcement?	Self-enforced, by design	Commitment device. Works for self-selected users, not universally. Economy is an amplifier, not a dependency.
29	Quest capture latency?	Optimistic UI — quest created instantly, enrichment async	Frictionless capture is non-negotiable for GTD-style input. AI enrichment enhances but never blocks.
30	First-mission onboarding?	Concrete 6-step discovery flow with defaults	First Ideation can’t rely on history. Structured questions, starter categories, conservative economy defaults (€15-20/week), skip-able principles. Under 10 minutes to launch.
31	AI context window?	Token-budgeted context assembly per interaction type	Enrichment: ~500 tokens. Debrief: ~2,000. Ideation: ~3,000. Previous missions as compressed summaries, never raw JSON.
32	Ledger trust?	Immutable entries, idempotency keys, cached balance reconciliation	Ledger is source of truth. Balance is projection. Corrections are new entries, not edits. Habit awards use idempotency keys.
33	Sid tone guardrails?	Negative examples defined, sensitive pattern detection rules	Never mock effort. Never psychologize. Shift to gentle mode on multi-week zero activity. Never fabricate patterns.
34	Kill criteria?	Defined: MDX fallback, generic debriefs, ignored economy	Honest failure signals tied to specific product bets. Better to learn the concept doesn’t work than to keep building.
35	Offline writes?	Optimistic for habit toggles and step advancement only	Low-conflict, idempotent operations queued locally and synced on reconnect. Quest creation, spending, and AI require connectivity.
36	Scope-cut priority?	Ordered list: signal decay → on-demand chat → AI Ideation → rollover → archive	Absolute minimum: manual mission creation + AI quests + habits + ledger + weekly debrief.
37	Mission closing?	Success criteria reviewed (achieved/partial/missed) before summary	Adds accountability to arc endings. Sid references outcomes in closing narrative.
38	Accessibility?	WCAG AA baseline, keyboard nav, screen readers, color independence	Non-negotiable for multi-user. Verify palette contrast. Signal decay needs secondary non-visual indicator.
39	Economy defaults?	First mission: €15-20/week target, 50-60% quests / 40-50% habits earning mix	Conservative start. Sid flags miscalibration (3+ weeks surplus or deficit). Mid-mission adjustments encouraged.
40	Data deletion?	Not in MVP. Required before multi-user launch.	GDPR-aware design. Right to deletion, data export. Known limitation with clear trigger.

15. Appendix

A. Competitor deep dives

Habitica — Free with $4.99/mo subscription. 15M+ downloads. RPG gamification with 8-bit pixel art. Three task types: habits, dailies, to-dos. Party system for group accountability. HP loss mechanic for missed dailies creates anxiety in some users. Community gutted in 2023 (guilds and Tavern removed). No AI. No mission/arc concept. Currency (gold) buys pixel gear — no real-world stakes. Strengths: proven gamification loop, open source. Weaknesses: childish aesthetic, no reflection/narrative, users outgrow it, community collapse.

Notion Life OS templates — $15-100 one-time purchase (templates). Requires Notion ($8-10/mo for AI features). Gamified Life OS, LiFE RPG, Life OS Dashboard are the leading options. Offer deep framework integration (up to 47 productivity models), AI agents for weekly/monthly review, identity-based tracking. Strengths: extreme customization, deep frameworks, active community. Weaknesses: Notion is slow on mobile, templates are fragile, setup takes hours, no native PWA, AI features require Notion Plus plan.

Fabulous — $39.99/year premium. 37M+ users. Behavioral science-backed from Duke University. Journey-based progression with habit stacking. Beautiful onboarding. Strengths: scientific backing, gorgeous design, guided journeys. Weaknesses: prescriptive not self-authored, limited free tier, aggressive upsells, not customizable for non-standard schedules.

Theme System Journal — $20-25 per journal (physical). Quarterly subscription available. Designed by CGP Grey and Myke Hurley (Cortex podcast). Seasonal theme + daily journal + habit tracking. Strengths: powerful framework, tactile object, community. Weaknesses: analog only, no data analysis, no AI, expensive ($80-100/year).

LifeUp — $4 one-time purchase. Android only. Highly customizable gamification sandbox. Strengths: one-time purchase, extreme customization, privacy-focused (offline-first). Weaknesses: Android only, no AI, steep learning curve, no mission concept.

OpenClaw — Free/open-source AI agent platform (68k+ GitHub stars). Runs locally, connects to 50+ integrations, community-built skills including gamification-xp (XP/levels/badges via Supabase). Strengths: unlimited extensibility, model-agnostic, privacy-first, community-driven. Weaknesses: requires significant technical setup, no designed experience, no opinionated structure — it’s a toolkit, not a product.

B. Framework influence map

Framework	Author	Key concept used in Helm
Atomic Habits	James Clear	Identity-based change; systems > goals; 4 laws → Character Sheet reflection, habit tracking
Theme System	CGP Grey / Myke Hurley	Seasonal themes as directional compass → Mission themes, 1-3 word constraint
VOMIT System	Struthless (Campbell Walker)	Bare minimum vs. killing it; buckets; 70% rule → Energy-aware design, life categories, Captain’s Log
PARA Method	Tiago Forte	Organize by actionability → Mission-scoped vs. global data architecture
Building a Second Brain	Tiago Forte	CODE workflow; progressive summarization → AI debrief as distillation
Deep Work	Cal Newport	Protected focus; time scarcity → Salvage Run, fragment-friendly UX
Clear Thinking	Shane Parrish	Decision defaults; decision journals → Principles, economy as pre-commitment
GTD	David Allen	Capture everything; weekly review; trusted system → Quest board (continuous capture), weekly log
Bullet Journal	Ryder Carroll	Rapid logging; migration → Quick-add quests, quest rollover
Zettelkasten	Niklas Luhmann	Networked knowledge; communication partner → AI as cross-data insight engine
Digital Gardens	Various	Living content; seedling→evergreen lifecycle → Mission as evolving organism
OKRs	Intel/Google	Objectives + Key Results → Theme + Success Criteria
Spaced Repetition	Various	Review at optimal intervals → Habit mutations, signal decay

C. Aesthetic & narrative inspirations

Source	Type	What it contributes to Helm
A Long Way to a Small Angry Planet (Becky Chambers)	Novel	Emotional north star. The Wayfarer is warm, lived-in, functional, personal. Also the origin of Sid’s name (Sidra from A Closed and Common Orbit).
Caves of Qud (Freehold Games)	Game	Emergent narrative from structured systems. Procedural history. Information-dense warmth. “Wild garden of emergent narrative.”
Dungeon Crawler Carl (Matt Dinniman)	Novel series	Sid’s personality inspiration. In the series, an AI runs a deadly, televised dungeon game — darkly humorous, sardonic, surprisingly invested in the contestants’ survival. Sid takes this energy and warms it: amused by your patterns, honest about your failures, genuinely invested in your success.
LitRPG genre (various)	Literary genre	The “status screen for real life” concept. Visible progression as narrative engine. Character growth that’s measurable and satisfying. Titles and levels as honest achievements.

D. Data & API landscape

Service	Role	Free tier	API available
Supabase	Backend (Postgres, Auth, Edge Functions)	500MB DB, 50k users, 500k invocations	Yes — REST, GraphQL, Realtime
Anthropic Claude API	AI (Sid)	None (pay per use)	Yes — Messages API
Google Fonts	Typography	Unlimited	CDN
Lucide React	Icons	Open source	npm package
Vercel	Frontend hosting	100GB bandwidth	Git deploy

E. Research sources

Habitica App Store reviews (iOS + Google Play), Trustpilot reviews
Gamified Life OS, LiFE RPG, Life OS Dashboard — Notion template marketplaces
Fabulous App Store reviews, Trustpilot, Choosing Therapy review
Theme System Journal — themesystem.com, Cortex podcast episodes, Pen Addict review
“Why I stopped using Habitica” — Yuv Saxena (Substack)
“5 Best Habitica Alternatives in 2026” — habi.app
Zettelkasten.de introduction, Maggie Appleton digital garden history
Caves of Qud press kit, Game Developer interview, RPGFan review
Atomic Habits cheat sheet (thebehavioralscientist.com)
Building a Second Brain definitive guide (fortelabs.com)
Struthless VOMIT System documentation
CGP Grey yearly themes (cgpgrey.substack.com)
Dungeon Crawler Carl (Matt Dinniman) — novel series, AI personality reference
OpenClaw documentation, showcase, DigitalOcean overview, gamification-xp skill