Case Study
AI-assisted journaling
built on why people quit.
Most journaling apps assume you already want to write. This one is designed for the days you don't. The behavioral science says those are the days that matter most.
40+
Components
25+
Research Papers
Neomorphism
Design System
Solo Build
Timeline
The Problem
I kept failing at journaling. Not because I didn't want to reflect. I'd start a journal, write for a few days, and then hit a day where I felt like I had nothing to say. No big insights. No interesting events. Just a blank page and the vague sense that I should write something.
So I wouldn't write. And then the next day felt harder because I'd already broken the streak. Within a week or two, the journal was abandoned.
This pattern repeated for years. The problem wasn't discipline. The problem was that on low-insight days, journaling felt pointless, and pointlessness kills habits.
Existing tools fell into two camps:
Dumb storage
A text box. You write, it saves. No structure, no help noticing patterns.
Intrusive AI
The tool wants to write for you. It summarizes, suggests, rewrites. You're just the prompter.
Neither approach solved the blank-page problem. Neither helped me sustain the habit. The broader landscape wasn't much better:
The Problem
Most habit and goal-tracking apps rely on streaks, gamification, and social pressure — mechanisms that research shows can undermine intrinsic motivation. They treat accountability as punishment rather than self-understanding.
- —Streak anxiety leads to burnout
- —Public commitment can reduce commitment-making
- —External rewards crowd out internal motivation
The Vision
An app that acts like a thoughtful accountability partner. It uses AI to surface patterns, supports self-determination, and builds habits through understanding.
- +Private by default, optional sharing
- +Evidence-based habit formation (Lally et al.)
- +AI that asks questions, not gives orders
Reflection isn't the hard part. Sustaining it is.
The Approach
Writrospect treats AI as a reflective companion. You write freely, even if it's short, even if it feels insignificant. The AI reads what you've written and notices patterns you might miss: recurring themes, commitments you've mentioned, connections to previous entries. It suggests additions to your journal based on your conversation with it.
But you decide what to keep. The AI never writes your entry for you.
The AI gives you something to respond to, which is easier than generating from scratch. It also helps you notice that even “boring” days have threads worth following.
The app's navigation mirrors how reflection actually turns into action:
You write, surface what matters, break it down, figure out how to approach it, and see where you stand. Each screen builds on the last.
Strategic Positioning
Most accountability apps lean punitive: streaks, shame, gamification. Writrospect occupies the underserved quadrant: AI-powered and genuinely supportive.
AI-Powered
Manual
Why behavioral science matters
Any app can add AI chat. The harder part is grounding every feature decision in peer-reviewed research and deliberately excluding popular patterns (streaks, leaderboards, public shaming) when the evidence says they backfire. That discipline shaped every decision in Writrospect.
Research to Product
The features aren't invented. Each one traces to a specific paper. A few examples:
Lally et al. (2010)
“Habit formation takes an average of 66 days, with a range of 18–254 days depending on complexity.”
Flexible formation tracking
No rigid 21-day or 30-day timelines. Progress adapts to the individual habit’s complexity and the user’s context.
Lally et al. (2010)
“Missing a single day did not meaningfully derail the habit formation process.”
No streak-shaming
A missed day is acknowledged without penalty. The UI avoids broken-streak visuals that induce guilt.
Ryan & Deci (2000)
“Threats, deadlines, and imposed goals diminish intrinsic motivation and undermine self-regulation.”
No punitive features
No fines, no public failure announcements, no countdown timers. Motivation comes from within, not from fear.
Gollwitzer (1999)
“Implementation intentions (if-then plans) produce strong effects on goal attainment (d = .61–.77).”
Strategies feature
Users create implementation intentions as “Strategies” — concrete if-then plans attached to their goals.
Munson et al. (2015)
“Public accountability increases commitment-keeping but suppresses commitment-making.”
Private by default
All data is private. Sharing is opt-in per item. This preserves the willingness to set ambitious goals.
The discipline of what I don't build is the real differentiator.
What I Deliberately Don’t Build
These are popular features in the accountability space. I excluded every one, not from oversight, but from evidence.
Streaks
Streak counters punish missed days and create anxiety around maintaining arbitrary chains. A single lapse becomes a catastrophic reset rather than a minor blip.
Lally et al. (2010): missing one day doesn’t reset habit formation. Ryan & Deci (2000): deadlines and threats diminish intrinsic motivation.
Leaderboards
Comparative rankings turn self-improvement into competition. Users optimize for rank rather than personal growth, and those at the bottom feel demotivated.
Ryan & Deci (2000): pressured evaluations and competitive contexts diminish intrinsic motivation and autonomous self-regulation.
Public Shaming
Making failures visible to others increases follow-through on existing commitments but discourages people from setting ambitious goals in the first place.
Munson et al. (2015): public accountability suppresses commitment-making. People set easier goals when they know failure is visible.
Gamification Points
Points, badges, and XP systems create extrinsic reward loops that crowd out the intrinsic satisfaction of genuine self-improvement.
Ryan & Deci (2000): external rewards and contingent reinforcements undermine internal motivation, especially for interesting tasks.
The Neuroscience of Follow-Through
When Writrospect detects that a user is setting a goal, making a commitment, or tracking a habit, the system doesn't just store it. It activates a set of design patterns rooted in how the brain actually processes identity and future planning.
Most productivity tools treat follow-through as a reminder problem. Set an alarm, get a notification, do the thing. But the failure isn't at the reminder stage. It's upstream, in how people relate to their own commitments.
Your future self is a stranger
The anterior cingulate cortex (ACC) activates strongly when you think about your present self, but weakly when you think about your future self. The activation pattern for “me in five years” looks more like “a stranger” than “me right now.” This isn’t a metaphor. It’s measurable neural activity.
Design implication
People don’t procrastinate because they’re lazy. They procrastinate because their brain literally doesn’t register future consequences as happening to them. The tracking system needs to collapse that distance.
ACC activation predicts behavior
The strength of ACC activation for your future self correlates with real-world outcomes: savings rates, exercise frequency, follow-through on commitments. People with stronger future-self continuity make better long-term decisions consistently.
Design implication
This isn’t about willpower or discipline. It’s about how vividly you can imagine yourself later. Design can intervene here by making the future self concrete and present.
Identity follows behavior, not the reverse
The “pencil-in-teeth” effect: holding a pencil between your teeth forces a smile shape, and subjects report feeling happier. Identity works the same way. You don’t need to feel like a disciplined person to act like one. Repeated behavior, reflected back, builds the identity.
Design implication
The system should reflect patterns back with evidence: “You’ve done this three times now. That’s what someone who cares about this does.” Let self-concept catch up to behavior.
Your brain treats “you tomorrow” like a stranger.
The anterior cingulate cortex fires strongly for present-self recognition, weakly for future-self. This measurable gap in neural activation explains why people neglect their own commitments. Design can close it.
Designing the Tracking System
These neuroscience findings translated directly into design principles for how Writrospect responds when it detects a goal, habit, or commitment in a journal entry. Each principle has a concrete “instead of this, do this” pattern baked into the AI's system prompt.
Earned identity reinforcement
Instead of
“Great job! You’re so disciplined!”
The system says
“You followed through on [X]. That’s the third time this week. That’s what someone who [identity] does.”
Every identity claim is anchored to evidence the user actually produced. No hollow affirmations.
Future-self bridging
Instead of
“Remember to do this later!”
The system says
“Imagine yourself tonight at 11pm. What do they need from you right now?”
Second person (“you at 11pm”) instead of third person (“future you”). Collapse the temporal distance.
Dependency chain surfacing
Instead of
“Set a reminder for 11pm bedtime”
The system says
“For an 11pm sleep goal, dinner needs to be done by 10. When do you need to start cooking?”
People see the goal but not the upstream prerequisites. The system backward-chains from the target.
Failure as data
Instead of
“You missed your goal. Try harder tomorrow!”
The system says
“What got in the way? One missed day doesn’t change who you are. You’re still the person who [past evidence].”
Capture the blocker as a learned dependency for future planning. Preserve accumulated identity evidence.
The system also maintains a running record of stated values, commitments made, follow-through events, recurring themes, identity aspirations, past blockers, and learned dependencies. This history is what makes identity reinforcement feel earned rather than manufactured.
Behavior, then reflection, then identity, then more behavior.
Design Decisions
Every visual choice maps back to the product's core value: calm, supportive self-reflection.
Visual Language
Neomorphism
Soft, tactile surfaces that feel calm and approachable. The raised/pressed states create physical affordances that guide interaction without harsh borders or flat minimalism.
Dual-shadow system (light + dark) for depth
Pressed state for active/selected elements
16px border radius for softness
Color Theory
Pink-Lavender Palette
Warm mauve tones chosen to reduce visual stress and create a journal-like intimacy. The muted palette avoids the clinical feel of blue-heavy productivity apps. All color pairings meet WCAG AA contrast ratios.
Background: #e8dde8 — warm, not sterile
Foreground: #5c4a5c — soft contrast, easy on eyes
Accent: #a890a8 — subtle call-to-action warmth
All text meets WCAG AA (4.5:1 normal, 3:1 large text)
Typography
Comfortaa + Nunito
Rounded typefaces that reinforce the soft design language. Comfortaa for headings provides personality; Nunito for body text ensures readability at all sizes.
Comfortaa: geometric, rounded, modern headings
Nunito: humanist, friendly body text
Consistent weight scale: 400–700
Commitments, Tasks, and Strategies
Writrospect distinguishes between three levels of intention. Most apps collapse these into a single “goals” feature. Separating them matters because each serves a different cognitive function.
Commitments
Open-ended intentions surfaced from journaling. “I want to be more present with my family” is a commitment. It has no deadline and no metric. Its job is to capture what matters before you know how to act on it.
Tasks
Specific, completable actions tied to a commitment. “Put my phone in another room during dinner” is a task. It's concrete enough to do or not do. Tasks turn vague commitments into something trackable.
Strategies
Implementation intentions: if-then plans that specify when, where, and how a task happens. “If it's 6pm and I'm home, I'll leave my phone in the bedroom before sitting down to eat.” Strategies reduce the cognitive load of deciding in the moment.
The AI suggests all three during conversations. A journal entry about feeling distracted might surface a commitment about focus, which generates tasks, which get paired with strategies. The progression is always journal-first: you reflect, then act.
The Prompt Architecture
A naive approach would send every journal entry to Sonnet with a massive system prompt covering every possible scenario: crisis support, habit tracking, identity reinforcement, relationship context, time-of-day awareness. But most entries don't need most of that context, and loading it all wastes tokens and dilutes the model's attention.
Instead, Writrospect uses a hybrid routing architecture. A base system prompt always loads with the core behavioral rules (anti-sycophancy, tool usage, the neuroscience-informed tracking principles). On top of that, a modular prompt system assembles additional context conditionally, using a pipeline that avoids unnecessary API calls for simple messages.
The routing works in layers. First, a quick complexity check: short messages (“hello,” one-liners) get minimal context and skip everything else. No point running classification on a greeting.
For longer entries, the system scans for signal words (crisis language, goal-setting, identity references) and cross-checks user state (open habits, tracked relationships, time of day). If patterns match, the right modules load directly. No API call needed. Most messages resolve here.
The hard cases are entries that are clearly substantive but don't contain obvious signals: implicit emotional processing, indirect references to commitments, someone working through something they haven't named yet. These go to Haiku for classification before Sonnet generates the response. It's cheaper than sending everything to Sonnet and more accurate than pattern matching alone.
The result: Sonnet only sees the modules it needs. About 50% fewer tokens on average. Two users sending the same message get different prompt assemblies based on their history and context.
The user always has the last word. A checkbox table surfaces the active prompt modules before the response generates, letting users add or remove context. The same pattern appears when the system detects a new task or commitment: instead of guessing at dependencies, it presents likely prerequisites and lets the user confirm which actually apply.
Key Features
Where design, engineering, and research come together.
AI Chat with Tool Use
The chat isn’t just a text box — it calls structured tools that propose habits, goals, tasks, and strategies. The AI returns typed suggestions rendered as interactive cards the user can approve, edit, or dismiss.
- ◆Streaming responses via Anthropic SDK
- ◆Tool definitions for propose_items, suggest_dependencies, journal prompts
- ◆Typed suggestion system with QuickSuggestions component
- ◆Hidden context messages for seamless AI awareness
Evidence-Based Habit Tracking
Built on Lally et al.’s finding that habit formation averages 66 days. No streak-shaming — missing one day doesn’t reset progress. The app tracks automaticity, not perfection.
- ◆66-day formation model (not arbitrary streaks)
- ◆Self-Determination Theory: autonomy, competence, relatedness
- ◆Implementation intentions: if-then planning for each habit
- ◆25+ cited research papers in source bibliography
Neomorphism Design System
A complete component library with Card, Button, Input, Select, Modal, Toast, Badge, Tooltip, Skeleton, and more — all following the soft neomorphic visual language with CSS variable theming.
- ◆Raised/pressed/accent Card variants
- ◆Framer Motion entrance & interaction animations
- ◆CSS variable theming for easy dark mode extension
- ◆Responsive, accessible, keyboard-navigable
Stripe Billing & Token Economy
A freemium model with three tiers (Starter, Growth, Team), annual/monthly billing cycles, and purchasable token packs. Webhooks keep subscription state in sync, and idempotency checks prevent duplicate processing.
- ◆Stripe Checkout for subscriptions & one-time token pack purchases
- ◆Webhook handler with signature verification & idempotent event processing
- ◆Token usage tracking per billing period with overage billing support
- ◆Lazy-initialized Stripe client via Proxy for serverless cold starts
What I Learned
I learned to design what an AI should be before writing a single prompt.
The first version pattern-matched too eagerly and surfaced connections that felt forced. The problem was that I jumped straight to prompt engineering without deciding the AI's role. Once I defined it as a reflective mirror (shows you what you wrote, asks about what it noticed), the prompts followed naturally and the interactions stopped feeling pushy.
I learned that structured tool calls change what a UI can do with AI output.
Early on, the AI returned free-form text that I had to parse client-side. Switching to typed tool calls meant proposals came back as structured data. The UI could render them as interactive cards, pre-fill forms, and link to tracked commitments. The output format determined the interaction quality.
I learned that reading the research first gives you the confidence to say no.
Streaks, leaderboards, and public accountability are table-stakes features in every competitor. The behavioral science literature says they backfire for intrinsic habits. Having specific papers to point to made it easy to cut those features early and defend the decision. Without the research, I would have built them by default.
I learned that the ACC research changed how I think about tracking systems.
Before reading the future-self continuity research, commitments were just items in a database. After, they became opportunities to strengthen the neural pathway between present and future self. That reframing changed the tracking UI from a checklist into a reflection tool. The neuroscience gave me a reason to design it differently than every other habit tracker.
I use it every day, and that changes how I design it.
Being the primary user means I notice friction the moment it appears. Features that sounded good in planning reveal themselves as annoying in daily use. The product is better because I can't hide from my own design decisions.
Built with