← Writing

Why we built an AI simulator for crisis professionals

· 5 min read
AI NGO feminoteka case study

The first time we sat down with the team at Feminoteka, they described a problem that had no clean solution. Their crisis counsellors — people who staff hotlines, answer emails from survivors, and work directly with victims of domestic violence — needed to practise difficult conversations before having them for real.

This is a category of problem that most software never has to grapple with. You cannot run a mock crisis call with a real person pretending to be in distress without that interaction carrying weight. You cannot outsource the emotional labour of a simulated conversation to a volunteer without asking something significant of them. You cannot record a library of scripted scenarios and expect counsellors to learn from them — because crisis conversations don’t follow scripts, and the skill being developed is precisely the ability to respond to the unexpected.

So we asked the obvious question: what do people do now?

The answer was: not much. New counsellors observe, then shadow, then are thrown in. The gap between watching someone else handle a crisis call and handling one yourself is filled with anxiety and on-the-job learning. That’s not a critique of the organisation — it’s a description of an unsolved problem that predates AI entirely.

Why not something simpler

Before we wrote a single line of code, we spent time genuinely trying to argue ourselves out of using AI. A chatbot felt like overkill. A role-play guide and a training partner might be enough.

But a training partner requires another person who can absorb the weight of the scenario and debrief it. That’s not always available, especially outside business hours, or for counsellors who are geographically isolated, or in organisations without full-time supervision capacity. A static guide teaches concepts; it doesn’t build the reflexes that crisis response actually requires.

The more we mapped the problem, the more AI simulation looked less like a technical choice and more like a necessity. Not “AI would be cool here” — more like “everything else we’ve considered is inadequate in ways that matter.”

That realization shaped how we built it.

What responsible looks like in this context

Responsible AI development gets invoked a lot. In our case it had a concrete shape: every scenario prompt in the simulator was reviewed by a psychologist who works in trauma-informed care. Not just reviewed for accuracy — reviewed for calibration. Is this scenario realistic enough to be useful? Is it distressing in ways that build skills, or distressing in ways that are just harmful?

We also made a deliberate architectural choice: the simulator is scenario-based, not open-ended. A counselor is placed in a specific situation — an initial contact call from someone who isn’t ready to name what’s happening, for example — and the AI plays that role within a defined frame. It doesn’t improvise freely. The scenario has a shape, and the AI works within it.

That constraint was a feature, not a limitation. It meant we could make safety guarantees. It meant supervisors could review specific scenarios with their teams. It meant the training had a beginning, middle, and end — which is how skills actually get internalized.

Every session is recorded for supervision. That was a requirement from the Feminoteka team, and it’s the right call: a session log isn’t surveillance, it’s the mechanism by which a supervisor can debrief with a counselor after a difficult simulation.

What surprised us

The hardest calibration problem wasn’t technical. It was tonal.

An AI that plays a distressed person too realistically risks causing secondary trauma in the counselor. An AI that pulls its punches isn’t useful — counsellors can tell when they’re being handled, and it doesn’t build the skill. We spent more time than expected on what we internally called the “realism gradient” — tuning how much emotional weight a scenario should carry at each stage of a counselor’s training progression.

There was also a bilingual complexity we underestimated. The platform operates in Polish, and crisis language in Polish carries idioms and registers that don’t map cleanly onto translated prompts. Prompts that worked fluently in English became wooden when directly translated. We ended up rebuilding the scenario library in Polish from the ground up rather than translating, which added time but produced something that actually felt like a real conversation.

Where it ends

The platform is a training tool. We want to be clear about that, because it’s easy to oversell what AI can do in a sensitive context.

AI simulation is useful for building fluency — for developing the automatic, practiced responses that let a counselor be fully present with someone in crisis rather than consciously searching for the right words. That’s valuable. But it doesn’t replace human supervision. It doesn’t replace peer review. It doesn’t replace the judgment that comes from years of actual experience.

We built something that makes training more accessible and more consistent. That’s a real contribution. The organisations that will use it well are the ones who understand that a better training tool is an input to good practice, not a substitute for it.

That framing came from Feminoteka. It’s one of the things that made the project worth doing.