An in-depth analysis of mental health software architecture, highlighting the critical trade-offs between data privacy, security, and therapeutic value, with insights from large-scale identity infrastructure and real-world threat modeling.

Okta Director of Security Engineering Arun Kumar Elengovan on Why Mental Health Software Needs a Different Conversation About Data Minimization

A security engineering leader who builds identity infrastructure for over a billion daily authentication events spent two weeks reviewing 72-hour mental health prototypes — and found that the deepest tension in the field is not between privacy and personalization, but between what the user needs the software to remember and what the security model can responsibly let it store.

There is a particular kind of architectural decision that almost no consumer software ever has to make, because the cost of getting it wrong is paid in abstractions like inconvenience, friction, or churn. The decision is what to remember about the user. For a shopping cart, the answer is obvious — remember everything. For a search engine, remember enough to personalize. For a fitness tracker, remember what the user opted into. None of these decisions are existential, because the worst-case outcome of remembering wrong is still recoverable: someone gets bad recommendations, someone deletes their account, someone files a complaint.

Mental health software does not get to make those decisions in the same register. The data that matters most therapeutically — fluctuations in mood, suicidal ideation, trauma narratives, the precise text of a 3 a.m. journal entry — is also the data whose breach would cause the most catastrophic harm to the user. The therapeutic value of remembering rises in direct proportion to the security cost of remembering. There is no neutral architectural choice here. Every line of database schema is a position on how much risk the user should bear in exchange for how much help.

Arun Kumar Elengovan, Director of Security Engineering at Okta, spent two weeks reviewing eight projects from MINDCODE 2026 — an international 72-hour hackathon organized by Hackathon Raptors that challenged participants to build software for mental wellness, accessible health tools, and AI-driven solutions for human wellbeing. Elengovan oversees identity infrastructure that protects more than a billion daily authentication events across over 19,000 customer organizations. He spends his working hours thinking about data architecture under adversarial conditions. What he saw at MINDCODE was a category of design problem his usual frame of reference cannot fully describe.

"Enterprise security teaches you to minimize data because the risk model is breach," Elengovan explains. "Mental health software inverts that intuition in a way enterprise rarely confronts. The user is asking the system to remember them. The therapeutic value of the product depends on memory. So 'data minimization' stops being a default and becomes a negotiation — what can you afford to remember, given what you can afford to lose?"

When the User Wants to Be Remembered

The cleanest articulation of this tension appeared in the way several MINDCODE submissions handled longitudinal mood tracking. One pattern Elengovan saw repeatedly was a Chrome extension or mobile app that captured passive signals — typing cadence, app switching, calendar density, sleep patterns inferred from device activity — and combined them with active inputs like check-in surveys or journal entries. The therapeutic logic was sound: burnout and depression are slow-moving phenomena that the user often cannot self-detect, and a passive signal layer fills a gap that explicit self-report cannot.

The architectural cost of that approach is that the system now holds a continuous, time-indexed, behaviorally rich dossier of the user's mental state. If that dossier is compromised, the user does not just lose a password. They lose a chronological record of when their depression worsened, when their relationship failed, when they could not sleep, and what they wrote in the dark hours of an unrecoverable week.

"The data that lets you build a useful intervention," Elengovan observes, "is the same data that lets a hostile actor build a profile that the user would never have consented to share with anyone. And the user did consent — they consented to therapeutic memory. The breach turns that memory into something else. That is a category of harm enterprise threat models do not capture, because enterprise data does not have the same emotional weight."

His recommendation in this domain is structural: separate the working memory of the application from the persistent memory, and apply different retention rules to each. Working memory — the data the system needs to render today's intervention — can be aggressive, detailed, behaviorally rich. Persistent memory — the data the system retains across sessions to inform tomorrow's intervention — should be transformed, summarized, and rotated. The ideal architecture lets the user benefit from continuity without forcing them to bear the breach risk of a full chronological record.

This is a concrete pattern, not a slogan. But it requires the engineer to make a deliberate decision about which signals are summary-worthy and which are not. And it requires the team, in the first 72 hours, to draw a line they will be tempted to erase later when product pressures arrive.

The Principal–Agent Problem of AI-Mediated Therapy

The second pattern that drew Elengovan's attention was the rise of AI-mediated wellness companions — chatbots, voice agents, conversational journals, and coaching layers built on large language models. Several MINDCODE submissions used this architecture, ranging from polished AI mental health companions with safety guardrails to more experimental conversational agents that paired emotional check-ins with personalized interventions.

For an identity engineer, these systems pose a question that is rarely asked explicitly: who is the relying party? In a traditional Okta deployment, the relying party is clear — it is the application that asks Okta to authenticate the user. The trust relationship is bilateral, contractually documented, and audit-traceable. In an AI-mediated wellness system, the trust relationship has at least three parties: the user, the application, and the model provider. Conversations the user thinks are private with the application may be passed through a third-party inference endpoint operated by an organization the user has never heard of. Tokens, retention policies, and fine-tuning rights are all governed by contracts the user did not sign.

"The assumption that the user is having a private conversation with the product is almost always wrong in AI-mediated systems," Elengovan notes. "There is at least one inference provider in the loop, and depending on the architecture, there may be embedding stores, vector databases, RAG retrieval layers, and logging pipelines all retaining fragments of that conversation. Each of those is a separate trust boundary. Each of them deserves its own threat model."

His specific recommendation for hackathon teams in this space was uncomfortably basic: write down, in a paragraph, exactly which third parties touch the user's conversation in each of the application's modes, and what each one is contractually allowed to do with it. Many teams could not answer the question. A small number could, and the difference between those two groups was visible immediately in his scoring. The teams that had considered the principal–agent problem before writing the integration code produced architectures that could be defended. The teams that had not produced architectures that worked beautifully in demo and would not survive a thoughtful adversary, a regulator, or an angry user.

"In my world," he explains, "the worst breaches are almost never the result of someone breaking the cryptography. They are the result of an architectural assumption that proved wrong. The user assumed the data stayed inside the application. The team assumed the model provider deletes after thirty days. The model provider assumed the team would not log embeddings. Everybody is operating in good faith and the user gets harmed anyway. That is the failure mode I look for, because it is the one nobody plans for."

Encryption Is a Necessary, Insufficient Answer

A pattern Elengovan flagged across multiple submissions was the citation of encryption as a sufficient privacy control. Several teams led with technical claims like AES-256-GCM, end-to-end encryption, or HIPAA-aware design as if these were complete answers to the question of whether the system was safe. For a security engineering leader, these claims read as starting points, not conclusions.

"AES-256 is not a privacy decision," he observes. "It is a question about what happens to the bits if someone gets the database file. That is one threat in a much larger space. The harder questions are: who has access to the key? When is the data decrypted, and where? Who can see it after decryption? What happens to the in-memory representation? Does it get logged? Cached? Sent to an analytics pipeline? Encryption answers the simplest question. Mental health software needs to answer the harder ones."

His point is not that encryption is wasted effort. It is that encryption is a control on a single threat — data-at-rest exfiltration — and the threats that actually harm mental health software users tend to live elsewhere. They live in third-party SDKs that exfiltrate session data to advertising networks. They live in support consoles that make customer service representatives effectively administrators of intimate user records. They live in research partnerships where "anonymized" mental health data is shared with academic institutions, then re-identified by joining against publicly available records. They live in the gap between the privacy policy and the actual data flow.

The hackathon teams that scored highest in his evaluation were not necessarily the ones with the strongest cryptographic claims. They were the ones whose architectural diagrams could be read as a coherent answer to the question: where does the user's data go, who can see it at each stop, and what happens to it when the user closes the app?

The Smallest Useful Threat Model

A recurring observation across Elengovan's reviews was that the teams who applied even the smallest amount of structured threat modeling produced markedly better submissions than those who did not. His framing was deliberately modest — not STRIDE, not LINDDUN, not a full security review, but a single paragraph answering four questions: who is the user, what is the worst thing that can happen to them through this software, who would benefit from making that happen, and what is the system doing to make it harder?

"Mental health software users have specific adversaries that consumer software users do not," he explains. "An employer who would penalize an employee for mental health disclosures. A custody battle where one parent's mental health record becomes evidence. A health insurer who might reprice based on data they should not have. A controlling partner who has access to the user's device. A government that might criminalize the condition the user is being treated for. Each of those is a real threat model that determines what the software should and should not retain. None of them are speculative. All of them are documented in the public record."

This is the discipline he wants to see: not paranoia, not perfectionism, just the willingness to write down who the user fears and to design accordingly. The teams that did this — even informally, even briefly — produced architectures with clear data boundaries, conservative retention defaults, and explicit consent flows. The teams that did not produced architectures where the path of least resistance led toward over-collection.

"The most important security control in mental health software is not a technology," he concludes. "It is a habit. The habit of asking, before you write the code, what the worst person in the user's life would do with this data if they had it. If you build the system imagining that adversary, you will make different decisions. You will collect less, summarize more, encrypt at rest and in transit, separate working memory from persistent memory, and document your third-party data flows. None of those things are exotic. All of them are doable in 72 hours. The reason teams do not do them is that nobody asked them to."

What Stood Out in the Aggregate

Across the eight projects in Elengovan's batch, the strongest submissions shared a common trait: they treated the data architecture as a first-class design concern rather than a deployment afterthought. They drew clean boundaries between public, authenticated, and sensitive data. They thought about who could see what, under which conditions, and they wrote down the answers. They acknowledged the third parties they relied on rather than hiding them. They were honest about what their AI features actually did versus what they hoped to demo.

The weaker submissions, by contrast, were not weaker because the engineers were less skilled. They were weaker because the data architecture had not been designed at all — it had emerged, organically, from the path of least resistance, and the resulting system could not be defended against any thoughtful adversary because no thoughtful adversary had been imagined during construction.

"The thing I want hackathon participants in this space to take away," Elengovan reflects, "is that mental health software is not consumer software with a higher emotional stakes. It is a different category of system. The threat model, the user's expectations of privacy, the legal exposure, and the cost of getting any of those wrong are different. The discipline you bring to the first 72 hours is the discipline that determines whether the product is safe to put in front of someone in their worst week. There is no second chance to add that discipline later — by the time the user is using the product, the architectural decisions have already been made."

The comment lands with the weight of someone who has spent his career watching architectural decisions calcify into security postures that cannot be undone without rebuilding the system. In the mental health domain, the cost of that calcification is paid in breach impact, regulatory exposure, and — most consequentially — the erosion of trust from the exact users the software was built to help.

MINDCODE 2026 — Software for Human Health was an international 72-hour hackathon organized by Hackathon Raptors from February 27 to March 2, 2026, with the official evaluation period running March 3–14. The competition attracted over 200 registrants and resulted in 21 valid submissions across the mental health and wellness domain. Submissions were independently reviewed by a panel of judges across three evaluation batches. Projects were assessed against five weighted criteria: Impact & Vision (35%), Execution (25%), Innovation (20%), User Experience (15%), and Presentation (5%). Hackathon Raptors is a United Kingdom Community Interest Company (CIC No. 15557917) that curates technically rigorous international hackathons and engineering initiatives focused on meaningful innovation in software systems.


Sponsors