A member of the OASIS security standards committees who classifies real-world vulnerabilities for a living evaluated hackathon projects built around intentional instability — and found that the frameworks used to describe software defects apply surprisingly well to software designed to embrace them.
Every vulnerability in the Common Vulnerabilities and Exposures database follows the same structure: an identifier, affected components, severity score, and a description of what happens when the flaw is exploited. The framework assumes that the behavior is unintended — a defect to be patched, a weakness to be remediated. But what happens when you apply that same analytical rigor to software where the collapse is the feature?
Sergii Demianchuk has spent sixteen years thinking about how software breaks. As a Senior Software Engineering Technical Leader in Cisco's Security and Trust Organization, he helps deliver secure software solutions for one of the world's largest networking infrastructure companies. Outside Cisco, he shapes how the industry talks about vulnerabilities as a member of both the OASIS OpenEoX and OASIS CSAF Technical Committees — the bodies that define the formats and protocols for exchanging security advisory information across organizations. His participation in the CVE community means he regularly encounters the structured taxonomy of system failure.
System Collapse 2026, organized by Hackathon Raptors, gave Demianchuk a different kind of system failure to analyze. Twenty-six teams spent 72 hours building software that thrives on instability — systems where breaking, adapting, and collapsing are features rather than failures. Demianchuk evaluated eleven of those submissions, and his perspective was shaped by a career spent distinguishing between failure that represents a defect and failure that represents intended behavior.
"In security, we have a very precise vocabulary for describing how systems break," Demianchuk explains. "Attack vectors, affected components, impact severity, exploitability metrics. When I looked at these hackathon projects, I instinctively reached for that same vocabulary. And it worked — not because these are vulnerable systems, but because the anatomy of failure is the same whether it's intentional or not."
Classifying Collapse: From CVE Severity to Design Metrics
The Common Vulnerability Scoring System (CVSS) evaluates software flaws across multiple dimensions: attack complexity, privileges required, user interaction needed, and the impact on confidentiality, integrity, and availability. These dimensions exist because not all vulnerabilities are equal — a remotely exploitable, zero-interaction flaw that compromises data integrity is categorically different from a local, high-complexity issue that affects only availability.
Demianchuk found himself applying analogous thinking to the hackathon submissions. "The evaluation criteria — Technical Execution, System Design, Creativity — are the equivalent of CVSS base metrics," he says. "But what I was really evaluating was whether each project's instability was well-architected. Did the collapse have structure? Did it propagate through defined channels? Was the impact consistent and reproducible?"
Two projects in his batch earned perfect scores across all three criteria: Flick AI by team Kaizen and System Sketch by System Architects.
System Sketch: Chaos Engineering as Vulnerability Testing
System Sketch scored 5.00/5.00 in Demianchuk's evaluation — a browser-based tool where users design distributed architectures and then intentionally stress-test them to reveal how they fail. Users draw load balancers, application servers, databases, and caches, then simulate traffic that pushes these architectures past their breaking points. Auto-scaling and caching strategies can be tested as recovery mechanisms.
"I believe you can already sell it to AWS," Demianchuk commented during his evaluation. "This system is amazing. I already see several vectors where different parts can be used as separate products on their own."
The enthusiasm reflects recognition of a professional parallel. At Cisco, security teams routinely stress-test network infrastructure to identify failure modes before attackers do. Penetration testing, fuzzing, fault injection — these are all practices that deliberately break systems under controlled conditions. System Sketch translates this methodology from the network layer to the application architecture layer.
"The key insight this project demonstrates is that understanding failure requires agency," Demianchuk observes. "Reading about cascading failures in a textbook is fundamentally different from designing an architecture you believe is resilient, then watching it collapse under load you chose. The emotional experience of seeing your own design fail teaches something that theory cannot."
In the OASIS CSAF framework that Demianchuk helps define, security advisories include not just what went wrong but what the fix is. System Sketch closes this loop: users see the failure, then iterate on their architecture, then test again. It's a vulnerability-discovery-and-remediation cycle compressed into a single interactive session.
The project's commercial potential that Demianchuk immediately recognized stems from an underserved market. Existing chaos engineering tools like Netflix's Chaos Monkey or Gremlin operate on live production systems — powerful but inherently risky. System Sketch operates on diagrams, making it a safe environment for learning failure patterns before encountering them in production. "The organizations that would benefit most from chaos engineering are the ones that can least afford the risk of running it on real systems," Demianchuk observes. "System Sketch solves that paradox."
Flick AI: Trust Boundaries and the Attack Surface of Helpfulness
Flick AI, also scoring 5.00/5.00, is an OS-native AI assistant built in 24 hours by a two-person team. Unlike browser-based chatbots, Flick AI operates at the operating system level — it reads the user's screen, understands context, and offers assistance without requiring context-switching to a separate application. The developers describe it as a "tap-on-the-shoulder" companion that "feels like magic."
From a security perspective, Demianchuk sees both the innovation and the risk surface. "Any system that reads your screen has access to everything visible on your screen — credentials, private messages, financial data, authentication tokens. The trust boundary between the user and the AI assistant is effectively zero."
This isn't a criticism of the project's execution — Demianchuk rated it perfect across all criteria. It's an observation about the inherent tension in helpful systems. "The most useful systems are the ones with the broadest access to context. The most secure systems are the ones with the narrowest access. Flick AI chose maximum utility, and within a hackathon context, that's the right choice. In production, you'd need to implement the kind of access controls and data classification that we build into Cisco's security infrastructure."
The project embodies the hackathon's theme in an unexpected way. The system doesn't collapse in the traditional sense — it destabilizes the user's relationship with their own data by making an AI agent a silent observer of their workflow. "The instability isn't in the code," Demianchuk explains. "It's in the trust model. And trust model instability is the most dangerous kind, because users don't notice it until something goes wrong."
FRACTURE: Adversarial Mutation in Real Time
FRACTURE by The Broken Being scored 4.70/5.00 — an AI-driven particle physics sandbox where destruction triggers evolution. Users draw structures, watch them fracture, and then GPT-4 generates new physics rules in real time based on what broke and how. The system implements all five of the hackathon's core mechanics: feedback loops, entropy visualization, adaptive rules, emergent behavior, and collapse events with "ghost traces" preserving visual memory of what was.
For Demianchuk, the GPT-4 integration creates a specific pattern that mirrors adversarial security scenarios. "In threat modeling, we plan for known attack patterns — SQL injection, buffer overflows, privilege escalation. But the most dangerous adversaries are the ones who adapt their tactics based on your defenses. FRACTURE does something analogous: the physics rules change based on how the system breaks. Every collapse teaches the system to break differently next time."
This adaptive mutation is what separates static vulnerability from active threat. A SQL injection vulnerability is a fixed flaw — it exists regardless of whether anyone exploits it. An advanced persistent threat (APT) group, by contrast, modifies its approach based on the target's response. FRACTURE's AI-generated physics rules simulate the second category: the rules of engagement change with every interaction.
"The ghost traces are the detail that makes this project memorable," Demianchuk adds. "They're essentially an audit trail of every collapse event. In incident response, we reconstruct timelines from logs and artifacts. FRACTURE builds that reconstruction into the visual experience — you can see the history of destruction layered into the current state."
The project's implementation choices reinforce the adversarial metaphor. Built on Next.js 16 and React 19, FRACTURE uses the Canvas API for 60fps particle simulation and Web Audio API for procedural sound that shifts as the system destabilizes. "The audio is what sells the experience," Demianchuk notes. "In security operations centers, audio alerts create urgency in a way that visual dashboards alone cannot. FRACTURE uses sound the same way — the audio environment degrades alongside the physics, creating a multi-sensory instability that makes the collapse feel real."
StateCraft: Version Control as Security Audit Trail
StateCraft by Girly Pop earned 4.70/5.00 with a concept that directly resonates with Demianchuk's standards work. It's an element-combining game where AI generates new elements from combinations, and the entire discovery process is tracked through a git-inspired version control system. Discoveries are recorded as "commits," parallel exploration paths become "branches," and the system's evolution is fully traceable.
"In the CSAF standard, traceability is everything," Demianchuk explains. "When a vulnerability is discovered, you need to trace its entire lifecycle — when it was introduced, which versions are affected, what patches exist, which organizations have been notified. StateCraft implements that same kind of lifecycle tracking for creative discovery."
The git model also introduces a concept that maps to security analysis: divergent timelines. When a user branches their discovery tree, they create parallel realities where different combinations were tried. In security, this is analogous to scenario analysis — what happens if the attacker takes path A versus path B? How do the outcomes differ?
"The AI-driven emergence adds another layer," Demianchuk notes. "Combinations aren't predetermined — the AI generates new elements based on context. This means the discovery space is unbounded. From a security perspective, unbounded state spaces are where the most interesting vulnerabilities hide, because they're impossible to fully test."
StateCraft's element categories drift from "basic" to "event" types as the game progresses, introducing increasing instability. This echoes a pattern Demianchuk sees in long-lived software systems: "The older a codebase gets, the more its abstractions drift from their original intent. What started as a 'basic' data type accumulates edge cases and special handling until it's effectively a different type wearing the old name. StateCraft compresses this entropy into minutes."
Solo Debugger: Incident Response as Combat
Solo Debugger by Abhinav Shukla (4.70/5.00) reimagines debugging as an immersive power fantasy inspired by the anime Solo Leveling. Errors manifest as interactive "monster cards" with countdown timers. Resolved errors re-emerge as shadow particles that follow cursor movement using a flocking behavior model. Collecting sixty shadow entities triggers the "Monarch State" — a visual and functional transformation representing complete mastery.
"Love the idea and execution," Demianchuk wrote during evaluation. "Monarch State reached! Love it!"
Beyond the entertainment value, the project captures something genuine about the psychology of incident response. "Debugging under time pressure is adversarial," Demianchuk explains. "The bug is actively causing damage while you're trying to understand it. Countdown timers on error cards aren't just game mechanics — they represent the real cost of unresolved production issues."
The shadow particle mechanic — where resolved errors follow the user as persistent companions — maps to what experienced engineers recognize as institutional memory. "Every bug you fix teaches you something about the system. Those lessons don't disappear — they follow you into the next incident. Solo Debugger makes that accumulated expertise visible as a literal swarm of resolved challenges."
The Monarch State transformation represents the inflection point that senior incident responders reach: the moment when accumulated experience shifts from reactive debugging to proactive pattern recognition. "There's a real threshold in engineering careers where you stop responding to incidents and start predicting them," Demianchuk says. "Solo Debugger gamifies that transition."
The technical implementation — Zustand for state management, a custom Boids-based flocking algorithm for shadow particle behavior, Framer Motion for visual instability effects — demonstrates engineering discipline applied to chaotic subject matter. "The irony is beautiful," Demianchuk observes. "To make a convincing simulation of chaos, you need extremely well-ordered code. The flocking algorithm alone requires precise vector mathematics. The chaos is deterministic — which, from a security perspective, means it's auditable."
The Standards Lens: Structure in Chaos
Evaluating eleven projects through the lens of security standards and vulnerability classification revealed a consistent pattern: the projects that scored highest were the ones whose instability had the most structure. Random glitches scored lower than deterministic cascades. Surface-level visual chaos scored lower than deep systemic mechanics. Unbounded randomness scored lower than constrained emergence.
"This maps directly to how we classify vulnerabilities," Demianchuk observes. "A well-documented vulnerability — clear attack vector, reproducible steps, defined impact — is paradoxically less dangerous than a poorly understood one. Because once you can describe the failure precisely, you can address it precisely. The same principle applies to these projects: the ones that could precisely describe their instability could precisely control it."
The OASIS OpenEoX standard that Demianchuk helps develop addresses the lifecycle of security advisories — when vulnerabilities are discovered, when patches are available, when support ends. Every system has a lifecycle, and instability looks different at each stage. "A new system's instability comes from undiscovered bugs. A mature system's instability comes from accumulated complexity. An end-of-life system's instability comes from abandonment. The best projects in this hackathon demonstrated all three phases within a single interaction."
Demianchuk draws a parallel to the thirteen-year career arc before Cisco. At SoftServe, where he spent thirteen years progressing from developer to Application Architect and Technical Lead, he watched the same principle play out across dozens of enterprise projects. "The teams that built the most resilient systems weren't the ones that avoided failures. They were the ones that had the most rigorous frameworks for understanding their failures. A crash you can reproduce is a crash you can fix. A crash you can't reproduce is a crash that's waiting for the worst possible moment to reappear."
Across the System Collapse evaluations, Demianchuk's batch averaged 4.5 in Technical Execution and 4.6 in Creativity — the highest averages across all three batches. All eleven submissions scored 4.0 or above in System Design, suggesting the batch as a whole had internalized the hackathon's core insight: instability, when structured, becomes a design resource rather than a design flaw.
"The projects I evaluated demonstrated that instability and quality aren't opposites," he concludes. "The most unstable systems were also the most carefully engineered. That's the lesson enterprise security should take from this: the organizations that understand their failure modes best are the ones that can operate most confidently at the edge of collapse. In security, we call that 'defense in depth.' In this hackathon, they called it 'systems that thrive on instability.' The engineering is the same."
System Collapse 2026 was organized by Hackathon Raptors, a Community Interest Company supporting innovation in software development. The event featured 26 teams competing across 72 hours, building systems designed to thrive on instability. Sergii Demianchuk served as a judge evaluating projects for technical execution, system design, and creativity and expression.