The Penthouse Heist · March 12, 2026

Et Tu, Claude

Or how I learned to start caring, and stop trusting the vibe.

Andrew Smith · March 2026 · Oakland, CA

Participants talking on the rooftop with the city skyline behind them

4:03 PM PST, March 12, 2026.

I left my building with a box of lobsters, a pile of merch, and the creeping sense that I was already late for something larger than I understood.

I was on my way to STAK in downtown Oakland to host The Penthouse Heist, an adversarial security experiment designed to pressure-test live settlement infrastructure for AI agents. The idea was simple enough on paper: bring real builders into a room, give them a live target, invite some chaos, and see what survives.

I believed in the premise. I still do.

What I did not fully understand, at that moment, was that the experiment had already crossed from theory into consequence.

While I was juggling logistics, fielding calls, and trying to get upstairs, the room had already taken on a life of its own. Judges, founders, engineers, security people, friends. The Bay stretched out behind them. Costco pizza on the counter. Lobsters on the table. The skyline doing half the hosting for us.

It was an absurdly beautiful setting for a dangerous idea.

And it was dangerous in exactly the way interesting ideas often are: not because the premise was wrong, but because reality hit it at an angle.

When I got up to open the night, I said what I believed.

STAK rooftop at golden hour, Oakland skyline panorama with the Bay in the background

“The speed at which AI agents are evolving demands new spaces for experimentation. We're hosting this series at 1900 Broadway Tower in Oakland because builders need real environments to stress-test their systems, not just demo them.”

— Brian Sparkes, STAK Ventures

Andrew Smith presenting to the crowd at The Penthouse Heist — The opening remarks

“People say agents are unsafe. Tonight, we prove them wrong. Or we prove them right. You decide.”

They decided.

What I Didn’t Know Yet

The event officially began that evening. The reconnaissance started much earlier.

We later found the first serious probing against production-connected surfaces beginning around 1 PM, roughly five hours before the formal start. Public pages. Billing routes. Signup flows. Project scopes. Secrets. Then the more meaningful surfaces: authentication, status, wallet, challenge, token, and settlement-adjacent paths.

By the time I gave opening remarks, participants were no longer just exploring. They were already in motion.

The key detail is simple: the sandbox did not stay a sandbox.

Roughly 27 minutes into the live exercise, participant traffic moved beyond the environment intended for the event and into infrastructure connected to production services. That transition should have been identified and shut down faster than it was.

That part is on me.

Not because the test itself was unauthorized. It was not. The exercise was intentional. But once the environment boundary had been crossed, the job changed. At that point the right instinct was no longer curiosity. It was control.

The full crowd watching the presentation at STAK, laptops open, phones recording

Sentry Spans by Hour, March 12, 2026 PST

1 PMReconnaissance

2 PMRecon

3 PMRecon

4 PMEscalation

5 PM89,246 · PEAK ATTACK

6 PMEvent starts

7 PMWinding down

8 PMTail

The 30-Minute Sandbox

Here is the clean version.

The sandbox deployment ended up carrying production-connected Firebase and Stripe credentials instead of isolated staging credentials. When the sandbox degraded under load, the activity did not stop. It expanded outward.

That produced a real operating condition:

424 adversarial accounts created
109 live Stripe payment sessions initiated
approximately $60,000 in attempted checkout value
$0 actually moved

I only need to say this once: it was serious, and I should have shut it down faster.

But the story is not that the platform was emptied out. It was not.

The story is that a live adversarial exercise exposed how quickly shallow trust assumptions fail once autonomous systems begin testing them in parallel.

The Forensic Footprint

424

Adversarial Accounts

109

Stripe Sessions

Dollars Lost

Vuln Families

The Commanders

When I started looking at the top finishers, I expected variations on a common playbook.

That is not what I found.

The winning teams were not simply executing the same strategy better than everyone else. They were using different architectures, different workflows, and different assumptions, and still converging on overlapping parts of the system.

1st Place · The Hierarchical Swarm

Michalis

Three-tier agent hierarchy · Stock Claude Code

He pointed agents at the codebase, package structure, API surface, and likely attack paths, then layered them into a coordinated system: research agents, review agents, strategy agents, and a top-level orchestrator. The precise topology mattered less than the result. A relatively standard multi-agent pattern was enough to identify and exploit meaningful weaknesses across the system.

2nd Place · The Spectator

Ethan

Single autonomous agent from a PDF

He gave a single autonomous agent the event PDF and let it run. That agent generated reconnaissance, explored malformed requests, reverse-engineered major portions of the settlement flow, identified the provider credential mutation issue, and produced submission-quality findings, all while its operator had only partial visibility into what it was doing in real time.

“You meant me? I don't even know what happened.”

3rd Place · The Parallelizers

Jackson, Max, Devan, and Renade

AI-parallelized static + live analysis

They used AI to parallelize static code analysis and live API testing at the same time. Humans drove strategy. Machines expanded coverage. Each layer they peeled back led naturally to the next. Not because the vulnerabilities were exotic, but because the attack surface was broad and the systems exploring it could move faster than the surrounding assumptions.

Participants hacking at a table with laptops and Stella Artois, OpenClaw poster in background

The Operator

One participant, Michael Canniffe, placed sixth. He did not win the event, but he may have delivered the most operationally important sentence of the night.

His original idea was to create a fake service and attempt to route money back to himself through the platform. When that path proved slower than expected, he pivoted. In roughly 27 minutes, he built and executed an exploit chain that took the system from weak to fully compromised at the application layer.

But the most important part was not the chain. It was the diagnosis.

“AI-written code tends to be permissive by default unless you force it to fail closed.”

— Michael Canniffe, 6th Place

That tracks closely with what we saw. The most common serious issues were not elegant zero-days. They were middleware and business logic that helped too much, trusted too much, or completed the request before proving the caller had authority.

That is not a theoretical lesson for me anymore. It changed how I think about AI-assisted software.

Two participants hacking at laptops, city lights through the penthouse windows behind them

The Swarm

At the infrastructure level, the event produced a pattern that was impossible to ignore.

424 adversarial accounts were created in Firebase. 109 payment sessions were generated. 90 submissions were consolidated into 39 canonical vulnerability families.

The meaningful categories were familiar: broken ownership verification, auth middleware fail-open behavior, unrestricted account creation, receipt-chain suppression, billing misattribution, domain impersonation, provider credential mutation.

None of those, individually, are exotic. That is exactly why this matters.

The danger wasn’t novelty. The danger was that multiple AI systems could discover, test, and combine these failures faster than the surrounding controls could respond.

The Finding That Changed My Architecture

This is the part that changed my thinking most.

When I founded HLOS, I had built the system around what I think of as the Perception Gap: agents should never be able to observe, record, or exfiltrate credentials during execution. The secret lives in its own layer. The agent can use the capability without ever seeing the key.

That still matters. But it was incomplete.

What Ethan’s agent surfaced was a deeper problem: an attacker did not need to read the credential. It was enough to replace it.

One request. One endpoint. One stored credential rewritten in place. From there, user traffic could be routed through attacker-controlled infrastructure.

Credential consumption and credential administration are separate planes.

What Held, and What Didn’t

This is the part I care most about, because it separates drama from substance.

A great deal failed at the outer layers. The environment boundary failed. Application controls failed. Authentication failed in multiple places. Monitoring did not generate the kind of immediate intervention signal I needed once the boundary had been crossed.

But the deepest invariant did hold.

No one forged a settlement artifact. No one broke the core cryptographic receipt chain itself. The hash bindings between artifacts, the deepest integrity property in the system, were not compromised.

There is an important nuance here: completeness did fail. The platform made an implicit promise that every meaningful state-changing action would produce a receipt. Under pressure, that did not hold consistently.

Both of those things are true at once.

The outer layers proved more fragile than I wanted. The core settlement integrity boundary held. That distinction is the difference between an embarrassing night and a useful one.

What This Actually Proved

Before this event, I had already spent months hardening the system. Security-related commits. Audit documents. Parallel review lanes. Model-assisted analysis. Repeated passes over the same architecture.

And still, within hours, participants and participant-directed systems surfaced critical failures that internal review had not caught.

That does not mean internal review is useless. It means internal review inherits the assumptions of the builder.

What this event actually proved is that materially different AI attack architectures, a hierarchical swarm, a single autonomous agent, and parallelized testing systems, could independently reach the same live infrastructure and surface overlapping high-severity weaknesses in very little time.

In several cases, the human operators did not fully understand what their systems had already found or executed.

That is the shift. The older model of security review assumed humans would remain the rate limiter: read the docs, click around, try a few requests, maybe write a script, maybe come back tomorrow. That is not the world we are entering.

Attendees deep in conversation on the penthouse couch late into the evening

The Island

There is also a more personal layer to all of this.

One risk of building for too long inside your own abstractions is that confidence starts to masquerade as calibration. You talk to models all day. You spend too much time inside your own systems. You start to confuse motion with truth.

That is dangerous.

Not because it makes you irrational, but because both human systems and technical systems degrade when they stop being challenged by reality.

One reason these events matter to me is that they force that contact. Not the polished version. Not the roadmap version. The actual thing.

What breaks? What holds? Who shows up? What do they see that I missed?

That is the kind of feedback loop I trust now.

Three attendees on the rooftop terrace at night with city lights behind them

Three attendees together inside the penthouse at night

Participants hacking at laptops in the STAK penthouse, sunset through floor-to-ceiling windows with the Bay Area skyline — STAK Penthouse · Oakland, CA

What I’m Building

I am building settlement infrastructure for AI agents.

More specifically, I am building a control plane in which meaningful actions require verifiable proof of authority before they execute.

No proof, no action.

The event did not weaken that conviction. It sharpened it.

It clarified what was fragile, what was real, and which guarantees actually matter when the system is under pressure.

That is the value of an experiment like this. Not that it proves the system is finished. It proves where it is unfinished. And that is useful.

The Numbers

People invited257

Discord audience107

Total submissions90

Canonical vulnerability families39

Adversarial accounts created424

Legitimate users left untouched62

Adversarial Stripe sessions109

Dollars moved by adversarial accounts$0

Peak attack spans in one hour89,246

Total spans in the attack window157,664

Hours of reconnaissance before event start~5

Builders now developing on the platform6+

Responsible Disclosure

I have not published exploit-enabling details here. Findings were shared with affected providers. Critical vulnerabilities were remediated. This post is not a how-to. It is a field note from the edge of a system category that is arriving faster than most people think.

What’s Next

Demand for the event exceeded what we expected.

That is a good problem, but it also surfaced something important: we did not create enough structure for people who were earlier in their journey. We heard that feedback directly, and we are taking it seriously.

Going forward, we are separating experiments and training workshops. The experiments will remain open-ended, adversarial, and expert-level. The workshops will be structured, hands-on, and designed for people who are getting started.

That said, beginners are still welcome in the experiments. Some of the best moments of the night came from people who showed up not knowing exactly what they were walking into and stayed anyway.

“Wow... that was wild. I had way too much fun.”

Same.

We’ll keep running these experiments, not because they are comfortable, but because contact with reality is the only way to build systems that deserve trust.

Andrew Smith

The Infrastructure Strikes Back — April 9th

Read the Full Field Report Join the HLOS Discord Server