The GigaOps Agent

GigaOps is the autonomous red-team operator that runs every audit. It is not a scanner. It plans its own attack chain, executes real commands in a real shell, observes the results, and iterates.

Operator semantics

GigaOps operates with the semantics of an elite human operator:

Authoritative scope — every target it receives is treated as authorized for full-scope engagement
Stealth-first — defaults to slow, low-noise techniques (nmap -T2, rate-limited hydra)
Outcome-oriented — engagement isn’t done until the objective is met or time runs out
Evidence-driven — every finding is backed by raw artifacts from successful exploitation
Persistent — won’t abandon an attack vector after one failed attempt

This is fundamentally different from a scanner that emits “potential” findings based on version banners. GigaOps confirms.

What the agent has access to

Inside the darkops sandbox, GigaOps has:

Bash — arbitrary shell execution with no allowlist
Desktop control — full keyboard, mouse, and screen via Claude’s computer-use capability
Browser — full Chromium for web target interaction
Network access — outbound connectivity to attack any reachable target
Persistent disk — for storing intermediate results, wordlists, captures
The full offensive toolkit — pre-installed at sandbox boot

The decision loop

For each engagement, GigaOps runs in a loop:

Observe — read the current state (screenshots, last command output, findings so far)
Plan — pick the next attack vector based on what's been discovered
Execute — run a command, take an action, request a screenshot
Analyze — interpret the result
Emit — call report_finding() with structured evidence to submit a confirmed finding
Loop — repeat until objective met or time exhausted

Findings are streamed into the audit as they’re confirmed — not batched at the end.

How findings are submitted and validated

GigaOps submits every finding through a dedicated report_finding() tool call — never by emitting text. This structured submission is validated in real time before the finding enters the audit:

Execution ledger check — the cmd_id cited in the evidence must reference a real command the agent actually ran in this engagement. The command must have exited successfully and produced output. An agent cannot claim a finding based on a command it fabricated or one that produced no response.
Stack fingerprint check — every bash command output and observed URL is monitored throughout the engagement to confirm which technology stacks are present on the target. If a finding claims a stack-specific vulnerability (e.g. Jinja2 SSTI, Firebase rules, WordPress RCE), that stack must have been fingerprinted from actual recon output. Claims about stacks that were never observed are rejected immediately.
Post-hoc verifier — after the engagement ends, a second verification pass cross-checks each finding’s evidence against the full execution ledger and optionally replays read-only curl commands in the sandbox to confirm the response matches what the agent claimed.

A finding that fails any of these checks is dropped, not downgraded. The pipeline is designed so that a false positive costs the model a [finding_rejected] response and nothing else — rejected findings never reach the audit.

How it plans

GigaOps uses Claude’s reasoning to plan, not a fixed playbook. Its decisions are informed by:

The mission brief (mode-specific or operator-provided)
The full session history visible in its context
The output of every previous command
Screenshots from the desktop
Tool-specific outputs (nmap results, nuclei templates fired, sqlmap progress)

The agent will sometimes abandon a phase early if it’s not producing results, or spend disproportionate time on a single subdomain if it’s yielding findings. This is intentional — it mirrors how a real operator allocates attention.

Tools beyond bash

In addition to bash, the agent has specialized function tools for high-frequency operations:

SSTI probe — server-side template injection detection
XXE injection — XML external entity testing
JWT attacks — algorithm confusion, signature stripping, key disclosure
OOB callbacks — out-of-band data exfiltration setup
HTTP smuggling — request smuggling detection
CORS misconfiguration — origin reflection and credential abuse
Deserialization — Java / .NET / Python / Ruby deserialization checks
ASP.NET bypasses — IIS short-name disclosure, viewstate decryption

These exist as discrete tools to make the most common probes fast and reliable, rather than reconstructing them from bash on every audit.

Pressure injection

If the agent narrates without acting, the orchestration layer injects pressure prompts:

“Stop narrating. Call bash() now. Pick the next attack vector and execute it.”

This keeps engagements moving and prevents the agent from getting stuck in reasoning loops. Real operators don’t write essays — they execute.

When the engagement ends

The agent transitions to the report phase when:

The mission objective is met (Autonomous mode)
All applicable phases of the methodology are exhausted (Deep / Shallow modes)
The audit time budget reaches expiration (any mode)

The final phase is always report writing, regardless of how the engagement ended. Even on a forced stop, you receive whatever findings were confirmed before time ran out.

Customization

For most engagements, mode selection is enough. When you need precision — specific TTPs, threat actor emulation, assumed-breach scenarios — use Autonomous mode and write the brief yourself. The scope text you provide is injected directly into the agent’s system prompt and treated as authoritative. There is no parsing, no sanitization — be explicit about what you want.

The GigaOps Agent

The GigaOps Agent

Operator semantics

What the agent has access to

The decision loop

How findings are submitted and validated

How it plans

Tools beyond bash

Pressure injection

When the engagement ends

Customization

Next steps

Attack Methodology

Toolkit

​The GigaOps Agent

​Operator semantics

​What the agent has access to

​The decision loop

​How findings are submitted and validated

​How it plans

​Tools beyond bash

​Pressure injection

​When the engagement ends

​Customization

​Next steps

Attack Methodology

Toolkit

The GigaOps Agent

Operator semantics

What the agent has access to

The decision loop

How findings are submitted and validated

How it plans

Tools beyond bash

Pressure injection

When the engagement ends

Customization

Next steps