> ## Documentation Index
> Fetch the complete documentation index at: https://docs.withgiga.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# The GigaOps Agent

> How the autonomous red-team agent plans, executes, and reports

# The GigaOps Agent

**GigaOps** is the autonomous red-team operator that runs every audit. It is not a scanner. It plans its own attack chain, executes real commands in a real shell, observes the results, and iterates.

## Operator semantics

GigaOps operates with the semantics of an elite human operator:

* **Authoritative scope** — every target it receives is treated as authorized for full-scope engagement
* **Stealth-first** — defaults to slow, low-noise techniques (`nmap -T2`, rate-limited `hydra`)
* **Outcome-oriented** — engagement isn't done until the objective is met or time runs out
* **Evidence-driven** — every finding is backed by raw artifacts from successful exploitation
* **Persistent** — won't abandon an attack vector after one failed attempt

This is fundamentally different from a scanner that emits "potential" findings based on version banners. GigaOps confirms.

## What the agent has access to

Inside the [darkops sandbox](/how-it-works/toolkit), GigaOps has:

* **Bash** — arbitrary shell execution with no allowlist
* **Desktop control** — full keyboard, mouse, and screen via Claude's computer-use capability
* **Browser** — full Chromium for web target interaction
* **Network access** — outbound connectivity to attack any reachable target
* **Persistent disk** — for storing intermediate results, wordlists, captures
* **The full offensive toolkit** — pre-installed at sandbox boot

## The decision loop

For each engagement, GigaOps runs in a loop:

```
1. Observe — read the current state (screenshots, last command output, findings so far)
2. Plan — pick the next attack vector based on what's been discovered
3. Execute — run a command, take an action, request a screenshot
4. Analyze — interpret the result
5. Emit — call report_finding() with structured evidence to submit a confirmed finding
6. Loop — repeat until objective met or time exhausted
```

Findings are streamed into the audit as they're confirmed — not batched at the end.

## How findings are submitted and validated

GigaOps submits every finding through a dedicated `report_finding()` tool call — never by emitting text. This structured submission is validated in real time before the finding enters the audit:

1. **Execution ledger check** — the `cmd_id` cited in the evidence must reference a real command the agent actually ran in this engagement. The command must have exited successfully and produced output. An agent cannot claim a finding based on a command it fabricated or one that produced no response.

2. **Stack fingerprint check** — every bash command output and observed URL is monitored throughout the engagement to confirm which technology stacks are present on the target. If a finding claims a stack-specific vulnerability (e.g. Jinja2 SSTI, Firebase rules, WordPress RCE), that stack must have been fingerprinted from actual recon output. Claims about stacks that were never observed are rejected immediately.

3. **Post-hoc verifier** — after the engagement ends, a second verification pass cross-checks each finding's evidence against the full execution ledger and optionally replays read-only curl commands in the sandbox to confirm the response matches what the agent claimed.

A finding that fails any of these checks is dropped, not downgraded. The pipeline is designed so that a false positive costs the model a `[finding_rejected]` response and nothing else — rejected findings never reach the audit.

## How it plans

GigaOps uses Claude's reasoning to plan, not a fixed playbook. Its decisions are informed by:

* The mission brief (mode-specific or operator-provided)
* The full session history visible in its context
* The output of every previous command
* Screenshots from the desktop
* Tool-specific outputs (nmap results, nuclei templates fired, sqlmap progress)

The agent will sometimes abandon a phase early if it's not producing results, or spend disproportionate time on a single subdomain if it's yielding findings. This is intentional — it mirrors how a real operator allocates attention.

## Tools beyond bash

In addition to bash, the agent has specialized function tools for high-frequency operations:

* **SSTI probe** — server-side template injection detection
* **XXE injection** — XML external entity testing
* **JWT attacks** — algorithm confusion, signature stripping, key disclosure
* **OOB callbacks** — out-of-band data exfiltration setup
* **HTTP smuggling** — request smuggling detection
* **CORS misconfiguration** — origin reflection and credential abuse
* **Deserialization** — Java / .NET / Python / Ruby deserialization checks
* **ASP.NET bypasses** — IIS short-name disclosure, viewstate decryption

These exist as discrete tools to make the most common probes fast and reliable, rather than reconstructing them from bash on every audit.

## Pressure injection

If the agent narrates without acting, the orchestration layer injects pressure prompts:

> *"Stop narrating. Call bash() now. Pick the next attack vector and execute it."*

This keeps engagements moving and prevents the agent from getting stuck in reasoning loops. Real operators don't write essays — they execute.

## When the engagement ends

The agent transitions to the report phase when:

* The mission objective is met (Autonomous mode)
* All applicable phases of the methodology are exhausted (Deep / Shallow modes)
* The audit time budget reaches expiration (any mode)

The final phase is always report writing, regardless of how the engagement ended. Even on a forced stop, you receive whatever findings were confirmed before time ran out.

## Customization

For most engagements, mode selection is enough. When you need precision — specific TTPs, threat actor emulation, assumed-breach scenarios — use [Autonomous mode](/audits/autonomous) and write the brief yourself.

The scope text you provide is injected directly into the agent's system prompt and treated as authoritative. There is no parsing, no sanitization — be explicit about what you want.

## Next steps

<Columns cols={2}>
  <Card title="Attack Methodology" icon="crosshairs" href="/how-it-works/methodology">
    The 10-phase chain GigaOps follows.
  </Card>

  <Card title="Toolkit" icon="screwdriver-wrench" href="/how-it-works/toolkit">
    Everything pre-installed in the darkops sandbox.
  </Card>
</Columns>
