Project Cotton

2024-11-02

This is just a sketch of an idea that I’d like to work on in the very near future. I’m going to document my progress here.

Essentially, fully automated luxury redteaming and pentesting on-demand. Key features (not exhaustive by any means):

takes in domain name or company name or IP addresses
generates a kanban board with all tasks planned out (first-order reconnaisance)
Autogen-like LLM agent to orchestrate the tasks on the kanban board – spins up microVMs / containerized agents as needed for each task.
OODA Loop (Observe, Orient, Decide, Act) that keeps enriching the task queue by digging into rabbit holes that are worth it, and stepping away from those that aren’t:
- LLM agent to read the situation, plan the next step
- LLM agent to execute the next step
- LLM agent to review the results, and plan the next step
once final task has been completed, begin writing a report:
- Collection of .md files, with a unique tracking ID for each finding
- The report for a client is going to be these .md files stitched together.
- Every finding is an atomic .md file, and groups of atomics with some narrative structure will be a report.
haven’t thought about the human guardrails (“big red button stop” scenarios) yet but need to.
Next steps:
- Figure out a cloud lab that I can start experimenting with
- Figure out the glue language for all of this (Golang or Python or Rust)

Aarav's Blog