Project Cotton
This is just a sketch of an idea that I’d like to work on in the very near future. I’m going to document my progress here.
Essentially, fully automated luxury redteaming and pentesting on-demand. Key features (not exhaustive by any means):
-
takes in domain name or company name or IP addresses
-
generates a kanban board with all tasks planned out (first-order reconnaisance)
-
Autogen-like LLM agent to orchestrate the tasks on the kanban board – spins up microVMs / containerized agents as needed for each task.
-
OODA Loop (Observe, Orient, Decide, Act) that keeps enriching the task queue by digging into rabbit holes that are worth it, and stepping away from those that aren’t:
- LLM agent to read the situation, plan the next step
- LLM agent to execute the next step
- LLM agent to review the results, and plan the next step
-
once final task has been completed, begin writing a report:
- Collection of .md files, with a unique tracking ID for each finding
- The report for a client is going to be these .md files stitched together.
- Every finding is an atomic .md file, and groups of atomics with some narrative structure will be a report.
-
haven’t thought about the human guardrails (“big red button stop” scenarios) yet but need to.
-
Next steps:
- Figure out a cloud lab that I can start experimenting with
- Figure out the glue language for all of this (Golang or Python or Rust)