Enhancing PentestGPT with RAG and Knowledge Graph Reasoning

Background

While working on LLM for Software Security research, I identified a core pain point in PentestGPT: in multi-step penetration testing, the model tends to “forget” earlier context, leading to incoherent commands and redundant exploration of already-known paths.

Approach

Two modules were introduced to improve context retention:

1. RAG (Retrieval-Augmented Generation)

Store historical pentest steps and discovered vulnerability info in a vector database. Before generating each new command, retrieve relevant history and inject it into the prompt context.

results = vector_store.similarity_search(current_state, k=5)
context = "\n".join([r.page_content for r in results])
prompt = f"History context:\n{context}\n\nCurrent task: {task}"

2. Knowledge Graph Reasoning

Model the network topology, discovered services, and CVE relationships as a graph, allowing the model to perform multi-hop reasoning when planning attack paths — rather than starting from scratch each step.

Takeaways

RAG noticeably reduces redundant probing
Knowledge graphs show greater advantage in complex network scenarios
Local deployment via Ollama enables fully offline operation — useful for sensitive testing environments

Lots of room for improvement, more experiments to come 🔧