Table of Contents

One of the least visible bottlenecks in cybersecurity appears after an attack has already been documented. When a new threat is identified, defenders are typically given a written report describing what happened. Turning that report into a working reproduction of the attack, one that can be tested against real systems is slow, expensive, and difficult to scale.

Researchers at the U.S. Department of Energy’s Pacific Northwest National Laboratory (PNNL) are examining how generative AI can significantly shorten this process. Their work focuses on reconstructing complex cyberattacks from text descriptions so defenders can rapidly test whether existing protections are effective and update them if they are not. Early results suggest that work which often takes weeks can, in some cases, be completed in hours.

“To really protect against an attack, you need to replicate it,” said Loc Truong, a data scientist at PNNL who leads the project. “When an attack happens, usually a defender simply receives a text document explaining the attack, but someone needs to re-implement the entire attack. That can be a lengthy process and cost a lot of money.”

Why Reconstructing Attacks Takes So Long

Recreating a real-world cyberattack is rarely straightforward. Sophisticated incidents often involve long attack chains made up of many distinct tactics and dozens or even hundreds of individual steps. Each step must be recreated, configured for the target environment, and executed in the correct order before defences can be meaningfully tested.

Tools such as MITRE’s open-source Caldera platform help automate parts of this work, but many parameters still require manual input. This is especially true when attacks are customised for specific operating systems, applications, or industrial environments. As reconstructed attacks are run repeatedly, errors frequently require human attention and manual fixes, extending the timeline further.

The cost and effort involved mean that once an organisation successfully reconstructs a complex attack, there is often little incentive to share that work, even when other organisations could face similar threats.

The Growing Role of AI in Cyber Offence and Defence

The PNNL research emerges in a time and season where attackers are already making extensive use of generative AI. According to PNNL cybersecurity researcher Kristopher Willis, AI is now part of how some of the most capable attackers operate across industry, academia, and government.

“At the most recent DEF CON, every team competing in the Capture the Flag finals was using AI to assist with their attacks,” Willis said.

As attackers adopt these tools, defenders are expanding their own use of automation and autonomous systems. The goal is not to remove human oversight, but to reduce the manual effort required to translate knowledge about an attack into practical defensive testing.

The Introduction of ALOHA – AI Agent for Cyber Attack Reconstruction

To address this challenge, the PNNL team developed an adaptive generative AI agent called ALOHA (Agentic LLMs for Offensive Heuristic Automation). ALOHA is built using Claude, a large language model developed by Anthropic, and works alongside MITRE Caldera.

When an attack occurs, a human defender enters a text description of the incident into ALOHA and instructs it to recreate the steps needed to emulate that attack. A complex attack chain may involve around 20 different tactics and more than 100 individual steps, all of which must be reconstructed before defences can be tested.

“You describe what you want, in plain English, and generative AI runs the attack automatically,” Truong said. “The technology speeds up the defender’s response. It’s click and go.”

In one test, ALOHA generated roughly one million outputs, or tokens, to rebuild a multi-step attack chain from simple written guidance.

Testing and Improving Defences

Once reconstructed, the attack is launched against the original target system in a siloed, offline environment. The purpose is to determine whether newly installed protections can stop it.

If defences fail, controls are adjusted and the attack is run again. This creates a rapid feedback loop: the system attacks, defenses respond, protections are strengthened, and the attack is repeated to expose remaining gaps. Unlike traditional approaches, ALOHA can automatically generate fixes when it encounters errors during execution, reducing the need for manual troubleshooting.

“There are many programs out there to detect attacks,” Willis said. “ALOHA goes much further, adapting attacks to particular hardware, software, and environments, and then attacking again and again to improve response.”

Defence in Hours, Not Weeks

PNNL manages the Control Environment Laboratory Resource (CELR) for the Cybersecurity and Infrastructure Security Agency. In a simulated test at CELR, researchers used ALOHA to strengthen the defences of a water treatment plant.Reconstructing and defending against a complex attack involving more than 100 steps took approximately three hours. Traditionally, the same effort would have taken weeks.

“One small action in a long chain might take a millisecond for ALOHA but several minutes for a person,” Truong said. “The difference becomes very pronounced for complex attacks.”

Accelerating Defence Without Replacing Humans

The researchers emphasise that broader testing across different systems is still needed. The work does not suggest replacing human defenders, but rather accelerating one of the most time-consuming phases of cyber defence: turning written reports into executable attack scenarios. In an environment where attackers already move quickly and reuse techniques, that reduction in time may prove critical.

Categorized in:

Blog,