Universidad
Politécnica de Madrid

A 'Digital Twin' of the Network to Train AI in Cyberthreat Detection

Researchers from UPM and Telefónica have joined forces to create a new AI-based tool that will help distinguish legitimate traffic from malicious traffic in real time and could mark a paradigm shift for network operators.

24.11.2025

In the intricate fabric of telecommunications, networks operate as vast digital highways where billions of data points stream through countless connections. A significant share of this traffic stands out for its high bandwidth consumption: these are known as heavy hitter flows. Behind them are everyday legitimate activities such as binge-watching shows on Netflix, watching high-definition videos on YouTube, or syncing files with cloud services like Google Drive.

However, this ability to move massive volumes of data also reveals a dangerous side: malicious volumetric attacks often appear as large traffic spikes, which can be mistakenly classified as heavy hitters. These cyberattacks intentionally aim to take down an internet service and overwhelm a network segment, with the goal of leaving thousands—or even millions—of users offline and crippling essential services. To counter these attacks, it is crucial to distinguish data traffic that is genuinely legitimate from traffic that poses a threat, a task that demands a high level of analytical capability.

To support this task, a team of researchers from the Mathematical Modeling and Biocomputing Group at UPM and Telefónica Innovación Digital have joined forces to create a “digital twin” that helps train Artificial Intelligence to carry out this function: identifying data and distinguishing malicious traffic from legitimate traffic.

A Network Digital Twin is a virtual and dynamic replica of the physical network. In advanced configurations, network digital twins are fed in real time with measurements from the actual network and, in turn, allow testing configuration adjustments in a continuous loop without affecting the active service,” explains Alberto Mozo, UPM full professor, director of MMB research group and main researcher of the Horizon Europe ACROSS project within this work has been developed.

Within this controlled environment, the researchers generate synthetic traffic that reproduces both everyday user behavior and the patterns of various attacks. This traffic is automatically labeled to indicate its nature (benign or malicious). Using this entire set of labeled data, they train a supervised learning algorithm—a branch of Artificial Intelligence that learns by identifying patterns in previously categorized examples. The ultimate goal is to teach the system to accurately infer the intention behind each heavy flow, turning it into an effective guardian of digital highways,” adds Amit Karamchandani Batra, another UPM researcher and first author of this work.

Training the AI

The work developed by UPM and Telefónica focused on four major goals designed to transform the way internet traffic is protected and managed. The first was to use the Network Digital Twin to create realistic network scenarios that make it possible to collect a large, diverse and realistic set of synthetic data. This allows researchers to avoid using data from real users, thereby protecting network privacy.

Network Digital Twin topology for emulating normal and heavy-hitter clients and servers.

The second goal involves using this synthetic data to train the AI to recognize each type of flow. To achieve this, thousands of examples of network traffic are presented, telling the AI in each case: “this is normal traffic,” “this is a user downloading a large file—this is legitimate,” or “careful, this is an attack!”.

By analyzing all these previously classified (or ‘labeled’) examples, the AI learns to detect the clues and characteristic patterns of each type of flow without inspecting the content of the data, which is essential for protecting user privacy,” explains Luis de la Cal, also a UPM researcher and co-author of this work. The goal is that, once trained, it can independently decide—in just milliseconds—whether a new data flow reaching the network is normal, a legitimate heavy hitter (such as downloading a large file or streaming high-quality video), or, on the contrary, part of a cyberattack,” he adds.

The third goal aims to close the loop between the virtual and real worlds, enabling live data exchange between the digital twin and the physical network. This bidirectional communication allows operators to safely and fully test new policies to improve service quality or strategies to mitigate attacks, without putting customer connections or information at risk. Moreover, by updating the digital twin with real-time data from its physical counterpart, the AI model can be continuously refined so that it remains calibrated and effective against emerging threats and cybercriminals’ evasion techniques.

Architecture for real-time network monitoring and proactive traffic optimization.

The fourth goal delivers a benefit to the entire scientific community, as it involves releasing both the system’s source code and the generated dataset so that anyone can access them.

Test Passed

Beyond laboratory tests, the work carried out by the researchers delivered outstanding results. The artificial intelligence, trained with this data, learned to distinguish with remarkable precision and at impressive speed between normal traffic, legitimate heavy-hitter flows, and those generated by DDoS attacks, explains Mozo. The researcher adds that the AI model didn’t just excel in controlled laboratory conditions: when we challenged it with completely new and unseen traffic data—including public datasets commonly used to evaluate intrusion detection systems—it continued to deliver excellent performance,” he concludes.

Reference: A. Karamchandani, J. Nunez, L. de-la-Cal, Y. Moreno, A. Mozo, and A. Pastor, “On the Applicability of Network Digital Twins in Generating Synthetic Data for Heavy Hitter Discrimination,” IEEE Communications Magazine, pp. 2–8, 2025, DOI: 10.1109/MCOM.003.2400648.

This work has received funding from the European Commission through the HORIZON-JU-SNS-2022 ACROSS project with Grant Agreement number 101097122.