Conference Proceedings by idio guarino

IEEE 6th International Forum on Research and Technology for Society and Industry (RTSI), 2021
In this work, we address the characterization and modeling of the network traffic generated by co... more In this work, we address the characterization and modeling of the network traffic generated by communication and collaboration apps which have been the object of recent traffic surge due to the COVID-19 pandemic spread. In detail, focusing on five of the top popular mobile apps (collected via the MIRAGE architecture) used for working/studying during the pandemic time frame, we provide characterization at trace and flow level, and modeling by means of Multimodal Markov Chains for both apps and related activities. The results highlight interesting peculiarities related to both the running applications and the specific activities performed. The outcome of this analysis constitutes the stepping stone toward a number of tasks related to network management and traffic analysis, such as identification/classification and prediction, and modern IT management in general.
Papers by idio guarino

Fine-Grained Traffic Prediction of Communication-and-Collaboration Apps Via Deep-Learning: A First Look at Explainability
The lifestyle change originated from the COVID-19 pandemic has caused a measurable impact on Inte... more The lifestyle change originated from the COVID-19 pandemic has caused a measurable impact on Internet traffic in terms of volume and application mix, with a sudden increase in usage of communication-and-collaboration apps. In this work, we focus on four of these apps (Skype, Teams, Webex, and Zoom), whose traffic we collect, reliably label at fine (i.e. per-activity) granularity, and analyze from the viewpoint of traffic prediction. The outcome of this analysis is informative for a number of network management tasks, including monitoring, planning, resource provisioning, and (security) policy enforcement. To this aim, we employ state-of-the-art multitask deep learning approaches to assess to which degree the traffic generated by these apps and their different use cases (i.e. activities: audio-call, video-call, and chat) can be forecast at packet level. The experimental analysis investigates the performance of the considered deep learning architectures, in terms of both traffic-prediction accuracy and complexity, and the related trade-off. Equally important, our work is a first attempt at interpreting the results obtained by these predictors via eXplainable Artificial Intelligence (XAI).

IEEE open journal of the Communications Society, 2024
Significant transformations in lifestyle have reshaped the Internet landscape, resulting in notab... more Significant transformations in lifestyle have reshaped the Internet landscape, resulting in notable shifts in both the magnitude of Internet traffic and the diversity of apps utilized. The increased adoption of communication-and-collaboration apps, also fueled by lockdowns in the COVID pandemic years, has heavily impacted the management of network infrastructures and their traffic. A notable characteristic of these apps is their multi-activity nature, e.g., they can be used for chat and (interactive) audio/video in the same usage session: predicting and managing the traffic they generate is an important but especially challenging task. In this study, we focus on real data from four popular apps belonging to the aforementioned category: Skype, Teams, Webex, and Zoom. First, we collect traffic data from these apps, reliably label it with both the app and the specific user activity and analyze it from the perspective of traffic prediction. Second, we design data-driven models to predict this traffic at the finest granularity (i.e. at packet level) employing four advanced multitask deep learning architectures and investigating three different training strategies. The trade-off between performance and complexity is explored as well. We publish the dataset and release our code as open source to foster the replicability of our analysis. Third, we leverage the packet-level prediction approach to perform aggregate prediction at different timescales. Fourth, our study pioneers the trustworthiness analysis of these predictors via the application of eXplainable Artificial Intelligence to (a) interpret their forecasting results and (b) evaluate their reliability, highlighting the relative importance of different parts of observed traffic and thus offering insights for future analyses and applications. The insights gained from the analysis provided with this work have implications for various network management tasks, including monitoring, planning, resource allocation, and enforcing security policies.

arXiv (Cornell University), May 21, 2023
The popularity of Deep Learning (DL), coupled with network traffic visibility reduction due to th... more The popularity of Deep Learning (DL), coupled with network traffic visibility reduction due to the increased adoption of HTTPS, QUIC, and DNS-SEC, re-ignited interest towards Traffic Classification (TC). However, to tame the dependency from task-specific large labeled datasets we need to find better ways to learn representations that are valid across tasks. In this work we investigate this problem comparing transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models (16 methods total). Using two publicly available datasets, namely MIRAGE19 (40 classes) and AppClassNet (500 classes), we show that (i) by using DL methods on large datasets we can obtain more general representations with (ii) contrastive learning methods yielding the best performance and (iii) meta-learning the worst one. While (iv) tree-based models can be impractical for large tasks but fit well small tasks, (v) DL methods that reuse better learned representations are closing their performance gap against trees also for small tasks.
Many or Few Samples?: Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification
2023 7th Network Traffic Measurement and Analysis Conference (TMA)

Contextual counters and multimodal Deep Learning for activity-level traffic classification of mobile communication apps during COVID-19 pandemic
Computer Networks, Nov 1, 2022
The COVID-19 pandemic has reshaped Internet traffic due to the huge modifications imposed to life... more The COVID-19 pandemic has reshaped Internet traffic due to the huge modifications imposed to lifestyle of people resorting more and more to collaboration and communication apps to accomplish daily tasks. Accordingly, these dramatic changes call for novel traffic management solutions to adequately countermeasure such unexpected and massive changes in traffic characteristics. In this paper, we focus on communication and collaboration apps whose traffic experienced a sudden growth during the last two years. Specifically, we consider nine apps whose traffic we collect, reliably label, and publicly release as a new dataset (MIRAGE-COVID-CCMA-2022) to the scientific community. First, we investigate the capability of state-of-art single-modal and multimodal Deep Learning-based classifiers in telling the specific app, the activity performed by the user, or both. While we highlight that state-of-art solutions reports a more-than-satisfactory performance in addressing app classification (96%-98% Fmeasure), evident shortcomings stem out when tackling activity classification (56%-65% F-measure) when using approaches that leverage the transport-layer payload and/or per-packet information attainable from the initial part of the biflows. In line with these limitations, we design a novel set of inputs (namely Context Inputs) providing clues about the nature of a biflow by observing the biflows coexisting simultaneously. Based on these considerations, we propose Mimetic-All a novel early traffic classification multimodal solution that leverages Context Inputs as an additional modality, achieving ≥ 82% F-measure in activity classification. Also, capitalizing the multimodal nature of Mimetic-All, we evaluate different combinations of the inputs. Interestingly, experimental results witness that Mimetic-ConSeq-a variant that uses the Context Inputs but does not rely on payload information (thus gaining greater robustness to more opaque encryption sub-layers possibly going to be adopted in the future)-experiences only ≈ 1% F-measure drop in performance w.r.t. Mimetic-All and results in a shorter training time.

On the use of Machine Learning Approaches for the Early Classification in Network Intrusion Detection
2022 IEEE International Symposium on Measurements & Networking (M&N)
Current intrusion detection techniques cannot keep up with the increasing amount and complexity o... more Current intrusion detection techniques cannot keep up with the increasing amount and complexity of cyber attacks. In fact, most of the traffic is encrypted and does not allow to apply deep packet inspection approaches. In recent years, Machine Learning techniques have been proposed for postmortem detection of network attacks, and many datasets have been shared by research groups and organizations for training and validation. Differently from the vast related literature, in this paper we propose an early classification approach conducted on CSE-CIC-IDS2018 dataset, which contains both benign and malicious traffic, for the detection of malicious attacks before they could damage an organization. To this aim, we investigated a different set of features, and the sensitivity of performance of five classification algorithms to the number of observed packets. Results show that ML approaches relying on ten packets provide satisfactory results.

Classification of Communication and Collaboration Apps via Advanced Deep-Learning Approaches
The lockdowns and lifestyle changes during the COVID-19 pandemic have caused a measurable impact ... more The lockdowns and lifestyle changes during the COVID-19 pandemic have caused a measurable impact on Internet traffic in terms of volumes and application mix, with a sudden increase of usage of communication and collaboration apps. In this work, we focus on five such apps, whose traffic we collect, reliably label at fine granularity (per-activity), and analyze from the viewpoint of traffic classification. To this aim, we employ state-of-art deep learning approaches to assess to which degree the apps, their different use cases (activities), and the pairs app-activity can be told apart from each other. We investigate the early behavior of the biflows composing the traffic and the effect of tuning the dimension of the input, via a sensitivity analysis. The experimental analysis highlights the figures of the different architectures, in terms of both traffic-classification performance and complexity w.r.t. different classification tasks, and the related trade-off. The outcome of this analysis is informative for a number of network management tasks, including monitoring, planning, resource provisioning, and (security) policy enforcement.

Characterizing and Modeling Traffic of Communication and Collaboration Apps Bloomed With COVID-19 Outbreak
2021 IEEE 6th International Forum on Research and Technology for Society and Industry (RTSI)
In this work, we address the characterization and modeling of the network traffic generated by co... more In this work, we address the characterization and modeling of the network traffic generated by communication and collaboration apps which have been the object of recent traffic surge due to the COVID-19 pandemic spread. In detail, focusing on five of the top popular mobile apps (collected via the MIRAGE architecture) used for working/studying during the pandemic time frame, we provide characterization at trace and flow level, and modeling by means of Multimodal Markov Chains for both apps and related activities. The results highlight interesting peculiarities related to both the running applications and the specific activities performed. The outcome of this analysis constitutes the stepping stone toward a number of tasks related to network management and traffic analysis, such as identification/classification and prediction, and modern IT management in general.
Mobile Network Traffic Prediction Using High Order Markov Chains Trained At Multiple Granularity
2021 IEEE 6th International Forum on Research and Technology for Society and Industry (RTSI)
Mobile Network Traffic Prediction Using High Order Markov Chains Trained At Multiple Granularity
2021 IEEE 6th International Forum on Research and Technology for Society and Industry (RTSI)
Journal papers by idio guarino

IEEE Transactions on Network and Service Management, 2025
Generative Artificial Intelligence (GenAI) models such as LLMs, GPTs, and Diffusion Models have r... more Generative Artificial Intelligence (GenAI) models such as LLMs, GPTs, and Diffusion Models have recently gained widespread attention from both the research and the industrial communities. This survey explores their application in network monitoring and management, focusing on prominent use cases, as well as challenges and opportunities. We discuss how network traffic generation and classification, network intrusion detection, networked system log analysis, and network digital assistance can benefit from the use of GenAI models. Additionally, we provide an overview of the available GenAI models, datasets for largescale training phases, and platforms for the development of such models. Finally, we discuss research directions that potentially mitigate the roadblocks to the adoption of GenAI for network monitoring and management. Our investigation aims to map the current landscape and pave the way for future research in leveraging GenAI for network monitoring and management.

Elsevier Computer Networks, 2022
The COVID-19 pandemic has reshaped Internet traffic due to the huge modifications imposed to life... more The COVID-19 pandemic has reshaped Internet traffic due to the huge modifications imposed to lifestyle of people resorting more and more to collaboration and communication apps to accomplish daily tasks. Accordingly, these dramatic changes call for novel traffic management solutions to adequately countermeasure such unexpected and massive changes in traffic characteristics. In this paper, we focus on communication and collaboration apps whose traffic experienced a sudden growth during the last two years. Specifically, we consider nine apps whose traffic we collect, reliably label, and publicly release as a new dataset (MIRAGE-COVID-CCMA-2022) to the scientific community. First, we investigate the capability of state-of-art single-modal and multimodal Deep Learning-based classifiers in telling the specific app, the activity performed by the user, or both. While we highlight that state-of-art solutions reports a more-than-satisfactory performance in addressing app classification (96%-98% Fmeasure), evident shortcomings stem out when tackling activity classification (56%-65% F-measure) when using approaches that leverage the transport-layer payload and/or per-packet information attainable from the initial part of the biflows. In line with these limitations, we design a novel set of inputs (namely Context Inputs) providing clues about the nature of a biflow by observing the biflows coexisting simultaneously. Based on these considerations, we propose Mimetic-All a novel early traffic classification multimodal solution that leverages Context Inputs as an additional modality, achieving ≥ 82% F-measure in activity classification. Also, capitalizing the multimodal nature of Mimetic-All, we evaluate different combinations of the inputs. Interestingly, experimental results witness that Mimetic-ConSeq-a variant that uses the Context Inputs but does not rely on payload information (thus gaining greater robustness to more opaque encryption sub-layers possibly going to be adopted in the future)-experiences only ≈ 1% F-measure drop in performance w.r.t. Mimetic-All and results in a shorter training time.
Uploads
Conference Proceedings by idio guarino
Papers by idio guarino
Journal papers by idio guarino