Papers by Emanuele Lattanzi

Research Square (Research Square), Feb 2, 2024
Graphics Processing Units and Tensor Processing Units coupled with tiny machine learning models d... more Graphics Processing Units and Tensor Processing Units coupled with tiny machine learning models deployed on edge devices are revolutionizing computer vision and real-time tracking systems. However, edge devices often pose constraints regarding computational resources and power consumption. This paper proposes a visual-based virtual sensor paradigm that provides power-aware multi-object tracking at the edge while preserving tracking accuracy and enhancing privacy. The virtual sensor implements a new Dynamic Inference Power Manager (DIPM) based on an adaptive frame rate. We implement and deploy the virtual sensor and the DIPM on the NVIDIA Jetson Nano edge platform to prove the effectiveness and efficiency of our solutions. Extensive experimental results show that the proposed virtual sensor can achieve a 40% reduction in energy consumption, with a marginal decrease of less than 6% in tracking accuracy.
With the diffusion of Java in advanced multimedia mobile devices, there is a growing need for spe... more With the diffusion of Java in advanced multimedia mobile devices, there is a growing need for speeding up the execution of Java Bytecode beyond the limits of traditional interpreters and just-in-time compilers. In this area, Java coprocessors are viewed as a promising technology, which marries the flexibility of a general purpose microprocessor to run legacy code and lightweight Java methods, with the high performance of a specialized execution engine on speed-critical bytecode. This work proposes and analyzes a microprocessor with FPGA coprocessor architecture with efficient shared-memory communication support. Furthermore, we describe a complete run-time environment that supports dynamic migration of Java methods to the coprocessor, and we quantitatively analyze speedups achievable under a number of system configurations using an accurate complete-system simulator.

The management, protection and sharing of sensitive data such as those associated with the health... more The management, protection and sharing of sensitive data such as those associated with the health domain are crucial in enabling personal care and contributing to worldwide medical advancements. Distributed Ledger Technologies (DLTs) allow for data protection compliant solutions in untrusted contexts that guarantee data immutability, protection and transparency when needed. This paper proposes an architecture based on DLTs, Smart Contracts and Distributed File Storage (DFS), enabling user data sovereignty, confidentiality and secure access control. A use case on health data is presented, where we apply a combination of DLT, DFS and an access control mechanism to allow users to distribute their data. Finally, we show an experimental evaluation of the overall architecture to demonstrate the feasibility of implementing practical DLT-based healthcare solutions. The results are collected through independent tests, available opensource, that verify the system's response time in each of its functions and as the load increases. The results are promising and show that the system is feasible and can scale as the load increases.
Decentralising the Internet of Medical Things with Distributed Ledger Technologies and Off-Chain Storages: A Proof of Concept
Springer eBooks, 2021
Energy Efficiency of Deep Learning Compression Techniques in Wearable Human Activity Recognition
IFIP advances in information and communication technology, 2023

Kluwer Academic Publishers eBooks, Dec 21, 2005
Virtual memory is considered to be an unlimited resource in desktop or notebook computers with hi... more Virtual memory is considered to be an unlimited resource in desktop or notebook computers with high storage memory capabilities. However, in wireless mobile devices like palmtops and personal digital assistants (PDA), storage memory is limited or absent due to weight, size and power constraints. As a consequence, swapping over remote memory devices can be considered as a viable alternative. Nevertheless, power hungry wireless network interface cards (WNIC) may limit the battery lifetime and application performance if not efficiently exploited. In this work we explore performance and energy of network swapping in comparison with swapping on local micro-drives and flash memories. Our study points out that remote swapping over powermanageable WNICs can be more efficient than local swapping and that both energy and performance can be optimized through power-aware reshaping of data requests. Experimental results show that our optimization technique can save up to 60% of communication energy while improving performance.

A Prim–Dijkstra Algorithm for Multihop Calibration of Networked Embedded Systems
IEEE Internet of Things Journal, Jul 15, 2021
The development of large-scale systems of networked embedded devices with sensing capabilities re... more The development of large-scale systems of networked embedded devices with sensing capabilities relies on the availability of low-cost and resource-constrained components However, the reduced precision and accuracy of low-cost sensors on board of low-power platforms risks to impair the overall reliability of these systems, thus preventing their potential diffusion, especially in deployments with several (e.g., hundreds or more) nodes. Hence, ensuring the required quality of measurements along the lifetime of a sensor network represents a key challenge, which is often tackled also by means of calibration techniques. In this article, we propose a novel approach to multihop calibration, targeting the derivation of a spanning tree that encompasses the optimization of a biobjective problem. Indeed, since minimum spanning trees can be related to the energy budget of a network and shortest path trees can be used as a model for the minimization of the cumulative calibration errors, the search for a spanning tree that simultaneously optimizes the two metrics represents a useful direction toward the design of effective and efficient calibration strategies. To this aim, we introduce a method based on the Prim–Dijkstra algorithm, which represents an effective heuristics for effective search of solutions that could represent a tradeoff between the accuracy of multihop calibration and the energy expenditure needed to calibrate sensors. The proposed approach allows fast derivation of different solutions by means of a single parameter, thus enabling the efficient exploration of the design space even in large-scale scenarios as confirmed by numerical results obtained for validation.
Energy-aware Tiny Machine Learning for Sensor-based Hand-washing Recognition
On the Decentralization of Health Systems for Data Availability: a DLT-based Architecture
Do we need early exit networks in human activity recognition?
Engineering Applications of Artificial Intelligence, May 1, 2023

IEEE Embedded Systems Letters, Mar 1, 2019
Energy harvesting is increasingly considered a key technology for the design of autonomous embedd... more Energy harvesting is increasingly considered a key technology for the design of autonomous embedded systems. However, the design, deployment and validation of systems exploiting the energy scavenged from the environment to sustain their operativeness, poses considerable research challenges, especially in a networked context. Emulation is regarded as an achievable option to allow reproducible and accurate experimental conditions. However, when the emulation is carried out by means of an embedded low-power device, the tradeoff between accuracy and tight time requirements has to be carefully taken into account in order to avoid performance degradation. We introduce a novel approach that aims at improving the reactiveness of a hardware-software embedded emulator thanks to the introduction of a hardware compensation circuit. The proposed system allows an efficient run-time correction of the emulated voltage to be supplied to the load, thus improving the response time of the emulator with respect to the software-based compensation.

Experimental evaluation of the impact of packet length on wireless sensor networks subject to interference
Computer Networks, Feb 1, 2020
Abstract Wireless sensor networks are nowadays considered an enabling technology for a wide spect... more Abstract Wireless sensor networks are nowadays considered an enabling technology for a wide spectrum of cyber-physical systems applications. However, in order to cope with stringent dependability and energy efficiency requirements, several research challenges have to be solved. Electromagnetic interference, for instance, adversely affects wireless communication, resulting into increased packet collisions and network congestion, and also increasing the energy consumption of devices. Highlighting the complex interplay between communication under interference and parameters of sensor networks is therefore mandatory for driving design choices and improving system performance. In this work we propose an experimental study of the reliability and energy efficiency of IEEE 802.15.4 compliant sensor networks under controlled interference, as a function of the packets length. The results of an extensive set of experiments on an ample range of low-power asynchronous, medium access protocols point out the trade-off between energy consumption and robustness to interference and also provide a comparative view of the protocols, thus indicating useful guidelines in the choice and in the design of several critical components.

International Journal of Embedded Systems, 2005
With the diffusion of Java in advanced multimedia mobile devices, there is a growing need for spe... more With the diffusion of Java in advanced multimedia mobile devices, there is a growing need for speeding up the execution of Java bytecode beyond the limits of traditional interpreters and just-in-time compilers. In this area, Java coprocessors are viewed as a promising technology, which marries the flexibility of a general-purpose microprocessor to run legacy code and lightweight Java methods, with the high performance of a specialised execution engine on speed-critical bytecode. This work proposes and analyses a microprocessor with FPGA coprocessor architecture with efficient shared-memory communication support. Furthermore, we describe a complete run-time environment that supports dynamic migration of Java methods to the coprocessor, and we quantitatively analyse speedups achievable under a number of system configurations using an accurate complete-system simulator.
SmartRoadSense - A collaborative project for monitoring road surface conditions

Fast Distributed Consensus Through Path Averaging on Random Walks
Wireless Personal Communications, May 26, 2017
AbstractDistributed computation of average consensus is an important function in numerous wireles... more AbstractDistributed computation of average consensus is an important function in numerous wireless sensor networks and ad-hoc networks applications. In light of the severe resource constraints characterizing these embedded networked systems, it is of paramount importance the design of effective algorithms with low computational, communication, and energy requirements. Randomized gossip are a category of network algorithms that entail communication and information exchange among nodes selected with probabilistic mechanisms. They are considered attractive solutions for solving consensus problems because of their decentralized nature, conceptual simplicity, and capability of adapting to structural network modifications. However, the number of iterations needed by randomized gossip to converge towards consensus is an issue to be addressed to reduce latency and to limit the impact of the algorithm on the lifetime of battery operated devices. Solutions to the problem have been proposed that make use of localization and greedy geographic routing to build overlay graphs on top of a randomized gossip protocol, thus enabling information averaging among non-neighbor nodes and accelerating convergence. The main drawback of this type of approach is that each node is required to know its position, which is not always affordable in terms of costs or not even possible because of the lack of GPS signal. In this article we propose a novel approach to the consensus averaging problem based on a random walk mechanism to identify nodes involved in a computation round, and on a path averaging technique to update more than two nodes into the same round. The resulting algorithm doesn’t rely on any positioning mechanism. Experimental results provide evidence of the improved performance of the proposed method with respect to standard randomized gossip, with considerable gain in terms of convergence speed.
A Study on the Energy Sustainability of Early Exit Networks for Human Activity Recognition
IEEE Transactions on Sustainable Computing

Machine Learning Techniques to Identify Unsafe Driving Behavior by Means of In-Vehicle Sensor Data
Expert Systems with Applications, 2021
Abstract Traffic crashes are one of the biggest causes of accidental death in the way where, ever... more Abstract Traffic crashes are one of the biggest causes of accidental death in the way where, every year, more than 1.35 million of people die. In most of them, the main cause is related to the driver’s behavior. The driver performs a set of actions on the vehicle commands, such as steering, braking, accelerating or changing gear, which generate a direct response of the vehicle, or other tasks, such as visual, auditory, or haptic related tasks (e.g. looking for items, listening to radio, and using a smartphone), which can still impact on the driving safety. In this work we propose a methodology based on machine learning techniques aimed at recognizing safe and unsafe driving behaviors by means of in-vehicle sensor data. Starting from these signals we compute a set of descriptive features capable to accurately describe the behavior of the driver. Two different classification tools, namely Support Vector Machines and feed-forward neural networks, have been trained and tested on a publicly available dataset containing more than 26 hours of total driving time. The classification results report an average accuracy above 90% for both classifiers and the McNemar test shows no performance difference between the models at the 0.05 significance level, demonstrating a concrete possibility of identifying unsafe driving using in-vehicle sensor data.
Lightweight Accurate Trigger to Reduce Power Consumption in Sensor-Based Continuous Human Activity Recognition

IEEE Access
In the human activity recognition (HAR) application domain, the use of deep learning (DL) algorit... more In the human activity recognition (HAR) application domain, the use of deep learning (DL) algorithms for feature extractions and training purposes delivers significant performance improvements with respect to the use of traditional machine learning (ML) algorithms. However, this comes at the expense of more complex and demanding models, making harder their deployment on constrained devices traditionally involved in the HAR process. The efficiency of DL deployment is thus yet to be explored. We thoroughly investigated the application of TensorFlow Lite simple conversion, dynamic, and full integer quantization compression techniques. We applied those techniques not only to convolutional neural networks (CNNs), but also to long short-term memory (LSTM) networks, and a combined version of CNN and LSTM. We also considered two use case scenarios, namely cascading compression and stand-alone compression mode. This paper reports the feasibility of deploying deep networks onto an ESP32 device, and how TensorFlow compression techniques impact classification accuracy, energy consumption, and inference latency. Results show that in the cascading case, it is not possible to carry out the performance characterization. Whereas in the stand-alone case, dynamic quantization is recommended because yields a negligible loss of accuracy. In terms of power efficiency, both dynamic and full integer quantization provide high energy saving with respect to the uncompressed models: between 31% and 37% for CNN networks, and up to 45% for LSTM networks. In terms of inference latency, dynamic and full integer quantization provide comparable performance.

Sensors
The increasing diffusion of tiny wearable devices and, at the same time, the advent of machine le... more The increasing diffusion of tiny wearable devices and, at the same time, the advent of machine learning techniques that can perform sophisticated inference, represent a valuable opportunity for the development of pervasive computing applications. Moreover, pushing inference on edge devices can in principle improve application responsiveness, reduce energy consumption and mitigate privacy and security issues. However, devices with small size and low-power consumption and factor form, like those dedicated to wearable platforms, pose strict computational, memory, and energy requirements which result in challenging issues to be addressed by designers. The main purpose of this study is to empirically explore this trade-off through the characterization of memory usage, energy consumption, and execution time needed by different types of neural networks (namely multilayer and convolutional neural networks) trained for human activity recognition on board of a typical low-power wearable devic...
Uploads
Papers by Emanuele Lattanzi