Papers by Christos Anagnostopoulos

IoT
Pervasive computing applications deal with the intelligence surrounding users that can facilitate... more Pervasive computing applications deal with the intelligence surrounding users that can facilitate their activities. This intelligence is provided in the form of software components incorporated in embedded systems or devices in close distance with end users. One example of infrastructure that can host intelligent pervasive services is the Edge Computing (EC) ecosystem. EC nodes can execute a number of tasks for data collected by devices present in the Internet of Things (IoT). In this paper, we propose an intelligent, proactive tasks management model based on demand. Demand depicts the number of users or applications interested in using the available tasks in EC nodes, thus characterizing their popularity. We rely on a Deep Machine Learning (DML) model and more specifically on a Long Short Term Memory (LSTM) network to learn the distribution of demand indicators for each task and estimate the future interest in them. This information is combined with historical observations of and s...
Data quality-aware task offloading in Mobile Edge Computing: An Optimal Stopping Theory approach
Future Generation Computer Systems

Applied Intelligence
Lack of knowledge in the underlying data distribution in distributed large-scale data can be an o... more Lack of knowledge in the underlying data distribution in distributed large-scale data can be an obstacle when issuing analytics & predictive modelling queries. Analysts find themselves having a hard time finding analytics/exploration queries that satisfy their needs. In this paper, we study how exploration query results can be predicted in order to avoid the execution of ‘bad’/non-informative queries that waste network, storage, financial resources, and time in a distributed computing environment. The proposed methodology involves clustering of a training set of exploration queries along with the cardinality of the results (score) they retrieved and then using query-centroid representatives to proceed with predictions. After the training phase, we propose a novel refinement process to increase the reliability of predicting the score of new unseen queries based on the refined query representatives. Comprehensive experimentation with real datasets shows that more reliable predictions ...

Computing
Data management at the edge of the network can increase the performance of applications as the pr... more Data management at the edge of the network can increase the performance of applications as the processing is realized close to end users limiting the observed latency in the provision of responses. A typical data processing involves the execution of queries/tasks defined by users or applications asking for responses in the form of analytics. Query/task execution can be realized at the edge nodes that can undertake the responsibility of delivering the desired analytics to the interested users or applications. In this paper, we deal with the problem of allocating queries to a number of edge nodes. The aim is to support the goal of eliminating further the latency by allocating queries to nodes that exhibit a low load and high processing speed, thus, they can respond in the minimum time. Before any allocation, we propose a method for estimating the computational burden that a query/task will add to a node and, afterwards, we proceed with the final assignment. The allocation is concluded...

Information
In recent years, there has been a significant increase in the use of mobile devices and their app... more In recent years, there has been a significant increase in the use of mobile devices and their applications. Meanwhile, cloud computing has been considered as the latest generation of computing infrastructure. There has also been a transformation in cloud computing ideas and their implementation so as to meet the demand for the latest applications. mobile edge computing (MEC) is a computing paradigm that provides cloud services near to the users at the edge of the network. Given the movement of mobile nodes between different MEC servers, the main aim would be the connection to the best server and at the right time in terms of the load of the server in order to optimize the quality of service (QoS) of the mobile nodes. We tackle the offloading decision making problem by adopting the principles of optimal stopping theory (OST) to minimize the execution delay in a sequential decision manner. A performance evaluation is provided using real world data sets with baseline deterministic and ...

International Journal of Data Science and Analytics
Regression analytics has been the standard approach to modeling the relationship between input an... more Regression analytics has been the standard approach to modeling the relationship between input and output variables, while recent trends aim to incorporate advanced regression analytics capabilities within data management systems (DMS). Linear regression queries are fundamental to exploratory analytics and predictive modeling. However, computing their exact answers leaves a lot to be desired in terms of efficiency and scalability. We contribute with a novel predictive analytics model and an associated statistical learning methodology, which are efficient, scalable and accurate in discovering piecewise linear dependencies among variables by observing only regression queries and their answers issued to a DMS. We focus on in-DMS piecewise linear regression and specifically in predicting the answers to mean-value aggregate queries, identifying and delivering the piecewise linear dependencies between variables to regression queries and predicting the data dependent variables within specific data subspaces defined by analysts and data scientists. Our goal is to discover a piecewise linear data function approximation over the underlying data only through query-answer pairs that is competitive with the best piecewise linear approximation to the ground truth. Our methodology is analyzed, evaluated and compared with exact solution and near-perfect approximations of the underlying relationships among variables achieving orders of magnitude improvement in analytics processing.

Applied Intelligence
We introduce a predictive modeling solution that provides high quality predictive analytics over ... more We introduce a predictive modeling solution that provides high quality predictive analytics over aggregation queries in Big Data environments. Our predictive methodology is generally applicable in environments in which large-scale data owners may or may not restrict access to their data and allow only aggregation operators like COUNT to be executed over their data. In this context, our methodology is based on historical queries and their answers to accurately predict ad-hoc queries' answers. We focus on the widely used set-cardinality, i.e., COUNT, aggregation query, as COUNT is a fundamental operator for both internal data system optimizations and for aggregation-oriented data exploration and predictive analytics. We contribute a novel, query-driven Machine Learning (ML) model whose goals are to: (i) learn the query-answer space from past issued queries, (ii) associate the query space with local linear regression & associative function estimators, (iii) define query similarity, and (iv) predict the cardinality of the answer set of unseen incoming queries, referred to the Set Cardinality Prediction (SCP) problem. Our ML model incorporates incremental ML algorithms for ensuring high quality prediction results. The significance of contribution lies in that it (i) is the only query-driven solution applicable over general Big Data environments, which include restrictedaccess data, (ii) offers incremental learning adjusted for

Applied Intelligence
We focus on Internet of Things (IoT) environments where a network of sensing and computing device... more We focus on Internet of Things (IoT) environments where a network of sensing and computing devices are responsible to locally process contextual data, reason and collaboratively infer the appearance of a specific phenomenon (event). Pushing processing and knowledge inference to the edge of the IoT network allows the complexity of the event reasoning process to be distributed into many manageable pieces and to be physically located at the source of the contextual information. This enables a huge amount of rich data streams to be processed in real time that would be prohibitively complex and costly to deliver on a traditional centralized Cloud system. We propose a lightweight, energy-efficient, distributed, adaptive, multiple-context perspective event reasoning model under uncertainty on each IoT device (sensor/actuator). Each device senses and processes context data and infers events based on different local context perspectives: (i) expert knowledge on event representation, (ii) outliers inference, and (iii) deviation from locally predicted context. Such novel approximate reasoning paradigm is achieved through a contextualized, collaborative belief-driven clustering process, where clusters of devices are formed according to their belief on the presence of events. Our distributed and federated intelligence model
Advanced Principal Component-Based Compression Schemes for Wireless Sensor Networks
ACM Transactions on Sensor Networks

Evolving Systems
We rest on the edge computing paradigm where pushing processing and inference to the edge of the ... more We rest on the edge computing paradigm where pushing processing and inference to the edge of the Internet of Things (IoT) allows the complexity of predictive analytics to be distributed into smaller pieces physically located at the source of the contextual information. This enables a huge amount of rich contextual data to be processed in real time that would be prohibitively complex and costly to deliver on a traditional centralized Cloud. We propose a lightweight, distributed, predictive intelligence mechanism that supports communication efficient aggregation and predictive modeling within the edge network. Our idea is based on the capability of the edge nodes to (1) monitor the evolution of the sensed time series contextual data, (2) locally determine (through prediction) whether to disseminate contextual data in the edge network or not, and (3) locally reconstruct undelivered contextual data in light of minimizing the required communication interaction at the expense of accurate analytics tasks. Based on this on-line decision making, we eliminate data transfer at the edge of the network, thus saving network resources by exploiting the evolving nature of the captured contextual data. We provide comprehensive analytical, experimental and comparative evaluation of the proposed mechanism with other mechanisms found in the literature over real contextual datasets and show the benefits stemmed from its adoption in edge computing environments.

A History Model for Activity Coordination Towards Attack Probabilistic Reasoning
The Pervasive Computing paradigm has raised issues such as conceptual semantic descriptions and m... more The Pervasive Computing paradigm has raised issues such as conceptual semantic descriptions and management of ambient information resources. The probabilistic theory on the other hand provides knowledge representation schemes that model uncertainty. However, history models related to activities exploit both semantic and probabilistic modeling. Issues such as attack prediction and classification of activities and intentions are of high importance in ubiquitous environments. In this paper we propose a novel history context model that purifies the history of certain activity-oriented contexts in terms of inferring collaborative activities towards attacks. Therefore, we evaluate such model by focusing on attack context modeling, which is based upon a distributed honeynet Intrusion Detection System (IDS) architecture. Breadth and Depth Bayesian classifier and an inference probabilistic algorithm, over well defined fuzzy conceptual information, by means of ontology, is introduced regarding context histories.

Journal of Parallel and Distributed Computing, Jul 1, 2011
We investigate the delivery of information in ad hoc networks. We consider information sources an... more We investigate the delivery of information in ad hoc networks. We consider information sources and information consumers, and the network in between. Information has a certain quality indicator that fades over time. Consumers (applications that process incoming data) can receive and process disseminated information from its generation time until the associated quality reaches the lowest possible level. We adopt optimal stopping theory and an optimal online search algorithm in order to study the problem of optimally scheduling information consumption. The assumptions of our study include an efficient epidemic information dissemination scheme, which is a popular scheme for wireless sensor networks nowadays. We adopt the latter scheme for a combined setting where receiving nodes delay the reporting of information to applications in search for better quality while the overall network optimizes transmissions through the epidemic abstraction. Our findings are quite promising for the engineering of delay-tolerant applications (and the relevant middleware) in ad hoc networks.
Context awareness in mobile computing: a survey
On the use of epidemical information spreading in mobile computing environments
Intelligent Contextual Information Collection in Internet of Things
International Journal of Wireless Information Networks, 2016
Semantic web service discovery: methods, algorithms and tools
... 7. Introduction to Web Services (pages 134-154). C. Pennington (University of Georgia, USA) S... more ... 7. Introduction to Web Services (pages 134-154). C. Pennington (University of Georgia, USA) Sample PDF | More details... ... Semantic Web Service Discovery in the WSMO Framework (pages 281-316). U. Keller (Leopold-Franzens-Universität Innsbruck, Austria), R. Lara (Grupo ...
Swarm intelligence in autonomic computing: the particle swarm optimization case

Big Data, 2015
Solving the missing-value (MV) problem with small estimation errors in large-scale data environme... more Solving the missing-value (MV) problem with small estimation errors in large-scale data environments is a notoriously resource-demanding task. The most widely used MV imputation approaches are computationally expensive because they explicitly depend on the volume and the dimension of the data. Moreover, as datasets and their user community continuously grow, the problem can only be exacerbated. In an attempt to deal with such problem, in our previous work [1], we introduced a novel framework coined Pythia, which employs a number of distributed data nodes (cohorts), each of which contains a partition of the original dataset. To perform MV imputation, the Pythia, based on specific machine and statistical learning structures (signatures), selects the most appropriate subset of cohorts to perform locally a Missing Value substitution Algorithm (MVA). This selection relies on the principle that that particular subset of cohorts maintains the most relevant partition of the dataset. In addition to this, as Pythia uses only part of the dataset for imputation and accesses different cohorts in parallel, it improves efficiency, scalability and accuracy comparing against a single machine (coined Godzilla), which uses the entire massive dataset to compute imputation requests. Although this paper is an extension to our previous work, we particularly investigate the robustness of the Pythia framework and show that the Pythia is independent from any MVA and signatures construction algorithms. In order to facilitate our research, we considered two well-known MVAs (namely Knearest neighbor and expectation-maximization imputation algorithms) as well as two machine and neural computational leaning signature construction algorithms based on adaptive vector quantization and competitive learning. We prove comprehensive experiments to assess the performance of the Pythia against Godzilla and showcase the benefits stemmed from this framework.
Epidemic Dissemination Controlled by Wireless Channel Awareness
We propose a cross-layer scheme for regulating power consumption in energy-constrained ad hoc wir... more We propose a cross-layer scheme for regulating power consumption in energy-constrained ad hoc wireless networks where information is disseminated in an adaptive epidemic manner. A time-optimized mechanism based on the Optimal Stopping Theory utilises noise fluctuations to trigger the adaptation of transmission characteristics in a fashion that a perceived net reward is maximized. This results in data delivery being enhanced while energy cost remains modest.

Journal of Systems and Software, 2015
Waste Management (WM) represents an important part of Smart Cities (SCs) with significant impact ... more Waste Management (WM) represents an important part of Smart Cities (SCs) with significant impact on modern societies. WM involves a set of processes ranging from waste collection to the recycling of the collected materials. The proliferation of sensors and actuators enable the new era of Internet of Things (IoT) that can be adopted in SCs and help in WM. Novel approaches that involve dynamic routing models combined with the IoT capabilities could provide solutions that outperform existing models. In this paper, we focus on a SC where a number of collection bins are located in different areas with sensors attached to them. We study a dynamic waste collection architecture, which is based on data retrieved by sensors. We pay special attention to the possibility of immediate WM service in high priority areas, e.g., schools or hospitals where, possibly, the presence of dangerous waste or the negative effects on human quality of living impose the need for immediate collection. This is very crucial when we focus on sensitive groups of citizens like pupils, elderly or people living close to areas where dangerous waste is rejected. We propose novel algorithms aiming at providing efficient and scalable solutions to the dynamic waste collection problem through the management of the trade-off between the immediate collection and its cost. We describe how the proposed system effectively responds to the demand as realized by sensor observations and alerts originated in high priority areas. Our aim is to minimize the time required for serving high priority areas while keeping the average expected performance at high level. Comprehensive simulations on top of the data retrieved by a SC validate the proposed algorithms on both quantitative and qualitative criteria which are adopted to analyze their strengths and weaknesses. We claim that, local authorities could choose the model that best matches their needs and resources of each city.
Uploads
Papers by Christos Anagnostopoulos