Papers by Steven Prestwich

CEUR workshop proceedings, Dec 1, 2019
Human activity recognition is an area of interest in various domains such as elderly and health c... more Human activity recognition is an area of interest in various domains such as elderly and health care, smart-buildings and surveillance, with multiple approaches to solving the problem accurately and efficiently. For many years hand-crafted features were manually extracted from raw data signals, and activities were classified using support vector machines and hidden Markov models. To further improve on this method and to extract relevant features in an automated fashion, deep learning methods have been used. The most common of these methods are Long Short-Term Memory models (LSTM), which can take the sequential nature of the data into consideration and outperform existing techniques, but which have two main pitfalls; longer training times and loss of distant pass memory. A relevantly new type of network, the Temporal Convolutional Network (TCN), overcomes these pitfalls, as it takes significantly less time to train than LSTMs and also has a greater ability to capture more of the long term dependencies than LSTMs. When paired with a Convolutional Auto-Encoder (CAE) to remove noise and reduce the complexity of the problem, our results show that both models perform equally well, achieving state-of-the-art results, but when tested for robustness on temporal data the TCN outperforms the LSTM. The results also show, for industry applications, the TCN can accurately be used for fall detection or similar events within a smart building environment.

arXiv (Cornell University), Jan 7, 2018
We propose denoising dictionary learning (DDL), a simple yet effective technique as a protection ... more We propose denoising dictionary learning (DDL), a simple yet effective technique as a protection measure against adversarial perturbations. We examined denoising dictionary learning on MNIST and CIFAR10 perturbed under two different perturbation techniques, fast gradient sign (FGSM) and jacobian saliency maps (JSMA). We evaluated it against five different deep neural networks (DNN) representing the building blocks of most recent architectures indicating a successive progression of model complexity of each other. We show that each model tends to capture different representations based on their architecture. For each model we recorded its accuracy both on the perturbed test data previously misclassified with high confidence and on the denoised one after the reconstruction using dictionary learning. The reconstruction quality of each data point is assessed by means of PSNR (Peak Signal to Noise Ratio) and Structure Similarity Index (SSI). We show that after applying (DDL) the reconstruction of the original data point from a noisy sample results in a correct prediction with high confidence.

Feature selection is used to select a subset of relevant features in machine learning, and is vit... more Feature selection is used to select a subset of relevant features in machine learning, and is vital for simplification, improving efficiency and reducing overfitting. In filter-based feature selection, a statistic such as correlation or entropy is computed between each feature and the target variable to evaluate feature relevance. A relevance threshold is typically used to limit the set of selected features, and features can also be removed based on redundancy (similarity to other features). Some methods are designed for use with a specific statistic or certain types of data. We present a new filter-based method called Relevance-Redundancy Dominance that applies to mixed data types, can use a wide variety of statistics, and does not require a threshold. Finally, we provide preliminary results, through extensive numerical experiments on public credit datasets. Feature selection strategies based on filter methods have received attention from many researchers in statistics and machine learning. Their advantages are that they are fast, independent of the classifier/predictor method, scalable and easy to interpret. The RELIEF algorithm estimates the quality of attributes according to how well their values distinguish between instances that are near to each other. It can deal with discrete and continuous features but was initially limited to two-class problems. An extension, ReliefF , not only deals with multiclass problems but is also more robust and capable of dealing with incomplete and noisy data. The Relief family of methods are especially attractive because they may be applied in all situations, have low bias, include interaction among features and may capture local dependencies that other methods miss. However, they select features based only on relevance and do not remove redundant features. Correlation-based Feature Selection (CFS) [8] is a simple filter algorithm that ranks feature subsets according to a correlation-based heuristic evaluation function. The bias of the evaluation function is toward subsets that contain features that are highly correlated with the class and uncorrelated with each other. Irrelevant features should be ignored because they will have low correlation with the class. Redundant features should be screened out as they will be highly correlated with one or more of the remaining features. Moreover, there exists an improved CFS version called Fast Correlated-Based Filter (FCBF) method based on symmetrical uncertainty (SU) , which is defined as the ratio between the information gain (IG) and the entropy (H) of two features. This method was designed for high-dimensionality data and has been shown to be effective in removing both irrelevant and redundant features (although it fails to take into consideration interactions between features). The INTERACT algorithm [28] uses the same goodness measure as the FCBF filter , but also includes the consistency contribution (c-contribution). The c-contribution of a feature indicates how significantly the elimination of that feature will affect consistency. The algorithm consists of two major parts. In the first part, the features are ranked in descending order based on their SU values. In the second part, features are evaluated one by one starting from the end of the ranked feature list. If the c-contribution of a feature is less than a given threshold the feature is removed, otherwise it is selected. Finally, the Minimum Redundancy-Maximum Relevance (MRMR) is a heuristic framework which minimizes redundancy, using a series of measures of relevance and redundancy to select promising features for both continuous and discrete data sets. Particularly, for discrete variables it applies Mutual Information, while for continuous variables it mainly uses the F-test and correlation. 3 The proposed method RRD is a univariate filter-based feature selection method, which can use any suitable statistic to select a good subset of features in a dataset. The statistic can
Value Interchangeability in Scenario Generation
Springer eBooks, 2013
Several types of symmetry have been identified and exploited in Constraint Programming, leading t... more Several types of symmetry have been identified and exploited in Constraint Programming, leading to large reductions in search time. We present a novel application of one such form of symmetry: detecting dynamic value interchangeability in the random variables of a 2-stage stochastic problem. We use a real-world problem from the literature: finding an optimal investment plan to strengthen a transportation network, given that a future earthquake probabilistically destroys links in the network. Detecting interchangeabilities enables us to bundle together many equivalent scenarios, drastically reducing the size of the problem and allowing the exact solution of cases previously considered intractable and solved only approximately.

Springer eBooks, 1995
Constraint logic programming has been applied to cost minimization problems such as job-shop sche... more Constraint logic programming has been applied to cost minimization problems such as job-shop scheduling with some success, using the (depth-first) branch and bound method. Recent work has shown that problem-specific heuristics can improve the performance of CLP systems on combinatorial optimisation problems. In this paper we take an orthogonal approach, by developing a generic parallel branch and bound strategy which improves existing CLP strategies in several ways: by avoiding the sometimes prohibitive overheads common to existing implementations; by speeding up convergence to optimal solutions; and by speeding up the proof of optimality for suboptimal solutions. The latter two improvements exploit parallelism in novel ways, which can be smoothly integrated with Or-parallelism. We evaluate these ideas on a set of job-shop scheduling problems, in some cases achieving order of magnitude speedups.

Remote Sensing, Mar 29, 2020
Scene classification is an important aspect of image/video understanding and segmentation. Howeve... more Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory-and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply only to the convolutional or fully connected layers, but not both. In experiments, we set new standards in classification accuracy on four remote-sensing and two scene-recognition image datasets.

arXiv (Cornell University), Mar 8, 2019
When creating benchmarks for SAT solvers, we need SAT instances that are easy to build but hard t... more When creating benchmarks for SAT solvers, we need SAT instances that are easy to build but hard to solve. A recent development in the search for such methods has led to the Balanced SAT algorithm, which can create k-SAT instances with m clauses of high difficulty, for arbitrary k and m. In this paper we introduce the No-Triangle SAT algorithm, a SAT instance generator based on the cluster coefficient graph statistic. We empirically compare the two algorithms by fixing the arity and the number of variables, but varying the number of clauses. The hardest instances that we find are produced by No-Triangle SAT. Furthermore, difficult instances from No-Triangle SAT have a different number of clauses than difficult instances from Balanced SAT, potentially allowing a combination of the two methods to find hard SAT instances for a larger array of parameters.
Chapter 2. CNF Encodings
Frontiers in Artificial Intelligence and Applications, 2021
Before a combinatorial problem can be solved by current SAT methods, it must usually be encoded i... more Before a combinatorial problem can be solved by current SAT methods, it must usually be encoded in conjunctive normal form, which facilitates algorithm implementation and allows a common file format for problems. Unfortunately there are several ways of encoding most problems and few guidelines on how to choose among them, yet the choice of encoding can be as important as the choice of search algorithm. This chapter reviews theoretical and empirical work on encoding methods, including the use of Tseitin encodings, the encoding of extensional and intensional constraints, the interaction between encodings and search algorithms, and some common sources of error. Case studies are used for illustration.

Remote Sensing, 2020
Scene classification is an important aspect of image/video understanding and segmentation. Howeve... more Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory- and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply ...
Constraint satisfaction problems can be SAT-encoded in more than one way, and the choice of encod... more Constraint satisfaction problems can be SAT-encoded in more than one way, and the choice of encoding can be as important as the choice of search algorithm. Theoretical results are few but experimental comparisons have been made between encodings, using both local and backtrack search algorithms. This paper compares local search performance on seven encodings of graph colouring benchmarks. Two of the encodings are new, and one of these gives generally better results than known encodings. We also find better results than expected for the log encoding and one of its variants.

A Generic Approach to Combining Stochastic Algorithms With Systematic Constraint Solvers
ABSTRACT Constraint satisfaction and optimization problems arise in many artificial intelligence ... more ABSTRACT Constraint satisfaction and optimization problems arise in many artificial intelligence applications, and a variety of algorithms have been proposed for their solution. Two quite different approaches are currently in competition: systematic algorithms such as forward checking with tree-search, and stochastic algorithms such as hill climbing with penalty functions. Neither has been shown to be consistently better than he other, and combining features of both is an active research area. This paper investigates a generic approach to a creating hybrids of algorithms from the two classes. Our method performs stochastic search in the space of search strategies for a systematic constraint solver; the solver is used to probe the constrained space in order to compute an objective function. The idea is straightforward and clearly very flexible, but at first glance it appears prohibitively expensive; it is also not obvious how much effort should be spent evaluating a strategy. We describe a cheap evaluation technique, and show that for sufficiently large problems the hybrids may outperform their systematic and stochastic counterparts. There is a significant overhead which degrades the hybrids’ performance on smaller problems in particular, but we describe ways of greatly reducing this overhead. The paper is structured as follows: Section 2 surveys existing approaches to solving constraint problems; Section 3 describes our method; Section 4 evaluates its performance; Section 5 describes its extension to constrained optimization problems; Section 6 mentions related work; and Section 7 draws some conclusions and discusses future work.
Hybrid local search on two multicolouring models
ABSTRACT

Lecture Notes in Computer Science, 1995
Constraint logic programming has been applied to cost minimization problems such as job-shop sche... more Constraint logic programming has been applied to cost minimization problems such as job-shop scheduling with some success, using the (depth-first) branch and bound method. Recent work has shown that problem-specific heuristics can improve the performance of CLP systems on combinatorial optimisation problems. In this paper we take an orthogonal approach, by developing a generic parallel branch and bound strategy which improves existing CLP strategies in several ways: by avoiding the sometimes prohibitive overheads common to existing implementations; by speeding up convergence to optimal solutions; and by speeding up the proof of optimality for suboptimal solutions. The latter two improvements exploit parallelism in novel ways, which can be smoothly integrated with Or-parallelism. We evaluate these ideas on a set of job-shop scheduling problems, in some cases achieving order of magnitude speedups.
Proceedings of the 1993 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation - PEPM '93, 1993
systems must be guided by an unfolding strategy, telling them which atoms to unfold and when to s... more systems must be guided by an unfolding strategy, telling them which atoms to unfold and when to stop unfolding, Online strategies exploit knowledge accumulated during the unfolding itself, for example in a goal stack, while offline strategies are fixed before unfolding begins. Online strategies are more powerful, but a major overhead for large programs is the analysis time spent on each atom, which increases as the knowledge grows. We describe an online strategy whose analysis time for each atom is independent of the amount of knowledge about that atom. This reduces transformation times for programs with large search spaces by an order of magnitude, while retaining the power of online analysis. Correctness, termination and nontriviality are shown.
The Relation Between Complete and Incomplete Search
Studies in Computational Intelligence, 2008
This chapter compares complete and incomplete search methods, discusses hybrid approaches, contra... more This chapter compares complete and incomplete search methods, discusses hybrid approaches, contrasts modelling techniques, and speculates that the boundary between the two is more blurred than it might seem.
Symmetry breaking and implied constraints can speed up both exhaustive search and the search for ... more Symmetry breaking and implied constraints can speed up both exhaustive search and the search for a single solution. We experiment with both types of constraint, using three search algorithms (backtracking, local and hybrid) to find single solutions for SAT encodings of three combinatorial problems (clique, set cover and balanced incomplete block design generation). Both show strong positive and negative effects,
Balanced Incomplete Block Design as Satisfiability
Balanced incomplete block design generation is a standard combinatorial problem from design theor... more Balanced incomplete block design generation is a standard combinatorial problem from design theory.Constraint programming has recently been applied to the problem using a mixture of binary andnon-binary constraints, with special techniques for symmetry breaking. We describe a new binary constraintmodel and apply search algorithms indirectly via satisability encoding. The encoded problemsturn out to be hard for current algorithms, and symmetry

Coloration Neighbourhood Search With Forward Checking
Annals of Mathematics and Artificial Intelligence - AMAI, 2002
Two contrasting search paradigms for solving combinatorial problems are systematic backtracking a... more Two contrasting search paradigms for solving combinatorial problems are systematic backtracking and local search. The former is often effective on highly structured problems because of its ability to exploit consistency techniques, while the latter tends to scale better on very large problems. Neither approach is ideal for all problems, and a current trend in artificial intelligence is the hybridisation of search techniques. This paper describes a use of forward checking in local search: pruning coloration neighbourhoods for graph colouring. The approach is evaluated on standard benchmarks and compared with several other algorithms. Good results are obtained; in particular, one variant finds improved colourings on geometric graphs, while another is very effective on equipartite graphs. Its application to other combinatorial problems is discussed.

Electronic Notes in Theoretical Computer Science, 2001
In software development, a metric is the measurement of some characteristic of a program's perfor... more In software development, a metric is the measurement of some characteristic of a program's performance. When developing software for parallel architectures, metrics can play a very useful role in tuning properties such as task granularity and load balancing. A current approach to parallel software development is the use of analysis or visualisation tools to reveal important aspects of the parallel execution, often by post-execution analysis of trace files. For example, it may be useful to detect the creation of a large number of fine-grained tasks, which cause significant runtime overheads. A simple metric to estimate this danger is the number of parallel tasks created per second. This may be computed by dividing the trace into time slots, and counting the number of created tasks in each slot. Such metrics require parallel events like task creation and completion to be assigned time stamps. However, precise times are hard to obtain in asynchronous parallel systems. The difficulty is exacerbated by the well-known "probe effect" whereby the act of monitoring performance affects the performance itself. This inaccuracy may render a metric meaningless if it relies on the order in which events occur, or on the precise duration of tasks. Another danger is that the choice of time slot length may create artifacts, which becomes obvious if a visualisation tool allows the user to zoom in on parts of the trace file. The designer of parallel performance metrics must therefore take great care to ensure that they are meaningful. This paper argues that small inaccuracies in event times should not produce large effects on metrics, and hence on numbers, graphs, pictures or animations produced by analysis or visualisation tools. This is not sufficient to guarantee good metrics but it is a useful necessary condition. The requirement that small changes have small effects is characteristic of continuous functions, and it is therefore proposed that metrics be defined as continuous functions of event times. To bridge the gap between discrete events and continuous functions, metrics can be defined as integrals (over time slots) of simpler functions called trace abstractions. A trace abstraction need not be continuous in time: it may be a simple step functions that changes value when an event occurs. Such functions can easily be integrated over time slots, and a trace abstraction obeying certain conditions yields a metric that is insensitive to event time inaccuracy. Practical sufficient conditions are provided for such trace abstractions, and rules provided for the composition of complex trace abstractions from simple ones. As a bonus, continuous metrics are shown to be insensitive to time slot length and hence to be well-behaved under changes in time scale. 1571-0661 c Elsevier Ltd 10.1016/S1571-0661(05)80057-X Open access under CC BY-NC-ND license.
Workshop on Modelling and …, 2004
Constrained CP-nets Steve Prestwich 1 , Francesca Rossi 2 , Kristen Brent Venable 2 , Toby Walsh ... more Constrained CP-nets Steve Prestwich 1 , Francesca Rossi 2 , Kristen Brent Venable 2 , Toby Walsh 1 1: Cork Constraint Computation Centre, University College Cork, Ireland. Email: s. prestwich@ cs. ucc. ie, tw@ 4c. ucc. ie 2: Department of Pure and Applied Mathematics, ...
Uploads
Papers by Steven Prestwich