Proceedings of the AAAI Conference on Human Computation and Crowdsourcing
Understanding the dynamics of a crowdsourcing application and controlling the quality of the data... more Understanding the dynamics of a crowdsourcing application and controlling the quality of the data it generates is challenging, partly due to the lack of tools to do so. Provenance is a domain-independent means to represent what happened in an application, which can help verify data and infer their quality. It can also reveal the processes that led to a data item and the interactions of contributors with it. Provenance patterns can manifest real-world phenomena such as a significant interest in a piece of content, providing an indication of its quality, or even issues such as undesirable interactions within a group of contributors. This paper presents an application-independent methodology for analyzing provenance graphs, constructed from provenance records, to learn about such patterns and to use them for assessing some key properties of crowdsourced data, such as their quality, in an automated manner. Validating this method on the provenance records of CollabMap, an online crowdsou...
Proceedings of the AAAI Conference on Artificial Intelligence
We consider the problem of task allocation in crowdsourcing systems with multiple complex workflo... more We consider the problem of task allocation in crowdsourcing systems with multiple complex workflows, each of which consists of a set of inter-dependent micro-tasks.We propose Budgeteer, an algorithm to solve this problem under a budget constraint. In particular, our algorithm first calculates an efficient way to allocate budget to each workflow. It then determines the number of inter-dependent micro-tasks and the price to pay for each task within each workflow, given the corresponding budget constraints. We empirically evaluate it on a well-known crowdsourcing-based text correction workflow using Amazon Mechanical Turk, and show that Budgeteer can achieve similar levels of accuracy to current benchmarks, but is on average 45 % cheaper.
Adaptive Agents and Multi-Agents Systems, May 6, 2013
In this paper, we present AgentSwitch, a prototype agent-based platform to solve the electricity ... more In this paper, we present AgentSwitch, a prototype agent-based platform to solve the electricity tariff selection problem. Agent-Switch incorporates novel algorithms to make predictions of hourly energy usage as well as detect (and suggest to the user) deferrable loads that could be shifted to off-peak times to maximise savings. To take advantage of group discounts from energy retailers, we develop a new scalable collective energy purchasing mechanism, based on the Shapley value, that ensures individual members of a collective (interacting through AgentSwitch) fairly share the discounts. To demonstrate the effectiveness of our algorithms we empirically evaluate them individually on real-world data (with up to 3000 homes in the UK) and show that they outperform the state of the art in their domains. Finally, to ensure individual components are accountable in providing recommendations, we provide a novel provenance-tracking service to record the flow of data in the system, and therefore provide users with a means of checking the provenance of suggestions from AgentSwitch and assess their reliability.
Adaptive Agents and Multi-Agents Systems, May 5, 2014
Crowdsourcing is a multi-agent task allocation paradigm that involves up to millions of workers, ... more Crowdsourcing is a multi-agent task allocation paradigm that involves up to millions of workers, of varying reliability and availability, performing large numbers of micro-tasks. A key challenge is to crowdsource, at minimal cost and with predictable accuracy, complex tasks that involve different types of interdependent microtasks structured into complex workflows. In this paper, we propose the first crowdsourcing algorithm that solves this problem. Our algorithm, called BudgetFix, determines the number of interdependent micro-tasks and the price to pay for each task given budget constraints. Moreover, BudgetFix provides quality guarantees on the accuracy of the output of each phase of a given workflow. Bud-getFix is empirically evaluated on a well-known crowdsourcingbased text correction workflow using Amazon Mechanical Turk, and is shown that BudgetFix can provide similar accuracy, compared to the state-of-the-art algorithm for this workflow, but is on average 32% cheaper.
SOFSEM 2018: Theory and Practice of Computer Science, 2017
In this paper we present UML2PROV, an approach addressing the gap between application design, thr... more In this paper we present UML2PROV, an approach addressing the gap between application design, through UML diagrams, and provenance design, using PROV-Template. PROV-Template is a declarative approach that enables software engineers to develop programs that generate provenance following the PROV standard. The main contributions of this paper are: (i) a mapping strategy from UML diagrams (UML State Machine and Sequence diagrams) to templates, (ii) a code generation technique that creates libraries, which can be deployed in an application by creating suitable artefacts for provenance generation, and (iii) a demonstration of the feasibility of UML2PROV implemented with Java, and a preliminary quantitative evaluation that shows benefits regarding aspects such as design, development and provenance capture.
Data provenance is a form of knowledge graph providing an account of what a system performs, desc... more Data provenance is a form of knowledge graph providing an account of what a system performs, describing the data involved, and the processes carried out over them. It is crucial to ascertaining the origin of data, validating their quality, auditing applications behaviours, and, ultimately, making them accountable. However, instrumenting applications, especially legacy ones, to track the provenance of their operations remains a significant technical hurdle, hindering the adoption of provenance technology. UML2PROV is a software-engineering methodology that facilitates the instrumentation of provenance recording in applications designed with UML diagrams. It automates the generation of (1) templates for the provenance to be recorded and (2) the code to capture values required to instantiate those templates from an application at run time, both from the application’s UML diagrams. By so doing, UML2PROV frees application developers from manual instrumentation of provenance capturing whi...
Provenance network analytics is a novel data analytics approach that helps infer properties of da... more Provenance network analytics is a novel data analytics approach that helps infer properties of data, such as quality or importance, from their provenance. Instead of analysing application data, which are typically domain-dependent, it analyses the data's provenance as represented using the World Wide Web Consortium's domainagnostic PROV data model. Specifically, the approach proposes a number of network metrics for provenance data and applies established machine learning techniques over such metrics to build predictive models for some key properties of data. Applying this method to the provenance of real-world data from three different applications, we show that it can successfully identify the owners of provenance documents, assess the quality of crowdsourced data, and identify instructions from chat messages in an alternate-reality game with high levels of accuracy. By so doing, we demonstrate the different ways the proposed provenance network metrics can be used in analysing data, providing the foundation for provenance-based data analytics.
Major natural or man-made disasters such as Hurricane Katrina or the 9/11 terror attacks pose sig... more Major natural or man-made disasters such as Hurricane Katrina or the 9/11 terror attacks pose significant challenges for emergency responders. First, they have to develop an understanding of the unfolding event either using their own resources or through third-parties such as the local population and agencies. Second, based on the information gathered, they need to deploy their teams in a flexible manner, ensuring that each team performs tasks in The most effective way. Third, given the dynamic nature of a disaster space, and the uncertainties involved in performing rescue missions, information about the disaster space and the actors within it needs to be managed to ensure that responders are always acting on up-to-date and trusted information. Against this background, this paper proposes a novel disaster response system called HAC-ER. Thus HAC-ER interweaves humans and agents, both robotic and software, in social relationships that augment their individual and collective capabiliti...
PROV-TEMPLATEis a declarative approach that enables designers and programmers to design and gener... more PROV-TEMPLATEis a declarative approach that enables designers and programmers to design and generate provenance compatible with the PROV standard of the World Wide Web Consortium. Designers specify the topology of the provenance to be generated by composing templates, which are provenance graphs containing variables, acting as placeholders for values. Programmers write programs that log values and package them up in sets of bindings, a data structure associating variables and values. An expansion algorithm generates instantiated provenance from templates and sets of bindings in any of the serialisation formats supported by PROV. A quantitative evaluation shows that sets of bindings have a size that is typically 40 percent of that of expanded provenance templates and that the expansion algorithm is suitably tractable, operating in fractions of milliseconds for the type of templates surveyed in the article. Furthermore, the approach shows four significant software engineering benefits: separation of responsibilities, provenance maintenance, potential runtime checks and static analysis, and provenance consumption. The article gathers quantitative data and qualitative benefits descriptions from four different applications making use of PROV-TEMPLATE. The system is implemented and released in the open-source library ProvToolbox for provenance processing.
ProvStore is the first online public provenance repository supporting the new PROV standards by W... more ProvStore is the first online public provenance repository supporting the new PROV standards by W3C. It allows users and applications to store and (optionally) publish the provenance of their data on the Web. Provenance documents can be transformed, visualized, and shared in various serializations, with all the functionality also available to automated applications via a RESTful API (OAuth supported).
It is commonly believed that provenance can be utilised to form assessments about the quality, re... more It is commonly believed that provenance can be utilised to form assessments about the quality, reliability or trustworthiness of data. Once presented with contradictory or questionable information, users can seek further validation by referring to its provenance. While there has been some effort to design principled methods to analyse provenance, the focus has mostly been on offline use of provenance. How to use provenance at runtime, i.e., as the application runs, to help users make decisions, has been barely investigated. In this paper, we propose a generic and application-independent approach to interpret provenance of data to make online decisions. We evaluate the system in CollabMap, an online crowd-sourcing mapping application, to make decisions about the quality of its data and to determine when the crowd's contributions to a task are deemed to be complete.
Provenance is a domain-independent means to represent what happened in an application, which can ... more Provenance is a domain-independent means to represent what happened in an application, which can help verify data and infer data quality. Provenance patterns can manifest real-world phenomena such as a significant interest in a piece of content, providing an indication of its quality, or even issues such as undesirable interactions within a group of contributors. This paper presents an application-independent methodology for analyzing data based on the network metrics of provenance graphs to learn about such patterns and to relate them to data quality in an automated manner. Validating this method on the provenance records of CollabMap, an online crowdsourcing mapping application, we demonstrated an accuracy level of over 95% for the trust classification of data generated by the crowd therein.
AgentSwitch: towards smart electricity tariff selection
In this paper, we present AgentSwitch, a prototype agent-based platform to solve the electricity ... more In this paper, we present AgentSwitch, a prototype agent-based platform to solve the electricity tariff selection problem. AgentSwitch incorporates novel algorithms to make predictions of hourly energy usage as well as detect (and suggest to the user) deferrable loads that could be shifted to off-peak times to maximise savings. To take advantage of group discounts from energy retailers, we develop a new scalable collective energy purchasing mechanism, based on the Shapley value, that ensures individual members of a collective (interacting through AgentSwitch) fairly share the discounts. To demonstrate the effectiveness of our algorithms we empirically evaluate them individually on real-world data (with up to 3000 homes in the UK) and show that they outperform the state of the art in their domains. Finally, to ensure individual components are accountable in providing recommendations, we provide a novel provenance-tracking service to record the flow of data in the system, and therefore provide users with a means of checking the provenance of suggestions from AgentSwitch and assess their reliability.
Provenance for the People: A User-Centered Look at the W3C PROV Standard through an Online Game
In the information age, tools for examining the validity of data are invaluable. Provenance is on... more In the information age, tools for examining the validity of data are invaluable. Provenance is one such tool, and the PROV model proposed by the World Wide Web Consortium in 2013 offers a means of expressing provenance in a machine readable format. In this paper, we examine from a user’s standpoint notions of provenance, the accessibility of the PROV model, and the general attitudes towards history and the verifiability of information in modern data society. We do this through the medium of an online-game designed to explore these issues and present the findings of the study along with a discussion of some of its implications.
Proceedings of the 5th Annual ACM Web Science Conference, 2013
In this paper, we present a software tool to help emergency planners at Hampshire County Council ... more In this paper, we present a software tool to help emergency planners at Hampshire County Council in the UK to create maps for high-fidelity crowd simulations that require evacuation routes from buildings to roads. The main feature of the system is a crowdsourcing mechanism that breaks down the problem of creating evacuation routes into microtasks that a contributor to the platform can execute in less than a minute. As part of the mechanism we developed a concensus-based trust mechanism that filters out incorrect contributions and ensures that the individual tasks are complete and correct. To drive people to contribute to the platform, we experimented with different incentive mechanisms and applied these over different time scales, the aim being to evaluate what incentives work with different types of crowds, including anonymous contributors from Amazon Mechanical Turk. The results of the 'in the wild' deployment of the system show that the system is effective at engaging contributors to perform tasks correctly and that users respond to incentives in different ways. More specifically, we show that purely social motives are not good enough to attract a large number of contributors and that contributors are averse to the uncertainty in winning rewards. When taken altogether, our results suggest that a combination of incentives may be the best approach to harnessing the maximum number of resources to get socially valuable tasks (such for planning applications) performed on a large scale.
Provenance is a record that describes the people, institutions, entities, and activities involved... more Provenance is a record that describes the people, institutions, entities, and activities involved in producing, influencing, or delivering a piece of data or a thing. The W3C Provenance Working group has just published the prov family of specifications, which include a data model for provenance on the Web. The working group introduces a notion of valid prov document whose intent is to ensure that a prov document represents a consistent history of objects and their interactions that is safe to use for the purpose of reasoning and other kinds of analysis. Valid prov documents satisfy certain definitions, inferences, and constraints, specified in prov-constraints. This paper discusses the design of ProvValidator, an online service for validating provenance documents according to prov-constraints. It discusses the algorithmic design of the validator, the complexity of the algorithm, how we demonstrated compliance with the standard, and its rest api.
Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, 2006
Current computational trust models are usually built either on an agent's direct experience of an... more Current computational trust models are usually built either on an agent's direct experience of an interaction partner (interaction trust) or reports provided by third parties about their experiences with a partner (witness reputation). However, both of these approaches have their limitations. Models using direct experience often result in poor performance until an agent has had a sufficient number of interactions to build up a reliable picture of a particular partner and witness reports rely on self-interested agents being willing to freely share their experience. To this end, this paper presents Certified Reputation (CR), a novel model of trust that can overcome these limitations. Specifically, CR works by allowing agents to actively provide third-party references about their previous performance as a means of building up the trust in them of their potential interaction partners. By so doing, trust relationships can quickly be established with very little cost to the involved parties. Here we empirically evaluate CR and show that it helps agents pick better interaction partners more quickly than models that do not incorporate this form of trust.
Trust is a fundamental concern in large-scale open distributed systems. It lies at the core of al... more Trust is a fundamental concern in large-scale open distributed systems. It lies at the core of all interactions between the entities that have to operate in such uncertain and constantly changing environments. Given this complexity, these components, and the ensuing system, are increasingly being conceptualised, designed, and built using agent-based techniques and, to this end, this paper examines the specific role of trust in multi-agent systems. In particular, we survey the state of the art and provide an account of the main directions along which research efforts are being focused. In so doing, we critically evaluate the relative strengths and weaknesses of the main models that have been proposed and show how, fundamentally, they all seek to minimise the uncertainty in interactions. Finally, we outline the areas that require further research in order to develop a comprehensive treatment of trust in complex computational settings.
Trust and reputation are central to effective interactions in open multi-agent systems (MAS) in w... more Trust and reputation are central to effective interactions in open multi-agent systems (MAS) in which agents, that are owned by a variety of stakeholders, continuously enter and leave the system. This openness means existing trust and reputation models cannot readily be used since their performance suffers when there are various (unforseen) changes in the environment. To this end, this paper presents FIRE, a trust and reputation model that integrates a number of information sources to produce a comprehensive assessment of an agent's likely performance in open systems. Specifically, FIRE incorporates interaction trust, role-based trust, witness reputation, and certified reputation to provide trust metrics in most circumstances. FIRE is empirically evaluated and is shown to help agents gain better utility (by effectively selecting appropriate interaction partners) than our benchmarks in a variety of agent populations. It is also shown that FIRE is able to effectively respond to changes that occur in an agent's environment.
Contemporary and near-future military coalition environments present a number of challenges for m... more Contemporary and near-future military coalition environments present a number of challenges for military planning. Not only must military planners create plans against a backdrop of strict time constraints and uncertain information, they must also coordinate their planning efforts with other planning staff (often from different organizational, linguistic and cultural communities). This paper examines the potential for semantic wikis to support collaborative planning activities in the face of these challenges. Whilst we do not claim ...
Uploads
Papers by Trung Huỳnh