Privacy in Graphs
2012, 5th Ares Plenary Meeting. Advanced Research on Information Security and Privacy.
Sign up for access to the world's latest research
Abstract
In this brief presentation on graph anonymization, we look at some graph modifier operators and different types of adversary information queries.
Related papers
2008
In this paper, we focus on the problem of preserving the privacy of sensitive relationships in graph data. We refer to the problem of inferring sensitive relationships from anonymized graph data as link re-identification. We propose five different privacy preservation strategies, which vary in terms of the amount of data removed (and hence their utility) and the amount of privacy preserved.
Technical Report, TR–IIIA–2010–04, IIIA-CSIC, 2010
In this document we review the state of the art on graph privacy with special emphasis on applications to online social networks, and we review how six different operators modify local topologies, when activity data is included. We consider an aspect which has not been greatly covered in the specialized literature on graph privacy: adding, deleting and disaggregation of nodes. We also cover the following key considerations: (i) choice of six different operators to modify the graph; (ii) simulated annealing to find the optimum graph using a fitness function based on information loss and disclosure risk; (iii) Use of heuristics to choose graph elements (nodes, edges) to be modified, as a probability weighted by the distribution of an elements statistical characteristics (degree, clustering coefficient and path length) in the original graph; (iv) re-linking of nodes: heuristic which finds the topology whose statistical characteristics are closest to those of the original neighborhood; (v) in the case of the aggregation of two nodes, we choose adjacent nodes rather than isomorphic topologies, in order to maintain the overall structure of the graph; (vi) incorporation of network activity as a weight on the topology characteristics; (vii) a statistically knowledgeable attacker who is able to search for regions of the graph based on statistical characteristics and map those onto a given node and its immediate neighborhood.
Journal of Telecommunications and Information Technology, 2019
The growing popularity of social networks and the increasing need for publishing related data mean that protection of privacy becomes an important and challenging problem in social networks. This paper describes the (k, l k, l k, l)-anonymity model used for social network graph anonymization. The method is based on edge addition and is utility-aware, i.e. it is designed to generate a graph that is similar to the original one. Different strategies are evaluated to this end and the results are compared based on common utility metrics. The outputs confirm that the naïve idea of adding some random or even minimum number of possible edges does not always produce useful anonymized social network graphs, thus creating some interesting alternatives for graph anonymization techniques.
2015
Massive graphs have become pervasive in a wide variety of data domains. However, they are generally more difficult to anonymize because the structural information buried in graph can be leveraged by an attacker to breach sensitive attributes. Furthermore, the increasing sizes of graph data sets present a major challenge to anonynization algorithms. In this paper, we will address the problem of privacy-preserving data mining of massive graph-data sets. We design a MapReduce framework to address the problem of attribute disclosure in massive graphs. We leverage the MapReduce framework to create a scalable algorithm that can be used for very large graphs. Unlike existing literature in graph privacy, our proposed algorithm focuses on the sensitive content at the nodes rather than on the structure. This is because content-centric perturbation at the nodes is a more effective way to prevent attribute disclosure rather than structural reorganization. One advantage of the approach is that s...
Lecture Notes in Computer Science, 2019
When releasing graph data (e.g., social network) to public or third parties, data privacy becomes a major concern. It has been shown that state-of-the-art graph anonymization techniques suffer from a lack of strong defense against De-Anonymization (DA) attacks mostly because of the bias towards utility preservation. In this paper, we propose GAGA, an Efficient Genetic Algorithm for Graph Anonymization, that simultaneously delivers high anonymization and utility preservation. To address the vulnerability against DA attacks especially when the adversary can re-identify the victim not only based on some information about the neighbors of a victim but also some knowledge on the structure of the neighbors of the victim's neighbors, GAGA puts the concept of k(d)-neighborhood-anonymity into action by developing the first general algorithm for any d distance neighborhood. GAGA also addresses the challenge of applying minimum number of changes to the original graph to preserve data utilities via an effective and efficient genetic algorithm. Results of our evaluation show that GAGA anonymizes the graphs in a way that it is more resistant to modern DA attacks than existing techniques-GAGA (with d=3) improves the defense against DA techniques by reducing the DA rate by at least a factor of 2.7× in comparison to the baseline. At the same time it preserves the data utilities to a very high degree-it is the best technique for preserving 11 out of 16 utilities. Finally, GAGA provides application-oriented level of control to users via different tunable parameters.
ACM Transactions on Information and System Security
When people utilize social applications and services, their privacy suffers a potential serious threat. In this article, we present a novel, robust, and effective de-anonymization attack to mobility trace data and social data. First, we design a Unified Similarity (US) measurement, which takes account of local and global structural characteristics of data, information obtained from auxiliary data, and knowledge inherited from ongoing de-anonymization results. By analyzing the measurement on real datasets, we find that some data can potentially be de-anonymized accurately and the other can be de-anonymized in a coarse granularity. Utilizing this property, we present a US-based De-Anonymization (DA) framework, which iteratively de-anonymizes data with accuracy guarantee. Then, to de-anonymize large-scale data without knowledge of the overlap size between the anonymized data and the auxiliary data, we generalize DA to an Adaptive De-Anonymization (ADA) framework. By smartly working on ...
Frontiers in Big Data, 2019
Complex networks gathered from our online interactions provide a rich source of information that can be used to try to model and predict our behavior. While this has very tangible benefits that we have all grown accustomed to, there is a concrete privacy risk in sharing potentially sensitive data about ourselves and the people we interact with, especially when this data is publicly available online and unprotected from malicious attacks. k-anonymity is a technique aimed at reducing this risk by obfuscating the topological information of a graph that can be used to infer the nodes' identity. In this paper we propose a novel algorithm to enforce k-anonymity based on a well-known result in extremal graph theory, the Szemerédi regularity lemma. Given a graph, we start by computing a regular partition of its nodes. The Szemerédi regularity lemma ensures that such a partition exists and that the edges between the sets of nodes behave almost randomly. With this partition, we anonymize the graph by randomizing the edges within each set, obtaining a graph that is structurally similar to the original one yet the nodes within each set are structurally indistinguishable. We test the proposed approach on real-world networks extracted from Facebook. Our experimental results show that the proposed approach is able to anonymize a graph while retaining most of its structural information.
—We propose a graph-based framework for privacy preserving data publication, which is a systematic abstraction of existing anonymity approaches and privacy criteria. Graph is explored for dataset representation, background knowledge specification, anonymity operation design, as well as attack inferring analysis. The framework is designed to accommodate various datasets including social networks, relational tables, temporal and spatial sequences, and even possible unknown data models. The privacy and utility measurements of the anonymity datasets are also quantified in terms of graph features. Our experiments show that the framework is capable of facilitating privacy protection by different anonymity approaches for various datasets with desirable performance.
ACM Transactions on Database Systems, 2014
We present efficient algorithms for releasing useful statistics about graph data while providing rigorous privacy guarantees. Our algorithms work on data sets that consist of relationships between individuals, such as social ties or email communication. The algorithms satisfy edge differential privacy, which essentially requires that the presence or absence of any particular relationship be hidden.
"With an abundance of social network data being released, the need to protect sensitive information within these networks has become an impor- tant concern of data publishers. To achieve this objective, various notions of k-anonymization have been proposed for social network graphs. In this paper we focus on the complexity of optimization problems that arise from trying to anonymize graphs, establishing that optimally k-anonymizing the label se- quences of edge-labeled graphs is intractable. We show how this result implies intractability for other notions of k-anonymization in literature. We also consider the case of bipartite social network graphs which arise from the representation of distinct entities, such as movies and viewers, pa- tients and drugs, or products and customers. Within this setting we demon- strate that, although k-anonymizing edge-labeled graphs is intractable for k ≥ 3, polynomial time algorithms exist for arbitrary bipartite graphs when k = 2 and for unlabeled bipartite graphs irrespective of the value of k. Finally, in this paper we extend the study of attribute disclosure within the context of social networks by defining t-closeness, a measure of how effec- tively an adversary can determine sensitive information about members of a k-anonymous social network."

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.