Privacy in Graphs

David Nettleton

Outline

Privacy in Graphs

David Nettleton

2012, 5th Ares Plenary Meeting. Advanced Research on Information Security and Privacy.

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In this brief presentation on graph anonymization, we look at some graph modifier operators and different types of adversary information queries.

Lise Getoor

2008

In this paper, we focus on the problem of preserving the privacy of sensitive relationships in graph data. We refer to the problem of inferring sensitive relationships from anonymized graph data as link re-identification. We propose five different privacy preservation strategies, which vary in terms of the amount of data removed (and hence their utility) and the amount of privacy preserved.

downloadDownload free PDF View PDFchevron_right

Data Privacy for Simply Anonymized Social Network Logs - Some Novel Proposals for Graph Alteration

David Nettleton

Technical Report, TR–IIIA–2010–04, IIIA-CSIC, 2010

In this document we review the state of the art on graph privacy with special emphasis on applications to online social networks, and we review how six different operators modify local topologies, when activity data is included. We consider an aspect which has not been greatly covered in the specialized literature on graph privacy: adding, deleting and disaggregation of nodes. We also cover the following key considerations: (i) choice of six different operators to modify the graph; (ii) simulated annealing to find the optimum graph using a fitness function based on information loss and disclosure risk; (iii) Use of heuristics to choose graph elements (nodes, edges) to be modified, as a probability weighted by the distribution of an elements statistical characteristics (degree, clustering coefficient and path length) in the original graph; (iv) re-linking of nodes: heuristic which finds the topology whose statistical characteristics are closest to those of the original neighborhood; (v) in the case of the aggregation of two nodes, we choose adjacent nodes rather than isomorphic topologies, in order to maintain the overall structure of the graph; (vi) incorporation of network activity as a weight on the topology characteristics; (vii) a statistically knowledgeable attacker who is able to search for regions of the graph based on statistical characteristics and map those onto a given node and its immediate neighborhood.

downloadDownload free PDF View PDFchevron_right

A Novel Graph-modiﬁcation Technique for User Privacy-preserving on Social Networks

Hamideh Erfani

Journal of Telecommunications and Information Technology, 2019

The growing popularity of social networks and the increasing need for publishing related data mean that protection of privacy becomes an important and challenging problem in social networks. This paper describes the (k, l k, l k, l)-anonymity model used for social network graph anonymization. The method is based on edge addition and is utility-aware, i.e. it is designed to generate a graph that is similar to the original one. Different strategies are evaluated to this end and the results are compared based on common utility metrics. The outputs confirm that the naïve idea of adding some random or even minimum number of possible edges does not always produce useful anonymized social network graphs, thus creating some interesting alternatives for graph anonymization techniques.

downloadDownload free PDF View PDFchevron_right

Big Graph Privacy

Ken Barker

2015

Massive graphs have become pervasive in a wide variety of data domains. However, they are generally more difficult to anonymize because the structural information buried in graph can be leveraged by an attacker to breach sensitive attributes. Furthermore, the increasing sizes of graph data sets present a major challenge to anonynization algorithms. In this paper, we will address the problem of privacy-preserving data mining of massive graph-data sets. We design a MapReduce framework to address the problem of attribute disclosure in massive graphs. We leverage the MapReduce framework to create a scalable algorithm that can be used for very large graphs. Unlike existing literature in graph privacy, our proposed algorithm focuses on the sensitive content at the nodes rather than on the structure. This is because content-centric perturbation at the nodes is a more effective way to prevent attribute disclosure rather than structural reorganization. One advantage of the approach is that s...

downloadDownload free PDF View PDFchevron_right

When the Attacker Knows a Lot: The GAGA Graph Anonymizer

Arash Alavi

Lecture Notes in Computer Science, 2019

When releasing graph data (e.g., social network) to public or third parties, data privacy becomes a major concern. It has been shown that state-of-the-art graph anonymization techniques suffer from a lack of strong defense against De-Anonymization (DA) attacks mostly because of the bias towards utility preservation. In this paper, we propose GAGA, an Efficient Genetic Algorithm for Graph Anonymization, that simultaneously delivers high anonymization and utility preservation. To address the vulnerability against DA attacks especially when the adversary can re-identify the victim not only based on some information about the neighbors of a victim but also some knowledge on the structure of the neighbors of the victim's neighbors, GAGA puts the concept of k(d)-neighborhood-anonymity into action by developing the first general algorithm for any d distance neighborhood. GAGA also addresses the challenge of applying minimum number of changes to the original graph to preserve data utilities via an effective and efficient genetic algorithm. Results of our evaluation show that GAGA anonymizes the graphs in a way that it is more resistant to modern DA attacks than existing techniques-GAGA (with d=3) improves the defense against DA techniques by reducing the DA rate by at least a factor of 2.7× in comparison to the baseline. At the same time it preserves the data utilities to a very high degree-it is the best technique for preserving 11 out of 16 utilities. Finally, GAGA provides application-oriented level of control to users via different tunable parameters.

downloadDownload free PDF View PDFchevron_right

General Graph Data De-Anonymization

Selena He

ACM Transactions on Information and System Security

When people utilize social applications and services, their privacy suffers a potential serious threat. In this article, we present a novel, robust, and effective de-anonymization attack to mobility trace data and social data. First, we design a Unified Similarity (US) measurement, which takes account of local and global structural characteristics of data, information obtained from auxiliary data, and knowledge inherited from ongoing de-anonymization results. By analyzing the measurement on real datasets, we find that some data can potentially be de-anonymized accurately and the other can be de-anonymized in a coarse granularity. Utilizing this property, we present a US-based De-Anonymization (DA) framework, which iteratively de-anonymizes data with accuracy guarantee. Then, to de-anonymize large-scale data without knowledge of the overlap size between the anonymized data and the auxiliary data, we generalize DA to an Adaptive De-Anonymization (ADA) framework. By smartly working on ...

downloadDownload free PDF View PDFchevron_right

You Can't See Me: Anonymizing Graphs Using the Szemerédi Regularity Lemma

Andrea TORSELLO

Frontiers in Big Data, 2019

Complex networks gathered from our online interactions provide a rich source of information that can be used to try to model and predict our behavior. While this has very tangible benefits that we have all grown accustomed to, there is a concrete privacy risk in sharing potentially sensitive data about ourselves and the people we interact with, especially when this data is publicly available online and unprotected from malicious attacks. k-anonymity is a technique aimed at reducing this risk by obfuscating the topological information of a graph that can be used to infer the nodes' identity. In this paper we propose a novel algorithm to enforce k-anonymity based on a well-known result in extremal graph theory, the Szemerédi regularity lemma. Given a graph, we start by computing a regular partition of its nodes. The Szemerédi regularity lemma ensures that such a partition exists and that the edges between the sets of nodes behave almost randomly. With this partition, we anonymize the graph by randomizing the edges within each set, obtaining a graph that is structurally similar to the original one yet the nodes within each set are structurally indistinguishable. We test the proposed approach on real-world networks extracted from Facebook. Our experimental results show that the proposed approach is able to anonymize a graph while retaining most of its structural information.

downloadDownload free PDF View PDFchevron_right

Graph-Based Privacy-Preserving Data Publication

Jianwei Qian

—We propose a graph-based framework for privacy preserving data publication, which is a systematic abstraction of existing anonymity approaches and privacy criteria. Graph is explored for dataset representation, background knowledge specification, anonymity operation design, as well as attack inferring analysis. The framework is designed to accommodate various datasets including social networks, relational tables, temporal and spatial sequences, and even possible unknown data models. The privacy and utility measurements of the anonymity datasets are also quantified in terms of graph features. Our experiments show that the framework is capable of facilitating privacy protection by different anonymity approaches for various datasets with desirable performance.

downloadDownload free PDF View PDFchevron_right

Private Analysis of Graph Structure

G. Yaroslavtsev

ACM Transactions on Database Systems, 2014

We present efficient algorithms for releasing useful statistics about graph data while providing rigorous privacy guarantees. Our algorithms work on data sets that consist of relationships between individuals, such as social ties or email communication. The algorithms satisfy edge differential privacy, which essentially requires that the presence or absence of any particular relationship be hidden.

downloadDownload free PDF View PDFchevron_right

Complexity of Social Network Anonymization

Sean Chester

"With an abundance of social network data being released, the need to protect sensitive information within these networks has become an impor- tant concern of data publishers. To achieve this objective, various notions of k-anonymization have been proposed for social network graphs. In this paper we focus on the complexity of optimization problems that arise from trying to anonymize graphs, establishing that optimally k-anonymizing the label se- quences of edge-labeled graphs is intractable. We show how this result implies intractability for other notions of k-anonymization in literature. We also consider the case of bipartite social network graphs which arise from the representation of distinct entities, such as movies and viewers, pa- tients and drugs, or products and customers. Within this setting we demon- strate that, although k-anonymizing edge-labeled graphs is intractable for k ≥ 3, polynomial time algorithms exist for arbitrary bipartite graphs when k = 2 and for unlabeled bipartite graphs irrespective of the value of k. Finally, in this paper we extend the study of attribute disclosure within the context of social networks by defining t-closeness, a measure of how effec- tively an adversary can determine sensitive information about members of a k-anonymous social network."

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

IAEME Publication

IAEME PUBLICATION, 2013

Developing privacy preserving mechanisms for data sharing across network for research purposes and business decisions has become one of the issues of the days research interest. L.Sweeney et.al., (2002) [26] developed the concept of k-anonymity, a model for protecting privacy which poses the condition that a database to be k-anonymous, then each record is indistinguishable from at least k-1 other records with respect to their quasi-identifiers. Despite the k-anonymity model, an intruder may gain access the sensitive information if a set of nodes share similar attributes. In this paper we systematically analyze the pure structure anonymization mechanisms and models proposed in the literature. Also we make a detailed study on k-degree-l-diversity anonymity model, which takes into consideration the structural information and sensitive labels of individuals as well. Also the study the algorithmic impact of adding noise nodes to original graph and the rigorous analyses on the theoretical limitations of the appended noise nodes and its impact.

downloadDownload free PDF View PDFchevron_right

Design Document: Graph Privacy Software Suite V1.0

David Nettleton

Technical Report, TR-IIIA-2010-05, IIIA-CSIC, 2010

This document describes the first version (V1.0) of the graph privacy software suite. It consists of some initial assumptions, together with a textual description of the main routine (simulated annealing) and the six graph modifier operators. This is followed by a structure diagram of the whole system and the pseudo code of each of the main functions, organized in a modular design. A companion document [TR-IIIA-2010-04] details the theoretical background to the work.

downloadDownload free PDF View PDFchevron_right

DATA PRIVACY FOR SIMPLY ANONYMIZED SOCIAL NETWORK LOGS REPRESENTED AS GRAPHS-CONSIDERATIONS FOR GRAPH ALTERATION OPERATIONS

David Nettleton

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 19(Supplement-1), pp. 107-125 (2011), 2011

In this paper we review the state of the art on graph privacy with special emphasis on applications to online social networks, and we consider some novel aspects which have not been greatly covered in the specialized literature on graph privacy. The following key considerations are covered: (i) choice of different operators to modify the graph; (ii) information loss based on the cost of graph operations in terms of statistical characteristics (degree, clustering coefficient and path length) in the original graph; (iii) computational cost of the operations; (iv) in the case of the aggregation of two nodes, the choice of similar adjacent nodes rather than isomorphic topologies, in order to maintain the overall structure of the graph; (v) a statistically knowledgeable attacker who is able to search for regions of a simply anonymized graph based on statistical characteristics and map those onto a given node and its immediate neighborhood.

downloadDownload free PDF View PDFchevron_right

An Anonymiser Tool for Sensitive Graph Data

Charini Nanayakkara

EYRE'20 workshop co-located with CIKM, 2020

Analysis of graph data is extensively conducted in numerous domains to learn the relationships between and behaviour of connected entities. Many graphs contain sensitive data, for example social network users and their posts, or genealogical records such as birth and death certificates. This has limited the use and publication of such sensitive graph data sets. While there are various techniques available to anonymise tabular data, anonymising graph data while maintaining the node and edge structure of the original graph, such as node attributes and the similarities between nodes, is a challenging task. In this paper, we present a web tool which can anonymise sensitive graph data while maintaining the similarity structure of the original graph by employing a cluster-based mapping of sensitive to public attribute values, as well as randomly shifting date values. Our demonstration will illustrate the tool on two example data sets of historical birth records.

downloadDownload free PDF View PDFchevron_right

Data Anonymization in Social Networks

Abdelouahid Lyhyaoui

2020

Privacy is a concern of social network users. Social networks are a source of valuable data for scientific or commercial analysis. Therefore, anonymizing social network data before releasing it becomes an important issue. The nodes in the network represent the individuals and the links among them denote their relationships. Nevertheless, publishing a social graph directly by simply removing the names of people who contributed to this graph raises important privacy issues. In particular, some inference attacks on the published graph can lead to deanonymizing certain nodes, learning the existence of a social relation between two nodes or even using the structure of the graph itself to deduce the value of certain sensitive attributes. In this paper, we present a brief yet systematic review of the existing anonymization techniques for privacy preserving publishing of social network data. We identify the challenges in privacy preserving publishing of social network data comparing to the ...

downloadDownload free PDF View PDFchevron_right

k-Anonymization of Social Networks by Vertex Addition

Sean Chester

With an abundance of social network data being released, the need to protect sensitive information within these networks has become an important concern of data publishers. In this paper we focus on the popular notion of k-anonymization as applied to node degrees in a social network. Given such a network N , the problem we study is to transform N to N ′ , such that the degree of each node in N ′ is attained by at least k − 1 other nodes in N ′ . Apart from previous work, we permit modifications to the node set, rather than the edge set, and this offers unique advantages with respect to the utility of the released anonymized network. We study both vertex-labeled and unlabeled graphs, since instances of each occur in real-world social networks. Under the constraint of minimum node additions, we show that on vertex-labeled graphs, the problem is NP-complete. For unlabeled graphs, we give an efficient (near-linear) algorithm and show that it gives solutions that are optimal modulo k, a guarantee that is novel in the literature. Additionally, we demonstrate empirically that commonly-studied structural properties of the network, such as clustering coefficient, are quite minorly distorted by the anonymization procedure.

downloadDownload free PDF View PDFchevron_right

SHORTEST PATHS ANONYMIZATION ON WEIGHTED GRAPHS

H.-y. Kao, Shyue-Liang Wang

International Journal of Software Engineering and Knowledge Engineering, 2013

downloadDownload free PDF View PDFchevron_right

Anonymizing Social Networks

Michael Hay

Advances in technology have made it possible to collect data about individuals and the connections between them, such as email correspondence and friendships. Agencies and researchers who have collected such social network data often have a compelling interest in allowing others to analyze the data. However, in many cases the data describes relationships that are private (e.g., email correspondence) and sharing the data in full can result in unacceptable disclosures. In this paper, we present a framework for assessing the privacy risk of sharing anonymized network data. This includes a model of adversary knowledge, for which we consider several variants and make connections to known graph theoretical results. On several real-world social networks, we show that simple anonymization techniques are inadequate, resulting in substantial breaches of privacy for even modestly informed adversaries. We propose a novel anonymization technique based on perturbing the network and demonstrate empirically that it leads to substantial reduction of the privacy threat. We also analyze the effect that anonymizing the network has on the utility of the data for social network analysis.

downloadDownload free PDF View PDFchevron_right

Privacy in Graphs

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics