Abstract
Location data are routinely available to a plethora of mobile apps and third party web services. The resulting datasets are increas ingly available to advertisers for targeting and also requested by governmental agencies for law enforcement purposes. While the re-identification risk of such data has been widely reported, the dis criminative power of mobility has received much less attention. In this study we fill this void with an open and reproducible method. We explore how the growing number of geotagged footprints left behind by social network users in photosharing services can give rise to inferring demographic information from mobility patterns. Chiefly among those, we provide the first detailed analysis of ethnic mobility patterns in two metropolitan areas. This analysis allows us to examine questions pertaining to spatial segregation and the ex tent to which ethnicity can be inferred using only location data. Our results reveal that even a few location records at a coarse ...
References (51)
- REFERENCES
- Y. Altshuler, N. Aharony, M. Fire, Y. Elovici, and A. Pentland. Incremental learning with accuracy prediction of social and individual properties from mobile-phone data. In SocialCom/PASSAT, pages 969-974. IEEE, 2012.
- E. Badger. This is how women feel about walking alone at night in their own neighborhoods. http://www.washingtonpost.com/blogs/wonkblog/wp/2014 /05/28/this-is-how-women-feel-about-walking-alone-at night-in-their-own-neighborhoods/, May 2014.
- R. Becker, R. Cáceres, K. Hanson, S. Isaacman, J. M. Loh, M. Martonosi, J. Rowland, S. Urbanek, A. Varshavsky, and C. Volinsky. Human mobility characterization from cellular network data. Communications of the ACM, 56(1), Jan. 2013.
- J. Brea, J. Burroni, M. Minnoni, and C. Sarraute. Harnessing Mobile Phone Social Network Topology to Infer Users Demographic Attributes. In SNAKDD'14: Proceedings of the 8th Workshop on Social Network Mining and Analysis. ACM Request Permissions, Aug. 2014.
- J. Chang, I. Rosenn, L. Backstrom, and C. Marlow. epluribus: Ethnicity on social networks, 2010.
- Z. Cheng, J. Caverlee, K. Lee, and D. Sui. Exploring millions of footprints in location sharing services, 2011.
- E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user movement in location-based social networks. In KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM Request Permissions, Aug. 2011.
- J. Cranshaw, E. Toch, J. Hong, A. Kittur, and N. Sadeh. Bridging the gap between physical location and online social networks. In Proceedings of the 12th ACM International Conference on Ubiquitous Computing, UbiComp '10, pages 119-128, New York, NY, USA, 2010. ACM.
- Y.-A. de Montjoye et al. Unique in the crowd: The privacy bounds of human mobility. Sci. Rep., 3, 2013.
- Y.-A. de Montjoye, J. Quoidbach, F. Robic, and A. S. Pentland. Predicting personality using novel mobile phone-based metrics. In Proceedings of the 6th International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction, SBP'13, pages 48-55, Berlin, Heidelberg, 2013. Springer-Verlag.
- Z. Deng and M. Ji. Deriving Rules for Trip Purpose Identification from GPS Travel Survey Data and Land Use Data: A Machine Learning Approach, chapter 72, pages 768-777. 2010.
- M. Duggan and J. Brenner. The demographics of social media users -2012. Pew Research Center, 2013.
- T. File. Computer and internet use in the united states. http://www.census.gov/prod/2013pubs/p20-569.pdf, May 2013.
- M. González, C. Hidalgo, and A.-L. Barabasi. Understanding individual human mobility patterns. Nature, 2008.
- M. Grossglauser and D. Tse. Mobility increases the capacity of ad hoc wireless networks. Networking, IEEE/ACM Transactions on, 10(4):477-486, 2002.
- S. Guha, M. Jain, and V. N. Padmanabhan. Koi: a location-privacy platform for smartphone apps. In NSDI'12: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, Apr. 2012.
- Y. Hu, L. Manikonda, and S. Kambhampati. What we instagram: A first analysis of instagram photo content and user types, 2014.
- J. Iceland, D. Weinberg, and L. Hughes. The residential segregation of detailed Hispanic and Asian groups in the United States: 1980-2010. Demographic Research, 3:593-624, 2014.
- S. Isaacman, R. Becker, R. Cáceres, S. Kobourov, M. Martonosi, J. Rowland, and A. Varshavsky. Identifying important places in people's lives from cellular network data. Pervasive Computing, pages 133-151, 2011.
- S. Isaacman, R. Becker, R. Cáceres, S. Kobourov, M. Martonosi, J. Rowland, and A. Varshavsky. Ranges of human mobility in Los Angeles and New York. In Pervasive Computing and Communications Workshops (PERCOM Workshops), 2011 IEEE International Conference on, pages 88-93, 2011.
- S. Isaacman, R. Becker, R. Cáceres, S. Kobourov, J. Rowland, and A. Varshavsky. A tale of two cities. In HotMobile '10: Proceedings of the Eleventh Workshop on Mobile Computing Systems & Applications. ACM Request Permissions, Feb. 2010.
- Kelton. 4th annual springhill suites annual travel survey. http://news.marriott.com/springhill-suites-annual-travel survey.html, April 2013.
- K. Krippendorff. Content analysis: An introduction to its methodology. SAGE, Beverly Hills, CA, USA, 1980.
- M.-P. Kwan. Gender, the home-work link, and space-time patterns of nonemployment activities. Economic Geography, 75(4):pp-370, 1999.
- N. Lathia, D. Quercia, and J. Crowcroft. The hidden image of the city: Sensing community well-being from urban mobility. In J. Kay, P. Lukowicz, H. Tokuda, P. Olivier, and A. Krüger, editors, Pervasive, volume 7319 of Lecture Notes in Computer Science, pages 91-98. Springer, 2012.
- K. Lewis, J. Kaufman, and N. Christakis. The taste for privacy: An analysis of college student privacy settings in an online social network. J. Computer-Mediated Communication, 14(1):79-100, 2008.
- L. Liao, D. Fox, and H. Kautz. Extracting places and activities from GPS traces using hierarchical conditional random fields. Int. J. Rob. Res., 26(1):119-134, Jan. 2007.
- J. Lindamood, R. Heatherly, M. Kantarcioglu, and B. Thuraisingham. Inferring private information using social network data. In Proceedings of the 18th International Conference on World Wide Web, WWW '09, pages 1145-1146, New York, NY, USA, 2009. ACM.
- F. Liu, D. Janssens, G. Wets, and M. Cools. Annotating mobile phone location data with activity purposes using machine learning algorithms. Expert Syst. Appl., 40(8):3299-3311, June 2013.
- M. Madden. Privacy management on social media sites. Pew Research Center, 2012.
- M. Madden, A. Lenhart, S. Cortesi, U. Grasser, M. Duggan, A. Smith, and M. Beaton. Teens, social media, and privacy. Pew Research Center, 2013.
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008.
- D. S. Massey and N. A. Denton. The dimensions of residential segregation. Social Forces, 67(2):281-315, 1988.
- S. McDonough and D. L. Brunsma. Navigating the color complex: How multiracial individuals narrate the elements of appearance and dynamics of color in twenty-first-century america. In R. E. Hall, editor, The Melanin Millennium. Springer, Dordrecht, 2013.
- A. Mislove, S. Lehmann, Y.-Y. Ahn, J.-P. Onnela, and J. N. Rosenquist. Understanding the Demographics of Twitter Users. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM'11), Barcelona, Spain, July 2011.
- A. Noulas, S. Scellato, C. Mascolo, and M. Pontil. An empirical study of geographic user activity patterns in foursquare, 2011.
- G. Paolacci, J. Chandler, and P. G. Ipeirotis. Running experiments on amazon mechanical turk. Judgment and Decision Making, 5(5):411-419, 2010.
- F. Pedregosa et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830, 2011.
- M. Pennacchiotti and A.-M. Popescu. A machine learning approach to twitter user classification, 2011.
- D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta. Classifying latent user attributes in twitter. In Proceedings of the 2Nd International Workshop on Search and Mining User-generated Contents, SMUC '10, pages 37-44, New York, NY, USA, 2010. ACM.
- S. F. Reardon. A Conceptual Framework for Measuring Segregation and its Association with Population Outcomes, chapter 7, pages 169-192. John Wiley Sons, San Francisco, CA, USA, 2006.
- J. T. Roscoe and J. A. Byars. An Investigation of the Restraints with Respect to Sample Size Commonly Imposed on the Use of the Chi-Square Statistic. Journal of the American Statistical Association, 66(336):755-759, Dec. 1971.
- C. Sarraute, P. Blanc, and J. Burroni. A study of age and gender seen through mobile phone usage patterns in Mexico. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, pages 836-843, 2014.
- C. Song, Z. Qu, N. Blumm, and A.-L. Barabási. Limits of predictability in human mobility. Science, 327(5968):1018-1021, 2010.
- Statista. Social networking time per user in the united states in july 2012, by ethnicity (in hours and minutes). http://www.statista.com/statistics/248158/social-networking time-per-us-user-by-ethnicity/, 2012.
- United States Census Bureau. 2010 census. http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml, 2010.
- United States v. Jones. 2012. 132 S. Ct. 945, 955 (Sotomayor, J., concurring) (quoting People v. Weaver, 12 N.Y.3d 433, 441-42 (2009)).
- M. J. White. Segregation and diversity measures in population distribution. Population Index, 52(2):198-221, 1986.
- H. Zang and J. Bolot. Anonymization of location data does not work: a large-scale measurement study. In MobiCom '11: Proceedings of the 17th annual international conference on Mobile computing and networking. ACM Request Permissions, Sept. 2011.
- Y. Zhong, N. J. Yuan, W. Zhong, F. Zhang, and X. Xie. You are where you go: Inferring demographic attributes from location check-ins. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, WSDM '15, pages 295-304, New York, NY, USA, 2015. ACM.