Academia.eduAcademia.edu

Outline

Codebook

2010, Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - ICSE '10

https://doi.org/10.1145/1806799.1806821

Abstract

Large-scale software engineering requires communication and collaboration to successfully build and ship products. We conducted a survey with Microsoft engineers on inter-team coordination and found that the most impactful problems concerned finding and keeping track of other engineers. Since engineers are connected by their shared work, a tool that discovers connections in their work-related repositories can help. Here we describe the Codebook framework for mining software repositories. It is flexible enough to address all of the problems identified by our survey with a single data structure (graph of people and artifacts) and a single algorithm (regular language reachability). Codebook handles a larger variety of problems than prior work, analyzes more kinds of work artifacts, and can be customized by and for end-users. To evaluate our framework's flexibility, we built two applications, Hoozizat and Deep Intellisense. We evaluated these applications with engineers to show effectiveness in addressing multiple inter-team coordination problems.

References (33)

  1. REFERENCES
  2. F. Alkhateeb. Querying RDF(S) with Regular Expressions. PhD thesis, Joseph Fourier University of Grenoble, June 2008.
  3. M. C. Andrew Cencini. Sql server 2005 full-text search: Internals and enhancements. http://msdn.microsoft.com/en-us/library/ms345119(SQL.90).aspx.
  4. J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In Proceedings of ICSE, pages 361-370, 2006.
  5. J. Aranda and G. Venolia. The secret life of bugs: Going past the errors and omissions in software repositories. In Proceedings of ICSE, pages 298-308, 2009.
  6. B. Ashok, J. Joy, H. Liang, S. Rajamani, G. Srinivasa, and V. Vangala. Debugadvisor: A recommender system for debugging. In Proceedings of ESEC/FSE '09, August 2009.
  7. A. Begel and R. DeLine. Codebook: Social networking over code. In Proceedings of ICSE, NIER Track, 2009.
  8. A. Begel, N. Nagappan, C. Poile, and L. Layman. Coordination in large-scale software teams. In Proceedings of CHASE, pages 1-7, 2009.
  9. M. Cataldo, D. Damian, P. Devanbu, S. Easterbrook, J. Herbsleb, and A. Mockus. 2nd international workshop on socio-technical congruence, May 2009.
  10. M. Cataldo, J. D. Herbsleb, and K. M. Carley. Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity. In Proceedings of ESEM, pages 2-11, 2008.
  11. M. Cataldo, P. A. Wagstrom, J. D. Herbsleb, and K. M. Carley. Identification of coordination requirements: implications for the design of collaboration and awareness tools. In Proceedings of CSCW, pages 353-362, 2006.
  12. D. Cubranic, J. Singer, and K. S. Booth. Hipikat: A project memory for software development. IEEE TSE, 31(6):446-465, 2005. Member-Gail C. Murphy.
  13. C. de Souza, J. Froehlich, and P. Dourish. Seeking the source: software source code as a social and technical artifact. In Proceedings of GROUP, pages 197-206, 2005.
  14. C. R. B. de Souza and D. F. Redmiles. An empirical study of software developers' management of dependencies and changes. In Proceedings of ICSE, pages 241-250, New York, NY, USA, 2008. ACM.
  15. A. E. Hassan. The road ahead for mining software repositories. In Proceedings ICSM, FoSM track, pages 48-57, 2008.
  16. P. Hinds and C. McGrath. Structures that work: social structure, work structure and coordination ease in geographically distributed teams. In Proceedings of CSCW, pages 343-352, 2006.
  17. R. Holmes and A. Begel. Deep intellisense: a tool for rehydrating evaporated information. In Proceedings of MSR, pages 23-26, 2008.
  18. R. C. Holt. Grokking software architecture. In Proceedings of WCRE, pages 5-14, 2008.
  19. D. Hyland-Wood, D. Carrington, and S. Kaplan. Toward a software maintenance methodology using semantic web techniques. In Proceedings of SOFTWARE-EVOLVABILITY, pages 23-30, 2006.
  20. H. H. Kagdi, M. L. Collard, and J. I. Maletic. A survey and taxonomy of approaches for mining software repositories in the context of software evolution. Journal of Software Maintenance, 19(2):77-131, 2007.
  21. C. Kiefer, A. Bernstein, and J. Tappolet. Mining software repositories with iSPARQL and a software evolution ontology. In Proceedings of MSR, page 10, 2007.
  22. A. J. Ko, R. DeLine, and G. Venolia. Information needs in collocated software development teams. In Proceedings of ICSE, pages 344-353, 2007.
  23. K. Kochut and M. Janik. Sparqler: Extended sparql for semantic association discovery. In Proceedings of ESWC, pages 145-159, 2007.
  24. T. D. LaToza, G. Venolia, and R. DeLine. Maintaining mental models: a study of developer work habits. In Proceedings of ICSE, pages 492-501, 2006.
  25. F. Manola and E. Miller. RDS primer. http://www.w3.org/TR/REC-rdf-syntax/, February 2004.
  26. A. Mockus and J. D. Herbsleb. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of ICSE, pages 503-512, 2002.
  27. E. Prud'hommeaux and A. Seaborne. SPARQL query language for RDF. http://www.w3.org/TR/rdf-sparql-query/, January 2008.
  28. P. Runeson, M. Alexandersson, and O. Nyholm. Detection of duplicate defect reports using natural language processing. In Proceedings of ICSE, pages 499-510, 2007.
  29. Z. M. Saul, V. Filkov, P. Devanbu, and C. Bird. Recommending random walks. In Proceedings of ESEC-FSE, pages 15-24, 2007.
  30. A. Tarvo. Mining software history to improve software maintenance quality: A case study. IEEE Software, 26(1):34-40, 2009.
  31. E. Trainer, S. Quirk, C. de Souza, and D. Redmiles. Bridging the gap between technical and social dependencies with ariadne. In Proceedings of eTX at OOPSLA, pages 26-30, 2005.
  32. G. Venolia. Textual alusions to artifacts in software-related repositories. In Proceedings of MSR, pages 151-154, 2006.
  33. T. Zimmermann, P. Weißgerber, S. Diehl, and A. Zeller. Mining version histories to guide software changes. IEEE TSE, 31(6):429-445, 2005.