Academia.eduAcademia.edu

Outline

Community evolution on Stack Overflow

https://doi.org/10.1371/JOURNAL.PONE.0253010

Abstract

Question and answer (Q&A) websites are a medium where people can communicate and help each other. Stack Overflow is one of the most popular Q&A websites about programming, where millions of developers seek help or provide valuable assistance. Activity on the Stack Overflow website is moderated by the user community, utilizing a voting system to promote high quality content. The website was created on 2008 and has accumulated a large amount of crowd wisdom about the software development industry. Here we analyse this data to examine trends in the grouping of technologies and their users into different subcommunities. In our work we analysed all questions, answers, votes and tags from Stack Overflow between 2008 and 2020. We generated a series of user-technology interaction graphs and applied community detection algorithms to identify the biggest user communities for each year, to examine which technologies those communities incorporate, how they are interconnected and how they evolve through time. The biggest and most persistent communities were related to web development. In general, there is little movement between communities; users tend to either stay within the same community or not acquire any score at all. Community evolution reveals the popularity of different programming languages and frameworks on Stack Overflow over time. These findings give insight into the user community on Stack Overflow and reveal long-term trends on the software development industry.

References (26)

  1. Zhang, W. E., Sheng, Q. Z., Lau, J. H. and Abebe, E. Detecting duplicate posts in programming QA communities via latent semantics and association rules. In Proceedings of the 26th International Con- ference on World Wide Web 2017 (pp. 1221-1229).
  2. Hu Y, Wang S, Ren Y, Choo KK. User influence analysis for Github developer social networks. Expert Systems with Applications. 2018 Oct 15; 108:108-18. https://doi.org/10.1016/j.eswa.2018.05.002
  3. Tian Y, Ng W, Cao J, McIntosh S. Geek Talents: Who are the Top Experts on GitHub and Stack Over- flow?. CMC-COMPUTERS MATERIALS & CONTINUA. 2019 Jan 1; 61(2):465-79. https://doi.org/10. 32604/cmc.2019.07818
  4. G. Silvestri, J. Yang, A. Bozzon, and A. Tagarelli Linking accounts across social networks: the case of stackoverflow, github and twitter. In International Workshop on Knowledge Discovery on the WEB, pages 41-52, 2015.
  5. X. Wang, H. Liu, W. Fan Connecting users with similar interests via tag network inference. in: Proceed- ings of the 20th ACM International Conference on Information and Knowledge Management, CIKM'11, ACM, New York, NY, USA, 2011, pp. 1019-1024.
  6. S. Beyer and M. Pinzger Grouping Android Tag Synonyms on Stack Overflow 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), Austin, TX, USA, 2016, pp. 430-440.
  7. Rosen Christoffer, Shihab Emad. What are mobile developers asking about? A large scale study using stack overflow. Empirical Software Engineering June 2016, Volume 21, Issue 3, pp 1192-1223. https:// doi.org/10.1007/s10664-015-9379-3
  8. Xiang Fu, Shangdi Yu, and Austin R. Benson. Modeling and Analysis of Tagging Networks in Stack Exchange Communities. warXiv preprint arXiv:1902.02372 (2019).
  9. A. Halavais, K. H. Kwon, S. Havener, and J. Striker Badges of Friendship: Social Influence and Badge Acquisition on Stack Overflow. in System Sciences (HICSS), 2014 47th Hawaii International Confer- ence on, 2014, pp. 1607-1615.
  10. Papoutsoglou M, Kapitsaki GM, Angelis L. Modeling the effect of the badges gamification mechanism on personality traits of Stack Overflow users. Simulation Modelling Practice and Theory. 2020 Aug 5:102157. https://doi.org/10.1016/j.simpat.2020.102157
  11. F. Calefato, F. Lanubile, M. C. Marasciulo, N. Novielli Mining Successful Answers in Stack Overflow. In Proceedings of Mining Software Repositories (MSR), 2015, pp. 430-433.
  12. Dennis Schenk and Mircea Lungu Geo-locating the knowledge transfer in Stack Overflow. In Interna- tional Workshop on Social Software Engineering (SSE), pages 21-24, 2013.
  13. Morrison P. and Murphy-Hill E. Is programming knowledge related to age? an exploration of stack over- flow. In Proceedings of Mining Software Repositories (MSR), 2013.
  14. Ragkhitwetsagul C., Krinke J., Paixao M., Bianco G., and Oliveto R. Toxic Code Snippets on Stack Overflow. IEEE Transactions on Software Engineering, 1-1 (2019).
  15. Vasilescu, Bogdan, Vladimir Filkov, and Alexander Serebrenik. Stackoverflow and github: Associations between software development and crowdsourced knowledge. International Conference on Social Computing. IEEE, 2013.
  16. L. B. L. de Souza, E. C. Campos, and M. D. A. Maia. Ranking crowd knowledge to assist software devel- opment. In Proceedings of the 22nd International Conference on Program Comprehension, 2014, pp. 72-82.
  17. Q. Liu, E. Agichtein, G. Dror, E. Gabrilovich, Y. Maarek, D. Pelleg, I. Szpektor. Predicting web searcher satisfaction with existing community-based answers. SIGIR, 2011.
  18. Xu, S., Bennett, A., Hoogeveen, D., Lau, J. H., and Baldwin, T. Preferred Answer Selection in Stack Overflow: Better Text Representations. . . and Metadata, Metadata, Metadata. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text (pp. 137-147).
  19. Rossetti, Giulio, and Re ´my Cazabet. Community discovery in dynamic networks: a survey. ACM Com- puting Surveys (CSUR) 51.2 (2018): 1-37.
  20. Rosvall M. and Bergstrom C. T. Maps of random walks on complex networks reveal community struc- ture. Proceedings of the National Academy of Sciences 105, 1118-1123 (2008). https://doi.org/10. 1073/pnas.0706851105 PMID: 18216267
  21. Fortunato, Santo. Community detection in graphs. Physics reports 486.3-5 75-174 (2010).
  22. Yang Z., Algesheimer R., & Tessone C. J. A comparative analysis of community detection algorithms on artificial networks. Scientific reports, 6(1), 1-18 (2016). https://doi.org/10.1038/srep30750 PMID: 27476470
  23. Lancichinetti Andrea, and Fortunato Santo. Community detection algorithms: a comparative analysis. Physical review E 80.5 (2009): 056117. https://doi.org/10.1103/PhysRevE.80.056117 PMID: 20365053
  24. Jacomy Mathieu, et al. ForceAtlas2, a continuous graph layout algorithm for handy network visualiza- tion designed for the Gephi software. PloS one 9.6 (2014): e98679. https://doi.org/10.1371/journal. pone.0098679 PMID: 24914678
  25. Moustroufas E, Stamelos I, Angelis L. Competency profiling for software engineers: literature review and a new model. InProceedings of the 19th Panhellenic Conference on Informatics 2015 Oct 1 (pp. 235-240). ACM.
  26. Vespignani A. Modelling dynamical processes in complex socio-technical systems. Nature Phys. 8, 32-39 (2012). https://doi.org/10.1038/nphys2160