Ethics in the mining of software repositories
AbstractResearch in Mining Software Repositories (MSR) is research involving human subjects, as the repositories usually contain data about developers’ and users’ interactions with the repositories and with each other. The ethics issues raised by such research therefore need to be considered before beginning. This paper presents a discussion of ethics issues that can arise in MSR research, using the mining challenges from the years 2006 to 2021 as a case study to identify the kinds of data used. On the basis of contemporary research ethics frameworks we discuss ethics challenges that may be encountered in creating and using repositories and associated datasets. We also report some results from a small community survey of approaches to ethics in MSR research. In addition, we present four case studies illustrating typical ethics issues one encounters in projects and how ethics considerations can shape projects before they commence. Based on our experience, we present some guidelines and practices that can help in considering potential ethics issues and reducing risks.