Empirical Software Engineering

Papers
(The median citation count of Empirical Software Engineering is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-03-01 to 2024-03-01.)
ArticleCitations
Pandemic programming141
Testing machine learning based systems: a systematic mapping120
An exploratory study of smart contracts in the Ethereum blockchain platform99
FixMiner: Mining relevant fix patterns for automated program repair91
Predictors of well-being and productivity among software professionals during the COVID-19 pandemic – a longitudinal study73
Sampling in software engineering research: a critical review and guidelines73
What do Programmers Discuss about Deep Learning Frameworks50
A practical guide on conducting eye tracking studies in software engineering48
The state of adoption and the challenges of systematic variability management in industry46
The practitioners’ point of view on the concept of technical debt and its causes and consequences: a design for a global family of industrial surveys and its first results from Brazil43
The who, what, how of software engineering research: a socio-technical framework42
Detection, assessment and mitigation of vulnerabilities in open source dependencies41
The impact of automated feature selection techniques on the interpretation of defect models39
A comprehensive study of bloated dependencies in the Maven ecosystem39
A privacy and security analysis of early-deployed COVID-19 contact tracing Android apps36
Perceived diversity in software engineering: a systematic literature review34
Code cloning in smart contracts: a case study on verified contracts from the Ethereum blockchain platform34
Automated patch assessment for program repair at scale33
Systematic mapping study on domain-specific language development tools33
Test case selection and prioritization using machine learning: a systematic literature review32
Understanding and improving the quality and reproducibility of Jupyter notebooks31
Topic modeling in software engineering research31
An empirical investigation on the relationship between design and architecture smells31
Promises and challenges of microservices: an exploratory study29
AI lifecycle models need to be revised29
Formal methods in dependable systems engineering: a survey of professionals from Europe and North America28
Enjoy your observability: an industrial survey of microservice tracing and analysis28
Lags in the release, adoption, and propagation of npm vulnerability fixes27
On the time-based conclusion stability of cross-project defect prediction models26
How software engineering research aligns with design science: a review25
On the feasibility of automated prediction of bug and non-bug issues25
MSRBot: Using bots to answer questions from software repositories25
Practical relevance of software engineering research: synthesizing the community’s voice24
An empirical investigation of performance overhead in cross-platform mobile development frameworks24
Out of sight, out of mind? How vulnerable dependencies affect open-source projects24
On the need of preserving order of data when validating within-project defect classifiers23
Problems with SZZ and features: An empirical study of the state of practice of defect prediction data collection23
An exploratory study on confusion in code reviews23
Evaluating the agreement among technical debt measurement tools: building an empirical benchmark of technical debt liabilities23
Software development with feature toggles: practices used by practitioners22
Predicting the objective and priority of issue reports in software repositories22
Wait for it: identifying “On-Hold” self-admitted technical debt22
The secret life of test smells - an empirical study on test smell evolution and maintenance21
How do i refactor this? An empirical study on refactoring trends and topics in Stack Overflow20
On the assessment of software defect prediction models via ROC curves20
How does combinatorial testing perform in the real world: an empirical study20
The ‘as code’ activities: development anti-patterns for infrastructure as code19
Security analysis of permission re-delegation vulnerabilities in Android apps19
An empirical study of IoT topics in IoT developer discussions on Stack Overflow19
Automated demarcation of requirements in textual specifications: a machine learning-based approach19
Software provenance tracking at the scale of public source code19
A teamwork effectiveness model for agile software development19
Self-admitted technical debt practices: a comparison between industry and open-source19
Why are many businesses instilling a DevOps culture into their organization?19
On the impact of security vulnerabilities in the npm and RubyGems dependency networks18
World of code: enabling a research workflow for mining and analyzing the universe of open source VCS data18
Explicit programming strategies18
Empirical evaluation of tools for hairy requirements engineering tasks18
Game-based Sprint retrospectives: multiple action research17
A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in Apache open source projects17
Can Offline Testing of Deep Neural Networks Replace Their Online Testing?17
Topic recommendation for software repositories using multi-label classification algorithms17
Industry practices and challenges for the evolvability assurance of microservices16
Spearheading agile: the role of the scrum master in agile projects16
Code and commit metrics of developer productivity: a study on team leaders perceptions16
Development of recommendation systems for software engineering: the CROSSMINER experience16
How agile teams make self-assignment work: a grounded theory study16
Locating faults with program slicing: an empirical analysis16
Strategies to manage quality requirements in agile software development: a multiple case study15
Beyond the virus: a first look at coronavirus-themed Android malware15
Predicting unstable software benchmarks using static source code features15
Finding the sweet spot for organizational control and team autonomy in large-scale agile software development15
A study of the performance of general compressors on log files15
Test smells 20 years later: detectability, validity, and reliability14
Learning to recognize actionable static code warnings (is intrinsically easy)14
PHANTOM: Curating GitHub for engineered software projects using time-series clustering14
The significance of bug report elements14
Better software analytics via “DUO”: Data mining algorithms using/used-by optimizers14
Publish or perish, but do not forget your software artifacts14
A comprehensive study on software aging across android versions and vendors14
Analysing app reviews for software engineering: a systematic literature review14
A family of experiments on test-driven development14
On systematically building a controlled natural language for functional requirements13
Investigating types and survivability of performance bugs in mobile apps13
Feature requests-based recommendation of software refactorings13
StateAFL: Greybox fuzzing for stateful network servers13
On the privacy of mental health apps13
Building the perfect game – an empirical study of game modifications13
GitHub Discussions: An exploratory study of early adoption13
An empirical study of the characteristics of popular Minecraft mods13
Understanding shared links and their intentions to meet information needs in modern code review:13
Resource and dependency based test case generation for RESTful Web services13
On the relationship between bug reports and queries for text retrieval-based bug localization13
API compatibility issues in Android: Causes and effectiveness of data-driven detection techniques13
An Empirical Investigation of Relevant Changes and Automation Needs in Modern Code Review12
Software engineering whispers: The effect of textual vs. graphical software design descriptions on software design communication12
Can pre-trained code embeddings improve model performance? Revisiting the use of code embeddings in software engineering tasks12
A large scale analysis of mHealth app user reviews12
Are datasets for information retrieval-based bug localization techniques trustworthy?12
From anecdote to evidence: the relationship between personality and need for cognition of developers12
An empirical study on changing leadership in agile teams12
To what extent do DNN-based image classification models make unreliable inferences?12
A multi-dimensional analysis of technical lag in Debian-based Docker images12
Automatically recommending components for issue reports using deep learning12
Assessment of off-the-shelf SE-specific sentiment analysis tools: An extended replication study12
A longitudinal explanatory case study of coordination in a very large development programme: the impact of transitioning from a first- to a second-generation large-scale agile development method12
Reuse and maintenance practices among divergent forks in three software ecosystems12
Automated end-to-end management of the modeling lifecycle in deep learning12
Maintenance-related concerns for post-deployed Ethereum smart contract development: issues, techniques, and future challenges12
CROKAGE: effective solution recommendation for programming tasks by leveraging crowd knowledge11
Ethics in the mining of software repositories11
The entrepreneurial logic of startup software development: A study of 40 software startups11
TaintBench: Automatic real-world malware benchmarking of Android taint analyses11
Search-based fairness testing for regression-based machine learning systems11
How to Better Distinguish Security Bug Reports (Using Dual Hyperparameter Optimization)11
A configurable method for benchmarking scalability of cloud-native applications11
Automated issue assignment: results and insights from an industrial case11
Deep security analysis of program code11
Evaluating network embedding techniques’ performances in software bug prediction11
Uniform and scalable sampling of highly configurable systems11
Breaking bad? Semantic versioning and impact of breaking changes in Maven Central11
SMBFL: slice-based cost reduction of mutation-based fault localization10
How does code readability change during software evolution?10
CGT-FL: using cooperative game theory to effective fault localization in presence of coincidental correctness10
An empirical study of Q&A websites for game developers10
Characterizing the evolution of statically-detectable performance issues of Android apps10
Learning from what we know: How to perform vulnerability prediction using noisy historical data10
Demystifying the challenges and benefits of analyzing user-reported logs in bug reports10
How Scrum adds value to achieving software quality?10
What makes a popular academic AI repository?9
A comparative study and analysis of developer communications on Slack and Gitter9
Too many images on DockerHub! How different are images for the same system?9
Improving energy-efficiency by recommending Java collections9
Testing self-healing cyber-physical systems under uncertainty with reinforcement learning: an empirical study9
An approach and benchmark to detect behavioral changes of commits in continuous integration9
Standing on shoulders or feet? An extended study on the usage of the MSR data papers9
Comparing the results of replications in software engineering9
Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection9
Automated test reuse for highly configurable software9
Developer-centric test amplification9
A unified multi-task learning model for AST-level and token-level code completion9
Automatic team recommendation for collaborative software development9
An exploratory study on the introduction and removal of different types of technical debt in deep learning frameworks9
Using code reviews to automatically configure static analysis tools9
A first look at Android applications in Google Play related to COVID-199
Gamification in software engineering: the mediating role of developer engagement and job satisfaction9
Systematic literature review on software quality for AI-based software9
Software testing and Android applications: a large-scale empirical study9
Automating system test case classification and prioritization for use case-driven testing in product lines9
The effects of continuous integration on software development: a systematic literature review9
Will you come back to contribute? Investigating the inactivity of OSS core developers in GitHub9
A gamification solution for improving Scrum adoption9
An empirical study on self-admitted technical debt in Dockerfiles8
Interaction-based creation and maintenance of continuously usable trace links between requirements and source code8
Pull request latency explained: an empirical overview8
Towards effective assessment of steady state performance in Java software: are we there yet?8
Using black-box performance models to detect performance regressions under varying workloads: an empirical study8
Using a balanced scorecard to identify opportunities to improve code review effectiveness: an industrial experience report8
The Teamwork Process Antecedents (TPA) questionnaire: developing and validating a comprehensive measure for assessing antecedents of teamwork process quality8
Where were the repair ingredients for Defects4j bugs?8
Learning by sampling: learning behavioral family models from software product lines8
Do I really need all this work to find vulnerabilities?8
Identifying self-admitted technical debt in issue tracking systems using machine learning8
An empirical study of question discussions on Stack Overflow8
Characteristics of method extractions in Java: a large scale empirical study8
Dynamical analysis of diversity in rule-based open source network intrusion detection systems8
Do code review measures explain the incidence of post-release defects?8
Understanding and improving artifact sharing in software engineering research8
Developers perception of peer code review in research software development8
Is GitHub’s Copilot as bad as humans at introducing vulnerabilities in code?8
Why and what happened? Aiding bug comprehension with automated category and causal link identification8
A fine-grained data set and analysis of tangling in bug fixing commits8
Generating API tags for tutorial fragments from Stack Overflow7
The Relation of Test-Related Factors to Software Quality: A Case Study on Apache Systems7
FACER: An API usage-based code-example recommender for opportunistic reuse7
FeatCompare: Feature comparison for competing mobile apps leveraging user reviews7
Revisiting reopened bugs in open source software systems7
Towards an evidence-based theoretical framework on factors influencing the software development productivity7
Efficient static analysis and verification of featured transition systems7
Understanding developers’ privacy and security mindsets via climate theory7
On the fulfillment of coordination requirements in open-source software projects: An exploratory study7
Software product-line evaluation in the large7
Revisiting process versus product metrics: a large scale analysis7
On effort-aware metrics for defect prediction7
Evolving software system families in space and time with feature revisions7
Information correspondence between types of documentation for APIs7
SPVF: security property assisted vulnerability fixing via attention-based models7
Evaluating the robustness of source code plagiarism detection tools to pervasive plagiarism-hiding modifications7
On the usage, co-usage and migration of CI/CD tools: A qualitative analysis7
Weighted software metrics aggregation and its application to defect prediction7
AndroEvolve: automated Android API update with data flow analysis and variable denormalization7
Characterizing usages, updates and risks of third-party libraries in Java projects7
Automated driver management for Selenium WebDriver7
A conceptual model for unifying variability in space and time: Rationale, validation, and illustrative applications7
FindICI: Using machine learning to detect linguistic inconsistencies between code and natural language descriptions in infrastructure-as-code7
Flair: efficient analysis of Android inter-component vulnerabilities in response to incremental changes7
What to share, when, and where: balancing the objectives and complexities of open source software contributions7
Automatic identification of self-admitted technical debt from four different sources7
An empirical study on release notes patterns of popular apps in the Google Play Store7
Learning lenient parsing & typing via indirect supervision7
Embedding API dependency graph for neural code generation7
Does class size matter? An in-depth assessment of the effect of class size in software defect prediction6
On the analysis of non-coding roles in open source development6
Mining and relating design contexts and design patterns from Stack Overflow6
Static detection of equivalent mutants in real-time model-based mutation testing6
FIXME: synchronize with database! An empirical study of data access self-admitted technical debt6
GreenHub: a large-scale collaborative dataset to battery consumption analysis of android devices6
An empirical study of same-day releases of popular packages in the npm ecosystem6
Helping or not helping? Why and how trivial packages impact the npm ecosystem6
Analysing Time-Stamped Co-Editing Networks in Software Development Teams using git2net6
A study of how Docker Compose is used to compose multi-component systems6
Understanding peer review of software engineering papers6
Omni: automated ensemble with unexpected models against adversarial evasion attack6
Investigating design anti-pattern and design pattern mutations and their change- and fault-proneness6
A systematic literature review on trust in the software ecosystem6
On the adequacy of static analysis warnings with respect to code smell prediction6
Towards cost-benefit evaluation for continuous software engineering activities6
An empirical study on the use of SZZ for identifying inducing changes of non-functional bugs6
Predicting health indicators for open source projects (using hyperparameter optimization)6
E-APR: Mapping the effectiveness of automated program repair techniques6
CT-IoT: a combinatorial testing-based path selection framework for effective IoT testing6
Learning actionable analytics from multiple software projects6
Mining Python fix patterns via analyzing fine-grained source code changes6
Revisiting the VCCFinder approach for the identification of vulnerability-contributing commits6
Training students in evidence-based software engineering and systematic reviews: a systematic review and empirical study6
TraceSim: An Alignment Method for Computing Stack Trace Similarity6
Mutation testing in the wild: findings from GitHub6
DebtFree: minimizing labeling cost in self-admitted technical debt identification using semi-supervised learning6
An automated framework for the extraction of semantic legal metadata from legal texts6
Exploring Performance Assurance Practices and Challenges in Agile Software Development: An Ethnographic Study6
Deep learning approaches for bad smell detection: a systematic literature review5
An empirical study of developers’ discussions about security challenges of different programming languages5
Characterizing refactoring graphs in Java and JavaScript projects5
Practitioner’s view of the success factors for software outsourcing partnership formation: an empirical exploration5
A qualitative study of developers’ discussions of their problems and joys during the early COVID-19 months5
On the use of commit-relevant mutants5
CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively5
Empirical analysis of security vulnerabilities in Python packages5
How do Android developers improve non-functional properties of software?5
From one to hundreds: multi-licensing in the JavaScript ecosystem5
On the preferences of quality indicators for multi-objective search algorithms in search-based software engineering5
Revisiting the debate: Are code metrics useful for measuring maintenance effort?5
Styler: learning formatting conventions to repair Checkstyle violations5
Static test flakiness prediction: How Far Can We Go?5
Open-source software product line extraction processes: the ArgoUML-SPL and Phaser cases5
The forgotten role of search queries in IR-based bug localization: an empirical study5
Conclusion stability for natural language based mining of design discussions5
Evaluating refactorings for disciplining #ifdef annotations: An eye tracking study with novices5
On the Removal of Feature Toggles5
A machine and deep learning analysis among SonarQube rules, product, and process metrics for fault prediction5
0.048662900924683