Research Highlights

SCAM 2025 Distinguished Artifact Award
Oct 2025
Refactoring-Aware Patch Integration Across Structurally Divergent Java Forks

Daniel Ogenrwot · John Businge

When two software variants diverge through independent refactoring, replaying patches from one fork to another fails silently — renamed methods, moved classes, and restructured call hierarchies cause integration to break even when the underlying logic is sound.

RePatch addresses this by making patch integration refactoring-aware: it detects structural drift between forks and resolves mismatches before applying patches. On a benchmark of previously failing cross-fork patches, RePatch recovered 52.8% of integrations that naive replay could not handle.

Fig. 1 — Patch Integration from Source to Target Variant

Source and target variants share a common codebase up to the fork date, then synchronize until the divergent date, after which they evolve independently. Structural drift causes cherry-pick to fail; RePatch inverts refactorings on both sides before replaying the patch.

Fig. 1: Illustration of patch integration from source to target variant
Artifact
  • Open-source — fully reproducible on GitHub
  • Benchmark dataset — cross-fork Java patch corpus
  • Tool — refactoring-aware patch integrator
  • Distinguished Artifact Award, IEEE SCAM 2025
Keywords
Software Variants Patch Integration Refactoring APR Empirical SE
AIware 2026 FSE 2026 Dataset Track
2026
AgenticFlict: Merge Conflicts in AI Coding Agent Pull Requests

Daniel Ogenrwot · John Businge

AI coding agents (Copilot, Devin, Cursor, etc.) are increasingly submitting pull requests to real repositories. AgenticFlict mines 142K+ agentic PRs from 59K+ repositories, identifying 29K PRs with merge conflicts and extracting 336K+ fine-grained conflict regions, revealing that conflicts are both frequent and often substantial in AI-generated contributions.

AgenticFlict dataset curation workflow
EMSE Journal ASE 2024
2024 – 2025
PatchTrack: ChatGPT's Influence on Pull Request Outcomes

Daniel Ogenrwot · John Businge

Studying 338 pull requests from 255 GitHub repositories with self-admitted ChatGPT usage, PatchTrack finds that full adoption of AI-generated code is infrequent, with a median integration rate of 25%. Developers use selective extraction and iterative refinement rather than direct acceptance, showing AI's influence extends beyond patch generation to the entire code review process.

PatchTrack study method overview
ACSE 2020 Dataset 2021 arXiv 2024
2020 – present
Code Smells & Software Quality

Design smells (structural anti-patterns like God Classes and Feature Envy) accumulate quietly in growing codebases, increasing maintenance cost and bug density. This research line compares smell occurrence across desktop and mobile platforms, characterizes the structural roles smells play during software evolution, and releases benchmark datasets for reproducible detection research.

Topics
Design Smells Code Quality Mobile Apps Software Evolution Benchmark Datasets
SEDE 2026 Network Science
2026
Software Ecosystems & Dependency Networks

Analyzing the Maven Central ecosystem as a network of 1.3M nodes and 20.9M edges, this work reveals scale-free topology and identifies critical infrastructure hubs whose failure could cascade across thousands of downstream projects. A companion study examines the AirQo data pipeline serving environmental monitoring across Africa.

Topics
Dependency Networks Graph Mining Maven Ecosystem Software Supply Chain Data Pipelines