Contribution We provide initial algorithms and a framework leading to assisted root cause analysis through a modular architecture including collection, identification, analysis, and presentation steps. Our proposed framework creates pre-structured data from vast heterogeneous datasets automatically, enriches the data with additional information from the CI system, and adds fine-grained default and user-defined labels that support the root cause analysis of failures. Background Projects spanning hundreds of thousands of lines of code and several thousand daily continuous integration workflows cannot rely on manual prelabeling and qualitative interviews to generate meaningful improvements to broken CI job runs. Evaluation We evaluated our approach by measuring manual root cause analysis times over several CI jobs. The data we used is publicly available via the Kubernetes and OpenShift projects, allowing every researcher to continue and reproduce our work.