Analyzing the Adoption of Database Management Systems Throughout the
Life Cycle of Open Source Projects
Abstract
Database Management Systems (DBMSs) are largely used to store, retrieve,
and manage the vast amounts of data that modern applications handle.
There are various DBMSs available in the industry. While a few studies
have examined the co-evolution of DBMSs and application source code,
there is a research gap in examining the adoption of DBMSs in real
systems. Knowing the most commonly used DBMSs, how frequently they are
used together, and their patterns of replacement can assist project
managers in making informed decisions about DBMS adoption. Therefore, we
conducted a historical investigation of 317 popular open source end-user
applications developed in Java and hosted on GitHub. We determined if
these projects had, at any point, employed any of the top 50 DBMSs as
ranked by DB-Engines. We observed that MySQL is the most utilized
relational DBMS, succeeded by PostgreSQL and H2. Considering only
non-relational DBMSs, Redis emerges as the predominant choice, with
Cassandra trailing behind. Multi-model DBMSs are top-ranked in
Infrastructure Management projects. Furthermore, we found different
combinations of subsets of 11 DBMSs being used together at the beginning
of the project life cycle (e.g., PostgreSQL and MySQL). Halfway through
the project life cycle, we found combinations of 25 DBMSs being used
together (e.g., MS SQL Server and Oracle). Finally, at the end of the
life cycle, this number increases to 29 DBMSs (e.g., Redis and H2). We
also investigated the replacements of DBMSs. We mined sequential
patterns and discovered 20 situations where projects replaced DBMSs. For
example, we could observe 11 replacements of PostgreSQL in 8 projects in
our corpus, with MySQL being a dominant replacement choice, having
superseded PostgreSQL in four instances. Conversely, no project switched
from MySQL to PostgreSQL. In summary, our study offers insights into the
patterns of DBMS adoption, co-use, and replacement tendencies.