There is a shift in developer practices to prioritize migrating databases to the cloud. According to IDC, spending on foundational cloud services grew by 38.5 percent in 2021 and shows no sign of slowing down. Merv Adrian of Gartner stated that the overall database management software market grew to just under $80billion in revenue for 2021. This went up by $14.5billion since 2020. According to Gartner’s research, around half of all this spend ($39.2billion) goes into cloud based databases, either as part of new application deployments or through migration for existing apps.
This shift is due to some overall trends in developer practices. For example, developers today commonly design their applications based on microservices. Each element will run separately connected to other elements via API. Around 85 percent of companies have adopted microservices to modernize their applications, according to research by ClearPath Strategies and solo.io. These components frequently run in containers on a service like Kubernetes. This makes it easier to scale services up and down based on demand.
Running in the cloud makes implementation easier for the application side. However, it leads to increased complexity around infrastructure when it comes to data. Traditional applications use a single database for all their records alongside an application server and a web server. Modern applications have a database instance for each microservice, so there can be tens or hundreds of database instances needed. These nuances should all be taken into consideration before deciding what databases to use, how you are going to deploy these instances, and how the data will be managed moving forward.
More choice over cloud services
Developers will likely run more databases as part of their applications. They will also run these in the cloud alongside said applications. This presents a few options for consideration when migrating databases to the cloud.
What database to use
The first option is what database you want to use. The database should take care of any transactions as they are created, and it will be the long term data management tool for all those entries. Developers have to know the basics of how databases work in order to access, create, modify, and delete data. However, beyond the basics, there are multiple databases worth considering. You may even want to run different databases for specific jobs.
These databases can be closed or open source. According to DB Engines, open source databases have become more popular over the past few years, with just over 51 percent of the market since 2021. Developers tend to prefer open source databases when they are available, as they have more options for support. Depending on the use case, there are different options available. For example, MySQL and PostgreSQL are great choices for general data management tasks. MongoDB is useful when your developers want to get started promptly and work with data quickly, and where relational data formatting is less important to the use case.
How to deploy your chosen database
Alongside which database to use, there are options on how to deploy and use the database you choose. For example, you can use a cloud service to host your database. This involves effectively using the cloud provider’s infrastructure to create and run a server in the same way you would on your own physical machine. The benefit of this is having full control over the implementation. However, the downside is you also have the full responsibility of managing database tasks like backup. For many developers, these tasks are things they would rather hand over to a provider.
As an alternative, you can choose a managed service, where a provider runs your database on your behalf. This involves adding access to your own installed database servers on your cloud account. Someone else takes on the jobs around management, security, and backup for each server. This simplifies getting up and running compared to running your own instances, but can be more expensive over time due to operational costs that the service provider charges.
Alongside managed services, there are Database as a Service (DBaaS) options that will take this even further with more automation. For developers that simply want to access and run everything through APIs, DBaaS offerings can make getting an application up and running easier. This often simplifies the operational side for developers. In addition, it further reduces the user overhead around managing many of the activities involved with running a database. Conversely, DBaaS services can be more expensive over time, as operational costs may be higher than capital cost expenditure.
What your choices mean
In effect, you can decide how much your team is responsible for managing your data after migrating databases to the cloud. However, you should never forget this is a shared responsibility model. Whatever approach you take around your databases, you are responsible for your data over time.
DBaaS offerings normally include data management tasks like backup. This should make it easier for your team to protect data and recover it after any problem. Pro tip: keep a close eye on data set-up to ensure it is managed effectively. This could mean managing your schema implementation to keep up with changes over time. It could also mean managing your schema to make it easier to run how queries are used in your applications.
When you first implement an application, you will have a certain set of data to manage. Over time, this set of data will evolve and change, meaning you may have more data to collect. While you can use a cloud database service to manage your data, you are responsible for regularly reviewing your schema and query design. This will help you check that your approach is still the right one. If carrying out this task presents an issue, consider bringing in expert consultants to help.
Similarly, you should also review how much data you store over time. Holding on to data too long or duplicating it without good reason can be a risk, particularly if your organization processes and stores data classified as personally identifiable information, or PII. Be sure to prune your data, or you can end up storing much more than you actually need.
This sensitive data requires security and risk management resources. If you regularly delete unneeded data, you can save on costs and reduce your risk profile at the same time.
What else to consider when migrating databases to the cloud?
Managing data over time involves knowing why you are holding that data and what you will use it for. Your decisions here should also cover the infrastructure you use, and why you have decided on that approach. For example, alongside the decision to operate your applications in containers and orchestrate them using Kubernetes, you may also want to run your databases in the same fashion. This can help avoid lock in, as you should be able to move your database containers to another location or provider when you need to.
Similarly, using an open source database — or a cloud service that is compatible with open standard open source builds — can provide you with a way to avoid lock in. Don’t like your cloud provider’s service for MySQL? Then you can move your workload to another service, or implement your own MySQL server.
The challenge here is not all cloud services based on open source are fully compatible with the full open source version. It’s important to ask if you can easily change your approach once implemented. Some migration projects might be easy to change after you have started. Other projects will need you to get this right from the start. This is a reason to look carefully at the service you choose when migrating databases to the cloud, and whether it is fully compatible with your data needs and your database(s) of choice before committing to production.
Using open source databases offers a degree of freedom for your workloads – you have the choice of how to run them, and how to support them over time. You can use compatible services from your cloud provider of choice, or opt for a full database as a service offering. However, you should not take that freedom for granted. You retain the responsibility for how your databases operate and how your data is organized. While you may move to the cloud for faster deployment and simpler operations, understanding your choices — and your responsibilities — can help you improve your data strategy in the long term.
Matt Yonkovit is Head of Open Source Strategy at Percona, an open source database company. He focuses on helping developers, architects, and DBAs get the most out of their data. He has been in the open-source database community for over 15 years working for MySQL AB, Sun Microsystems, Mattermost, and Percona.