Deploying and managing database changes can be a challenging task, especially when dealing with multiple environments and numerous database instances. In this blog post, we'll explore how Flyway simplifies the process of database deployments and enables scalability. With Flyway, you can effortlessly manage deployments across multiple database endpoints, ensuring consistency and efficiency. Let's dive into the world of scalable database deployments with Flyway.
I recently had a conversation with a client who shared their concerns about deploying to a large number of production databases. They highlighted the challenge they face in regularly updating and keeping these databases in sync with each other. One day they might have to deploy to 20 databases, and the next day it could be 2000. The current manual approach to deployments is simply not feasible without sacrificing the quality of deployments or development.
Fortunately, I'm no stranger to this problem. While I hadn't encountered it on such a large scale before, I was confident that a robust and scalable system could handle any load. This confidence led me to explore solutions to address the two main problems faced when deploying to a large number of production databases:
1) Having a Reliable way to keep the constantly changing estate of databases in sync, without manual deployments
Manually running each script for each database instance is time-consuming and prone to errors. To tackle this challenge, we need to automate the deployment process while ensuring synchronization across all databases. It becomes even more critical when compliance requirements come into play. We must avoid deploying changes to databases that don't require them and establish a reliable mechanism to track and manage the state of each database.
2) Deploying in a repeatable fashion
To achieve scalability, we require a deployment pipeline that can handle the entire estate of databases, regardless of their size, on a daily basis. The pipeline should enable deployments without the need to go in and manually script out certain aspects, but instead, a singular repeatable process that can. Building such a repeatable deployment pipeline can be extremely challenging without a tool like Flyway to support and streamline the process.
Versioned Scripts
One of the key features that make Flyway a powerful tool for scalable deployments is its support for versioned scripts. With Flyway, you no longer need to worry about the state of each individual database instance. Instead, you can maintain versioned migration scripts that represent incremental changes to the database schema and data. This approach simplifies the process of keeping track of modifications and ensures consistent deployments across different environments.
Let's consider an example to understand the pain points that arise when dealing with a larger number of databases without proper versioning. Suppose you rely on legacy tooling or manual scripting for database deployments. As the number of databases increases, you encounter a challenge when versions become mismatched. Suddenly, you cannot script out a single deployment for all databases, resulting in a significant amount of manual work.
As depicted in the image, some databases are newly provisioned in production and are out of sync with the rest. Additionally, there are databases that are missing the latest version and are one version behind. This situation prevents you from deploying a single script to all databases in a straightforward manner. Instead, you would need to script out different deployments for each database, some receiving all migration scripts, some receiving two or three, and some receiving only the new script. This manual approach undermines the automation efforts you had in mind initially.
However, with versioned scripts and Flyway, the engine itself becomes aware of the deployed scripts and their status. Flyway is intelligent enough to know which scripts each database needs, taking the burden off your shoulders, enabling fluid automation, and allowing for many further improvements to your process as the burden becomes lighter.
For example, in the image below, you can see that there are three pending scripts according to Flyway. If we had another DB only expecting one, we can simply tell Flyway to migrate both and watch the engine do the heavy lifting.
By utilizing Flyway's versioned scripts, you gain the ability to automate deployments efficiently and reliably. Flyway keeps track of the scripts that have been applied to each database, ensuring that only the necessary scripts are deployed during each migration. This simplifies the management of deployments across multiple databases, allowing for scalability and eliminating manual interventions.
Scaling, Automated Deployments
Now we know that the versions and individual script deployments are being handled by Flyway, we can focus on how we are actually going to deploy these Flyway commands, with the scripts needed for each DB in a potential 2000 estate.
To streamline this process, we can leverage Flyway, along with a templated YAML configuration. This setup allows for automated and parallel deployments to different databases and is completely scalable.
The complete code for this solution can be found in my repository, but please remember this requires careful implementation and examination.
1. Templated YAML Configuration
The main YAML file serves as the pipeline configuration. It includes variable groups, resources, and stages. The "resources" section specifies the location of the template repository. The "variables" section imports a template for each stage, which will be used to define the actual CLI needed to perform each stage, making scaling much easier with less code needed. The "stages" section defines the sequence of deployment steps, which we will outline below to maximise flyways use and repeatability, effectiveness and automation.
resources:
repositories:
- repository: templates
type: git
name: templates
variables:
- template: csv-vars.yml@templates
- template: build.yml@templates parameters: stage: Build displayName: Deploy Build executeBuild: true targetCredentials: Tundra_build_vars pipelineParameters: pipeline_vars - template: csv-deploy.yml@templates parameters: stage: TestDeploy displayName: Deploy Test dependsOn: Build baselineVersion: $(BASELINE_VERSION) flywayLicenseKey: $(FLYWAY_LICENSE_KEY) workingDirectory: $(WORKING_DIRECTORY) deploymentTargetsList: deployment-targets-batch-1.csv pipelineParameters: pipeline_vars
2. Build Stage
The build stage, defined in the "build.yml" template, is responsible for executing the build process. This stage can be customised to accommodate the specific build requirements of your project. In mine, it is responsible for validating all my scripts, both forward and undo, and stands up my database in a non-live environment. This reassures my changes are stable, as well as allowing me to automate some unit tests before deploying to a live environment.
The code for this, like every other stage mentioned, is in my repo!
3. Deployment Stages
The deployment stages, defined in the "csv-deploy.yml" template, handle the actual database deployments. They are designed to be executed in parallel, allowing for scalability and automation, for each new agent you desire you simply need to copy the five lines to call it as seen above, each that can deploy to as many endpoints as desired.
Each deployment stage requires parameters such as the stage name, display name, dependencies, baseline version, Flyway license key, working directory, and the CSV file containing the deployment targets.
The deployment stage, in itself simply passes the list of JDBCs given to it, to a copy of the PowerShell commands needed to deploy it. This means each agent can run a copy of the code, and deploy in parallel to its list. Remember, each time it deploys Flyway will work out what its target needs to be up to date!
stages: - stage: ${{parameters.stage}} pool: $(AGENT_POOL) displayName: ${{parameters.displayName}} dependsOn: ${{parameters.dependsOn}} jobs: - job: Deploy displayName: Deployment variables: - group: ${{parameters.pipelineParameters}} - group: redgate_global_vars steps: - task: CmdLine@2 inputs: script: | git checkout migrations git pull displayName: 'Commit and Push Migration Script' - task: PowerShell@2 displayName: 'Deploy From CSV Batch List' inputs: targetType: 'filePath' filePath: 'PowershellDeploymentDefinition.ps1' arguments: -deploymentTargetsList ${{ parameters.deploymentTargetsList }} -baselineVersion ${{ parameters.baselineVersion }} -flywayLicenseKey ${{ parameters.flywayLicenseKey }} -workingDirectory ${{ parameters.workingDirectory }}
4. PowerShell Deployment Script
The PowerShell script is responsible for executing the database deployments based on the JDBC targets provided in the CSV file. It loops through each row of the CSV file and deploys the changes to the respective databases in parallel. This is where we can see the flyway command, and notice how it has certain Flyway parameters attached to take into account varying versions and needs which can be customised.
param ($deploymentTargetsList, $baselineVersion, $flywayLicenseKey, $workingDirectory)
write-host "deploymentTargetsList: $deploymentTargetsList"
Import-csv -path $deploymentTargetsList |
Foreach-object {
$currentJDBC = $_.JDBCString
$deployment = "flyway migrate -outOfOrder='true' -baselineOnMigrate='true' -errorOverrides='S0001:0:I'- -baselineVersion='" + $baselineVersion + "' -licenseKey='" + $flywayLicenseKey + "' -configFiles=" + $workingDirectory + "\flyway.conf" + " -locations=filesystem:" + $workingDirectory + '\migrations' + " -url='" + $currentJDBC + "'"
write-host "flywayCommand:" + $deployment
write-host "deploying to: $currentJDBC"
Invoke-Expression $deployment
}
In this blog post, we explored an approach for automating database deployments using Flyway, a powerful database migration tool. By combining Flyway with a templated YAML configuration, a YAML caller script, and a PowerShell deployment script, we achieved scalable and automated deployments to multiple databases in parallel.
We began by understanding the importance of versioned scripts in Flyway, which allow for incremental changes to the database schema and data. With versioned scripts, we can ensure consistent deployments across different environments and eliminate the challenges of managing individual database instances manually. We then analysed the actual code needed to implement this solution inside Azure DevOps or another CI/CD tool (with some potential translation needed, happy to help).
Throughout the blog post, we emphasised the importance of scalability and automation in database deployments. The solution we presented allowed for seamless scaling as the number of databases increased. By leveraging parallel deployments, we saved time and ensured consistent deployments across multiple environments.
The full code for this solution, including the YAML files, PowerShell script, and CSV files, is available in my repository.
Feel free to explore the repository for a more detailed understanding of the implementation.
Thanks
Comments