Zero-downtime Clustered Deployment of WebDeploy Packages Via PowerShell

Web applications which require high scale or redundancy (very common in enterpise environments) must be deployed to multiple web servers and traffic distributed to the servers by a load balancer or proxy of some type. Those applications also often require no-downtime deployments. In this post, we will review how our team deploys our MSDeploy/WebDeploy packages to multiple servers with no-downtime.

The above is a simplified visualization of our CD pipeline workflow. We will review step 3.2 in this post but the other steps are described in the following previous posts.

https://www.dotnetcatch.com/2016/10/14/queue-tfsvsts-builds-via-powershell/
https://www.dotnetcatch.com/2016/11/03/cd-pipelines-for-net-in-thoughtworks-go/

Deployment Downtime

All traffic routes through the load balancer/proxy to be distributed to the available nodes. We must tell the load balancer to stop sending traffic to the node we want to deploy to otherwise users will receive errors as files are updated/added/deleted during the deployment. Additionally we have to wait for the node to complete any queued requests it has already received before we deploy.

Step one is to disable the node on the load balancer/proxy so it stops sending traffic.

Load Balancer API

We, like many, use a load balancer appliance to distribute web traffic to our clusted web applications. Regardless of the type of load balancer you use (hardware, software, SaaS, etc) most of them offer some type of programmatic interface to control the creation/enabling/disabling of nodes behind a virtual endpoint. That being said some APIs are easier to use than others. REST web services are ideal, but that wasn’t an option for us.

Load Balancer NOde Monitoring

Beyond direct interaction with an interface, most load balancers also support some form of monitoring to automatically enable/disable endpoints. We have an HTML file that, if present, tells the load balancer the node is working and available to serve requests. The load balancer checks for the file on some schedule, typically every few seconds, with a timeout.

In our case we simply rename the file using the IIS Administration PowerShell cmdlets and then wait for 30 seconds to make sure the monitor has time to fail:

$monitorFilePath = "IIS:\Sites\$($websiteName)\monitor.htm"
$monitorFileDisabledName = 'monitor.disabled.htm'
$monitorFileDisabledPath = "IIS:\Sites\$($websiteName)\$monitorFileDisabledName"
Invoke-Command -ComputerName $server { 
	Param($monitorFilePath, $monitorFileDisabledName, $stopWebsiteForDeploymentWaitSeconds);
	Get-Website > $null
	Write-Host "test path = $(Test-Path $monitorFilePath)"
    if (Test-Path $monitorFilePath) {
        $monitorFile = Get-Item $monitorFilePath
        $monitorFile | Rename-Item -NewName $monitorFileDisabledName
        Write-Host "Disabled monitor.htm file."
        
        # Wait for traffic to dissipate
        Write-Host "Waiting $stopWebsiteForDeploymentWaitSeconds seconds for load balancer to update..."
        Start-Sleep -Seconds $stopWebsiteForDeploymentWaitSeconds
    }
} -ArgumentList $monitorFilePath, $monitorFileDisabledName, $stopWebsiteForDeploymentWaitSeconds

We will run this script from our GO! agent but we want to change the filename on a remote server so we wrap the IIS script with a Invoke-Command and set the ComputerName to the remote server. Test-Path is extended by the IIS Administration cmdlets to also search the IIS site list by default but for some reason that didn’t work unless we first called Get-Website. Not sure why but that seemed to initialize the IIS site list and then the Test-Path call was successful. If the monitor.htm file is found on the site we change its name using the Rename-Item cmdlet and then call Start-Sleep to wait for the load balancer monitor to fail.

Stop the Site

Next we want to stop the IIS site using the Stop-Site cmdlet. This will wait for all pending request to complete and gracefully stop the website.

Invoke-Command -ComputerName $server { Param($websiteName); stop-website -Name "$websiteName" } -ArgumentList $websiteName

For more information about the Stop-Website cmdlet visit the following link:

https://technet.microsoft.com/en-us/library/hh867853(v=wps.630).aspx

Once the site is stopped you can rename the monitor file back to the correct name because the stopped site won’t respond to monitor check requests. This is not needed if you do not use virtual applications in IIS. We use a mix of websites and virtual applications under sites, so in some cases we are stopping the parent website and then deploying the virtual application. This deployment won’t replace the renamed monitor file in the website root and the monitor will be stuck in a failed state.

One final note here, my assumption is the Stop-Website command is a blocking call and will return an error if the site cannot be stopped but I haven’t found documentation explicitly stating this. So we might end up adding some script to verify the site is stopped before continuing to the deployment.

ASP.NET AppOffline

ASP.NET offers a really neat feature whereby you can instruct the site to redirect all requests to a AppOffline page if it’s present. Additionally, WebDeploy also provides support for enabling AppOffline during and deployment and even providing a custom AppOffline template page:

https://blogs.iis.net/msdeploy/webdeploy-3-5-rtw

This could be used to disable the monitor and stop the site from serving new requests (queued requests are processed normally). In our case, the monitor is a static HTML file and is not processed through the ASP.NET pipeline. Sites in AppOffline mode still serve static files so this approach doesn’t work in our situation. We could have changed our monitor file to be an .aspx file but we didn’t want to update our 50+ websites.

Deploy the Package

Finally we will deploy the WebDeploy/MSDeploy package and restart the website:

& $deployCommand /M:$server /Y -setParamFile:$setParamFilename $build.additionalMSDeployArgs -verbose 2>&1

Write-Host "Starting Website '$websiteName' on '$server'"
Invoke-Command -ComputerName $server { Param($websiteName); start-website -Name "$websiteName" } -ArgumentList $websiteName
Write-Host "Waiting $stopWebsiteForDeploymentWaitSeconds seconds for website to start..."
Start-Sleep -Seconds $stopWebsiteForDeploymentWaitSeconds

Again we wait for a few seconds to make sure the site has time to initialize and the load balancer monitor succeeds.

Multiple Deployments

Obviously we will perform the steps above for multiple servers. We use a json file to configure the nodes we wish to deploy to:

{
  "builds": [
    {
      "name": "MyWebsite (Pkg)",
      "teamProject": "MyTeamProject",
      "environments": [
        {
          "name": "DEV",
          "servers": [ "DEVServer1" ]
        },
        {
          "name": "QA",
          "servers": [ "QAServer1", "QAServer2" ]
        },
        {
          "name": "MOCK",
          "servers": [ "MOCKServer1" ]
        },
        {
          "name": "Release",
          "servers": [ "Server1", "Server2" ]
        }
      ]
    },
    {
      "name": "MyApp.Db.Sql.Deploy (Pkg)",
      "teamProject": "MyTeamProject",
      "environments": [
        {
          "name": "DEV"
        },
        {
          "name": "QA"
        },
        {
          "name": "MOCK"
        },
        {
          "name": "Release"
        }
      ]
    }
  ]
}

We parse the json and loop through the builds/environments to perform the deployments:

...
$builds = $deployConfigJson.builds

    foreach($build in $builds){
        $environment = $build.environments | Where-Object {$_.name -eq $environmentName} | Select-Object -First 1
        Write-Host "Environment = $environment"

        if ($environment -ne $null)
        {
            $buildName = $build.name
            Write-Host "Build = $buildName"
            $teamProject = $build.teamProject
            Write-Host "Team Project = $teamProject"
...

Summary

Zero-downtime deployments can be bit tricky to implement and the process will likely be different across development shops based on their infrastructure but it is possible. It often hard to share these types of scripts broadly but hopefully this post helps you think through the process.

The full deploy package script can be viewed here.

If this was helpful to you or you have questions please comment below or hit me up on Twitter.