Resuming a managed instance

This page documents how to resume suspended managed instances. This is useful when Sales team would like to re-engage a customer again where their managed instance was suspended.

Managed instances configuration is tracked in deploy-sourcegraph-managed.

For basic operations like accessing an instance for these steps, see managed instances operations.

General setup

Managed instances configuration is tracked in deploy-sourcegraph-managed - make sure you have the latest revision of this repository checked out. For basic operations like accessing an instance for these steps, see managed instances operations.

First, ensure you have the prerequisites installed and up-to-date:

Many of the following commands in this guide, as well as the commands operations, use environment variables. Export the appropriate values for the upgrade so you don’t lose track:

# name of customer deployment (should match folder)
export CUSTOMER=<customer_or_instance_name>

# the previous active deployment, either `red` or `black`
export CURRENT_DEPLOYMENT=$(gcloud compute snapshots list --format json --sort-by '~creationTimestamp' --limit 1 | jq -r '.[0].name' | awk -F: '{ if ($1 ~ "-red-") {print "red"} else {print "black"} }')

# The value can be found in the Managed Instances vault
# https://my.1password.com/vaults/nwbckdjmg4p7y4ntestrtopkuu/allitems/d64bhllfw4wyybqnd4c3wvca2m
export TF_VAR_opsgenie_webhook=<OpsGenie Webhook value>

export TF_VAR_cf_origin_cert_base64=$(gcloud secrets versions access latest --project=sourcegraph-dev --secret="SOURCEGRAPH_WILDCARD_CERT" | base64)
export TF_VAR_cf_origin_private_key_base64=$(gcloud secrets versions access latest --project=sourcegraph-dev --secret="SOURCEGRAPH_WILDCARD_KEY" | base64)

Make sure your copy of the deploy-sourcegraph-managed repository is up to date:

git checkout main
git pull origin main

Sourcegraph resume

0) Sourcegraph resume setup

Follow the general setup guide. Then, set up the appropriate version variables:

Make sure to use the same shell for all the commands in this guide unless otherwise stated.

Now start a branch for your upgrade:

git checkout -b $CUSTOMER/resume-instance
# all the below steps are documented assuming you are in the customer deployment directory
cd $CUSTOMER

1) Prepare terraform module

Uncomment out the following resources in infrastructure.tf.

  • google_compute_network_endpoint.primary
  • google_monitoring_uptime_check_config.primary
  • google_monitoring_alert_policy.primary

In terraform.tfvars, remove the emply deployments list and empty disks map, uncomment the previous deployments and disks varible.

Update terraform.tfvars, replace <snapshot_name> with the snapshot taken during suspend process. Depending on the previous active deployment, the disk to restore from may vary. Run gcloud compute snapshots list to obtain the snapshot name.

If the $CURRENT_DEPLOYMENT is red, vice versa.

disks = {
  red = { from_snapshot = "default-red-data-disk--upgrade-from-<>" }
}
terraform plan -out resume.plan

If everything looks good to you, apply the plan

terraform apply resume.plan

2) Make database writable

You should ssh into the deployment and wait until postgres services are healthy (instructions), then run the following.

../util/set-db-readonly.sh $CURRENT_DEPLOYMENT false

3) Verify instance heath

Consult the upgrade process for more detail.

4) Commit your change

git add . && git commit -m "$CUSTOMER: resume instance" && git push origin HEAD

And click the provided link to open a pull request in deploy-sourcegraph-managed.

IMPORTANT: DO NOT FORGET TO GET YOUR PR APPROVED AND MERGED, if you forget then the next person touching the instance will have a very bad time.

5) Upgrade as needed

Check if the instance is running the latest version of Sourcegraph. First, obtain the current version

cat $CURRENT_DEPLOYMENT/VERSION

Go to deploy-sourcegraph-docker, and compare with the latest tag.

Follow the upgrade process as needed