Foot-guns -
- are the common pitfalls that can lead to unintended consequences or issues when deploying Helmcharts.

Here are some common foot-guns to avoid:
Beware of setting enabled: false in the values yaml file
You might think that, by setting enable to false the service will be excluded from the upgrade, but that is NOT the case.
If you set enabled: false in the values file, the service will be DELETED from the cluster.
Always upgrade the LOWER environment first before upgrading upper environments
The order of upgrading the environments should be from lower to upper. This ensures that the upper environments will be stable and the applications will work as expected.
For example, if you have dev, SIT, UAT, and PROD environments, you should ALWAYS:
- Upgrade the services dev environment first
- Test the applications in the dev environment
- Upgrade the services SIT environment
- Test the applications in the SIT environment
- Upgrade the services UAT environment
- Test the applications in the UAT environment
Get the helm changes reviewed by a senior engineer - Follow the Checklist
Since the helm charts are managed in a git repository, it is ALWAYS a good practice to get the changes reviewed by a senior engineer.
Here is a checklist of typical workflow to follow. You can copy this checklist and actually follow when helm upgrading.
- START
- Clone the repository
- Check out the source branch for your environment. For example, art-master branch for the ART environment
git checkout art-master
- If already checked out to the correct branch, make sure to clean the working directory by either stashing or discarding the changes
- Pull the latest changes
git pull
- Checkout to a new branch for the changes you are going to make
git checkout -b <branch-name>
- Make the necessary changes
- Add all the changes to the staging area
git add --all
- Commit the changes
git commit -m "Your commit message"
- Push the changes to the remote repository (origin)
# use `git push --set-upstream origin <branch-name>` for the first time
git push origin <branch-name>
- Create a pull request to the source branch. For example, art-master branch for the ART environment
- Get the pull request reviewed and merged by a senior engineer
- Once the pull request is merged, switch back to the source branch
git checkout art-master
- Pull the latest changes
git pull
- Upgrade the helm chart in the environment
- Test the applications in the environment
- END
Cancelling the upgrade process can lead to a broken state (CTRL+C in the middle of the upgrade process)
- If you cancel the upgrade process in the middle, it can lead to a broken state. A broken state might look like:
- The deployment image is not updated, but the secrets are updated
- Backend services are not updated, but the frontend services are updated
- To recover from a broken state, it will require manual intervention and keeping the manual intervention to a minimum is always a good practice.
- Make sure you use the appropriate flags while upgrading the helm chart (like
--atomic,--debug, and--timeout), which leads my next point
Use --atomic, --debug and --timeout flags when applying the helm chart (upgrade or install)
--atomic
- Ensures that if the upgrade fails, Helm will automatically roll back to the previous release to maintain cluster stability.
- Implicitly sets the
--waitflag, causing Helm to wait for all resources to reach a ready state before considering the upgrade successful. - If resources do not become ready within the specified timeout, the upgrade is deemed a failure, triggering a rollback.
- Reference: Helm Documentation > --atomic
--debug
- Provides detailed output during the upgrade process, invaluable for troubleshooting.
- Displays the rendered templates, executed commands, and other internal operations.
- Offers insights into the upgrade’s progression and any issues that arise.
- Reference: Helm Documentation > --debug
--timeout
- Specifies the maximum duration Helm will wait for Kubernetes operations (e.g., Jobs or Pods) to complete during the upgrade.
- The default is 5 minutes (5m0s).
- If operations exceed this duration without reaching a ready state and
--atomicis set, Helm will initiate a rollback. - Adjusting the timeout is essential for deployments requiring more time to become ready.
- Reference: Helm Documentation > --timeout
Example Usage
helm upgrade my-release my-chart --atomic --debug --timeout 3m
User --namespace flag when applying helm chart (upgrade or install)
- Not using the
--namespacewill cause the helm release metadata to be stored in the context in which the helm command was run (In most casesdefaultnamespace) - And, the kubernetes artefacts (deployment, service etc) to be installed in the namespace specified in the template
- Reference: Helm Documentation > --namespace
- Reference: Github Issue
Example Usage
helm upgrade my-release my-chart --atomic --debug --timeout 3m --namespace
What are some of these foot-guns you have faced and want to stay away from?
Resources:
- Helmchart Git Repository - alpha-helm-charts
- Helm Docs - Link