Flux is a GitOps continuous delivery tool that provides a framework for keeping a Kubernetes cluster in-sync with source git repositories, OCI registries, and published Helm charts [1]. In this article, I will show how Flux can deploy a Helm chart, and then subsequently monitor for changes and auto-upgrade when new chart versions are published. ... Flux: automating Helm chart deployments with Flux| Fabian Lee : Software Engineer
Flux is a GitOps continuous delivery tool that provides a framework for keeping a Kubernetes cluster in-sync with source git repositories, OCI registries, and published Helm charts [1]. The recommended way to install Flux on a Kubernetes cluster is to bootstrap using the Flux CLI, so I will go through those details in this article. ... Flux: installing Flux on a Kubernetes cluster with bootstrap command| Fabian Lee : Software Engineer
Github Actions provide the ability to define a build workflow, including the packaging and publishing of a Helm chart. This allows tools like Helm to refer to the URL of the public source project, add it as a remote Helm repository, and then use the packaged chart to deploy a workload to a Kubernetes cluster. ... GitHub: automated publish of Helm chart using GitHub Actions| Fabian Lee : Software Engineer
GitLab Pipelines provide the ability to define a build workflow, including the packaging and publishing of a Helm chart to the GitLab Package Registry. This allows tools like Helm to refer to the public URL of the Gitlab Package Registry, add it as a remote Helm repository, and then use the packaged chart. Pipeline job ... GitLab: pipeline to publish Helm chart to GitLab Package Registry| fabianlee.org
The “kubectl get all” command only returns a limited set of resources, namely: pods, services, daemon sets, deployments, replica sets, jobs, cronjobs, and stateful sets (not Ingress, Secrets, ConfigMap, CRD, etc.). While there is a rationale for this, it is often the case we need to see all the resources and custom resources defined in ... Kubernetes: showing all resources in a namespace| fabianlee.org
Enabling the use of the GPU on your Mac M1 with the tensorflow-metal plugin can be challenging because there is a lot of conflicting documentation and older forum questions and replies. I’ve written this article for a Mac M1 running on macOS Sequoia 15.1.1. As of December 2024, you should pair Python 3.11 with TensorFlow ... Mac: tensorflow-metal pip module on M1 chip for GPU support| Fabian Lee : Software Engineer
The random_id Terraform resource generates a value that can be used to create remote infrastructure that requires a unique identifier. The primary attribute it exposes is ‘.id’ which contains upper+lower+number characters, but it also has ‘.dec’ and ‘.hex’ equivalent representations that can be used to support infrastructure requiring a limited character set. As an example, ... Terraform: converting hex and decimal representation of random_id back to id| Fabian Lee : Software Engineer
If you need to “import a Terraform module”, it is critical to understand that importing a module state is not a single bundled operation. Instead, you must import each of the resources inside the module individually. It is unfortunate that you cannot simply supply the module identifier and its variable values to import all its ... Terraform: importing a module by its individual resources| Fabian Lee : Software Engineer
Kyverno is an open-source project that manages and enforces policies within a Kubernetes cluster. The policy definitions are defined as yaml and deployed as Kubernetes objects. Kyverno has become popular for its Kubernetes-specific policy engine and declarative rule definitions (as opposed to a general policy engine like OPA/Gatekeeper that use a domain specific language). It ... Kubernetes: deploying Kyverno for cluster policy control| fabianlee.org
If you have specific intentions for a Kubernetes node pool/group (workload isolation, cpu type, etc.), then you can assign labels to attract workloads in conjunction with taints to repel workloads that do not have explicit tolerations applied. And although the generalized kubectl utility can assign labels and taints to specific nodes, the assignment of labels ... Kubernetes: targeting workloads to a node pool/group using taints and tolerations| fabianlee.org
If you have a set of resources in Terraform that are conditionally included based on the same criteria, instead of appending a “count/for_each” on every resource definition, consider refactoring them into a module. The conditional can then be placed on the module definition instead of polluting each resource definition. For example, if you had several ... Terraform: module for conditional include of related resources| Fabian Lee : Software Engineer
The output of a Terraform child module is not shown by the root configuration/module. In order to have the child module output shown, you must explicitly define a root level output. For example, if you root had a child module defined like below, and you wanted to display the output of that child module, then ... Terraform: enabling the output of a child module| fabianlee.org
If you are using awk with a relatively indexed NF variable (number of fields), and get an error like below, this is because the input being parsed does not have the number of fields expected. awk: run time error: negative field index $-1 The root problem is the awk expression was expecting a certain number ... Bash: resolving awk run time error, negative field index| fabianlee.org
Istio Ambient mode is a data plane model that eliminates the need for an envoy sidecar proxy on each of your workloads. This reduces resource overhead, timing issues between sidecar lifecycle and your containers, and the need to restart your workloads to upgrade proxy versions. In this article, I will show you how to install ... Minikube: Istio Ambient mode on Minikube| Fabian Lee : Software Engineer
At the Bash command line interface, there is the concept of programmable completion and regular file/directory completion. This means that when you press the <TAB>, the alternatives can be provided by a custom program or the filesystem hierarchy. There is always the chance that a program may introduce undesirable behavior to your auto-completion, and if ... Bash: falling back to file autocompletion if errors introduced by program autocompletion| Fabian Lee : Software Engineer
Github Actions provide the ability to define a build workflow, and for projects that are building an OCI (Docker) image, there are custom actions available for running the Trivy container security scanner. In this article, I will show you how to modify your GitHub Action to run the Trivy security scanner against your image, and ... Github: security scanning built into GitHub Actions image build| Fabian Lee : Software Engineer
GitLab Pipelines provide the ability to define a build workflow, and for projects that are building an OCI (Docker) image, there is a convenient method for doing container security scanning as part of the build process. Include Container Scanning As described in the official documentation, add the following include to your .gitlab-ci.yml pipeline definition. include: ... GitLab: security scanning built into GitLab Pipelines image build| Fabian Lee : Software Engineer
Google Pub/Sub is a managed messaging platform providing a scalable, asynchronous, loosely-coupled solution for communication between application entities. It centers around the concept of a Topic (queue). A Publisher can put messages on the Topic, and a Subscriber can read messages from the Subscription on a Topic. In this article, I will first use the ... GCP: publishing and reading from Google PubSub Topic using Python client libraries| Fabian Lee : Software Engineer
KEDA is an open-source event-driven autoscaler that greatly enhances the abilities of the standard HorizontalPodAutoscaler. It can scale based on internal metrics as well as external Scaler sources. In this article, I will illustrate how to install KEDA on a GKE cluster that has Workload Identity enabled, and then how to configure KEDA scaling events ... GCP: Installing KEDA on a GKE cluster with workload identity and testing Scalers| Fabian Lee : Software Engineer
Although the simple ‘gcloud container operations list‘ command is the easiest way to find recent upgrade events on your GKE cluster or nodepool, it returns only the recent events and does not provide a historical record. If you need to look at historical events, you can use Logs Explorer web UI or use the ‘gcloud ... GCP: historical log of GKE cluster and nodepool upgrades and scaling| Fabian Lee : Software Engineer
Trivy is an open-source tool that can scan your containers and produce reports on known critical issues at the binary and OS package level. In this article, I will describe how to scan images directly from your local Debian/Ubuntu machine, whether you built the image locally or pulled it down remotely. Installation on Debian/Ubuntu Per ... Kubernetes: Trivy for container scanning from CLI| Fabian Lee : Software Engineer
If you find yourself low on disk space, check the size of the Systemd journal logs as a way to easily free up some space. $ sudo journalctl --disk-usage Archived and active journals take up 528.0M in the file system. # reduce size sudo journalctl --vacuum-size=300M There are automatic jobs that trim these Systemd journal ... Linux: reducing disk usage of Systemd journal logs| Fabian Lee : Software Engineer
The openssl utility can be used to show the details of a certificate, including its ‘Not After’ expiration date in string format. This can be transformed into “how many days till expiration” with a bit of Bash date math. Create test certificate and key Using a line provided by Diego Woitasen for non-interactive self-signed certification ... Bash: calculating number of days till certificate expiration using openssl| Fabian Lee : Software Engineer
GitLab pipelines are a convenient way to expose deployment/delivery tasks. But with their rudimentary web UI for variable input, it can be challenging for users to populate the required list of variables. One way of making it more convenient for end-users is to provide them a URL pre-populated with the specific branch and pipeline variable ... GitLab: URL shortcut to override pipeline variable values| Fabian Lee : Software Engineer
Terraform has now been open-source and forked with the OpenTofu project. The ‘tofu’ binary is a drop-in replacement for terraform, and this article will show you how to install on Debian/Ubuntu. After installation, we will then use the Debian/Ubuntu Alternatives concept to supersede existing calls to ‘terraform’ to instead invoke ‘tofu’. Setup OpenTofu apt repository ... OpenTofu: installing OpenTofu on Debian/Ubuntu| Fabian Lee : Software Engineer
If you are using Terraform as a way to provide infrastructure/services to outside teams, your Terraform definitions and variables may initially be owned by your team alone, with all “tf apply” operations done by trusted internal group members at the CLI. But once there is a certain amount of maturity in the solution, the Terraform ... Terraform: external yaml file as a contribution model for outside teams| Fabian Lee : Software Engineer
Base64 encoding a string may seem like a straight-forward operation, but there are a couple of gotchas even when dealing with just simple ASCII strings. Avoid embedding a new line character into the encoding If you use the most straight forward method of Base64 encoding shown below, you have to remember that echo by default ... Bash: avoiding newline artifacts when Base64 encoding a string| fabianlee.org
Mike Farah’s yq yaml processor has a a rich set of operators and functions for advanced usage. In this article, I will illustrate how to update deeply nested elements in yaml. This can be done for both known paths as well as arbitrarily deep paths. Sample yaml We will use the following yaml files to ... yq: updating deeply nested elements| Fabian Lee : Software Engineer
Mike Farah’s yq yaml processor has a a full-featured validation command that is very detailed in its reporting, but the yaml specification itself is very lenient, which means yq may accept scenarios you did not expect (e.g. an empty file). yq -v file.yaml >/dev/null ; echo "final result = $?" Luckily, the yq tips-and-tricks section ... yq: validate yaml syntax| fabianlee.org
Keeping the Ubuntu system-level Python version and modules independent from those desired at each project level is a difficult task best managed by a purpose-built tool. There are many solutions in the Python ecosystem, but one that stands out for simplicity is pyenv and pyenv-virtualenv. pyenv allows you to install and switch between different versions ... Ubuntu: pyenv for managing multiple Python versions and environments| Fabian Lee : Software Engineer
It is relatively easy to experiment with a base LLama2 model on Ubuntu, thanks to llama.cpp written by Georgi Gerganov. The llama.cpp project provides a C++ implementation for running LLama2 models, and works even on systems with only a CPU (although performance would be significantly enhanced if using a CUDA-capable GPU). Download model and run ... Ubuntu: LLama2 model on Ubuntu using llama.cpp| Fabian Lee : Software Engineer
If you need to dig deeper into the dependencies that brew has installed on your Mac, you can show a complete dependency tree using: brew deps --tree --installed If you need to see who relies on a specific target Formulae: brew uses --installed <targetFormulae> REFERENCES brew.sh, deps command syntax apple.stackexchange, ‘brew deps’ showing all ... Mac: list deep dependencies of Homebrew formulae| Fabian Lee : Software Engineer
It is relatively easy to experiment with a base LLama2 model on M family Apple Silicon, thanks to llama.cpp written by Georgi Gerganov. The llama.cpp project provides a C++ implementation for running LLama2 models, and takes advantage of the Apple integrated GPU to offer a performant experience (see M family performance specs). Download model and ... Mac: LLama2 model on Apple Silicon and GPU using llama.cpp| Fabian Lee : Software Engineer
minikube makes it easy to spin up a local Kubernetes cluster on macOS, and adding an Ingress is convenient with its built-in Addons. In this article, I want to take it one step further and show how to expose the Ingress via TLS (secure https) using a custom key/certificate chain. Prerequisites MacOS Brew package manager ... minikube: installing minikube on Mac with secure TLS ingress| Fabian Lee : Software Engineer
The Apple Virtualization Framework (AVF) provides the ability to run completely independent virtual machines on top of M family Apple Silicon. For example, you can run multiple versions of MacOS virtualized for validating an application or its dependencies against different environments. Additionally, cloning an existing VM (with little cost thanks to APFS copy-on-write) allows you ... Mac: bare-metal virtualization on Apple Silicon with virtualbuddy| Fabian Lee : Software Engineer
Although you could use brew to install Python directly, the cleaner way to manage Python versions and isolate Python virtual environments is by using pyenv and pyenv-virtualenv. pyenv allows you to install and switch between different versions of Python, while pyenv-virtualenv provides isolation of pip modules, for independence between projects. Install brew package manager Install ... Mac: multiple Python versions/virtualenv with brew and pyenv| Fabian Lee : Software Engineer
If you are using ssh private/public keypair authentication, and get an almost immediate error like below: $ ssh -i id_rsa.pub myuser@a.b.c.d -p 22 Received disconnect from a.b.c.d port 22:2: Too many authentication failures Disconnected from a.b.c.d port 22 Then try again using the ‘IdentitiesOnly‘ option. ssh -o 'IdentitiesOnly yes' -i id_rsa.pub myuser@a.b.c.d -p 22 The ... Bash: fixing “Too many authentication failures” for ssh with private key authentication| Fabian Lee : Software Engineer
If you want a variable to reference another variable in Bash, that is possible using a concept called indirect reference. Below is a simple example where ‘varname’ contains the name of another variable ‘foo’. # our variable foo=bar # our reference varname=foo # the following syntax will ERROR and DOES NOT work !!! echo "${$varname}" ... Bash: indirect reference to evaluate a variable value| Fabian Lee : Software Engineer
If your systemd service is failing with the following error message: XXX.service: Start request repeated too quickly The first thing to do is fix any underlying issues. Use ‘systemctl status <service>’, ‘journalctl -u <service>’, and search any log files produced by the service to understand why the service failed multiple times and exceeded its StartLimitBurst. ... Ubuntu: resolving systemd error, “Start request repeated too quickly”| Fabian Lee : Software Engineer
The Vault Secrets Operator is a Vault integration that runs inside a Kubernetes cluster and synchronizes Vault-level secrets to Kubernetes-level secrets. This secret synchronization happens transparently to the running workloads, without any need to retrofit existing images or manifests. In this article, I will show how to: Install the Vault Secrets Operator (VSO) Configure the ... Vault: synchronizing secrets from Vault to Kubernetes using Vault Secrets Operator| Fabian Lee : Software Engineer
minikube makes it easy to spin up a local Kubernetes cluster, and adding an Ingress is convenient with its built-in Addons. In this article, I want to take it one step further and show how to use a custom key/certificate to expose a service using TLS (secure https). Prerequisites A container or virtual machine manager ... minikube: exposing a deployment using ingress with secure TLS| Fabian Lee : Software Engineer
If there is a command that needs to run on startup/reboot, you can use the @reboot directive in cron to schedule it. Here is a simple example of logging the date to a file at startup. (crontab -l 2>/dev/null; echo "@reboot date >> ~/startup.log") | sort -u | crontab - The sort command avoids duplicate ... Bash: schedule a command that will be run at reboot using cron| Fabian Lee : Software Engineer
In this article, I will detail how to use Vault JWT auth mode to isolate the secrets of two different deployments in the same Kubernetes cluster. This will be done by using two different Kubernetes Service Accounts, each of which generates unique JWT that are tied to a different Vault role. JWT auth mode is ... Vault: JWT authentication mode with multiple roles to isolate secrets| Fabian Lee : Software Engineer
If you are getting the following error when invoking an Ansible playbook or any of the Ansible related utilities: ERROR! Invalid callback for stdout specified: yaml This means Ansible is attempting to use the new YAML callback plugin, but cannot find the Ansible Galaxy community.general module. This module is installed by the ‘ansible’ pip module, ... Ansible: resolving error “Invalid callback for stdout specified: yaml”| Fabian Lee : Software Engineer
HashiCorp Vault is a secret and encryption management system that allows your organization to secure sensitive information such as API keys, certificates, and passwords. In this article, I will show how a NodeJS Express web application deployed into a Kubernetes cluster can fetch a secret directly from the Vault server using the node-vault module. This ... Vault: NodeJS Express web app using node-vault to fetch secrets| Fabian Lee : Software Engineer
Getting the sum of a list of numbers is a common scenario on the command line, whether you are checking local filesystem utilization or parsing infrastructure reports for total cpu counts. There are multiple ways this can be done, but one of the simplest is to use awk. Let’s use a a sequence of numbers ... Bash: calculate sum from a list of numbers| Fabian Lee : Software Engineer
HashiCorp Vault is a secret and encryption management system that allows your organization to secure sensitive information such as API keys, certificates, and passwords. In this article, I will show how a Java Spring Boot web application deployed into a Kubernetes cluster can fetch a secret directly from the Vault server using the Spring Cloud ... Vault: Spring Boot web app using Spring Cloud Vault to fetch secrets| Fabian Lee : Software Engineer
HashiCorp Vault is a secret and encryption management system that allows your organization to secure sensitive information such as API keys, certificates, and passwords. It has tight integrations with Kubernetes that allows containers to fetch secrets without requiring hardcoding them into environment variables, files, or external services. The official docs already provide usage scenarios, so ... Vault: HashiCorp Vault deployed into Kubernetes cluster for secret management| Fabian Lee : Software Engineer
If ssh private/public keypair authentication is failing, check the logs on the server side for permission errors. On Debian/Ubuntu check for these errors in “/var/log/auth.log”. # error if authorized_keys file has too wide a permission for others Authentication refused: bad ownership or modes for file /home/myuser/.ssh/authorized_keys # error if .ssh directory has too wide a ... Bash: fixing SSH authentication error “bad ownership or modes for file/directory”| Fabian Lee : Software Engineer
GitLab Agent for Kubernetes is an integration for the GitLab CI/CD pipeline that provides kubectl access from pipeline jobs, allowing Continuous Deployment into a live Kubernetes Cluster. However, the default role for this Agent is cluster-admin when doing a basic Helm install, which is far too permissive and needs to be scoped down to only ... GitLab: least privilege for Kube-API calls from GitLab Agent for Kubernetes| Fabian Lee : Software Engineer
GitLab pipelines are frequently used for the building of binaries and publishing of images to container registries, but do not always follow through with Continuous Deployment to a live environment. One reason is that pipelines do not usually have access to the internal systems where these applications are meant to be deployed. In this article, ... GitLab: Continuous Deployment with Agent for Kubernetes and GitLab pipeline| Fabian Lee : Software Engineer
The globally shared set of GitLab runners for CI/CD jobs works well for building binaries, publishing images, and reaching out to publicly available endpoints for services and infrastructure building. But the ability to run a private, self-managed runner can grant pipelines entirely new levels of functionality on several fronts: Can communicate openly to private, internal ... GitLab: self-managed runner for CI/CD jobs on GCP VM instances| Fabian Lee : Software Engineer
If you receive an error similar to below when calling the GCP API using ADC login credentials with either gcloud or terraform: Cannot add the project "myproj-i1wsbbn8pkfeq3jhkcg0z4" to ADC as the quota project because the account in ADC does not have the "serviceusage.services.use" permission on this project. You might receive a "quota_exceeded" or "API not ... GCP: quota project error when invoking GCP API using ADC application-default| Fabian Lee : Software Engineer
If you have a previous investment in Ansible Configuration Management for command line automation, you may now want to invoke that same logic from a GitLab CI/CD pipeline. The cleanest way to provide Ansible to a pipeline job is to create a custom Docker image that contains all the Ansible binaries and required Galaxy modules. ... GitLab: invoking Ansible from a GitLab pipeline job| Fabian Lee : Software Engineer
The first action you typically take after “git clone” is to change into the newly created directory. This can be accomplished at the Bash shell in a couple of ways. Here is the git URL we will use as an example for this article. # can be https or ssh, can end in .git or ... Bash: change into directory just created with git clone| Fabian Lee : Software Engineer
When a GitLab CI/CD pipeline needs to persist job output or a rendered report, it will typically save it as an artifact on the job, or perhaps write it to an external storage service or as a GitLab Release archive. But it is also capable of pushing this file to its own git repository, stored ... GitLab: add files to source repository as part of GitLab pipeline| Fabian Lee : Software Engineer
When parsing a string that is divided by a separator char, getting the first N values OR last N values is a common scenario when dealing with: IP address separated by periods, e.g. “10.11.12.13” File path separated by forward slash “/tmp/myfolder/subpath1/subpath2/subpath3” Fully qualified domain separated by periods “sub1.sub2.my.domain.com” Getting first N values Getting the first ... Bash: extracting first or last N octets, paths, or domain from string with fixed separator| Fabian Lee : Software Engineer
The Gitlab documentation shows how to use a ‘dotenv’ artifact to pass values from one job to another in a CI/CD pipeline. In this article, I want to show an example of this method, but also another method using a custom artifact. dotenv artifact for passing variable between jobs Here is how a variable set ... GitLab: passing values between two jobs in pipeline| fabianlee.org
GitLab CI/CD pipelines can be used to automatically build and push Docker images to the GitLab Container Registry. Beyond building a simple image, in this article I will show how to define a workflow that builds and pushes a multi-platform image (amd64,arm64,arm32) with manifest index to the GitLab Container Registry. This is enabled by using ... GitLab: automated build and publish of multi-platform container image with GitLab pipeline| fabianlee.org
If you are within the context of a CI/CD tool, you may run into the scenario where a newly applied git tag has initiated a pipeline action. Depending on the tool, the pipeline will provide you with either a SHA of the last commit and/or the tag name – but not the branch where the ... Git: find branch name of newly applied tag| fabianlee.org
In a previous article, I described how to expose a Github source repo as a public Helm repository by enabling Github Pages and running the chart-releaser utility. In this article, I want to remove the manual invocation of the chart-releaser, and instead place that into an Github Actions workflow that automatically publishes changes to the ... Helm: automated publishing of Helm repo with Github Actions| fabianlee.org
The only requirement for a public Helm chart repository is that it exposes a URL named “index.yaml”. So by adding a file named “index.yaml” to source control and enabling Github Pages to serve the file over HTTPS, you have the minimal basis for a public Helm chart repository. The backing Chart content (.tgz) can also ... Helm: manually publishing Helm repo on Github using chart-releaser| fabianlee.org
Github Actions provide the ability to define a build workflow based on Github repository events. The workflow steps are defined as yaml and can be triggered by various events, including a code push, branch, or tagging in the repository. In this article, I will show how to define workflow steps that build and push a ... Github: automated build and publish of multi-platform container image with Github Actions| fabianlee.org
Docker can build multi-platform images that use a manifest index (fat manifest list) by using the Docker buildx command with backing containerd runtime and QEMU for cross-platform emulation. Using a manifest index for multi-platform images simplifies application level orchestration by using the same name and version for all architectures. For example: # same image name ... Docker: building multi-platform images that use fat manifest list/index| fabianlee.org
If you want to use Docker to build cross-platform images, the first step is to enable QEMU to run images targeted at others architectures via emulation. I assume you have installed Docker CE and its containerd runtime as described here, and are running on a x86_64 host. Test current ability to emulate other architectures # ... Docker: QEMU emulation to run arm64 images from native amd64 host| fabianlee.org
Docker is a container platform that streamlines software delivery and provides isolation, scalability, and efficiency with less overhead than OS level virtualization. These instructions are taken from the official Docker for Ubuntu page, but I fine-tuned them per Ubuntu22+ standards. Uninstall older versions for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt ... Docker: installing Docker CE on Ubuntu| fabianlee.org
If you are administering a Kubernetes cluster that you have inherited or perhaps not visited in a while, then you may need to reacquaint yourself with: which Helm charts are installed into what namespaces, if there are chart updates available, and then what values were used for chart installation. Below are commands that can assist ... Helm: discovering Helm chart releases installed into Kubernetes cluster| fabianlee.org
The need to configure a specific pod’s container arguments is a common Kubernetes administration task. As examples, you might need to enable verbose logging, set an explicit value to override a default, or configure a host name or port set in a container’s arguments. In the example below, we are targeting the ‘metrics-server’ in the ... Kubernetes: patching container arguments array with kubectl and jq| fabianlee.org
HorizontalPodAutoscaler (HPA) allow you to dynamically scale the replica count of your Deployment based on basic CPU/memory resource metrics from the metrics-server. If you want scaling based on more advanced scenarios and you are already using the Prometheus stack, the prometheus-adapter provides this enhancement. The prometheus-adapter takes basic Prometheus metrics, and then synthesizes custom API ... Kubernetes: HorizontalPodAutoscaler evaluation based on Prometheus metric| fabianlee.org
HorizontalPodAutoscaler (HPA) allow you to dynamically scale the replica count of your Deployment based on criteria such as memory or CPU utilization, which make it great way to manage spikes in utilization while still keeping your cluster size and infrastructure costs managed effectively. In order for HPA to evaluate CPU and memory utilization and take ... Kubernetes: implementing and testing a HorizontalPodAutoscaler| fabianlee.org
K3s is deployed by default with a metrics-server, but if you have a multi-node cluster it will fail unless you add the names of all the nodes to the kube-apiserver certificate. Symptoms of this problem include: metrics-server deployment will throw x509 errors in its log Error when you try to run “kubectl top pods” No ... Kubernetes: fixing x509 certificate errors from metric-server on K3s cluster| fabianlee.org
Although jwt.io has become a common online destination for decoding JWT, this can also be done locally using jq. # populate JWT variable JWT=... # decode with jq utility echo $JWT | jq -R 'split(".") | .[0],.[1] | @base64d | fromjson' Attribution of credit goes to this gist. If you have not installed jq on ... Bash: decoding a JWT from the command line with jq| Fabian Lee : Software Engineer
If you have just removed a module declaration from your Terraform configuration and now get a ‘Provider configuration not present’ error when running apply: Error: Provider configuration not present To work with module.mymodule_legacysyntax.null_resource.test_rs (orphan) its original provider configuration at module.mymodule_legacysyntax.provider["registry.terraform.io/hashicorp/null"] is required, but it has been removed. This occurs when a provider configuration is removed ... Terraform...| Fabian Lee : Software Engineer
If you are getting the following error when invoking ‘ansible’, ‘ansible-playbook’, ‘ansible-galaxy’ or any of the Ansible related utilities: ERROR: Ansible could not initialize the preferred locale: unsupported locale setting This means Ansible cannot find a locale ending in “.UTF-8”. Check the currently installed locales: $ locale -a Then export the LC_ALL variable to one ... Ansible: resolving ‘could not initialize the preferred locale: unsupported locale setting’| Fabian Lee : Software Engineer
Deployments and Daemonset typically have more than one replica or desired replica count, and although kubectl default formatting will return columns summarizing how many are desired and how many are currently ready, an automated script needs to parse these value in order to determine if full health. Similiarly, pod status as well as the readiness ... Kubernetes: evaluating full readiness of deployment, daemonset, or pod| Fabian Lee : Software Engineer
It would be uncommon to have one monolithic Terraform configuration for all the infrastructure in your organization. More than likely, there are multiple groups and each has responsibility and ownership of certain components (e.g. networking, storage, authorization, Kubernetes). As an example, let’s say your responsibility is the Kubernetes cluster build. You may need the following ... Terraform: terraform_remote_state to pass values to other configurations| Fabian Lee : Software Engineer
There are multiple options for creating a TLS secret using kustomize. One is to embed the certificate content as a base64 string directly in the data, the other is to use an external file. Below is an example kustomization.yaml file that serves as an entry point for both methods. --- apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: ... Kubernetes: creating TLS secrets with kustomize using embedded or external content| Fabian Lee : Software Engineer
If you need to create a new git branch for your project, one that is completely fresh and with no previous history or commits, that can be done using the “orphan” flag of the switch command if you are using git 2.23+. Fresh branch using ‘git switch’ # create fresh branch git switch --orphan my-fresh-branch ... Git: create a new empty branch with no history or commits| Fabian Lee : Software Engineer
If you are attempting to run “terraform init” with a Google Cloud Storage backend and get the following error: Error: Failed to get existing workspaces: querying Cloud Storage failed: storage: bucket doesn't exist The first check should be that the Google Cloud Storage bucket indeed exists, using gsutil. project_id=myproject-123 gsutil ls -p $project_id If the ... Terraform: fixing error “querying Cloud Storage failed: storage: bucket doesn’t exist”| Fabian Lee : Software Engineer
A simple grep can help you recursively search for a substring in all the files of a directory. Here is an example of looking for multi-doc yaml files: # recursively search directory, looking for line '---' # regex: caret means line starts with, dollar sign mean line ends with grep -sr '^---$' If you have ... Bash: counting number of times substring is found in directory| fabianlee.org
The forked workflow is popularized by the Open Source community where your personal contributions are made by having your own personal fork of a repository and pushing a GitLab Merge Request to a central repository. A GitLab Merge Request can be submitted from the web UI by clicking on “Merge requests” and manually selecting the ... GitLab: generating URL that can be used for Merge Request from fork to upstream| Fabian Lee : Software Engineer
Anthos Service Mesh for GKE can be installed in the following modes: In-cluster ASM using the asmcli utility Managed ASM using the asmcli utility Managed ASM using the ‘gcloud container fleet’ command Managed ASM using the Terraform asm submodule If you need to determine the installation mode used on your GKE cluster, you can examine ... GCP: determining whether ASM is installed via asmcli or gcloud fleet| Fabian Lee : Software Engineer
If you need to test for a file’s existence, content size, and whether it was recently modified, the ‘find‘ utility can provide this functionality in a single call. One scenario for this usage might be the cached results from a remote service call (database, REST service, etc). If fetching these results was a relatively costly ... Bash: testing if a file exists, has content, and is recently modified| Fabian Lee : Software Engineer
If you need to determine at the CLI whether a GKE cluster is managed using Standard or Autopilot mode, this is available by using gcloud to describe the cluster. # identify cluster and location gcloud container clusters list cluster_name=<clusterName> location_flag="--region=<region>" # OR --zone=<zone> # returns 'True' if GKE AutoPilot cluster # returns empty if standard ... GCP: determining whether GKE cluster mode is Standard or Autopilot| Fabian Lee : Software Engineer
As much as Terraform pushes to be the absolute system of record for resources it creates, often valid external processes are assisting in managing those same resources. Here are some examples of legitimate external changes: Other company-approved Terraform scripts applying labeling to resources in order to track ownership and costs Security teams modifying IAM roles ... GKE: terraform lifecycle ‘ignore_changes’ to manage external changes to GKE cluster| Fabian Lee : Software Engineer
GCP build triggers can easily handle Continuous Deployment (CD) when the source code is homed in a Google Cloud Source repository. But even if the system of record for your source is a remote GitHub repository, these same type of push and tag events can be consumed if you configure a connection and repository link. ... GCP: Cloud Run with build trigger coming from remote GitHub repository| fabianlee.org
Flask is a suitable web server during development, but if you are going to deploy in a production environment, a Python WSGI server such as Gunicorn should be used. This also applies to Python Flask apps deployed to GCP Cloud Run. Gunicorn is necessary to tune the worker and thread count of each instance to ... GCP: deploying a Python WSGI Gunicorn app on Cloud Run| fabianlee.org
It is not uncommon when using kustomize to inherit a large set of resources or components. Perhaps a few of them need to be updated with patches to accommodate your environment. But if there are objects that are completely incompatible, it may be necessary to delete them. This can be done with a kustomize ‘$delete’ ... Kubernetes: using a delete patch with kustomize| fabianlee.org
At some point, there will be a system change significant enough that a maintenance window needs to be scheduled with customers. But that doesn’t mean the end-user traffic or client integrations will stop requesting the services. What we need to present to end-users is a maintenance page during this outage to indicate the overall solution ... GCP: Cloud Run/Function to handle requests to GKE cluster during maintenance| fabianlee.org
The centralized system keyring for apt was deprecated starting in Ubuntu 21, and is being replaced with an explicit path to the local gpg key in the ‘signed-by’ attribute. I have written more extensive articles on this subject [here,here], but from an Ansible perspective, this means ensuring the gpg key is downloaded to ‘/usr/share/keyrings’ with ... Ansible: adding custom apt repository with ‘signed-by’ gpg key| fabianlee.org
If you have a simple directory containing multiple template files that should be generated on a target host, the ‘with_fileglob‘ lookup plugin provides an easy way to render them. Below is an example rendering all the files from the ‘templates’ directory of a role. - name: create file out of every file in template directory ... Ansible: generating templates with deep directory structure using with_filetree| fabianlee.org
Whether you are working on scaling, performance, or high-availability, it can be useful to see exactly which Kubernetes worker node that pods are being scheduled unto. Pods as distributed across worker nodes ns=default kubectl get pods -n $ns -o=custom-columns=NAME:.metadata.name,NODE:.spec.nodeName Pods as distributed across zones (GKE specific) If you wanted to take it one step further ... GKE: show pod distribution across nodes and zones| fabianlee.org
If you are managing GKE clusters using Anthos Config Management (ACM) and need to take advantage of newer features or enhancements in ConfigSync or PolicyController, upgrading these components can be done using the gcloud utility. # check current version of ACM on GKE clusters gcloud beta container fleet config-management version # select membership to upgrade ... GKE: upgrade Anthos Config Management for GKE cluster| fabianlee.org
If you are getting a warning similar to below when running a Python3 application: /usr/lib/python3/dist-packages/paramiko/transport.py:219: CryptographyDeprecationWarning: Blowfish has been deprecated This can be resolved by upgrading to the latest paramiko module. # check current version then upgrade pip3 show paramiko pip3 install paramiko --upgrade # check upgraded version pip3 show paramiko In my case, this ... Python: fixing ‘CryptographyDeprecationWarning: Blowfish has been deprecated’| Fabian Lee : Software Engineer
In this article I will demonstrate how to take a Terraform configuration that is using a local state file and migrate its persistent state to a remote Google Cloud Storage bucket (GCS). We will then perform the migration again, but this time to bring the remote state back to a local file. We will illustrate ... Terraform: migrate state from local to remote Google Cloud Storage bucket and back| Fabian Lee : Software Engineer
If you are using Anthos GKE on-premise and need to determine which node of your Admin Cluster is the master, query for the master role. The label is ‘node-role.kubernetes.io/master’. $ kubectl get nodes -l node-role.kubernetes.io/master NAME STATUS ROLES AGE VERSION gke-admin-master-adfwa Ready control-plane,master 7d v1.24.9-gke.100 # using wide will also show External and Internal IP ... GKE: Determine Anthos on-prem GKE master node and IP address| Fabian Lee : Software Engineer
Listing all the pods belonging to a deployment can be done by querying its selectors, but using the deployment’s synthesized replicaset identifier allows for easier automation. # deployment name and namespace deployment_name=mydeployment deployment_ns=mynamespace # get replica set identifier for deployment dep_rs=$(kubectl describe deployment $deployment_name -n $deployment_ns | grep ^NewReplicaSet | awk '{print $2}') # get ... Kubernetes: list all pods in deployment| Fabian Lee : Software Engineer
The dig utility is convenient for doing manual DNS resolution from your system. Additionally, it uses the same OS resolver libraries as your applications which makes it more accurate than nslookup for emulating application issues and its output is more suitable for machine parsing. # ensure 'dig' is installed sudo apt install -y bind9-dnsutils dig ... Bash: using dig for reverse DNS lookup by IP| Fabian Lee : Software Engineer
For troubleshooting DNS issues, running the dig utility directly from OpenWrt can be essential. This is easily done by installing the ‘bind-dig’ package as shown below. opkg update opkg install bind-dig REFERENCES Opkg official docs openwrt forums, bind-dig package missing| Fabian Lee : Software Engineer
If you are upgrading from Ubuntu 20 to Ubuntu 22 using ‘do-release-upgrade’ and get a fatal error ‘Connection to the the Snap Store failed’, this may be resolved by removing the ‘lxd’ package which is a lightweight container supervisor. sudo /etc/init.d/lxd stop sudo rm -fr /var/lib/lxd sudo dpkg --force depends -P lxd; sudo dpkg --force ... Ubuntu: ‘Connection to the Snap Store failed’ during upgrade from Ubuntu 20 to 22| Fabian Lee : Software Engineer