Introduction
PostgreSQL is a powerful, open-source relational database system. This tutorial explains how the Postgres Operator from Crunchy Data v5 automates and simplifies deploying and managing PostgreSQL clusters on Kubernetes. Within a few minutes, you can have a production-grade Postgres cluster with high availability for disaster recovery.
Prerequisites
Before you begin, you should:
- Deploy a Rcs Kubernetes Cluster.
- Deploy a Rcs Object Storage.
- Configure
kubectlandgitin your machine.
1. Install PGO, the Postgres Operator from Crunchy Data
Clone the official example repository from Crunchy Data.
$ git clone --depth=1 https://github.com/CrunchyData/postgres-operator-examples $ cd cd postgres-operator-examplesInstall the PGO.
$ kubectl apply -k kustomize/installCheck if PGO is READY.
$ kubectl get pods -n postgres-operator
The result should look like:
NAME READY STATUS RESTARTS AGE
pgo-59c4f987b6-6pj72 1/1 Running 0 44s2. Prepare a Rcs Object Storage
A Rcs Object Storage stores the Write-Ahead-Logging files and daily backups of your Postgres cluster.
- Create a Rcs Object Storage.
- Create a bucket
postgres-demo-bucketinside that Object Storage
3. Prepare a Manifest for Your Postgres Cluster
In the postgres-operator-examples repository, there are multiple examples to create Postgres clusters. In this tutorial, you use the postgres-operator-examples/kustomize/s3 as the starting point.
Change directory to
kustomize/s3/folder.$ cd kustomize/s3Copy the file
s3.conf.exampletos3.conf.$ cp s3.conf.example s3.confSet your Rcs Object Storage Access Key and Secret Key into s3.conf file. Here is the example content of this tutorial.
[global] repo1-s3-key=OR70GNH<redacted>HVKG3X repo1-s3-key-secret=MnsrWR5kKAZ<redacted>83P3b5J2BdY5pUOpen the file
postgres.yamland find the following section.s3: bucket: "<YOUR_AWS_S3_BUCKET_NAME>" endpoint: "<YOUR_AWS_S3_ENDPOINT>" region: "<YOUR_AWS_S3_REGION>"Replace
"<YOUR_AWS_S3_ENDPOINT>"with the Hostname of your Rcs Object Storage. Replace"<YOUR_AWS_S3_BUCKET_NAME>"with the bucket name in section 2. Replace<YOUR_AWS_S3_REGION>with any text. Here is the example content of this tutorial.s3: bucket: "postgres-demo-bucket" endpoint: "ewr1.vultrobjects.com" region: "default"Add the
repo1-s3-uri-style: pathto theglobalsection as follows:global: repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1 repo1-s3-uri-style: pathAdd a new section under the
specsection to back up the Write-AHead-Logging (WAL) every 60 seconds:spec: patroni: dynamicConfiguration: postgresql: parameters: archive_timeout: 60Rcs Block Storage requires to have a minimum size of 10GB. Change the
storage: 1Gitostorage: 10Gi.Add another
repo2to therepossection, which has avolumeinstead of ans3. This creates another Rcs Block Storage to save the Write-AHead-Logging and daily backups. Here is an example configurationglobal: repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1 repo1-s3-uri-style: path repos: - name: repo1 s3: bucket: "postgres-demo-bucket" endpoint: "ewr1.vultrobjects.com" region: "default" - name: repo2 volume: volumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 10Gi
The final content of the postgres.yaml should be as follows:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-s3
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
patroni:
dynamicConfiguration:
postgresql:
parameters:
archive_timeout: 60
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repos:
- name: repo1
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi4. Create a Postgres Cluster
Under the s3 folder, run the following command to create the Postgres Cluster
$ kubectl apply -k .Check the running pods with
kubectl get pods -n postgres-operator. The result should look likeNAME READY STATUS RESTARTS AGE hippo-s3-00-7nt4-0 2/2 Running 0 96s pgo-59c4f987b6-nzpnn 1/1 Running 0 10m
If you see a similar result, you have successfully deployed a Postgres cluster on Rcs Kubernetes Engine with the following features:
- Save the data of the Postgres Cluster in a Rcs Block Storage with a size of 10GB.
- Upload Write-AHead-Logging (WAL) every 60 seconds to a Rcs Object Storage.
- The WAL files and backups are stored inside another Rcs Block Storage in the same Kubernetes cluster and Rcs Object Storage in another region.
5. Connect to the Postgres cluster
The information to connect to the Postgres Cluster is inside a secret that has the name <clusterName>-pguser-<userName> in postgres-operator namespace. In this tutorial, the secret is hippo-s3-pguser-hippo-s3
Get the secret
hippo-s3-pguser-hippo-s3$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yamlThe output should look as follows. The value in
dataare base64-encoded strings.apiVersion: v1 data: dbname: aGlwcG8tczM= host: aGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM= jdbc-uri: amRiYzpwb3N0Z3Jlc3FsOi8vaGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM6NTQzMi9oaXBwby1zMz9wYXNzd29yZD1vJTNCJTVEQlhmWFo0azUlNUVUZyU0MDklM0IlMkJGTyUyQTduJTNCJnVzZXI9aGlwcG8tczM= password: bztdQlhmWFo0azVeVGdAOTsrRk8qN247 port: NTQzMg== uri: cG9zdGdyZXNxbDovL2hpcHBvLXMzOm87JTVEQlhmWFo0azUlNUVUZyU0MDk7K0ZPJTJBN247QGhpcHBvLXMzLXByaW1hcnkucG9zdGdyZXMtb3BlcmF0b3Iuc3ZjOjU0MzIvaGlwcG8tczM= user: aGlwcG8tczM= verifier: U0NSQU0tU0hBLTI1NiQ0MDk2OnJsRlNIUERLU1VmNDE0KzNLNlN4Qmc9PSR4Z2dTbjgzaFk1QjZYSERoR2gxbjdvZmdIUWNUNnJRamZHUGwvdUVFQUVrPTo2YUFiSk9pUSs2cVVtUzZTNkpwbW1McFJXeDFFVGdFcTdKSVQ1UnozSmR3PQ== kind: Secret metadata: creationTimestamp: "2022-03-12T07:24:42Z" labels: postgres-operator.crunchydata.com/cluster: hippo-s3 postgres-operator.crunchydata.com/pguser: hippo-s3 postgres-operator.crunchydata.com/role: pguser name: hippo-s3-pguser-hippo-s3 namespace: postgres-operator ownerReferences: - apiVersion: postgres-operator.crunchydata.com/v1beta1 blockOwnerDeletion: true controller: true kind: PostgresCluster name: hippo-s3 uid: 2743b032-51d0-46e7-ace8-fef49eb305a1 resourceVersion: "3380" uid: f7664af8-f371-4a3a-b4ac-2871e2abda02 type: OpaqueDecode the value for the root password using the following command
$ echo 'bztdQlhmWFo0azVeVGdAOTsrRk8qN247' | base64 --decodeRun the following commands to get the user, database name, and password
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.user | base64decode}}' $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.dbname | base64decode}}' $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.password | base64decode}}'Get the pod's name that is the primary node of the Postgres Cluster (
pod/hippo-s3-00-7nt4-0in this tutorial).$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=masterRun port-forward to access the pod through localhost with port 5432. Replace
hippo-s3-00-7nt4-0with your pod name. Then, connect to your Postgres with your favorite database tool with the above credentials.$ kubectl -n postgres-operator port-forward hippo-s3-00-7nt4-0 5432:5432
6. Customize the Postgres Cluster
Here are some customizations that you may need. Whenever you change the postgres.yaml file, make sure that you run the apply command as follows to apply the changes to the Postgres cluster with htis command:
$ kubectl apply -k .Daily Backup to S3
Add a schedules section to repo to allow automatically full backup every day at 1 a.m. and incremental backup every 4 hours. You can change the Cron schedule expression as you want.
Add repo1-retention-full field to global field to automatically remove old backups.
Here is an example configuration:
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repo1-retention-full: "14"
repo1-retention-full-type: "count"
repo2-retention-full: "14"
repo2-retention-full-type: "count"
repos:
- name: repo1
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10GiAdd replicas to the Postgres Cluster
Add replicas: 3 under the instances to get two more replicas in the Postgres Cluster.
Here is an example configuration:
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
replicas: 3Enable Synchronous Replication
Synchronous Replication is useful for workloads that are sensitive to losing transactions.
The trade-offs are:
- Take longer for a transaction to commit
- A crash in synchronous replicas blocks writes to the primary.
Under the patroni section, add the synchronous_mode: true and synchronous_commit: "on" as follows:
patroni:
dynamicConfiguration:
synchronous_mode: true
postgresql:
parameters:
synchronous_commit: "on"
archive_timeout: 60 Connection Pooling
Connection Pooling is useful when scaling and maintaining the connection between the application and the database, especially if you are using a serverless architecture for your application.
Add the
proxysection underspecto enable the connection pooling with PgBouncer connection pooler. You can also specify the number of replicas of the poolingspecs: proxy: pgBouncer: image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1 replicas: 2
Run the following command to see the pods for connection pooling
$ kubectl get pod -n postgres-operator -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=pgbouncerRun the following command to see the secrets. You should see the new attributes for the connection pooling including
pgbouncer-host,pgbouncer-jdbc-uri,pgbouncer-portandpgbouncer-uri.$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yamlRun port-forwarding to access the PgBouncer service through localhost with port 5432.
$ kubectl -n postgres-operator port-forward service/hippo-s3-pgbouncer 5432:5432
Perform a Manual Backup
Create a section under the pgbackrest as follows:
backup: pgbackrest: manual: repoName: repo1 options: - --type=fullAnnotate the Postgres Cluster to trigger a one-off backup
$ kubectl annotate -n postgres-operator postgrescluster hippo-s3 --overwrite postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
Here is the final postgres.yaml file which all the above customizations.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-s3
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
replicas: 3
patroni:
dynamicConfiguration:
synchronous_mode: true
postgresql:
parameters:
synchronous_commit: "on"
archive_timeout: 60
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1
replicas: 2
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repo1-retention-full: "14"
repo1-retention-full-type: "count"
repo2-retention-full: "14"
repo2-retention-full-type: "count"
manual:
repoName: repo1
options:
- --type=full
repos:
- name: repo1
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi7. Troubleshooting Tips
Use
kubectl describeto check parameters and events of the Postgres Cluster$ kubectl describe -n postgres-operator postgrescluster hippo-s3Get all the events under the
postgres-operatornamespace$ kubectl get events -n postgres-operator --sort-by='.metadata.creationTimestamp'Get the pod name of the primary pod. Replace
hippo-s3with your cluster name$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=masterGet the pod name of the pod which mounts the volume of the volume backup repository named
repo2in this tutorial.$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/data=pgbackrestGet a shell into the pod that controls the volume backup repository. Replace
<POD_NAME>with your pod name$ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bashIn the shell, you can query the files inside the Rcs Block Storage that stores the data of repo2. Run
df -hto get the mounted location of the Rcs Block Storage (/pgbackrest/repo2in this tutorial).Get a shell into the primary pod. Replace
<POD_NAME>with your pod name$ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bash
You can access the pgbackrest tool that backs up and restores the database in the shell. Here are some useful commands
Get information of the backup repos
$ pgbackrest infoCheck the pgbackrest configuration
$ pgbackrest check --stanza=dbFind the list of available Write-AHead-Logging files
$ ls -l /pgdata/pg14_walFind the list of uploaded Write-AHead-Logging files
$ ls -l /pgdata/pg14_wal/archive_statusCheck the pgbackrest log files
$ ls /pgdata/pgbackrest/log $ cat /pgdata/pgbackrest/log/db-backup.log $ cat /pgdata/pgbackrest/log/db-expire.log $ cat /pgdata/pgbackrest/log/db-stanza-create.log