Introduction
PostgreSQL is a powerful, open-source relational database system. This tutorial explains how the Postgres Operator from Crunchy Data v5 automates and simplifies deploying and managing PostgreSQL clusters on Kubernetes. Within a few minutes, you can have a production-grade Postgres cluster with high availability for disaster recovery.
Prerequisites
Before you begin, you should:
- Deploy a Rcs Kubernetes Cluster.
- Deploy a Rcs Object Storage.
- Configure
kubectl
andgit
in your machine.
1. Install PGO, the Postgres Operator from Crunchy Data
Clone the official example repository from Crunchy Data.
$ git clone --depth=1 https://github.com/CrunchyData/postgres-operator-examples $ cd cd postgres-operator-examples
Install the PGO.
$ kubectl apply -k kustomize/install
Check if PGO is READY.
$ kubectl get pods -n postgres-operator
The result should look like:
NAME READY STATUS RESTARTS AGE
pgo-59c4f987b6-6pj72 1/1 Running 0 44s
2. Prepare a Rcs Object Storage
A Rcs Object Storage stores the Write-Ahead-Logging files and daily backups of your Postgres cluster.
- Create a Rcs Object Storage.
- Create a bucket
postgres-demo-bucket
inside that Object Storage
3. Prepare a Manifest for Your Postgres Cluster
In the postgres-operator-examples
repository, there are multiple examples to create Postgres clusters. In this tutorial, you use the postgres-operator-examples/kustomize/s3
as the starting point.
Change directory to
kustomize/s3/
folder.$ cd kustomize/s3
Copy the file
s3.conf.example
tos3.conf
.$ cp s3.conf.example s3.conf
Set your Rcs Object Storage Access Key and Secret Key into s3.conf file. Here is the example content of this tutorial.
[global] repo1-s3-key=OR70GNH<redacted>HVKG3X repo1-s3-key-secret=MnsrWR5kKAZ<redacted>83P3b5J2BdY5pU
Open the file
postgres.yaml
and find the following section.s3: bucket: "<YOUR_AWS_S3_BUCKET_NAME>" endpoint: "<YOUR_AWS_S3_ENDPOINT>" region: "<YOUR_AWS_S3_REGION>"
Replace
"<YOUR_AWS_S3_ENDPOINT>"
with the Hostname of your Rcs Object Storage. Replace"<YOUR_AWS_S3_BUCKET_NAME>"
with the bucket name in section 2. Replace<YOUR_AWS_S3_REGION>
with any text. Here is the example content of this tutorial.s3: bucket: "postgres-demo-bucket" endpoint: "ewr1.vultrobjects.com" region: "default"
Add the
repo1-s3-uri-style: path
to theglobal
section as follows:global: repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1 repo1-s3-uri-style: path
Add a new section under the
spec
section to back up the Write-AHead-Logging (WAL) every 60 seconds:spec: patroni: dynamicConfiguration: postgresql: parameters: archive_timeout: 60
Rcs Block Storage requires to have a minimum size of 10GB. Change the
storage: 1Gi
tostorage: 10Gi
.Add another
repo2
to therepos
section, which has avolume
instead of ans3
. This creates another Rcs Block Storage to save the Write-AHead-Logging and daily backups. Here is an example configurationglobal: repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1 repo1-s3-uri-style: path repos: - name: repo1 s3: bucket: "postgres-demo-bucket" endpoint: "ewr1.vultrobjects.com" region: "default" - name: repo2 volume: volumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 10Gi
The final content of the postgres.yaml
should be as follows:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-s3
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
patroni:
dynamicConfiguration:
postgresql:
parameters:
archive_timeout: 60
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repos:
- name: repo1
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
4. Create a Postgres Cluster
Under the s3 folder, run the following command to create the Postgres Cluster
$ kubectl apply -k .
Check the running pods with
kubectl get pods -n postgres-operator
. The result should look likeNAME READY STATUS RESTARTS AGE hippo-s3-00-7nt4-0 2/2 Running 0 96s pgo-59c4f987b6-nzpnn 1/1 Running 0 10m
If you see a similar result, you have successfully deployed a Postgres cluster on Rcs Kubernetes Engine with the following features:
- Save the data of the Postgres Cluster in a Rcs Block Storage with a size of 10GB.
- Upload Write-AHead-Logging (WAL) every 60 seconds to a Rcs Object Storage.
- The WAL files and backups are stored inside another Rcs Block Storage in the same Kubernetes cluster and Rcs Object Storage in another region.
5. Connect to the Postgres cluster
The information to connect to the Postgres Cluster is inside a secret that has the name <clusterName>-pguser-<userName>
in postgres-operator
namespace. In this tutorial, the secret is hippo-s3-pguser-hippo-s3
Get the secret
hippo-s3-pguser-hippo-s3
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yaml
The output should look as follows. The value in
data
are base64-encoded strings.apiVersion: v1 data: dbname: aGlwcG8tczM= host: aGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM= jdbc-uri: amRiYzpwb3N0Z3Jlc3FsOi8vaGlwcG8tczMtcHJpbWFyeS5wb3N0Z3Jlcy1vcGVyYXRvci5zdmM6NTQzMi9oaXBwby1zMz9wYXNzd29yZD1vJTNCJTVEQlhmWFo0azUlNUVUZyU0MDklM0IlMkJGTyUyQTduJTNCJnVzZXI9aGlwcG8tczM= password: bztdQlhmWFo0azVeVGdAOTsrRk8qN247 port: NTQzMg== uri: cG9zdGdyZXNxbDovL2hpcHBvLXMzOm87JTVEQlhmWFo0azUlNUVUZyU0MDk7K0ZPJTJBN247QGhpcHBvLXMzLXByaW1hcnkucG9zdGdyZXMtb3BlcmF0b3Iuc3ZjOjU0MzIvaGlwcG8tczM= user: aGlwcG8tczM= verifier: U0NSQU0tU0hBLTI1NiQ0MDk2OnJsRlNIUERLU1VmNDE0KzNLNlN4Qmc9PSR4Z2dTbjgzaFk1QjZYSERoR2gxbjdvZmdIUWNUNnJRamZHUGwvdUVFQUVrPTo2YUFiSk9pUSs2cVVtUzZTNkpwbW1McFJXeDFFVGdFcTdKSVQ1UnozSmR3PQ== kind: Secret metadata: creationTimestamp: "2022-03-12T07:24:42Z" labels: postgres-operator.crunchydata.com/cluster: hippo-s3 postgres-operator.crunchydata.com/pguser: hippo-s3 postgres-operator.crunchydata.com/role: pguser name: hippo-s3-pguser-hippo-s3 namespace: postgres-operator ownerReferences: - apiVersion: postgres-operator.crunchydata.com/v1beta1 blockOwnerDeletion: true controller: true kind: PostgresCluster name: hippo-s3 uid: 2743b032-51d0-46e7-ace8-fef49eb305a1 resourceVersion: "3380" uid: f7664af8-f371-4a3a-b4ac-2871e2abda02 type: Opaque
Decode the value for the root password using the following command
$ echo 'bztdQlhmWFo0azVeVGdAOTsrRk8qN247' | base64 --decode
Run the following commands to get the user, database name, and password
$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.user | base64decode}}' $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.dbname | base64decode}}' $ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o go-template='{{.data.password | base64decode}}'
Get the pod's name that is the primary node of the Postgres Cluster (
pod/hippo-s3-00-7nt4-0
in this tutorial).$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=master
Run port-forward to access the pod through localhost with port 5432. Replace
hippo-s3-00-7nt4-0
with your pod name. Then, connect to your Postgres with your favorite database tool with the above credentials.$ kubectl -n postgres-operator port-forward hippo-s3-00-7nt4-0 5432:5432
6. Customize the Postgres Cluster
Here are some customizations that you may need. Whenever you change the postgres.yaml
file, make sure that you run the apply command as follows to apply the changes to the Postgres cluster with htis command:
$ kubectl apply -k .
Daily Backup to S3
Add a schedules
section to repo
to allow automatically full backup every day at 1 a.m. and incremental backup every 4 hours. You can change the Cron schedule expression as you want.
Add repo1-retention-full
field to global
field to automatically remove old backups.
Here is an example configuration:
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repo1-retention-full: "14"
repo1-retention-full-type: "count"
repo2-retention-full: "14"
repo2-retention-full-type: "count"
repos:
- name: repo1
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
Add replicas to the Postgres Cluster
Add replicas: 3
under the instances to get two more replicas in the Postgres Cluster.
Here is an example configuration:
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
replicas: 3
Enable Synchronous Replication
Synchronous Replication is useful for workloads that are sensitive to losing transactions.
The trade-offs are:
- Take longer for a transaction to commit
- A crash in synchronous replicas blocks writes to the primary.
Under the patroni
section, add the synchronous_mode: true
and synchronous_commit: "on"
as follows:
patroni:
dynamicConfiguration:
synchronous_mode: true
postgresql:
parameters:
synchronous_commit: "on"
archive_timeout: 60
Connection Pooling
Connection Pooling is useful when scaling and maintaining the connection between the application and the database, especially if you are using a serverless architecture for your application.
Add the
proxy
section underspec
to enable the connection pooling with PgBouncer connection pooler. You can also specify the number of replicas of the poolingspecs: proxy: pgBouncer: image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1 replicas: 2
Run the following command to see the pods for connection pooling
$ kubectl get pod -n postgres-operator -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=pgbouncer
Run the following command to see the secrets. You should see the new attributes for the connection pooling including
pgbouncer-host
,pgbouncer-jdbc-uri
,pgbouncer-port
andpgbouncer-uri
.$ kubectl -n postgres-operator get secrets hippo-s3-pguser-hippo-s3 -o yaml
Run port-forwarding to access the PgBouncer service through localhost with port 5432.
$ kubectl -n postgres-operator port-forward service/hippo-s3-pgbouncer 5432:5432
Perform a Manual Backup
Create a section under the pgbackrest as follows:
backup: pgbackrest: manual: repoName: repo1 options: - --type=full
Annotate the Postgres Cluster to trigger a one-off backup
$ kubectl annotate -n postgres-operator postgrescluster hippo-s3 --overwrite postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
Here is the final postgres.yaml
file which all the above customizations.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo-s3
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0
postgresVersion: 14
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
replicas: 3
patroni:
dynamicConfiguration:
synchronous_mode: true
postgresql:
parameters:
synchronous_commit: "on"
archive_timeout: 60
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-1
replicas: 2
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-path: /pgbackrest/postgres-operator/hippo-s3/repo1
repo1-s3-uri-style: path
repo1-retention-full: "14"
repo1-retention-full-type: "count"
repo2-retention-full: "14"
repo2-retention-full-type: "count"
manual:
repoName: repo1
options:
- --type=full
repos:
- name: repo1
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
s3:
bucket: "postgres-demo-bucket"
endpoint: "ewr1.vultrobjects.com"
region: "default"
- name: repo2
schedules:
full: '0 1 * * *'
incremental: "0 */4 * * *"
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 10Gi
7. Troubleshooting Tips
Use
kubectl describe
to check parameters and events of the Postgres Cluster$ kubectl describe -n postgres-operator postgrescluster hippo-s3
Get all the events under the
postgres-operator
namespace$ kubectl get events -n postgres-operator --sort-by='.metadata.creationTimestamp'
Get the pod name of the primary pod. Replace
hippo-s3
with your cluster name$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/role=master
Get the pod name of the pod which mounts the volume of the volume backup repository named
repo2
in this tutorial.$ kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo-s3,postgres-operator.crunchydata.com/data=pgbackrest
Get a shell into the pod that controls the volume backup repository. Replace
<POD_NAME>
with your pod name$ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bash
In the shell, you can query the files inside the Rcs Block Storage that stores the data of repo2. Run
df -h
to get the mounted location of the Rcs Block Storage (/pgbackrest/repo2
in this tutorial).Get a shell into the primary pod. Replace
<POD_NAME>
with your pod name$ kubectl exec -n postgres-operator <POD_NAME> -it -- /bin/bash
You can access the pgbackrest tool that backs up and restores the database in the shell. Here are some useful commands
Get information of the backup repos
$ pgbackrest info
Check the pgbackrest configuration
$ pgbackrest check --stanza=db
Find the list of available Write-AHead-Logging files
$ ls -l /pgdata/pg14_wal
Find the list of uploaded Write-AHead-Logging files
$ ls -l /pgdata/pg14_wal/archive_status
Check the pgbackrest log files
$ ls /pgdata/pgbackrest/log $ cat /pgdata/pgbackrest/log/db-backup.log $ cat /pgdata/pgbackrest/log/db-expire.log $ cat /pgdata/pgbackrest/log/db-stanza-create.log