diff --git a/docs/architecture/comparison.md b/docs/architecture/comparison.md index 75e1b06f..fde587d5 100644 --- a/docs/architecture/comparison.md +++ b/docs/architecture/comparison.md @@ -10,7 +10,7 @@ PgDog aims to be the de facto PostgreSQL proxy and pooler. Below is a feature co | Feature | PgBouncer | PgCat | PgDog | |-|-|-|-| -| [Connection pooler](../features/transaction-mode.md) | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: | +| [Connection pooler](../features/connection-pooler/transaction-mode.md) | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: | | Load balancer | Requires external TCP proxy | :material-check-circle-outline: | :material-check-circle-outline: | | [Read/write separation](../features/load-balancer/index.md) | No | Basic support | Advanced support handling edge cases | | [Failover](../features/load-balancer/healthchecks.md) | No | :material-check-circle-outline: | :material-check-circle-outline: | @@ -19,7 +19,7 @@ PgDog aims to be the de facto PostgreSQL proxy and pooler. Below is a feature co | [Metrics](../features/metrics.md) | Admin database only | OpenMetrics & admin database | OpenMetrics & admin database | | [Mirroring](../features/mirroring.md) | No | Partial support | :material-check-circle-outline: | | TLS | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: | -| [Prepared statements](../features/prepared-statements.md) | :material-check-circle-outline: | Partial support | :material-check-circle-outline: | +| [Prepared statements](../features/connection-pooler/prepared-statements.md) | :material-check-circle-outline: | Partial support | :material-check-circle-outline: | | [Plugins](../features/plugins/index.md) | No | Hardcoded in core | :material-check-circle-outline: | | Session evariables in transaction mode | Partial support | Partial support | :material-check-circle-outline: | diff --git a/docs/client-drivers.md b/docs/client-drivers.md index 6ca59429..7036f175 100644 --- a/docs/client-drivers.md +++ b/docs/client-drivers.md @@ -64,7 +64,7 @@ We benchmarked this to be 5 times faster than normal `pg_query` parsing, which s ### Prisma -Prisma doesn't correctly use the `IN` clause with arrays, causing it to generate a very large number of unique prepared statements. This is not a big problem, but if left unchecked, can cause heavy memory usage in PgDog. Consider setting a lower prepared statements [cache limit](features/prepared-statements.md#cache-limit): +Prisma doesn't correctly use the `IN` clause with arrays, causing it to generate a very large number of unique prepared statements. This is not a big problem, but if left unchecked, can cause heavy memory usage in PgDog. Consider setting a lower prepared statements [cache limit](features/connection-pooler/prepared-statements.md#cache-limit): === "pgdog.toml" ```toml diff --git a/docs/configuration/pgdog.toml/general.md b/docs/configuration/pgdog.toml/general.md index 69c5cecb..ec0cfd29 100644 --- a/docs/configuration/pgdog.toml/general.md +++ b/docs/configuration/pgdog.toml/general.md @@ -62,7 +62,7 @@ Available options: - `statement` -See [transaction mode](../../features/transaction-mode.md) and [session mode](../../features/session-mode.md) for more details on each mode. +See [transaction mode](../../features/connection-pooler/transaction-mode.md) and [session mode](../../features/connection-pooler/session-mode.md) for more details on each mode. Default: **`transaction`** @@ -170,7 +170,7 @@ from abnormal conditions like hardware failure. ### `rollback_timeout` -How long to allow for `ROLLBACK` queries to run on server connections with unfinished transactions. See [transaction mode](../../features/transaction-mode.md) for more details. +How long to allow for `ROLLBACK` queries to run on server connections with unfinished transactions. See [transaction mode](../../features/connection-pooler/transaction-mode.md) for more details. Default: **`5_000`** (5s) @@ -379,7 +379,7 @@ Default: **`extended`** ### `prepared_statements_limit` Number of prepared statements that will be allowed for each server connection. If this limit is reached, the least used statement is closed -and replaced with the newest one. Additionally, any unused statements in the [global cache](../../features/prepared-statements.md) above this +and replaced with the newest one. Additionally, any unused statements in the [global cache](../../features/connection-pooler/prepared-statements.md) above this limit will be removed. Default: **`none`** (unlimited) @@ -388,7 +388,7 @@ Default: **`none`** (unlimited) ### `pub_sub_channel_size` -Enables support for [pub/sub](../../features/pub_sub.md) and configures the size of the background task queue. +Enables support for [pub/sub](../../features/connection-pooler/pub_sub.md) and configures the size of the background task queue. Default: **`none`** (disabled) @@ -476,7 +476,7 @@ Default: **`1_000`** !!! warning "Deprecated setting" This setting is deprecated. Use [`query_parser`](#query_parser) instead. -Force-enable query parsing to take advantage of its features in non-sharded databases, like [advisory locks](../../features/transaction-mode.md#advisory-locks) or managing [session state](../../features/transaction-mode.md#session-state). +Force-enable query parsing to take advantage of its features in non-sharded databases, like [advisory locks](../../features/connection-pooler/transaction-mode.md#advisory-locks) or managing [session state](../../features/connection-pooler/transaction-mode.md#session-state). ### `query_parser` diff --git a/docs/configuration/users.toml/users.md b/docs/configuration/users.toml/users.md index 6ba258d3..f59d4bed 100644 --- a/docs/configuration/users.toml/users.md +++ b/docs/configuration/users.toml/users.md @@ -94,8 +94,8 @@ Overrides [`min_pool_size`](../pgdog.toml/general.md#min_pool_size) for this use ### `pooler_mode` -Overrides [`pooler_mode`](../pgdog.toml/general.md) for this user. This allows users in [session mode](../../features/session-mode.md) to connect to the -same PgDog instance as users in [transaction mode](../../features/transaction-mode.md). +Overrides [`pooler_mode`](../pgdog.toml/general.md) for this user. This allows users in [session mode](../../features/connection-pooler/session-mode.md) to connect to the +same PgDog instance as users in [transaction mode](../../features/connection-pooler/transaction-mode.md). Default: **none** (defaults to `pooler_mode` from `pgdog.toml`) diff --git a/docs/enterprise_edition/control_plane/installation.md b/docs/enterprise_edition/control_plane/installation.md index 51701643..027a3f2f 100644 --- a/docs/enterprise_edition/control_plane/installation.md +++ b/docs/enterprise_edition/control_plane/installation.md @@ -13,7 +13,7 @@ helm repo add pgdogdev-ee https://helm-ee.pgdog.dev helm install control pgdogdev-ee/pgdog-control ``` -The chart has a few external requirements, [documented below](#requirements). +The chart has a few external requirements, [documented below](#dependencies). ## Dependencies diff --git a/docs/features/.pages b/docs/features/.pages index a7dacbb0..75cc5871 100644 --- a/docs/features/.pages +++ b/docs/features/.pages @@ -1,9 +1,9 @@ nav: - 'index.md' + - 'connection-pooler' - 'load-balancer' - 'sharding' - 'plugins' - - 'transaction-mode.md' - '...' - 'mirroring.md' - 'multi-tenancy.md' diff --git a/docs/features/connection-pooler/.pages b/docs/features/connection-pooler/.pages new file mode 100644 index 00000000..d696560f --- /dev/null +++ b/docs/features/connection-pooler/.pages @@ -0,0 +1,5 @@ +nav: + - 'index.md' + - 'transaction-mode.md' + - 'pub_sub.md' + - '...' diff --git a/docs/features/connection-recovery.md b/docs/features/connection-pooler/connection-recovery.md similarity index 91% rename from docs/features/connection-recovery.md rename to docs/features/connection-pooler/connection-recovery.md index 1127fe40..b2a201b5 100644 --- a/docs/features/connection-recovery.md +++ b/docs/features/connection-pooler/connection-recovery.md @@ -4,7 +4,7 @@ icon: material/connection # Connection recovery -PostgreSQL database connections are expensive to create so PgDog does its best not to close them unless absolutely necessary. In case a client disconnects before fully processing a query response, PgDog will attempt to preserve the connection using several recovery steps. +PostgreSQL database connections are expensive to create so PgDog does its best not to close them unless absolutely necessary. In case a client disconnects before fully processing a query response, PgDog will attempt to preserve the connection using several recovery methods. ## Abandoned transactions @@ -69,7 +69,7 @@ Just like [abandoned transactions](#abandoned-transactions), this protects Postg ### Configuration -Connection recovery is an optional feature, enabled by default. You can change how it behaves through configuration: +Connection recovery is an optional feature, **enabled** by default. You can change how it behaves through configuration: === "pgdog.toml" ```toml @@ -81,6 +81,8 @@ Connection recovery is an optional feature, enabled by default. You can change h connectionRecovery: recover ``` +The following connection recovery options are available: + | Configuration value | Description | |-|-| | `recover` | Attempt full connection recovery, including rollback and resynchronization. This is the default. | @@ -103,7 +105,7 @@ To make sure abandoned server connections don't block normal operations, PgDog s Just like server connections, PgDog can maintain client connections (application --> PgDog) during incidents. This helps preserve application-side connection pools and avoids re-creating thousands of connections unnecessarily. -While enabled by default, some applications don't behave well when their queries return errors instead of results. Therefore, this feature is configurable and can be disabled: +While **enabled** by default, some applications don't behave well when their queries return errors instead of results. Therefore, this feature is configurable and can be disabled: === "pgdog.toml" ```toml @@ -115,6 +117,8 @@ While enabled by default, some applications don't behave well when their queries clientConnectionRecovery: drop ``` +The following client connection recovery options are available: + | Configuration value | Description | |-|-| | `recover` | Attempt to maintain client connections open after database-related errors, like `checkout timeout`. | diff --git a/docs/features/connection-pooler/index.md b/docs/features/connection-pooler/index.md new file mode 100644 index 00000000..961c5770 --- /dev/null +++ b/docs/features/connection-pooler/index.md @@ -0,0 +1,55 @@ +--- +icon: material/transit-connection-variant +--- + +# Connection pooler + +PgDog is first and foremost a connection pooler. It can proxy thousands (even hundreds of thousands) of application connections with only a handful of actual PostgreSQL connections. This feature is essential to large and busy databases. Without connection pooling, it would be very difficult to use Postgres in production. + +## PgDog vs. other poolers + +The Postgres ecosystem has many other connection poolers, e.g., the ubiquitous PgBouncer, RDS Proxy, and others. So, why build PgDog and what makes it unique? + +### Connection state + +PgDog can handle `SET` commands and preserve connection state in [transaction mode](transaction-mode.md). For example, this command works without polluting the connection state for other clients: + +```postgresql +SET application_name TO 'sidekiq'; +``` + +PgDog preserves this and other session parameters in transaction mode, allowing multiple applications to use the same connection pool. This increases pool efficiency at the small cost of running a few extra `SET` commands. + +Additionally, it's common to use GUC settings to temporarily change connection state, e.g., to work with RLS (row-level security) or to execute long queries without triggering a statement timeout. Applications using PgBouncer need to bypass it and connect to the database directly. With PgDog, it just works. + +!!! note "Connection pinning" + Unlike RDS Proxy, PgDog doesn't pin sessions or have a query length limit that would trigger that behavior. + +### Multithreading + +PgDog is multithreaded and asynchronous. Under the hood, we use the popular [Tokio](https://tokio.rs) Rust async runtime. This allows PgDog to serve more queries per second on machines with multiple CPUs. + +While it's possible to achieve a similar effect with PgBouncer in port reuse mode (i.e., `so_reuseport`), what sets PgDog apart is its ability to reuse the _same_ connection pool to serve more clients inside the same process. This improves pool utilization and allows PgDog to keep the number of connections to PostgreSQL low while serving more queries per second than PgBouncer. + +Additionally, PgDog is easier to manage from an infrastructure/DevOps perspective, since a single multithreaded process will emit only one set of [metrics](../metrics.md). + +### Pub/sub + +If your application uses `LISTEN`/`NOTIFY`, e.g., [DBOS](https://dbos.dev) or another job queue, it would traditionally need to connect to Postgres directly. PgDog implements its own pub/sub queue and sends and receives `LISTEN`/`NOTIFY` messages for clients connected to it. + +This allows applications to use `LISTEN`/`NOTIFY` in [transaction mode](pub_sub.md), just like any other Postgres feature. + +### Connection recovery + +Unlike other poolers, PgDog takes extra care not to close connections to Postgres unless absolutely necessary. It goes as far as to roll back abandoned transactions and drain abandoned queries. This protects the database against connection storms created by buggy applications. + +You can read more about connection recovery methods [here](connection-recovery.md). + +## Read more + +{{ next_steps_links([ + ("Transaction mode", "/features/connection-pooler/transaction-mode/", "Multiplex PostgreSQL server connections across thousands of clients."), + ("Pub/Sub", "/features/connection-pooler/pub_sub/", "Use LISTEN and NOTIFY through PgDog in transaction mode."), + ("Connection recovery", "/features/connection-pooler/connection-recovery/", "Recover interrupted server connections if clients abruptly disconnect."), + ("Prepared statements", "/features/connection-pooler/prepared-statements/", "Use named prepared statements efficiently in transaction mode."), +]) }} diff --git a/docs/features/prepared-statements.md b/docs/features/connection-pooler/prepared-statements.md similarity index 66% rename from docs/features/prepared-statements.md rename to docs/features/connection-pooler/prepared-statements.md index ea14946f..3b3f8c04 100644 --- a/docs/features/prepared-statements.md +++ b/docs/features/connection-pooler/prepared-statements.md @@ -16,8 +16,9 @@ entry and gives it a unique name. The `Parse` message is then renamed and sent to Postgres. This way, multiple clients can send the same prepared statement through PgDog without causing `"duplicate prepared statement"` errors. -
- Prepared statements +
+ Prepared statements +

Prepared statements flow

While the global cache helps with statement reuse, each client keeps its own mapping of prepared statement names. @@ -44,43 +45,51 @@ This limit is strictly enforced on server connections: if a prepared statement n Since clients re-use prepared statements, this limit isn't enforced for clients: they can prepare as many statements as they wish (and you have memory for). Each statement keeps a counter of when it's used by a client. If the counter reaches zero, i.e., all clients either closed it explicitly or disconnected, the statement is removed from the global cache. -### Tracking used statements +## Tracking used statements -The number of prepared statements and what they are can be tracked by executing this command on the [admin database](../administration/index.md): +The number of prepared statements and what they are can be tracked by executing this command on the [admin database](../../administration/index.md): -``` -SHOW PREPARED; -``` +=== "Command" -Additionally, each server connection entry in [`SHOW SERVERS`](../administration/servers.md) will report the number of currently prepared statements. + ``` + SHOW PREPARED; + ``` +=== "Output" + ``` + name | statement | rewrite | used_by | memory_used + -----------+-------------------------------------------------------+---------+---------+------------- + __pgdog_1 | SELECT abalance FROM pgbench_accounts WHERE aid = $1; | | 4 | 144 + (1 row) + ``` + +Additionally, each server connection entry in the admin [`SHOW SERVERS`](../../administration/servers.md) view will report the number of currently prepared statements. + +### Metrics + +The number of prepared statements in the global cache, and for each connection pool, is reported in OTEL and OpenMetrics [exporters](../metrics.md). ## Simple protocol -While prepared statements are typically sent using the extended protocol (`Parse`, `Bind`, `Describe`), Postgres +While prepared statements are typically sent using the extended protocol (i.e., `Parse`, `Bind`, `Describe` messages), Postgres supports preparing statements using the `PREPARE` command, and executing them using the `EXECUTE` command. PgDog supports rewriting these prepared statements to make sure their names are globally unique, just like with the extended -protocol. - -For example: - -```postgresql -PREPARE test AS SELECT * FROM users; -``` +protocol, for example: -will be rewritten by PgDog to: +=== "Original statement" + ```postgresql + PREPARE test AS SELECT * FROM users; + ``` -```postgresql -PREPARE __pgdog_1 AS SELECT * FROM users; -``` +=== "Rewritten statement" + ```postgresql + PREPARE __pgdog_1 AS SELECT * FROM users; + ``` Statements sent over the simple protocol are not checked against the global cache. Each new statement is given a unique global name. Since this requires PgDog to parse _each_ incoming query, and that's computationally expensive, this feature is **disabled** by default. -!!! note - `full` extends `extended`: it rewrites named extended-protocol statements in addition to simple-protocol `PREPARE`/`EXECUTE`. - -You can enable it in [`pgdog.toml`](../configuration/pgdog.toml/general.md#prepared_statements): +You can enable simple statement rewrites in [`pgdog.toml`](../../configuration/pgdog.toml/general.md#prepared_statements): === "pgdog.toml" ```toml @@ -95,22 +104,9 @@ You can enable it in [`pgdog.toml`](../configuration/pgdog.toml/general.md#prepa Statements prepared using this method can be executed normally with `Bind` and `Execute` messages. Result data types can be inspected with `Describe`, just like a regular prepared statement. -!!! note "Sharding support" - Currently, `EXECUTE` of prepared statements requiring [sharding](sharding/index.md) isn't supported. By default, the statement - will be sent to all shards. +!!! warning "Sharding support" + Currently, `EXECUTE` command for [sharded](../sharding/index.md) prepared statements is not supported. Such commands will be sent to all shards. ## Unnamed statements -By default, unnamed (or anonymous) prepared statements are not cached and are sent to Postgres as-is. This works fine for most client drivers because they send the entire query in a single request. However, some drivers, like `go/pq` do not. - -To make those drivers work, consider caching and rewriting unnamed prepared statements, like so: - -=== "pgdog.toml" - ```toml - [general] - prepared_statements = "extended_anonymous" - ``` -=== "Helm chart" - ```yaml - preparedStatements: extended_anonymous - ``` +Unnamed (aka anonymous) prepared statements are not cached and are sent to Postgres connections as-is. diff --git a/docs/features/pub_sub.md b/docs/features/connection-pooler/pub_sub.md similarity index 54% rename from docs/features/pub_sub.md rename to docs/features/connection-pooler/pub_sub.md index d7e962e9..8389078c 100644 --- a/docs/features/pub_sub.md +++ b/docs/features/connection-pooler/pub_sub.md @@ -3,17 +3,13 @@ icon: material/publish --- # Pub/sub -!!! note - This feature is new and experimental. Please [report](https://github.com/pgdogdev/pgdog/issues) any issues you may run into. - Postgres has native support for pub/sub through [`LISTEN`](https://www.postgresql.org/docs/current/sql-listen.html) and [`NOTIFY`](https://www.postgresql.org/docs/current/sql-notify.html) commands. If you're not familiar, pub/sub stands for publish/subscribe and allows you to send and listen for arbitrary messages, in real time, by using Postgres as the message broker. -Historically, this feature was only available to clients who connect to Postgres directly. -PgDog supports this in [transaction mode](transaction-mode.md), removing this limitation. You can use `LISTEN` and `NOTIFY`, as if you're connected directly to Postgres, with thousands of clients. +Historically, this feature was only available to clients which connect to Postgres directly. PgDog supports this in [transaction mode](transaction-mode.md), removing this limitation. You can use `LISTEN` and `NOTIFY`, as if you're connected directly to Postgres, with thousands of clients. ## How it works -You can enable pub/sub support by configuring the asynchronous message channel size in [`pgdog.toml`](../configuration/pgdog.toml/general.md): +Pub/sub support is **disabled** by default and can be enabled by configuring the asynchronous message channel size in [`pgdog.toml`](../../configuration/pgdog.toml/general.md): === "pgdog.toml" ```toml @@ -25,21 +21,22 @@ You can enable pub/sub support by configuring the asynchronous message channel s pubSubChannelSize: 4096 ``` -Clients can then use Postgres pub/sub like normal. PgDog will intercept all commands and process them internally. How each command is handled is described below. +Clients can then start using Postgres `LISTEN`/`NOTIFY` commands like normal. PgDog will intercept and process them internally. How each command is handled is described below.
- Pub/sub + Pub/sub +

Pub/sub architecture

-### `LISTEN` +### LISTEN -When PgDog receives a `LISTEN channel` command, it will register itself with Postgres on the requested channel, on the client's behalf. It does that over a dedicated server connection. If multiple clients request to listen on the same channel, PgDog will register itself only once. +When PgDog receives a `LISTEN ` command, it will register itself with Postgres on the requested channel, on the client's behalf. It does that over a dedicated server connection. If multiple clients request to listen on the same channel, PgDog will register itself only once. -### `NOTIFY` +### NOTIFY -When PgDog receives a `NOTIFY channel, payload` command, it will place it into an asynchronous queue and forward it to Postgres over a dedicated connection. This ensures that multiple instances of PgDog all receive the notification. +When PgDog receives a `NOTIFY , ''` command, it will place it into an asynchronous queue and forward it to Postgres over a dedicated connection. This ensures that multiple instances of PgDog all receive the notification. Once Postgres sends the notification back, PgDog will fan it out to all registered clients. If you have thousands of listeners, sending them a message is cheap since it's handled by a [Tokio](https://docs.rs/tokio/latest/tokio/sync/broadcast/index.html) `broadcast` channel and not by Postgres. @@ -49,18 +46,18 @@ PgDog respects transactional guarantees offered by Postgres for notifications. I If the transaction is rolled back or has an error, buffered notifications are dropped. -!!! note +!!! info "NOTIFY performance regression" This feature protects clients from sending `NOTIFY` commands to Postgres inside transactions, which has a [known](https://news.ycombinator.com/item?id=44490510) performance problem. -### `UNLISTEN` +### UNLISTEN -`UNLISTEN channel` removes the client from the list of clients interested in messages sent to that channel. It won't receive any more notifications, but PgDog continues to listen for them until all clients have unsubscribed or disconnected. +`UNLISTEN ` removes the client from the list of clients interested in messages sent to that channel. It won't receive any more notifications, but PgDog continues to listen for them until all clients have unsubscribed or disconnected. -Once that happens, PgDog will send the `UNLISTEN channel` command to Postgres automatically. +Once that happens, PgDog will send the `UNLISTEN ` command to Postgres automatically. ### Trade-offs -Since PgDog handles all commands, clients will get an immediate acknowledgement as soon as it processes a command. However, it doesn't mean that the command is immediately executed. If the backlog is large, it could take a few milliseconds for a command to be forwarded to Postgres. +Since PgDog handles all commands, clients will get an immediate acknowledgement as soon as it processes a command. However, it doesn't mean that the command is immediately sent to Postgres. If the is a large backlog in the background queue, it could take a few milliseconds for a command to be forwarded to the server. The size of the backlog is controlled with the `pub_sub_channel_size` setting. Once the queue is full, clients will begin to wait until the commands are processed. @@ -68,6 +65,6 @@ The size of the backlog is controlled with the `pub_sub_channel_size` setting. O `LISTEN` and `UNLISTEN` messages are guaranteed to be delivered. A client will, eventually, start receiving notifications on a channel. If there are a lot of requests in the queue, this may take a little while (a few milliseconds, typically). -A `NOTIFY` message can be lost if the dedicated connection to Postgres is broken. PgDog will not attempt re-delivery for a message on connection error. This is done to avoid duplicate notifications. +A `NOTIFY` message can be lost if the dedicated connection to Postgres is broken. PgDog will not attempt re-delivery for a message on connection error. This is done to avoid duplicate notifications and follows the "at most once" guarantee principle. If this happens, PgDog will attempt to re-establish the connection immediately. All subsequent messages will be delivered over the new connection. diff --git a/docs/features/session-mode.md b/docs/features/connection-pooler/session-mode.md similarity index 66% rename from docs/features/session-mode.md rename to docs/features/connection-pooler/session-mode.md index 5d14f7f2..a7496567 100644 --- a/docs/features/session-mode.md +++ b/docs/features/connection-pooler/session-mode.md @@ -3,10 +3,9 @@ icon: material/speedometer-slow --- # Session mode -In session mode, PgDog allocates one PostgreSQL server connection per client. This ensures that all PostgreSQL features work as expected, including persistent session variables, settings, and -process-based features like [`LISTEN`/`NOTIFY`](pub_sub.md). Some batch-based tasks, like ingesting large amounts of data, perform better in session mode. +In session mode, PgDog allocates one PostgreSQL server connection per client. This ensures that all PostgreSQL features work as expected not supported in [transaction mode](transaction-mode.md) work as expected. -As development of PgDog progresses, more and more session-level features will be added to [transaction mode](transaction-mode.md). Eventually, we expect this mode to no longer be useful. +As development of PgDog progresses, more and more session-level features will be added to [transaction mode](transaction-mode.md). Eventually, we expect this mode to no longer be useful. However, some batch-based tasks, like ingesting large amounts of data, could sometimes perform better in session mode. ## Configuration @@ -28,16 +27,16 @@ Session mode can be enabled globally or on a per-user basis: ## Performance Unlike [transaction mode](transaction-mode.md), session mode doesn't allow for client <-> server connection multiplexing, so the maximum number of allowed client connections -is controlled by the [`default_pool_size`](../configuration/pgdog.toml/general.md#default_pool_size) (or [`pool_size`](../configuration/users.toml/users.md#pool_size)) settings. +is controlled by the [`default_pool_size`](../../configuration/pgdog.toml/general.md#default_pool_size) (or [`pool_size`](../../configuration/users.toml/users.md#pool_size)) settings. For example, if your database pool size is 15, only 15 clients will be able to connect and use that database via PgDog at any given moment. -!!! note - In session mode, when the connection pool reaches full capacity, a client has to disconnect before another one can connect to PgDog. +### Full connection pools - Clients attempting to connect - will wait in a queue until a client disconnects. The maximum amount of time a client is allowed to wait is controlled by the [`checkout_timeout`](../configuration/pgdog.toml/general.md#checkout_timeout) setting. +In session mode, when the connection pool reaches full capacity, a client has to disconnect before another one can connect to PgDog. + +Clients attempting to connect will wait in a queue until a client disconnects. The maximum amount of time a client is allowed to wait is controlled by the [`checkout_timeout`](../../configuration/pgdog.toml/general.md#checkout_timeout) setting. ### Benefits of session mode @@ -46,6 +45,7 @@ Using PgDog in session mode is still an improvement over connecting to PostgreSQ when a client disconnects, the PostgreSQL server connection remains intact and can be reused by another client. #### Lazy connections + Until a client issues their first query, PgDog doesn't attach it to a server connection. This allows one set of clients to connect before the previous set disconnects, which is common when using zero-downtime deployment strategies like blue/green [^1]. diff --git a/docs/features/transaction-mode.md b/docs/features/connection-pooler/transaction-mode.md similarity index 82% rename from docs/features/transaction-mode.md rename to docs/features/connection-pooler/transaction-mode.md index 3ebf55c2..0d8d8d62 100644 --- a/docs/features/transaction-mode.md +++ b/docs/features/connection-pooler/transaction-mode.md @@ -12,8 +12,8 @@ All queries served by PostgreSQL run inside transactions. Transactions can be st PgDog takes advantage of this behavior and can split up transactions inside client connections and send them individually, in order, to the first available PostgreSQL server in the connection pool. -
- Load balancer +
+ Load balancer
In practice, this allows thousands of client connections to re-use just one PostgreSQL server connection. Most pools will have several server connections, so hundreds of thousands of clients can use the pooler to execute queries without exceeding the database connection limit. @@ -68,11 +68,11 @@ To avoid session-level state leaking between clients, PgDog tracks connection pa This is performed efficiently, and server parameters are updated only if they differ from the ones set on the client. -!!! note "Parsing SET commands" - PgDog automatically detects `SET` commands and uses the `pg_query` SQL parser to extract the GUC/session variable. This feature is **enabled** by default. +### Parsing SET commands + +PgDog automatically detects `SET` commands and uses the native PostgreSQL query parser to extract the GUC/session variable. This feature is enabled by default. - For deployments that don't normally need the parser (i.e. unsharded, read-only or no replicas), PgDog can selectively enable its parser for `SET` commands only. This is very fast - and shouldn't have a noticeable impact on pooler performance. +For deployments that don't normally need the parser (i.e. unsharded, replica-only deployments, or a primary with no replicas), PgDog can selectively enable its parser for `SET` commands only. This is very fast and shouldn't have a noticeable impact on pooler performance. ### Connection parameters @@ -82,9 +82,9 @@ Most Postgres connection drivers support passing parameters in the connection UR postgres://user@host:6432/db?options=-c%20statement_timeout%3D3s ``` -This sets the `statement_timeout` setting to `3s` (3 seconds). Each time this client +This example sets the `statement_timeout` setting to `3s` (3 seconds). Each time this client executes a transaction, the pooler will check the value for `statement_timeout` on the server connection, -and if it differs, issue a command to Postgres to update it: +and if it differs, issue a `SET` command to the Postgres server connection to update it: ```postgresql SET statement_timeout TO '3s'; @@ -102,6 +102,8 @@ For example: ```postgresql SELECT pg_advisory_lock(1234); +-- Do some work outside the database. +SELECT pg_advisory_unlock(1234); ``` In transaction mode, server connections are re-used between clients, so additional care needs to be taken to keep the server connection tied to the client that created the lock. @@ -110,14 +112,14 @@ In transaction mode, server connections are re-used between clients, so addition PgDog is able to detect advisory lock usage and will pin the server connection to the client connection until one of the following conditions is met: -1. The client releases the lock with `pg_advisory_unlock` +1. The client releases the lock by calling `pg_advisory_unlock()` 2. The client disconnects !!! note "Query parser" This feature requires the query parser to be enabled, which happens if the deployment is sharded - or is using the read/write split feature of the [load balancer](load-balancer/index.md). + or is using the read/write split feature of the [load balancer](../load-balancer/index.md). -If your PgDog deployment is unsharded and isn't using the [load balancer](load-balancer/index.md) for read/write separation, this feature is **disabled** by default. To enable it, turn on the query parser with the following setting: +If your PgDog deployment is unsharded and isn't using the [load balancer](../load-balancer/index.md) for read/write separation, this feature is **disabled** by default. To enable it, turn on the query parser with the following setting: === "pgdog.toml" ```toml @@ -129,7 +131,7 @@ If your PgDog deployment is unsharded and isn't using the [load balancer](load-b queryParser: session_control_and_locks ``` -This will scan all incoming queries for `pg_advisory_*` functions and selectively enable the query parser to handle them correctly. +This will scan all `SELECT` queries for `pg_advisory_*()` functions and selectively enable the query parser to handle them correctly. ### Performance diff --git a/docs/features/index.md b/docs/features/index.md index c72b596a..7a9a8dba 100644 --- a/docs/features/index.md +++ b/docs/features/index.md @@ -14,16 +14,16 @@ All features are configurable to fit your environment and can be toggled on/off. |---------|-------------| | [Load balancer](load-balancer/index.md) | Evenly distribute read queries between replicas and send write queries to the primary, allowing applications to connect to a single endpoint. | | [Health checks](load-balancer/healthchecks.md) | Ensure databases are up and can serve queries. Offline databases are blocked from serving queries. | -| [Transaction mode](transaction-mode.md) | Multiplex few PostgreSQL server connections between thousands of clients. | +| [Transaction mode](connection-pooler/transaction-mode.md) | Multiplex few PostgreSQL server connections between thousands of clients. | | [Hot reload](../configuration/index.md) | Update configuration at runtime without restarting PgDog. | | [Sharding](sharding/index.md) | Query routing, data migration and schema management to scale PostgreSQL horizontally. | -| [Prepared statements](prepared-statements.md) | Support for Postgres named prepared statements in transaction mode. | +| [Prepared statements](connection-pooler/prepared-statements.md) | Support for Postgres named prepared statements in transaction mode. | | [Plugins](plugins/index.md) | Pluggable libraries to add functionality to PgDog at runtime, without recompiling code. | | [Authentication](authentication.md) | Support for various PostgreSQL user authentication mechanisms, like SCRAM. | -| [Session mode](session-mode.md) | Compatibility mode with direct PostgreSQL connections. | +| [Session mode](connection-pooler/session-mode.md) | Compatibility mode with direct PostgreSQL connections. | | [Metrics](metrics.md) | Real time reporting, including Prometheus/OpenMetrics and an admin database. | | [Mirroring](mirroring.md) | Copy queries from one database to another in the background. | -| [Pub/Sub](pub_sub.md) | Support for `LISTEN`/`NOTIFY` in transaction mode. | +| [Pub/Sub](connection-pooler/pub_sub.md) | Support for `LISTEN`/`NOTIFY` in transaction mode. | | [Encryption](tls.md) | TLS encryption for client and server connections. | #### Operating system support diff --git a/docs/features/load-balancer/healthchecks.md b/docs/features/load-balancer/healthchecks.md index 12f5a26d..37191812 100644 --- a/docs/features/load-balancer/healthchecks.md +++ b/docs/features/load-balancer/healthchecks.md @@ -9,7 +9,8 @@ All databases load balanced by PgDog are regularly checked with health checks. A If a replica database fails a health check, it's temporarily removed from the load balancer, preventing it from serving queries for a configurable period of time.
- Healthchecks + Healthchecks +

Health checks prevent broken databases from serving queries.

## How it works @@ -23,8 +24,7 @@ PgDog performs two kinds of health checks to ensure applications don't accidenta If a connection or database fails a health check, it is **temporarily removed** from the load balancer and cannot serve any more queries. This prevents applications from continuously hitting a broken database until it's restarted by an administrator. -!!! note "99.99% uptime" - This strategy is very effective at reducing error rates in busy applications. If you are operating a large number of databases, hardware failures are relatively common and an effective load balancer is required to maintain 99.99% database uptime. +This strategy is very effective at reducing error rates in busy applications. If you are operating a large number of databases, hardware failures are relatively common and an effective load balancer is required to maintain 99.99% database uptime. ### Connection health check @@ -85,6 +85,13 @@ When PgDog is first started, it's possible that the database or the network is n The **default** value for this setting is **5 seconds** (`5_000` milliseconds). +!!! note "Disabling health checks" + If you want to disable database health checks, you can set the `idle_healthcheck_delay` setting to a large number, e.g. 100 years, in milliseconds: + + ```toml + [general] + idle_healthcheck_delay = 3155760000000 + ``` ### Primary database exception @@ -108,10 +115,7 @@ The amount of time the database is banned from serving traffic is controlled wit The **default** value is **5 minutes** (`300_000` milliseconds). -!!! note - A database will not be placed back into the load balancer until it passes a health check again. - - Make sure that `idle_healthcheck_interval` is set to a lower value than `ban_timeout`, so health checks have time to run before you expect the database to resume serving traffic. +A database will not be placed back into the load balancer until it passes a health check again. Make sure that `idle_healthcheck_interval` is set to a lower value than `ban_timeout`, so health checks have time to run before you expect the database to resume serving traffic. ### False positives @@ -170,7 +174,10 @@ This is configurable on startup only and will spin up an HTTP server on `http:// This health check looks at all configured connection pools, and if **at least one** is online, responds with `HTTP/1.1 200 OK`. If _all_ connection pools are down because of failed health checks, PgDog will respond with `HTTP/1.1 502 Bad Gateway`. -!!! note "Handling a lot of requests" - The HTTP health check uses existing internal state to answer requests and doesn't send queries to the connection pools. This makes it very quick and inexpensive, which ensures that massively distributed load balancers (like the AWS NLB) don't cause an unexpected influx of requests to the database. +#### Handling a lot of requests + +The HTTP health check uses existing internal state to answer requests and doesn't send queries to the connection pools. This makes it very quick and inexpensive, which ensures that massively distributed load balancers (like the AWS NLB) don't cause an unexpected influx of requests to the database. + +#### HTTPS To make configuration easier, the health check endpoint doesn't support HTTPS, so make sure to configure your load balancer to use plain HTTP only. diff --git a/docs/features/load-balancer/index.md b/docs/features/load-balancer/index.md index 78c6f6c9..9aaedc7f 100644 --- a/docs/features/load-balancer/index.md +++ b/docs/features/load-balancer/index.md @@ -9,7 +9,7 @@ icon: material/lan # Load balancer overview -PgDog understands the PostgreSQL wire protocol and uses its SQL parser to understand queries. This allows it to split read queries from write queries and distribute traffic evenly between databases. +PgDog understands the PostgreSQL wire protocol and uses the native PostgreSQL parser to understand queries. This allows it to split read queries from write queries and distribute traffic evenly between databases. Applications can connect to a single PgDog [endpoint](#single-endpoint), without having to manually manage multiple connection pools. @@ -18,13 +18,15 @@ Applications can connect to a single PgDog [endpoint](#single-endpoint), without When a query is received by PgDog, it will inspect it using the native Postgres SQL parser. If the query is a `SELECT` and the [configuration](../../configuration/pgdog.toml/databases.md) contains both primary and replica databases, PgDog will send it to one of the replicas. For all other queries, PgDog will send them to the primary.
- Load balancer + Load balancer +

Load balancer topology

Applications don't have to manually route queries between databases or maintain several connection pools internally. -!!! note "SQL compatibility" - PgDog's query parser is powered by the `pg_query` library, which extracts the Postgres native SQL parser directly from its source code. This makes it **100% compatible** with the PostgreSQL query language and allows PgDog to understand all valid PostgreSQL queries. +### SQL compatibility + +PgDog's query parser is powered by the `pg_query` library, which extracts the Postgres native SQL parser directly from its source code. This makes it **100% compatible** with the PostgreSQL query language and allows PgDog to understand all valid PostgreSQL queries. ## Load distribution @@ -165,7 +167,7 @@ This behavior is configurable in [pgdog.toml](../../configuration/pgdog.toml/gen readWriteSplit: exclude_primary ``` -#### Failover for reads +### Failover for reads In case one of your replicas fails, you can configure the primary to serve read queries temporarily while you (or your cloud vendor) bring the replica back up. This is configurable, like so: @@ -179,6 +181,22 @@ In case one of your replicas fails, you can configure the primary to serve read readWriteSplit: include_primary_if_replica_banned ``` +### Replicas optional + +Migrating applications to use replicas can take some time, especially if some queries are replica lag-sensitive, e.g., a read query issued immediately after a write. To make it easier to migrate to PgDog, you can disable replicas for reads, while explicitly opting specific queries in via [manual routing](manual-routing.md): + +=== "pgdog.toml" + ```toml + [general] + read_write_split = "prefer_primary" + ``` +=== "Helm chart" + ```yaml + readWriteSplit: prefer_primary + ``` + +Enabling this will make PgDog send all queries to the primary unless specified otherwise with a [query comment](manual-routing.md#query-comments) or a [session parameter](manual-routing.md#parameters). + ## Learn more {{ next_steps_links(next_steps) }} diff --git a/docs/features/load-balancer/manual-routing.md b/docs/features/load-balancer/manual-routing.md index b8781004..d9b78694 100644 --- a/docs/features/load-balancer/manual-routing.md +++ b/docs/features/load-balancer/manual-routing.md @@ -139,17 +139,17 @@ SET LOCAL "pgdog"."role" TO "primary"; In this example, all transaction statements (including the `BEGIN` statement) will be sent to the primary database. Whether the transaction is committed or reverted, the value of `pgdog.role` will be reset to its previous value. -!!! note "Statement ordering" - To make sure PgDog intercepts the routing hint early enough in the transaction flow, you need to send all hints _before_ executing actual queries. +#### Statement ordering +To make sure PgDog intercepts the routing hint early enough in the transaction flow, you need to send all hints _before_ executing DDL/DML statements. - The following flow, for example, _will not_ work: +The following flow, for example, _will not_ work: - ```postgresql - BEGIN; - SELECT * FROM users WHERE id = $1; - SET LOCAL pgdog.role TO "primary"; -- The client is already connected to a server. - INSERT INTO users (id) VALUES ($1); -- If connected to a replica, this will fail. - ``` +```postgresql +BEGIN; +SELECT * FROM users WHERE id = $1; +SET LOCAL pgdog.role TO "primary"; -- The client is already connected to a server. +INSERT INTO users (id) VALUES ($1); -- If connected to a replica, this will fail. +``` @@ -171,6 +171,24 @@ If you've configured the desired database role (and/or shard) for each of your a Once it's disabled, PgDog will rely solely on the `pgdog.role` and `pgdog.shard` parameters to make its routing decisions. -### Session state & `SET` +### Session state and SET + +The query parser is used to intercept and interpret `SET` commands. If the parser is disabled and your application uses `SET` commands to configure the connection, PgDog will not be able to guarantee that all connections have the correct session settings in [transaction mode](../connection-pooler/transaction-mode.md). -The query parser is used to intercept and interpret `SET` commands. If the parser is disabled and your application uses `SET` commands to configure the connection, PgDog will not be able to guarantee that all connections have the correct session settings in [transaction mode](../transaction-mode.md). +You can keep the parser enabled for handling `SET` commands only as follows: + +=== "pgdog.toml" + ```toml + [general] + query_parser = "session_control" + ``` +=== "Helm chart" + ```yaml + queryParser: session_control + ``` + +The internal implementation is using a very fast Regex to detect `SET` commands and will turn on the query parser for that statement only. The regex supports comments as well, so the following example will be detected: + +```postgresql +/* api: users.create */ SET application_name TO 'sidekiq'; +``` diff --git a/docs/features/load-balancer/replication-failover.md b/docs/features/load-balancer/replication-failover.md index 3f3afd0d..7c86f337 100644 --- a/docs/features/load-balancer/replication-failover.md +++ b/docs/features/load-balancer/replication-failover.md @@ -4,7 +4,7 @@ icon: material/chart-timeline-variant # Replication and failover -PgDog has built-in functionality for monitoring the state of Postgres replica databases. If configured, it can also automatically detect when a replica is promoted and redirect write queries to the new primary, and ban replicas from serving traffic if they have fallen far behind in the replication stream. +PgDog has built-in functionality for monitoring the state of Postgres replica databases. If configured, it can also automatically detect when a replica is promoted and redirect write queries to the new primary, or block replicas from serving traffic if they have fallen far behind in the replication stream. ## Replication @@ -25,7 +25,7 @@ In addition to fetching raw metrics, PgDog can calculate the replication lag (al | Primary LSN | Get the LSN from the primary using `pg_current_wal_lsn()`. | | Replica LSN | Get the LSN from each replica using `pg_last_wal_replay_lsn()` or `pg_last_wal_receive_lsn()`. | | LSN check | If the two LSNs are identical, replication lag is 0. | -| Calculate lag | If the two LSNs are different, replication lag is `now() - pg_last_xact_replay_timestamp()`. | +| Calculate lag | If the two LSNs are different, replication lag is `now() - pg_last_xact_replay_timestamp()` retrieved from the replica. | This formula assumes that when the replica's LSN is behind the primary, the primary is still receiving write requests. While this is not always the case, it will show replication lag growing over time if the replication stream is falling behind or is broken. @@ -33,7 +33,7 @@ This formula assumes that when the replica's LSN is behind the primary, the prim It is possible to calculate the exact replication delay in bytes by subtracting a replica LSN from the primary LSN. While this provides an exact measurement, that metric isn't very useful: it's hard to translate bytes into a measurement of how stale the data on the replica truly is. - Approximating the lag in milliseconds is more informative and will be reasonably accurate the majority of the time. + Approximating the lag in milliseconds is more informative and will be reasonably accurate most of the time. ### Configuration @@ -66,10 +66,15 @@ Decreasing the value of `lsn_check_interval` will produce more accurate statisti It's common for PgDog deployments to be serving upwards of 30,000-50,000 queries per second per pooler process, so you can run the LSN check query quite frequently without noticeable impact on system latency. +#### Saturated connection pool + +If the connection pool is at capacity, either due to a database incident or an inefficient query, PgDog will create a standalone connection to fetch the LSN from each database in the configuration. This is to make sure that it can't miss +a [failover](#failover) event during a database incident. + ### Replica lag ban -!!! note "Experimental feature" - This feature is new and experimental. Please report any issues you encounter. +!!! note "New feature" + This feature is new and experimental. Please report any issues you may encounter. If a replica has fallen far behind the primary, it may start serving stale data to the application. This can cause hard to debug issues, so it's often best to remove this replica from the load balancer until it's able to catch up. @@ -98,14 +103,15 @@ Unlike [health check-triggered](healthchecks.md) bans, replica lag ban is not cl ## Failover -
- Failover -
- If the `pg_is_in_recovery()` function returns `true`, the database is configured as a standby. It can only serve read queries (e.g. `SELECT`) and is expected to be reasonably up-to-date with the primary database. Replica databases can be promoted to serve write queries. If that happens, `pg_is_in_recovery()` will start returning `false`. You can read more about this in the [PostgreSQL documentation](https://www.postgresql.org/docs/18/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL). +
+ Failover +

Failover event

+
+ !!! warning "Failover trigger" PgDog does not detect primary failure and **will not** call `pg_promote()`. It is expected that the databases are managed externally by another tool, like Patroni or AWS RDS, which handle replica promotion. @@ -140,6 +146,10 @@ Failover is disabled by default. To enable it, change all configured databases i On startup, PgDog will connect to each database, find out if they are in recovery, and automatically reload its configuration with the determined roles. +!!! info "Replication lag monitoring" + In order for PgDog to detect failover events, it needs to query the database for its [replication status](#configuration). Make sure to set + `lsn_check_delay` to a reasonable value (e.g., `0`) before enabling this feature. + ### Split brain If a replica is promoted while the existing primary is alive and serving queries, write queries can be routed to either database, causing data loss. This type of error is called "split brain", indicating that the database cluster no longer has an authoritative source of data it's managing. diff --git a/docs/features/load-balancer/transactions.md b/docs/features/load-balancer/transactions.md index 5d3a7ad7..042aef94 100644 --- a/docs/features/load-balancer/transactions.md +++ b/docs/features/load-balancer/transactions.md @@ -4,7 +4,7 @@ icon: material/swap-horizontal # Transactions -PgDog's load balancer is [transaction-aware](../transaction-mode.md) and will ensure that all statements inside a transaction are sent to the same PostgreSQL connection on just one database. +PgDog's load balancer is [transaction-aware](../connection-pooler/transaction-mode.md) and will ensure that all statements inside a transaction are sent to the same PostgreSQL connection on just one database. To make sure all queries inside a transaction succeed, PgDog will route all manually started transactions to the primary database. diff --git a/docs/features/sharding/manual-routing.md b/docs/features/sharding/manual-routing.md index d675f787..9950b436 100644 --- a/docs/features/sharding/manual-routing.md +++ b/docs/features/sharding/manual-routing.md @@ -51,7 +51,7 @@ The comment can appear anywhere in the query, as long as it's syntactically vali Since parsing comments is not free, this method is best used for infrequent commands, like schema migrations or queries executed manually by an administrator. For faster query routing, consider supplying the sharding key [directly](query-routing.md) in the query. -Additionally, using query comments with a high-cardinality value, like the `pgdog_sharding_key`, may substantially increase the size of the [prepared statements](../prepared-statements.md) cache. To avoid this, consider the [`SET`](#set) command instead. +Additionally, using query comments with a high-cardinality value, like the `pgdog_sharding_key`, may substantially increase the size of the [prepared statements](../connection-pooler/prepared-statements.md) cache. To avoid this, consider the [`SET`](#set) command instead. ## SET diff --git a/docs/features/sharding/query-routing.md b/docs/features/sharding/query-routing.md index f32b0958..2bb0dad6 100644 --- a/docs/features/sharding/query-routing.md +++ b/docs/features/sharding/query-routing.md @@ -29,7 +29,7 @@ WHERE payments.user_id = $1; -- Sharding key. ``` -Both regular queries and [prepared statements](../prepared-statements.md) are supported. So if your database driver is using placeholders instead of actual values, PgDog will extract the sharding key value from the extended protocol messages. +Both regular queries and [prepared statements](../connection-pooler/prepared-statements.md) are supported. So if your database driver is using placeholders instead of actual values, PgDog will extract the sharding key value from the extended protocol messages. ### Supported syntax @@ -60,7 +60,7 @@ INSERT INTO payments (user_id, amount) VALUES ($1, $2) RETURNING * If the query is inserting a row into a [sharded table](../../configuration/pgdog.toml/sharded_tables.md), the query router will extract the sharding key, and route the query to the corresponding shard. -Just like for `SELECT` queries, both [prepared statements](../prepared-statements.md) and regular queries are supported. +Just like for `SELECT` queries, both [prepared statements](../connection-pooler/prepared-statements.md) and regular queries are supported. ### Supported syntax diff --git a/docs/features/tls.md b/docs/features/tls.md index 871b3152..1d6ec110 100644 --- a/docs/features/tls.md +++ b/docs/features/tls.md @@ -35,6 +35,33 @@ To enable encryption on the client, set the `sslmode` connection parameter. If y postgres://user:password@host:port/database?sslmode=prefer ``` +#### Rejecting unencrypted connections + +PgDog can reject connections from clients that choose not to use TLS encryption: + +=== "pgdog.toml" + ```toml + [general] + tls_client_required = true + ``` +=== "Helm chart" + ``` + tlsClientRequired: true + ``` + +This is helpful to enforce a security protocol but, in some rare scenarios, could limit which clients are allowed to connect. Most Postgres client drivers ship with TLS support bundled in so, in practice, enabling this feature is not going to be a problem. + +#### Self-signed certificate + +If you're deploying PgDog using our [Helm chart](../installation.md#kubernetes), you can configure it to generate a self-signed TLS certificate at deploy time: + +=== "Helm chart" + ```yaml + tlsGenerateSelfSignedCert: true + ``` + +This is useful for quickly deploying TLS in development or staging. For production deployments, you may want to load your own certificate that your clients can validate instead. + ### Connection modes PostgreSQL supports 4 modes for establishing encrypted connections, documented below: @@ -83,6 +110,64 @@ If you use `verify_ca` or `verify_full` and your certificate is not signed by a tlsServerCaCertificate: /path/to/ca/certificate.pem ``` +### Deploying on AWS RDS + +PgDog is commonly deployed in front of AWS RDS or Aurora. To make it easier to setup secure TLS, we are bundling the [RDS certificate bundle](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html#UsingWithRDS.SSL.CertificatesDownload) into the Helm chart and making it available to PgDog at runtime: + +=== "Helm chart" + ```yaml + rdsCertificateBundle: + enabled: true + ``` +=== "AWS GovCloud" + If deploying into the AWS GovCloud (US), you can change the bundle accordingly: + + ```yaml + rdsCertificateBundle: + type: govcloud + ``` + + +Once the bundle is loaded, you can switch to `verify_ca` (or `verify_full`) for server connections which will ensure that connections from PgDog to RDS are always encrypted _and_ authenticated: + +=== "pgdog.toml" + ```toml + [general] + tls_verify = "verify_full" + ``` +=== "Helm chart" + ```yaml + tlsVerify: verify_full + ``` + + +## Mutual TLS + +!!! note "New" + This is a new feature. Please report any issues you may run into. + +Mutual TLS (also known as **mTLS**) allows PgDog to authenticate connections received from the client using a mutually agreed upon certificate. If the client doesn't provide the right certificate (or doesn't have one), PgDog will reject the connection. This can be enabled by setting the client CA certificate in [`pgdog.toml`](../configuration/pgdog.toml/general.md): + +=== "pgdog.toml" + ```toml + [general] + tls_client_ca_certificate = "/path/to/client/ca.pem" + ``` +=== "Helm chart" + ```yaml + tlsClientCaCertificate: /path/to/client/ca.pem + ``` + +The certificate provided by the client doesn't have to be self-signed. In fact, any certificate signed by any of the certs in the chain loaded via `tls_client_ca_certificate` is an acceptable anchor. This allows an internal CA (Certificate Authority) to issue unique certificates to each application, while also making them short-lived (e.g., 30 days expiration) to satisfy security or compliance requirements. + +## TLS in practice + +PgDog terminates TLS from clients and opens a separate connection to Postgres. Traffic can be encrypted on both network hops, but it is not end-to-end encryption in the cryptographic sense: PgDog decrypts client traffic so it can read PostgreSQL protocol messages, route queries, and manage connections. + +For the client side, configure applications to use TLS when connecting to PgDog. Use `sslmode=verify-full` when clients should can PgDog's certificate and hostname. If you want PgDog to authenticate clients during the TLS handshake, set [`tls_client_ca_certificate`](#mutual-tls). + +For the server side, set `tls_verify` setting to `"verify_full"` so PgDog can validate the Postgres server certificate and hostname on each connection. It does not currently present a client TLS certificate to Postgres, so mutual TLS between PgDog and PostgreSQL is not currently supported. + ## Performance Encryption has some performance impact on latency and CPU utilization. While most TLS encryption algorithms are now implemented in hardware and are quite quick, you will still notice some impact on your query turnaround times when using TLS. diff --git a/docs/images/failover.png b/docs/images/failover.png index 6ab253cf..1cdc96a9 100644 Binary files a/docs/images/failover.png and b/docs/images/failover.png differ diff --git a/docs/images/healthchecks.png b/docs/images/healthchecks.png index c0f74f4a..cb20ebd9 100644 Binary files a/docs/images/healthchecks.png and b/docs/images/healthchecks.png differ diff --git a/docs/images/prepared-statements-1.png b/docs/images/prepared-statements-1.png index 31911987..b1af8245 100644 Binary files a/docs/images/prepared-statements-1.png and b/docs/images/prepared-statements-1.png differ diff --git a/docs/images/pub_sub.png b/docs/images/pub_sub.png index e94e633d..fbcae3b6 100644 Binary files a/docs/images/pub_sub.png and b/docs/images/pub_sub.png differ diff --git a/docs/images/replicas.png b/docs/images/replicas.png index b7e289d1..6a4d42e6 100644 Binary files a/docs/images/replicas.png and b/docs/images/replicas.png differ diff --git a/docs/images/transaction-mode.png b/docs/images/transaction-mode.png index 7021da58..79ac37f5 100644 Binary files a/docs/images/transaction-mode.png and b/docs/images/transaction-mode.png differ diff --git a/docs/index.md b/docs/index.md index cb5ffa61..263fba30 100644 --- a/docs/index.md +++ b/docs/index.md @@ -32,7 +32,7 @@ Unlike those proxies, PgDog handles features that usually force a pooler to pin PgDog is also multithreaded, so a single instance can serve many more clients while still relying on the same small number of Postgres connections. -You can read more about how the connection pooler works [here](features/transaction-mode.md). +You can read more about how the connection pooler works [here](features/connection-pooler/transaction-mode.md). ## Load balancer @@ -62,8 +62,8 @@ This documentation provides a detailed overview of all PgDog features, along wit ## Read more {{ next_steps_links([ - ("Features", "/features/", "Read more about PgDog features like load balancing, supported authentication mechanisms, TLS, health checks, and more."), - ("Administration", "/administration/", "Learn how to operate PgDog in production, like fetching real-time statistics from the admin database or updating configuration."), - ("Installation", "/installation/", "Install PgDog on your Linux server or on your Linux/Mac/Windows machine for local development."), - ("Configuration", "/configuration/", "Reference for PgDog configuration like maximum server connections, number of shards, and more."), + ("Installation", "/installation/", "Deploy PgDog with Helm on Kubernetes, run it on AWS ECS with Terraform, with Docker, with pre-built binaries, or by building from source."), + ("Connection pooler", "/features/connection-pooler/", "Multiplex thousands of application connections over a small number of PostgreSQL server connections."), + ("Load balancer", "/features/load-balancer/", "Distribute read queries across replicas and send write queries to the primary database."), + ("Sharding", "/features/sharding/", "Scale PostgreSQL horizontally with query routing, data migration, and schema management."), ]) }} diff --git a/docs/migrating-to-pgdog/from-pgbouncer.md b/docs/migrating-to-pgdog/from-pgbouncer.md index 647e5f2b..2dc756a6 100644 --- a/docs/migrating-to-pgdog/from-pgbouncer.md +++ b/docs/migrating-to-pgdog/from-pgbouncer.md @@ -162,9 +162,9 @@ Settings that control general connection pooler operations. | [`max_client_conn`](https://www.pgbouncer.org/config.html#max_client_conn) | N/A | PgDog doesn't place an upper bound on the number of client connections. | | [`max_db_connections`](https://www.pgbouncer.org/config.html#max_db_connections) | N/A | PgDog doesn't have a global database connection limit. Individual pools configure their own limits. | | [`server_round_robin`](https://www.pgbouncer.org/config.html#server_round_robin) | N/A | PgDog has its own [load balancing](../features/load-balancer/index.md) algorithms that are configured separately. | -| [`track_extra_parameters`](https://www.pgbouncer.org/config.html#track_extra_parameters) | N/A | PgDog tracks [all parameters](../features/transaction-mode.md#session-state) by default, including those that PgBouncer doesn't. | +| [`track_extra_parameters`](https://www.pgbouncer.org/config.html#track_extra_parameters) | N/A | PgDog tracks [all parameters](../features/connection-pooler/transaction-mode.md#session-state) by default, including those that PgBouncer doesn't. | | [`stats_period`](https://www.pgbouncer.org/config.html#stats_period) | [`stats_period`](../configuration/pgdog.toml/general.md#stats_period) | - | -| [`max_prepared_statements`](https://www.pgbouncer.org/config.html#max_prepared_statements) | [`prepared_statements_limit`](../configuration/pgdog.toml/general.md#prepared_statements_limit) | PgDog's [prepared statements](../features/prepared-statements.md) limit is soft and is only enforced on server connections. | +| [`max_prepared_statements`](https://www.pgbouncer.org/config.html#max_prepared_statements) | [`prepared_statements_limit`](../configuration/pgdog.toml/general.md#prepared_statements_limit) | PgDog's [prepared statements](../features/connection-pooler/prepared-statements.md) limit is soft and is only enforced on server connections. | | [`unix_socket_dir`](https://www.pgbouncer.org/config.html#unix_socket_dir) | N/A | PgDog doesn't support UNIX sockets. | | [`unix_socket_mode`](https://www.pgbouncer.org/config.html#unix_socket_mode) | N/A | Same as above. | | [`unix_socket_group`](https://www.pgbouncer.org/config.html#unix_socket_group) | N/A | Same as above. | @@ -227,7 +227,7 @@ Various connection-related and DNS-related settings. | PgBouncer | PgDog | Notes | |-|-|-| -| [`server_reset_query`](https://www.pgbouncer.org/config.html#server_reset_query) | N/A | Server state is [managed](../features/transaction-mode.md#session-state) by PgDog and different reset queries are used, depending on circumstances. | +| [`server_reset_query`](https://www.pgbouncer.org/config.html#server_reset_query) | N/A | Server state is [managed](../features/connection-pooler/transaction-mode.md#session-state) by PgDog and different reset queries are used, depending on circumstances. | | [`server_reset_query_always`](https://www.pgbouncer.org/config.html#server_reset_query_always) | N/A | Same as above. | | [`server_check_query`](https://www.pgbouncer.org/config.html#server_check_query) | N/A | Not currently configurable. PgDog runs an empty query (`;`) by default. | | [`server_check_delay`](https://www.pgbouncer.org/config.html#server_check_delay) | N/A | Not currently supported. | diff --git a/docs/roadmap.md b/docs/roadmap.md index 9923d704..32af79fb 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -24,11 +24,11 @@ These features are required for PgDog to act as a replacement for PgBouncer and/ | Feature | Status | Notes | |---------|--------|-------| -| [Transactional pooler](features/transaction-mode.md) | :material-check-circle-outline: | | -| [Session mode](features/session-mode.md) | :material-check-circle-outline: | No sharding or load balancing. | +| [Transactional pooler](features/connection-pooler/transaction-mode.md) | :material-check-circle-outline: | | +| [Session mode](features/connection-pooler/session-mode.md) | :material-check-circle-outline: | No sharding or load balancing. | | [Load balancer](features/load-balancer/index.md) | :material-check-circle-outline: | | | [Health checks & failover](features/load-balancer/healthchecks.md) | :material-check-circle-outline: | | -| [Prepared statements](features/prepared-statements.md) | :material-check-circle-outline: | | +| [Prepared statements](features/connection-pooler/prepared-statements.md) | :material-check-circle-outline: | | | [Metrics](features/metrics.md) | :material-check-circle-outline: | Admin database views contain more columns than PgBouncer. | | [Encryption](features/tls.md) | :material-check-circle-outline: | | | [Authentication](features/authentication.md) | :material-wrench: | Password authentication only. `scram-sha-256`, `md5` are supported. | diff --git a/docs/user_guides/connection_pool.md b/docs/user_guides/connection_pool.md index 42985985..dcc6d8fc 100644 --- a/docs/user_guides/connection_pool.md +++ b/docs/user_guides/connection_pool.md @@ -4,7 +4,7 @@ icon: material/pipe # Connection pools configuration -When deploying a connection pooler for the first time, it can be challenging to know how many connections to provision for each [connection pool](../configuration/pgdog.toml/general.md#default_pool_size). The goal of using a connection pooler, like PgDog, is to reduce the total number of database connections by taking advantage of [transaction mode](../features/transaction-mode.md). +When deploying a connection pooler for the first time, it can be challenging to know how many connections to provision for each [connection pool](../configuration/pgdog.toml/general.md#default_pool_size). The goal of using a connection pooler, like PgDog, is to reduce the total number of database connections by taking advantage of [transaction mode](../features/connection-pooler/transaction-mode.md). Sizing up the right value for that setting depends on your database, but follows a few general principles. diff --git a/mkdocs.yml b/mkdocs.yml index 23da805a..1001c714 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -96,6 +96,10 @@ plugins: 'features/sharding/schema_management/primary_keys.md': 'features/sharding/sequences.md' 'features/sharding/resharding/hash.md': 'features/sharding/resharding/move.md' 'features/sharding/cross-shard/index.md': 'features/sharding/cross-shard-queries/index.md' + 'features/transaction-mode.md': 'features/connection-pooler/transaction-mode.md' + 'features/session-mode.md': 'features/connection-pooler/session-mode.md' + 'features/prepared-statements.md': 'features/connection-pooler/prepared-statements.md' + 'features/pub_sub.md': 'features/connection-pooler/pub_sub.md' extra: social: