diff --git a/docs/features/sharding/.pages b/docs/features/sharding/.pages
index 6c05145f..7da81ec3 100644
--- a/docs/features/sharding/.pages
+++ b/docs/features/sharding/.pages
@@ -1,11 +1,11 @@
nav:
- 'index.md'
- 'basics.md'
- - 'supported-queries.md'
+ - 'sharding-functions.md'
- 'query-routing.md'
- - 'manual-routing.md'
- 'cross-shard-queries'
- - 'sharding-functions.md'
+ - 'supported-queries.md'
+ - 'manual-routing.md'
- '...'
- 'resharding'
- 'internals'
diff --git a/docs/features/sharding/basics.md b/docs/features/sharding/basics.md
index 866e1ee0..fe89e4c9 100644
--- a/docs/features/sharding/basics.md
+++ b/docs/features/sharding/basics.md
@@ -7,36 +7,36 @@ Sharding a PostgreSQL database splits it, with all its tables and indices, betwe
## Terminology
-Some terms and expressions used in the documentation may be new to some readers. They are defined below.
+Some terms and expressions used in the documentation may be new to some readers. They are defined below:
| Term | Definition |
|------|------------|
-| Shard | PostgreSQL database(s) that contains a portion of the entire dataset. |
+| Shard | PostgreSQL database(s) that contains a portion of the entire dataset. In most deployments, that's an independent PostgreSQL server. |
| Sharding function | A function that takes some data and computes a shard number for where this data should be placed. |
| Shard number | A number between 0 and _N_ where _N_ is the total number of shards in the cluster. |
| Primary | A database that serves write queries like `INSERT`, `UPDATE`, `DELETE`, etc. |
| Replica | A database that has the same data as a primary and can only serve `SELECT` queries. |
-### Shards
+### Database shards
Each shard is responsible for a subset of the total data in the database cluster. Each shard can have multiple replica databases and only one primary. The primary is responsible for serving writes, like creating and updating rows, while replica databases are responsible for serving read queries.
-When we refer to "shards" in this documentation, we mean primary and replica databases collectively responsible for a subset of the database data.
+When we refer to "shards" in this documentation, we mean primary and replica databases collectively responsible for a subset of the database data. PgDog can simultaneously manage sharding and load balancing queries, so a database shard can actually encompass a primary and replicas without leaking that abstraction to the application.
### Sharding function
-A [sharding function](sharding-functions.md) is responsible for splitting data between shards. It's typically based on some kind of hash, divided by the total number of shards in the cluster, to obtain the shard number. The hash ensures that data is distributed uniformly between shards, no matter what that data is.
+A [sharding function](sharding-functions.md) is responsible for splitting data between shards. It's typically based on some kind of hash, divided by the total number of shards in the cluster, to obtain the shard number. The hash ensures that data is distributed uniformly between shards, no matter what that data is, for example:
```
shard_number = hash(data) % num_shards
```
-Alternatively, a sharding function can map the sharding keys directly to a shard number. This is done when the number of keys is relatively small and they represent large collections of evenly distributed data. Examples include country codes for geographic data or tenant IDs for multitenant applications.
+PgDog supports many different sharding functions, including a list-based and value-based function which can map the sharding keys directly to a shard number. This is done in situations where the number of keys is relatively small and they represent large collections of evenly distributed data, e.g., country codes for geographic data or tenant IDs for multitenant applications.
## Read more
{{ next_steps_links([
- ("Sharding functions", "sharding-functions.md", "Control how rows are distributed across shards."),
- ("Direct-to-shard queries", "query-routing.md", "Route queries directly to the correct shard."),
+ ("Sharding functions", "sharding-functions.md", "Control how rows or tables are distributed across shards."),
+ ("Direct-to-shard queries", "query-routing.md", "Route queries directly to the right shard without application changes."),
]) }}
diff --git a/docs/features/sharding/cross-shard-queries/index.md b/docs/features/sharding/cross-shard-queries/index.md
index e3fa5b4b..2e409c40 100644
--- a/docs/features/sharding/cross-shard-queries/index.md
+++ b/docs/features/sharding/cross-shard-queries/index.md
@@ -2,44 +2,48 @@
icon: material/multicast
---
-# Cross-shard queries overview
+# Cross-shard queries
-If a client can't or doesn't specify a sharding key in the query, PgDog will send that query to all shards in parallel, and combine the results automatically. To the client, this looks like the query was executed by a single database.
+If a client can't specify a sharding key, or doesn't specify one in the query, PgDog will send that query to all shards concurrently and combine the results automatically. To the client, this looks like the query was executed by a single database.
-
-
+
+
+ Cross-shard queries are sent to all shards concurrently.
-## How it works
+While this sounds simple on the surface, the actual implementation is anything but. It's described below, along with edge cases that are not yet supported.
-PgDog understands the Postgres protocol and query language. It can connect to multiple database servers, send the query to all of them, and collect [`DataRow`](#under-the-hood) messages as they are returned by each connection.
+## Cross-shard basics
-Once all servers finish executing the request, PgDog processes the result, performs any requested sorting, aggregation or row disambiguation, and sends the complete result back to the client, as if all rows came from one database server.
+PgDog understands the Postgres protocol and SQL query language. It can connect to multiple database servers, send the query to all of them, and collect [rows](#under-the-hood) as they are returned by each connection.
-Just like with [direct-to-shard](../query-routing.md) queries, each SQL command is handled differently, as documented below:
+Once all servers finish executing the request, PgDog processes the result, performs any requested sorting, aggregation, or row disambiguation, and sends the complete result back to the client, as if all rows came from one database server.
-- [`SELECT`](select.md)
-- [`INSERT`](insert.md)
-- [`UPDATE`, `DELETE`](update.md)
-- [`CREATE`, `ALTER`, `DROP`](ddl.md) (and other DDL statements)
-- [`COPY`](copy.md)
+Just like with [direct-to-shard](../query-routing.md) queries, each SQL command is handled differently, as documented below:
+| Commands | Summary |
+|-|-|
+| [SELECT](select.md) | PgDog implements a scatter/gather query engine to fetch rows from multiple shards concurrently. |
+| [INSERT](insert.md) | Statements targeting [omnisharded](omnishards.md) tables are sent to all shards concurrently. Sharded tables with automatic [primary key](../unique-ids.md) generation are sent to one shard only. |
+| [UPDATE and DELETE](update.md) | Statements are sent to all shards concurrently. Sharding key updates are partially supported. |
+| [DDL statements, e.g., CREATE, ALTER, DROP](ddl.md) | DDL is sent to all shards concurrently, to make sure the schema is identical on all shards. |
+| [COPY command](copy.md) | Rows sent via COPY are automatically distributed between all shards using the configured [sharding function](../sharding-functions.md). |
-## Under the hood
+### Under the hood
-PgDog implements the PostgreSQL wire protocol, which is well documented and stable. The messages sent by Postgres clients and servers contain all the necessary information about data types, column names and executed statements, which PgDog can use to present multi-database results as a single stream of data.
+PgDog implements the PostgreSQL wire protocol, which is well documented and stable. The messages sent by Postgres clients and servers contain all the necessary information about data types, column names, and executed statements, which PgDog can use to present multi-database results as a single stream of data.
The following protocol messages are especially relevant:
| Message | Description |
|-|-|
-| `DataRow` | Each `DataRow` message contains one tuple, for each row returned by the query. |
+| `DataRow` | Each `DataRow` message contains one tuple for each row returned by the query. |
| `RowDescription` | This message has the column names and data types returned by the query. |
| `CommandComplete` | Indicates that the query has finished returning results. PgDog uses it to start sorting and aggregation. |
The protocol has two formats for encoding tuples: text and binary. Text format is equivalent to calling the `to_string()` method on native types, while binary encoding sends them in network-byte order. For example:
-=== "Data"
+=== "Query"
```postgresql
SELECT 1::bigint, 2::integer, 'three'::VARCHAR;
```
@@ -50,7 +54,7 @@ The protocol has two formats for encoding tuples: text and binary. Text format i
| `INTEGER` | `"2"` | `00 00 00 02` |
| `VARCHAR` | `"three"` | `three` |
-Since PgDog needs to process rows before sending them to the client, we implemented parsing both formats for [most data types](select.md#supported-data-types).
+Since PgDog needs to process rows before sending them to the client, it implements parsing for both formats for most [data types](select.md#supported-data-types).
## Disabling cross-shard queries
@@ -67,15 +71,13 @@ If you don't want PgDog to route cross-shard queries, e.g., because you have a [
crossShardDisabled: true
```
-When this setting is enabled and a query doesn't have a sharding key, instead of executing the query, PgDog will return an error and abort the transaction.
+When this setting is enabled and a query doesn't have a sharding key, PgDog will return an error and abort the transaction instead of executing the query.
## Read more
{{ next_steps_links([
- ("Sharding functions", "../sharding-functions.md", "Control how rows are distributed across shards."),
- ("Cross-shard SELECT", "select.md", "Query data across all shards with automatic merging."),
- ("Cross-shard INSERT", "insert.md", "Insert rows that get routed to the correct shard."),
- ("Cross-shard UPDATE and DELETE", "update.md", "Modify or remove rows in tables spanning multiple shards."),
- ("DDL, e.g. CREATE TABLE", "ddl.md", "Run schema changes across all shards at once."),
- ("COPY command", "copy.md", "Bulk load data across shards with the COPY protocol."),
+ ("SELECT", "select.md", "Scatter/gather queries, sorting, aggregation, and supported SQL features."),
+ ("INSERT", "insert.md", "Omnisharded inserts, missing sharding keys, multiple tuples, and unique IDs."),
+ ("UPDATE and DELETE", "update.md", "Cross-shard writes, consistency, and sharding key updates."),
+ ("DDL", "ddl.md", "Cluster-wide schema changes, atomicity, and idempotent migrations."),
]) }}
diff --git a/docs/features/sharding/cross-shard-queries/insert.md b/docs/features/sharding/cross-shard-queries/insert.md
index a93fc793..656d21fa 100644
--- a/docs/features/sharding/cross-shard-queries/insert.md
+++ b/docs/features/sharding/cross-shard-queries/insert.md
@@ -4,63 +4,69 @@ icon: material/table-plus
# Cross-shard INSERT
-If an `INSERT` statement specifies only one sharding key, it's sent [directly](../query-routing.md#insert) to one of the shards. Otherwise, it becomes a cross-shard `INSERT` statement.
+An INSERT statement that doesn't specify a sharding key, has multiple sharding keys, or targets an [omnisharded](../omnishards.md) table is a cross-shard statement and requires special handling, as described below.
## How it works
-Cross-shard `INSERT` statements fall into one of three categories, each one handled differently:
+Cross-shard INSERT statements fall into one of three categories, each one being handled differently:
-1. `INSERT` targeting [omnisharded tables](#omnisharded-tables)
-2. `INSERT` targeting a [sharded table](#sharded-tables), with no sharding key specified
-3. `INSERT` with [multiple tuples](#multiple-tuples), each destined for a different shard
+| Statement | Handling |
+|-|-|
+| INSERT into an [omnisharded](#omnisharded-tables) table | Sent concurrently to all shards. |
+| INSERT a single tuple into a [sharded](#sharded-tables) table | Sent to one shard only, depending on the [primary key](#primary-keys) generation strategy. |
+| INSERT [multiple tuples](#multiple-tuples) into a [sharded](#sharded-tables) table | Rewritten into separate statements and sent separately to the matching shards. |
-By design, applications using PgDog don't need to concern themselves with this and can use the database normally. However, there are some trade-offs when using cross-shard queries, documented below.
+Applications using PgDog don't need to concern themselves with implementation details and can use the database normally. However, there are some trade-offs when using cross-shard queries which are documented below.
## Omnisharded tables
For queries that target [omnisharded](../omnishards.md) tables, the statement is sent to all shards concurrently. This ensures that the data is identical on all shards, for example:
```postgresql
-INSERT INTO request_logs
- (client_ip, request_path, response_code, created_at)
-VALUES
- ($1, $2, $3, $4)
+INSERT INTO cities (id, city_name, country_code, created_at)
+VALUES ($1, $2, $3, now())
```
-An identical row will be created on each shard. [Direct-to-shard](../query-routing.md#select) queries can then either fetch them directly or join with other sharded or omnisharded tables.
+This is a common pattern for tables that don't have a sharding key, or tables that contain small amounts of data. The same row will be created on all shards and queries can then either fetch it directly or join it with other sharded and omnisharded tables.
+
+### Omnisharded consistency
-### Consistency
+Unless [two-phase commit](../2pc.md) is enabled, inserts into omnisharded tables are not guaranteed to be atomic. It is possible for the statement to succeed on some of the shards and not others.
-Unless [two-phase commit](../2pc.md) is enabled, inserts into omnisharded tables are not guaranteed to be atomic. It is possible for the statement to succeed on some of the shards and not others. If you don't want to or can't enable 2pc, consider sending cross-shard inserts inside a transaction:
+If you don't want to or can't enable two-phase commit on your database shards, consider sending cross-shard inserts inside a transaction or writing idempotent statements, for example:
```postgresql
BEGIN;
-INSERT INTO request_logs
- (client_ip, request_path, response_code, created_at)
-VALUES
- ($1, $2, $3, $4);
+INSERT INTO cities (id, city_name, country_code, created_at)
+VALUES ($1, $2, $3, now())
+ON CONFLICT (city_name) DO NOTHING;
-- You will receive an ack or an error from all shards here.
COMMIT;
```
-This gives you a much higher chance of recording rows on all shards, since you will know if your statement violated some constraint (e.g., unique index or `NOT NULL` check) before committing the transaction.
+This gives you a much higher chance of writing rows on all shards, since you will know if your statement violated a constraint (e.g., unique index or `NOT NULL` check) before committing the transaction.
-### Primary keys
+!!! warning "Two-phase commit"
+ Enabling [two-phase commit](../2pc.md) is highly recommended. It's been tested and works well in production.
+
+### Primary keys in omnisharded tables
+
+While UUID primary keys are common and are easily generated by the application, it is still common to use `BIGSERIAL` (or `SERIAL`) columns to uniquely identify rows in databases.
+
+These are typically powered by a sequence to ensure non-recurring values are automatically generated for each new row. However, sharded databases can't use sequences directly because they are not aware of other shards and will produce duplicate values across databases.
-It's common practice to use `BIGSERIAL` columns as primary keys. These are powered by a sequence to ensure non-recurring values are automatically generated for each new row.
+To work around this, PgDog provides a way to generate [unique integers](../unique-ids.md) in the proxy using a distributed and shard-aware algorithm. Since the integer is generated inside PgDog before sending the query to Postgres, its value will be the same for all rows sent to each shard.
-Sharded databases can't use sequences directly because they are not aware of other shards and will produce duplicate values across databases. To circumvent this, PgDog provides a way to generate [unique integers](../unique-ids.md) in the proxy using a distributed and shard-aware algorithm.
+### Manual primary key generation
-To use unique IDs as primary keys (or in any other column) in omnisharded tables, you can call the `pgdog.unique_id()` function in the `VALUES` clause. For example:
+To use unique IDs as primary keys (or in any other column) in omnisharded tables, you can call the `pgdog.unique_id()` function directly in the `VALUES` clause of an INSERT statement, for example:
```postgresql
-INSERT INTO ip_logs
- (id, client_ip, created_at)
-VALUES
- (pgdog.unique_id(), $1, now());
+INSERT INTO cities (id, city_name, country_code, created_at)
+VALUES (pgdog.unique_id(), $1, $2, now());
```
-The function is evaluated inside PgDog which places the value it returns directly into the query. This works for all queries, including prepared statements.
+The function is evaluated inside PgDog, replacing the value it generates directly into the query. This works for all queries, including prepared statements.
Each call to `pgdog.unique_id()` generates a unique value, so it's possible to use it multiple times inside the same query and get different numbers, for example:
@@ -78,47 +84,37 @@ Each call to `pgdog.unique_id()` generates a unique value, so it's possible to u
(1 row)
```
-This function can be used with any tables, not just omnisharded ones, or independently of any tables at all. PgDog can also [automatically inject](#primary-key-injection) the function call into insert queries so this feature works with ORMs like ActiveRecord out of the box.
+This function can be used with any table, not just omnisharded ones, or independently of tables altogether.
-## Sharded tables
-`INSERT` statements targeting sharded tables will commonly provide the sharding key. A notable exception to this rule is tables that shard on the primary key, which is often database-generated, e.g., using a sequence.
+### Automatic primary key generation
-The simplest way to work around this is to use the `pgdog.unique_id()` function to create a unique identifier on the fly, for example:
+If you're using an ORM like ActiveRecord, Prisma, SQLAlchemy, etc., it's often not easy or possible to modify how it generates its INSERT statements. Thankfully, the SQL standard specifies a couple of ways for the client to make the database generate the primary key automatically, which PgDog can intercept and handle.
-```postgresql
-INSERT INTO users
- (id, email, created_at)
-VALUES
- (pgdog.unique_id(), $1, $2)
-RETURNING *;
-```
-
-However, if you prefer to use sequences instead, you can rely on [database-generated](../sequences.md) primary keys.
-
-Statements that don't include the primary key in the `INSERT` tuple will be sent to one of the shards, using the same round robin algorithm used for [omnisharded](#omnisharded-tables) tables. The shard will then generate the primary key value using PgDog's [sharded sequences](../schema_management/functions.md#pgdognext_id_seq).
+#### How it works
-For example, assuming the table `users` is sharded on the primary key `id`, omitting it from the `INSERT` statement will send it to only one of the shards:
-
-```postgresql
-INSERT INTO users (email, created_at) VALUES ($1, $2) RETURNING *;
-```
+It's common for some ORMs to omit columns that the database is expected to generate values for. Since PgDog has knowledge of the [database schema](../schema_management/index.md), it can detect this scenario and inject the call to `pgdog.unique_id()` automatically, for example:
-## Primary key injection
+=== "Original query"
+ ```postgresql
+ INSERT INTO cities (city_name, country_code, created_at)
+ VALUES ($1, $2, now())
+ ```
+=== "Rewritten query"
+ ```postgresql
+ INSERT INTO cities (id, city_name, country_code, created_at)
+ VALUES ($3, $1, $2, now())
+ ```
-It's common for ORMs to not specify the primary key during inserts at all, or use a `DEFAULT` placeholder, for example:
+ The value of parameter `$3` is automatically set by PgDog to the value returned by the [unique ID](../unique-ids.md) generator.
-```postgresql
-INSERT INTO "users" ("email", "created_at") VALUES ($1, $2)
-```
+PgDog can also [automatically inject](#automatic-primary-key-generation) the function call into INSERT queries, so this feature works with ORMs like ActiveRecord, Prisma, etc., out of the box.
-This presents a problem to sharded databases because the underlying sequence cannot be used reliably. PgDog can automatically inject the `"id"` column (or any other `PRIMARY KEY` column) into the query generated with its [unique ID](../unique-ids.md) generator, rewriting the query to:
+Another way to write that query is to use the `DEFAULT` keyword, which explicitly tells the database to use the configured default for the column. PgDog can handle that scenario as well and will inject its generated primary key into the row.
-```postgresql
-INSERT INTO "users" ("id", "email", "created_at") VALUES (pgdog.unique_id(), $1, $2)
-```
+#### Configuration
-This works for regular queries and prepared statements alike. This feature is **disabled** by default and can be enabled in [`pgdog.toml`](../../../configuration/pgdog.toml/rewrite.md):
+This primary key generation feature is still relatively new and **disabled** by default. To enable it, configure it in [`pgdog.toml`](../../../configuration/pgdog.toml/rewrite.md):
=== "pgdog.toml"
```toml
@@ -133,9 +129,53 @@ This works for regular queries and prepared statements alike. This feature is **
primaryKey: rewrite
```
-### Composite primary keys
+## Sharded tables
+
+INSERT statements targeting sharded tables will commonly provide the sharding key. A notable exception to this rule is tables that shard on the primary key, which is often database-generated, e.g., using a sequence.
+
+The simplest way to work around this is to use the `pgdog.unique_id()` function to create a unique identifier on the fly, for example:
+
+```postgresql
+INSERT INTO users (id, email, created_at)
+VALUES (pgdog.unique_id(), $1, $2)
+RETURNING *;
+```
+
+PgDog will inject a unique integer into the query and send it to the corresponding shard as a [direct-to-shard](../query-routing.md) statement.
+
+### Primary keys
+
+For sharded tables that have `BIGINT` (or `INTEGER`) primary keys, you can rely on the [unique ID](../unique-ids.md) generator, as you can with [omnisharded](#primary-keys-in-omnisharded-tables) tables.
+
+However, since our algorithm produces large numbers, this may not always be suitable for all applications, especially those that pass IDs directly to a JavaScript/TypeScript frontend that can't handle large integers. For this reason, we created [sharded sequences](../sequences.md).
+
+### Sharded sequences
+
+!!! note "Installation required"
+ For sharded sequences to work correctly, they have to be installed into the database first.
+ Read more about this [here](../sequences.md).
+
+Sharded sequences apply a hashing function to a PostgreSQL sequence, creating unique and monotonically increasing values with small gaps between them. They are useful for generating primary keys for sharded tables, since they create cluster-unique values and require no special handling or query rewriting inside PgDog.
+
+For tables that use sharded sequences for primary key generation and use the primary key for sharding, PgDog sends the query to only one shard (using round-robin routing) and lets the database generate the sharding key automatically, for example:
+
+```postgresql
+INSERT INTO users (email, created_at) VALUES ($1, $2)
+RETURNING id;
+```
+
+The `id` column will be generated by the database (and not PgDog), globally unique, and matched to the shard it's generated on, as guaranteed by the [sharded sequence](../sequences.md) implementation.
+
+!!! warning "Sharded tables only"
+ Make sure to **never** use sharded sequences with **omnisharded** tables. They are not guaranteed to generate the same value on all shards, even with [two-phase commit](../2pc.md), and could cause primary key drift across shards.
+
+## Composite primary keys
+
+!!! warning "Not currently supported"
+ Composite primary keys are not currently supported for primary key generation inside PgDog.
Primary key injection only works for `BIGINT` primary key columns. Composite primary keys or other data types are not currently supported, but are on the [roadmap](../../../roadmap.md).
+
## Multiple tuples
In order to create multiple rows at once, the PostgreSQL query syntax supports sending multiple tuples in one statement. For example:
@@ -148,7 +188,7 @@ VALUES
($4, $5, $6);
```
-In sharded databases, however, the individual tuples are likely to belong on different shards. To make this work, PgDog can automatically rewrite the statement and send each tuple to the right shard. Using the example above, the result of that operation produces two single-tuple statements:
+In sharded databases, however, the individual tuples are likely to belong on different shards. To make this work, PgDog can automatically rewrite the statement and send each tuple to the right shard. Using the example above, that operation produces two single-tuple statements:
=== "Statement 1"
```postgresql
@@ -167,7 +207,7 @@ In sharded databases, however, the individual tuples are likely to belong on dif
This works for all queries, including prepared statements. PgDog will rewrite all Postgres protocol messages (e.g., `Bind`, `Describe`, etc.) without the application having to change its queries.
-Since this feature has additional overhead by using multiple shards for each query, it is **disabled** by default and can be enabled in [`pgdog.toml`](../../../configuration/pgdog.toml/rewrite.md):
+This feature is relatively new and is **disabled** by default. It can be enabled in [`pgdog.toml`](../../../configuration/pgdog.toml/rewrite.md):
=== "pgdog.toml"
```toml
@@ -196,6 +236,6 @@ If a transaction isn't started and a multi-tuple statement is sent by the applic
Requiring transactions ensures that if one of the `INSERT` statements fails, e.g., because of a unique constraint violation, the transaction can be rolled back, leaving the database in a consistent state.
-!!! note "Consistency guarantees"
+!!! warning "Two-phase commit"
- Much like [omnisharded](#omnisharded-tables) table inserts, it's best to enable [2pc](../2pc.md) before attempting cross-shard multi-tuple inserts. This feature increases the likelihood that cross-shard transactions are atomic.
+ Much like [omnisharded](#omnisharded-tables) table inserts, it's best to enable [two-phase commit](../2pc.md) before attempting cross-shard multi-tuple inserts. This feature increases the likelihood that cross-shard transactions are atomic.
diff --git a/docs/features/sharding/cross-shard-queries/select.md b/docs/features/sharding/cross-shard-queries/select.md
index 7b4bfe30..28aedf2c 100644
--- a/docs/features/sharding/cross-shard-queries/select.md
+++ b/docs/features/sharding/cross-shard-queries/select.md
@@ -4,45 +4,46 @@ icon: material/table-search
# Cross-shard SELECT
-A cross-shard `SELECT` query doesn't have one or has several sharding keys, which requires it to be executed by all shards. PgDog can perform it in parallel, assembling the results from each shard automatically. This makes it a powerful scatter/gather engine, with data nodes powered by PostgreSQL.
+A cross-shard SELECT query has either no sharding key or multiple sharding keys, which requires it to be executed by multiple database shards. PgDog can perform this in parallel, assembling the results from each shard automatically. This makes it a powerful scatter/gather engine, with data nodes powered by regular PostgreSQL.
## How it works
-When PgDog receives a `SELECT` query with no (or multiple) sharding keys, it connects to all databases and sends the query to all of them in parallel.
+When PgDog receives a SELECT query with no sharding key, or with multiple sharding keys, it connects to all databases and sends the query to all of them in parallel.
-If the result needs post-processing, e.g., to support [sorting](#sorting) or [aggregation](#aggregates), it will buffer the rows in memory and perform the necessary operations. Otherwise, PgDog will stream the rows directly to the client.
+If the result needs post-processing, e.g., to support [sorting](#sorting) or [aggregation](#aggregates), PgDog will buffer the rows in memory and perform the necessary operations. Otherwise, it will stream the rows directly to the application.
-!!! note "Predicate push-down"
- PgDog pushes all filtering, sorting and aggregation statements to the database. If the query is correctly constructed, the shards will return very few rows, allowing searches of vast quantities of data without causing out-of-memory errors or latency issues in the proxy.
+### Predicate push-down
+
+PgDog pushes all filtering, sorting, and aggregation statements to the database. If the query is correctly constructed, the shards will return very few rows, allowing searches over vast quantities of data without causing out-of-memory errors or latency issues in the proxy.
## Supported features
-The SQL language allows for powerful data filtration and manipulation. While we aim to support most operations, currently, support for some cross-shard operations is limited as documented below:
+SQL allows powerful data filtering and manipulation. While we aim to support most operations, support for some cross-shard operations is currently limited as documented below:
-| Operation | Supported | Limitations |
+| Operation | Support | Limitations |
|-|-|-|
-| `SELECT` columns | :material-check: | None. |
-| `ORDER BY` | :material-check: | Columns must be part of the returned tuples. See [sorting](#sorting). |
-| `DISTINCT` / `DISTINCT BY`| :material-check: | Columns must be part of the returned tuples. |
-| `GROUP BY` | :material-wrench: | Columns must be part of the returned tuples. See [aggregates](#aggregates). |
-| `LIMIT` | :material-check: | None. |
-| `OFFSET` | :material-check: | Rows are filtered in-memory, so paginating becomes linearly more expensive with the number of pages. |
-| CTEs | :material-wrench: | CTE must refer to data located on the same shard. |
-| Window functions | :material-close: | Not currently supported. |
-| Subqueries | :material-wrench: | Subqueries must refer to data located on the same shard. |
+| `SELECT` columns | :material-check: | |
+| `ORDER BY` | Partial support | Columns must be part of the returned tuples. See [sorting](#sorting). |
+| `DISTINCT` / `DISTINCT BY` | Partial support | Columns must be part of the returned tuples. |
+| `GROUP BY` | Partial support | Columns must be part of the returned tuples. See [aggregates](#aggregates). |
+| `LIMIT` | :material-check: | |
+| `OFFSET` | :material-check: | Rows are filtered in memory, so pagination becomes linearly more expensive with the number of pages. |
+| CTEs | Partial support | CTE queries must refer to data located on the same shard, e.g., the same sharding keys or [omnisharded](../omnishards.md) tables. |
+| Window functions | :material-close: | Not currently supported, but on the roadmap. |
+| Subqueries | Partial support | Just like CTEs, subqueries must refer to data located on the same shard, e.g., the same sharding keys or [omnisharded](../omnishards.md) tables. |
## Sorting
-If the query contains an `ORDER BY` clause, PgDog can sort the rows returned from all shards automatically. This works by buffering data returned from all servers and sorting it in memory.
+If the query contains an `ORDER BY` clause, PgDog can sort the rows returned from all shards. This works by buffering the data returned from all servers and sorting it in memory.
Currently, two forms of the `ORDER BY` SQL syntax are supported:
| Syntax | Example | Notes |
|-|-|-|
| Order by column name | `ORDER BY id, email` | The column must be present in the returned tuples. |
-| Order by column position | `ORDER BY 1, 2` | The column is referenced by its position in the returned tuples. |
+| Order by column position | `ORDER BY 1, 2` | |
-Sorting by multiple columns is supported, including opposing sorting directions, for example:
+Sorting by multiple columns is supported, including opposing sort directions, for example:
```postgresql
SELECT id, email, created_at FROM users
@@ -51,9 +52,12 @@ ORDER BY
created_at DESC
```
-Note that columns in the `ORDER BY` clause are retrieved from the table. PgDog cannot sort by columns it doesn't receive from the databases.
+Note that columns in the `ORDER BY` clause are retrieved from the table. PgDog cannot sort by columns it doesn't receive from the database shards.
+
+### Sorting by functions
-### Functions
+!!! warning "Not currently supported"
+ PgDog doesn't currently support sorting rows by the result of a SQL function.
PgDog currently doesn't support sorting results using a function, for example:
@@ -61,7 +65,7 @@ PgDog currently doesn't support sorting results using a function, for example:
SELECT email FROM users ORDER BY coalesce(email, '')
```
-To make this work, we need to implement all SQL functions inside PgDog. This is on the roadmap, but not currently a priority since the query can be easily rewritten to execute the function inside the database:
+To make this work, we need to implement many SQL functions inside PgDog. This is on the roadmap, but it is not currently a top priority since the query can be easily rewritten to execute the function inside the database:
```postgresql
SELECT
@@ -85,7 +89,7 @@ FROM users ORDER BY
### ORMs
-ORMs like SQLAlchemy or ActiveRecord, more often than not, will write queries that work with PgDog out of the box. This is because they tend to fetch entire rows and use fully-qualified names in all parts of the statement, including the `ORDER BY` clause.
+ORMs like SQLAlchemy, ActiveRecord, Prisma, etc., will often generate queries that work with PgDog out of the box. This is because they tend to fetch entire rows and use fully qualified names in all parts of the statement, including the `ORDER BY` clause.
For example, this is what a [`first`](https://apidock.com/rails/ActiveRecord/FinderMethods/first) Rails/ActiveRecord query looks like:
@@ -97,20 +101,20 @@ The `users.id` column is present in the returned row, so PgDog can read it and s
## Aggregates
-Aggregates are transformative functions: instead of returning rows as-is, they return calculated summaries, like a sum or a count. Many aggregate functions are cumulative: the final value can be calculated from partial results returned by each shard.
+Aggregates are transformative functions: instead of returning rows as-is, they return calculated summaries, like a sum or a count. Many aggregate functions are cumulative: the final value can be calculated from the partial results returned by each shard.
If an aggregate function is supported (see list of supported functions below), this is handled by PgDog automatically:
-| Aggregate functions | Supported | Notes |
+| Aggregate functions | Support | Notes |
|-|-|-|
-| `count`, `count(*)` | :material-check: | Works for most [data types](#supported-data-types). |
-| `max`, `min`, `avg`, `sum` | :material-check: | Works for most [data types](#supported-data-types). |
-| `stddev`, `variance` | :material-check: | Works for most [data types](#supported-data-types).|
+| `count()`, `count(*)` | :material-check: | Works for most [data types](#supported-data-types). |
+| `max()`, `min()`, `avg()`, `sum()` | :material-check: | Works for most [data types](#supported-data-types). |
+| `stddev()`, `variance()` | :material-check: | Works for most [data types](#supported-data-types). Results are [approximated](#rewriting-queries). |
| `percentile_disc`, `percentile_cont` | :material-close: | Not currently supported and very expensive to calculate on large datasets. |
-| `*_agg` | :material-close: | Not currently supported. |
-| `json_*` | :material-close: | Not currently supported. |
+| `*_agg` | :material-close: | Not currently supported, but on the roadmap. |
+| `json_*` | :material-close: | Not currently supported, but on the roadmap. |
-Aggregate functions have to appear in the target clause of the statement (`SELECT [...]`), and can also be combined with sorting, for example:
+Aggregate functions must appear in the target clause of the statement (`SELECT [...]`) and can also be combined with sorting, for example:
```postgresql
SELECT COUNT(*), is_admin
@@ -119,18 +123,21 @@ GROUP BY 2
ORDER BY 1 DESC
```
-#### `HAVING` clause
+#### HAVING clause
+
+!!! warning "Not currently supported"
+ The `HAVING` clause is not currently supported but is on the roadmap.
The `HAVING` clause requires additional filtering of the results and is not currently supported. See [#695](https://github.com/pgdogdev/pgdog/issues/695) for more details.
-### Rewriting queries
+## Rewriting queries
-For some aggregate functions to work as expected, each shard may need to return columns and intermediate calculations not present in the original query.
+For some aggregate functions to work as expected, each shard may need to return columns and intermediate calculations that are not present in the original query.
-For example, to get an average of a column, we need to fetch the row `count`, multiply it by the `avg` of the column on each shard, and divide it by the total `count` of rows on all shards.
+For example, to get the average of a column, we need to fetch the row count from each shard, multiply it by the average of the column on each shard, and divide it by the total count of rows on all shards.
-If the `count` function isn't present in the query, PgDog will automatically rewrite the query to add it. This allows queries, like the following example, to just work without modifications:
+If the `count()` function is needed to compute an aggregate but isn't present in the query, PgDog will automatically rewrite the query to add it. This allows queries like the following example to work without modification:
=== "Original"
```postgresql
@@ -145,17 +152,17 @@ PgDog automatically removes the `count` column from the returned rows, so applic
The following aggregate functions take advantage of this feature:
-- `avg`
-- `stddev`
-- `variance`
-
-#### Configuration
+| Function | Description |
+|-|-|
+| `avg()` | Calculates the average of a column across multiple shards. |
+| `stddev()` | Uses an approximation of the actual standard deviation. |
+| `variance()` | Same as `stddev()`, the result is approximated. |
-This feature is **enabled** by default and requires no additional configuration.
+This feature is **enabled** by default for all cross-shard SELECT queries and requires no additional configuration.
## Supported data types
-The following table lists the data types supported by PgDog for ordering and aggregation. Since Postgres clients can request results in either text or binary format, each one must be handled separately:
+The following table lists the data types supported by PgDog for ordering and aggregation. Since Postgres clients can request results in either text or binary format, each format must be handled separately:
| Data type | Sorting | Aggregation | Text format | Binary format |
|-|-|-|-|-|
@@ -168,5 +175,5 @@ The following table lists the data types supported by PgDog for ordering and agg
| `UUID` | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: |
| `VECTOR` | Only by L2 | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: |
-!!! note "pgvector data types"
- `VECTOR` type doesn't have a fixed OID in Postgres because it comes from an extension (`pgvector`). We infer it from the `<->` operator used in the `ORDER BY` clause.
+!!! note "pgvector"
+ The `VECTOR` type doesn't have a fixed OID in Postgres because it comes from an extension (`pgvector`). We infer it from the `<->` operator used in the `ORDER BY` clause.
diff --git a/docs/features/sharding/index.md b/docs/features/sharding/index.md
index 66522693..85f7653a 100644
--- a/docs/features/sharding/index.md
+++ b/docs/features/sharding/index.md
@@ -2,45 +2,44 @@
icon: material/set-split
---
-# Sharding Postgres
+# Sharding PostgreSQL
-Sharding splits a PostgreSQL database and all its tables and indices between multiple machines. Each machine runs its own, independent PostgreSQL server, while PgDog takes care of routing queries and moving data between hosts.
+Sharding splits up a PostgreSQL database with all its tables and indices between multiple servers. Each machine runs its own, independent PostgreSQL server, while PgDog takes care of routing queries and moving data between databases.
+
+Applications are not aware of sharding and should continue to work as if they were using regular Postgres. PgDog's role is to make that possible.
+
+A lot of the features described in this section are stable, tested and are powering large, production databases. Others are still experimental and are marked accordingly. If you have any questions about how sharding works in PgDog, join our [Discord](https://discord.gg/CcBZkjSJdd).
## Intro to sharding
+
+
+ PgDog's sharding architecture
+
+
If you're not familiar with database sharding fundamentals, take a look at the [sharding basics](basics.md). Even if you're a seasoned database expert, it's good to have a refresher to confirm your understanding matches our implementation.
-[**→ Sharding basics**](basics.md)
+PgDog is somewhat similar in architecture to Vitess (sharding proxy for MySQL). Everything that has to do with sharding is handled internally and any abstraction that leaks to the client is usually considered a bug. You can report those [here](https://github.com/pgdogdev/pgdog/issues).
## Routing queries
-PgDog is a query router. It can extract sharding hints directly from the SQL using the PostgreSQL parser and send queries to one or more shards.
+PgDog is a query router. It can extract sharding hints directly from the SQL queries, using its built-in PostgreSQL parser, and send queries to one or more shards. Different types of queries which PgDog can currently handle are listed below:
-| Query router feature | Description |
+| Query | Description |
|-|-|
-| [**Direct-to-shard queries**](query-routing.md) | Automatic sharding key detection which sends the query to one shard only. |
-| [**Cross-shard queries**](cross-shard-queries/index.md) | Queries that don't have a sharding key are sent to all shards with results collected and transformed, as if they came from one database. |
-| [**Manual routing**](manual-routing.md) | Provide the sharding key in a query comment, or separately with a `SET` PostgreSQL command. |
-| [**Sharded COPY**](cross-shard-queries/copy.md) | Data sent via `COPY` commands is automatically split between all shards, using the configured [sharding function](sharding-functions.md). |
-
-## Managing data
-
-PgDog implements the logical replication protocol used by Postgres to move data between nodes.
+| [**Direct-to-shard queries**](query-routing.md) | Sharding key(s) are extracted directly from the query text and the query is sent to one shard only. |
+| [**Cross-shard queries**](cross-shard-queries/index.md) | Queries which don't have a sharding key are sent to all shards, with the results collected and transformed, as if they are coming from one database. |
+| [**Manually routed queries**](manual-routing.md) | Queries are routed explicitly using a query comment, or separately with a `SET` command, to a specific shard. |
+| [**COPY commands**](cross-shard-queries/copy.md) | Data sent with the `COPY` command is automatically sharded between all databases. |
### Data consistency
-To make sure data is atomically written in cross-shard transactions, PgDog supports PostgreSQL's prepared transactions and two-phase commit.
-
-[**→ Two-phase commit**](2pc.md)
-
-### Resharding
+To make sure data is atomically written in cross-shard transactions, PgDog supports PostgreSQL's prepared transactions and [two-phase commit](2pc.md).
-Resharding takes a database cluster with _N_ shards (where _N_ can be 1, for unsharded databases), and turns it into a cluster with _M_ databases. It uses logical replication to do this without downtime or impacting production operations.
+## Managing data
-[**→ Resharding**](resharding/index.md)
+PgDog implements the logical replication protocol used by PostgreSQL and can move data between databases, while distributing individual rows between shards. This process is called resharding and you can read more about how PgDog implements it [here](resharding/index.md).
### Schema management
-PgDog makes sure that the database schema is identical on all shards. It also has support for in-database primary key generation.
-
-[**→ Schema management**](schema_management/index.md)
+PgDog makes sure that the database [schema](schema_management/index.md) is identical on all shards. It also has support for [in-database](sequences.md) and [in-proxy](unique-ids.md) primary key generation, so you can continue to use `BIGINT` (and `INTEGER`) primary keys in sharded PostgreSQL deployments.
diff --git a/docs/features/sharding/query-routing.md b/docs/features/sharding/query-routing.md
index 2bb0dad6..022be690 100644
--- a/docs/features/sharding/query-routing.md
+++ b/docs/features/sharding/query-routing.md
@@ -5,21 +5,22 @@ icon: material/call-split
PgDog has a powerful parser that can extract sharding hints directly from SQL queries. Queries that refer to a column in one of the [sharded tables](../../configuration/pgdog.toml/sharded_tables.md) are sent directly to the corresponding database in the [configuration](../../configuration/pgdog.toml/databases.md).
-Direct-to-shard queries are foundational to horizontal database scaling. The more queries can be routed to just one database, the more requests can be served by the entire cluster.
+Direct-to-shard queries are foundational to horizontal database scaling. The more queries that can be routed to just one database, the more requests the entire sharded database cluster can serve.
## How it works
-PgDog is using the [pg_query](https://docs.rs/pg_query) library, which provides direct access to the native PostgreSQL parser. This allows PgDog to read and understand **100%** of valid SQL queries and commands.
+Under the hood, PgDog uses the [pg_query](https://docs.rs/pg_query) library, which provides direct access to the native PostgreSQL parser. This allows PgDog to read and understand all valid SQL queries and commands.
-
+
+ Direct-to-shard queries go to one shard at a time.
-PgDog is deployed as a proxy between Postgres shards and the application and takes care of routing queries between them. Each SQL command is different and is handled differently by our query router, as documented below.
+PgDog is deployed as a proxy between Postgres shards and the application, and takes care of routing queries between them. Each SQL command is different and is handled differently by the query router, as documented below.
## SELECT
-To route `SELECT` queries, the query router looks for a sharding key in the `WHERE` clause. For example, if your database is sharded by the `user_id` column, all queries that filter rows by that column, either directly or through a foreign key, can be sent to a single shard:
+To route SELECT queries, the query router looks for a sharding key in the `WHERE` clause. For example, if your database is sharded by the `user_id` column, all queries that filter rows by that column, either directly or through a foreign key, can be sent to a single shard:
```postgresql
SELECT * FROM payments
@@ -29,26 +30,32 @@ WHERE
payments.user_id = $1; -- Sharding key.
```
-Both regular queries and [prepared statements](../connection-pooler/prepared-statements.md) are supported. So if your database driver is using placeholders instead of actual values, PgDog will extract the sharding key value from the extended protocol messages.
+Both regular queries and [prepared statements](../connection-pooler/prepared-statements.md) are supported. If your database driver uses placeholders instead of actual values, PgDog will extract the sharding key value from the extended protocol messages.
+
+The sharding key doesn't have to appear in the top-level statement: PgDog's parser will recurse into subqueries and CTEs, if any, and find all matching filters. For example:
+
+```postgresql
+SELECT * FROM (SELECT * FROM users WHERE tenant_id = $1 /* sharding key */) t;
+```
### Supported syntax
-The `SELECT` query can express complex filtering logic and not all of it is currently supported. The following filters in the `WHERE` will work:
+The `SELECT` query can express complex filtering logic, and not all of it is currently supported. The following filters in the `WHERE` clause will work:
| Filter | Example |
|-|-|
-| Column equals to a value | `payments.user_id = $1` |
-| Column matches against a list | `payments.user_id IN ($1, $2, $3)`
+| Column equals a value | `payments.user_id = $1` |
+| Column matches a list | `payments.user_id IN ($1, $2, $3)` |
All other variations will be ignored and the query will be sent to [all shards](cross-shard-queries/index.md).
!!! note "Query router improvements"
- This is an area of constant improvement. Check back here for updates or [create an issue](https://github.com/pgdogdev/pgdog/issues/new) to request
- support for a particular filter you're using.
+ This is an area of constant improvement. Check back here for updates or [create an issue](https://github.com/pgdogdev/pgdog/issues/) to request
+ support for a particular filter or query you are using.
-If the query has multiple sharding key filters, all of them will be extracted and converged to a set of unique shard numbers.
+If the query has multiple sharding keys, all of them will be extracted and reduced to a set of unique shard numbers.
-For example, when filtering by a list of values, e.g., `WHERE user_id IN ($1, $2, $3)`, if all of them map to a single shard, the query will be sent to that shard only. If they map to two or more shards, it will be sent to all corresponding shards [concurrently](cross-shard-queries/index.md).
+For example, when filtering by a list of values, e.g., `WHERE user_id IN ($1, $2, $3)`, the query will be sent only to that shard if all values map to a single shard. If they map to two or more shards, it will be sent to all corresponding shards [concurrently](cross-shard-queries/index.md).
## INSERT
@@ -58,93 +65,124 @@ Insert queries are routed using the values in the `VALUES` clause, for example:
INSERT INTO payments (user_id, amount) VALUES ($1, $2) RETURNING *
```
-If the query is inserting a row into a [sharded table](../../configuration/pgdog.toml/sharded_tables.md), the query router will extract the sharding key, and route the query to the corresponding shard.
-
-Just like for `SELECT` queries, both [prepared statements](../connection-pooler/prepared-statements.md) and regular queries are supported.
+If the query is inserting a row into a [sharded table](../../configuration/pgdog.toml/sharded_tables.md), the query router will extract the sharding key and route the query to the corresponding shard. As with [SELECT](#select) queries, both [prepared statements](../connection-pooler/prepared-statements.md) and regular queries are supported.
### Supported syntax
-PgDog can automatically detect the sharding key in an `INSERT` statement, whether it specifies column names or not. It can also split multi-tuple inserts, by sending each tuple to their respective shard, for example:
+PgDog can automatically detect the sharding key in an INSERT statement, whether it specifies column names or not. This works because PgDog fetches the table definitions at proxy startup and knows which columns a particular table contains:
```postgresql
--- user_id is the sharding key ($1)
+-- user_id is the sharding key ($1).
INSERT INTO payments (user_id, amount) VALUES ($1, $2);
-- user_id is automatically detected as parameter $1
--- using schema inference
+-- using the table schema.
INSERT INTO payments VALUES ($1, $2);
-
--- Statement is rewritten into two inserts, and each is sent
--- to different shards
-INSERT INTO payments VALUES ($1, $2), ($3, $4);
```
-## UPDATE and DELETE
+If an INSERT statement contains multiple tuples, PgDog can rewrite it into separate statements and send them concurrently to their respective shards. This feature is still experimental and **disabled** by default. You can enable it in [`pgdog.toml`](../../configuration/pgdog.toml/rewrite.md):
-Both `UPDATE` and `DELETE` queries work identically to [`SELECT`](#select) queries. The query router looks inside the `WHERE` clause for sharding keys, and routes the query to the corresponding shard.
+=== "pgdog.toml"
+ ```toml
+ [rewrite]
+ split_inserts = "rewrite"
+ ```
+=== "Helm chart"
+ ```yaml
+ rewrite:
+ splitInserts: rewrite
+ ```
-If no `WHERE` clause is present, or it's filtering on a column not used for sharding, the query is sent to all shards [concurrently](cross-shard-queries/index.md), for example:
+Once enabled, PgDog will transform multi-tuple queries automatically and send them to their respective shards, for example:
+
+=== "Original statement"
+ ```postgresql
+ INSERT INTO payments VALUES ($1, $2), ($3, $4);
+ ```
+=== "Rewritten statements"
+ ```postgresql
+ -- Statement is rewritten into two, and each is sent to a different shard.
+ INSERT INTO payments VALUES ($1, $2);
+ INSERT INTO payments VALUES ($1, $2);
+ ```
+
+### Subqueries and CTEs
+
+!!! warning "Not supported yet"
+ Subqueries and CTEs are not presently supported for sharded INSERT statements.
+
+Currently, subqueries fetching data from _other_ shards are not supported in INSERT statements. For example, the following pattern _will not_ work with PgDog:
```postgresql
-UPDATE users SET banned = true;
+INSERT INTO users (tenant_id, email) VALUES ($1, (SELECT email FROM signups LIMIT 1));
```
-
+Unlike Citus, PgDog supports mutating a sharding key column with an UPDATE statement. Under the hood, it will move the row between shards, deleting it from the original shard and inserting it into the new one. See [sharding key updates](cross-shard-queries/update.md#sharding-key-updates) for more details.
## Foreign keys
-While it's best to choose a sharding column present in all tables, it is sometimes not desirable or possible to do so. For example, it's redundant to store a foreign key in a table that has a transitive relationship to another table:
+While it's best to choose a sharding column that's present in all tables, it is sometimes not desirable or possible to do so. For example, it's redundant to store a foreign key in a table that has a transitive relationship to another table:
-
+
+ Transitive foreign key relationships require special handling.
-In this example, the `order_items` table has a foreign key to `orders`, which in turn refers to `users`. This makes `order_items` related to `users` as well, but it doesn't need a foreign key to that table. However, this also means that table doesn't have a sharding key.
+In this example, the `order_items` table has a foreign key to `orders`, which in turn refers to `users`. This makes `order_items` related to `users` as well, but it doesn't need a foreign key to that table. However, this also means the table doesn't have its own sharding key.
To make querying the `order_items` table in a sharded database possible, the following workarounds are available:
| Workaround | Description |
|-|-|
-| Add sharding key column | Add the sharding key column to the table and backfill it with corresponding values. |
+| Add the sharding key column | Add the sharding key column to the table and backfill it with corresponding values. |
| [Manual routing](manual-routing.md) | Provide sharding hints to the query router via SQL comments or `SET` commands. |
-| Use joins | For `SELECT` queries only, refer to the table as part of a join to a table that has the sharding key column. All other queries would need to use [manual routing](manual-routing.md).|
+| Use joins | For [SELECT](#select) queries only, refer to the table as part of a join to another table that has the sharding key column. All other queries, e.g., INSERT, DELETE, etc., would need to use [manual routing](manual-routing.md).|
+
+Adding the sharding key column is often the best choice because it makes writing queries much easier. The sharding key is usually a compact data type, like a `BIGINT` or a `UUID`, so it doesn't take up much space and can be backfilled relatively quickly.
-Adding the sharding key column is often best, because it makes writing queries a lot easier. The sharding key is usually a compact data type, like a `BIGINT` or a `UUID`, so it doesn't take up much space, and can be backfilled relatively quickly. If backfilling, make sure to do so in small batches, so as to reduce impact on database performance.
+!!! note "Backfilling"
+ If backfilling the sharding key column, make sure to do so in small batches to reduce the impact on database performance.
### Sharding configuration
-If most or all of your tables have the sharding key and the column name is the same, you can add it to [pgdog.toml](../../configuration/pgdog.toml/sharded_tables.md) without specifying a table name, for example:
+If most or all of your tables have the sharding key and the column name is the same, you can add it to [`pgdog.toml`](../../configuration/pgdog.toml/sharded_tables.md) without specifying a table name, for example:
=== "pgdog.toml"
```toml
[[sharded_tables]]
database = "prod"
column = "user_id"
- data_type = "bigint"
```
=== "Helm chart"
```yaml
shardedTables:
- database: prod
column: user_id
- dataType: bigint
```
This will match all queries referring to all tables with the `user_id` column and route them to a shard accordingly.
-For the table storing the actual data referred to by the foreign keys, you can make another
-entry in the config, this time with the table name explicitly stated:
+For the table storing the actual data referred to by the foreign keys, you can add another entry to the configuration, this time with the table name explicitly stated:
=== "pgdog.toml"
```toml
@@ -152,7 +190,6 @@ entry in the config, this time with the table name explicitly stated:
database = "prod"
name = "users"
column = "id"
- data_type = "bigint"
```
=== "Helm chart"
```yaml
@@ -160,10 +197,9 @@ entry in the config, this time with the table name explicitly stated:
- database: prod
name: users
column: id
- dataType: bigint
```
-The latter will match queries referring to the `users.id` column only. Together with the `user_id` entry, all tables that contain the sharding key will be supported by the query router for direct-to-shard queries.
+The second entry will match queries referring to the `users.id` column only. Together with the `user_id` entry, all tables that contain the sharding key will be supported by the query router for direct-to-shard queries.
## Read more
diff --git a/docs/features/sharding/sharding-functions.md b/docs/features/sharding/sharding-functions.md
index 2999ab87..c84afddd 100644
--- a/docs/features/sharding/sharding-functions.md
+++ b/docs/features/sharding/sharding-functions.md
@@ -3,113 +3,106 @@ icon: material/function
---
# Sharding functions
-The sharding function inside PgDog transforms column values in SQL queries to specific shard numbers, which are in turn used for routing queries to one or more databases in the [configuration](../../configuration/pgdog.toml/databases.md).
+The sharding functions determine how to route SQL queries to one or more shard numbers. They can use arbitrary input data to make this decision, and PgDog supports multiple sharding functions. Once a shard number is determined, PgDog will send the query to one or more databases configured in [`pgdog.toml`](../../configuration/pgdog.toml/databases.md).
-## How it works
+## Supported functions
-The PgDog sharding function is based on PostgreSQL declarative partitions. This choice is intentional: it allows data to be sharded both inside PgDog and inside PostgreSQL, with the use of the same partition functions.
+Currently, PgDog supports two kinds of sharding function:
-PgDog supports all three PostgreSQL partition functions and uses them for sharding data between nodes:
+| Sharding function | Description |
+|-|-|
+| [Column-based](#column-based-sharding) | Uses one of the three supported Postgres partition functions and applies them to a specific column value, e.g., `tenant_id` to produce a shard number. |
+| [Schema-based](#schema-based-sharding) | Maps different PostgreSQL schemas (e.g., `public`) to different shard numbers, allowing you to physically separate different schemas. |
+
+## Column-based sharding
+
+The PgDog column sharding function is based on PostgreSQL declarative partitions. This choice is intentional: it allows data to be sharded both inside PgDog and inside PostgreSQL, using the same partition functions.
+
+PgDog supports all three PostgreSQL partition functions:
| Function | Description |
|-|-|
-| Hash | `PARTITION BY HASH` function, using a special hashing function implemented by both PgDog and Postgres. |
+| Hash | `PARTITION BY HASH` function, using an internal hashing function implemented by both PgDog and PostgreSQL. |
| List | `PARTITION BY LIST` function, used for splitting rows by an explicitly defined mapping of values to shard numbers. |
-| Range| `PARTITION BY RANGE` function, similar to list sharding, except the mapping is defined with a bounded range. |
+| Range | `PARTITION BY RANGE` function, similar to list sharding, except the mapping is defined using a bounded range. |
-The sharding functions are configurable in [`pgdog.toml`](../../configuration/pgdog.toml/sharded_tables.md) on a per-table and/or per-column basis.
+The sharding functions are configurable in [`pgdog.toml`](../../configuration/pgdog.toml/sharded_tables.md) on a per-table and/or per-column basis, for example:
-!!! note "Multiple sharding functions"
- Since sharding is configured for each table or column name, this allows storing tables
- with different sharding functions in the same database.
+=== "pgdog.toml"
+ ```toml
+ [[sharded_tables]]
+ database = "prod"
+ column = "tenant_id"
+ ```
+=== "Helm chart"
+ ```yaml
+ shardedTables:
+ - database: prod
+ column: tenant_id
+ ```
- While this works for some [cross-shard](cross-shard-queries/index.md) queries, joins between tables using a different sharding function are not possible for [direct-to-shard](query-routing.md) queries.
+By default, PgDog uses the hash-based function, which distributes data evenly, on average, between all shards. PgDog currently supports sharding on all integers (including `BIGINT`, `INTEGER`, and `SMALLINT`), text (including `VARCHAR`), and UUID columns.
+By default, the sharded tables configuration uses the integer data type, but you can specify a different one as follows:
-## Hash
+=== "pgdog.toml"
+ ```toml
+ [[sharded_tables]]
+ database = "prod"
+ column = "tenant_id"
+ data_type = "uuid" # or "varchar"
+ ```
+=== "Helm chart"
+ ```yaml
+ shardedTables:
+ - database: prod
+ column: tenant_id
+ dataType: uuid # or varchar
+ ```
-The hash function evenly distributes data between all shards. It ingests bytes and returns a single 64-bit unsigned integer which we then modulo by the number of shards in the configuration.
+The data type needs to be known at runtime so PgDog can safely parse and interpret queries without talking to the database. This also allows it to resolve the data type in ambiguous situations, e.g., when using [query comments](manual-routing.md#query-comment) for routing queries.
-
-`hash(user_id) mod shards`
-
+### Table/column matching
+The sharded tables configuration uses greedy matching to find tables and columns. For example, if the configuration only specifies the `column`, the config will match all tables that have that column. This is especially useful when the database schema follows a convention for naming columns.
-The hash function is used by default when configuring sharded tables in [`pgdog.toml`](../../configuration/pgdog.toml/sharded_tables.md):
+To match a specific table/column combination, you can specify the table name as follows:
=== "pgdog.toml"
```toml
[[sharded_tables]]
database = "prod"
- column = "user_id"
- data_type = "bigint"
+ name = "users"
+ column = "company_id"
```
=== "Helm chart"
```yaml
shardedTables:
- database: prod
- column: user_id
- dataType: bigint
+ name: users
+ column: company_id
```
-All queries referencing the `user_id` column will be automatically sent to the matching shard(s) and data in those tables will be split between all data nodes evenly. See below for a list of [supported](#supported-data-types) data types. Each can be specified as follows:
-
-=== "Integers"
- !!! note "Integer types"
- Different integer types are treated the same by the query router. If you're using `BIGINT`, `INTEGER` or `SMALLINT` as your sharding key, you can specify `bigint` in the configuration:
-
- === "pgdog.toml"
- ```toml
- [[sharded_tables]]
- database = "prod"
- column = "user_id"
- data_type = "bigint"
- ```
- === "Helm chart"
- ```yaml
- shardedTables:
- - database: prod
- column: user_id
- dataType: bigint
- ```
-=== "Text"
- !!! note "Text types"
- `VARCHAR`, `VARCHAR(n)`, and `TEXT` use the same encoding and are treated the same by the query router. For either one, you can specify `varchar` in the configuration:
- === "pgdog.toml"
- ```toml
- [[sharded_tables]]
- database = "prod"
- column = "serial_number"
- data_type = "varchar"
- ```
- === "Helm chart"
- ```yaml
- shardedTables:
- - database: prod
- column: serial_number
- dataType: varchar
- ```
-=== "UUID"
- !!! note "UUID types"
- Only UUIDv4 is currently supported for sharding in the query router.
- === "pgdog.toml"
- ```toml
- [[sharded_tables]]
- database = "prod"
- column = "unique_id"
- data_type = "uuid"
- ```
- === "Helm chart"
- ```yaml
- shardedTables:
- - database: prod
- column: unique_id
- dataType: uuid
- ```
-
-## List
-
-The list sharding function distributes data between shards according to a value <-> shard mapping. It's useful for low-cardinality sharding keys, like country codes or region names, or when you want to control how your data is distributed between the data nodes. The most common use case for this is [multitenant](../multi-tenancy.md) systems.
+This makes PgDog's sharding configuration flexible and forgiving of the realities of running PostgreSQL in production. As long as you can find and configure all required sharding keys, query routing will work as expected.
+
+!!! note "Multiple sharding functions"
+ Since sharding is configured for each table or column name, this allows you to store tables
+ with different sharding functions in the same database.
+
+ While this works for some [cross-shard](cross-shard-queries/index.md) queries, joins between tables using different sharding functions are not going to work for [direct-to-shard](query-routing.md) queries.
+
+
+### Why Postgres partitions
+
+We often get asked why we chose PostgreSQL partitions for sharding Postgres. There are indeed better hash functions, e.g., rendezvous hashing, which minimizes the amount of data movement needed when changing the number of shards later on.
+
+Partition functions allow you to reshard data both inside PgDog and inside Postgres. For example, if you already have partitioned several tables (usually the biggest and most heavily used ones) and you just want to move those to different PostgreSQL servers, you can do so with logical replication, or even with `COPY`.
+
+This makes the initial step for sharding your database that much easier and doesn't require you to use our (currently experimental) [resharding](resharding/index.md) implementation.
+
+### List-based sharding
+
+The list sharding function distributes data between shards according to a value <-> shard mapping. It's useful for low-cardinality sharding keys, such as country codes or region names, or when you want to control how your data is distributed between the data nodes. The most common use case for this is [multitenant](../multi-tenancy.md) systems.
To enable this sharding function on a table or column, you need to specify additional value <-> shard mappings in [`pgdog.toml`](../../configuration/pgdog.toml/sharded_tables.md), for example:
@@ -132,14 +125,14 @@ To enable this sharding function on a table or column, you need to specify addit
shard: 0
```
-This example will route all queries with `user_id` equal to one, two, or three to shard zero. Unlike [hash](#hash) sharding, a value <-> shard mapping is required for _all_ values of the sharding key. If a value is used that doesn't have a mapping, the query will be sent to [all shards](cross-shard-queries/index.md).
+This example will route all queries with `user_id` equal to `1`, `2`, and `3` to shard zero. Unlike [hash](#column-based-sharding) sharding, a value <-> shard mapping is usually required for _all_ values of the sharding key. If a value is used that doesn't have a mapping and a [fallback](#fallback-shard) routing configuration isn't specified, the query will be sent to [all shards](cross-shard-queries/index.md).
!!! note "Required configuration"
- The `[[sharded_tables]]` configuration entry is still required for list and range sharding. It specifies the data type of the column, which tells PgDog how to parse its value at runtime.
+ The `[[sharded_tables]]` configuration entry is still required for list-based sharding. It specifies the data type of the column, which tells PgDog how to parse its value at runtime.
-## Range
+### Range-based sharding
-Sharding by range is similar to [list](#list) sharding, except instead of specifying the values explicitly, you can specify a bounding range. All values that are included in the range will be sent to the specified shard, for example:
+Sharding by range is similar to [list](#list-based-sharding) sharding, except instead of specifying the values explicitly, you can specify a bounded range. All values included in the range will be sent to the specified shard, for example:
=== "pgdog.toml"
```toml
@@ -162,34 +155,58 @@ Sharding by range is similar to [list](#list) sharding, except instead of specif
shard: 0
```
-This will route queries that refer to the `user_id` column, with values between 1 and 100 (exclusively), to shard zero. For open-ended ranges, you can specify either the `start` or the `end` value. The start value is included in the range, while the end value is excluded.
+This example will route queries that refer to the `user_id` column with values between 1 and 100 (exclusively) to shard zero. For open-ended ranges, you can specify either the `start` or the `end` value. The start value is included in the range, while the end value is excluded (same as PostgreSQL partitions).
!!! note "Required configuration"
- The `[[sharded_tables]]` configuration entry is still required for list and range sharding. It specifies the data type of the column, which tells PgDog how to parse its value at runtime.
+ The `[[sharded_tables]]` configuration entry is still required for range-based sharding. It specifies the data type of the column, which tells PgDog how to parse its value at runtime.
-## Supported data types
+### Fallback shard
+
+If you don't want to specify an exhaustive list of values, PgDog accepts a default (or fallback) mapping which will match all queries that are not otherwise configured using other `[[sharded_mapping]]` entries:
+
+=== "pgdog.toml"
+ ```toml
+ [[sharded_mappings]]
+ database = "prod"
+ column = "user_id"
+ kind = "default"
+ shard = 1
+ ```
+=== "Helm chart"
+ ```yaml
+ shardedMappings:
+ - database: prod
+ column: user_id
+ kind: default
+ shard: 1
+ ```
+
+This is identical to `PARTITION OF [...] DEFAULT` behavior in PostgreSQL.
-PostgreSQL has dozens of data types. PgDog supports a subset of those for sharding purposes and they are listed below.
+## Supported data types
!!! note "Work in progress"
This list will continue to get longer as the development of PgDog continues. Check back soon or [create an issue](https://github.com/pgdogdev/pgdog/issues) to request support for a data type you need.
-| Data type | Hash | List | Range |
-|-|-|-|-|
-| `BIGINT` / `INTEGER` / `SMALLINT` | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: |
-| `VARCHAR` / `TEXT` | :material-check-circle-outline: | :material-check-circle-outline: | No |
-| `UUID` | :material-check-circle-outline: | :material-check-circle-outline: | No |
+
+PostgreSQL has dozens of data types. PgDog supports a subset of those for sharding purposes, and they are listed below:
+
+| Data type | Hash | List | Range | Configuration |
+|-|-|-|-|-|
+| `BIGINT` / `INTEGER` / `SMALLINT` | :material-check-circle-outline: | :material-check-circle-outline: | :material-check-circle-outline: | `"bigint"` |
+| `VARCHAR` / `TEXT` | :material-check-circle-outline: | :material-check-circle-outline: | No | `"varchar"` |
+| `UUID` | :material-check-circle-outline: | :material-check-circle-outline: | No | `"uuid"` |
## Schema-based sharding
In addition to splitting the tables themselves, PgDog can shard Postgres databases by placing different schemas on different shards. This is useful for multi-tenant applications that have stricter separation between their users' data.
-When enabled, PgDog will route queries that fully qualify tables based on their respective schema names.
+When enabled, PgDog will route queries that fully qualify tables with their schema names. Additionally, it can use the `search_path` session variable to infer the schema name for specified tables and use that for routing queries instead.
### Schema-to-shard mapping
-Schemas are mapped to their shards in [pgdog.toml](../../configuration/pgdog.toml/sharded_schemas.md), for example:
+Just like [column-based sharding](#column-based-sharding), schemas can be mapped to their shards with configuration in [`pgdog.toml`](../../configuration/pgdog.toml/sharded_schemas.md):
=== "pgdog.toml"
```toml
@@ -214,7 +231,7 @@ Schemas are mapped to their shards in [pgdog.toml](../../configuration/pgdog.tom
shard: 1
```
-Queries that include the schema name in the tables they are referring to can be routed to the right shard. For example:
+Queries that include the schema name in the tables they refer to can be routed to the right shard. For example:
```postgresql
SELECT * FROM customer_a.users WHERE email = $1;
@@ -222,11 +239,17 @@ SELECT * FROM customer_a.users WHERE email = $1;
Since the `users` table is fully qualified as `customer_a.users`, the query will be routed to shard zero.
-### DDL
+Alternatively, the application can dynamically set the `search_path` session variable to the desired schema before executing the query, for example:
+
+```postgresql
+SET search_path TO customer_a, public;
+```
-Unlike other sharding functions, schema-based sharding will also route DDL (e.g., `CREATE TABLE`, `CREATE INDEX`, etc.) queries to their respective shard, as long as the entity name is fully qualified.
+Schemas are evaluated in the order specified in the statement, and the first schema that matches a configuration entry in `pgdog.toml` is chosen for routing all subsequent queries.
-For example:
+### DDL
+
+Unlike column-based sharding functions, schema-based sharding will also route DDL statements (e.g., `CREATE TABLE`, `CREATE INDEX`, etc.) to the appropriate shard, as long as the entity name is fully qualified or `search_path` is set:
```postgresql
CREATE TABLE customer_b.users (
@@ -237,13 +260,24 @@ CREATE TABLE customer_b.users (
CREATE UNIQUE INDEX ON customer_b.users USING btree(email);
```
-Both of these DDL statements will be sent to shard one, because they explicitly refer to tables in schema `customer_b`, which is mapped to shard one.
+Alternatively, you can:
+
+```postgresql
+SET search_path TO customer_b, public;
+
+CREATE TABLE users (
+ id BIGSERIAL PRIMARY KEY,
+ email VARCHAR NOT NULL
+);
+```
+
+All of these DDL statements will be sent to shard one, because they explicitly refer to tables in schema `customer_b`, which is mapped to shard one in the configuration.
### Default routing
-If a schema isn't mapped to a shard number, PgDog will fallback to using other configured sharding functions. If none are set, the query will be sent to all shards.
+If a schema isn't mapped to a shard number, PgDog will fall back to using other configured sharding functions. If none are set, the query will be sent to all shards.
-To avoid this behavior and send all other queries to a particular shard, you can add a default schema mapping:
+To avoid this behavior and send all other queries to a particular shard, you can add a default schema mapping, for example:
=== "pgdog.toml"
```toml
@@ -260,9 +294,15 @@ To avoid this behavior and send all other queries to a particular shard, you can
This will send all queries that don't specify a schema or use a schema without a mapping to shard zero.
+### Why shard on schema
+
+Schema-based sharding is straightforward to deploy and use, since it has clear data separation and will almost always route requests as [direct-to-shard](query-routing.md) queries. That makes it 100% compatible with all PostgreSQL features, while allowing you to scale your database horizontally.
+
## Read more
{{ next_steps_links([
+ ("Direct-to-shard queries", "query-routing.md", "Route queries to a single shard whenever the sharding key is known."),
+ ("Cross-shard queries", "cross-shard-queries/index.md", "Run queries that span multiple shards."),
("COPY command", "cross-shard-queries/copy.md", "Bulk load data across shards with the COPY protocol."),
("Two-phase commit", "2pc.md", "Atomic transactions spanning multiple shards."),
]) }}
diff --git a/docs/images/cross-shard.png b/docs/images/cross-shard.png
index fa9cc07f..ff4d7f9a 100644
Binary files a/docs/images/cross-shard.png and b/docs/images/cross-shard.png differ
diff --git a/docs/images/fk.png b/docs/images/fk.png
index ca959481..083b9656 100644
Binary files a/docs/images/fk.png and b/docs/images/fk.png differ
diff --git a/docs/images/intro.png b/docs/images/intro.png
index 645ed2cd..a87eba99 100644
Binary files a/docs/images/intro.png and b/docs/images/intro.png differ
diff --git a/docs/index.md b/docs/index.md
index 263fba30..ba1171a7 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -32,7 +32,7 @@ Unlike those proxies, PgDog handles features that usually force a pooler to pin
PgDog is also multithreaded, so a single instance can serve many more clients while still relying on the same small number of Postgres connections.
-You can read more about how the connection pooler works [here](features/connection-pooler/transaction-mode.md).
+You can read more about how the connection pooler works [here](features/connection-pooler/index.md).
## Load balancer