When the optimizer determines that parallel query is the fastest execution strategy for a particular query, it will create a query plan which includes a Gather node. Here is a simple example:
EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; QUERY PLAN ------------------------------------------------------------------------------------- Gather (cost=1000.00..217018.43 rows=1 width=97) Workers Planned: 2 -> Parallel Seq Scan on pgbench_accounts (cost=0.00..216018.33 rows=1 width=97) Filter: (filler ~~ '%x%'::text) (4 rows)
In all cases, the Gather
node will have exactly one
child plan, which is the portion of the plan that will be executed in
parallel. If the Gather
node is at the very top of the plan
tree, then the entire query will execute in parallel. If it is somewhere
else in the plan tree, then only the portion of the plan below it will run
in parallel. In the example above, the query accesses only one table, so
there is only one plan node other than the Gather
node itself;
since that plan node is a child of the Gather
node, it will
run in parallel.
Using EXPLAIN, you can see the number of
workers chosen by the planner. When the Gather
node is reached
during query execution, the process which is implementing the user's
session will request a number of background
worker processes equal to the number
of workers chosen by the planner. The total number of background
workers that can exist at any one time is limited by both
max_worker_processes and
max_parallel_workers, so it is possible for a
parallel query to run with fewer workers than planned, or even with
no workers at all. The optimal plan may depend on the number of workers
that are available, so this can result in poor query performance. If this
occurrence is frequent, considering increasing
max_worker_processes
and max_parallel_workers
so that more workers can be run simultaneously or alternatively reducing
max_parallel_workers_per_gather so that the planner
requests fewer workers.
Every background worker process which is successfully started for a given
parallel query will execute the portion of the plan below
the Gather
node. The leader will also execute that portion
of the plan, but it has an additional responsibility: it must also read
all of the tuples generated by the workers. When the parallel portion of
the plan generates only a small number of tuples, the leader will often
behave very much like an additional worker, speeding up query execution.
Conversely, when the parallel portion of the plan generates a large number
of tuples, the leader may be almost entirely occupied with reading the
tuples generated by the workers and performing any further processing
steps which are required by plan nodes above the level of the
Gather
node. In such cases, the leader will do very
little of the work of executing the parallel portion of the plan.