How to Find and Stop Running Queries on PostgreSQL

2022-06-20 Owl be stopping those pesky queries!

Your PostgreSQL server is seizing up, with some naughty queries consuming too many resources or blocking others. Don’t panic! You can stop those problem queries and stabilize your system.

In this post we’ll look at stopping queries via SQL, techniques for finding problematic queries, and the occasionally useful ability to cancel via operating system tools.

Stopping Queries via SQL in Two Steps

Here’s the basic process to find and stop a query. Note you’ll need to connect as a user with adequate permissions to do so, such as an admin account.

1. Find the pid

PostgreSQL creates one process per connection, and it identifies each process with its operating system process ID, or pid. In order to cancel a query, you need to know the pid for the connection it’s running on.

One way to find this out is with the pg_stat_activity view, which provides information about the live queries. For example, try this query:

SELECT pid, state, backend_start, substr(query, 0, 100) q
FROM pg_stat_activity
WHERE backend_type = 'client backend'
ORDER BY backend_start;

The substr call limits the displayed SQL to 100 characters, to avoid long queries taking up a lot of screen space. The backend_type filter avoids showing background server processes, such as the autovacuum launcher. Sorting by backend_start shows the longest running connections first, which often shows problematic long-running transactions.

Here’s an example of running this on my development server:

stagingi_inventev=# SELECT pid, state, backend_start, substr(query, 0, 100) q
FROM pg_stat_activity
WHERE backend_type = 'client backend'
ORDER BY backend_start;
 pid | state  |         backend_start         |                                                  q
-----+--------+-------------------------------+-----------------------------------------------------------------------------------------------------
  73 | active | 2022-06-17 18:57:15.850803+00 | SELECT pid, state, backend_start, substr(query, 0, 100) q FROM pg_stat_activity WHERE backend_type
  77 | idle   | 2022-06-17 18:57:33.567574+00 | SELECT /* long-running query */ pg_sleep(10000);
(2 rows)

Another way to find pid’s is through the operating system, using tools like htop. This is useful if you have access to your PostgreSQL server and want to find queries that are consuming the most resources.

Okay, you have a pid, now let’s stop that query!

2. Terminate, or Cancel, the Process

PostgreSQL has two query-stopping functions, with differences discussed below.

The “harsh” one is pg_terminate_backend, which you can use like:

SELECT pg_terminate_backend(pid);

I default to using this one, reasoning explained below.

The “kinder” function is pg_cancel_backend, which you can use like:

SELECT pg_cancel_backend(pid);

When using either, replace pid with the pid you found in step 1. For example, to terminate the long-running query from above:

SELECT pg_terminate_backend(77);

And boom, it’s gone.

There are two differences that make pg_terminate_backend “harsher”.

First, pg_terminate_backend completely stops the process, causing the connection to close. This rolls back any open transaction on the connection, freeing all locks it holds.

In contrast, pg_cancel_backend interrupts only the running query, leaving the connection open. The current transaction or savepoint is aborted. Thus, if the connection is using savepoints, it can still keep the surrounding transaction open, with pending data changes and locks.

Second, pg_terminate_backend applies immediately*, whilst pg_cancel_backend can be deferred at certain points in the backend process lifecycle. So sometimes, you might run pg_cancel_backend and see nothing happen for a while. Specifically, this can occur when the process is reading input from the client, such as an incoming query. The backend process defers handling the cancellation until all the input has been read, since otherwise the connection could not be left open and functional.

(*Ackshuallly, pg_terminate_backend can also fail to apply immediately, but it’s much less likely. Small sections of the code also defer handling it. In theory, these sections don’t spend much time executing, but never say never. If you’re interested in diving into the source, start with ProcessInterrupts in src/backend/tcop/postgres.c.)

I default to using pg_terminate_backend. Normally, when I need to stop a query, I want to stop the whole application process that started it, roll back all data changes, and release all locks. When using pg_cancel_backend, there’s a risk that the application’s error-handling code will rollback the transaction/savepoint, and carry on to run similar queries. And it may continue to hold problematic locks.

Discovering Vexacious Running Queries

If you want to stop more than one query, it can be onerous to run pg_terminate_backend() on them one-by-one. You can use SQL to find the bad queries and generate the terminate statements, in order to get rid of them all with ease. Here are a couple of examples.

Queries Blocking a Particular Process

If you’re executing an ALTER TABLE and find it blocked, waiting for a table lock, you may want to terminate the connections holding locks on that table. This will allow the ALTER TABLE to proceed.

For example, I recently worked on an application with a few long-running transactions that blocked a database migration. These long-running queries were safe to terminate as the responsible application process would re-run later and fill in any gaps.

You can find the blocked pid of ALTER TABLE with a query against pg_stat_activity like so:

SELECT pid, state, backend_start, substr(query, 0, 100) q
FROM pg_stat_activity
WHERE backend_type = 'client backend' AND query LIKE 'ALTER TABLE%'
ORDER BY backend_start;

With the blocked pid, you can use this query with pg_blocking_pids to generate SQL to terminate the blocking processes:

SELECT 'SELECT pg_terminate_backend(' || pid || ');'
FROM pg_stat_activity
WHERE pid = ANY (pg_blocking_pids(blockedpid));

To run this query, replace blockedpid with the pid of the blocked process. Then copy-paste the output lines and run them.

Here’s an example session using these queries to unblock an ALTER TABLE:

mydb=# SELECT pid, state, backend_start, substr(query, 0, 100) q
  FROM pg_stat_activity
  WHERE backend_type = 'client backend' AND query LIKE 'ALTER TABLE%'
  ORDER BY backend_start;
 pid | state  |         backend_start         |                       q
-----+--------+-------------------------------+------------------------------------------------
 613 | active | 2022-06-17 20:25:06.604326+00 | ALTER TABLE auth_user ADD COLUMN age int null;
(1 row)

mydb=# SELECT 'SELECT pg_terminate_backend(' || pid || ');'
  FROM pg_stat_activity
  WHERE pid = ANY (pg_blocking_pids(613));
             ?column?
----------------------------------
 SELECT pg_terminate_backend(77);
(1 row)

mydb=# SELECT pg_terminate_backend(77);
 pg_terminate_backend
----------------------
 t
(1 row)

mydb=# SELECT pid, state, backend_start, substr(query, 0, 100) q
  FROM pg_stat_activity
  WHERE backend_type = 'client backend' AND query LIKE 'ALTER TABLE%'
  ORDER BY backend_start;
 pid | state |         backend_start         |                       q
-----+-------+-------------------------------+------------------------------------------------
 613 | idle  | 2022-06-17 20:25:06.604326+00 | ALTER TABLE auth_user ADD COLUMN age int null;
(1 row)

Cowabunga!

Queries Against a Particular Table

Sometimes you may just want to terminate all queries running against a given table. This can be appropriate to stop overload from a particular misbehaving application process.

This query will generate SQL to terminate all running queries that look like they’re using a particular table called auth_user:

SELECT 'SELECT pg_terminate_backend(' || pid || ');'
FROM pg_stat_activity
WHERE query LIKE '%auth_user%'
  AND pid != pg_backend_pid();

To use this query, change the matched table name in LIKE '%auth_user%' before running it. Then copy-paste the output lines and run them.

Matching query with LIKE is a bit blunt, as it has the chance for false positives, but it is simple. The comparison with pg_backend_pid is required to avoid matching the current connection.

Here’s an example of using this query to terminate all queries to a table called library_book:

mydb=# SELECT 'SELECT pg_terminate_backend(' || pid || ');'
  FROM pg_stat_activity
  WHERE query LIKE '%library_book%'
    AND pid != pg_backend_pid();
           ?column?
-------------------------------
 SELECT pg_terminate_backend(123);
 SELECT pg_terminate_backend(124);
(1 row)

mydb=# SELECT pg_terminate_backend(123);
SELECT pg_terminate_backend(124);
 pg_terminate_backend
-------------------
 t
(1 row)

 pg_terminate_backend
-------------------
 t
(1 row)

🔥🔥🔥

Connections Open Longer Than N Seconds

A final example: how to filter out connections to those open longer than N seconds. This is quite the blunt hammer, but you may try it in emergencies.

SELECT 'SELECT pg_terminate_backend(' || pid || ');'
FROM pg_stat_activity
WHERE backend_type = 'client backend'
      AND pid != pg_backend_pid()
      AND backend_start < NOW() - '10 seconds'::interval;

Adjust '10 seconds' as appropriate.

For example:

mydb=# SELECT 'SELECT pg_terminate_backend(' || pid || ');'
FROM pg_stat_activity
WHERE backend_type = 'client backend'
      AND backend_start < now() - '10 seconds'::interval
      AND pid != pg_backend_pid();
              ?column?
------------------------------------
 SELECT pg_terminate_backend(2675);
 SELECT pg_terminate_backend(2676);
(2 rows)

mydb=# SELECT pg_terminate_backend(2675);
SELECT pg_terminate_backend(2676);
 pg_terminate_backend
----------------------
 t
(1 row)

WARNING:  PID 2676 is not a PostgreSQL server process
 pg_terminate_backend
----------------------
 f
(1 row)

😅 Note in this case it seems pid 2676 completed between generating the SQL and running it.

Stopping Queries Through the Operating System

If you have access to your PostgreSQL server, you can also stop queries by signalling the relevant operating system processes. This requires you to have shell access to your PostgreSQL server, which isn’t so common these days with the rise of managed database platforms.

You can use a tool like htop to check the running backend processes. You may be able to find the problematic ones based on high CPU or memory usage.

Backend processes appear as children of the main PostgreSQL server process. Be careful you pick the right pids since if you kill the main process, the whole PostgreSQL server shuts down.

On Linux/macOS/Unixes

On Unix OSs, you can terminate a running query by sending its process SIGTERM (the terminate signal) with kill:

$ kill -SIGTERM pid

Replace pid with the backend process pid.

And you can cancel a process by sending SIGINT (the interrupt signal):

$ kill -SIGINT pid

🫡

On Windows

Windows does not have a kill command, so PostgreSQL provides the pg_ctl tool to send signals to processes. You can use it to terminate a query by sending the terminate signal like so:

$ pg_ctl kill TERM pid

Replace pid with the target backend pid.

You can use it to cancel a query by sending the interrupt signal like so:

$ pg_ctl kill INT pid

…bye bye query!

Killing Query Processes

Using the OS, we can also stop a process immediately, guaranteed, by “killing” it. This is where the “kill” utility’s name comes from. (And its dated, violent terminology.) Doing so can be useful in rare situations where a process carries on consuming resources after termination, potentially due to a bug in PostgreSQL.

This action is, however, very drastic. Immediate termination stops the process instantly, without any chance to do cleanup. PostgreSQL’s design should prevent data loss (of committed rows), but you may miss other incidental data like log messages.

To kill a process, send it the KILL signal.

On Unixes:

$ kill -KILL pid

On Windows:

$ pg_ctl kill KILL pid

💥 Kaboom!

Fin

May you rarely need to use this knowledge,

—Adam

😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX! 😸😸😸

One summary email a week, no spam, I pinky promise.

Tags: postgresql