How to Check Multiple Tables Are Empty in SQL2020-06-08
I encountered this problem recently for a Django ticket for speeding up part of Django’s test runner.
We have a list of tables to truncate.
How can we figure out which are already empty, so we can skip issuing a somewhat costly
TRUNCATE TABLE on them?
You might think the database could skip
TRUNCATE TABLE on empty tables, but it normally does work beyond removing the rows such as resetting auto-increment counters.
TRUNCATE TABLE is even equivalent to
DROP TABLE and
CREATE TABLE, which requires removing and creating the table file on disk.
You might think the easiest way to find out which tables are empty would be to count the rows in each table:
This works, but it’s slow.
Database systems don’t keep a counter of the number of rows in each table, because the overhead to keep it consistent with transactions would be too high.
COUNT(*) means touching every individual row, obtaining read locks on each of them.
Since we only care about seeing if the table is empty, if we find at least one row we know it’s not. Therefore we can instead query for a single row:
This query returns an empty result set if the table is empty, or a single ID an arbitrary row otherwise.
We don’t care about the ID itself. We can therefore select a static value instead:
This query again returns an empty result set if the table is empty, or
1 if not.
We can use this technique against multiple tables at once using SQL’s
By selecting the table names as our static values, under the same column name, the combined result set will be a list of the non-empty tables:
UNION many such queries together - databases tend to have a limit on query length or opened tables, but it’s quite high.
And this is the fastest way I can think of for checking which tables in a list are empty, accurately.
What about INFORMATION_SCHEMA?
I should mention
INFORMATION_SCHEMA.TABLES, the information schema metadata table-of-tables.
In theory this is the right place to ask meta-level questions about your tables, including which ones are empty.
On MariaDB/MySQL, this includes a
table_rows column that you could use to check which tables are empty.
Unfortunately it’s only an estimate, so you might get false positives or negatives.
Other databases have similar estimates, such as SQL Server’s
sys.partitions.rows, but I believe they all carry risks of false positives and negatives.
Hope this helps,
Are your Django project's tests slow? Read Speed Up Your Django Tests now!
One summary email a week, no spam, I pinky promise.
© 2020 All rights reserved.