After working a lot more with PostgreSQL, I happened to stumble upon a few things, mostly related to query optimisation, that at least for me weren't quite readily apparent. Please note that this is all with regards to PostgreSQL 9.3, but unless otherwise noted it should still be the case for 9.4 as well.
Slow DELETE
s
Obviously, indexes are key to good query performance. If you're already using
EXPLAIN
, with, or without ANALYZE
, chances are good you know how and why
your queries perform like they do. However I encountered a problem with a
DELETE
query, where the output from EXPLAIN
was as expected, i.e. it was
using the optimal plan, but still the performance was abysmal; a query to
delete about 300 elements in bulk, like DELETE FROM table WHERE id IN (...);
,
was quite fast to remove the elements (as tested from a separately running
psql
), but still the query took about three minutes(!) to complete.
In this scenario the table in question was about a million rows long, had a
primary index on the id
column and was referenced from three other tables,
which also had foreign key constraints set up; no other triggers were running
on any of the tables involved.
The postgres
server process in question wasn't doing anything interesting, it
was basically taking up one core with semop
calls as reported by strace
, no
other I/O was observed though.
At that point I finally turned to #postgresql
. Since the information I could
share wasn't very helpful, there were no immediate replies, but one hint
finally helped me to fix this problem. It turns out that the (kinda obvious)
solution was to check for missing indexes on foreign key constraints from
other tables. With two out of three indexes missing I resumed to CREATE
INDEX CONCURRENTLY...;
and about five minutes later the DELETE
was now
running in a few milliseconds.
The part where I was really frustrated here is that none of the available
statistics guided me in the search; EXPLAIN ANALYZE
apparently doesn't
include the runtime for foreign key constraints and they don't show up in
other places as well. In hindsight this is something that I should've checked
earlier (and from now on I will), but it's also a weakness of the analyze
framework not to help the developer to see the slowdowns involved in a
situation like this.
Common Table Expressions
Refactoring queries to reuse results with WITH
queries is absolutely worth it
and improved the runtime of rather large queries by a large factor. This is
something that can be seen from the query plan, so when you're using the same
expressions twice, start looking into this and see if it helps both for
readability (don't repeat yourself) and performance.
JSON result construction
We need nested JSON output in a few cases. This means (in 9.3, there are some
better functions available in 9.4) that a combination of
row_to_json(row(json_agg(...)))
was necessary to get proper nesting of
aggregated sub-objects, as well as wrapping the end result in another object,
because the output had to be formatted as a JSON object (with curly brackets)
instead of a JSON array (with rectangular brackets).
Technicalities aside the JSON support is pretty good and since the initial code
was written I've discovered that since we in many cases don't actually have
multiple results (for json_agg
), not using that method will again
significantly improve performance.
That means instead of something like the following:
SELECT row_to_json(row(json_agg(...))) FROM ... JOIN ... GROUP BY id;
, where the input to json_agg
is a single result from the JOIN
, we can
write the following instead:
SELECT row_to_json(row(ARRAY[...])) FROM ... JOIN ...;
, which, if you examine the output of EXPLAIN
, means no sorting because of
the GROUP BY
clause. The convenience of json_agg
here doesn't really
justify the significant slowdown caused by the aggregation function.
Note that the array constructed via ARRAY[]
is properly converted to JSON, so
the end result is again proper JSON.