Mettre à jour PostgreSQL pour améliorer les performances

create table geants( idg serial primary key, genre char(1), taille smallint, masse smallint, actif boolean, devise varchar(128), pw smallint, heureux boolean, couleur varchar(8), veteran boolean, clan smallint, gabarit varchar(8), revenu integer, pm smallint, berserk boolean, tutelaire smallint, ere varchar(10), cyclope boolean); CREATE TABLE WITH serie(i, r1) AS (SELECT generate_series(1,10000000,1), random()) insert into geants(genre, taille, masse, actif, devise, pw, heureux, couleur, veteran, clan, gabarit, revenu, pm, berserk, tutelaire, ere, cyclope) select case when random() < 0.45 then 'M' else 'F' end, 200 + (trunc(random() * 200 + 1)), 300 + (trunc(random() * 200 + 1)), case when random() < 0.01 then false when random() > 0.5 and random() < 0.99 then false else true end, upper(md5(random()::text)), (trunc(random()*100 + 1)), case when random() < 0.1 then false else true end, case when r1 <= 0.29998 then 'GRIS' when r1 <= 0.59998 then 'NOIR' else 'BLEU' end, case when random() < 0.9 then false else true end, (trunc(random()*1000 + 1)), case when r1 <= 0.29999 then 'PETIT' when r1 <= 0.59999 then 'MOYEN' else 'GRAND' end, (trunc(random()*1000000 + 1)), (trunc(random()*10 + 1)), case when r1 <= 0.3 then true when r1 <= 0.6 then false else null end, (trunc(random()*10 + 1)), case when r1 <= 0.30001 then 'TAUREAU' when r1 <= 0.60001 then 'LICORNE' else 'DRAGON' end, case when r1 <= 0.30002 then true when r1 <= 0.60002 then false else null end from serie; INSERT 0 10000000 create table lancers(dtl timestamp, idg integer, perf integer); CREATE TABLE WITH serie(i) AS (SELECT generate_series(25000000,1,-1)) insert into lancers(dtl, idg, perf) select current_timestamp - (i || ' minutes')::interval, trunc(random() * 10000000 + 1), case when random() <= 0.001 then 50000 + trunc(random() * 50000 + 1) else trunc(random() * 50000 + 1) end from serie; INSERT 0 25000000 create index on lancers(perf, dtl); CREATE INDEX analyze geants; ANALYZE analyze lancers; ANALYZE \d+ Liste des relations Schéma | Nom | Type | Propriétaire | Taille | Description --------+----------------+----------+--------------+------------+------------- public | geants | table | postgres | 1116 MB | public | geants_idg_seq | séquence | postgres | 8192 bytes | public | lancers | table | postgres | 1056 MB | (3 lignes)

select version(); version ---------------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 14.8 (Debian 14.8-1.pgdg+~20230519.2215.gf8320cc) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit (1 ligne) explain select idg, perf from (select idg, perf, rank() over (order by perf desc) from lancers) as lancerst where rank <= 5 order by perf desc; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Subquery Scan on lancerst (cost=0.56..1988101.52 rows=8333387 width=8) Filter: (lancerst.rank <= 5) -> WindowAgg (cost=0.56..1675599.52 rows=25000160 width=16) -> Index Scan Backward using lancers_perf_dtl_idx on lancers (cost=0.56..1300597.12 rows=25000160 width=8) JIT: Functions: 7 Options: Inlining true, Optimization true, Expressions true, Deforming true (7 lignes) select idg, perf from (select idg, perf, rank() over (order by perf desc) from lancers) as lancerst where rank <= 5 order by perf desc; idg | perf ---------+-------- 9099587 | 100000 3578666 | 100000 2112692 | 99998 8481544 | 99998 9029854 | 99996 (5 lignes) Durée : 51879,843 ms (00:51,880)

Toujours plus de 50 secondes, PostgreSQL 14 n'avait rien apporté par rapport à PostgreSQL 13.
Le même test avec PostreSQL 15 :

pg_upgradecluster 14 -v 15 main select version(); version ---------------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 15.3 (Debian 15.3-1.pgdg+~20230519.2210.g4729d1e) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit (1 ligne) explain select idg, perf from (select idg, perf, rank() over (order by perf desc) from lancers) as lancerst where rank <= 5 order by perf desc; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Subquery Scan on lancerst (cost=0.56..2074167.74 rows=25000160 width=8) -> WindowAgg (cost=0.56..1824166.14 rows=25000160 width=16) Run Condition: (rank() OVER (?) <= 5) -> Index Scan Backward using lancers_perf_dtl_idx on lancers (cost=0.56..1449163.74 rows=25000160 width=8) JIT: Functions: 7 Options: Inlining true, Optimization true, Expressions true, Deforming true (7 lignes) select idg, perf from (select idg, perf, rank() over (order by perf desc) from lancers) as lancerst where rank <= 5 order by perf desc; idg | perf ---------+-------- 5119689 | 100000 6151298 | 99997 8200569 | 99996 8343259 | 99993 4922338 | 99992 (5 lignes) Temps : 51,302 ms

51 millisecondes, le temps d'exécution est divisé par 1000. C'est une amélioration signée David Rowley qui fait toute la différence "Improve the performance of window functions that use row_number(), rank(), dense_rank() and count() ". Le travail réellement effectué est proportionnel au N de la requête top-N via la partie "Run Condition" du plan d'exécution.
Mon exemple est loin d'être la seule utilisation possible. Cette amélioration ouvre notamment la voie à des fonctionnalités de pagination efficaces basées sur row_number(), comme l'a relevé Markus Winand.

Conclusion

Mettre à jour PostgreSQL permet de bénéficier de nouvelles syntaxes efficaces mais aussi d'améliorer les performances des requêtes sans modifier les syntaxes existantes.

Mettre à jour PostgreSQL pour améliorer les performances

Mise à jour : 21/05/2023