The indices are indeed missing, probably forgotten during schema update. Adding them now is of course trivial, but in the end futile. For the indices to have possitive effect, planner costs would have to be altered. This was done on mentat-alt experimentally, but it breaks the fragile equilibrium, breaking many other cases and leading to suboptimal plans there.
Based on the work on this issue, #6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.
#6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.
Okay, let's see what comes up from #6251 (it might be that different set of indices will solve/mitigate the issue, or we pursue different form, or we drop the idea - and columns - altoghether).
With PG13, this might actually be finally possible. The multi-column statistics are supposed to be extended for the ANY operator, leading to better plans with sensible (compatible with other queries) planner costs. Details here: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development, a long discussion about three separate improvements, the first one is of interest in this case.
The database on mentat-dev was updated to PGv13beta1, there are actually testing debian packages available from the postgres apt repository. The events table with all of its configuration was migrated using @pg_dump@/pg_restore from mentat-alt. The difference is that mentat-dev is of a much lower perforance budget and also the table does not fit into the RAM there.
The testing shows, that the improvements in PG13 are helping tremendously with this workload. Out of the box (with events table and it's settings transferred), the querying works as designed:
The absolute run-times are not comparable. The tests show that it is best to use the extended query form with aggregation comparison for prefixes longer than /16. For shorter prefixes, the original form offers better performance (as matching events are quite abundant).
All in all, this seems like a viable approach with PG13, as was intended originally.
Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I'm curious what final times on our boxes will be.
Anyway, seems we can postpone more convoluted solution attempts for now.
Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I'm curious what final times on our boxes will be.
This was with full data, the events table as exported from mentat-alt is 39GB, 53GB with indices, the machine has 32GB of RAM. Not that it matters, this is from cache and the indices and corresponding pages can fit into RAM.
The run times on mentat-hub for such a query were about 250ms if I remember correctly.