JSONB data occupy a lot of precious RAM (even though compressed by pq), especially because they have to be fetched into RAM with the rest of the db tuples during table scan and index scan queries (which we unfortunatelly rely on).
After usage of JSONB data is minimized within Mentat code (see #4274), events table can be split, and 'extend' part of the API can be modified accordingly. Only small parts of full length jsonb IDEA data will be hot in the memory, presumably only single events, which are viewed by user in the Hawat 'show' interface, and hundreds (tops) of events from reporting - which are however usually hours old, max days if relapse kicks in.
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
From telco brainstorming with Radko - upgrade of the db and code could be done with relatively short downtime (the longest will be CLUSTER, which can be delayed to off-hours).
create new empty table
enable API 'extend' code to be able to deal with jsonb in separate table, but resort to jsonb in the original table (to be able to cope with the transition period)
stop incoming data
run the last run of the reporter and statistician
alter table jsonb NULL (allow NULL data in the old table)
set up new storage code (to split into metadata table and jsonb data table)
enable incoming data (last column in metadata table now gets NULLs)
enable reporter and statistician
repeat: select old jsonb into new jsonb limit XXX; until done (limit is to prevent lock congestion)
Statistics gathering was performed to better understand this. Following numbers are for mentat-hub on 20181024.
Current table size: 171GB
Number of rows: 125.3M (125 314 536)
Aggregate size of event BYTEA: 110GB (118 092 392 057) - obtained by SELECT sum(octet_length(event)) FROM events;
Aggregate indices size: 34.8GB (REM: No REINDEX was performed for a long time so the indices are inflated considerably)
ID index size: 11GB (this would be duplicated for the data table)
That leaves us with (171-110) + 34.8 + 11 = 106.8GB per 125.3M of events, effectively 0.85GB per 1M rows. With current physical memory and long term memory load, this would get us to at least 260M events fully cached in RAM. A considerable improvement.
The VACUUM and CLUSTER commands are very important as those are the points where the events table is compacted. They should be part of the automatic migration.