Your software
My Mautic version is: 3.2.0
My PHP version is: 7.3.26
My Database type and version is: MySQL v8
Your problem
My Mautic database is pretty huge at the moment, it’s 56 GB total. My biggest tables are page_hits (24 GB), email_stats (19 GB), and campaign_lead_event_log (6GB).
There are leads in my database that are from 3-4 years ago, that are not involved in any campaigns, and are essentially contacts I will never be adding into any future campaigns or be contacting through Mautic. This is typically the case with my entire database — once they are signed up, they are contacted over 3-4 weeks and then never contacted again. I’m thinking I can just delete older contacts.
To improve performance, I’m wondering if I can just do a standard ‘delete from’ query and delete all records that are older than 6 months old, from those two tables (page_hits and email_stats).
My questions are:
Would that cause any consistency issues?
Would deleting these records improve performance in any way?
Is there anything I should be aware of? Or any alternatives to increase performance?
Btw, I’ve been using Mautic for 3-4 years now, and apart from being a bit slow when dealing with a large number of contacts, it’s been a total gamechanger in terms of saving me money. Thank you!
So if I just delete records from page_hits and email_stats (the biggest two tables), but I leave the campaign_lead_event_log table alone, I should be all good?
@instamaker There is a cronjob which you can add which purges data older than X days.
It might make sense if you run this command to keep the database at a decent level.
Yes the cron job helps a lot of cleaning out old data. If you want you can even clean out inactive leads. The page hit records you can delete without huge consequences, unless you really need this data. I deleted once the email stats, but repented afterwards, since I lost open rate statistics, indicators of inactive leads and a few other things.
Most importantly: make sure that for every lead the youngest entry in campaign_lead_event_log does NOT get deleted!
I tried to adapt my sql query
My existing query was like this: DELETE FROM lead_event_log where bundle="lead" and object="import" AND date_added < (NOW() - INTERVAL 30 DAY);
I try to build a query which selects for every lead_id, campaign_id combination the latest entry:
# select for every combination of lead_id and campaign id the last event log id
select max(id),lead_id, campaign_id, count(id) FROM campaign_lead_event_log group by lead_id, campaign_id order by lead_id
(i added lead_id, campaign_id and the number of items for debugging reasons in the select statement)
If you combine this with my old query you would get something like this:
# select all entries with a triggered date oder than 30 days and which are not the last entry in the log per leadid and campaign
select * FROM campaign_lead_event_log
WHERE date_triggered < (NOW() - INTERVAL 30 DAY)
and campaign_lead_event_log.id not in ( select max(id) FROM campaign_lead_event_log group by lead_id, campaign_id )
Of course this query needs now some debugging love.
@ekke What do you think? Is this somewhere near your own solution of the plugin you build?