Mautic database is huge, can I manually delete old records?

Your software
My Mautic version is: 3.2.0
My PHP version is: 7.3.26
My Database type and version is: MySQL v8

Your problem
My Mautic database is pretty huge at the moment, it’s 56 GB total. My biggest tables are page_hits (24 GB), email_stats (19 GB), and campaign_lead_event_log (6GB).

There are leads in my database that are from 3-4 years ago, that are not involved in any campaigns, and are essentially contacts I will never be adding into any future campaigns or be contacting through Mautic. This is typically the case with my entire database — once they are signed up, they are contacted over 3-4 weeks and then never contacted again. I’m thinking I can just delete older contacts.

To improve performance, I’m wondering if I can just do a standard ‘delete from’ query and delete all records that are older than 6 months old, from those two tables (page_hits and email_stats).

My questions are:

  • Would that cause any consistency issues?
  • Would deleting these records improve performance in any way?
  • Is there anything I should be aware of? Or any alternatives to increase performance?

Btw, I’ve been using Mautic for 3-4 years now, and apart from being a bit slow when dealing with a large number of contacts, it’s been a total gamechanger in terms of saving me money. Thank you!

1 Like

Most importantly: make sure that for every lead the youngest entry in campaign_lead_event_log does NOT get deleted!

So if I just delete records from page_hits and email_stats (the biggest two tables), but I leave the campaign_lead_event_log table alone, I should be all good?

Thank you! :slight_smile:

@instamaker There is a cronjob which you can add which purges data older than X days.
It might make sense if you run this command to keep the database at a decent level.

View the docs here, look for Clean up old data

M

3 Likes

Yes the cron job helps a lot of cleaning out old data. If you want you can even clean out inactive leads. The page hit records you can delete without huge consequences, unless you really need this data. I deleted once the email stats, but repented afterwards, since I lost open rate statistics, indicators of inactive leads and a few other things.

Hey Instamaker

You could do this commands on the shell (backup + on your own risk!)

We do them regularly but I could give you no garanty.

There is a new plugin which does some campaign_lead_event_log cleanups:
https://github.com/Leuchtfeuer/mautic-housekeeping-bundle

**Update: **
Sorry made a mistake in the audit log statement. Fixed it in the gist.

Greetings
Sebastian

7 Likes

I am at your debt master.

2 Likes

FYI we just set this repo to “private” for the time being, because it is a bit too aggressive.
See my comment above:

Most importantly: make sure that for every lead the youngest entry in campaign_lead_event_log does NOT get deleted!

Read: The next version of the plugin will honor that and leave campaign status intact even for very old contacts. Please be patient :slight_smile:

3 Likes

Thanks for the hint ekke! I will try to add this to my sql query!

Good Morning everybody

Regarding the comment from ekke:

Most importantly: make sure that for every lead the youngest entry in campaign_lead_event_log does NOT get deleted!

I tried to adapt my sql query

My existing query was like this:
DELETE FROM lead_event_log where bundle="lead" and object="import" AND date_added < (NOW() - INTERVAL 30 DAY);

I try to build a query which selects for every lead_id, campaign_id combination the latest entry:

# select for every combination of lead_id and campaign id the last event log id 
select max(id),lead_id, campaign_id, count(id) FROM campaign_lead_event_log  group by lead_id, campaign_id order by lead_id

(i added lead_id, campaign_id and the number of items for debugging reasons in the select statement)

If you combine this with my old query you would get something like this:

# select all entries with a triggered date oder than 30 days and which are not the last entry in the log per leadid and campaign 

select * FROM campaign_lead_event_log 
	WHERE date_triggered < (NOW() - INTERVAL 30 DAY) 
	and campaign_lead_event_log.id not in ( select max(id) FROM campaign_lead_event_log  group by lead_id, campaign_id )

Of course this query needs now some debugging love.

@ekke What do you think? Is this somewhere near your own solution of the plugin you build?

Greetings
Sebastian

2 Likes

Looking forward to the update on the plugin too, thank you. Please let us know when you think you’ll have an update ready :slight_smile:

Hello @ekke, I’m very interesting by your plugin.

Pierre

Looking forward to try your plugins … give us a post when it’s released!

Hi ekke — I hope you’re well. I wanted to check if you had an update on the mautic housekeeping bundle? Would you be able to keep the plugin open for use?

My mautic queries are massively hogging up the database and I was wondering if there was a way to clean up campaign_lead_event_log. You mentioned that I should keep the youngest entry in campaign_lead_event_log. Would you happen to know what the query might end up looking like?

Hello,
Here is an article about making your DB smaller documented with screenshots and commands.
Good luck!

4 Likes

Hello there. Was this plugin ever made public? I can’t seem to find it.

Thanks,

Hi there, pls send me your github handle and I can let you in :slight_smile:

Thanks,

1 Like

What it take to be invited to try the plugin?

2 more beta users to give :wink:
→ Do send me your Github handle, @techbill