Always long running queries against email_stats (5+ hours)

My Mautic version is: 4.2.0
My PHP version is: 7.4
My Database type and version is: MySQL 5.7 (percona)

My problem is: We’ve been running Mautic for about 4 or 5 years now, and there’s always at least 1, multi-hour, huge query running, taking at least a few hours to complete, usually between 4 to 6 hours.

Being the Systems Admin, not the Mautic user, I can’t speak much to what exactly they’re doing, if anything, to cause these, but in any case it seems like if this was normal for others, no one would be using Mautic as these queries cause pretty everything else about the system to be extremely sluggish.

When we first rolled out Mautic, we purchased a (custom?) support package, again, I’m not exactly clear on the details of that, but to my understanding we should have received support on the initial rollout, and a year afterwards – this problem has existed since the original setup. We’ve migrated to 4 different hosting environments to give the database more resources, upgraded the Mautic software to each new major version, but nothing has made any measurable impact.

I’ve tried a handful of different cron implementations to make sure they weren’t just piling on top one another, as of today, we’re trying a similar implementation to what’s mentioned on Broadcast sending stop suddenly during the process no issues from Cron Job or Queue

I don’t know how to compare the size of our setup to others, but I imagine other orgs are running much larger installs.

I won’t post the entire query, but hopefully this is useful:

SELECT ch.lead_id FROM email_stats ch WHERE (ch.date_sent >= '2022-04-25 20:13:45') AND (ch.lead_id IN ( .... snip ... thousands of numeric values ... snip ....)) AND (NOT EXISTS (SELECT null FROM lead_frequencyrules fr WHERE (fr.lead_id = ch.lead_id) AND (fr.frequency_time IS NOT NULL AND fr.frequency_number IS NOT NULL))) GROUP BY ch.lead_id HAVING count(ch.lead_id) >= '3'

Samples from show full processlist

           db: mautic
      Command: Query
         Time: 13706
        State: Sending data
           db: mautic
      Command: Query
         Time: 12254
        State: Sending data

Steps I have tried to fix the problem:

  • Migrated to different hosting environments, each greater than the previous in terms of CPU/MEM/DISK IO
  • Our devs/DBA have spent (too) many hours over the years attempting to find a solution

Please let me know if anyone needs more information, has suggestions, comments, etc.

Hi there,

Can you post some more info:
Full email_stats count/size
Full email_stats count where ch.date_sent > $foo
Size of the indexes
What storage engine the database/tables are

I dont think lead_frequencyrules is being used, but that should not affect the query too much.
That query maybe coming from a campaign, that’s why that has so many leads.

If possible to run the real query with EXPLAIN, you may be able to see what part exactly it’s stuck, or strace -p in the worse case.

I don’t have too much knowledge on the internals/query tunning of mysql, but AFAIK it can only do one index a time, and that list with a lot of IN is not going to be using the index (at least if this was postgres, it will try to do a (parallelo) seq scan, that will inevitability be slow

Also, did ever tried to use MariaDB instead? Mautic can run fine on mariadb:10.5.10 with ~ 30k leads/100k emails/week with 1 year retation, but I also have some smaller mautic running on percona-server:5.7
the only time I had a CPU spike was when I was using this segment: Slow segment runtime

Check tour db size. Email_stats.tokens table can grow huge the whole email is preserved there. What is your db size?

Info requested:

Can you post some more info:
Full email_stats count/size

SELECT COUNT(id) FROM email_stats;
+-----------+
| COUNT(id) |
+-----------+
|  11282840 |
+-----------+

@joeyk @renatocron

Ran: SELECT SUM(char_length(email_stats.tokens)) FROM email_stats
Which came out to 38572298081 bytes, ~38GB

Full email_stats count where ch.date_sent > $foo

We’re actually already purging +1 year from email_stats, so all the numbers above are for 1 year

mautic> SELECT MIN(date_sent), MAX(date_sent), COUNT(*) FROM email_stats;
+---------------------+---------------------+----------+
| MIN(date_sent)      | MAX(date_sent)      | COUNT(*) |
+---------------------+---------------------+----------+
| 2021-04-27 07:10:03 | 2022-04-27 13:33:23 | 11282840 |
+---------------------+---------------------+----------+
1 row in set (5.209 sec)

Size of the indexes

3.7GB for email_stats

What storage engine the database/tables are

InnoDB

Comments

If possible to run the real query with EXPLAIN, you may be able to see what part exactly it’s stuck, or strace -p in the worse case.

+----+--------------------+-------+------------+-------+----------------------------------------------------------------------------------------------+--------------------+---------+-------------------+---------+----------+-------------+
| id | select_type        | table | partitions | type  | possible_keys                                                                                | key                | key_len | ref               | rows    | filtered | Extra       |
+----+--------------------+-------+------------+-------+----------------------------------------------------------------------------------------------+--------------------+---------+-------------------+---------+----------+-------------+
|  1 | PRIMARY            | ch    | NULL       | index | IDX_CA0A262555458D,stat_email_search,stat_email_search2,email_date_sent,email_date_read_lead | IDX_CA0A262555458D | 5       | NULL              | 7957533 |    16.66 | Using where |
|  2 | DEPENDENT SUBQUERY | fr    | NULL       | ref   | IDX_AA8A57F455458D                                                                           | IDX_AA8A57F455458D | 4       | mautic.ch.lead_id |       1 |    81.00 | Using where |
+----+--------------------+-------+------------+-------+----------------------------------------------------------------------------------------------+--------------------+---------+-------------------+---------+----------+-------------+

Link to full output for explain (43870 lead ids in this particular query)

Also, did ever tried to use MariaDB instead? Mautic can run fine on mariadb:10.5.10 with ~ 30k leads/100k emails/week with 1 year retation, but I also have some smaller mautic running on percona-server:5.7

This is probably my next step if I can’t find anything else.
I really doubt there’s a change that will make that much of a difference between the MySQL versions, but it’d be worth trying if there’s nowhere else to look.

Is it “safe” to cleanup the email_stats.token column? What data is lost?

Hey, great investigation so far.
I happened to walk in the same shoe, and did an experiment. Here is a blog post with step by step commands and explanations including effect of the treatment:

3 Likes

Thank you for the info

As you have the query and the data, I suggest you run it again (with explain) on the mariadb to see if any improvements may happen

Please keep us Updated

Thanks @joeyk and @renatocron for the info/suggestions.

So, this kinda answered my initial question, “No, this isn’t normal for a mautic install”, which is what I suspected, the fact that others are having to cleanup similar tables is at least re-assuring we’re on the right path.

Next steps we’re taking:

  • Pruning a bunch of old, anonymous leads, and consequentially, orphaned rows in other tables
  • Pruning email_stats.tokens for any emails more than 30 days old

I’ll post an update once complete and see if it’s going to “resolve” this issue.

As an update, we performed the following:

DB Cleanup

  • Pruned email_stats.tokens, setting anything older than 30 days to NULL
  • Pruned ~1.2 million anonymous leads – we were under the impression the maintenance:cleanup would be handling this, but it appears not
  • OPTIMIZED all tables

Results

  • No longer have queries for 5 hours :smile:
  • The longest queries appear to be related to mautic:segment:update, running between 4 and 6 minutes, which is still not great – any suggestions for optimization is welcome here.
1 Like

Great post!!

Hi, good news about speeding up your Mautic.
Segment processing time depends on how many segments are you running for how many contacts.
Also:
It depends on your filters. “contains” for example slows down your queries, try to avoid that.
Maybe you want to run your segments selectively. Not all of them needs to be ran all the time?
Here is a video:

Much appreciate everyone’s input here.

After a week or so of tinkering with this, happy to report our Mautic install is in a much better state.
We can run a full set of segment updates multiple times per day now, as opposed to 1 full one every day and a half or so.

In an effort to give back, I’m releasing the scripts we use to run our Mautic crons, probably only really useful for decently large installs, but nonetheless, you can find them at

I converted ours to this setup about the time I opened this thread and it’s been going great, feedback welcome :slight_smile:

1 Like