I have been using mautic for a short while on smaller projects and recently decided to try it at a larger scale.
Here’s what was done:
- Install mautic
- Connect it to salesforce with fields being mapped only to pull from salesforce (no writes desired, yet)
- Run the integration scripts to pull/fetch
- Troubleshooting around #3
Here’s more information about the environment:
- We have approximately 2,300,000 records in salesforce (75% leads, 25% accounts)
- Mautic is running in a docker container (and I have many of these running elsewhere at smaller scale/much less records, this configuration is a “familiar territory” to me)
- There is nothing other than mautic running, it’s been given 2 CPUs and 8GB of RAM to start’
- I’m also familiar with Symfony3 and have been writing code for far too long; an advanced to hardcore level user
Here’s the issues I’ve run into so far:
- Running the integration script with the timespan of “the past 8 years” proved fruitless, it surpassed 6GB of RAM use (which was the php.ini memory_limit, actually passed in as -dmemory_limit=8G to the console script). Yes we have 8 year’s worth of data there, yes some of those old records are useful (they are still a customer!)
- Running the integration script from a supervisor script to call it “once per day for each day in the past 8 years” proved useful, as this allowed it to fetch the records in chunks and pulled in all of the 2.3M records. This was my fix to #1 but actually used 5GB+ of RAM when it hit days that saw 250K+ modified records. Later on I further broke this down to hourly when it started reading records “to be updated in salesforce”.
- So this import occurred within the past 3 days and encompassed records spanning the past 8 years. What was discovered, is that when my script got to 3 days ago (read: the day that the first sets of records from 8 years ago were imported into mautic) it seems to trigger an update of salesforce for those records. This is a bit counter-intuitive, as I don’t have any fields “pointing” back to salesforce yet in the mapping nor any timeline records for these contacts (besides their recent import to mautic).
- Now #3 wouldn’t be an issue in of itself, I can doctor some dates in the database to skip past that “update that will probably be skipped anyways”. There is another problem here related to SOSL that is related to data that was imported (see picture below). This one actually has me concerned about a potential issue (read: one day this is going to bite when not expected), whereas much of the above is mainly just feedback.
- Do not subscribe entries do not seem to be created. I mapped a field we use “EmailOptOut” (boolean) to the “Do not send contact by email” (or similarly named) field during salesforce setup and prior to running the integration scripts. We can import this manually afterward, but it was expected that this field would’ve been mapped accordingly. The docs are fairly thin on what to expect from this mapping, I’ll review this code once #4 is dealt with.
- Much of the data didn’t have to be pulled from salesforce. Allowing the salesforce integration to supply additional filters (on top of mapping) to only fetch records with a specific criteria would have reduced the impact (both in terms of API requests to SF as well as load on the mautic system/db) and made this process a bit less painful. This is more of a feature request than a problem, as over the coming months I’ll likely be flushing out more of these “large scale” issues we encounter.
Is anyone else using mautic with millions of records or have suggestions/comments? I’m currently blocked on #4 – any help would be appreciated!!