Quality of Service Disruption Alert

Incident Report for Maropost Marketing Cloud

Postmortem

Executive Summary: 

At approximately 9:43 AM EST, we received alerts from customers that login was no longer possible to the application. Initially we saw intermittent errors and the team started investigating. The team found that a recent deployment for campaign queues was affecting a limited functionality for clients, no backend jobs were affected. The team initiated a rollback of the code change and redeployed the application. The application was back up fully for all clients at 11:05 AM EST. 

Components Impacted: 

  1. Marketing Cloud application 

Clients Impacted

  • All customers. 

Root causes: 

  • During a deployment a bug was presented into the application that was no identified in QA, UAT or post deployment 

Mitigation and resolution: 

  • Rolled back recent deployment 

  • We have added an alert for login to detect when login is non functional

Posted Mar 12, 2025 - 18:19 EDT

Resolved

Our Engineering Team has confirmed that system health has been fully restored to baseline norms.

All systems should now be fully operational for all clients. Thank you for your patience as we worked through this.

A Post Mortem of this incident will be posted within 2 business days.

If you continue to experience issues, please submit a support ticket by sending an email to support@maropost.com.
Posted Mar 07, 2025 - 12:35 EST

Monitoring

Our Engineering Team has successfully implemented the solution to restore system availability. We are continuing to monitor system health to ensure that all aspects of our platform are fully recovered.

We apologize for the inconvenience, and will post an update as soon as we've confirmed the issue is fully resolved
Posted Mar 07, 2025 - 11:09 EST

Update

We are continuing to work on a fix for this issue.
Posted Mar 07, 2025 - 10:45 EST

Identified

Subsequent investigation has revealed the cause of the incident reported at 10:00AM Eastern Time Zone. Unfortunately an unforeseen combination of factors lead to the disruption in service.

Our Engineering Team is working to implement a solution, and restore system availability as quickly as possible.

We apologize for the impact to your business. Maropost’s standard procedure is to maintain high availabiility of its platform to all clients.

Thank you for your patience as we work to resolve this issue.
Posted Mar 07, 2025 - 10:37 EST

Investigating

At 10:00AM Eastern Time Zone, we became aware of a service disruption affecting all clients.

Our Engineering Team is currently assessing the impact. We will post a status update as quickly as possible.

We apologize for the impact to your business. Thank you for your patience as we work to resolve this issue.
Posted Mar 07, 2025 - 10:23 EST
This incident affected: Cloud Cluster (Application, API Server), App Cluster (Application, API Server), Cloud1 Cluster (Application, API Server), CA1 Cluster (Application, API Server), and EU1 Cluster (Application, API Server).