PHP Upgrade & Issues with NextGEN Gallery (Updated)

UPDATE: 01/11/2018 UTC 17:23: All servers have now been updated to 7.0.27 which has a fix for the regression which appeared in php 7.0.26.

UPDATE: 11/30/2017 UTC 18:53: We have now rolled back our PHP version to 7.0.25 Your NextGEN plugin should function as it did prior to our PHP upgrade.

With our PHP upgrade from  7.0.25 to 7.0.26 , we have discovered an issue with the NextGEN Gallery plugin. The PHP upgrade is causing a PHP Fatal error and Segfault with this particular plugin. We have found this to be a reported issue on the plugin’s support forums as well:

After discussions with the plugin’s developers. we found the bug to be within the PHP release itself and not the actual plugin. We are currently in process of reverting back to an earlier version of PHP to avoid this issue until a new release is out with a fix.

As always, if you have any questions , please do not hesitate to reach out to our support team:

Holiday Season Support Hours

It’s been a very busy year at Pressable and the holiday season is a great time to say thanks to everyone who has supported us and worked alongside us this year. All of the support, encouragement, and belief in what we do is wonderful and we can’t thank everyone enough!

With that in mind, our help desk will have reduced staffing/support on a few set days so that our employees can celebrate the holidays and the year’s accomplishments with their family and friends. Please note these reduced staff days and consider addressing any potential issues in advance. Thank you!

  • Thursday, November 23 – Emergency Support Only
  • Friday, November 24 – Emergency Support Only
  • Monday, December 25 – Emergency Support Only
  • Monday, January 1 – Emergency Support Only

Our help desk will be working regular hours over the season outside of these days.

Thanks again for your continued support! Happy Holidays!

Temporarily Support Outage

Our upstream support provider reported two short outages totaling seven minutes on Tue Aug 29 from 22:03 – 22:05 and again from 16:36 – 16:41.

If you sent a ticket to it may need to be resent.

This report only effects support tickets and is not the Pressable infrastructure itself which remains fully operational.

Our service provider is reporting all systems are up. We are monitoring the situation throughout the day and will report back if anything changes.



5/18 Partial Outage (Resolved)

4:00 PM UTC : The Partial Outage was caused by a DDoS attack on a specific customer’s site that we have now mitigated. We will continue to monitor the situation but as of now this outage has been resolved.

3:30 PM UTC : We are aware of and investigating a partial outage impacting some customer sites.

Investigating Outage

UPDATE: 2017-07-19 18:26 UTC

On Thursday July 13, 2017 a subset of Pressable customer sites (including our own site, experienced an outage caused by a failure in a database server. Customers with sites reliant on this database server experienced 42 minutes of downtime. A smaller subset of the impacted sites experienced a further 15 minutes of downtime 2.5 hours after resolution of the first outage.


The investigation to the underlying cause for the failure of the database server is ongoing. We know that several database queries were allowed to create temporary tables on disk that never completed, resulting in more than 1TB of disk space to be consumed in a very short period of time. The database server disk became 100% full, which led to the database server failing.

Pressable has failover and redundant systems in place, but promoting a replicated database “slave” to become the database “master” is not an automated process.

While the underlying cause that led to the database failure may not have been avoidable, gaps in our alerting caused the outage to last far longer than it needed to or should have.

The 15 minute outage caused 2.5 hours later was our fault. Once the original database master failed, our engineers worked to reinitialize it as a replicated slave of the new master. Unfortunately, the engineer used the new master server to create a backup to import on the slave. This resulted in read/write locks against the databases and tables.

What We’re Doing

  • We’ve made updates to our alerting to ensure that the right engineers are paged when hosts and services that are critical to serving site traffic trigger monitoring.
  • We’ve also deployed updates to set a maximum allowed query execution time and queries that reach or exceed that are “killed”. When architecting our new platform, we avoided adding this in favor of wanting to provide an environment that was more flexible for working with larger datasets (importing, exporting, querying).
  • Reviewing processes and procedures for recovering from situations like this and implementing tools that are more automated and remove the potential for error.

We’d like to apologize to our customers that were impacted by this outage.

This is the first failure of a database master server that caused downtime for customer sites since launching our “v2” platform over 16 months ago. Several of the tools, safeguards, and features built into the platform worked in this scenario. Some, unfortunately, didn’t or failed in ways we didn’t think possible or exposed gaps in process and alerting.

Around 3:00 am CST we were getting reports of users not being able to login to the WordPress dashboard, or their sites giving a 500 error. Members of our systems team were notified immediately. We were able to resolve the issue after approximately 30 to 40 minutes.

We will provide everyone with a post mortem after we have finished investigating the issue. We are still looking into the cause.