Category Archives: Uncategorized

Site creation/Cloning Issues

We are currently having issues deploying new sites and cloning sites to our ORD Datacenter. Our team is currently looking into getting this issue resolved as quickly as possible. Please note that this only affects new site creations and clones from current sites. This does not affect functionality on current sites.

We will continue to post updates here as we look for a resolution. If you have any further questions, please submit a ticket via your my.pressable.com control panel.

UPDATE 4/4/2015: Deploying sites on our systems is functioning once again. We are still working on the cloning process at this point. 

UPDATE 4/9/2015: We are still working on getting clones working through our automative system. If you are currently waiting on a clone to finish or need one done, please submit a ticket via your my.pressable.com and we will be able to manually clone your site for you. 

RESOLVED: Chicago and Virginia Datacenter Outages.

Due to our upstream provider having connectivity issues, we are currently experiencing downtimes at our Chicago and Virginia Datacenters. We are currently working with our provider to correct this issue. We will continue to update this post as we get more information.

UPDATE 4:41 PM CST: After further investigation, it appears that this outage is only affecting customers in our Chicago Datacenter.  We are still trying to gather more information from our provider so we can provide a possible ETA. 

UPDATE 5:15 PM CST: We are still working with our provider in order to diagnose this issue. We should have a more detailed update very soon. 

UPDATE 6:30 PM CST: We are seeing some clusters begin to online. We are working on rolling out the rest of our clusters to full functionality now. We will update once that is done. 

UPDATE 7:47 PM CST: We have restored functionality across our systems. If you are continuing to see issues with your site/s, please submit a ticket via your my.pressable.com control panel. 

 

RESOLVED: WooCommerce SQL Injection Vulnerability

Earlier today the Wordfence Security team released the details of a WooCommerce SQL Injection Vulnerability. Our systems are already at work patching this popular plugin across sites on our systems. We’ll provide an update when the process has been completed.

UPDATE March 14th, 7:55AM CST: At this time all sites on our systems have been updated to the latest (patched) version of WooCommerce. If you have any questions, please don’t hesitate to reach out.

IAD Datacenter Issues

We are experiencing an issue with sites in our IAD Datacenter that is causing them to not load appropriately.

We believe this is related to issues that Rackspace is currently having with their cloud block storage services. We are working with them directly and awaiting further details regarding this issue and will update the status blog with more information as it becomes available.

If you have questions or concerns, please create a help desk ticket via your https://my.pressable.com panel or join us in our community lounge at http://chat.pressable.com for updates while we await further details.

UPDATE Feb 27, 2015 @ 5:00 AM Central: We were able to confirm that this is an issue occurring at Rackspace with their Cloud Block Storage service. You can find more details and information on their status page: https://status.rackspace.com/index/viewincidents?start=1425013200

We will update again as more information is available.

UPDATE Feb 27, 2015 @ 5:55 AM Central: Rackspace has resolved the issue on their end and we are now working on re-establishing stability on our end. We will update again as soon as this is taken care of.

UPDATE Feb 27, 2015 @ 6:10 AM Central: We have now restored functionality across our IAD datacenter and all sites are now functioning normally. If you continue to experience problems, please submit a help desk ticket via your https://my.pressable.com panel.

Slow connections causing issues with sites.

We are currently experiencing an attack similar to the attack we had 2 weeks ago, we have isolated the target which is towards the cluster “galaxy01”.

We are working with our provider now to resolve this as quickly as we can.

UPDATE: 4:35 P.M.- We’re working on identifying which specific set of servers are being attacked and will have a fix in place as soon as we can identify this.

UPDATE: 5:07 P.M.- We have identified the IP that was causing this attack. We have banned that IP which has resulted in connections returning to normal, which in turn means that sites should be loading properly now. If you are not seeing your site/sites coming online, please submit a ticket via your my.pressable.com control panel. 

If you have any other questions, please feel free to submit a support ticket via your my.pressable.com control panel.

Current Outage Breakdown and Full Information

Howdy,

We have been flooded with help desk requests, tweets, emails, and phone calls requesting more information and have had a bit of a difficult time keeping up and getting responses out to everyone and answer all questions.

We are starting a new status blog post to help answer as many of the most frequently asked questions as we can. You can continue to receive the latest updates at the bottom of this post.

For customers who have questions/concerns regarding the outage, please join us in our chat lounge to discuss: hipchat.com/g5gQ8vl9S

Exactly what happened and what is going on?

Yesterday morning, we encountered issues with caching servers that had previously been built and optimized to handle load from the outage early last week. It was determined that this was a result of a bandwidth limitations of our internal network traffic.

As a result, we built new servers that had double the network throughput and continued to have the same issues as before. From here, we decided to subdivide caching traffic based on cluster and saw significant improvement in the situation.

After seeing improvement, we began to see internal bandwidth limitations on database servers and are currently working on adding additional database servers to help with this.

What is being done to fix it?

The first step in addressing the original issue from yesterday morning was to subdivide internal caching traffic based on cluster. Doing has helped significantly and has helped us see bottle necks in other places, most importantly our database servers.

As a result, we are working to add additional database servers as a means of addressing these internal bandwidth limitations as they pertain to databases.

What is the ETA on completing a fix?

Providing an ETA in this situation is very difficult. It relies on us knowing exactly how quickly we can get new servers up, optimized, and working reliably. These times are unknown because it needs to include some time to monitor the implementation and verify improvement.

We are all hands on deck and all working very hard to have stability restored to the system.

We do not currently have and will not likely provide an ETA in this situation. The best thing to do is to keep checking the current status at the bottom of this post.

We expect things to be working normally within a number of hours.

What is the current status?

As of 9:35AM, Jan. 21, 2015: Sites are currently up and down, intermittently. We are currently working to re-provision portions of our architecture but the rate at which we can add servers is currently limited and we are working with our provider to have our rate limits pushed up. Once new servers are up, we will begin seeing sites stay up consistently and running at normal speed.

UPDATE 10:45AM, Jan. 21, 2015: Our provider has raised our rate limits and we are able to provision servers at a more rapid pace. We are still working on getting new database servers up and will update again as soon as we have more information to share.

UPDATE 1:50PM, Jan. 21, 2015: We are currently finalizing the deployment of several new machines and are monitoring these for improvements. We are expecting to see improvements in several clusters as soon as these deploys are finished. We will update again as soon as we have more information to share.

UPDATE 5:10 PM, Jan. 21, 2015: Our team is still working to fine tune the new hardware deployments brought online. We’re continuing to monitor the situation and make adjustments as needed.

UPDATE 8:00 PM, Jan. 21, 2015: Our team is still working to get the new hardware deployments brought online and in rotation.We are expecting to see improvements in several clusters as soon as these deploys are finished. We’re continuing to monitor the situation and make adjustments as needed.

UPDATE 11:30 PM, Jan. 21, 2015: We’re continuing to monitor the situation and make adjustments as needed. Sites are currently up and down, intermittently until the new hardware deployments are brought online. We will update again as soon as we have more information to share.

UPDATE 9:15 AM, Jan. 22, 2015: After discussions with our provider, we are in the middle of rolling out changes that we are hoping will help resolve this in the near future. Thank you for hanging in there with us as we look for a resolution.

UPDATE 10:30 AM: We have isolated a single cluster that was causing trouble for the others. For the time being, all other clusters except that one are up and running “normally.” As we work on that other cluster and test changes/fixes, though, the others may be affected by it. For customers on our bode cluster, we are working on a fix right now  and looking to possibly move customers off this cluster as soon as possible. We are still working on a finalized course of action for these customers.

UPDATE 1:35 PM: We have created a new cluster and have moved a large chunk of customers off of bode and onto a new cluster named “hydra.” If you previously had a site on bode and that has been moved, you will likely see it begin working within the next 1 – 2 hours as DNS changes over. We are finalizing plans for customers that do not have DNS pointed at us and will communicate this as soon as we know what we will be doing with this set of customers.

For customers who have questions/concerns regarding the outage, please join us in our chat lounge to discuss: hipchat.com/g5gQ8vl9S

Slow/Unresponsive Sites

Howdy,

We are investigating an issue causing slow/unresponsive sites and page loads resulting in 502s for some sites on our network.

As soon as we have further information we will update this status blog.

If you have questions or concerns, please contact us via your https://my.pressable.com control panel.

UPDATE 9:54AM Central: We are still looking into a root cause for this issue. Sites will continue to go/up down while we troubleshoot and clear this up.

UPDATE 11:06AM Central: Service has returned to normal at this point and sites will be loading properly. We are still working to identify the root cause of the issue and ensure that it has been properly adressed. For now, sites are up and functional and we will keep an eye out for any further potential issues.

UPDATE 12:02 PM: We are continuing to investigate the issue from this morning to find a resolution. The team is working now to make sure this gets resolved as quickly as possible.

UPDATE 1:00 PM: Our team is still working to bring services back to normal functionality. Some sites may have begun responding over the past hour, however, our system is still not back to 100%. We’ll provide more updates as we progress.

UPDATE 2:10 PM: We are continuing to investigate the issue from this morning to find a resolution. The team is working now to make sure this gets resolved as quickly as possible. If you haven’t done so, please submit a support ticket via your my.pressable.com control panel or send an email to help@pressable.com and we can answer anything other questions you may have.

UPDATE 6:50PM Central: We are continuing to see some sites function normally and others experience issues. Our team is working on addressing the issues across the network and is making progress. We hope to have things back up and running shortly. We will update again as soon as more information is available.

UPDATE 9:00PM Central: At this time we have new caching servers up for each galaxy in our network. These are helping with load but sites are still flipping between up and down intermittently. We will update again once these new caching servers are stable and returning site speeds to normal.

UPDATE 12:00AM Central, Jan. 21, 2015: At this time we are seeing most all sites back up and running. If you are still experiencing issues, please let us know and we will address them accordingly. We are still working on maintaining stability and speed at the moment.

UPDATE 5:40AM Central, Jan. 21, 2015: Though most clusters remain up and running intermittently, sites are continuing to go in and out occasionally. We are continuing to work on the stability and speed issues that are currently at hand. Doing so will help bring sites back online consistently again.

UPDATE 9:36AM, Jan 21st, 2015: We have posted a full breakdown of the current situation and will be providing further updates on this status blog post:
http://status.pressable.com/2015/01/21/current-outage-breakdown-and-full-information/

Please see the link above for further updates to this situation

Connectivity Issues inside of Chicago DC

We’re currently investigating new connectivity issues inside of our Chicago Datacenter. These currently appear to be unrelated to issues that were present for the past 48 hours. We’ll provide an update as we have more information about the current problems.

UPDATE 1:30PM CST: Everything should be operating normally at this time. We’re still evaluating systems and the stability of things. If you’re still experiencing issues, please submit a ticket so our team can take a look for you.

RESOLVED: Chicago Data Center Outage

We wanted to provide this post as a notification that issues related to our Chicago Data Center Outage have been resolved. Our team will be continuing to work through the emails and tickets related to this issue and we’ll be providing a full postmortem tomorrow.

We sincerely appreciate your patience and understanding during this truly trying experience. The kind emails, tweets and messages we’ve received have been a true blessing.

RESOLVED: Chicago Data Center Outage

We are currently experiencing an outage in our Chicago Datacenter. All sites hosted in this data center are currently unresponsive and not loading.

We are addressing this issue now and will have things back up and operational shortly.

UPDATE 5:30PM CST: We’re still experiencing issues with the network backbone at our Chicago DC. We’re working with our provider to determine what the cause is and how we can mitigate this traffic to restore services.

UPDATE 6:05PM CST: We’re still working with our provider to determine the cause of the increased traffic along our backbone. We apologize for the delay in getting things back up and operational.

UPDATE 6:30PM CST: We’re still working with our provider to determine mitigating steps we can take for our internal network. We’ll provide an update in 30 minutes.

UPDATE 7:00PM CST: We’re currently engaging more senior members of our providers team to help troubleshoot the issues. We’ll provide an update in 30 minutes.

UPDATE 7:35PM CST: We’re still working with our provider to diagnose the current issues. We’ll provide an update in 30 minutes.

UPDATE 8:30PM CST: Our team is still working to mitigate the effects of traffic on our backbone. We’ve started to push out configuration changes which we expect to help reduce the impact, but it will be a bit longer before we know the impact of these changes. We’ll provide an update in 30 minutes.

UPDATE 10:05PM CST: Our team is still working to restore services. We’ll provide an update in 30 minutes.

UPDATE 10:40PM CST: Our team is beginning some work to bring servers back online and then will continue to evaluate issues. We do not expect services to return to normal at this time and will provide another update in 30 minutes.

UPDATE Jan, 10th, 2015 @ 12:30AM CST: Our team is still working on bringing servers back online and undergoing internal troubleshooting processes. We will continue to update the status blog as often as possible.

UPDATE Jan 10th, 2015 @ 1:25AM CST: We’re currently experiencing delays in the restoration of services as Rackspace is having provider line issues. We’ll continue our efforts as best as possible while Rackspace works to improve their provider issue. (https://status.rackspace.com/)

UPDATE Jan 10th, 2015 @ 2:45AM CST: The team is still working to bring services back online following issues from yesterday. We will continue to update the status blog as we make progress.

UPDATE Jan 10th, 2015 @ 6:05AM CST: Our team is still working to bring services back online. Some sites may have begun responding over the past hour, however, capacity is still not back to 100%. We’ll provide more updates as we progress.

UPDATE Jan 10th, 2015 @ 7:30AM CST: The team is still bringing services back online from the outage. We will provide more updates as we have them.

UPDATE Jan 10th, 2015 @ 9:00AM CST: We’re still working on bringing services back online related to this outage. We’ll provide more information as progress is made.

UPDATE Jan 10th, 2015 @ 10:00AM CST: Our team is still working to resolve issues related to connectivity and site availability. We’ll provide an update when we have more information.

UPDATE Jan 10th, 2015 @ 11:45AM CST: The team is making some configuration changes while we work to bring additional capacity online. We’ll provide an update when there is more progress.

UPDATE Jan 10th, 2015 @ 1:20PM CST: The team is making progress and we hope to have all systems back up and running soon. Some sites may have begun responding over the past hour, however, capacity is still not back to 100%. We’ll provide more updates as we progress.

UPDATE Jan 10th, 2015 @ 3:30PM CST: The team has made updates to the system that will help get things running and stable. Capacity is still not back to 100%. We’ll provide more updates as we progress.

UPDATE Jan 10th, 2015 @ 6:20PM CST: Sites are beginning to be served now but you may encounter the intermittent 502 error while things continue to settle. We’ll provide more updates as we progress.

UPDATE Jan 10th, 2015 @ 9:20PM CST: Customers with sites on our “Thor” cluster should see their sites being served properly now. We’re finalizing some work on our “Bode” and “Galaxy01” clusters and expect those to be functional shortly. We’ll provide more updates as we progress.

UPDATE Jan 11th, 2015 @ 9:15AM CST: We wanted to get another note out to let everyone know that we’ve seen stability restored (as of last night) and are currently seeing all systems online. We sincerely regret the experience provided over the last several days, but do appreciate your continued patience. Tomorrow we’ll be providing a more detailed analysis of the issues and our steps to correct these issues moving forward.