Author Archives: Pressable

Caching layer degradation – Fixed and Stable

We’ve found some issues with our memcached cluster, which we’re working to resolve as soon as possible. Symptoms of this are slower sites, and sometimes pages that will return a “504 Timeout” page.

Sorry for the issues, we’re working to resolve these issues ASAP.

Update February 12, 2013 8:46 PM: The problem was fixed at 7:30PM CST, we’ve been monitoring the situation for the past hour, and things have been stable. We consider the issue resolved.

Database and Service Interruptions

Leave a reply

We’re currently aware of an issue inside our database cluster that’s causing some slowness/unavailable sites. We’re currently working on the issue and will update you with more details as available.

UPDATE 10:45AM CST: The database connection issues are still ongoing and we’re continuing to investigate the cause.

Update on the botnet attack of February 7, 2013

Leave a reply

We’re starting to get things under control. We’ve blocked 2832 unique ip addressess so far. We’re continuing to monitor the situation and isolate the customers who were affected by this, from the customer who was being attacked.

What we know so far

A customer’s website is under a botnet attack, where we are seeing 190,000 requests/second made to one ip address.
These requests seem to be coming from about 3000 unique ip addresses.
Our firewall was reaching a CPU max of about 90% while this was happening, our alarms go off when it hits 51%.
Blocking all 3000 ips on the firewall is not a good idea, so we’ve “null routed” the destination ip address.

What are we doing to bring customers back?

We are assigning new ips to the affected customers (several hundred) who shared the same ip address with this customer.
If we control your dns, this change will happen within the next 30 minutes. If we don’t, we’ll be contacting you to let you know what the ip address should be.

One site under major DDOS attack, affecting rest of network

Leave a reply

One of the websites hosted with us is under a major denial of service attack. We’re working with our network security team to isolate the traffic to this site, so we can restore service to normal.

Currently we’re seeing 10,000 requests/second just to this domain on our load balancers.

Minor issue with content delivery network (CDN)

Leave a reply

We experienced intermittent issues with the content delivery network we have. The symptoms of which are that your site doesn’t look correct, all the text loads, but you won’t see images or your stylesheets.

The outage was for about 20 minutes. All systems are functioning now.

January 27, 2013 all systems functioning normally again

Leave a reply

As of 2:53 PM on January 27, 2013 All systems are functioning normally again. We had intermittent issues across our network.

Here’s what happened.

One of our 4 memcached servers had run out of memory, and in the process locked up. This made it so that our database servers were seeing 8x the average calls. Since our monitors started telling us about higher than normal database activity, we started investigating the issue there.

What did we learn?

It turns out, our monitoring on the memcached systems isn’t as good as we thought it was. Had we known that the one of the memcached server was out of commission, we would’ve been able to identify the problem, and fix it. Rather than investigating what was causing the spike in the database usage.

Database Issue

Leave a reply

We are currently experience an issue that causing sites to display database connection errors.

We are working on having this resolved as soon as possible and will update with more information as it is available.

Update:

All sites are back up now and the total down time was about 15 minutes.

If you have any questions, please send an email to help@zippykid.com and we can address your concerns via our help desk.