Login to participate
  
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

Up and Down - Sat, 21 Nov 2015

My websites, beholdgenealogy.com, lkessler.com, and gensoftreviews.com have been experiencing unfortunate periods of downtime over the past couple of months.

Yesterday, I put in my third ticket with Netfirms, my hosting company, about this. My first ticket was Oct 15 and I titled it:  “My sites are down a lot”.

Could you please check the server that my sites: www.beholdgenealogy.com, www.lkessler.com and gensoftreviews.com are on. Is there some problem with the server, because Uptime Robot has reported 150 incidents of my sites going down, several times every day, for the past two weeks. These are confirmed by a second monitoring service I use: pingdom.com. This seems to have started about 2 a.m. on Sept 29. The most recent incident was about 14 minutes ago (9:20pm EST) for about 5 minutes. My sites timed out when I tried accessing them and it also affected my lkessler.com email.

Is there some bad service or problem on the server that is hosting my sites?
 
Anything you can do to improve this situation will be appreciated. Thanks.

An hour and a half later, after a few contacts back and forth on the ticket, they “fixed” it:

Our Network Operations Team has corrected the issue causing some customers to have issues accessing their sites and email . You should once again have full access to your services. Thank you for your patience as we worked to correct the issue.

So then things improved, and my monitoring services only reported a couple of incidents each day, a bit higher than the less than 1 that I was averaging prior to all this, but still acceptable. But it did’t take long, and on November 16th, my second ticket titled: “My sites are again down a lot”.

I am having the same sort of problems that I had about a month ago where all my sites are going down a lot, i.e. a dozen times each today.

Please see my ticket 14587925 - Then your Network Operations Team corrected the issue. Please look at this again.

About six hours later:

Our Network Operations Team has corrected the issue causing some customer’s websites to display a blank white page.. Your website should be fully operational once more. We truly thank you for your patience as we worked to correct the issue.

So things got good for 3 more days. Then yesterday, it was horrendous, worse than ever. I had just released version 1.2 of Behold and I had reports that the download wouldn’t install. So I checked and my sites were down about half the time I tried to access them. I figured something was happening preventing the whole installation program to download. So I put in my third ticket, titled “Once Again, my SItes are Going Down”:

image

I do know what’s going on. My websites are on a shared server at Netfirms. My sites are served along with a few hundred other sites on the server. These are huge machines, powerful with lots of RAM. The MySQL databases for GenSoftReviews and the Behold Blog and Forum are served by another huge SQL server at Netfirms. Together, the power of these servers has left my site fast and responsive with very fast response on my WordPress-based sites, which were slow at IXWebHosting before I moved my sites to Netfirms in 2008.

What’s happening now is one of a few possible things. There might be someone on the same server as me who has a programming problem on his web page. It starts running into some infinite loops and it drags the server down. Another possibility is that there is someone on the server who is an abuser. He may be sending out millions of spam mail, or doing something malicious which he’s covering up. The third possibility is that one of the shared sites has had a lucky streak and got extremely popular, with millions of hits coming in to its web pages because of that. And then there’s the possibility that there is some abuser out there doing a denial of service attack on one of the sites on this server, sending millions of hits on purpose to grind the server to a halt.

Which it is, I don’t know, because the Netfirms Technical Management won’t say. They only say they corrected the “issue”.

As part of yesterday’s troubles, I called Netfirm’s priority support via phone. My “priority” support left me on hold for 32 minutes before the tech support rep answered. We spent over 20 minutes talking about the situation. I told him I understand the reasons but expect that this fix again may only be temporary.

The rep mentioned that I could consider a Virtual Private Server (VPS). Yes, I knew that as I’ve researched and considered them before. They would indeed prevent other sites on the same server from affecting my site. Now I’m not so much worried about the extra cost and extra work on my part to maintain a VPS. The real problem is that it has limited RAM and limited CPU power out of its server. That means if I have one of those lucky popular events (e.g. a super-popular web post, or piles of downloads), my VPS won’t be able to handle it, whereas a shared server would have the power to do so.

So far, in the 30 hours since the third “issue” was “corrected”, pingdom has reported 3 incidents. I could live with that, but its still more than the less than one incident per day that should be the norm.

If another build-up of downtimes happens again, I’ll definitely be asking Netfirms to try to find who or what is causing this, or otherwise move my sites to a different server. As long as the problem is not happening on or to one of my own sites, then that should fix things and bring back good reliability.

The joys of websites.

No Comments Yet

Leave a Comment

You must login to comment.

Login to participate
  
Register   Lost ID/password?