Go Back   English Forum Switzerland > Support > Support > Announcements  
Reply
 
Thread Tools Display Modes
  #1  
Old 25.11.2007, 20:54
mark's Avatar
The Architect
 
Join Date: May 2005
Location: Zollikon, Switzerland
Posts: 2,995
Groaned at 3 Times in 3 Posts
Thanked 418 Times in 115 Posts
mark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond repute
Recent forum downtime

Update 09.12.2007: Photo system back online. Details here.
Update 06.12.2007: 500 missing attachment restored. Details here.

The last four days have seen the biggest outage in English Forum's history. Between early Thursday morning and late Sunday evening I know that many of you were unable to get your "fix". Your employers probably noticed that the amount of actual work you did shot up, and maybe you rediscovered some important parts of your personal lives with all that free time. I wish I could say the same about the last four days...

Before we talk about what happened and why, let's start with what we have right now. This is mostly good news.
  • The forum has lost only three hours of messages. These hours were in the early morning of Thursday 22nd November, when most of you were asleep. So for practical purposes - no posts were lost.
  • Almost everybody has their same avatar and/or profile picture. Some of you who had updated it recently will find that it doesn't display. You can easily fix this by uploading it again.
  • Some forum functions such as the photo gallery do not work. They will be fixed over time. In the case of the photo gallery many of the pictures were lost.
  • All files attached to posts were lost. You can read about how you can help fix this situation here.
So all in all, as far as forum disasters go, I'm quite happy with the result. A lot of other stuff has been lost, but it's all behind the scenes stuff to do with the smooth maintenance of the server. In my own personal case it represents a lot of time and effort down the drain. Hopefully I'll be faster with some stuff the second time around.

So what happened and why did it take so long to fix? I'll try to keep the technical information for the end, because I realise that most of you aren't interested, but some of you are. Early in the morning on Thursday I discovered that the server that hosts English Forum was down. This is my own server, which lives in a data center (raised floors, aircon, backup power and all that) and it also hosts many other domains, though as far as traffic and server load go, English Forum takes up most of the resources.

To cut a long story short, in the process of trying to resolve the situation (which was caused by virtual disc files expanding to the point where the system which carried them ran out of space) I lost the server. Totally. Toasted. I had to reinstall it from scratch and go to my backups.

So why did it take four days? Well that's where the fun began. Because I'm a cautious kind of fellow I have all the backups made to a completely separate (non virtual, physical) drive. This drive was not affected, but I quickly found that most of my backups were corrupted and unuseable. To make matters worse, the magic files I needed which tie all the backups together were not there. Why? The crash happened in the middle of a backup. Why didn't I have other backups? Well I should have, but it turns out the script which was supposed to save old copies of backups had not been working (just the script which was supposed to stop the discs becoming full).

Some of you work in the IT industry, and some of you may have written disaster recovery plans. I've written a few in my time, but they are very seldom tested and are often a work of fiction. Anyone who has been through this themselves will tell you that a backup is worthless unless you actually try to restore it. But I digress.

Faced with the reality of the situation I had to work with what I had. I had to manually cobble together the 60 or so other domains on the server and restore what I could from various collections of older backups. It also took time to deal with all the unhappy customers (all were really great and understanding which helped a lot). Since english forum would kill an unoptimised server (due to the high load), I couldn't even start to put it in place until a lot of other ground work was in place and the other sites were up and running. All in all, considering the amount of corruption on the backup drive, pretty much everyone got almost everything back.

We have no loss of message content because I back the english forum database up every hour. Call me paranoid, but with the amount of posts I knew that in a disaster situation losing posts is bad for a forum. Unfortunately everything else around the forum was a little different. I did have backups, but most were unusable. The reason we lost a couple of hours was because the first two database backups were corrupt, but the third was ok.

Why were my backups corrupted? I don't know. Could have been a physical problem with the disc, or the operating system may have done it. If you have a file of several hundred megabytes, it only takes a single byte of corruption to destroy it. The larger the files, the higher the failure rate. My backup drive is less than a year old, and I'd never seen any problems with it. Suffice to say, I'll be checking the integrity of the backups more thoroughly from now on.

Unfortunately I still have much work to do, so you won't see me on the forum very often in the next couple of weeks, but be rest assured I'll be thinking of ways to avoid a repeat of this incident. Anyone who has worked in this field knows it's the sort of thing that can keep you awake at night.

I'd like to thank Lob Rockster, gregv and swissbob who helped me out with a few things during the recovery. I'm sure they and the rest of the hard working EF moderator team will be doing their best over the next few weeks to help us bounce back from this. I'd also like to thank those of my commercial customers who host their domains on the same server for their patience and understanding through this difficult time.
__________________
Help: How to get more from searches | How to use tags | Please do not PM me for help on permits, jobs or salaries

Last edited by mark; 12.12.2007 at 02:08. Reason: removing forum downtime message - completed
Reply With Quote
The following 104 users would like to thank mark for this useful post:
adrian76, ali_the_nomad, BenK, Big Dream, billie, Blonaybear, Bookworm, Brianb_ie, brüder, Bubugala, caninecounselor, chipmaker, chrisch, ChrisW, clive7, Colonelboris, Cooper, CornerFlax, couta, Crumbs, dalehauskins, Darkphoenix, dbsb, Dodger, draculin, eejit, flow23, Galatea, Gav, grumpygrapefruit, helencho, i-b-deborah, Jack, jannewbold, Jekyll, jemma, jonnyt, jot, kfcfriend, Kittster, konijn, krlock3, lucy_sg, magyir, mannie organ, Maple Leaf, Mark75, Martin79, Mikeybroomers, mila cruz, mimi1981, miniMia, MissBehaving, muze7, möpp, Natasha, NatsBrit, Oldhand, Ollie, outrage, Pacman, panamahat, pat, PlantHead, Polorise, quinallex, readingsteve, ric, Ritchi, Rob, robban, RSargeant, SamC, SamCole, Scott, seb23, Simon, smackerjack, Smitty, southie, Suermel, Sutter, telandy, terryhall, thoean, tigerli, tildaoz, timpy, Uncle Max, undercovermoles, Woodsie, WorldTraveller, Yorkie
  #2  
Old 26.11.2007, 02:17
mark's Avatar
The Architect
 
Join Date: May 2005
Location: Zollikon, Switzerland
Posts: 2,995
Groaned at 3 Times in 3 Posts
Thanked 418 Times in 115 Posts
mark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond reputemark has a reputation beyond repute
Re: Recent forum downtime

Changed the content of the initial post with more useful info and opened the thread for discussion.
Reply With Quote
  #3  
Old 26.11.2007, 02:47
eejit's Avatar
Member
 
Join Date: Sep 2007
Location: Zurich
Posts: 169
Groaned at 1 Time in 1 Post
Thanked 68 Times in 48 Posts
eejit has made some interesting contributions
Re: Recent forum downtime

Having been a database administrator in a past life, I have had my share of weekends spent restoring from backups

All your (and the other admins') effort is very much appreciated. I'm off to visit that Donate button. Even if you're doing it as a hobby, some beers are probably in order.
Reply With Quote
  #4  
Old 26.11.2007, 08:06
Idgie's Avatar
Forum Veteran
 
Join Date: Mar 2006
Location: Zurich
Posts: 603
Groaned at 1 Time in 1 Post
Thanked 441 Times in 205 Posts
Idgie has a reputation beyond reputeIdgie has a reputation beyond reputeIdgie has a reputation beyond reputeIdgie has a reputation beyond repute
Re: Recent forum downtime

As always: Thank you so much, Mark. I am time and again impressed with the care and effort you run this forum to the benefit of us all. I wish you a good "personal recovery" after all this extra work.

Thanks!!!!
Idgie
Reply With Quote
  #5  
Old 26.11.2007, 08:27
Woodsie's Avatar
Forum Veteran
 
Join Date: Jun 2007
Location: Zürich
Posts: 898
Groaned at 6 Times in 5 Posts
Thanked 665 Times in 338 Posts
Woodsie has a reputation beyond reputeWoodsie has a reputation beyond reputeWoodsie has a reputation beyond reputeWoodsie has a reputation beyond repute
Re: Recent forum downtime

Legendary effort mate. I was having nightmares that all the info on the forum would be lost and we would have to start from scratch. Thanks for all the hours and hard work getting it back up and running again. Bet you'll be testing those recoveries now.
Reply With Quote
  #6  
Old 26.11.2007, 08:56
telandy's Avatar
Forum Veteran
 
Join Date: Oct 2006
Location: Thurgau
Posts: 2,060
Groaned at 5 Times in 4 Posts
Thanked 634 Times in 378 Posts
telandy has an excellent reputationtelandy has an excellent reputationtelandy has an excellent reputationtelandy has an excellent reputation
Re: Recent forum downtime

Lost this weekend without the forum, that Donate button will be pressed. Fantastic job and much appreciated by all.
Reply With Quote
  #7  
Old 26.11.2007, 09:44
NatsBrit's Avatar
Forum Veteran
 
Join Date: Apr 2007
Location: Lausanne / Weybridge UK
Posts: 501
Groaned at 10 Times in 5 Posts
Thanked 241 Times in 166 Posts
NatsBrit is considered knowledgeableNatsBrit is considered knowledgeableNatsBrit is considered knowledgeable
Re: Recent forum downtime

Thank you guys. Your hard work is VERY MUCH appreciated. My weekend was just not the same without my hourly fix of EF!!!!

Thanks again
Reply With Quote
  #8  
Old 26.11.2007, 10:14
Dodger's Avatar
Forum Veteran
 
Join Date: Jul 2007
Location: Lörrach/DE
Posts: 644
Groaned at 6 Times in 6 Posts
Thanked 568 Times in 294 Posts
Dodger has an excellent reputationDodger has an excellent reputationDodger has an excellent reputationDodger has an excellent reputation
Re: Recent forum downtime

Great work, Mark. And thanks for getting everything up and running once again.

It's a good thing that it was just a server crash. At first I thought that the IT dept at work had blocked EF: now that would be a real disaster...
Reply With Quote
  #9  
Old 26.11.2007, 10:45
Guest
 
Posts: n/a
Re: Recent forum downtime

Mate I feel your pain, been there got the t-shirt. It's just very unfortunate that the backups went aswell..

One customer I worked with in the past insisted on backing up 3tb of data to tape, which took about 3-4 days to carry out. For some reason they couldn't get through their thick heads that if the server/disks went they'd lose up 4 days of data.

This warehouse was continually growing to add insult to injury due to their poor archiving/it management of records.

Well after 2yrs of all roses and sunshine it finally happened, and you can warn people all you like.

I just regurgitated the e-mails I'd held in a little folder warning them every 3 months that if it happened they would be in big trouble to their managers.

Oh well I got a gold star and 3 managers who ignored the advice were shown the door! It may look good on your budget now, but in the end it will cost you a job..

What was really funny was the data which was being held. I'm not going to go into this, but it's along the lines of criticality to a UK govt system, with what was lost on 2 cds by courier in England..(I might send you a pm Mark with the actual data stored, should bring a smile after all the carnage this w/e)

So again well done mate and time to write a disaster recovery plan to cover the disaster recovery plan I reckon. (Damn tape drives eh!)

It's only now you begin to realise what a community EF is and how many people rely on it's content. Another time to step back and be proud of your creation mate I reckon <raises glass>

Time for a beer I reckon...You + Helpers have earned it.

All the best

Karl
Reply With Quote
  #10  
Old 26.11.2007, 10:45
Guest
 
Posts: n/a
Re: Recent forum downtime

phewwww

Thanks a lot guys for putting so much effort into 're-animating' the EF back to live again.It's much appreciated by me as well!!
Reply With Quote
  #11  
Old 26.11.2007, 11:43
Zug bound's Avatar
Forum Veteran
 
Join Date: May 2007
Location: Meisenberg Zug
Posts: 863
Groaned at 19 Times in 11 Posts
Thanked 284 Times in 182 Posts
Zug bound is considered knowledgeableZug bound is considered knowledgeableZug bound is considered knowledgeable
Re: Recent forum downtime

I don't have the foggiest idea what you were talking about. I am just grateful it is back up. I thought I had buggered up our home computer when it first went down. it wouldn't be the first time I am ashamed to say.
Thank you so much, you are all stars
Reply With Quote
  #12  
Old 26.11.2007, 11:45
stout2104's Avatar
Junior Member
 
Join Date: Mar 2007
Location: Lutry
Posts: 57
Groaned at 0 Times in 0 Posts
Thanked 48 Times in 21 Posts
stout2104 has made some interesting contributions
Re: Recent forum downtime

I would also like to thank you for the hard work. Working in IT myself, I can feel the pain.

May i request that part of my donation goes into buying your wives/girlfriends/partners a bunch of flowers? I am sure they did not see much of you this weekend and should be thanked as well !
Reply With Quote
  #13  
Old 26.11.2007, 11:51
Guest
 
Posts: n/a
Re: Recent forum downtime

I thought our IT dept had blocked it because it's open on my computer all day. Got a bit paranoid until I saw the message about the crash later in the day.

Got absolutely no idea about computers and servers and their internal gubbins but hats off to the sterling work you made at the weekend putting the widgets, nuts and bolts back together on the EF.
Reply With Quote
  #14  
Old 26.11.2007, 12:08
JVC
 
Posts: n/a
Re: Recent forum downtime

My thanks too Mark, from another one who realises what a pain this kind of thing can be.
Reply With Quote
  #15  
Old 26.11.2007, 12:21
Member
 
Join Date: Apr 2007
Location: ZH
Posts: 116
Groaned at 1 Time in 1 Post
Thanked 15 Times in 12 Posts
DeadLast has no particular reputation at present
Re: Recent forum downtime

You f-*/)(§ useless /&%! Get yer backups in %&$=/ order! I mean, I always have everything reliably backed up and.. oh no.. wait..

(Just kidding! Thanks for all your efforts, Mark. Been there a few times myself, though in my case almost always through my own stupid fault. Last few days have been v. boring without EF.)
Reply With Quote
  #16  
Old 26.11.2007, 12:27
solidsnake
 
Posts: n/a
Re: Recent forum downtime

@mark

Thanks for the efforts Mark -- as an IT person myself, I can feel your pain

May I suggest that you check out rsnapshot (www.rsnapshot.org). It's a great system for keeping backups always available. I use this in combination with regular tape backups and they have saved my hide more than once.

cheers,

Ben
Reply With Quote
  #17  
Old 26.11.2007, 13:20
My2pups's Avatar
Forum Veteran
 
Join Date: Apr 2007
Location: Used to be Zurich
Posts: 1,706
Groaned at 94 Times in 64 Posts
Thanked 1,989 Times in 870 Posts
My2pups has a reputation beyond reputeMy2pups has a reputation beyond reputeMy2pups has a reputation beyond reputeMy2pups has a reputation beyond reputeMy2pups has a reputation beyond reputeMy2pups has a reputation beyond repute
Re: Recent forum downtime

Mark - I am sure that you have gotten loads of suggestions and probably don't need anymore, but I use a company called SIAG for personal and corporate backup solutions (www.siag.ch). They are Swiss, and your data gets stored inside a mountain in Gstaad (kind of cool). The actual product is called SwissVault. Check it out.

fduvall
Reply With Quote
  #18  
Old 26.11.2007, 13:37
terryhall's Avatar
Forum Veteran
 
Join Date: Mar 2007
Location: Die Südkürve
Posts: 1,790
Groaned at 11 Times in 10 Posts
Thanked 1,030 Times in 552 Posts
terryhall has a reputation beyond reputeterryhall has a reputation beyond reputeterryhall has a reputation beyond reputeterryhall has a reputation beyond reputeterryhall has a reputation beyond repute
Re: Recent forum downtime

Yep, cheers Mark - sorry for your lost weekend

I used to run a forum myself a few years ago, thankfully we never had any problems along these lines but just from knowing a small bit of the back end of a vBulletin forum I can imagine how much has gone on unseen and needs to be put back - ouch

Just to echo everyone else's sentiments, I was gutted on Thursday and Friday that I actually had to do some work for a change

(p.s. - have put all my attachments back in as requested in the link)
Reply With Quote
  #19  
Old 26.11.2007, 15:24
möpp's Avatar
Forum Veteran
 
Join Date: Apr 2007
Location: Zurich
Posts: 1,201
Groaned at 2 Times in 2 Posts
Thanked 1,079 Times in 547 Posts
möpp has a reputation beyond reputemöpp has a reputation beyond reputemöpp has a reputation beyond reputemöpp has a reputation beyond repute
Re: Recent forum downtime

A huge thankyou to Mark and all others who've got the system working again. Can't even begin to imagine the gallons of coffee needed for that!
Reply With Quote
  #20  
Old 26.11.2007, 17:25
SamC's Avatar
Member
 
Join Date: Jan 2007
Location: Lausanne
Posts: 175
Groaned at 3 Times in 3 Posts
Thanked 68 Times in 35 Posts
SamC has no particular reputation at present
Re: Recent forum downtime

Cheers Mark et al. I was sooooo glad to be able to get my fix again today.
Reply With Quote
Reply

Tags
downtime, server crash




Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Salary for recent graduate (little/no experience)? iamcanadian Employment 15 29.08.2008 03:08
Recent arrivals from New York amw11574 Introductions 2 19.07.2007 12:27
Forum downtime - Thu May 31, 2007 - 09:00 to 10:00 mark Announcements 9 31.05.2007 12:01
Forum downtime - Wed Feb 28, 2007 - 08:00 to 10:00 mark Announcements 6 28.02.2007 14:19
Some guidance on recent posts mark Daily life 11 22.03.2006 22:02


All times are GMT +2. The time now is 13:38.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO 3.1.0