PDA

View Full Version : How to get the sack in IT


SimonD
21st March 2007, 08:56 PM
http://www.smh.com.au/news/technology/oops-how-to-ctrl-alt-delete-48-billion/2007/03/22/1174153207365.html

You can just see the techo's face draining of color.

Good ol' format c: /u - it always fixes the problem

Zep
21st March 2007, 10:46 PM
Jeez, I've been in the IT industry for over 30 years, so I'm one of the old birds who flew into the wire and lived to tell the tale. And everyone laughs at me when I keep repeating to the bold young guns, "Have you got a working backup? Where's your working backup? Make sure you've got a working backup!". I get stuff like "Naw, we don't need it - the RAID will be safe enough!"

Peh!

And yet, when the excrement impacts the spinning turbine blades... :eek: ...they turn to me and beg me to rescue them. So I tell them, "Have you got a working backup?..." ;)

Hokulele
21st March 2007, 11:38 PM
*Hokulele remembers something and scurries off to make sure the backup system is really working.*

Hokulele
21st March 2007, 11:40 PM
I remember when I was working in IT how one department answered "Yup, our backup tape fires up automatically every night at 9:00pm." When we went to check, there was a lovely pile of blank tapes from the last two months. Not a single byte of data. Not only should one make a backup, one should also check to make sure it contains everything one thought it did.

Zep
21st March 2007, 11:46 PM
That particularly nasty gotcha is far more common than most IT managers of today would like to know about... :eek:

The best way to tell if your backup worked is to actually do a full restore off the backup media, and confirm the restored data/system does what it is supposed to in the brochure by pounding it. With real users. Otherwise you have NOT done a backup, and this thought will nag your conscience until you DO do it properly! ;)

ob986s
22nd March 2007, 07:28 AM
as I am one of those terrible people known as IT sales guys I have been on the other side of these issues for years.

When you guys forget to check the back-up it's guys like me that get pounded on for selling you a "back-up solution that doesn't work"

For the record I thank you guys for testing your back-ups as it will save the skins of my fellow resellers :)

for the record I sold Tivoli (TSM, ADSM) backup solutions with IBM tape solutions for years, I am now, thankfully out of hardware but still do DR Solutions.

thanks for posting, that article will now be used by me for marketing purposes


Jon

SphereGuy
22nd March 2007, 10:35 AM
Our critical data is backed up twice to two different locations, then one of those locations backs up the backup file. Each night. This is in addition to the raids and mirrored drives. Tested once a week on a random file or database. Yes, it will never happen to me...again.

bruto
22nd March 2007, 03:29 PM
I probably mentioned this once before, but a few years ago went to a huge barn sale in which the residue of a now-defunct business was being sold, including a bunch of old computer programs, disks, books etc. (and a nice Hewlett Packard calculator for a buck ;) ), and among them were cases and cases of floppies to which various programs and large amounts of data had been backed up. Spreadsheets, accounts, years of customer records, you name it. I bought a bunch simply for the cases. Whoever had done the backups had decided to economize by formatting DD disks, and even single sided disks, as HD. I only bought a few dozen, but not a single one passed a disk check.

jimlintott
22nd March 2007, 03:44 PM
I've learned (by other's mistakes) that checking that your backups work is as important as the backups themselves. I worked in a retail store where the accounting software corrupted its files and sure enough the backups didn't work. Really a bad deal.

I know I probably shouldn't say it but I will anyway. When I read this on Slashdot yesturday I got the impression that it all started when an IT guy was doing a format and a reinstall of a popular OS. He goofed and formatted the wrong drive. So I lay some of the blame on an OS where formatting and reinstalling to fix a problem is an acceptable and sometimes only solution. (Hey, I felt the urge to slam Windows. :))

Ultimately the fault lays at the feet of the guy screwed up a simple reinstall.

It's scary to think of how much money and time is tied up in backups that might not work.

ETA: Watching the look of fear turn to one of relief in your boss's face is one of the rewards for backups that work.

DoubtingStephen
22nd March 2007, 05:10 PM
I used to do field service on big network management systems. I'd ask my customers when they last did a backup, and if they did not know I'd walk them through it once.

Every now and then a customer would be dismissive of the need for doing backups, and I'd say that I had never met anyone that regretted making a backup.

Gord_in_Toronto
22nd March 2007, 08:34 PM
I used to do field service on big network management systems. I'd ask my customers when they last did a backup, and if they did not know I'd walk them through it once.

Every now and then a customer would be dismissive of the need for doing backups, and I'd say that I had never met anyone that regretted making a backup.

Oliver North?

Zep
22nd March 2007, 08:50 PM
Our critical data is backed up twice to two different locations, then one of those locations backs up the backup file. Each night. This is in addition to the raids and mirrored drives. Tested once a week on a random file or database. Yes, it will never happen to me...again.

Trvth!

quixotecoyote
22nd March 2007, 08:51 PM
According to the article the tech didn't get blamed for it, oddly enough.

Zep
22nd March 2007, 08:57 PM
I know I probably shouldn't say it but I will anyway. When I read this on Slashdot yesturday I got the impression that it all started when an IT guy was doing a format and a reinstall of a popular OS. He goofed and formatted the wrong drive. So I lay some of the blame on an OS where formatting and reinstalling to fix a problem is an acceptable and sometimes only solution. (Hey, I felt the urge to slam Windows. :))One of my scars is from allowing Windows to "take control" during an upgrade and do what it thought was "best". Bye-bye server primary system boot partition! :eek: I'm certain I went all sorts of pale grey-green colours when I discovered (about 60 seconds into a full format...!). Scraped out of that one - used a second boot partition plus a tested tape backup (hallelujah!) to go back to the previous night. :relieved:

I now know: If you are doing any system partition upgrade or install, uplug/disconnect the disks you don't want touched! Sounds drastic, but not as drastic as explaining the disappearance of your SQL server plus payroll and accounting database to your boss... :boxedin: :blush:

Kopji
22nd March 2007, 09:53 PM
So yeah, I saw that article.
I thought, how do you end up managing backups in Alaska?
Do you get caught sleeping with the bosses wife? Lose those incriminating photos from last year's Christmas party? Fall asleep during a meeting with the CIO?

Mongrel
23rd March 2007, 10:25 AM
I now know: If you are doing any system partition upgrade or install, uplug/disconnect the disks you don't want touched! Sounds drastic, but not as drastic as explaining the disappearance of your SQL server plus payroll and accounting database to your boss... :boxedin: :blush:

Whilst not often used much above personal level make sure you unplug any external USB hard drives that you archive to as well (that includes MP3 players) ;)

hodgy
23rd March 2007, 05:21 PM
Pah, you're all big-girls-blouses
I don't backup anything - I like the risk...

SphereGuy
24th March 2007, 09:31 AM
Even if you don't check your backups, don't you think you should at least check them before a major change/upgrade/maintenance?

Soapy Sam
25th March 2007, 03:57 PM
I had three copies of a lot of photos- several GB. One USB drive died.
To try to recover data from the faulty drive , I moved a load of files off a second USB drive to the EIDE drive on my laptop.
Which promptly crashed.
All three copies gone in one morning.

Fortunately, I managed to get the laptop drive back in action with no data loss. (WinXP died due to a battery problem).

They are all on DVD now- as well as other places.

CFLarsen
26th March 2007, 01:20 AM
So yeah, I saw that article.
I thought, how do you end up managing backups in Alaska?
Do you get caught sleeping with the bosses wife? Lose those incriminating photos from last year's Christmas party? Fall asleep during a meeting with the CIO?

Not managing backups everywhere else.

Diamond
26th March 2007, 03:13 AM
I had three copies of a lot of photos- several GB. One USB drive died.
To try to recover data from the faulty drive , I moved a load of files off a second USB drive to the EIDE drive on my laptop.
Which promptly crashed.
All three copies gone in one morning.

Fortunately, I managed to get the laptop drive back in action with no data loss. (WinXP died due to a battery problem).

They are all on DVD now- as well as other places.

That brings an interesting question: how do you backup digital data (like family pictures) for a long time? Is there a personalalized, secure and reliable data archive out there?

CFLarsen
26th March 2007, 03:20 AM
That brings an interesting question: how do you backup digital data (like family pictures) for a long time? Is there a personalalized, secure and reliable data archive out there?

How long is "long"?

SimonD
26th March 2007, 04:53 AM
thanks for posting, that article will now be used by me for marketing purposes
Jon

No worries

According to the article the tech didn't get blamed for it, oddly enough.

I was trying to be funny

bruto
26th March 2007, 08:09 AM
That brings an interesting question: how do you backup digital data (like family pictures) for a long time? Is there a personalalized, secure and reliable data archive out there?

Family pictures work well on film. :D

Seriously, what medium will ever beat Kodachrome?

I think the only way to ensure long term survival is to recopy your data periodically to whatever is the best new medium of the moment and remember to check it periodically, including being sure that you have whatever is needed to read the information. Even if the information survives, it's no good if the equipment to read it does not.

Alareth
1st April 2007, 05:30 PM
How long is "long"?

Let's poll the porn industry to find out.

Arkan_Wolfshade
6th April 2007, 09:56 AM
Jeez, I've been in the IT industry for over 30 years, so I'm one of the old birds who flew into the wire and lived to tell the tale. And everyone laughs at me when I keep repeating to the bold young guns, "Have you got a working backup? Where's your working backup? Make sure you've got a working backup!". I get stuff like "Naw, we don't need it - the RAID will be safe enough!"

Peh!

And yet, when the excrement impacts the spinning turbine blades... :eek: ...they turn to me and beg me to rescue them. So I tell them, "Have you got a working backup?..." ;)
Hahhahahaha! I've had two drives fail, one on each channel, of a RAID 5-0 before at the same time. Damned straight I was glad I had full/differential/transactional backups.

baron
6th April 2007, 10:15 AM
The tech is one of the last people who should take the blame. Head of IT, for one, should have been fired instantly, followed quickly by any team leaders who had the slightest input into, or knowledge of, the data security procedures. This story is so bizarre I'm wondering if something hasn't been missed out. Even a basic tape backup procedure would have prevented this, let's not even go into the world of online backups, SANs and data replication. Amazing.

rockoon
9th April 2007, 09:14 AM
Its not like its something that we havent all done before.

SphereGuy
12th April 2007, 11:14 AM
Its not like its something that we havent all done before.

Okay, show of hands, how many people here cost their company millions of dollars because you didn't do a simple back up?

Freethinker
13th April 2007, 08:12 AM
Okay, show of hands, how many people here cost their company millions of dollars because you didn't do a simple back up?

I've saved mine that much because I didn't throw out unofficial CD backups I did on my personal machine every week. Working copy crashed and a tech trashed the on-site backup. Off-site backup couldn't be located:bwall. I pulled an unauthorized, unverified, audit-failing backup from my drawer, and using my personal change notes managed to get the system working again in a few weeks. Still cost them hundreds of thousands to get everything verified.

The reason I backed up my own stuff is that I was burned before by admins forgetting to change the tapes.

Worst part was I didn't get any recognition for saving their butts because it would have required them to acknowledge that their major screw-up was fixed by an engineer who violated policy.

Azure
13th April 2007, 06:24 PM
Okay, show of hands, how many people here cost their company millions of dollars because you didn't do a simple back up?

Does 100 bucks count?

:p

delphi_ote
13th April 2007, 11:50 PM
The department is asking lawmakers to approve a supplemental budget request for $US220,700 to cover the excess costs incurred during the six-week recovery effort, including about $US128,400 in overtime and $US71,800 for computer consultants.
They stored hundreds of thousands of dollars worth of data tracking billions of dollars worth of data on a single hard drive with one backup hard drive and gave one tech easy access to both? You've got to be *********** kidding me. When I worked at a data center, anyone who even thought of an idea like this would've been murdered on the spot!
Former Revenue Commissioner Bill Corbus said no one was ever blamed for the incident.

"Everybody felt very bad about it and we all learned a lesson. There was no witch hunt," Corbus said.
Of course not. Systematic problems are never anyone's fault. Let's all give ourselves a hug and a raise.

delphi_ote
13th April 2007, 11:51 PM
Head of IT, for one, should have been fired instantly, followed quickly by any team leaders who had the slightest input into, or knowledge of, the data security procedures.
Exactly.

geni
14th April 2007, 12:14 PM
The tech is one of the last people who should take the blame. Head of IT, for one, should have been fired instantly, followed quickly by any team leaders who had the slightest input into, or knowledge of, the data security procedures. This story is so bizarre I'm wondering if something hasn't been missed out. Even a basic tape backup procedure would have prevented this, let's not even go into the world of online backups, SANs and data replication. Amazing.

At a guess I would say that security and privacy legislation was an issue.

delphi_ote
14th April 2007, 12:21 PM
At a guess I would say that security and privacy legislation was an issue.
And yet one tech clearly had read/write access to both the original and backup. Security and privacy my ass!

geni
14th April 2007, 01:44 PM
And yet one tech clearly had read/write access to both the original and backup. Security and privacy my ass!

But he had been vetted (ha). Oldside contractors would not have been.

delphi_ote
15th April 2007, 01:29 AM
But he had been vetted (ha). Oldside contractors would not have been.
You're right. No way an inside man would have a conflict of interest. This organization is clearly top notch! :p

fishbob
16th April 2007, 07:31 PM
http://www.smh.com.au/news/technology/oops-how-to-ctrl-alt-delete-48-billion/2007/03/22/1174153207365.html

You can just see the techo's face draining of color.

Good ol' format c: /u - it always fixes the problem


I don't remember this story making the local news.
The fix must be in.

NiallM
17th April 2007, 07:20 AM
Been there and seen it happen many times.

One mainframe prodcut which I supported was a really powerful tool which allowed you to migrate data while systems were in full flight.

Storage manufacturers loved the package, because they could install now drives, and migrate all data to them without the customer experience any down-time.

I got a call from a field engineer one day. He'd migrated 15 Tb of data, after which he discovered that he had forgotten to run an agent on a system which shared the data. Updates from that system would be lost. Did I have any suggestions?

The only question I could ask him was the "Do they have a good backup?" one. He knew it too. I felt so bad for the guy. He told his management and the customer, helped them do a full system restore, and collected his cards.

Easy mistake. Just like the operator in a bank who intended to vary two tape drives offline. The command was: "V F000-F001, offline"

The first "f" wasn't type hard enough.

"v 0000-F002, offline" waas interpreted by the console which then varied every disk, console, network device ATM, etc in that range, offline. He picked up his coat and left.