Jump to content

Tell me the last month of the forum didn't go bye-bye...


Recommended Posts

  • Members

Tell me there's a legit reason for feeling this is a slight to all you hold important in (forum) life. Or, are you just feigning concern for the sake of setting up a complaint like the rest of the well-groomed effeminates here.

Link to comment
Share on other sites

  • Members

It looks like the data is gone (coming back?) but that the software update from a few weeks ago is still in place (the one that fills your inbox with notices about threads you've posted in having replies).

 

FWIW, if any of the tech team is listening...I have a near-perfect data-loss record going back to 1998. My formula starts with either RAID 10 or a variant-thereof that uses three mirrors of stripes. Each stripe lives in a separate enclosure and contains a complete copy of the data. Each enclosure is connected by fiberchannel (or SCSI back in the day) to two or more hosts, any of which can "own" the filesystem (one host at a time). Changey data is all on the external storage. Internal storage is OS and swap, with internal RAID 1+0. (Note - Sun/Solaris bias). I tend to prefer smaller disks for increased I/O bandwidth -- i.e. I will spec 4x72GB instead of 1x300GB every time.

 

Three mirrors of stripes is especially useful for doing "cold snapshot" backups - you disconnect one stripe, leaving yourself with RAID 10. Then bring the database copy up on a spare host, get it to a recoverable state, shutdown clean, and archive. Less fraught with peril than trying to keep track of 5 years worth of archive logs. (Note - oracle bias).

 

RAID 5 is "verboten" as single-disk failure leaves you with very poor performance and a long recovery time. Hardware RAID controllers in the hosts are also never used, as these tend to have poor failure recovery characteristics....like you order a replacement from the vendor and find out that they've changed something in the firmware to make it "better", but now you can't read your data. Or the card is just not available anymore. Hardware RAID 5 is the worst of both worlds. It looks like you have a good solution in a board room, but the minute you hit a failure in the datacenter, you are foxtrot uniform charlie kilo echo delta.

 

My formula is also not particularly expensive. It's largely an upfront capital investment coupled with good management. Ongoing costs are mostly power (cooling) and increased replacement parts.

Link to comment
Share on other sites

  • Members
Tell me there's a legit reason for feeling this is a slight to all you hold important in (forum) life. Or' date=' are you just feigning concern for the sake of setting up a complaint like the rest of the well-groomed effeminates here.[/quote']

 

 

Call me effeminate again and I will hit you with my purse. :)

Link to comment
Share on other sites

  • Members
Hardware RAID controllers in the hosts are also never used' date=' as these tend to have poor failure recovery characteristics....[/quote']

So you prefer to set up software RAID? Like, within the OS? Is there a drawback, say more processing overhead and reduced speed?

 

I see what you mean, though. I've had a RAID card {censored} itself and I wasn't able to import the array on a new card.

Link to comment
Share on other sites

  • Members

Definitely prefer to set up software RAID. Even on one occasion where I have elected to use hardware RAID, I let the hardware handle the stripes but handled the mirroring in software. The occasion I did that was a 3-enclosure triple mirror, each RAID5. The enclosures each had 1GB write-behind cache, redundant power supplies, and redundant dedicated battery backup. IOs per second were 5:1 reads:writes, with writes on average being twice as big as reads. It performed very well, except under a single-disk failure condition. If I had to do it again with the exact same hardware, I would prefer a couple of standby disks to going RAID 5. Bringing the standbys online would be more expensive, but it wouldn't be a continual resource drain until replacement, which might not happen for a day or two.

 

Technically speaking, there is more processing overhead to making two or three writes instead of one, but it is negligible, especially if you are setting up for maximum I/O and doing one FC host bus adapter per enclosure. There is no calculation involved when you're dealing with RAID 10, since the disk geometry is either identical (JBOD enclosures) or handled by the enclosure itself and there is no parity disk like in RAID5. It is also necessary to have external disks which do not communicate with one another -- necessitating software RAID -- when you are looking for appropriate levels of availability. At that point, failure-fencing also comes into play.

 

Note also that a software RAID 10 has virtually no penalty at all when it comes to reading data. The only decision it has to make is which disk to ask for the data, and it only has to read it from one disk...

 

I do this type of work for the telecom industry, where excellent performance is highly desirable, but data loss is absolutely unacceptable.

Link to comment
Share on other sites

  • CMS Author
Definitely prefer to set up software RAID. Even on one occasion where I have elected to use hardware RAID, I let the hardware handle the stripes but handled the mirroring in software. The occasion I did that was a 3-enclosure triple mirror, each RAID5. The enclosures each had 1GB write-behind cache, redundant power supplies, and redundant dedicated battery backup. IOs per second were 5:1 reads:writes, with writes on average being twice as big as reads. It performed very well, except under a single-disk failure condition. If I had to do it again with the exact same hardware, I would prefer a couple of standby disks to going RAID 5. Bringing the standbys online would be more expensive, but it wouldn't be a continual resource drain until replacement, which might not happen for a day or two.

 

Technically speaking, there is more processing overhead to making two or three writes instead of one, but it is negligible, especially if you are setting up for maximum I/O and doing one FC host bus adapter per enclosure. There is no calculation involved when you're dealing with RAID 10, since the disk geometry is either identical (JBOD enclosures) or handled by the enclosure itself and there is no parity disk like in RAID5. It is also necessary to have external disks which do not communicate with one another -- necessitating software RAID -- when you are looking for appropriate levels of availability. At that point, failure-fencing also comes into play.

 

Note also that a software RAID 10 has virtually no penalty at all when it comes to reading data. The only decision it has to make is which disk to ask for the data, and it only has to read it from one disk...

 

I do this type of work for the telecom industry, where excellent performance is highly desirable, but data loss is absolutely unacceptable.

 

Wes, has anyone from Nucleus or the HC admin team ever reached out to you? If not, they should. I'm not so much on your end of things as applications, but you don't work in the biz for over two decades and not pick up on all aspects, even if by osmosis. What you describe echoes what I've heard in well-executed shops.

 

I wouldn't be opposed to you sending this directly to Phil, Dendy, or Craig A. Over the years there have been far too many "oops" moments at HC, and it really needs to be managed.

Link to comment
Share on other sites

  • Members

Craig - Never heard from anybody. I find this stuff frustrating, best practices do take time to learn and build for a given environment -- and cost money -- but they're not really that hard, or that expensive. What I usually find when talking to many other business owners is that many seem to be content with a "it will probably be okay" approach to data security. I am a lot more proactive, because experience tells me that failures WILL happen, and I budget vis-a-vis business impact. One thing that I really love about my job is that we are not driven by a quarterly profit cycle; I am able to plan effectively for managing new equipment throughout its lifetime and my budget is related to *total* cost of ownership.

 

Mogwix: it's far worse than any nightmare. ;)

 

Wes

Link to comment
Share on other sites

  • Members

The alternative is to simplify to the point of inherent speed/reliability. The whole over-integration has brought forth utter failures in a variety of industries.

 

Also, going with industry standard solutions cuts down on learning curves as it's spread out over a lot of early adopters who have better resources for this sort of thing.

Link to comment
Share on other sites

  • Members

Since this thread has digressed a bit into RAIDs. I have a question regarding them as I'm building a storage server on my home LAN soon. If a corrupt bit of data gets written onto one mirror, doesn't it get written onto the next simultaneously (think corrupt FAT or something similarly catastrophic)? In the interest of keeping costs down and having ALMOST perfect integrity, does it make sense to have storage drives and then image them (overwrite) to another similar drive on a schedule (say every 24 hours)? That way when a drive with a catastrophic failure occurs, you only loose a maximum of 24 hours of data (not good for banks but maybe better integrity for my home use).

 

If I'm out of line asking here, just ignore me - sorry for being so OT

Link to comment
Share on other sites

  • CMS Author
Since this thread has digressed a bit into RAIDs. I have a question regarding them as I'm building a storage server on my home LAN soon. If a corrupt bit of data gets written onto one mirror, doesn't it get written onto the next simultaneously (think corrupt FAT or something similarly catastrophic)? In the interest of keeping costs down and having ALMOST perfect integrity, does it make sense to have storage drives and then image them (overwrite) to another similar drive on a schedule (say every 24 hours)? That way when a drive with a catastrophic failure occurs, you only loose a maximum of 24 hours of data (not good for banks but maybe better integrity for my home use).

 

If I'm out of line asking here, just ignore me - sorry for being so OT

 

All depends what you mean by, "If a corrupt bit of data gets written".

 

 

 

If the bad data exists in cache (or whatever source memory hosts it) and is then written, every write will repeat the bad data.

 

If the write fails or the target memory location is bad, then the bad write may not repeat. It *could* repeat if subsequent writes or targets are also bad, hence no such thing as perfect integrity, but the odds decrease exponentially with each redundancy.

Link to comment
Share on other sites

  • Members

this place used to be the goto hangout for us. it was very fun and informative

 

now i poke my head in every few months and see the topics havent updated.

 

seems like that one conversion (live? jive?) tanked it and its never been even a ghost of what it was.

 

i'm having a blast with my conversion to digital FOH and losing my racks in the process, i mix on ipad alone on about 1/3 shows now. setup time is ridiculously fast with lack of interconnect and having previous shows in presets.

 

this place just doesnt have the traffic to support our previous level of interaction, and that is sad. i have no idea how to fix that

Link to comment
Share on other sites

  • Members
The alternative is to simplify to the point of inherent speed/reliability. The whole over-integration has brought forth utter failures in a variety of industries.

 

Truer words were never spoken!

 

In fact, I increased our actual availability (measured over 5 years) on one platform by getting rid of the high-availability clustering software and reverting to a manual failover solution. The HA software looked great on paper, but it was so intertwined and integrated that it caused maintenance troubles to the point where routine maintenance (like replacing failed disks) would trigger an unnecessary outage. And when anything went wrong with the HA software, it was a real bugger to do *anything* with the platform.

 

The bit about being a late adopter is key to reliability, also. I tend to stay about two generations behind everything but security patches.

 

Dogoth - Yes, if you write corrupt data, then the corruption will be mirrored. The job of the RAID is to accurately store what you tell it to store, not to second-guess you. The solution is simple, 1) don't write corrupt data, 2) have corruption-tolerant data, 3) have a backup plan

 

If you have an operating system that actually writes corrupt file allocation tables, then you need to get rid of that operating system.

 

PSG - audio-joy is running a copy of this site for the ad revenue. They used to actively scrape it, but that stopped shortly after I reported them to Dendy last year.

 

Wes

 

Link to comment
Share on other sites

  • Members
this place used to be the goto hangout for us. it was very fun and informative

 

now i poke my head in every few months and see the topics havent updated.

 

seems like that one conversion (live? jive?) tanked it and its never been even a ghost of what it was.

 

i'm having a blast with my conversion to digital FOH and losing my racks in the process, i mix on ipad alone on about 1/3 shows now. setup time is ridiculously fast with lack of interconnect and having previous shows in presets.

 

this place just doesnt have the traffic to support our previous level of interaction, and that is sad. i have no idea how to fix that

 

Where do you go now instead? PM me.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...