front - face recent changes local names special software-bazaar-wiki

DataBackup

front - face

Here at Software Bazaar, we have a few projects related to data backup:

While “data backup” is not the primary purpose of the distributed wiki, it looks like it will handle most of the desirable features of a backup system:

"Data Recovery - 9 Things to Look for when Considering Data Backup Solutions":

1. Is it automatic? 2. Is it simple to use? 3. Is it secure? Your data must be off-site. 4. Is it confidential? (not necessary for a public wiki) 5. Is the data compressed? (not necessary until the data grows beyond a megabyte; ErasureCode as well as standard data compression both can reduce the amount of disk space used.) 6. Is the system informative? 7. Is the system flexible? capable of backing up any file size or type. (is this relevant to a public wiki?) 8. Is it versatile? make sure that it has both automatic and manual back facility. You want … set and forget, but you also want the, “back up my data now” feature so that you have the benefit of not having to worry through the time between finishing your thesis or proposal and knowing it is really secure. 9. Is it backed up more than once? Don’t rely on a system that only has one copy. Make sure that it has several copies stored on at least 2, preferably 3, off-site computers.

scary stories

Various perspectives on the 2006 Alaska Department of Revenue incident:

more scary stories

Jamie Zawinski

Jamie Zawinski talks about backups.

related systems that don't quite do everything I want

In a comment to "Trying Dreamhost for Backup", Mathew Newton 2007 mentions “It seems to me that there a loads of home users running their own 24/7 servers at home nowadays … and with this comes the common desire for robust backups. As mentioned above it soon occurs to many that offsite is the (only) way to go. Hence, it strikes me that one of the best (and cheapest) solutions to all of this is to set up reciprocal backup arrangements between yourself and someone else in the same boat …”

"Mount Strongspace to a folder in Ubuntu"

https://wiki.ubuntu.com/ContinuousBackups seems like exactly what I want. Well, other than the fact that it hasn’t been implemented yet.

Wikipedia:Tandem_Computers built custom hardware designed to be fault-tolerant. I suspect that (with a bit of programming) we could build a fault-tolerant wiki on top of (modern) off-the-shelf computers (and the internet) without modification. Or is there something I missed?

"Consumerium ... plans to cascade data between three wikis". Why “three” exactly?

ideas

(later I need to sort out “goals”, “nice things to have”, from “how to test that its working” and “possible ways to implement it”).

The basic idea here is “I know any one hard drive will fail after an average of 5 years. And any one web site will eventually go offline because of power loss or other causes (fire, flood, etc.). How do I access my files after such an event occurs?”.

I am most interested in building a “fault tolerant wiki”, but these 2 kinds of “distributed data store” are so very similar … in fact, I’m not sure whether it’s better to build the wiki on top of a distributed data store, or build a distributed data store on top of the distributed wiki.

When I say “no one can delete a file”, I mean that a perfect copy of the that version of the file can immediately be recovered from some other server, no matter which server gets hit by a “potential dataloss event” such as:

Possible implementations of a distributed wiki:

Where does the software sit?

Testability: Backup software: How do I know it’s really working? Have you heard the scary stories about tape drives with broken heads that only recorded zeros – but no one discovered that fact until a hard drive failed and they needed to restore that data?

failure modes

"Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You?" by Bianca Schroeder and Garth A. Gibson. “Unfortunately, many aspects of disk failures in real systems are not well understood, probably because the owners of such systems are reluctant to release failure data or do not gather such data.” “The work in this paper is part of a broader research agenda with the long-term goal of providing a better understanding of failures in IT systems”

Is there any reason not to gather and release data on disk failures in the distributed wiki?

The fact that a particular node has gone offline will be obvious. What other data should we collect to put that fact in context? Perhaps: * manufacturer and model number of the drive; * whether the drive is stand-alone or inside some sort of RAID array; * temperature at time of failure; … * What exactly failed: hard drive, power supply, motherboard, CPU, memory, or what ?

Ideally, we would gather enough data to help us make decisions that improve reliability without increasing costs. But not so much data that the cost of collecting the data outweighs any possible benefit.

The computer failure data repository (CFDR) http://cfdr.usenix.org/ seems to be the appropriate place to publish this disk failure data.


Define external redirect: LionKimbro grammar check SamRose BrandonCsSanders