Distributed Storage Systems for Linux? 52
elambrecht asks: "We've got a _lot_ of data we'd like to archive and make sure it is accessible via the web 24/7. We've been using a NetApp for this, but that solution is just waaaay to expensive to scale. We want to move to using a cluster of Linux boxes that redundantly store and serve up the data. What are the best packages out there for this? GFS? MogileFS?"
Lustre (Score:5, Informative)
Lustre for ClusterFS (Score:2)
2.4 only.
The system itself is developed in a VERY cathedral-like style by a company called ClusterFS, who is selling the 2.6 version. My guess is they'll release it for free when Linux 2.8 or 3.0 is released, so that they always give away the obsolete version and sell the new one.
Re:Lustre for ClusterFS (Score:2)
needful? (Score:2)
I've seen systems still running 2.2, but it's not pretty. 2.6 is the way to go these days. Lustre is about the ONLY reason anyone should be using anything older.
How big? (Score:1)
I suppose there's a difference between serving just 500GB or a few terabytes.
What else... (Score:1)
Re:How big? (Score:1)
If it is only that, you could just stick 8 disks in a nice server system, that gets you a few terabytes. Use your filesystem of least mistrust, such as reiserfs, XFS or JFS.
RAID 1+0 or 10 is very nice, but you still have a single point of failure. NetRAID and some kind of fail-over redundancy with another machine way be the way. You will not get a full SAN/NAS this way, but you also avoid a lot of complexity.
Panasas -- check it out (Score:3, Informative)
Re:Panasas -- check it out (Score:2)
Panasas is a GPL Violator, as per conversations with them during a demo. They have a proprietary module that is built against only RHEL kernels. They confirmed that this module uses kernel headers and other parts of the kernel, but they refuse to release the source to any customer.
If you are ever in a situation where you need to upgrade kernels in a hurry, switch
Re:Panasas -- check it out (Score:2)
A self contained module is not a GPL violation. Having a GPL'd wrapper that can load your proprietary module is not a GPL violation. Having a driver that has headers and kernel source in the compiled version would make it a GPL violation since it is a "derivative".
Our crystal ball is fuzzy! (Score:5, Insightful)
If you can afford NetApp, why not keep with NetApp? A bunch of Linux boxes is not a storage solution. Indeed, what does Linux have to do with anything? We're talking storage here. What are you planning to do - put in 200 of them with internal SATA drives? Yeah, that'll be a lot cheaper to maintain...
I'm not shilling for NetApp, but if you really have "a lot" of data to put "on the web" "24/7" then you need some kind of real storage solution like a NetApp or one of their competitors.
Now go away and please take Cliff with you.
look at the rest NetApp is 4th place (Score:3, Informative)
so go ask them about what you want
really you can admin your white box's (that become a NAS ) or you can get a NAS
are you thinking SAN ?
also talk to Apple they do some nice product as well as SUN
whats this for large data ?
video data go talk to SGI and their XFS products
really it depends on what your doing NetApp is great for company File system of documents but Bad if you want to get the most out of your storeage and you do mostly vid
Re:look at the rest NetApp is 4th place (Score:2)
NAS/NetApp is so overrated, more like 7th place. Only reason why it has made a headway is because it is so close to NFS. And everyone can do NFS commands.
Re:Our crystal ball is fuzzy! (Score:2, Insightful)
Hey man, don't tell that to Google.
Re:Our crystal ball is fuzzy! (Score:1)
I don't think the Slashdot editors can spell either. An "editor like Cliffy" recently allowed the word "persue" to appear in a headline.
forced to get NAS windows 2000 embedded (Score:1)
I did look into getting a linux NAS but the solutions out t
Re:forced to get NAS windows 2000 embedded (Score:2)
They are linux based, web management, have the scsi port, and they sell backup and restore software.
I use one, but I back it up as a linux server with Veritas Netbackup to an existing tape robot on another linux server.
AFS ?? (Score:4, Interesting)
Re:AFS ?? (Score:2)
Re:AFS ?? (Score:2)
Centera (Score:5, Informative)
I'm biased but this is a high level Linux based storage system done right. It's not easy to create a coherent storage system out of lots of separate machines, the software that runs on this cluster does a lot of work. This thing fully redundant with no single point of failure, dynamically expandable without even taking it offline, it scales to 100's of terabytes and manages all that content continuously (scanning for corruption and fixing it, garbage collecting, etc..). The cluster has redundant backend networks and parallel paths everywhere, it even uses reiserfs to store the data. There's a lot of good engineering in this unit and they sell it at a decent price compared to NAS boxes.
Check it out:
http://www.emc.com/products/systems/centera.jsp [emc.com]
I do work for EMC (like I said.. I'm biased) but I don't speak for them, my opinions are my own.
Storage clustering is simply hard to do while still presenting a low level filesystem interface. Tossing that out and creating file storage as a high level service with a richer interface seems like the right approach to me. Show me a storage clustering solution that doesn't do that and I'll show you something full of bugs, expandability issues, limitations, and pain points.
Re:Centera (Score:1)
Of course, the centera always did sound interesting. But last I heard you still needed to write to the centera API, no block access for you (or real NAS type either). But I did hear murmers of this changing.
Anything to avoid a Celerra.
Re:Centera (Score:2)
There are at least two products you can put in front of Centera to make it look like a standard filesystem: CUA (a Centera specific EMC product) and Legato Disk Extender. The tradeoff is that by interfacing with it like a filesystem you re-introduce the limitations of filesystems and lose all the automatic functionality the API gives you. If you look a
Re:Centera (Score:1)
I was just laid off from Xiotech,(with about 100 others) one of the smaller SAN vendors..
But we call the Centerra a "data jail". It's like the roach motel..
Data checks in, but it don't check out. It can't scale beyond a 42U rack enclosure. It's a bunch of little servers striped together to form a big NAS with a metedata controller in the middle.
Just bashing the 800lb gorilla...
OTOH, If you're hiring, I'm willing to tell you your products could rule the world!
Regards.
P
Re:Centera (Score:3, Insightful)
Ug. It's just not true. Most applications that are built to work with Centera include functionality to migrate in/out of the system just like most applications that are built to work with tape can both put data on and get it back. The difference is tape sucks, Centera doesn't.
It can't scale beyond a 42U rack enclosure.
Also not true. I have worked extensively with a 3 rack install with about 50tb of data on it. I believe all versi
Re:Centera (Score:2)
I work for EMC but I don't speak for them, my thoughts are my own even if I sound like an EMC cheerleader/sock puppet.
Ask Google (Score:3, Interesting)
Even easier (Score:2)
Clustering filesystems- an overview (Score:4, Informative)
I did some reasearch on clustering filesystems for work a while ago. Here's the Cliffs-notes version:
Re:Clustering filesystems- an overview (Score:2)
Converting extra Windows(tm) workstation space? (Score:5, Interesting)
A barely-related subject - I've been wondering whether there's some way to collect the unused space on all the Windows workstations around here into a shared space for storage.
This is purely a speculative exercise, but I keep wondering if some combination of:
Yes, I know it's kind of silly, and performance seems like it would be pretty pathetic, but the more I think about it, the more I want to see if I could actually do it (think pretty much the same mindset that the IP-over-carrier-pigeon guys had...)
Heck, it might conceivably actually WORK for a large-but-infrequently-accessed historical repository or something...
Or has someone already started some sort of "Virtual ATA-over-ethernet-from-a-file driver for Windows" project and spoiled my fun?...
Re:Converting extra Windows(tm) workstation space? (Score:3, Interesting)
http://perlfs.sourceforge.net/ [sourceforge.net]
Build it first, optimize later.
FYI.. The multi-threaded filesystem version exists, I just haven't bundled it up pretty for distribution. Now someone needs to create a multi-threaded samba to share it out.
Re:Converting extra Windows(tm) workstation space? (Score:2)
That's actually the part I'm not sure about - I know I could e.g. format an old 6GB HDD and then use dd to make a filesystem image that I could mount, but I haven't done any digging to find out if it's possible to directly create a ('standard') filesystem as an image file. (Hints welcome...)
Perlfs looks interesting but it appears as though it hasn't been updated in a while (the homepage talks about adding support for linux "2.5" at some point...)
Re:Converting extra Windows(tm) workstation space? (Score:3, Informative)
Huh? Just run mkfs.whatever on your file. Should work without problems. Your filesystem is as large as it would be on an equally large blockdevice.
Example:
$ mkfs.ext3 file
mke2fs 1.36 (05-Feb-2005)
file is not a block special device.
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
1784 inodes, 7116 blocks
Re:Converting extra Windows(tm) workstation space? (Score:2)
Thanks - that saves my lazy butt from having to actually look at the man pages or whatever.
Now I have no excuse not to try it...
Re:Converting extra Windows(tm) workstation space? (Score:2)
Sorry about that. I'm busy. You did get me motivated to put up the new multi-threaded capable version of perlfs today though. Maybe I'll even edit the home page at some point.
And yes.. it's somewhat actively maintained. No, I haven't worked on it with the 2.6 kernels yet.
Re:Converting extra Windows(tm) workstation space? (Score:2)
starwind will allow you to export images or even drives and partitions to iSCSI. WAY less overhead and way faster than smb. I can only get about 6MB/sec over 100 speed ethernet and SMB, but i can get ~10.5MB/sec with iSCSI.
simply add a number of these disk images accross your network and then mount them on your linux system. you can use LVM to manage these volumes also.
i have to warn your that 100 speed ethernet is a bit slow for data storage on a network especially when y
We use OpenAFS (Score:4, Insightful)
We've moved to using linux based OpenAFS servers. A high quality 3U box (qsol.com [qsol.com]) loaded with 16x 300GB ATA drives costs about $8.5K and provides us about 3.5TB (2 drives for parity, 2 drives for hot-swap). That works out to $2.5K/TB. If your risk tolerance is higher than mine, you can bring that up to $8K/5.5TB, for about $1.5K/TB). We really want 99.999% availability, so just to be safe, we keep a 100% redundent read-only copy on a second machine (AFS supports this beautifully, including automatic fail-over).
OpenAFS has a couple of features that make it better than NFS (client-side cache, for instance), but it also has a few drawbacks, like no files >2GB.
Re:We use OpenAFS (Score:2, Informative)
Copied from release notes:
For UNIX, 1.3.81 is the latest version in the 1.4 release cycle. Starting
in 1.3.70, platforms with pthreads support provide a volserver which like
the fileserver and butc backup system uses pthreads. Solaris versions 8
and above, AIX, IRIX, OpenBSD, Darwin, MacOS and Linux clients support
large (>2gb) files, and provided fileservers have this option enabled.
HP-UX may also support large files, but has not yet been verified. We hope
sites which can do s
Re:We use OpenAFS (Score:1)
hello from M over in PSF =)
Re:We use OpenAFS (Score:1)
i was talking to Z at the installfest 2 weekends ago and he told me about this migration. it sounded pretty cool. we have been eying a similar migration too (primarily for authentication, not storage), but we're waaaaaay short on resources here =(
Is data partitionalbe? (Score:1)
After all, as a collection there is an immense amount of data on "the world wide web" but since it's partitioned, scaling isn't an issue.
Even before the web, the universe of ftp, gopher, news, and other servers held gobs and gobs of data, nicely partitioned.
When answering questions like this, it would help to know the organization of the data and if it can
The IBRIX file system is a strong runner for this. (Score:2, Informative)
aRchive.org (Score:3, Informative)
http://www.archive.org/web/petabox.php [archive.org]
They are on the order of petabytes
Get a copy of this month's LinuxJournal (Score:1)
Avoid Filesystems if you need scalability (Score:2)
The theoretical upper limit of any file system is limited by 2 things, the address space, and the efficiency of the data structure.
In a 32 bit system, that means that, in theory you could fit 4.2 billion objects into a file system... but don't try it. NTFS craps out at between 15 and 50 million depending on whose numbers you are willing to listen to, EXT3 sta
silly worker drones... what about CXFS?!!!! (Score:1)
Re:silly worker drones... what about CXFS?!!!! (Score:2)