×

Announcing: Slashdot Deals - Explore geek apps, games, gadgets and more. (what is this?)

Thank you!

We are sorry to see you leave - Beta is different and we value the time you took to try it out. Before you decide to go, please take a look at some value-adds for Beta and learn more about it. Thank you for reading Slashdot, and for making the site better!

Benchmarking Linux Filesystems Part II

Zonk posted more than 8 years ago | from the some-of-this-content-may-be-inappropriate-for-young-readers dept.

Data Storage 255

Anonymous Coward writes "Linux Gazette has a new filesystem benchmarking article, this time using the 2.6 kernel and showing ReiserFS v4. The second round of benchmarks include both the metrics from the first filesystem benchmark and the second in two matrices." From the article: "Instead of a Western Digital 250GB and Promise ATA/100 controller, I am now using a Seagate 400GB and Maxtor ATA/133 Promise controller. The physical machine remains the same, there is an additional 664MB of swap and I am now running Debian Etch. In the previous article, I was running Slackware 9.1 with custom compiled filesystem utilities. I've added a small section in the beginning that shows the filesystem creation and mount time, I've also added a graph showing these new benchmarks." We reported on the original benchmarks in the first half of last year.

Sorry! There are no comments related to the filter you selected.

Very interesting article... (4, Interesting)

toofast (20646) | more than 8 years ago | (#14410134)

An interesting analysis in every aspect, and it's fine and dandy for the person who uses 400 GB drives and a ATA controller on a 500MHz computer but I'd like to see how the filesystems compare on a bigass RAID system run by a Power5 server, or a few Itaniums that usually have with a few hundred connected users. Something a bit more "entreprise" - where the choice of a filesystem is a bit more critical than a small server or a home PC.

Re:Very interesting article... (2, Insightful)

CastrTroy (595695) | more than 8 years ago | (#14410172)

I'd like to see how they perform on a 12 GB Disk on a P2 266. You really start to see the differences when working on older hardware.

Re:Very interesting article... (1)

chrismcdirty (677039) | more than 8 years ago | (#14410218)

Reiser4 kills old disks, supposedly. I (mistakenly) used it on my (at the time) year old laptop, and after about 4-6 months I kept getting something like "drive seek complete" errors. It got to the point where it wouldn't boot because of the errors. So I had to reinstall everything, and no data could be saved since reiser4 hated my drive. Been running Reiser3 ever since, and I haven't had any problems.... yet.

Re:Very interesting article... (3, Interesting)

Captain Segfault (686912) | more than 8 years ago | (#14410489)

It is completely absurd for a filesystem to kill a disk. If you were getting those errors (with the "drive ready" and "seek complete" bits being set being most common) it *strongly* suggests that either your disk is broken or it is improperly powered.

If you're actually using that disk, still, have a look at it with smartctl. In particular, run "smartctl -t long" on it, and have a look at the results. If it doesn't pass that, don't even think of trusting it with your data.

Re:Very interesting article... (2, Interesting)

chrismcdirty (677039) | more than 8 years ago | (#14410665)

'Kill' was a little strong for how I meant to use it. What I really meant to say (and can now find any data backing me up) is that Reiser4 deals with the disk so intensely that it uncovers flaws and errors that other filesystems may (A) never find, or (B) live with.

I'd look through the Namesys page, but it's large and the TOC didn't reveal any warnings.. or I wasn't looking hard enough.

Re:Very interesting article... (1)

cli_man (681444) | more than 8 years ago | (#14410410)

I had 3 Squid cache servers that I run off from pretty old cheap hardware using Reiser and I found the best way to work it was to have 4 or 5 drives for the cache for best performance, reiser worked great but on the older equipment they had trouble with concurrent open files etc. By adding more drives I could have a lower end PIII handling 2,000 - 3,000 open files at a time, pretty impressive I would say.

I think trying on a P2 266 is a bad idea (5, Interesting)

H4x0r Jim Duggan (757476) | more than 8 years ago | (#14410467)

Reiser is not designed for slow CPUs. AFAIK, a key part of the design was the Hans Reiser realised that CPUs were vastly underused. IO resources were maxed out and CPUs were sitting idle. So he found ways to use the CPU to make more efficient use of the IO resources. So this benchmark on a 500Mhz machine will of course show Reiser in a bad light, and moving lower down to a 266Mhz will make it even worse.

For a decent benchmark of how filesystems work on modern hardware: use modern hardware.

Re:I think trying on a P2 266 is a bad idea (0)

Anonymous Coward | more than 8 years ago | (#14410878)

I'm fairly certain using more CPU in most cases is RETARDED.

Re:I think trying on a P2 266 is a bad idea (1)

Trifthen (40989) | more than 8 years ago | (#14410899)

What this says to me, is to never use Reiser on a DB machine. Sure, the disk churn is much more prevalent on such a beast, but the CPU(s) aren't exactly sitting around idle, either.

It actually sounds like Reiser would do really well as a disk controller in a dedicated drive array. I wonder if anyone has put embedded Linux on such a device, to act as a Reiser RAID controller...

Re:I think trying on a P2 266 is a bad idea (1)

wavq (216458) | more than 8 years ago | (#14410908)

I used to work for a company where 10000 files in a single directory is considered
*small*. Try ten or a hundred times that many, and watch EXT(2|3) come to a grinding
halt. Reiser (v3) happily obliged with these kinds of loads.

Any test can be made to highlight the good/bad given "properly" (ahem) chosen parameters.

Re:I think trying on a P2 266 is a bad idea (2, Informative)

StarHeart (27290) | more than 8 years ago | (#14411002)

I am pretty sure that ext3 fixed that with htree indexing. Htree has been around for a while.

Re:Very interesting article... (1)

110010001000 (697113) | more than 8 years ago | (#14410191)

Toofast, if you (or anyone) would like to donate to me such a setup I would be happy to reperform the benchmarks and share them with the community. I was only able to test with the equipment I had on hand. Just drop me an email.

Thanks

Interesting? How about a DECENT one? (5, Interesting)

diegocgteleline.es (653730) | more than 8 years ago | (#14410842)

I'm *sick* of reading filesystem benchmarks of people who doesn't even care about even reading the documentation of the filesystems they compare

OK, so ext3 is not the fastest filesystem on earth. But it has some default options which makes it suck even more than it usually do, and those options are *documented* in Documentation/filesystem/ext3.txt

* Ext3 does a sync() every 5 seconds. This is because ext3 developers are paranoid about your data and prefers to care about your data than win on benchmarks. Syncing every 5 seconds ensures you don't lose more than 5 seconds of work but it hurts on benchmarks. Other filesystems don't do it, if you are doing a FAIR comparison override the default with the "commit" mount option

* ext3's default journaling mode is slower than those from XFS, JFS or reiserfs, because it's safer. When ext3 is going to write some metadata to the journal, it takes care of writting to the disk the data associated to that metadata. XFS and JFS journaling modes do *not* care about this, neither they should, journaling was designed to keep filesystem integrity intact, not data, ext3 does it as an "extra", and it's slower because of that. But if you want to do a fair comparison, you should use the "data=writeback" mount option, which makes ext3 behave like xfs and jfs WRT to journaling. Reiserfs default journaling mode is like XFS/JFS, but you can make it behave like the ext3 default option with "data=ordered"

ext3 is not going to beat the other by using those mount options, but it won't suck so much, and the comparison will be more fair. And remember: ext3 tradeoffs data integrity for speed. There's nothing wrong with XFS and JFS, but _I_ use ext3.

Re:Very interesting article... (1)

flaming-opus (8186) | more than 8 years ago | (#14410994)

You raise an excellent point, though the only way to get ahold of enough hardware to make that test interesting is to get the system vendor to provide the hardware, in which case you often have limited ability to publish any results they don't like. (Been there, didn't publish that)

Furthermore, once you get into that high-end of a system, you're generally not all that interested in "general purpose" benchmarks. I have a lot of experience benchmarking filesystems on high-end systems. (15GBytes/s and so on) In those cases you're benchmarking everything: the application, the filesystem, the filesystem settings, the operating system, the OS settings, multipathing drivers, san environments, raid controllers, down to even the disk drives in the raids. It's hard to isolate the filesystem from this mess, except in the performance of the particular application.

In a sense, generic benchmarks only make sense on small servers and workstations, as you run a diverse set of applications, and have a limited set of hardware, that changes only modestly with time (though 500mhz is getting pretty antique there dude). Benchmarking a dual 2.4 ghz dell slab with a a mirrored pair of 10k scsi drives might be a little more useful, as there are a LOT those out there running linux. Benchmark mail-serving, web-serving, file-serving. Since these are the sweet-spot for linux servers, benchmarking these things would probably be most instructive to the broadest group of people. The microbenchmarks Mr. Piszcz runs are a little too workstation-like for my tastes. I don't consider workstation disk performance to be all that important, at least compared to server tasks.

pretty dry, except... (1, Funny)

samyool (450631) | more than 8 years ago | (#14410171)

wow, what a dry article.

However, scroll to the bottom. More latin translations than you can shake a stick at, including my personal favorite:

I have a terrible hangover.
    Crapulam terriblem habeo.

-S

Need to be careful... (3, Insightful)

Conor Turton (639827) | more than 8 years ago | (#14410190)

One thing this does show is that you need to be very careful to match the filesystem type to the main tasks the PC is going to be used for. Personally, there's no real clear winner as all have major gains or deficiencies in some areas. One very interesting point was the vast difference in the amount of available space after a partition and format between the different filesystems.

I would agree (2, Informative)

jd (1658) | more than 8 years ago | (#14410304)

From a brief examination of the benchmarks, I'd say the following would seem to hold up:


  • JFS: Great for software development, as it allows rapid file and directory reads, writes, creates and deletes
  • XFS: Seems to work best with much more stable content. Creating and mounting the partition is also fast, and the FS overhead seemed low. Should be good for static databases, particularly if you're going to use a network filing system to access the drive, say using a SAN.
  • Reiser4: Surprisingly, I didn't see Reiser4 really shine at a whole lot in the benchmarks. The massive mount time tells me it needs to be a local drive that only needs mounting the once. Just not sure what sort of data would be best on it.
  • Ext2/Ext3: Mediocre at almost everything. Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid. The best fall-back filing systems if you can't find anything better for what you want the partition to do, but should never be used in specialized contexts.

Re:I would agree (4, Interesting)

lawpoop (604919) | more than 8 years ago | (#14410450)

I'm no expert by any means, but I think the idea behind the ReiserFS is breaking down the FS paradigm from the file level to the line level.

There is the classic example from the Reiser website. If your password file gets hacked, you have to ditch the whole file if you're using traditional file systems. You only know whether or not the file's been changed. However, with the Reiser system, it can tell you *what line*, and thus which user/password, was changed.

That's just a taste of where you can go with the ReiserFS. There are other things coming down the pipe; check out the reiser website for a better idea of the new features that ReiserFS promises.

How is ext3 mediocre? (1)

mrcparker (469158) | more than 8 years ago | (#14410494)

It seemed to be either first or second at most of the benchmarks. I really don't consider that mediocre.

I was pretty surprised by ext3's performance. I also read the article.

Re:How is ext3 mediocre? (0)

Anonymous Coward | more than 8 years ago | (#14410672)

I must agree with the above. From the tests performed, I was very suprised and impressed with ext2 and ext3, both of which I had moved away from for the 'newer breed' filesystems, assuming, falsly, the newer is generally better. (why make something that is not better then what already exists and If you can study what already exists you can improve upon it) However the tests performed are not necessarly conclusive, as others have stated (more enterprise setup, etc).

Also, these tests don't include tests like file storage efficiency, (sectors used, etc) stability, longevity, etc, etc.

Still, all in all the results are interesting.

How is ext3 mediocre? Default limitations is how (1)

WoodstockJeff (568111) | more than 8 years ago | (#14410944)

I have personally had to deal with the results of forgetting to change from EXT3 to something else when setting up one of our servers. Took a year, but one of the database files reached that magical compiled-in limit of 4GB... Fortunately, I caught it shortly after it happened, and was able to rearrange things to keep the server from too far out of sync with the rest of the cluster.

EXT3 has a lot going for it, but the default compile options (at least the ones used by several of the popular packagers) make it incompatible with large files. "Large files" is, of course, a relative term, but more than a few people deal with 4+GB files nowadays, like DVD ISOs, so it's not just billion-record databases that blow up EXT2/3.

If I wanted small files, I'd have used FAT32! :)

Re:I would agree (5, Insightful)

Anonymous Coward | more than 8 years ago | (#14410546)

Ext2/Ext3: Mediocre at almost everything. Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid. The best fall-back filing systems if you can't find anything better for what you want the partition to do, but should never be used in specialized contexts.

Huh? Sorry, did you read the same graphs or are you just trolling?

This article shows that ext2 and ext3 are close to the top performer in most tests and do not have many "worst-case scenarios" (unlike, e.g. Reiser3 and Reiser4).

If there is anything that you can conclude after reading this study, it is that ext3 is a reasonably good default choice for a filesystem.

wrong conclusions (1)

penguin-collective (932038) | more than 8 years ago | (#14410596)

What I take away from these benchmarks is that Ext3 is still the most reasonable choice: mature, well supported, and good overall performance.

JFS, XFS, and ReiserFS are small players with a fraction of the user community and a fraction of the tools and support; their performance would have to be astounding in comparison to Ext3 to even consider them, but it isn't.

Unfortunately, benchmark-happy people like you, people who optimize for the wrong thing, are far too frequent in this industry.

Re:wrong conclusions (1)

HidingMyName (669183) | more than 8 years ago | (#14410863)

I think XFS supports caced block access (using DMAPI if I recall. This helps with low level access, so that the dump utility can operate on a live file system (although write activity during the dump could still cause inconsitency if a snapshot is not used). Ext2FS/Ext3FS don't have such support as far as I know.

Re:I would agree (1, Redundant)

david.given (6740) | more than 8 years ago | (#14410647)

Reiser4: Surprisingly, I didn't see Reiser4 really shine at a whole lot in the benchmarks. The massive mount time tells me it needs to be a local drive that only needs mounting the once. Just not sure what sort of data would be best on it.

I think the ReiserFS mount times in the benchmark are misleading. From my experience, mkreiserfs creates an extremely basic file system; the first time you mount it, the file system driver itself will do a lot of heavy housekeeping, which takes ages. Subsequent mounts are much faster.

In fact, I find the whole benchmark a bit dubious. A lot of the operations will vary wildly in speed depending on how much data is currently in the buffer cache or not. This means that performing the benchmarks in a different order is going to vastly change the results... couldn't he at least put a 'sync' in every now and again?

Re:I would agree (1)

JonXP (850946) | more than 8 years ago | (#14410708)

From TFA:

NOTE1: Between each test run, a 'sync' and 10 second sleep were performed.

Re:I would agree (1)

david.given (6740) | more than 8 years ago | (#14410728)

NOTE1: Between each test run, a 'sync' and 10 second sleep were performed.

D'oh!

But what's the sleep in aid of? It'll achieve precisely nothing --- the sync will block until all I/O is complete.

Re:I would agree (1)

pegr (46683) | more than 8 years ago | (#14410882)

NOTE1: Between each test run, a 'sync' and 10 second sleep were performed.

D'oh!

But what's the sleep in aid of? It'll achieve precisely nothing --- the sync will block until all I/O is complete.

 
Maybe it's to flush from the internal drive cache to the platters? Just because the OS says the data is flushed doesn't mean the data is flushed...

Re:I would agree (2, Interesting)

m50d (797211) | more than 8 years ago | (#14410701)

Reiser4: Surprisingly, I didn't see Reiser4 really shine at a whole lot in the benchmarks. The massive mount time tells me it needs to be a local drive that only needs mounting the once. Just not sure what sort of data would be best on it.

Reiser4 now defaults to journalling everything - file data as well as metadata. If they left it like that, then no wonder it's slower - but it's the best choice if data integrity is important.

Re:I would agree (1)

Tet (2721) | more than 8 years ago | (#14410819)

Reiser4 now defaults to journalling everything - file data as well as metadata. If they left it like that, then no wonder it's slower - but it's the best choice if data integrity is important.

Best choice for you, perhaps. If data integrity is important, then reiserfs is the last place I'd be looking. I'd be going with ext3 with data journalling enabled.

Re:I would agree (1)

m50d (797211) | more than 8 years ago | (#14410836)

I meant best choice of those tested. I'd certainly like to see benchmarking of reiser4 against ext3 with data journaling.

Re:I would agree (2, Insightful)

smoker2 (750216) | more than 8 years ago | (#14410869)

Ext2/Ext3: Mediocre at almost everything. Distros like Fedora that mandate the initial install ONLY use Ext3 are being stupid. The best fall-back filing systems if you can't find anything better for what you want the partition to do, but should never be used in specialized contexts.
How the hell did you come up with that opinion ?

Ext3 came 1st or 2nd in 24 out of the 40 tests done. If you were producing an OS for general purpose computing, would you use a specialist fs or the best performing general purpose one ?

You seem to have good words for JFS and XFS though, and XFS had only 13 1st or 2nd places !

How do you work out that Ext3 is "mediocre" from those figures ?

(you sound like you run debian)

I would disagree with you (0)

Anonymous Coward | more than 8 years ago | (#14410957)

ext2/3 are actually the clear speed winners.

Almost the only useful part of the article is the table "File Benchmark II Data". Scrolling through all the bar graphs is a waste of time. (Unfortunately, the first few pieces of data appear only in bar graphs, and not in the table.) The "All Test Times" plot would be very useful, except half the points don't have a corresponding label on the x-axis!

And what's important for the kind of evaluation you're doing are only the real world benchmarks like "UnTAR Kernel 2.6.14.4 Tarball", "Copy 2.6.14.4 Kernel Source Tree", "Mount Filesystem", and so on. ext2/3 are the fastest in almost all of these benchmarks. Then comes XFS and JFS, then ReiserFS, and finally Reiser4, which appears to be something of a dog so far as these benchmarks are concerned.

Obviously benchmarks don't tell the whole story. There are advanced features for Reiser4, ReiserFS, XFS, and JFS which could easily be more important for a user than these small differences in speed. And although ext2 is apparently fast for real world use, it is not journalled, which should disqualify it for most users.

The more fundamental benchmarks like "Make 10,000 Directories" are not useful for choosing which filesystem to use. ext2/3 stink on ice when making large numbers of directories, but it's not obvious what fraction of time users spend making directories. Plus, most directories that are created will eventually be removed, at which point ext2/3 win back most of the speed that they lost. What the fundamental benchmarks are good for is figuring out why a particular filesystem is unusually fast or slow at some benchmark, and so how that filesystem could be tuned or otherwise improved.

Anyway, it really looks like ext3 is a decent choice for the general user. I think it's bad to mandate the use of ext2/ext3 at install time, but that's a separate issue.

no reason to switch (1, Informative)

Anonymous Coward | more than 8 years ago | (#14410340)

Actually, what I take from this is there's no need to switch from a safe, standard EXT3 FS which is the default of many distros.

Re:Need to be careful... (4, Insightful)

Raphael (18701) | more than 8 years ago | (#14410476)

One very interesting point was the vast difference in the amount of available space after a partition and format between the different filesystems.

Unfortunately, that graph is rather misleading. The ext2 and ext3 filesystems keep some percentage of the disk space as "reserved" and only root can write to this reserved area. This is useful if the disk contains /var or other directories containing log files, mail queues and other stuff. Even if a normal user has filled the disk to 100%, it is still possible for some processes owned by root to store some files until an administrator can fix the problem. On the other hand, if your filesystem contains only /home or other directories in which users are not competing for disk space with processes owned by root, then it does not make much sense to have a lot of disk space reserved for root. That is why you should think about how the filesystem is going to be used when you create it, and set the amount of reserved space accordingly.

The default behavior for both ext2 and ext3 is to reserve 5% of the disk space for root. You can see it in the section Creating the Filesystems from the article:

4883860 blocks (5.00%) reserved for the super user
You can change this behavior with the -m option, specifying the percentage of the disk space that is reserved. The article did not mention how the filesystem was supposed to be used if it had been used in production. However, I would guess that the option -m 0 or maybe -m 1 could have been used in this case. This would have provided a fair comparison and suddenly you would have seen all filesystems in the same range (close to 373GB available), except maybe for Reiser3.

For instance, servers. (0)

Anonymous Coward | more than 8 years ago | (#14410967)

If you are thinking about a file server, it's very important to match the file system to the type of network protocol (NFS, SMB, whatever).

Ext3 and NFS work ok, except that writing a bunch of small files (as in untarring to an NFS mounted dir) is as slow as, well, swimming in cold tar (:P. An untar that takes less than a second to a local ext3 filesystem can take over a minute to the same file system mounted via NFS. Yes, that's run from the same fast server. It's a bad interaction, having to do with how NFS and ext3 deal with write commits. Even with all the NFS tweaks in place, it's still a couple of orders of magnitude slower.

Similarly, if your Samba server uses a non-Windows authorization backend, such as NIS or MIT Kerberos/LDAP, your filesystem really doesn't matter -- it's going to be slower than needed due to the overhead of translating the protocols.

A web or other server that does primarily read, read, read all day, should avoid ReiserFS, because if you get slashdotted you want your CPU as free as possible. That's not the only factor, but the point is that a lightweight, non-journalled FS can be your friend if you aren't writing to disk much.

First Prime Factorization Post (1)

2*2*3*75011 (900132) | more than 8 years ago | (#14410213)

Instead of a Western Digital 2*5*5*5GB and Promise ATA/2*2*5*5 controller, I am now using a Seagate 2*2*2*2*5*5GB and Maxtor ATA/7*19 Promise controller. The physical machine remains the same, there is an additional 2*2*2*83MB of swap and I am now running Debian Etch. In the previous article, I was running Slackware 7*13/(2*5) with custom compiled factorization utilities.

Re:First Prime Factorization Post (1)

creimer (824291) | more than 8 years ago | (#14410254)

Is "First Prime Factorization" is what you do when you have too much time on your hands? Persoanlly, I would put my time to better use by playing Quake 4 and blowing zombies to kibbles 'n' bits on a fast hard drive subsystem. :P

Re:First Prime Factorization Post (-1, Offtopic)

Anonymous Coward | more than 8 years ago | (#14410427)

Is "First Prime Factorization" is what you do when you have too much time on your hands? Persoanlly, I would put my time to better use by playing Quake 4 and blowing zombies to kibbles 'n' bits on a fast hard drive subsystem. :P

Personally, I would put my time to better use by touching women where they pee.

Re:First Prime Factorization Post (-1, Offtopic)

Anonymous Coward | more than 8 years ago | (#14410509)

Why the urethra fixation, bub?

Hardware mismatch (5, Interesting)

lostlogic (831646) | more than 8 years ago | (#14410260)

It is widely known that Reiser filesystems are heavy on CPU usage 4 more than 3. These benchmarks seem to show a CPU bound IO situation as opposed to an IO bound IO situation. As an earlier comment pointed out, the hardware used in this test was a 500mhz CPU. My slowest computer is a 1000mhz system, which is usually IO limited, not CPU limited. I'd be interested to see these same benchmarks run on real hardware, or some more complex benchmarks (random RW, DB load, etc.). The hardware used for this test would be suitable for a fileserver, but not much else. In that situation, E2, E3 or XFS are probably the right choices as it points out. What about desktop loads, enterprise loads, or something more interesting?

Re:Hardware mismatch (3, Informative)

Hextreme (900318) | more than 8 years ago | (#14410334)

This was definitely an issue in testing here. The wide range of "winning" filesystems for the different tests clearly indicates the bottleneck is somewhere other than the disk. In most modern systems, this isn't an issue.

From TFA: ReiserFS takes a VERY long time to mount the filesystem. I included this test because I found it actually takes minutes to hours mounting a ReiserFS filesystem on a large RAID volume.

Looks like this guy makes a habit out of using systems with 500MHz CPUs... my dual 3GHz xeon box mounts a 1.2TB raid5 array formatted with ReiserFS in about 33 seconds, give or take a couple seconds.

Re:Hardware mismatch (0)

Anonymous Coward | more than 8 years ago | (#14410480)

It's a filesystem.
It shouldn't take a massive mechine to mount a drive.
A dual 3ghz is massive for serving out files.
If all you are doing is using samba or netatalk to serve files even 500mhz is overkill.

Re:Hardware mismatch (2, Insightful)

Clover_Kicker (20761) | more than 8 years ago | (#14410632)

> If all you are doing is using samba or netatalk to serve files
> even 500mhz is overkill.

Not for ReiserV4 :)

Seriously though, there's nothing wrong with designing a new filesystem to take advantage of modern CPU horsepower as long as everyone understands the system requirements.

Here's what's missing (5, Interesting)

CastrTroy (595695) | more than 8 years ago | (#14410279)

Here's what's missing. They forgot to tell you how well the drive performed after being used for 1 year, and having constantly moved data from one place to another, and constantly deleting and creating new data. It would have been a better test if the drive was about 75% full, with data from 2 years of use, and then the same tests were performed.

Re:Here's what's missing (1)

phoenix.bam! (642635) | more than 8 years ago | (#14410317)

Good idea. You should get right on that. Don't forget to keep accurate logs as well as make us pretty graphs to show us how well each filesystem performs?

Thanks.

Re:Here's what's missing (1)

ebrandsberg (75344) | more than 8 years ago | (#14410734)

Worse--it doesn't say what mount paramaters are used, or if any tuning was done. You can change the performance characteristics significantly if you tune the paramaters of the mount. I suspect that reiser4 was in a failsafe mode for data integrety, while the others were doing a bit more caching.

Re:Here's what's missing (1)

drinkypoo (153816) | more than 8 years ago | (#14410770)

Given that filesystem creation was shown, we can probably safely assume that no tuning was done, and that if he had specified mount options, he probably would have showed us those, too... though that last part is in a bit more question.

SATA? (1)

ruiner13 (527499) | more than 8 years ago | (#14410293)

How about some SATA benchmarks? PATA is good, but I suspect things will be much improved with SATA and NCQ. Does anyone have any links?

Re:SATA? (2, Informative)

MarcQuadra (129430) | more than 8 years ago | (#14410434)

IIRC NCQ isn't 100% fully-baked on Linux yet, so even NCQ-capable controllers and drives won't take advantage of it yet. I just upgraded my home file server with NCQ-capable gear and I don't think it's using it yet, even though I'm running the latest kernel.

There are patches for libATA that enable NCQ, but they're not in the mainline yet.

The only thing worse than testing without the new technologies would be testing with half-baked implementations of them. Let's wait until NCQ is done before we try testing with it.

Re:SATA? (1)

undeadly (941339) | more than 8 years ago | (#14410508)

How about some SATA benchmarks? PATA is good, but I suspect things will be much improved with SATA and NCQ. Does anyone have any links?

Most won't notice any speed difference when moving from PATA to SATA. On PATA you typically have two harddisk one the same controller, but that hurts performance when using both disks at the same time. With SATA this is not a problem, assuming you have enough SATA connections available. NCQ may reduce desktop performance, and is most usefull for server like environments. For more info, search Storage Review [storagereview.com]

Forget SATA (1)

WindBourne (631190) | more than 8 years ago | (#14410633)

I want to know the SANTA benchmark. How did he travel all over the world and when will he not be able to handle anymore?

Meaningless? (0)

bombshelter13 (786671) | more than 8 years ago | (#14410306)

So he's benchmarked two different file systems on two almost completely different hardware setups (different drives, different raid controllers, different ammounts of RAM) and produced completely meaningless results? This is news how?

Warning (2, Informative)

c0dedude (587568) | more than 8 years ago | (#14410315)

Remember, fastest!=best. Some filesystems cannot shrink. Some cannot change size at all. If you're doing anything with LVM or RAID, generally ext3 is the way to go. If you're just formatting a disk and using it without anything on top of it, these FS's may be for you. Then again, ext3 looks damn good in the tests as stands. XFS looks like the clear loser.

Re:Warning (1)

Dionysus (12737) | more than 8 years ago | (#14410443)

How often do you shrink a volume, anyways? I think fastest when it comes to reading the data you expect to have on a given volume is much more important than whether the system can shrink or not. Don't know about jfs, but both xfs and reiserfs lets you grow the filesystem. if you are using LVM and RAID, both Reiserfs and XFS are great.

Re:Warning (1)

lividdr (775594) | more than 8 years ago | (#14410519)

Problem is that EXT2/EXT3 don't do online resizing. I see there are kernel and e2fsprogs patches to support it, but I keep seeing 'Make sure you have a very good backup' in the notes. Reiserfs, at least, does online grow very nicely.

Sucks to find out your ext3 /usr is a pinch too small for the new OOo 2.0 build you just did and have to kill off just about everything (or reboot into single) just to unmount /usr.

Re:Warning (3, Insightful)

drinkypoo (153816) | more than 8 years ago | (#14410839)

XFS does things that ext? and Reiser can't do. Reiser does things other FSes don't do as well. It's a true 64-bit filesystem and it supports insanely large filesystems, up to 9 million terabytes in 64 bit mode (with a 64 bit kernel.) It even provides realtime support, although I guess that's still beta in linux? It can be defragged and even dumped while live. It has insanely quick crash recovery. And of course, it does other stuff too; check the project page [sgi.com] . XFS may not be the fastest filesystem - it may even be the slowest - but it's got features no other filesystem has. If you need them, XFS is the winner. Hell, if you just trust XFS more than you trust other filesystems, it's the winner. (Sorry, but I wasn't sleeping when reiser was eating everyone's data, and ext3 handles corruption much more poorly than any of the other Journaled options.)

Something's up (1, Interesting)

Anonymous Coward | more than 8 years ago | (#14410327)

I'll leave aside the fact that all the other benchmarks I've seen are very favourable to Reiser4 and this is very unfavourable, and concentrate on the discrepancies.

Reiser4 is the slowest in searching, creating and removing. It performs a lot better when tarring and untarring, which indicates that reading and writing is much better than other filesystems. However, when you get to copying and creating large files, it loses again.

Why the discrepancy? These benchmarks contradict others, but don't make sense when taken alone. I'm inclined to believe the other benchmarks.

how to lie with statistics (4, Insightful)

Clover_Kicker (20761) | more than 8 years ago | (#14410329)

I love the CPU utilization graph for "touch 10,000 files".

A quick glance shows ReiserV4 as much more CPU intensive, you have to look at the scale to realize it only used 0.3% more CPU.

agreed: please rescale CPU utilization graphs (0)

Anonymous Coward | more than 8 years ago | (#14410387)

I have to agree with you, I got fooled looking at these charts.
It would be a lot more helpful to me from a practical standpoint
(i.e. which filesystem to choose) if all CPU graphs were scaled
from 0 to 100. That would help me understand which differences
were important, and which were irrelevent.

Re:how to lie with statistics (1)

j0ebaker (304465) | more than 8 years ago | (#14410764)

ReiserFS is much more CPU intensive when it comes to writing small files. Remember how the partition is broken up into smaller custers (a size of 4k comes to mind but this is customizable). Well other filesystems use each of those chunks as the smallest amount of space which can be allocated to a file. Well ReiserFS uses disk space much more efficiently by packing more than one of those files into a cluster.

So after touching 10,000 files it would have been interesting to analyze how much space on the disk was used by each of the compeditors of ReiserFS.

I like to use ReiserFS on IMAP mail servers where the sizes of individual messages can be very small.

On the other hand I've seen huge beowulf clusters where client machines touch files over nfs where the cpu load on the fileserver went ballistic precicely because ReiserFS was trying to make such efficient use of the space.

Re:how to lie with statistics (0, Troll)

Cyno (85911) | more than 8 years ago | (#14410969)

Yeah, right, they're lying to you. Those evil anti-reiser zealots. They only provided 40 graphs of various benchmarks, and only 80% of those scale up from 0. Its so obvious how biased these results are against ReiserV4. Its so full of lies I bet Microsoft funded this benchmark.

I use the same machine! (1)

denverradiosucks (653647) | more than 8 years ago | (#14410361)

[quote]
COMPUTER: Dell Optiplex GX1
                  CPU: Pentium III 500MHZ
                  RAM: 768MB
                SWAP: 2200MB
    CONTROLLER: Maxtor Promise ATA/133 TX2 - IN PCI SLOT #1
  DRIVES USED: 1] Seagate 400GB ATA/100 8MB CACHE 7200RPM
                            2] Maxtor 61.4GB ATA/66 2MB CACHE 5400RPM
DRIVE TESTED: The Seagate 400GB.
[/quote]

It's comforting to know I'm not the only one still using one of these! Those are almost the exact same specs as my linux server!

Re:I use the same machine! (0)

Anonymous Coward | more than 8 years ago | (#14410686)

Me too: I have one of these exact machines but with only 512Mb RAM and 1Gb of swap space that hosts several ATA/100 HDDs grouped with LVM and formatted to a single 0.5Tb XFS volume. The OS - GNU/Linux - boots and runs from an ultra160 18Gb Seagate drive.

Until yesterday when a drunken slip of the finger entered init 0 instead of 1 [doh!] it had happily been running for ~200 days, providing rock solid and high speed file serving of mostly large files [hence XFS].

I've just obtained another almost identical machine [GX110, Pentium3 @ 1GHz] which will be outfitted almost identically except for x2 power in just about everything that counts: CPU is twice as fast, RAM will be 1Gb and it will be stuffed with 1Tb of cheapo ATA disks. An ultra320 SCSI for the OS will finish the job.

Luckily I get these boxes free from work as they are retired, and just have to spring for the SCSI kit and extra RAM normally: nonetheless, if anyone is looking to build themselves a great value for money fileserver I would certainly recommend one of these old GX* boxes as a base. They're miles better than the Dell crap I have to look after these days [Optiplex GX280, for example].

somewhat worthless (5, Insightful)

aachrisg (899192) | more than 8 years ago | (#14410375)

His benchmark data is ruined by using a gross unrealtistic piece of hardware - modern fast hard disks coupled with a cpu which is absurdly slower than anything you can buy.

Re:somewhat worthless (1)

cli_man (681444) | more than 8 years ago | (#14410461)

I am always surprised when I go into companies to see the old equipment they are running. It is not unusual to find a machine in the 500 mhz range with an ide raid controller to hold a few hundred gig's of space. Not everyone has the budget for the newest and greatest and not everyone has to have the processing speed. Many people are running fileservers that store a ton of info but don't actually process anything.

Re:somewhat worthless (1)

Josh (2625) | more than 8 years ago | (#14410544)

Other people have mentioned that it is not uncommon to use slower CPUs for fileservers since more CPU is often overkill. But even for workstations, one situation that should be of interest to many people is compiling one or more large source trees - there CPU usage of the filesystem is very relevant.

Slow processors, compiling (1)

Stunning Tard (653417) | more than 8 years ago | (#14411027)

As you and other have pointed out running the benchmarks with a slower CPU is useful.
So I'd agree that these tests aren't worthless, but they're only a start.

Also useful would be running these tests with a faster cpu to see how things change. The CPU might be a bottleneck in some cases it would be interesting to see how the picture changes. The CPU utilization went to 100% on many of his tests.

You could also try some tests with a filesystem mounted in memory to see where seek time becomes a bottleneck. Because you can't be too sure if flash drives might overtake harddrives for price AND speed. Some people use flash drives regardless of the cost.

These tests are also application independant which limits their usefulness a little. When somebody benchmarks a new 3d video card they'll start with 3dMark. But then they'll continue on and test with actual games.

So I'd like to see some practical benchmarks. Compiling something large is a great start. Then try various database loads. Some workstation or home pc desktop apps. Games. Some of the tests done by the folks at StorageReview.com might be relevant too.

Sample size (2, Insightful)

rongage (237813) | more than 8 years ago | (#14410423)

Am I reading this "benchmark" correctly? Did he base his results on a sample size of 1?

At the very least, you run multiple times and average the results to give statistically meaningful numbers. I can't think of ANY time where a sample size of 1 was meaningful for anything.

What would be really interesting is to come up with a reasonable UCL and LCL for each test, and then calculate out a cpK for each test. It's one thing to say "I got these results one time", it's something much more impressive to say "I can achieve this result +-10%".

Of course, if a particular benchmark can't even hit a cpK of 1, then maybe there is room for improvement in the coding of the driver.

For those of you who haven't done much with statistics, cpK is a measure of "capability" in a machine or process. It shows how repeatable the measured process is. A higher number indicates that you have a highly targeted, low deviation process whereas a low number (1 or less) indicates that your process is incapable of repeatability and/or accuracy.

Sample size == 3 (0)

Anonymous Coward | more than 8 years ago | (#14410727)

NOTE5: All tests were run 3 times and the average was taken,
              if any tests were questionable, they were re-run and
              checked with the previous average for consistency.

Re:Sample size (1)

bubulubugoth (896803) | more than 8 years ago | (#14410765)

No, you didnt read TFA correctly.

The made 3 samples of each test, and put the average.

The author states this at the beggining of TFA, where we was expaining the metodology and the test cases

Re:Sample size (1)

Atzanteol (99067) | more than 8 years ago | (#14410881)

Am I reading this "benchmark" correctly? Did he base his results on a sample size of 1?

No you're not, and no he didn't. FTFA:

NOTE5: All tests were run 3 times and the average was taken, if any tests were questionable, they were re-run and checked with the previous average for consistency.

It would be nice if... (4, Insightful)

bhirsch (785803) | more than 8 years ago | (#14410437)

There were some current (recent 2.6 kernel with XFS, JFS, possibly Reiser4, etc) benchmarks done on highend servers (or at least something with drives a few steps up from the CompUSA weekly special), especially if anyone wants to see Linux succeed in the enterprise.

Normalized results (3, Informative)

dtfinch (661405) | more than 8 years ago | (#14410441)

Based on the geometric mean of all the benchmark times for each filesystem, which effectively weights all benchmarks equally:
JFS won
EXT2 and EXT3 took 17% longer than JFS
XFS took 29% longer than JFS
Reiser3 took 38% longer than JFS
Reiser4 took 52% longer than JFS

Now, 1.52 seconds is not a whole lot longer to wait than 1 second. With any luck we'll see a post from Hans explaining why Reiser4 took longer, or what sacrifices were made to make the others faster, if there are any.

Re:Normalized results (5, Insightful)

phoenix.bam! (642635) | more than 8 years ago | (#14410702)

Reiser uses much more CPU for file system tasks. ReiserFS is a modern filesystem meant to run on modern machines. This machine is only 500mhz and therefore Reiser performs poorly. Had this machine been a 2ghz (standard now, 4x faster than the test machine), or even a 1ghz (Outdated and 2x as fast) machine Resier would have performed much better.

If you want to use parts from 1997 to build a computer, Reiser is not for you. 500mhz is at least 8 year old technology if I remember correctly.

Old Comps (1)

Zaurus (674150) | more than 8 years ago | (#14410900)

> 500mhz is at least 8 year old technology if I remember correctly.

Close. When I bought a computer in Jan 1998 (8 years ago this month), the fastest processor available from Dell/Gateway was a PII-300. I'd say it's more 6-7 year-old tech.

Re:Normalized results (1)

Nimey (114278) | more than 8 years ago | (#14410912)

About six years. 500 MHz processors came out in late 1999.

Re:Normalized results (0, Insightful)

Anonymous Coward | more than 8 years ago | (#14410999)

Don't use that software garbage excuse of "there's more cpu lets use it always cause we can".

That's why stock dell's and HP's are so much god damn slower than a much worse specced machine.

If that's the concept for reiser, I can only guess a large portion of the linux population is retarded.

JFS ... (2, Interesting)

Pegasus (13291) | more than 8 years ago | (#14410930)

Of course JFS won, since it was designed to be as simple as possible ... it's originating from OS/2, afterall. On such a machine as used in this test, this is a huge advantage.

Of course Reiser4 was slow (1, Insightful)

Anonymous Coward | more than 8 years ago | (#14410464)

Everyone knows Reiser4 uses a lot of CPU, and these guys run the test on a 500MHz machine!!

Uhm, whats with the chart? (1)

FunkyELF (609131) | more than 8 years ago | (#14410471)

I'm looking at the all test times chart and it seems to mis-represent the time taken to cat a 1Gb file to /dev/null http://linuxgazette.net/122/misc/piszcz/group002/i mage018.png [linuxgazette.net] In the last set of data points shows REISERv3 as the 4th best but... http://linuxgazette.net/122/misc/piszcz/group002/i mage017.png [linuxgazette.net] is showing it as the clear loser. Also, the data at the bottom of the article confirms it. WTF?? I call shenanagins (sp?) ~ELF

IDE Drives Cause other Overheads (4, Insightful)

j0ebaker (304465) | more than 8 years ago | (#14410491)

It would be interesting to see the results of the same tests running against a SCSI drive system where there is less IO overhead to see if the results differ.
There are other considerations here as well. What about the I/O elevator's tuning options.
Yes, I'd much rather see this test occur against a SCSI drive or better yet against a RAM drive for pure software performance.

Cheers fellow slashdoters!
-Joe Baker

Re:IDE Drives Cause other Overheads (1)

oglueck (235089) | more than 8 years ago | (#14410573)

The IO scheduler should not matter as they are only important when multiple processes access the disk.

Re:IDE Drives Cause other Overheads (1)

j0ebaker (304465) | more than 8 years ago | (#14410658)

Wouldn't a single journaling filesystem transaction be considered three independant writes?
I've also learned that the first part of a hard drive is the fastest. I trust that this user used the same partitioning scheme for each test to be fair. If I'd known the first part of the hard drive was faster my laptop's swap partition would be the first partition on the drive instead of the last.

Re:IDE Drives Cause other Overheads (2, Informative)

oglueck (235089) | more than 8 years ago | (#14410750)

Wouldn't a single journaling filesystem transaction be considered three independant writes?

No. A single transaction comes from a single thread. So the IO scheduler has no freedom here. It consists of these operations:

1. write redo log
2. write
3. clear redo log

They must occur in exactly this order. There are flush operations involved as well but I am not an expert here.

Part III (1, Funny)

renrutal (872592) | more than 8 years ago | (#14410594)

Part III of the test should feature the filesystem behaviour during a Slashdot Effect.

I must say the filesystem they're trying in the current effect is really failing. No pages served booh!

Ok, I'll be blunt. (0)

Anonymous Coward | more than 8 years ago | (#14410638)

Aside from specific needs like constant-speed streaming (XFS), it's a processor thing:

Do you have lots of CPU time available?

Yes: ReiserFS will extract the most of your computer.

No: Don't use ReiserFS; maybe JFS or something...

Just my personal opinion, not endorsed by anyone here.

Outdated hardware... (3, Informative)

tetabiate (55848) | more than 8 years ago | (#14410666)

Anyway, how is the average user supposed to be concerned by these results?
In my daily work I manage hundreds of GB's of data and have hardly seen a significative difference between XFS, JFS and ReiserFS v.3 on relatively modern hardware (Tyan S2882 Pro motherboard, two Opteron 244 processors, 4 GB RAM and two 250-GB SATA HD's) running OpenSuSE 10. I put the most important data on a XFS partition but also have a small ReiserFS partition which can be read from Windows.

-- Help us to save our cousins the great apes, do not use cell phones.

Re:Outdated hardware... (1)

Gramie2 (411713) | more than 8 years ago | (#14410773)

"significative" is a perfectly cromulent word.

a couple of comments as AC (0)

Anonymous Coward | more than 8 years ago | (#14410679)

1) the physcal machine is the same? but you've just said you've replaced the HD and the HD controller!

2) I notice in small print at the bottom what I believe to be the case too, after looking at the overall figures. XFS seems to be the best performer overall in terms of CPU load and speed of file system for day to day tasks. okay, it loses big time ona few items. I'd never realised how painfully crap reiserFS is for many many files....and yet its constantly been 'bigged up' as the choice to make for MAILDIR systems. why?? who would do somthing like use reiserfs for a mail server?

let's clarify things a little (0)

Karaman (873136) | more than 8 years ago | (#14410721)

ext3 without -j (journal mounting option) is no more than a ext2 partition and -j with defaults of 2 to 4 seconds of commit is just pushing luck for AC blackouts as for xfs, it repairs itself without external tools :) p.s. and just for the record: I have used so far ext3, ext2 and xfs partition types for / and data partitions and to tell the truth I dont like ext2 at all because I get errors too often. Unlike it, xfs has never failed me except once, when I had to mount a partition readonly to repair it (still debugging this case) ext3 without -j is just the same although my brain didnt realize it until repair to the partition (brain still intact) was fruitless. ext3 with -j was slower than my granma.

Bad graphs to prove a point (2, Informative)

Anonymous Coward | more than 8 years ago | (#14410740)

The total free space graph is poor statistical representation

It starts at 345GB and goes to 375GB on the y scale. This makes the difference between 355 and 370 look like a 50% difference rather than that 5.7% increase.

He does it again in make 10,000 directories 99.5% is not double the cpu use of 97%

That's what I call ironic! (1, Offtopic)

aconkling (916504) | more than 8 years ago | (#14410741)

Are those graphs really created in MS Excel?

Nice stats... but wrong... (3, Interesting)

strredwolf (532) | more than 8 years ago | (#14410806)

You know, I was looking at all these stats from this roundup... and while I'm glad they have one nice stat (how much the FS itself takes, the rest for space), I'm not happy that there is no "We've loaded it up, lets see how much is left" statistic.

What am I saying? I want to know how efficent these filesystems are in packing the data on the HD.

  • I know Reiser v3 has "tail packing" to take small files and ends of files that stick out past a block boundary, and packing them inside "sub-blocks" to save space. ext2/3 is stuck at the block boundary (even though you can adjust the size of these blocks)
  • I don't know if ext2/3 has been enhanced to pack small files in inode data.
  • JFS and XFS does not have a tail-packing feature, and is too stuck at (adjustable) block boundaries.


I'm glad that you get more data out of Reiser v4, JFS, and XFS at formatting time, but my feeling is that Reiser v4 (once profiled, tweaked and refined for speed and space) will pack data tighter than anyone else. Meanwhile, I'm looking for something like ext3 that packs better.

Thanks! (0)

Anonymous Coward | more than 8 years ago | (#14410874)

It's missing the most critical stuff from the tests, but I guess those things are hard to measure without manually creating a hardware failure.

I'm glad for this information, though. It affirms my choice of Ext-3 as the best all-around filesystem for my Linux servers and workstations. It's not the fastest, certainly not the slowest, but it's well-supported with utilities, and standard in every bootdisk kernel.

Poor benchmark writeup. MS Excel graphs? (2, Informative)

Srdjant (650988) | more than 8 years ago | (#14410880)

What's with the Microsoft Excel style graphs? They're not very precise or professional-looking.
You would have thought the author would use something better like gnuplot?

The author's opinion "Personally, I still choose XFS for filesystem performance and scalability."
is largely irrelevant here and sounds like bias, although the author acknowledges this.

There is no discussion of the results. The text between the graphs only mentions superficially
what is obvious to anyone looking at the graphs.

Seems a far cry from the very nicely done BSD and Linux benchmark at http://bulk.fefe.de/scalability/ [bulk.fefe.de]

A bit ridiculous... (0)

Anonymous Coward | more than 8 years ago | (#14411008)

Well. Wow. A benchmark that runs for between 0.03 and 0.07 seconds sure is quite precise, free from random variations and stuff. I bet most of the time was taken by spawning the "touch" process 10k times anyway.

About the "free space" issue, some filesystems count the space used by their internal data structures, some don't ; so for instance reiser has less free space after formatting but that doesn't mean you can put less data on it...

Also the results are pretty useless. I'm not really interested by knowing how long it will take to sequentially access a small (10K) number of files which are cached in RAM anyway, so it's CPU-limited and not disk-limited. I'm more interested in what happens when the dataset is larger than RAM, how intelligently stuff is cached, etc. These make a lot of difference in the way the computer feels, between sluggish and responsive, but there is really no benchmark for that...

All I know is that reiser4 is the only filesystem that made my crap slow laptop harddrive at least usable. That's a good enough benchmark for me...

Flash / SWF (0, Redundant)

fire-eyes (522894) | more than 8 years ago | (#14411037)

Editors: Please don't post links to such garbage ridden pages like this. I got at least three or four prompts in konqueror 3.5 to save a .swf file or cancel.

those FS are junk (0)

Anonymous Coward | more than 8 years ago | (#14411047)

no NTFS, no fucking care.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?