[arch-general] Btrfs more than twice as fast compared to ext4

Nathan Wayde kumyco at konnichi.com
Mon Mar 15 11:14:35 CET 2010


On 13/03/10 03:05, Shridhar Daithankar wrote:
> Hi,
>
> Just wanted to share an interesting experience I had today.
>
> Check http://ghodechhap.net/btrfs.performance.txt

Maybe you're looking for http://docs.python.org/library/filecmp.html

One cannot help but think that you took a disk-bound process and turned 
it into a cpu-bound one. Since you're just interested in which files are 
different you should have just used `cmp` instead of `md5sum`
the latter is just overkill and I'd assume calling an external command 
that many times can't be very nice either.

here are some comparisons, they use /usr/lib - i figured 75000 files 
should be a good test... I made this as deliberately 
unfair/in-comparable as possible, I wanted to show the potential 
overhead of calling md5sum that many times.

[[ky] ~]# }} time python -c "import filecmp; print 
len(filecmp.dircmp('/usr/lib', '/temp/lib').diff_files)"
80

real	2m24.240s
user	0m10.123s
sys	0m10.669s

That looks reasonable, on this crappy 5400 rpm (sata) laptop harddisk 
with ext4.

You'll note that test below is pretty much just to see how much time 
calling md5sum takes, /tmp/a is a 1 byte file(contains the character a, 
to give md5sum as simple a job as possible). /tmp is a tmpfs, not that 
it matters as /tmp/a most likely remains in cached the entire time

[[ky] ~]# }} time find /temp/lib -type f | wc -l
75272

real	0m0.532s
user	0m0.140s
sys	0m0.383s

[[ky] ~]# }} time find /temp/lib -type f -exec md5sum /tmp/a \;

real	2m6.781s
user	0m2.200s
sys	0m15.409s

the disk-status light didn't come on at all during those 2mins meanwhile 
I could hear my cpu-fan going crazy the whole time (1.6ghz). I should 
note, the light remained on the entire time during the filecmp and cpu 
stayed low(800mhz) for most of that time as well.



More information about the arch-general mailing list