[Bioclusters] 3ware 7850 RAID 5 performance

Joe Landman bioclusters@bioinformatics.org
27 Aug 2002 14:51:15 -0400


On Tue, 2002-08-27 at 10:37, Vsevolod Ilyushchenko wrote:

> BTW - what does JBOD mean in this context? How is it different from
> RAID 0?

JBOD in this context could mean taking the drives and making one very
large drive from the individual drives (concatenating or appending).

RAID0 would be striping across the JBOD.  This level of RAID should get
you the best performance.

> >   RAID on these systems are going to be limited to the speed of the
> > slowest disk.  If the disk is in PIO modes rather than UDMA modes, then
> > I could imagine that you have that sort of write speed. 
> 
> How would I check that?

I would take a look at the card documentation.  I cant do this myself
right now though (in the middle of building a minicluster)

>  >  It is also
>  > possible, that if you are using a journaling file system such as XFS,
> > and you are pointing your log to write to a single disk somewhere else,
> > that is likely to be your bottleneck.
> 
> The filesystem is a simple ext3.

Ok...  ext3 is known to not be a high performing fs.  It is a little
slower than ext2 on most operations.  Ext3 maintains some sort of
logging as well, and there are mount options to turn this on or off.

Try adding things like

	noatime,data=ordered

to the mount options in the /etc/fstab.

> >   Which file system are you using?  What is the nature of your test
> > (large block reads/writes), and specifically how are you testing? 
> 
> Testing was done with bonnie++.

Results should be reasonable.

>  > What
> > is the machine the card is plugged into? 
> 
> The machine has two 1.26 Ghz CPUs and 2 Gb of RAM, so the file size used 
> in bonnie++ testing was 4 Gb.

Good.  This shouldnt be a factor, unless you are slamming the networking
and overloading the PCI bus at the same time you are slamming on the
file system.  What is your user/system load during the tests?

>   What is the reported speed for
> > 
> > 	hdparm -tT /dev/raid_device
> > 
> > where /dev/raid_device is the device that appears to be your big single
> > disk.  Are you using LVM?  Software RAID atop a JBOD? ???
> 
> Hdparms numbers are surprisingly high:
> 
> /dev/sda1:
>   Timing buffer-cache reads:   128 MB in  0.49 seconds =261.22 MB/sec
>   Timing buffered disk reads:  64 MB in  1.89 seconds = 33.86 MB/sec

I would say low.  If this is a single disk named /dev/sda1 versus the
device which represents the entire raid, I could understand the above. 
Your buffered disk reads should be around 30*N(disks).  If N=1, I
understand these results, or if you have it in the JBOD config.

> No software RAID is used, just the card's RAID 5.

Have you called the 3ware folks?

> 
> > If you run the following, how long does it take?
> > 
> > 	/usr/bin/time --verbose dd if=/dev/zero of=big bs=10240000 count=100
> > 
> > On my wimpy single spindle file system, this takes 42 wall clock
> > seconds, and 7 system seconds.  This corresponds to a write speed of
> > about 24.4 MB/s.
> 
> 13 wall clock and 7 system seconds.

13 wall clock to write 1 GB of file corresponds to a write speed of 78.8
MB/s.  I would guess at a 3 or 4 way stripe.  If you are running this
off of a PCI32 bus you are nearly maxing it out.  If you are running
this off of a PIC64 or PCI-66MHz bus, you should have more room for
higher performance.

> However, writing a 4 Gb file took 6 minutes total!!! (And only 30 system 
> seconds.) Another data point: writing a 2 Gb file took 1:52 total and 14 
> system seconds. Something is very wrong here.

I agree that something sounds rather odd.  Try that dd command with the
count changed from 100 to 200 (a 2 GB file) and 400 (a 4 GB file).  Then

> > If you run the following after creating the 1 GB file, how long does it
> > take?
> > 
> > 	/usr/bin/time --verbose md5sum big
> > 
> > On the same wimpy single spindle file system, this takes 50 seconds for
> > a read of about 20 MB/s. 
> 
> 34 seconds for the 1 Gb file, 2:26 for the 4-Gb file. This scales 
> reasonably.

Actually it doesn't make much sense.  This corresponds to a read speed
of about 30.1 MB/s.

I think something may be wrong with the setup.  It appears you are
getting single spindle read speeds.  Depending upon how many disks are
attached as the RAID5 system, you should be seeing ~1.3 - 4x better than
single spindle read speed.  Your write speed seems to be about 78 MB/s
indicating a 3->4 way stripe (or just lots of cache).

I would call 3ware first, and tell them how you have it setup.  Ask them
for help.

-- 
Joseph Landman, Ph.D
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615