Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f6NE8oa01447 for linux-xfs-outgoing; Mon, 23 Jul 2001 07:08:50 -0700 Received: from deliverator.sgi.com (deliverator.sgi.com [204.94.214.10]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f6NE8mV01427 for ; Mon, 23 Jul 2001 07:08:49 -0700 Received: from zeus-fddi.americas.sgi.com (zeus-fddi.americas.sgi.com [128.162.8.103]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id HAA16501 for ; Mon, 23 Jul 2001 07:08:37 -0700 (PDT) mail_from (lord@sgi.com) Received: from daisy-e185.americas.sgi.com (daisy.americas.sgi.com [128.162.185.214]) by zeus-fddi.americas.sgi.com (8.9.3/americas-smart-nospam1.1) with ESMTP id JAA2503282; Mon, 23 Jul 2001 09:07:32 -0500 (CDT) Received: from jen.americas.sgi.com (IDENT:root@jen.americas.sgi.com [128.162.187.49]) by daisy-e185.americas.sgi.com (SGI-8.9.3/SGI-server-1.7) with ESMTP id JAA38363; Mon, 23 Jul 2001 09:07:32 -0500 (CDT) Received: from jen.americas.sgi.com by jen.americas.sgi.com (8.11.2/SGI-client-1.7) via ESMTP id f6NE7iW10457; Mon, 23 Jul 2001 09:07:44 -0500 Message-Id: <200107231407.f6NE7iW10457@jen.americas.sgi.com> X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 To: Jani Jaakkola cc: "linux-xfs@oss.sgi.com" Subject: Re: nfs/local performance with software raid5 and xfs and SMP In-Reply-To: Message from Jani Jaakkola of "Mon, 23 Jul 2001 14:31:49 +0300." Date: Mon, 23 Jul 2001 09:07:44 -0500 From: Steve Lord Sender: owner-linux-xfs@oss.sgi.com Precedence: bulk > On Thu, 19 Jul 2001, Steve Lord wrote: > > > > A little update, > > > > > > At least, some better news: > > > I did *NOT* see the problem (yet!) with the noapic option > > > with 2.4.7pre8-SMP > > > > > > I am informed that there is a fix in the latest kernels in the raid5 code > > to fix some stall issues which ext3 hit. So it maybe that the answer here > > is to run the latest kernels. I still do not like the fact we do non-stripe > > friendly I/O to the log, and want to look into that. > > > > Jani, this may also be the source of your problems, we did experience > > an almost complete lockup here with the 1.0.1 rpm kernel. If you get a > > chance can you run the latest cvs kernel? > > OK now I have the newest CVS kernel. Now I am getting the bad clientid > mount problem on my sw RAID5 partition (this could be caused by the old > kernel saving a garbled log) > > Mount says: > > mount: wrong fs type, bad option, bad superblock on /dev/md0, > or too many mounted file systems > > And kernel says: > > XFS: xlog_recover_process_data: bad clientid > XFS: log mount/recovery failed > XFS: log mount failed There is definitely a problem with the log on raid5 right now, we did not change anything on our end, so it looks like the underlying code may have broken things. We are looking into this. As for performance, the stall appears to be fixed, now we are just dealing with the issue of non-aligned and variable sized writes. We have some ideas about stripe aligning log writes, but this may take a while. An external log should go faster, on the delete end though, there is a synchronous write to the log in there, so the number of ops per second your journal device can do is important for delete performance. Steve > > In case anyone is interested, I have saved output of xfs_logprint > and it is available from http://www.cs.helsinki.fi/u/jjaakkol/xfslog.txt > > I am starting xfs_repair on the RAID partition now.. It wen't through > nicely and quickly and the files are still there. Hmm, the unlink() > performance is definetely better, but still not very good (this just how > it feels like, I have not yet done any real benchmarks). > > Hmm, I'll guess I have to get my hands on some other benchmark software > than bonnie and try it with ext2, xfs+internal log and xfs+external log. > > - Jani