[Top] [All Lists]

Re: realtime partition support?

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: realtime partition support?
From: Phil Karn <karn@xxxxxxxxxxxx>
Date: Fri, 07 Jan 2011 19:59:27 -0800
Cc: karn@xxxxxxxx, xfs@xxxxxxxxxxx
In-reply-to: <20110108021728.GA28803@dastard>
References: <4D2724E0.9020801@xxxxxxxxxxxx> <20110108021728.GA28803@dastard>
Reply-to: karn@xxxxxxxx
Sender: Phil Karn <karn@xxxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv: Gecko/20101207 Thunderbird/3.1.7
On 1/7/11 6:17 PM, Dave Chinner wrote:

> Experimental, implemented in linux and mostly just works, but is
> largely untested and not really recommended for any sort of
> production use.


>> Throughput on large files would be almost as fast as if everything were
>> on the SSD.
> Not at all. The data is still written to the rotating disk, so
> the presence of the SSD won't change throughput rates at all.

My point is that the big win of the SSD comes from its lack of
rotational and seek latency. They really shine on random small reads. On
large sequential reads and writes, modern rotating disks can shovel data
almost as quickly as an SSD once the head is in the right place and the
data has started flowing. But it can't get there until it has walked the
directory tree, read the inode for the file in question and finally
seeked to the file's first extent. If all that meta information resided
on SSD, the conventional drive could get to that first extent that much
more quickly.

Yeah, the SSD is typically still faster than a rotating drive on
sequential reads -- but only by a factor of 2:1, not dozens or hundreds
of times. On sequential writes, many rotating drives are actually faster
than many SSDs.

> I'd also expect it to be be much, much slower than just using the
> rotating disk for the standard data device - the SSD will make no
> difference as metadata IO is not the limiting factor.

No? I'm having a very hard time getting XFS on rotating SATA drives to
come close to Reiser or ext4 when extracting a large tarball (e.g., the
Linux source tree) or when doing rm -rf. I've improved it by playing
with logbsize and logbufs but Reiser is still much faster, especially at
rm -rf. The only way I've managed to get XFS close is by essentially
disabling journaling altogether, which I don't want to do. I've tried
building XFS with an external journal and giving it a loopback device
connected to a file in /tmp. Then it's plenty fast. But unsafe.

As I understand it, the problem is all that seeking to the internal
journal. I'd like to try putting the journal on a SSD partition but I
can't figure out how to do that with an existing XFS file system without
rebuilding it.

Turning off the write barrier also speeds things up considerably, but
that also makes me nervous. My system doesn't have a RAID controller
with a nonvolatile cache but it is plugged into a UPS (actually a large
solar power system with a battery bank) so unexpected loss of power is
unlikely. Can I safely turn off the barrier?

If I correctly understand how drive write caching works, then even a
kernel panic shouldn't keep data that's already been sent to the drive
from being written out to the media. Only a power failure could do that,
or possibly the host resetting the drive. After a kernel panic the BIOS
will eventually reset all the hardware, but that won't happen for some
time after a kernel panic.

<Prev in Thread] Current Thread [Next in Thread>