On Tue, May 07, 2002 at 11:04:02AM -0500, Stephen Lord wrote:
> First of all, I presume when you fail over to the second node, you
> unmount the xfs filesystem on one
> box and mount it on the second one before the IP address takeover
> happens? If not then there is
> a good chance of corruption.
Yes, that's the way it's done. Most cluster-implementations also
implement some kind of fencing to make sure the original server doesn't
continue accessing the disk after the second server takes over.
> That aside, I think that without code changes, the only way to make this
> work is to ensure that the
> /dev entry used to mount the filesystem on both boxes has the same
> major/minor number. If not
> then I think the stale handles will result. Note this is just a guess on
> my part right now, I have
> not looked at this code in a while.
Recently a patch was accepted to allow the user to specify the device-id
to use for nfs exports - that should take care of the problem with
different major/minor numbers on the two servers. But of course, having
to identical servers are the easiest way.
You also need to put /var/lib/nfs on shared storage, and to use the same
"servicename" on both servers (-n flag in newest version of nfs-utils)
for locks to work properly.
There are other cavecats as well, but basicly it's possible to get it to
work with a little effort.
And for out-of-the-box solutions there are ofcourse NAS devices that you
can buy that implement this :-)