Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 27 May 2003 05:07:19 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h4RC6t2x027173 for ; Tue, 27 May 2003 05:06:57 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 8978614CA1; Tue, 27 May 2003 14:06:50 +0200 (MEST) Date: Tue, 27 May 2003 14:06:50 +0200 From: Andi Kleen To: Michael Sinz Cc: Andi Kleen , linux-xfs@oss.sgi.com Subject: Re: Tomorrow Message-ID: <20030527120650.GA22306@wotan.suse.de> References: <1053694002.2887.1.camel@localhost.localdomain> <1053697162.21472.51.camel@jen.americas.sgi.com> <20030523134438.GC30288@wotan.suse.de> <20030523150530.A31022@infradead.org> <20030524071709.GK27626@plato.local.lan> <20030524095245.A24074@infradead.org> <20030524091516.GM27626@plato.local.lan> <20030524093103.GA12181@wotan.suse.de> <3ED344C0.1010700@wgate.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3ED344C0.1010700@wgate.com> X-archive-position: 4157 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: linux-xfs Content-Length: 1857 Lines: 40 On Tue, May 27, 2003 at 06:58:08AM -0400, Michael Sinz wrote: > When we did this for the Amiga (oh so many years ago) it was a royal > PITA. We ended up punting for the most part on anything that was > outside of the ISO-Latin-1 code page and even there we had a problem > due to some "differences" of opinion by certain language groups what > was supposed to happen. I wrote a C Library for the Amiga a long time ago and in the end I left it all for locale.library because it was too nasty to do by itself. > This gets worse when you look at behavior patterns due to the fact that > a file, especially one accessed over the network, may be accessed by > a machine with different locale settings and thus have slightly different > rules as to what is the lowercase form of an uppercase letter or wordform. AFAIk the SMB protocol handles this. > >You either only support UTF-8 Unicode (shifting the burden of conversion > >to user space) or you need to store a "codepage" per filesystem. Linux > >seems > >to go towards the UTF-8 route. The kernel already has some code for this > >(JFS does it), but it will be not pretty. > > I have not looked at the JFS code at all but this can not be very pretty > if they supported the locale preferences. (Unless, in the last 10 years JFS has a code page as mount option or you can use UTF-8. The locale code to support this is a generic kernel subsystem, also used by VFAT. > there was some new agreement such that case conversion for all locales > are consistant with eachother) Yes there is: Unicode/UTF-8. That is where all the Linux distributions are going too. For legacy SMB support you will still need to support codepages, but that could be done by samba. For XFS I guess it would be enough to just support UTF-8. Supporting different code pages is probably not too useful anymore. -Andi