[Top] [All Lists]

XFS and DPX files

To: xfs@xxxxxxxxxxx
Subject: XFS and DPX files
From: "AndrewL733@xxxxxxx" <AndrewL733@xxxxxxx>
Date: Sat, 31 Oct 2009 08:26:28 -0400
User-agent: Thunderbird (X11/20090822)

For many years and with great success, I have been capturing and editing high bandwidth video on Linux systems with XFS filesystems exported via Samba. However, I am currently running into a problem and I am wondering if somebody has some hints about how to solve it.

Whereas in the past, I have been working with video formats such as MXF and QuickTime -- in which a video clip is represented by a single file (or by a handful of files -- one video, several audio) I now find myself having to deal with DPX files. Unlike MXF or QuickTime files, the DPX format creates one file for each frame of video or film. For American video, that's about 30 files per second, 1800 files per minute, and so on.

I have a high performance 10-gigabit-based NAS that allows me to capture and playback "single file" uncompressed HD video streams (up to 160 MB/sec per stream) without any problems. I can also PLAY BACK so-called 2K DPX video, which has the "1 file per frame structure" and has a higher data rate than uncompressed HD -- a bit over 300 MB/sec. However, when I go to WRITE DPX files, that's where the trouble begins. Even when I am recording "standard definition" DPX files at only a data rate of about 40 MB/sec or 1.3 MB/file, I am having trouble.

This is what I am observing:

1) When I begin recording, I can see that data immediately starts moving across the network at a steady rate of about 41-42 MB/sec, and data also starts getting written to the hardware RAID at the same steady rate (3ware 9650 + 16 x 7200 RPM enterprise-class SATA disks)

2) After about 3 minutes of recording, or after about 6000 files have been written, suddenly my server is no longer writing to the RAID subsystem. The data continues to come in through the network interface, but the writing stops. When I look at vmstat, I can see that "outgoing blocks" pretty much grind to a halt at this point.

3) Then, after about 5-10 seconds of pause, the system begins writing to the RAID again. All the while, the data has been coming into the network interface at a fairly steady 41-42 MB/sec. The writing never seems to "catch up" and about 10 seconds after the writing begins again, the client application stops sending data because it senses that it has "dropped frames".

4) Level 5 Samba show some curious errors now and then about "xfs_quota" failing -- but they don't seem to be concentrated just at the point where the writing stops.

  [2009/10/30 15:00:59, 3] lib/sysquotas.c:sys_get_quota(433)
  sys_get_xfs_quota() failed for mntpath[/mnt/vol1] bdev[/dev/sdb1]
  qtype[2] id[502]: No such file or directory.

By the way, I tried mounting my XFS filesytsems without quota support -- I don't see these messages any more, but I also still have the same problem that the system stops writing to the disks after about 3 minutes.

5) If I export an iSCSI target from the exact same NAS (via iSCSI Enterprise Target, for example), mount it on my Windows machine and format it as NTFS, I don't have any trouble capturing for an hour or more. So, there is clearly nothing wrong with the network or the cabling or the RAID subsystem itself.

6) Similarly, if I format my storage with EXT3 instead of XFS and export the volume via Samba, I don't have any trouble recording for the same very long periods of time. I DO observe a very different pattern of writing to the storage, however. While 41-42 MB/sec comes in steadily over the network interface, with ext3-formated disks, the NAS writes to the storage at about 200-250 MB/sec every now and then. Then there is no writing activity for about 4-5 seconds. Then another burst of 200-250 MB/sec again. And the pattern continues.

7) My NAS system is running a plain-vanilla kernel.org kernel. It is a 64-bit system with 3.2 Ghz Quad Core Intel 5482 CPUs and 4 GBs of RAM. However, I see EXACTLY the same behavior on an an even more powerful Nehalem-based system with 2.93 Ghz Quad Core CPU and 6 GBs RAM and the very latest kernel. So, I don't think it has anything to do with the XFS version or the hardware, for instance. And as I said above, I don't have trouble handling much higher data rates when I am only creating a few files per hour, versus creating 30 files per second.

My hunch is that the problem is related to the number of files I am creating per second. Could it be that XFS is not handling this situation well, whereas this doesn't pose a problem for EXT3 or iSCSI/NTFS? I am wondering if there are any specific XFS formating or mounting options that would make a huge difference (size of log, sectorsize, agsize, inode size, allocation group count, log buffers at mounting, etc).

Any ideas here? Is this a known issue? And is there a workaround? Any help would be greatly appreciated.


<Prev in Thread] Current Thread [Next in Thread>