Hi, my name is Shane Brubaker and I work at the Joint Genome Institute. We are facing a problem with scalability on large numbers of short jobs involving SGE and a workflow system which we wrote. We are running large numbers (10,000 to 100,000) jobs that are very short (1 second). Admittedly, one second is too short for a job and will produce a lot of overhead no matter what, but there are times when it is difficult to change our code to produce longer jobs, and we'd like to provide some facility to do this with at least minimal overhead. Also, when our file systems have more than a few thousand files in one directory things slow down tremendously, and it becomes impossible to even ls the directory. It also can crash our file servers. We are using NFS. I have come up with a strategy of using an array job and having the workflow system, which is written in perl, concatenate the smaller task files to the end of a set of master logs and then remove the smaller files, using system calls, as I go. This actually worked quite well for 10,000 jobs, keeping the directory from growing and greatly improving performance. However, when I went to 100,000 jobs the number of files grew faster than they could be concatenated, and the system is now slowly going through that huge directory and trying to append the smaller files, even though the array job is long since finished. I am wondering if anyone has experience with this and has a recommended solution. I am also curious if the SGE folks have any plans to add a master log capability for array jobs. Finally, if you have any general advice on fast ways to append files and ways to deal with large directories, I would really appreciate any advice. Thanks, Shane Brubaker