Filesystems

Usage of the filesystems is subject to the Disk Usage Policy.

Tahoma has several filesystems available with different characteristics for different purposes. Taking some time to understand and select the proper filesystem(s) to use for your workflow will make a huge difference in performance!

Generally speaking, you should use the /tahoma filesystem if you are sharing persistent data across the entire machine. You should use /big_scratch or /fast_scratch for best performance within a single node, but remember files there will be deleted when your job ends! So copy them to /tahoma, /home, or /archive (see notes below!) if you need to keep them after a job.

Notes on usage for each filesystem

/home

  • Shared resource by all users of Tahoma.

  • 331 TB system.

  • Provides storage for source code, submission scripts, small input files and possibly executables.

/tahoma

  • Each project has a /tahoma/<project_id> directory for storing project-specific files.

  • Global 10 PB BeeGFS file system visible to all the compute nodes and the login nodes.

  • Users should direct program output to /tahoma and not /home. /tahoma is the preferred file system to store input, output and executables.

  • /tahoma has more space, higher bandwidth and lower latency than the /home file system.

  • /tahoma is for temporary storage it is not backed up. Files older than 60 days in the /tahoma file system are subject to deletion without notification.

/archive

  • /archive is a 22PB disk/tape library shared by several PNNL systems.

  • Tahoma users have a /archive/<login_id> directory for storing input and output files and submission scripts.

  • The way users store files can influence retrieval time (on the order of weeks compared to days).

  • EFFICIENT STORAGE: small numbers of large files is the most efficient way of storing files on /archive.

  • INEFFICIENT STORAGE: large numbers of small files is the least efficient method. It can takes weeks to retrieve these files.

  • To store a large number of files, first tar czf <name>.tar.gz <dir> the files, then store the large file archive.

  • /archive is intended for output files you want to save and large input files you intend to reuse.

For more information on /archive, aka Aurora, see Data File Storage (Aurora).

/fast_scratch

  • Each Tahoma node has a ~150 or ~700 GB RAM disk mounted as /fast_scratch (depending on the node’s overall memory; /fast_scratch is half of RAM).

  • /fast_scratch is used to store ephemeral results, input files, and executables to reduce access time; it is appropriate for high-performance I/O on files that do not need to be shared between nodes.

  • The /fast_scratch file system has the lowest latency, highest per node bandwidth of all the file systems.

  • /fast_scratch is volatile memory: files on /fast_scratch will be deleted after the job terminates; to preserve files, move them from /fast_scratch to /tahoma in the submission script (do not store them on /home); do not store files you want to save (e.g. checkpoint files) on /fast_scratch because if the job terminates before the copy completes, the data will be lost.

  • Please direct output files to the /tahoma file system.

  • /fast_scratch uses system RAM for storage, so it will take away from the memory available for your application! Use with care.

  • Best practice: Use /big_scratch instead of /fast_scratch. If you believe that your use of /big_scratch is slowing down your calculations considerably, AND if you understand how RAM disks work, then try /fast_scratch, but it is advised that you first discuss the performance problems (and the possibility of using /fast_scratch) with MSC Consulting, please email msc-consulting@pnnl.gov.

/big_scratch

  • Each Tahoma compute node has a 1.8 or 7 TB local disk mounted as /big_scratch.

  • /big_scratch is solid state storage local to the node; used for high-performance I/O on files that do not need to be shared between nodes.

  • /big_scratch is ephemeral storage: files on /big_scratch will be deleted after the job terminates; to preserve files, move them from /big_scratch to /tahoma in the submission script (do not store them on /home); do not store files you want to save (e.g. checkpoint files) on /big_scratch because if the job terminates before the copy completes, the data will be lost.

  • Please direct output files to the /tahoma file system.

/emslfs

  • Each Tahoma login and data mover node has the /emslfs filesystem mounted.

  • /emslfs is a shared filesystem that is pervasively mounted on EMSL systems. It is a convenient way to move data between systems. It should never be used for input or output of jobs.

  • /emslfs is mounted as a convenience for users who already have data there. Directories are not created for you automatically. Contact MSC Consulting if you need directories created.