Disk Usage Policy

Important

The processing, storage, or transmittal of sensitive data [e.g. Personally Identifiable Information (PII), Official Use Only (OUO)] is prohibited on the Cascade Information System. Due diligence must be used to prevent inadvertent disclosure of invention, patent, or other sensitive information. It is your responsibility to protect access to the information.

/home

Directory for storage of data and executables while users have an active EMSL project. /home sub-directories are available for each individual user. When a user no longer has active EMSL projects, the data will be deleted and the disk space reclaimed 90 days after the end of the last project on which a user is a participant.

Jobs running from the /home file system can cause system performance degradation either because too many files are being accessed simultaneously or by exceeding disk storage capacity.

Important

Do not submit jobs from /home. Jobs must be run from /tahoma or /fast_scratch or /big_scratch.

/home is a shared resource without enforced quotas. When excessive amounts of disk space are used, those who are using the most space will be asked to reduce their use. Repeated excessive use of disk space can lead to loss of system privileges. This directory is not a permanent storage system. Files to be saved should be stored on your local machine or in /archive.

/tahoma

Each EMSL project has a /tahoma/<project_id> directory for storing project- specific files.

Users should direct program output to /tahoma and not /home. /tahoma is the preferred file system to store input, output, and executables.

While an EMSL project is active, users are responsible for managing data in their project /tahoma sub-directory. While a project is active, users are requested to maintain their /tahoma project sub-directory data storage below 1.0TB. The project sub-directories will be reclaimed and all data deleted 90 days after the end of the project period.

For long-term storage of important data, users should move files from /tahoma to /archive. Files should be compressed with tar and gzip prior to storage on /archive.

Important

Jobs must be run from /tahoma or /fast_scratch or /big_scratch. Do not submit jobs from /home.

/archive

The Aurora archive is for long-term archiving of data. Each EMSL project has a /archive sub-directory available to the users who are project participants. Storing data in personal sub-directories outside of the project sub-directories is discouraged. Data should remain with the project where it is easily identifiable and accessible by other project participants.

Important

Do not store large numbers of small files on /archive. Instead, compress datasets into fewer, larger files using TAR and GZIP, then move the .tgz file(s) into /archive and delete the source data. TAR/GZIP Example:

tar -czvf filename.tgz /sub-directory/to/archive

Important

Single files larger than 2TB cannot be copied to /archive directly. Instead, use HSI to transfer these large files. Instructions for using HSI can be found at:

https://www.emsl.pnnl.gov/MSC/UserGuide/compute_resources/aurora.html#transferring-files-with-hsi

/fast_scratch and /big_scratch

/fast_scratch is ~300 or ~1400 GB RAM per node and /big_scratch is 1.8 or 7 TB local disk mounted per node. These are intended for storage of ephemeral results, input files, and executables while a job is running. After termination, the entirety of both /big_scratch and /fast_scratch will be deleted and all that data will be unrecoverable.

Important

/fast_scratch uses system RAM for storage, so it will take away from the memory available for your application! Use with care.

Important

Jobs must be run from /tahoma or /fast_scratch or /big_scratch. Do not submit jobs from /home.