Disk Usage Policy¶
Important
The processing, storage, or transmittal of sensitive data [e.g. Personally Identifiable Information (PII), Official Use Only (OUO)] is prohibited on the Cascade Information System. Due diligence must be used to prevent inadvertent disclosure of invention, patent, or other sensitive information. It is your responsibility to protect access to the information.
/home
¶
Directory for storage of data and executables while users have an active EMSL project.
/home
sub-directories are available for each individual user. When a user no longer has
active EMSL projects, the data will be deleted and the disk space reclaimed 90 days after
the end of the last project on which a user is a participant.
Jobs running from the /home
file system can cause system performance degradation
either because too many files are being accessed simultaneously or by exceeding disk
storage capacity.
Important
Do not submit jobs from /home
. Jobs must be run from /tahoma
or
/fast_scratch
or /big_scratch
.
/home
is a shared resource without enforced quotas. When excessive amounts
of disk space are used, those who are using the most space will be asked to
reduce their use. Repeated excessive use of disk space can lead to loss of
system privileges. This directory is not a permanent storage system. Files to
be saved should be stored on your local machine or in /archive
.
/tahoma
¶
Each EMSL project has a /tahoma/<project_id>
directory for storing project-
specific files.
Users should direct program output to /tahoma
and not /home
. /tahoma
is the preferred file system to store input, output, and executables.
While an EMSL project is active, users are responsible for managing data in their project
/tahoma
sub-directory. While a project is active, users are requested to maintain their
/tahoma
project sub-directory data storage below 1.0TB. The project sub-directories will
be reclaimed and all data deleted 90 days after the end of the project period.
For long-term storage of important data, users should move files from /tahoma
to /archive
.
Files should be compressed with tar and gzip prior to storage on /archive
.
Important
Jobs must be run from /tahoma or /fast_scratch or /big_scratch. Do not submit jobs from /home.
/archive
¶
The Aurora archive is for long-term archiving of data. Each EMSL project has a /archive
sub-directory available to the users who are project participants. Storing data in personal
sub-directories outside of the project sub-directories is discouraged. Data should remain with
the project where it is easily identifiable and accessible by other project participants.
Important
Do not store large numbers of small files on /archive
. Instead, compress datasets
into fewer, larger files using TAR and GZIP, then move the .tgz file(s) into /archive
and delete the source data. TAR/GZIP Example:
tar -czvf filename.tgz /sub-directory/to/archive
Important
Single files larger than 2TB cannot be copied to /archive
directly. Instead, use HSI
to transfer these large files. Instructions for using HSI can be found at:
https://www.emsl.pnnl.gov/MSC/UserGuide/compute_resources/aurora.html#transferring-files-with-hsi
/fast_scratch
and /big_scratch
¶
/fast_scratch
is ~300 or ~1400 GB RAM per node and /big_scratch
is 1.8 or 7 TB local
disk mounted per node. These are intended for storage of ephemeral results, input files, and
executables while a job is running. After termination, the entirety of both /big_scratch
and /fast_scratch
will be deleted and all that data will be unrecoverable.
Important
/fast_scratch
uses system RAM for storage, so it will take away from the memory available
for your application! Use with care.
Important
Jobs must be run from /tahoma
or /fast_scratch
or /big_scratch
.
Do not submit jobs from /home
.