Untergeordnete Seiten
  • Compute cluster description
Zum Ende der Metadaten springen
Zum Anfang der Metadaten

The system is build by IBM and is in most aspects a quarter of a thin island of supermuc. See Alexey's presentation from the science week 2014 for an overview of the system.

Nodes

There are 126 compute nodes with names n003 - n128 and two login/data nodes n001, n002:

  • 2x Intel Xeon CPU E5-2680 v1 with 8 cores @ 2.7 GHz + hyper-threading
  • Sandy bridge architecture
  • 64 kB L1 per core, 256 kB L2 cache per core, 20 MB L3 cache shared for all 8 cores
  • 64 GB RAM, login: 128 GB
  • 250 GB hard disk (no RAID!), login: 300 GB RAID1
  • ethernet 1 Gbit/s, login: 10 Gbit/s
  • connection to other nodes: Mellanox Infiniband 40 Gbit/s

Fat node

For task with high memory demands, we have the fat node:

  • 2x Intel Xeon E5-2680 v2 with 10 cores @ 2.8 GHz + hyper-threading
  • Ivy bridge architecture
  • 64 kB L1 per core, 256 kB L2 cache per core, 25 MB L3 cache shared for all 10 cores
  • 768 GB RAM
  • 18 TB hard disk (Hardware RAID 5)
  • two ethernet (1 Gbit/s login) adapters
  • one Mellanox Ethernet (10 Gbit/s Dual Port) adapter
  • one Mellanox Infiniband (56 Gbit/s Single Port) adapter

Storage

Home

The home directories have maximum size of 100 GB and are situated on the LRZ NAS server and connected using NFS, the bandwith to the home directories is only of the order of 10 GB/s, so the connection is much slower compared to GPFS. There is an automatic weekly backup of home directories to LRZ's tape system.

GPFS

The GPFS is a very fast parallel 250 TB file system with 200 GB/s throughput. Each project has 100 GB under /gpfs/work/${GID}/${UID}. There is a separate subdirectory for each user, but the total storage is shared among project members. For convenience, this subdirectory is available in the shell as $WORK.

A part of the GPFS is reserved as a scratch area ($SCRATCH) for everyone to use. Data in scratch is available only for a limited time, and removed if needed later. always transfer critical data to your project storage!

Users who have a SuperMUC project can access their data at /smgpfs/work(scratch)/${GID} through three 10 Gbit/s ethernet links.

Data on GPFS is RAID6 secured. For every batch of 10 disks, there are two backup disk. So if 1 or 2 disks fail at once, the data can be recovered and replacement disks can be inserted without any data loss. If you need to store large data sets for an extended period of time, you can apply for a tape back up at LRZ. Please contact the staff for details.

On a compute node

Each compute node has its own disks with 250 GB of memory (no RAID !). If your program requires significant temporary storage during runtime, you are best advised to use ${LOCAL_SCRATCH} from within your job. The directory is created for that particular job only, and is automatically removed after your job is finished or killed. You cannot access ${LOCAL_SCRATCH} from the login node.

Software

The operating system is SuSE Linux Enterprise Server (SLES) 11. Lots of software is available via the module system and just needs to be activated. Please have a look at https://www.lrz.de/services/compute/supermuc/programming/

  • Keine Stichwörter