Difference between revisions of "Supercomputer access"

From Miguel Caro's wiki
Jump to navigation Jump to search
(Created page with "This is miscellaneous information about how to use the supercomputing infrastructure available to our group. == CSC == [http://www.csc.fi CSC] (Center for Scientific Computi...")
 
Line 6: Line 6:
  
 
* Mahti: a supercomputer with lots of CPU power (mostly CPU based). Calculations that we typically carry out here are VASP and other DFT packages and large-scale molecular dynamics with LAMMPS and TurboGAP.
 
* Mahti: a supercomputer with lots of CPU power (mostly CPU based). Calculations that we typically carry out here are VASP and other DFT packages and large-scale molecular dynamics with LAMMPS and TurboGAP.
* Puhti: a supercluster with heterogenous architecture (CPUs, GPUs, different memory configurations, etc.). Here we run a variety of tasks, including samall array jobs (small-scale MD with TurboGAP, miscellaneous Python jobs, etc.). A speficic kind of task we carry out here is training of GAP potentials with gap_fit (a program in the QUIP suite), for which we use the fat nodes (hugemem partition), with 1.5 TB and 768 GB nodes. We also have our own dedicated fat node (Sumo, see in its own section) which boasts 3 TB of RAM, for the most demanding GAP fitting tasks.
+
* Puhti: a supercluster with heterogenous architecture (CPUs, GPUs, different memory configurations, etc.). Here we run a variety of tasks, including small array jobs (small-scale MD with TurboGAP, miscellaneous Python jobs, etc.). A speficic kind of task we carry out here is training of GAP potentials with gap_fit (a program in the QUIP suite), for which we use the fat nodes (hugemem partition), with 1.5 TB and 768 GB nodes. We also have our own dedicated fat node (Sumo, see in its own section) which boasts 3 TB of RAM, for the most demanding GAP fitting tasks.
 
* Pouta: virtual machines.
 
* Pouta: virtual machines.
  
 
=== For new users ===
 
=== For new users ===
  
# First, you need to get a CSC account. Go to [http://my.csc.fi], where you can log on with your Aalto credentials.
+
# First, you need to get a CSC account. Go to [http://my.csc.fi my.csc.fi], where you can log on with your Aalto credentials.
 
# Second, you need to ask Miguel to add you to an existing project.
 
# Second, you need to ask Miguel to add you to an existing project.
 +
# For each project, you need to accept the terms of usage for each service. In my.csc.fi, got to Projects, and then click on the project for which you want to enable compute services (there is a list on the right). Them, accept the terms and conditions (usually, do it for Mahti and Puhti). Some services might not be available for all projects.
 
# Now you're all set to log on CSC machines.
 
# Now you're all set to log on CSC machines.
  
 
=== Using Mahti ===
 
=== Using Mahti ===
 +
 +
To log on Mahti, do <code>ssh -X username@mahti.csc.fi</code>. If you want to avoid entering your password every time you log on, use an [[ssh key]].
 +
Once on Mahti, you land on your Linux home. The Mahti filesystem is organized into:
 +
 +
* Home: your usual Linux home, located at <code>/users/username</code>, with 10 GB/100k files quota (quotas on CSC filesystems affect storage space ''and'' number of files). This is for you to put your configuration files and miscellaneous things.
 +
* Project application directories, located at <code>/projappl/project_projectnumber</code>. These usually have 50 GB/100k files quota ''for everyone in that project''. Only the files stored under the project directory "belong" to the project in terms of computing the quota. These directories are normal used for application data, i.e., "programs".
 +
* Scratch directories, located at <code>/scratch/project_projectnumber</code>. Usual quotas here are 1 TB/1M files. These are the directories where you store data which are input and/or output of your calculations.
 +
 +
You can only write to projappl and scratch directories if you belong to the relevant project. If you want to check:
 +
* Which projects you're part of, type <code>groups</code>
 +
* The quotas and current usaged of filesystem space, type <code>csc-workspaces</code>
 +
* The remaining [[CPU time]] in the projects you belong to, type <code>csc-projects</code>. You can also check this info (with some delay) on my.csc.fi.
  
 
== Aalto (Triton) ==
 
== Aalto (Triton) ==
  
 
== Aalto (Sumo) ==
 
== Aalto (Sumo) ==

Revision as of 14:57, 27 August 2021

This is miscellaneous information about how to use the supercomputing infrastructure available to our group.

CSC

CSC (Center for Scientific Computing) is the national supercomputing center of Finland, and where we get (by far) most of our CPU time from. CSC resources are made available for free to Finland-based academic researchers on the basis of scientific excellence: plans for future research and past track record. CSC offers a number of services:

  • Mahti: a supercomputer with lots of CPU power (mostly CPU based). Calculations that we typically carry out here are VASP and other DFT packages and large-scale molecular dynamics with LAMMPS and TurboGAP.
  • Puhti: a supercluster with heterogenous architecture (CPUs, GPUs, different memory configurations, etc.). Here we run a variety of tasks, including small array jobs (small-scale MD with TurboGAP, miscellaneous Python jobs, etc.). A speficic kind of task we carry out here is training of GAP potentials with gap_fit (a program in the QUIP suite), for which we use the fat nodes (hugemem partition), with 1.5 TB and 768 GB nodes. We also have our own dedicated fat node (Sumo, see in its own section) which boasts 3 TB of RAM, for the most demanding GAP fitting tasks.
  • Pouta: virtual machines.

For new users

  1. First, you need to get a CSC account. Go to my.csc.fi, where you can log on with your Aalto credentials.
  2. Second, you need to ask Miguel to add you to an existing project.
  3. For each project, you need to accept the terms of usage for each service. In my.csc.fi, got to Projects, and then click on the project for which you want to enable compute services (there is a list on the right). Them, accept the terms and conditions (usually, do it for Mahti and Puhti). Some services might not be available for all projects.
  4. Now you're all set to log on CSC machines.

Using Mahti

To log on Mahti, do ssh -X username@mahti.csc.fi. If you want to avoid entering your password every time you log on, use an ssh key. Once on Mahti, you land on your Linux home. The Mahti filesystem is organized into:

  • Home: your usual Linux home, located at /users/username, with 10 GB/100k files quota (quotas on CSC filesystems affect storage space and number of files). This is for you to put your configuration files and miscellaneous things.
  • Project application directories, located at /projappl/project_projectnumber. These usually have 50 GB/100k files quota for everyone in that project. Only the files stored under the project directory "belong" to the project in terms of computing the quota. These directories are normal used for application data, i.e., "programs".
  • Scratch directories, located at /scratch/project_projectnumber. Usual quotas here are 1 TB/1M files. These are the directories where you store data which are input and/or output of your calculations.

You can only write to projappl and scratch directories if you belong to the relevant project. If you want to check:

  • Which projects you're part of, type groups
  • The quotas and current usaged of filesystem space, type csc-workspaces
  • The remaining CPU time in the projects you belong to, type csc-projects. You can also check this info (with some delay) on my.csc.fi.

Aalto (Triton)

Aalto (Sumo)