Setting up modelblocks jobs on unity

 

Introductory docs: 

Unity intro: https://osu.teamdynamix.com/TDClient/1929/ASC/KB/ArticleDet?ID=61519

Overview to supercomputing: https://www.osc.edu/resources/getting_started/new_user_resource_guide

 

Step-by-step Setup

  1. Request unity account access from your advisor.  Your unity login will be your OSU name.# and password.
  2. Login to unity: ssh <name.#>@unity.asc.ohio-state.edu
  3. From your home directory, download miniconda3 for linux: wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  4. Install miniconda:  bash Miniconda3-latest-Linux-x86_64.sh
  5. Log out and then log back in again, to make the conda command available.
  6. Clone modelblocks into your unity workspace: git clone https://github.com/modelblocks/modelblocks-release.git
  7. Go to the modelblocks-release directory and run the modelblocks unity setup: make mb_on_unity This will:
    • Create a ~/tmp/ directory, which is necessary for certain Unix commands;
    • Create a new conda environment named mb;
    • Install dependencies using conda (e.g. compilers, python, PyTorch); and
    • Download armadillo from source and install it to conda environment mb.
  8. Copy /fs/project/schuler.77/mb_on_unity/example.bashrc to ~/.bashrc.  Alternatively, lines 26-34 can be copied from this file and pasted to your current /.bachrc.  These lines need to be executed in order to assign required environment variables prior to compiling/running armadillo-based c++ programs.
  9. Log out and then log back in again, to make the bashrc modifications available.
  10. Go to the modelblocks-release directory and run make: make This will create a workspace directory for you.
  11. Go to the workspace directory and make any binary to configure modelblocks, e.g.: make bin/indent This complains a lot about missing files, then creates a bunch of user-...-directory.txt files in the config directory, then warns you that they need to be updated.
  12. Edit modelblocks-release/config/user-<directoryname>-directory.txt where applicable to point to the shared corpus or compling resource accessible on unity.  The address is: /fs/project/schuler.77/corpora/<corpusname> or/fs/project/schuler.77/compling/<softwarename>.  (or rsync a copy from the ling servers).  Repeat for all required 3rd party corpora or software required for your make job.
  13. Copy example job script /fs/project/schuler.77/example.sbatch to ~
  14. Modify job script to use your target.  Rename script with descriptive name.
  15. Submit job: sbatch <myjob>.sbatch, check on it using squeue -u <username> and cancel it using scancel <myjob>.sbatch.

Some gotchas:

compiling doesn't seem to work with -std=c++17 in pbs case, but compiles when made on login node. (c++17 required for try_emplace() function calls)  try testing compilation on interactive node and also in pbs with modules below:

 

How to submit jobs:

  • Batch

    Create job script named myjob.sbatch with text editor.

    Submit batch job with the command: sbatch myjob.sbatch

    SBATCH files are similar to UNIX shell scripts, but have additional header commands called 'directives' that are specific to the batch scheduling system.

    Details for directives can be found at https://www.osc.edu/supercomputing/batch-processing-at-osc

  • Interactive access to command line

    Session defaults: 1 hour of wall time, 1 CPU core

    Start session with the command: sinteractive

    Interactive sessions can be used to set up environments, do job debugging, and otherwise prepare for full jobs, but generally won't have access to significant computing resources.

Whether running interactively or by batch script, two major setup steps are often used prior to running the target job.

These involve loading required software either:

  • by pre-packaged 'modules' made available by OSC/Unity:

    module load <package-names> (e.g. cuda torch/7-asc ...)
     
  • or else by user defined conda environments:

    source activate <my-env-name>
     

For for information on setting up conda environments, see: https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf

OSC/Unity module setup details, including creating your own modules, can be found at OSC documentation: https://www.osc.edu/resources/getting_started/howto/howto_locally_installing_software

 

Old Step-by-step Setup

  1. Request unity account access from PI.  Unity login will be your same name.# and university password.
  2. Login to unity: ssh <name.#>@unity.asc.ohio-state.edu
  3. Clone modelblocks into your unity workspace: git clone https://github.com/modelblocks/modelblocks-release.git
  4. Go to the modelblocks-release directory and run make.  It will complain a lot and create a bunch of 'user-<corpusname>-directory.txt' files.
  5. Edit modelblocks-release/config/user-<corpusname>-directory.txt where applicable to point to the shared corpus resource accessible on unity.  The address is: /fs/project/schuler.77/corpora/<corpusname>.  (or rsync a copy from the ling servers).  Repeat for all required 3rd party corpora or tools required for your make job.
  6. Install miniconda3 for linux with: wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  7. Copy conda environment from /fs/project/schuler.77/modelblocks.yml to ~
  8. Generate conda environment with conda env create --name mb
  9. Activate environment: source activate mb
  10. Install Torch via pip: pip install -f https://download.pytorch.org/whl/torch_stable.html torch==1.5.0+cu101 torchvision==0.6.0+cu101
  11. Update conda deps with: conda env update --file modelblocks.yml
  12. Copy /fs/project/schuler.77/example_.bashrc to ~/.bashrc
  13. Modify ~/.bashrc to replace default user to your name.# everywhere
  14. Create tmp directory: mkdir /home/<name.#>/tmp (unix commands like sort will use this)
  15. Copy example job script /fs/project/schuler.77/example.pbs to ~
  16. Modify job script to use your target.  Rename script with descriptive name.
  17. Submit job: qsub <myjob>.pbs

Some gotchas:

compiling doesn't seem to work with -std=c++17 in pbs case, but compiles when made on login node. (c++17 required for try_emplace() function calls)  try testing compilation on interactive node and also in pbs with modules below: