Setting up modelblocks jobs on unity

Body

 

Introductory docs: 

Unity intro: https://osuasc.teamdynamix.com/TDClient/1929/Portal/KB/ArticleDet?ID=61519

Overview to supercomputing: https://www.osc.edu/resources/getting_started/new_user_resource_guide

 

Step-by-step Setup

  1. Request unity account access from PI.  Unity login will be your same name.# and university password.
  2. Login to unity: ssh <name.#>@unity.asc.ohio-state.edu
  3. Clone modelblocks into your unity workspace: git clone https://github.com/modelblocks/modelblocks-release.git
  4. Go to the modelblocks-release directory and run make.  It will complain a lot and create a bunch of 'user-<corpusname>-directory.txt' files.
  5. Edit modelblocks-release/config/user-<corpusname>-directory.txt where applicable to point to the shared corpus resource accessible on unity.  The address is: /fs/project/schuler.77/corpora/<corpusname>.  (or rsync a copy from the ling servers).  Repeat for all required 3rd party corpora or tools required for your make job.
  6. Install miniconda3 for linux with: wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  7. Go to the modelblocks-release directory and run make mb_on_unity.  This will:
    • Create a ~/tmp/ directory, which is necessary for certain Unix commands;
    • Create a new conda environment named mb;
    • Install dependencies using conda (e.g. compilers, python, PyTorch); and
    • Download armadillo from source and install it to conda environment mb.
  8. Copy /fs/project/schuler.77/mb_on_unity/example.bashrc to ~/.bashrc and replace default user to your name.#.  Alternatively, lines 26-34 can be copied from this file and pasted to your current /.bachrc.  These lines need to be executed in order to assign required environment variables prior to compiling/running armadillo-based c++ programs.
  9. Copy example job script /fs/project/schuler.77/example.pbs to ~
  10. Modify job script to use your target.  Rename script with descriptive name.
  11. Submit job: qsub <myjob>.pbs

Some gotchas:

compiling doesn't seem to work with -std=c++17 in pbs case, but compiles when made on login node. (c++17 required for try_emplace() function calls)  try testing compilation on interactive node and also in pbs with modules below:

 

How to submit jobs:

  • Batch

    Create job script named myjob.pbs with text editor.

    Submit batch job with the command: qsub myjob.pbs

    PBS files are similar to UNIX shell scripts, but have additional header commands called 'directives' that are specific to the batch scheduling system.

    Details for directives can be found at https://www.osc.edu/supercomputing/batch-processing-at-osc

  • Interactive access to command line

    Session defaults: 1 hour of wall time, 1 CPU core

    Start session with the command: qsub -I (capital 'eye' for interactive)

    Interactive sessions can be used to set up environments, do job debugging, and otherwise prepare for full jobs, but generally won't have access to significant computing resources.

Whether running interactively or by batch script, two major setup steps are often used prior to running the target job.

These involve loading required software either:

  • by pre-packaged 'modules' made available by OSC/Unity:

    module load <package-names> (e.g. cuda torch/7-asc ...)
     
  • or else by user defined conda environments:

    source activate <my-env-name>
     

For for information on setting up conda environments, see: https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf

OSC/Unity module setup details, including creating your own modules, can be found at OSC documentation: https://www.osc.edu/resources/getting_started/howto/howto_locally_installing_software

 

Old Step-by-step Setup

  1. Request unity account access from PI.  Unity login will be your same name.# and university password.
  2. Login to unity: ssh <name.#>@unity.asc.ohio-state.edu
  3. Clone modelblocks into your unity workspace: git clone https://github.com/modelblocks/modelblocks-release.git
  4. Go to the modelblocks-release directory and run make.  It will complain a lot and create a bunch of 'user-<corpusname>-directory.txt' files.
  5. Edit modelblocks-release/config/user-<corpusname>-directory.txt where applicable to point to the shared corpus resource accessible on unity.  The address is: /fs/project/schuler.77/corpora/<corpusname>.  (or rsync a copy from the ling servers).  Repeat for all required 3rd party corpora or tools required for your make job.
  6. Install miniconda3 for linux with: wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  7. Copy conda environment from /fs/project/schuler.77/modelblocks.yml to ~
  8. Generate conda environment with conda env create --name mb
  9. Activate environment: source activate mb
  10. Install Torch via pip: pip install -f https://download.pytorch.org/whl/torch_stable.html torch==1.5.0+cu101 torchvision==0.6.0+cu101
  11. Update conda deps with: conda env update --file modelblocks.yml
  12. Copy /fs/project/schuler.77/example_.bashrc to ~/.bashrc
  13. Modify ~/.bashrc to replace default user to your name.# everywhere
  14. Create tmp directory: mkdir /home/<name.#>/tmp (unix commands like sort will use this)
  15. Copy example job script /fs/project/schuler.77/example.pbs to ~
  16. Modify job script to use your target.  Rename script with descriptive name.
  17. Submit job: qsub <myjob>.pbs

Some gotchas:

compiling doesn't seem to work with -std=c++17 in pbs case, but compiles when made on login node. (c++17 required for try_emplace() function calls)  try testing compilation on interactive node and also in pbs with modules below: