Python Job Runner

  1. Python Job Runner Tutorial
  2. Python Job Runner Download
  3. Python Job Runners
  4. Python Job Runner Online
  5. Python Job Runner

To create a blocking search and display properties of the job. Running a blocking search creates a search job and runs the search synchronously in 'blocking' mode. The job is returned after the search has finished and all the results are in. When you create a search job, you need to set the parameters of the job as a dictionary of key-value pairs. I also have a list of items, and I want to run the same job for each item in the list, the exact same code, just with the item name injected in. I could do with with a python loop, but my list contains about 30 items, and if one fails, the whole job fails and so does every other item in the list.

To best understand the below information, you should already have anunderstanding of:

  • Using the command line to: navigate within directories,create/copy/move/delete files and directories, and run theirintended programs (aka 'executables').

CHTC provides several copies of Python that can be used to run Pythoncode in jobs. See our list of supported versions here: CHTC SupportedPython

This guide details the steps needed to:

If you want to build your own copy of base Python, see this archivedpage: Building a Python installation

If you want to use conda to manage your Python package dependencies, read this guide as background material,then read our guide on using conda.

Python versionName of Python installation file
Python 2.7python27.tar.gz
Python 3.6python36.tar.gz
Python 3.7python37.tar.gz
Python 3.8python38.tar.gz

If your code uses specific Python packages (like numpy, matplotlib,sci-kit learn, etc) follow the directions below to download andprepare the packages you need for job submission. If your job does notrequire any extra Python packages, skip to parts 2 and 3.

You are going to start an interactive job that runs on the HTC buildservers and that downloads a copy of Python. You will then install yourpackages to a folder and zip those files to return to the submit server.

These instructions are primarily about adding packages to a freshinstall of Python; if you want to add packages to a pre-existingpackage folder, there will be notes below in boxes like this one.

A. Submit an Interactive Job

Create the following special submit file on the submit server, callingit something like build.sub.

The only thing you should need to change in the above file is the nameof the python##.tar.gz file - in the 'transfer_input_files' line.We have two versions of Python available to build from -- see the tableabove.

If you want to add packages to a pre-existing package directory, addthe tar.gz file with the packages to the transfer_input_filesline:

Once this submit file is created, you will start the interactive job byrunning the following command:

Python job runner tutorial

It may take a few minutes for the build job to start.

B. Install the Packages

Once the interactive build job starts, you should see the Python thatyou specified inside the working directory:

We'll now unzip the copy of Python and set the PATH variable toreference that version of Python:

Python Job Runner

To make sure that your setup worked, try running:

You can also try running this command to make sure the copy of pythonthat is now active is the one you just installed:

The command above should return a path that includes the prefix/var/lib/condor/, indicating that it is installed in your job'sworking directory.

If you're using Python 2, use python2 instead of python3 above (andin what follows). The output should match the version number that youwant to be using!

If you brought along your own package directory, un-tar it here andskip the directory creation step below.

Python Job Runner Tutorial

First, create, a directory to put your packages into:

You can choose what name to use for this directory -- if you havedifferent sets of packages that you use for different jobs, you coulduse a more descriptive name than 'packages'

To install the Python packages run the following command:

Replace package1package2 with the names of packages you want toinstall. pip should download all dependent packages and install them.Certain packages may take longer than others.

C. Finish Up

Python Job Runner Download

Right now, if we exit the interactive job, nothing will be transferredback because we haven't created any new files in the workingdirectory, just sub-directories. In order to transfer back ourinstallation, we will need to compress it into a tarball file - not onlywill HTCondor then transfer back the file, it is generally easier totransfer a single, compressed tarball file than an uncompressed set ofdirectories.

Run the following command to create your own tarball of your packages:

Again, you can use a different name for the tar.gz file, if you want.

We now have our packages bundled and ready for CHTC! You can now exitthe interactive job and the tar.gz file with your Python packages willreturn to the submit server with you (this sometimes takes a few extraseconds after exiting).

In order to use CHTC's copy of Python and the packages you haveprepared in an HTCondor job, we will need to write a script that unpacksboth Python and the packages and then runs our Python code. We will usethis script as as the executable of our HTCondor submit file.

A sample script appears below. After the first line, the lines startingwith hash marks are comments . You should replace 'my_script.py' withthe name of the script you would like to run, and modify the Pythonversion numbers to be the same as what you used above to install yourpackages.

If you have additional commands you would like to be run within the job,you can add them to this base script. Once your script does what youwould like, give it executable permissions by running:

Arguments in Python

To pass arguments to an R script within a job, you'll need to use thefollowing syntax in your main executable script, in place of thegeneric command above:

Python Job Runners

Here, $1 and $2 are the first and second arguments passed to thebash script from the submit file (see below), which are then sent onto the Python script. For more (or fewer) arguments, simply add more(or fewer) arguments and numbers.

In addition, your Python script will need to be able to acceptarguments from the command line. There is an explanation of how to dothis in this Software Carpentrylesson.

A sample submit file can be found in our helloworld example page. You should make the followingchanges in order to run Python jobs:

  • Your executable should be the script that you wroteabove.

  • Modify the CPU/memory request lines. Test a few jobs for disk space/memory usage in order to make sure your requests for a large batch are accurate!
    Disk space and memory usage can be found in the log file after the job completes.
  • Change transfer_input_files to include:
  • If your script takes arguments (see the box from the previoussection), include those in the arguments line:

Python Job Runner Online

How big is your package tarball?

Python Job Runner

If your package tarball is larger than 100 MB, you should NOT transferthe tarball using transfer_input_files. Instead, you should useCHTC's web proxy, squid. In order to request space on squid,email the research computing facilitators at chtc@cs.wisc.edu.

Comments are closed.