Xcrypt - Job Level Parallel Script Language

Download

Get from GitHub by:
% git clone https://github.com/xcrypt-job/xcrypt

Changelog

Refer to the log information of the Git Repository.

What is Xcrypt?

Computational scientists often perform large scale simulations in their research or development such as car body design and drug discovery. For parameter sweeps or optimal parameter searches, such a simulation often forms Plan-Do-Check-Act (PDCA) cycles, that is, iterations of plenty of sequential/parallel job executions with different parameters.

PDCA cycles should be automated. However, pre-existing general script languages, such as Perl or Ruby, are hard for typical computational scientists to use for preparing input files, generating a job script for each job, extracting necessary parts from output files to analyze results, and managing plenty of asynchronously running jobs. Though they can also use GUI-based workflow tools, it is difficult to describe some kind of complicated workflows with them.

Therefore, we are developing a job-level parallel script language named Xcrypt that helps such automation.

The goal of Xcrypt is to give a simple way to computational scientists, who are typically familiar with C or FORTRAN and not familiar with general script languages such as Perl and Ruby, to automate various workflows that consist of plenty of runs of programs and dependencies among them.

Differently from pre-existing workflow tools, Xcrypt is required to be not only simple but also flexible as a programming language; we should be able to implement from simple parameter sweeps to complicated search algorithms using Xcrypt. We realized these requirements by starting with Perl, a general script programming language, and extending it by adding features to release programmers from various annoying tasks such as writing job scripts for batch systems, generating/analyzing a huge number of input/output files, and managing states of asynchronously running jobs.

In addition, we provided a mechanism that enables “Perl wizards” to add various helpful “spells” (e.g., smart search algorithms) as modules in the way that end-users can use them easily.

Due to all of these features, Xcrypt users can run a wide variety of workflows only by writing simple scripts.

We aim to achieve peta/exa-scale computing easily by combination with lower-layer parallelization implemented using OpenMP, MPI, and/or XscalableMP.

Example

use base qw(limit core); limit::initialize (30); %template = ( 'id' => 'example', 'RANGE0' => [1..5000], 'exe0' => './a.out', 'arg0_0@' => '"input$_[0]"' 'arg0_1@’ => '"output$_[0]"', 'queue' => 'medium', ); @jobs = prepare (%template); submit (@jobs); sync (@jobs);

This is a simple example of an end-user script of Xcrypt. This script submits 5,000 jobs that execute a single program “a.out” with different command line arguments for each job, with limiting simultaneously running jobs up to 30.

As this example, a typical Xcrypt script consists of:

preamble that specifies modules to be loaded, initializes some global parameters, and so on,
definition of jobs to be submitted, and
function calls to prepare for submitting jobs (e.g., creating working directories), submit the jobs, and wait for the jobs finished.

Job Definition

Jobs are defined declaratively as a Perl hash object that contains parameter values as members’ values. Using parameters named RANGEn, we can define a single object for a sequence of jobs. In that case, we can set a different parameter for each job using a parameter name ending with S and a string evaluated by Perl interpreter in the environment where $_[n] binds the corresponding value in the RANGEn.

Submitting/Synchronizing Jobs

Defined jobs are submitted imperatively by the submit() function. Before submitting, we need to call prepare() to make a working directory and copy necessary files for each submitting job. All the submitted jobs are executed asynchronously; we can wait for the jobs finished by sync().

Defining/Using Modules

For limiting the number of simultaneously running jobs, the user script shown above uses the limit module, which is implemented as shown below.

package limit; use NEXT; use Thread::Semaphore; my $smph; sub initialize { $smph = Thread::Semaphore->new($_[0]); } sub before { my $self = shift; $smph->down; $self->NEXT::before(); } sub after { my $self = shift; $self->NEXT::after(); $smph->up; }

An Xcrypt module is defined as an extension to the class for job objects named core. In the core class and its subclasses, methods named before and after have special meanings; they are invoked asynchronously before submitting a job and after the job finished, respectively.

Due to this mechanism, a wide variety of functionalities can be developed and end-users can use them easily only by writing module names for (multiple) class inheritance. For instance, we have also implemented the modules for dry execution and for allowing management of the order of submitting jobs by declarative description of dependencies among them.

Job Script Generation

When submit() is invoked, Xcrypt runtime generates a job script for the batch scheduler (e.g., NQS, Torque, LSF, or SGE) based on information in a job object. In order to support a wide variety of batch schedulers, which have different command-line interfaces, specifications for job scripts, and so on each other, Xcrypt provides a mechanism that enable programmers or system administrators define a new batch scheduler by writing a Perl-based configuration script. The following script shows an example of such a script.Each parameter value is allowed to be a string or a function object, which realizes both easiness to write and flexibility for various specifications of batch schedulers.

$jobsched::jobsched_config{"NQS"} = { qsub_command => "/usr/local/bin/qsub", qdel_command => "/usr/local/bin/qdel -K", qstat_command => "/usr/local/bin/qstat", jobscript_option_queue => '# @$-q ', jobscript_option_stdout => '# @$-o ', jobscript_option_stderr => '# @$-e ', extract_req_id_from_qsub_output => sub { my (@lines) = @_; if ($lines[0] =~ /([0-9]*).nqs/) { return $1 ;} else { return -1; } }, ... }

Libraries for Input File Generation and Output File Extraction

Of course, we can use legacy Unix tools such as grep, sed, and awk in order to generate input files and extract data from output files for a huge number of jobs. However, it is not so easy for users who are unfamiliar with regular expressions to generate a large number of FORTRAN namelists that are slightly different each other or extract certain elements from an output file that represents a matrix.

Therefore, Xcrypt provides higher level generation/extraction libraries; we improved usability by specializing them for use in computational science such as modifying a FORTRAN namelist and extracting data from output files by specifying both row and column numbers.

Papers

Masaru Ueno, Tasuku Hiraishi, Motoharu Hibino, Takeshi Iwashita, Hiroshi Nakashima. Multilingualization Based on RPC for Job-Level Parallel Script Language, Xcrypt. IPSJ Transaction on Programming, Vol. 6, No. 2, pp. 55--68, 2013-8 . pdf link
Tasuku Hiraishi, Masaru Ueno, Tatsuya Abe, Motoharu Hibino, Takeshi Iwashita, Hiroshi Nakashima. Xcrypt on Lisp: A Scripting System for Job Level Parallel Programming in Lisp. International Lisp Conference, pp. 107--114, Kyoto, 2012-10-23 . pdf link
Tasuku Hiraishi, Tatsuya Abe, Takeshi Iwashita, Hiroshi Nakashima. Xcrypt: A Perl Extension for Job Level Parallel Programming. Second International Workshop on High-performance Infrastructure for Scalable Tools WHIST 2012 (held as part of ICS'12), Venice, Italy, 2012-6-29 . pdf
Tasuku Hiraishi, Tatsuya Abe, Yohei Miyake, Takeshi Iwashita, Hiroshi Nakashima. Xcrypt: Flexible and Intuitive Job-Parallel Script Language. Symposium on Advanced Computing Systems and Infrastructures (SACSIS2010), pp. 183--191, Nara, Japan, 2010-5 . (In Japanese)
Tasuku Hiraishi, Takeshi Iwashita, Hiroshi Nakashima. Towards Seamless and Highly-Productive Parallel Script Language. The 119th IPSJ Workshop on High Performance Computing (HOKKE-2009), pp. 175--180, Sapporo, 2009-2 . (In Japanese) pdf
Tasuku Hiraishi, Tatsuya Abe, Yohei Miyake, Takeshi Iwashita, Hiroshi Nakashima. Development of Xcrypt: Highly Productive Parallel Script Language. IPSJ Summer Programming Symposium 2009, pp. 67--73, Nasushiobara, Tochigi, . (In Japanese) pdf

Presentations

Hiroshi Nakashima, Tasuku Hiraishi, Tatsuya Abe, Yohei Miyake. Progress Report of Xcrypt: What You Can Do Now with the Parallel Script Language. WPSE2010, Kyoro, 2010-2 . link
Tasuku HIRAISHI. Xcrypt Tutorial. PC Cluster Workshop in Kyoto 2010, Kyoto, 2010-2 . (In Japanese) link
T2K Open Supercomputer Alliance. Seamless and Highly-Productive Parallel Programming Environment for High-Performance Computing. SC09 Exhibition, Portland, Oregon, 2009-11 . pdf
Tasuku HIRAISHI, Hiroshi NAKASHIMA. Xcrypt: Highly-Productive Parallel Script Language. Third French-Japanese Workshop Petascale Applications, Algorithms and Programming (PAAP2009), Kyoto University, 2009-4 . link