XGrid PList format

From CSclasswiki
Jump to: navigation, search

Back to XGrid PRogramming


Generating PList Files

--Thiebaut 23:30, 27 October 2008 (UTC)

Dag nabbit good stuff you whippersanppers!

Example

Here's a simple batch file:

{
    jobSpecification =     {
        applicationIdentifier = "com.apple.xgrid.cli";
        inputFiles =  { "getStats.pl" =  {
                fileData = <2321202f 7573722f 62696e2f 7065726c 202d770a 23206765 74537461 74732e70   
                                 ...
                                 2020207d 0a0a7d0a 0a6d6169 6e28293b 0a>;
                isExecutable = YES; };
        };
        name = "getStats.pl";
        schedulerHints =  {
            0 = mathgrid5;
        };
        submissionIdentifier = abc;
        taskSpecifications =         {
            0 = { arguments = ( "pdb100d.ent" );  command = "getStats.pl"; };
        };
    };
}

The batch file corresponds to the following command:

 getStats.pl pdb100d.ent

In this case we assume that the getStats.pl program will grab the file pdb100d.ent from a Web server and process it, so the batch file contains only the contents of the getStats.pl program. It is the part that reads

               fileData = <2321202f 7573722f 62696e2f 7065726c 202d770a 23206765 74537461 74732e70   
                                ...
                                2020207d 0a0a7d0a 0a6d6169 6e28293b 0a>;

This description of the contents of the file is in hexadecimal, and lists 4 bytes at a time.

Generating Hexadecimal Content

We can generate the hexadecimal description of the file ourselves by using the hexdump command, which is usually part of all Linux distributions. The command is:

  hexdump  -v -e ' "" 4/1 "%02x" " "'  targetFileName

Here's an example

  echo "hello world" > dummy
  hexdump  -v -e ' "" 4/1 "%02x" " "'  dummy

  68656c6c 6f20776f 726c640a

A Python Script

Generating a batch file in PList format given program and data files can be done easily, now that we know how to generate the hexadecimal version of any file.

The Python script below shows how to generate a complete batch file for one program and one data file.

For example, assume that you want to create a batch file for running the program getStats.pl on the PDB file pdb100d.ent, simply type:

chmod +x makeBatch.py                     (needed only once, to make file executable)

makeBatch.py getStats.pl pdb100d.ent > batch.plist

xgrid -job batch back.plist | getXGridOutput.py

where getXGridOutput.py is a script we generated in an earlier programming example (see Running a C Program on the XGrid), and which gathers the results and prints them out automatically when the XGrid job is finished.


#! /Library/Frameworks/Python.framework/Versions/Current/bin/python
# D. Thiebaut
#
# Syntax:
#        makeBatch.py programName dataFileName > batchFile
#
# a simple python program that generates an XGrid batch file whose
# contents are shown below.  It uses the hexdump utility to generate
# the hexcode that is used by plist files to represent the contents of
# program or data files.


import sys
import popen2

batch = """
{
    jobSpecification =     {
        applicationIdentifier = "com.apple.xgrid.cli";
        inputFiles = {
            "%s" = { fileData = <%s>; };
            "%s" = { fileData = <%s>; isExecutable = YES; };
        };
        name = "%s";
        schedulerHints =  { 0 = mathgrid9; };
        submissionIdentifier = abc;
        taskSpecifications = {
            0 = { arguments = ( "%s" );  command = "%s";  };
        };
    };
}
"""

def getPlistFormat( fileName ):
    fin, fout = popen2.popen2( """hexdump  -v -e ' "" 4/1 "%%02x" " "'  %s"""
                               % fileName )
    output = fin.readline()
    fin.close()
    return output.strip()

def main():
    if len( sys.argv ) < 2:
        print "Syntax: makeBatch.py programName dataFileName"
        sys.exit( 1 )


    progName = sys.argv[ 1 ]
    dataFileName = sys.argv[ 2 ]

    progPList = getPlistFormat( progName )
    dataPlist = getPlistFormat( dataFileName )

    if progName.find( "/" )!= -1:
        progName = progName.split( '/' )[-1]

    if dataFileName.find( "/" )!= -1:
        dataFileName = dataFileName.split( '/' )[-1]
    
    #--- create a new batch file ---
    newBatch = batch % ( dataFileName, dataPlist,
                         progName, progPList,
                         progName, dataFileName,
                         progName )

    #--- print the new batch file to stdout ---
    print newBatch

main()