================= Process generator ================= Experimental feature and unofficial extension to the CWL standards. A process generator is a CWL Process type that executes a concrete CWL process (CommandLineTool, Workflow or ExpressionTool) which produces CWL files as output, then executes the CWL that was generated. The intention is to have a formalized way to express a pre-processing or bootstrapping step in which a CWL description is generated by another program (such as from a template, or conversion from another workflow language). The ProcessGenerator is a subtype of CWL process, so it must define its inputs and outputs. The "run" field is similar to the "run" field of a workflow step -- it specifies a tool to run that will create new CWL as output. .. code:: yaml - name: ProcessGenerator type: record inVocab: true extends: cwl:Process documentRoot: true fields: - name: class jsonldPredicate: "_id": "@type" "_type": "@vocab" type: string - name: run type: [string, cwl:Process] jsonldPredicate: _id: "cwl:run" _type: "@id" subscope: run doc: | Specifies the process to run. Process generator example (pytoolgen.cwl) .. code:: yaml #!/usr/bin/env cwl-runner cwlVersion: v1.0 $namespaces: cwltool: "http://commonwl.org/cwltool#" class: cwltool:ProcessGenerator inputs: script: string dir: Directory outputs: {} run: class: CommandLineTool inputs: script: string dir: Directory outputs: runProcess: type: File outputBinding: glob: main.cwl requirements: InlineJavascriptRequirement: {} cwltool:LoadListingRequirement: loadListing: shallow_listing InitialWorkDirRequirement: listing: | ${ var v = inputs.dir.listing; v.push({entryname: "inp.py", entry: inputs.script}); return v; } arguments: [python, inp.py] stdout: main.cwl The process generator has two required inputs: "script" and "dir". It runs the command line tool listed inline in "run" with the input object, which is required to have those parameters. Note: the input object may contain additional parameters which are intended for the generated CWL when it is executed. The command line tool populates the working directory using InitialWorkDirRequirement. It uses the listing from 'dir' and adds a new file literal called "inp.py" which contains the text from the input parameter "script". Then it runs "python inp.py". The output of this command line tool is the File parameter "runProcess". In this example, the "inp.py" script, when run, is expected to print the CWL description to standard output, which will be captured in the "runProcess" output parameter. Next, the ProcessGenerator will load file in the "runProcess" parameter, which in this example is "main.cwl". Finally, it will execute the process with input object that was originally provided to the process generator. The output of the generated script is used as the output for ProcessGenerator as a whole. Here's an example (zing.cwl) that uses pytoolgen.cwl. .. code:: yaml #!/usr/bin/env cwltool {cwl:tool: pytoolgen.cwl, script: {$include: "#attachment-1"}, dir: {class: Directory, location: .}} --- | import os import sys print(""" cwlVersion: v1.0 class: CommandLineTool inputs: zing: string outputs: {} arguments: [echo, $(inputs.zing)] """) The first line ``#!/usr/bin/env cwltool`` means that this file can be given the executable bit (+x) and then run directly. This is a multi-part YAML file. The first section is a CWL input object. The input object uses "cwl:tool" to indicate that this input object should be used as input to execute "pytoolgen.cwl". The parameter ``script: {$include: "#attachment-1"}`` takes the text from the second part of the file (following the YAML division marker ``--- |``) and assigns it as a string value to "script". The "dir" parameter is not doing much in this example, but by capturing the whole directory it allows the Python script to refer to files in the current directory. In this example the script is trivially printing CWL as a string, but of course could do something much more complex: generate code from a template, select among several possible workflows based on the input, convert from another workflow language, etc. When this is executed, the following steps happen: #. pytoolgen.py is loaded and executed with the 1st part of the file as the input object #. The "script" parameter contains the contents of the second part. The inline command line tool creates a file called "inp.py" with the contents of "script" #. The inline command line tool runs python on "inp.py" and collects the output, which is CWL description for a trivial "echo" tool. #. It loads the CWL description and executes it with any additional parameters declared in the input object or command line. Example runs ------------ Note: requires ``cwltool`` flags ``--enable-ext`` and ``--enable-dev`` You can set these with the environment parameter CWLTOOL_OPTIONS .. code:: $ export CWLTOOL_OPTIONS="--enable-dev --enable-ext" $ ./zing.cwl INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758 INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl' INFO [job d3626216-d7d8-4322-bc21-4d469634cc9a] /tmp/8sez90gb$ python \ inp.py > /tmp/8sez90gb/main.cwl INFO [job d3626216-d7d8-4322-bc21-4d469634cc9a] completed success usage: ./zing.cwl [-h] --zing ZING [job_order] ./zing.cwl: error: the following arguments are required: --zing .. code:: $ ./zing.cwl --zing blurf INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758 INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl' INFO [job a580b69d-2b88-4268-904e-ed105ba7c85e] /tmp/ujff239o$ python \ inp.py > /tmp/ujff239o/main.cwl INFO [job a580b69d-2b88-4268-904e-ed105ba7c85e] completed success INFO [job main.cwl] /tmp/f_7bxncq$ echo \ blurf blurf INFO [job main.cwl] completed success { "runProcess": { "location": "file:///home/peter/work/cwltool/tests/wf/generator/main.cwl", "basename": "main.cwl", "class": "File", "checksum": "sha1$8c160b680fb2cededef3228a53425e595b8cdf48", "size": 111, "path": "/home/peter/work/cwltool/tests/wf/generator/main.cwl" } } INFO Final process status is success .. code:: $ echo "zing: zoop" > job.yml $ ./zing.cwl job.yml INFO /home/peter/work/cwltool/venv3/bin/cwltool 3.1.20211112163758 INFO Resolved './zing.cwl' to 'file:///home/peter/work/cwltool/tests/wf/generator/zing.cwl' INFO [job 9073a083-dc79-4719-8762-1c024480605c] /tmp/meeo3d19$ python \ inp.py > /tmp/meeo3d19/main.cwl INFO [job 9073a083-dc79-4719-8762-1c024480605c] completed success INFO [job main.cwl] /tmp/2pqdz5nq$ echo \ zoop zoop INFO [job main.cwl] completed success { "runProcess": { "location": "file:///home/peter/work/cwltool/tests/wf/generator/main.cwl", "basename": "main.cwl", "class": "File", "checksum": "sha1$8c160b680fb2cededef3228a53425e595b8cdf48", "size": 111, "path": "/home/peter/work/cwltool/tests/wf/generator/main.cwl" } } INFO Final process status is success