Constructing Workflows from Script Applications
For programming and executing complex applications on grid infrastructures, scientific workflows have been proposed as convenient high-level alternative to solutions based on general-purpose programming languages, APIs and scripts. GridSpace is a collaborative programming and execution environment, which is based on a scripting approach and it extends Ruby language with a high-level API for invoking operations on remote resources. In this paper we describe a tool which enables to convert the GridSpace application source code into a workflow representation which, in turn, may be used for scheduling, provenance, or visualization. We describe how we addressed the issues of analyzing Ruby source code, resolving variable and method dependencies, as well as building workflow representation. The solutions to these problems have been developed and they were evaluated by testing them on complex grid application workflows such as CyberShake, Epigenomics and Montage. Evaluation is enriched by representing typical workflow control flow patterns.