Service Based Support For Scientific Workflows

Show full item record

Title: Service Based Support For Scientific Workflows
Author: Chandra, Sandeep
Advisors: Dr. Mladen A. Vouk, Committee Chair
Dr. Peter R. Wurman, Committee Member
Dr. Munindar P. Singh, Committee Member
Abstract: A Problem Solving Environment (PSE) is a computer-based system that provides all the computational facilities necessary to solve a target class of problems, that is, it efficiently supports a specific set of scientific workflows. A special class of PSE's are those that rely on networks. Network-based PSEs are collections of distributed applications, interfaces, libraries, databases, programs, tools, clients, and intelligent agents, which facilitate user interaction and cooperative execution of the components charged with the solution tasks. Thus, the need for effective and efficient communication among the PSE components is obvious. This has resulted in a proliferation of communication building blocks, or middleware, for distributed scientific computing and problem solving. The most recent, and quite promising, option is the availability of network-based services, sometimes called Web Services. This work is concerned with evaluation of the feasibility, usability and effectiveness of service-based support for scientific workflows. A successful open source proof of concept architecture and framework was developed and assessed using a Bioinformatics scientific problem solving workflow. Access to the data, computations, and user interfaces is based on the services architecture and standards such as XML for data descriptions, WSDL for service descriptions, SOAP for service delivery, and UDDI for service registration and brokering. This service-oriented approach facilitates integration of disparate data acquisition and analysis applications that participate in complex scientific problem solving processes. The developed framework, called Scientific Data Management Service Workflow System, or SDMSWS, was found to be sufficiently flexible and versatile that it was possible to effect appliance-like composition and use of standard-conforming workflow services. An open-source solution, such as SDMSWS, enables services from different organizations, platforms and domains to interoperate seamlessly through standard interoperability protocols. This is contrasted with similar systems that implement proprietary service solutions. In order to compare SDMSWS with other systems that support scientific problem solving workflows, it was necessary to develop a set of measures. They include measures related to end-user issues, workflow issues, services issues, networking issues, and a variety of more general issues. The selection and the definition of the metrics, and the rational behind them, are discussed. The comparative analysis reveals that its open source nature has some distinct advantages. For example, the framework minimizes the customization and integration of new component workflow services, and it shows that, unlike commercial solutions, it is relatively independent of the service domains and can incorporate any network-based service that conforms to standard Web Services description and protocols. It is the finding of this work that complex distributed scientific workflows can be, and should be, supported using open-source service-based solutions. However, current standards, for workflow description, interchange and execution were found wanting and further work will be needed before one can depend on them in practical scientific workflow environments.
Date: 2002-12-27
Degree: MS
Discipline: Computer Science

Files in this item

Files Size Format View
etd.pdf 428.4Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record