Automated Composition of Big Data services


Workshop Title

Automated Composition of Big Data services


Workshop Abstract

As Big data applications involve multiple stages (collection and cleaning, data lake creation and management, analytics parallelization etc.), each involving multiple components, coping with a growing number of services in a flexible and efficient manner becomes a more and more critical issue.

Semantic Web metadata promise to enable greater access not only to content but also to Big Data services. Users and software agents become able to discover, invoke, compose, and monitor Big Data resources offering services and having particular properties, and should be able to do so with a high degree of automation if desired.

The usage of shared ontologies increases interoperability between different service platforms, and the additional abstraction layer of proprietary ontologies with facilitate the integration of heterogeneous data sets.

Also, ontologies may enable formal consistency verification and advanced queries to Big Data compositions. Therefore, we argue that Semantic Web metadata is ideally designed to support heterogeneous Big Data environments.

This workshop presents the TOREADOR approach for automated composition of a Big Data services, including data preparation, representation and storage, analytics parallelization and visualization.

In particular, we focus on solutions based on OWL-S (formerly DAML-S), an ontology of services that makes these functionalities possible.



Workshop Outlines

  1. Big Data environments as Services platforms
    • The stages of Big Data application: Preparation, Storage/DataLake/Streams setup/Analytics/Display
  2. Introduction to Service selection
    • Invoking individual services: constructing valid messages based on the published signature/interface of a service
    • Library invocation via glue code
  3. Negotiating SLAs & communications
    • Protocols used for agreeing upon non-functional objectives
  4. Composing Services
    • Using workflows to achieve goals based on available Services/Agents
  5. Process models
    • Atomic (functional) or composite (conversational)
    • Determining what to expose
      1. Just interaction points
      2. Additional process information for reasoning
  6. Describe the profile and advertise
    • OWL/S ontologies for service description
    • Sample of metadata exposing input & outputs, preconditions and effects
  7. Bind to a transport mechanism via OWL/S grounding
    • Provide (or augment existing) WSDL document and bind to it
    • From OWL/S to executable workflow language
  8. The TOREADOR platform
    • Sample OWL/S descriptions
    • Executable Workflows of BDA analytics


Workshop Organizers





Dr. Ernesto Damiani

Full Professor at Department of Computer Science, and Research Coordinator at SESAR Lab, Università degli Studi di Milano.


Dr. Sadegh M. Astaneh

PhD Senior Researcher at SESAR Lab, Università degli Studi di Milano.