7th International Provenance and Annotation Workshop

9th - 10th July 2018

Keynote: From Workflows to Provenance and Reproducibility: Looking Back and Forth

Computational notions of data provenance have been studied in different contexts such as databases, programming languages, and scientific workflows. While the different communities overlap to some extent, much of the research has been conducted independently, with limited cross-fertilization and often without the explicit recognition of the different assumptions, perspectives, and problems under investigation.  In this talk, I will trace some of the origins, research questions, approaches, and results on provenance, with the aim to highlight what’s similar and what’s different in the various subareas and communities.  Based on an understanding of the past, we can also aim to better understand the present research challenges and focus on important new problems, e.g., the use of provenance to support reproducibility in science – provided we agree on what we mean by reproducibility and provenance, respectively.  Thus, this “Tour de Provenance” will include various “stops” and revisit some of the conceptual foundations, but also possible sources of confusion due to the lack of a common terminology or shared understanding about provenance and reproducibility. I will conclude by venturing to look back and forth and suggest future research questions and opportunities in provenance.

Bertram Ludäscher is a professor at the School of Information Sciences at the University of Illinois, Urbana-Champaign, and directs the Center for Informatics Research in Science and Scholarship (CIRSS). He is also a faculty affiliate with the National Center for Supercomputing Applications (NCSA) and the Department of Computer Science at Illinois.  Until 2014 he was a professor at the Department of Computer Science at the University of California, Davis. His research interests range from scientific data and workflow management, to knowledge representation and reasoning. Until 2004 he was a research scientist at the San Diego Supercomputer Center (SDSC) and an adjunct faculty at the CSE Department at UC San Diego. He received his M.S. in computer science from the University of Karlsruhe (now part of K.I.T.), and his PhD from the University of Freiburg, Germany, respectively.

IPAW Accepted Papers

Shawn Bowers, Timothy McPhillips and Bertram Ludaescher. Validation and Inference of Schema-Level Workflow Data-Dependency Annotations

Jacek Cała and Paolo Missier. Provenance Annotation and Analysis to Support Process Re-Computation

Carlos Sáenz-Adán, Luc Moreau, Beatriz Pérez, Simon Miles and Francisco J. García-Izquierdo. Automating Provenance Capture in Software Engineering with UML2PROV

Joshua Valdez, Matthew Kim, Michael Rueschman, Susan Redline and Satya Sahoo. Classification of Provenance Triples for Scientific Reproducibility: A Comparative Evaluation of Deep Learning Models in the ProvCaRe Project

Renan Souza and Marta Mattoso. Provenance of Dynamic Adaptations
 in User-steered Dataflows

Belfrit Victor Batlajery, Mark Weal, Adriane Chapman and Luc Moreau. Belief Propagation through Provenance Graphs-

Iman Naja and Nicholas Gibbins. Using Provenance to Efficiently Propagate SPARQL Updates on RDF Source Graphs

Benjamin Ujcich, Adam Bates and William Sanders. A Provenance Model for the European Union General Data Protection Regulation

Michael Johnson, Adriane Chapman, Luc Moreau, Poshak Gandhi and Carlos Sáenz-Adán. Using the Provenance from Astronomical Workflows

Joao Felipe Pimentel, Paolo Missier, Leonardo Murta and Vanessa Braganholo. Versioned-PROV: A PROV extension to support mutable data entities-

Elliot Fairweather, Pinar Alper and Vasa Curcin. Simulated Domain-specific Provenance

Abdussalam Alawini, Leshang Chen, Susan Davidson, Stephen Fisher and Junhyong Kim. Discovering Similar Workflows via Provenance Clustering: a Case Study


