Academia.eduAcademia.edu

Outline

D3. 1.1: An architecture for provenance systems

2006, University of Southampton, …

Abstract
sparkles

AI

The paper proposes an architecture for provenance systems, emphasizing a hierarchical structure called the p-structure for process documentation that facilitates the retrieval of p-assertions. It details functionality for recording and querying p-assertions through an interface based on a stateless recording protocol (PReP), ensuring that documentation remains unchanged and reflects original execution. The architecture addresses various operational requirements, including access control and identity management, crucial for effective data retrieval and documentation in heterogeneous environments.

References (74)

  1. GR-OTM.7, 129 GR-OTM.8, 129 GR-OTM.9, 130
  2. OTM-15, 131
  3. OTM-16, 131
  4. OTM-17, 131
  5. OTM-18, 131
  6. OTM-19, 132 SL-1, 127 SL-2, 127 SL-3, 127 SL-4, 127 SL-5, 127 SL-6, 127 SL-7, 128 SL-8, 128 SL-9, 128 SR-1-1, 115 SR-1-10, 118 SR-1-11, 118 SR-1-12, 118 SR-1-13, 118 SR-1-14, 118 SR-1-15, 119 SR-1-16, 119 SR-1-17, 119 SR-1-18, 119 SR-1-2, 115 SR-1-3, 116 SR-1-4, 116 SR-1-5, 116 SR-1-6, 116 SR-1-7, 117 SR-1-8, 117 SR-1-9, 117 SR-2-1, 120 SR-2-2, 120 SR-2-3, 120 SR-2-4, 120 SR-3-1-1, 121 SR-3-1-2, 121 SR-3-1-3, 121 SR-3-1-4, 121
  7. Copyright @ 2005, 2006 by the PROVENANCE consortium The PROVENANCE project receives research funding from the European Commission's Sixth Framework Programme Bibliography
  8. G. Alonso and A. El Abbadi. Goose: Geographic object oriented support environment. In Proc. of the ACM workshop on Advances in Geographic Information Systems, pages 38-49, Arlington, Virginia, November 1993.
  9. Gian Pietro Picco Antonio Carzaniga and Giovanni Vigna. Designing distributed applications with mobile code paradigms. In Proceedings of the 19th international conference on Software engineering, pages 22- 32, Boston, Massachusetts, May 1997.
  10. G. Alonso and C. Hagen. Geo-opera: Workflow concepts for spatial processes. In Proc. 5th Intl. Symposium on Spatial Databases (SSD '97), Berlin, Germany, June 1997.
  11. Christopher Alexander, Sara Ishikawa, and Murray Silverstein. A Pat- tern Language. Oxford University Press, 1977.
  12. Christopher Alexander. The Timeless Way of Building. Oxford Univer- sity Press, 1979.
  13. Árpád Andics (ed.). D2.1.1: User requirements document. Technical report, MTA SZTAKI, February 2005.
  14. Árpád Andics (ed.). D2.2.1: Software requirements document. Techni- cal report, MTA SZTAKI, February 2005.
  15. R. A. Becker and J. M. J. M. Chambers. Auditing of data analyses. SIAM Journal of Scientific and Statistical Computing, 9(4):747-760, 1988.
  16. P. Buneman, S. Khanna, K.Tajima, and W.C. Tan. Archiving scientific data. In Proc. of the 2002 ACM SIGMOD International Conference on Management of Data, pages 1-12. ACM Press, 2002.
  17. P. Buneman, S. Khanna, and W.C. Tan. Why and where: A characteri- zation of data provenance. In Int. Conf. on Databases Theory (ICDT), 2001.
  18. Copyright @ 2005, 2006 by the PROVENANCE consortium The PROVENANCE project receives research funding from the European Commission's Sixth Framework Programme
  19. R. Bose. A conceptual framework for composing and managing scien- tific data lineage. In Proceedings of the 14th International Conference on Scientific and Statistical Database Management, pages 15-19, Edin- burgh, Scotland, July 2002.
  20. S. Bradner. Rfc 2119 -key words for use in rfcs to indicate requirement levels. http://www.faqs.org/rfcs/rfc2119.html, March 1997.
  21. Miguel S. Branco. A provenance infrastructure for the atlas experiment at cern. 9 month report, University of Southampton; Faculty of Engi- neering, Science and Mathematics; School of Electronics and Computer Science, 2005.
  22. Steve Burbeck. The tao of e-business services. Technical report, Emerg- ing Technologies, IBM Software Group, October 2000.
  23. CaIFF + 04] Karl Czajkowski, Donal Ferguson adn Ian Foster, Jeffrey Frey, Steve Graham, Igor Sedukhin, David Snelling, Steve Tuecke, and William Vambenepe. The WS-Resource Framework, March 2004.
  24. Y. Cui and J. Widom. Practical lineage tracing in data warehouses. In Proceedings of the 16th International Conference on Data Engineering (ICDE'00), San Diego, California, February 2000.
  25. Y. Cui and J. Widom. Lineage tracing for general data warehouse trans- formations. The VLDB Journal, 12(1):41-58, 2003.
  26. Y. Cui, J. Widom, and J. L. Wiener. Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst., 25(2):179-227, 2000.
  27. Encoded archival description. http://www.loc.gov/ead/, 2002.
  28. H. Fan and A. Poulovassilis. Tracing data lineage using schema transfor- mation pathways. In B. Omelayenko and M. Klein, editors, Knowledge transformation for the Semantic Web, pages 64-79. IOS Press, 2003.
  29. I. Foster, J. Voeckler, M. Wilde, and Y.Zhao. Chimera: A virtual data system for representing, querying and automating data derivation. In Proc. of the 14th Conf. on Scientific and Statistical Database Manage- ment, July 2002.
  30. GGS + 03] M. Greenwood, C. Goble, R. Stevens, J. Zhao, M. Addis, D. Marvin, L. Moreau, and T. Oinn. Provenance of e-science experiments -experi- ence from bioinformatics. In Simon J Cox, editor, Proc. UK e-Science All Hands Meeting 2003, pages 223-226, September 2003.
  31. Copyright @ 2005, 2006 by the PROVENANCE consortium The PROVENANCE project receives research funding from the European Commission's Sixth Framework Programme [GHJV95] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. De- sign Patterns. Addison-Wesley Professional, 1995.
  32. Paul Groth, Michael Luck, and Luc Moreau. Formalising a protocol for recording provenance in grids. In Proceedings of the UK OST e- Science second All Hands Meeting 2004 (AHM'04), Nottingham, UK, September 2004.
  33. Paul Groth, Michael Luck, and Luc Moreau. A protocol for record- ing provenance in service-oriented grids. In Teruo Higashino, edi- tor, Proceedings of the 8th International Conference on Principles of Distributed Systems (OPODIS'04), volume Lecture Notes in Computer Science, pages 124-139, Grenoble, France, December 2004. Springer- Verlag.
  34. Paul Groth, Michael Luck, and Luc Moreau. A protocol for recording provenance in service-oriented grids. In Proceedings of the 8th Interna- tional Conference on Principles of Distributed Systems (OPODIS'04), volume 3544 of Lecture Notes in Computer Science, pages 124-139, Grenoble, France, December 2004. Springer-Verlag.
  35. GMF + 05a] Paul Groth, Simon Miles, Weijian Fang, Sylvia C. Wong, Klaus-Peter Zauner, and Luc Moreau. Recording and using provenance in a pro- tein compressibility experiment. In Proceedings of the 14th IEEE In- ternational Symposium on High Performance Distributed Computing (HPDC'05), July 2005.
  36. GMF + 05b] Paul Groth, Simon Miles, Weijian Fang, Sylvia C. Wong, Klaus-Peter Zauner, and Luc Moreau. Recording and using provenance in a pro- tein compressibility experiment. In Proceedings of the 14th IEEE In- ternational Symposium on High Performance Distributed Computing (HPDC'05), 2005.
  37. Paul Groth, Simon Miles, and Luc Moreau. Preserv: Provenance record- ing for services. In Proceedings of the UK OST e-Science second All Hands Meeting 2005 (AHM'05), Nottingham,UK, September 2005.
  38. Paul T. Groth. On the record: Provenance in large scale, open, dis- tributed systems. Technical report, University of Southampton; Faculty of Engineering, Science and Mathematics; School of Electronics and Computer Science, July 2005.
  39. Neil Hardman and John Ibbotson. D9.3.1: Functional prototype. Tech- nical report, IBM, September 2005.
  40. John Ibbotson, Paul Groth, and Simon Miles. D5.1.1: Scalability re- quirements. Technical report, IBM Hursley, September 2005.
  41. John Ibbotson, Neil Hardman, and Victor Tan. D4.1.1: Security require- ments. Technical report, IBM Hursley, September 2005.
  42. General international standard archival description (isad(g)). http://www.icacds.org.uk/eng/ISAD(G).pdf, 2000.
  43. JGM + 06] Sheng Jiang, Paul Groth, Simon Miles, Victor Tan, Steve Munroe, Sofia Tsasakou, and Luc Moreau. Client side library design and implementa- tion. Technical report, 2006.
  44. Philippe Kruchten. Architectural blueprints -the "4+1" view. model of software architecture. IEEE Software, 12(6), November 1995.
  45. Guy K. Kloss and Andreas Schreiber. D7.1.1: Application 1: Aerospace engineering. specification of mapping to provenance architecture, and domain specific provenance handling. Technical report, German Aerospace (DLR), September 2005.
  46. D.P. Lanter. Design of a lineage-based meta-data base for gis. Cartog- raphy and Geographic Information Systems, 18(4):255-261, 1991.
  47. D.P. Lanter. Lineage in gis: The problem and a solution. Technical Report 90-6, National Center for Geographic Information and Analysis (NCGIA), UCSB, Santa Barbara, CA, 1991.
  48. D.P. Lanter and R. Essinger. User-centered graphical user interface de- sign for gis. Technical Report 91-6, National Center for Geographic Information and Analysis (NCGIA). UCSB, 1991.
  49. Mark Little, Eric Newcomer, and Greg Pavlik (Editors). Web services context specification committee draft version 0.8. Committee draft ver- sion 0.8, Arjuna Technologies, Fujitsu, IONA Technologies, Oracle and Sun, November 2004.
  50. Nancy Lynch. Distributed Algorithms. Morgan Kaufmann Publishers, December 1995.
  51. A. P. Marathe. Tracing lineage of array data. J. Intell. Inf. Syst., 17(2- 3):193-214, 2001.
  52. MCE + 03] J.D. Myers, A.R. Chappell, M. Elder, A. Geist, and J. Schwidder. Re- integrating the research record. IEEE Computing in Science & Engi- neering, pages 44-50, 2003.
  53. Copyright @ 2005, 2006 by the PROVENANCE consortium The PROVENANCE project receives research funding from the European Commission's Sixth Framework Programme [MCG + 05] Luc Moreau, Liming Chen, Paul Groth, John Ibbotson, Michael Luck, Simon Miles, Omer Rana, Victor Tan, Willmott, and Fenglian Xu. Log- ical architecture strawman for provenance systems. Technical report, University of Southampton, 2005.
  54. Simon Miles, Paul Groth, Miguel Branco, and Luc Moreau. The re- quirements of recording and using provenance in e-science experiments. Journal of Grid Computing, 2006.
  55. Robin Milner. Communicating and mobile systems: the π-calculus. Cambridge University Press, 1999.
  56. Simon Miles and Luc Moreau. Querying the provenance of electronic and physical entities. Technical report, University of Southampton, Jan- uary 2006.
  57. MMT + 06] Steve Munroe, Simon Miles, Victor Tan, Paul Groth, Sheng Jiang, Luc Moreau, , John Ibbotson, and Javier Vázquez-Salceda. PrIMe: A methodology for developing provenance-aware applications. Technical report, University of Southampton, 2006.
  58. MPL + 03] J. D. Myers, C. Pancerella, C. Lansing, K. L. Schuchardt, and B. Didier. Multi-scale science: supporting emerging practice with semantically de- rived provenance. In ISWC 2003 Workshop: Semantic Web Technologies for Searching and Retrieving Scientific Data, Sanibel Island, Florida, USA, October 2003.
  59. Omer F. Rana. D6.1.1: Tools description document. Technical report, University of Cardiff, December 2005.
  60. Causal relationships ontology. http://twiki.pasoa.ecs.soton.ac.uk/pub/ PASOA/Ontologies/causal.owl, 2006.
  61. P. Ruth, D. Xu, B. K. Bhargava, and F. Regnier. E-notebook middleware for acccountability and reputation based trust in distributed data sharing communities. In Proc. 2nd Int. Conf. on Trust Management, Oxford, UK, volume 2995 of LNCS. Springer, 2004.
  62. Munindar P. Singh and Michael N. Huhns. Service-Oriented Comput- ing: Semantics, Processes, Agents. John Wiley & Sons, Ltd., 2005.
  63. M. Szomszor and L. Moreau. Recording and reasoning over data prove- nance in web and grid services. In Int. Conf. on Ontologies, Databases and Applications of Semantics, volume 2888 of LNCS, 2003.
  64. Martin Szomszor and Luc Moreau. Recording and reasoning over data provenance in web and grid services. In International Conference on Ontologies, Databases and Applications of SEmantics (ODBASE'03), volume 2888 of Lecture Notes in Computer Science, pages 603-620, Catania, Sicily, Italy, November 2003.
  65. V. H. K. Tan. Interaction tracing for mobile agent security. PhD thesis, University of Southampton, 2004.
  66. Gerard Tel. Introduction to Distributed Algorithms. Cambridge Univer- sity Press, 1994.
  67. Paul Townend, Paul Groth, and Jie Xu. A provenance-aware weighted fault tolerance scheme for service-based applications. In In Proc. of the 8th IEEE International Symposium on Object-oriented Real-time dis- tributed Computing (ISORC 2005), May 2005.
  68. A. Vahdat and T. Anderson. Transparent result caching. In Proc. of the 1998 USENIX Technical Conference, New Orleans, Louisiana, June 1998.
  69. Javier Vzquez-Salceda, Steve Willmott, Kifor Tams, and Lszl Zs. Varga. D8.1.1: Application 2: Organ transplant management. specification of mapping to provenance architecture, and domain specific provenance handling. Technical report, UPC, September 2005.
  70. WMF + 05a] Sylvia C. Wong, Simon Miles, Weijian Fang, Paul Groth, and Luc Moreau. Provenance-based validation of e-science experiments. In Pro- ceedings of 4th Internation Semantic Web Conference (ISWC'05), vol- ume 3729 of Lecture Notes in Computer Science, pages 801-815, Gal- way, Ireland, November 2005. Springer-Verlag.
  71. WMF + 05b] Sylvia C. Wong, Simon Miles, Weijian Fang, Paul Groth, and Luc Moreau. Validation of e-science experiments using a provenance-based approach. In Proceedings of Fourth All Hands Meeting (AHM'05), Not- tingham, September 2005.
  72. Allison Gyle Woodruff. Data Lineage and Information Density in Database Visualization. PhD thesis, University of California at Berke- ley, 1998.
  73. A. Woodruff and M. Stonebraker. Supporting fine-grained data lineage in a database visualization environment. In Proc. of the 13th Interna- tional Conference on Data Engineering, pages 91-102, Birmingham, England, April 1997.
  74. XBC + 05] Fenglian Xu, Alexis Biller, Liming Chen, Victor Tan, Paul Groth, Si- mon Miles, John Ibbotson, and Luc Moreau. A proof of concept design for provenance. Technical report, University of Southampton, February 2005. [ZDF + 05] Yong Zhao, Jed Dobson, Ian Foster, Luc Moreau, and Michael Wilde. A notation and system for expressing and executing cleanly typed work- flows on messy scientific data. Sigmod Record, 34(3), September 2005. [ZGG + 03] J. Zhao, C. Goble, M. Greenwood, C. Wroe, and R. Stevens. Annotating, linking and browsing provenance logs for e-science. In Proc. of the Workshop on Semantic Web Technologies for Searching and Retrieving Scientific Data, October 2003.