From 26/11/2025 To 26/11/2025
Starts at 16:00 until 17:00GHGA virtual lecture series "Advances in Data-Driven Biomedicine" (Christian Fufezan)
- Address: virtual
-
Language:
English
- Registration necessary: Yes
Christian Fufezan from the University Heidelberg and GlaxoSmithKline will talk at the GHGA lecture series "Advances in Data-Driven Biomedicine" about “urgap - unified resource governance and data provenance”
Abstract:
Data intensive research now depends on repeated processing of large file collections, yet current practice often duplicates effort and obscures lineage. We describe urgap, a cloud-native framework for file-based data engineering that makes identity and provenance intrinsic to every artifact. To achieve this, urgap relies on location-agnostic data identity captured by the urgap canonical file signature (ucfs) and "Provenance as Code" (PaC) architecture. Outputs carry their own history, enabling safe reuse and automatic skipping of redundant steps across projects and platforms. Furthermore, urgap enables standardized microservices and exposure of all encapsulated processes as RESTful endpoints and Model Context Protocol (MCP) servers, the executors of modern agentic AI approaches. I will present Urgap, an open-source foundation for file-based data engineering that facilitates standardized data provenance, aligns with FAIR principles, and addresses the increasingly distributed nature of data generation and consumption in rapidly developing environments. It reduces operational costs and environmental impact while enhancing adaptability to emerging technologies and ensuring compatibility across cloud providers and orchestration platforms.
Registration
