There is little doubt that we have entered an era where digital data underpins modern science and wider research endeavours. To support this, numerous infrastructures have been designed and built to store these data, ranging from proprietary on-premises systems through to commercial clouds and hybrids of both. Such implementations provide a range of functions during the research lifecycle, from provisioning and cataloguing data assets through to storing and presenting data to computing platforms.
These require advanced data infrastructures which can respond to increasing demands for high performance and scale, as well as support rich access models to increasingly complex data. Importantly, research data workflows are different to traditional enterprise patterns of data movement, transactions, and growth. Conventional corporate storage systems are typically not fit-for-purpose as research storage systems from both a performance and business process perspective.
To date, the implementation of research data platforms has largely advanced in an ad-hoc way, often driven by the urgent need to deliver operational infrastructure within a constrained budget and sometimes driven more by what is available in the market than what would provide a powerful, flexible, and extensible system.
In recent work we have proposed and documented an abstract Research Data Reference Architecture (RDRA) which serves as a framework for guiding and classifying real world systems, and a Research Data Implementation Architecture (RDIA) to guide implementers. This workshop is designed to both build a record of real research data infrastructures and measure them against the RDRA. This will both validate the RDRA and provide a practical resource for implementers.
This workshop is a continuation of a very successful AeRO Forum held at SCA 2024.
Steering Committee
| David Abramson | University of Queensland |
| Jake Carroll | University of Queensland |
| Bronis R. de Supinski | Lawrence Livermore National Labs |
| Osamu Tatebe | University of Tsukuba |
| Manish Parashar | University of Utah |
Venue
- Sands Expo and Convention Centre , Marina Bay Sands, Singapore
- Room O6 – Orchid Jr 4312 (Level 4)
Schedule
| Time | Agenda Item | Person Responsible | Institution | Talk Title | Remote Y/N? |
| 09:00-09:20 | The RDRA and RDIA summary; workshop structure. | David Abramson | University of Queensland (UQ) | N | |
| 09:20-9:40 | Speaker #1 | Beth Holtz | Princeton/Tigerdata | TigerDATA | Y |
| 9:40-10:00 | Speaker #2 | John Westlund | LLNL | Livermore Computing Research Data Reference Architecture Alignment | N |
| 10:00-10:20 | Speaker #3 | Dieter Kranzlmüller | Leibniz Supercomputing Centre (LRZ) | The German National Research Data Initiative: A view from the Leibniz Supercomputing Centre (LRZ) | N |
| 10:30-11:00 | Morning Tea Break – Bayview Foyer | N | |||
| 11:00-11:20 | Speaker #4 | Osamu Tatebe | Tsukba | Documenting and Classifying Research Data Storage Infrastructures at HPCI | N |
| 11:20-11:40 | Speaker #5 | Jake Carroll | UQ | UQRDM alignment to the RDIA. | N |
| 11:40-12:00 | Speaker #6 | Jess David Tate | Utah U | Utah’s Regional Approach to Research Data Management | Y |
| 12:00-12:20 | Speaker #7 | Chris Maestas | IBM | IBM Global Data Platform for Research Data Storage Reference Architecture | N |
| 12:30-13:30 | Lunch | ||||
| 13:30-13:50 | Speaker #8 | Julia Gusakova | Microsoft | TBA | N |
| 13:50-14:10 | Speaker #9 | Leslie Almberg | Arcitecta/UoM | Research Data Implementation Architecture – University of Melbourne, Mediaflux Magic | |
| 14:10-14:30 | Speaker #10 | Jeffrey Tay | VAST | VAST Alignment to the RDRA | N |
| 14:30-14:50 | Speaker #11 | Werner Scholz | Xenon | Disaggregated RDSS using VAST Data and Versity ScoutAM | N |
| 15:00-15:30 | Afternoon tea | ||||
| 15:30-15:50 | Speaker #12 | Rizwan Sathakkathulla | UNSW | UNSW Research Storage Infrastructure – current and conceptual target state | N |
| 15:50-16:10 | Speaker #13 | Ikki Fujiwara | NII | Japan’s common infrastructure for academic data management, publication and discovery | N |
| 16:10-16:30 | Speaker #14 | Chris Schlipalius | Pawsey | Long Term Storage Research Data Infrastructure at The Pawsey Supercomputing Research Centre | N |
| 16:30-16:50 | Speaker #15 | Ranchana Anathakrishnan | Globus/UoC | Globus: platform for data driven research | N |
| 16:50-17:10 | Speaker #16 | Paul Hiew | NSCC | Introduction to NSCC’s systems and processes | N |
| 17:10-17:30 | Speaker #17 | Shinji Kikuchi | Riken | Case Study of R2DMS in RIKEN | Y |
| 17:30:18:00 | Summary discussion, thank you to all participants, wrap up and next steps. | David Abramson, Jake Carroll | UQ | N |



























