Home » Academic » Documenting and Classifying Research Data Infrastructure

Documenting and Classifying Research Data Infrastructure

There is little doubt that we have entered an era where digital data underpins modern science and wider research endeavours. To support this, numerous infrastructures have been designed and built to store these data, ranging from proprietary on-premises systems through to commercial clouds and hybrids of both. Such implementations provide a range of functions during the research lifecycle, from provisioning and cataloguing data assets through to storing and presenting data to computing platforms.

These require advanced data infrastructures which can respond to increasing demands for high performance and scale, as well as support rich access models to increasingly complex data. Importantly, research data workflows are different to traditional enterprise patterns of data movement, transactions, and growth. Conventional corporate storage systems are typically not fit-for-purpose as research storage systems from both a performance and business process perspective.

To date, the implementation of research data platforms has largely advanced in an ad-hoc way, often driven by the urgent need to deliver operational infrastructure within a constrained budget and sometimes driven more by what is available in the market than what would provide a powerful, flexible, and extensible system.

In recent work we have proposed and documented an abstract Research Data Reference Architecture (RDRA) which serves as a framework for guiding and classifying real world systems, and a Research Data Implementation Architecture (RDIA) to guide implementers. This workshop is designed to both build a record of real research data infrastructures and measure them against the RDRA. This will both validate the RDRA and provide a practical resource for implementers. 

This workshop is a continuation of a very successful AeRO Forum held at SCA 2024.

Steering Committee

David AbramsonUniversity of Queensland
Jake CarrollUniversity of Queensland
Bronis R. de SupinskiLawrence Livermore National Labs
Osamu TatebeUniversity of Tsukuba
Manish ParasharUniversity of Utah

Venue

  • Sands Expo and Convention Centre , Marina Bay Sands, Singapore
  • Room O6 – Orchid Jr 4312 (Level 4)

Schedule

TimeAgenda ItemPerson ResponsibleInstitutionTalk TitleRemote Y/N?
09:00-09:20The RDRA and RDIA summary; workshop structure.David AbramsonUniversity of Queensland (UQ) N
09:20-9:40Speaker #1Beth HoltzPrinceton/TigerdataTigerDATAY
9:40-10:00Speaker #2John WestlundLLNLLivermore Computing Research Data Reference Architecture AlignmentN
10:00-10:20Speaker #3Dieter Kranzlmüller Leibniz Supercomputing Centre (LRZ)The German National Research Data Initiative: A view from the Leibniz Supercomputing Centre (LRZ)N
10:30-11:00Morning Tea Break – Bayview Foyer   N
11:00-11:20Speaker #4Osamu TatebeTsukbaDocumenting and Classifying Research Data Storage Infrastructures at HPCIN
11:20-11:40Speaker #5Jake CarrollUQ UQRDM alignment to the RDIA.N
11:40-12:00Speaker #6Jess David TateUtah UUtah’s Regional Approach to Research Data ManagementY
12:00-12:20Speaker #7Chris MaestasIBMIBM Global Data Platform for Research Data Storage Reference ArchitectureN
12:30-13:30Lunch    
13:30-13:50Speaker #8Julia GusakovaMicrosoftTBAN
13:50-14:10Speaker #9Leslie AlmbergArcitecta/UoMResearch Data Implementation Architecture – University of Melbourne, Mediaflux Magic 
14:10-14:30Speaker #10Jeffrey TayVASTVAST Alignment to the RDRAN
14:30-14:50Speaker #11Werner ScholzXenonDisaggregated RDSS using 
VAST Data and Versity ScoutAM
N
15:00-15:30Afternoon tea    
15:30-15:50Speaker #12Rizwan SathakkathullaUNSWUNSW Research Storage Infrastructure – current and conceptual target stateN
15:50-16:10Speaker #13Ikki FujiwaraNIIJapan’s common
infrastructure for academic data management, publication and
discovery
N
16:10-16:30Speaker #14Chris SchlipaliusPawseyLong Term Storage Research Data Infrastructure at The Pawsey Supercomputing Research CentreN
16:30-16:50Speaker #15Ranchana AnathakrishnanGlobus/UoCGlobus: platform for data driven researchN
16:50-17:10Speaker #16Paul HiewNSCCIntroduction to NSCC’s systems and processesN
17:10-17:30Speaker #17Shinji KikuchiRikenCase Study of R2DMS in RIKENY
17:30:18:00Summary discussion, thank you to all participants, wrap up and next steps.David Abramson, Jake CarrollUQ N

Photos