eResearch Collaboration Projects-supporting CSIRO's digital science and research

2020-03-10T03:59:51Z (GMT) by John Zic Justin Baker
Background
CSIRO is Australia’s largest research agency and is a recognised leader in a diverse set of science domains: Agricultural Sciences, Environment/Ecology, Plant and Animal Sciences, Geosciences, Chemistry and Materials Science. CSIRO also manages research infrastructure like the Australia Telescope National Facility (ATNF), the Marine Research Vessel RV Investigator and the Pawsey Supercomputing Centre.

For many years in Australia, and also worldwide [2], research and science have undergone transformational changes with the introduction of new instruments and advanced facilities with matching increases in storage and computing capabilities. Individual researchers were taking a bespoke approach to matching these technologies and capabilities to the way that research and science were carried out. Wider adoption of new practices required social change (in the practice of science and research) and these changes remained fragmented and tailored to specific sciences or even projects. Organisations, by and large, varied enormously in their support of these new practices.

As far back as 2007 [1], CSIRO eResearch practitioners advocated that science and research practices within CSIRO adapt to deal with these challenges. Much like the rest of the world, practices matured over the years: in CSIRO’s health and biosecurity, oceanographic and atmospheric research, radio astronomy, agriculture and food as well as geological and other earth sciences.

However, a significant shift occured in 2018, with a formal recognition by the CSIRO Board of the need to support the new “digital” science and research at an organisational level. CSIRO developed strategic digital transformation initiatives, including CSIRO’s Managed Data Ecosystem (MDE), Missions and the Digital Academy [4].

The aim of the MDE is to connect current and new platforms in a seamless way and improve interoperability between datasets so users will be able to easily find and work on multiple datasets. It will provide a set of tools and approaches enabling CSIRO and partners to improve our collaboration, mining and analysis of data.

CSIRO Missions are major scientific and collaborative research programs aimed at making significant breakthroughs in one of six major challenges facing Australia. They include the resilient and valuable environments, food security and quality, health and well-being, future industries, sustainable energy and resources, and regional security.

CSIRO's Digital Academy is focused on investing in the digital capability of our staff and involves a rethink in planning for a digitally driven research environment. It provides a learning opportunity for our staff, helping define the digital talent, skills and new ways of working. The Academy will help attract and retain new digital talent within the Australian innovation system, develop new digital skills and mindsets in Australian’s scientists and facilitate digital talent accessibility and collaboration across Australia’s innovation system.

Existing Support for “Digital” Science through “eResearch” initiatives
CSIRO Scientific Computing Services group has been providing a dedicated eResearch service since 2011 [3] This service is delivered through "eResearch Collaboration Projects” (eRCPs) which now delivers specialist capabilities that includes Machine Learning, Data Analytics, Scientific Visualisation, Workflow Management and Science Data Handling into research and science projects.

The eRCP process is run as a competitive grant process and continues to be very successful.

In the latest cycle, forty Scientific Computing Services specialists successfully completed and delivered over sixty eRCPs outcomes from a total of eighty submissions. The underlying capabilities are delivered by members from each of teams in the Scientific Computing Services group: Technical Solutions; Data Analytics and Visualisation; Research Software Engineering; and Modelling and Dataflow. The eRCP process also provides a mechanism to promote and introduce new tools and frameworks for consumption to CSIRO’s research community eg Jupyter and R/Shiny.

In the latest cycle, forty Scientific Computing Services specialists successfully completed and delivered over sixty eRCPs outcomes from a total of eighty submissions. The underlying capabilities are delivered by members from each of teams in the Scientific Computing Services group: Technical Solutions; Data Analytics and Visualisation; Research Software Engineering; and Modelling and Dataflow. The eRCP process also provides a mechanism to promote and introduce new tools and frameworks for consumption to CSIRO’s research community eg Jupyter and R/Shiny.

Specialists from the Scientific Computing program are then assigned to work on one or more approved eRCPs. Over the six-month cycle, the resource allocation is around 0.2 FTE, with each staff member allocated 3 eRCP projects per cycle. Importantly, eRCPs are provided to CSIRO researchers and scientists at no additional charge.

The eRCP has been enormously successful over the years, with demand outstripping capability to allocate staff to the projects. The program has demonstrated a range of useful outcomes including – including for example - an augmented reality tool for analysing bushfire plumes over Tasmania; a dashboard to interrogate cotton crop physiological measurements and an online platform to monitor algal blooms for multiple water bodies.

Scientific Computing specialists also provide dedicated support to CSIRO researchers, based around the same set of core capabilities, via an entirely separate funding models known as “pan deployments” as well as secondments. In both cases, CSIRO projects fund the specialists’ time at larger allocations, often extending over 12 months or more. In a sense, this acts like a contractor service for Business Units, providing them with highly specialised skills but without the need to recruit new staff of their own.

Future Plans
CSIRO Scientific Computing will respond to the major initiatives – MDE, Digital Academy and Missions as follows:

MDE
- Redirect Scientific Computing expertise currently working on eRCPs and pan deployments to MDE related activities. In the first instance, these specialists will apply their skills and domain knowledge to one of several nominated pilots, helping design and build foundational components of the MDE.
- Over time, it is anticipated that those same specialists will contribute to the ongoing development and enhancement of additional MDE components in line with its progressive organisational rollout.

Digital Academy
- Develop/adapt training content as appropriate for the Digital Academy. For example, making use of existing Software Carpentry material for HPC usage, but customising appropriate aspects for our own computing environment.
- Delivering training content to CSIRO staff. This has already proven very successful in the machine learning area – with hundreds of staff attending sessions - and will no doubt continue to grow over time.

Missions
- Scientific Computing will continue to provide CSIRO researchers with the eResearch support they need in response to the significant scientific challenges tackling Missions.

REFERENCES
1. J. A. Taylor, J. Zic, and J. Morrissey, “Building CSIRO e-Research Capabilities,” in eResearch Australasia 2008.

2. T. Hey, S. Tansley, and K. Tolle, “The Fourth Paradigm: Data-Intensive Scientific Discovery,” Data-Intensive Sci. Discov. Microsoft Res., 2009.

3. S. Moskwa, “The Accelerated Computing Initiative,” in eResearch Australasia, 2012.

4. CSIRO Chief Executive's Report 2018-19: https://www.csiro.au/en/About/Ourimpact/Reporting-our-impact/Annual-reports/18-19-annual-report/part-1/chiefexecutive-report

ABOUT THE AUTHOR(S)
Dr John Zic is the Executive Manager of CSIRO’s Science Computing Services

Mr Justin Baker is Leader of the Scientific Computing Data Analytics and Visualisation Team.

Categories

License

CC BY 4.0