University of Western Sydney eResearch team plan / roadmap 2013-2015 by Peter Sefton, Peter Bugeia, Andrew Cheetham, Sarah Chaloner and Toby O'Hara CreativeCommons (opens in a new window). © University of Western Sydney
About this document
This document represents the plan for eResearch team operations and projects from 2013-2015. It was approved by the eResearch Steering Committee in June 2013
eResearch, which broadly speaking is the application of advanced IT and communications technology to the pursuit of research is vital to being competitive in research of all kinds, but particularly domains where working at internet-scale is key to research outcomes. This includes climate change research and a wide variety of other scientific endeavours as well as the Digital Humanities, where UWS is building capability. The eResearch portfolio also embraces basic essential infrastructure for data management and reporting, which is key to researchers' compliance with the Australian Code for the Responsible Conduct of research, and with predicted changes to funding-body policies. There is basic information on eResearch at UWS at the eResearch web site (opens in a new window).
This roadmap also provides an indication of expected growth and uptake of the services described. This will be used as a benchmark to plot progress. Budget requests and expected needs are based on these growth estimates.
The following diagram shows how ITS, eResearch and other stakeholders will support eResearch infrastructure for UWS, with shared generic infrastructure at the bottom, and discipline/project specific tools and services at the top.
The eResearch unit at UWS is currently resourced as a small consultancy reporting to the Pro Vice Chancellor, Research. The team aims to catalyse developments in eResearch, and as such does not take an operational role in providing IT services, but liaises between the major support services for research; Information Technology Services (ITS), the Library, the Office of Research Services and the research community to drive the adoption of eResearch practices and build eResearch infrastructure.
The eResearch team will continue to play a lead role in driving infrastructure projects, in the three major strands of eResearch, data management, computing, and collaboration services. 2013 will see the start of a change-management process to promote the use of eResearch tools and techniques to researchers, as the infrastructure projects come online.
Note: the use of the word services throughout this document emphasises that we are talking about much more than computing infrastructure here. eResearch projects all include integration with institutional processes, service desk, service catalogue etc as appropriate to a core business system.
Relationship to UWS research plan
The proposed eResearch Roadmap (eRR) supports and enables the four objectives of the UWS Research Plan 2012-14.
Objective 1 – Increase external research income to the University
Attracting research income requires reputation and track record that is visible; Pillar 1 provides access to and exposure of both publications and research data both in the UWS research publications repository1 and in the ANDS Research Data Australia (RDA)2. Automated feeding of data sets to RDA is being provided by the ANDS funded "Seeding the Commons" project3.
Objective 2 – Increase the number of fields of research at UWS operating above or well above world standard
Similarly, quality research requires exposure, as in Objective 1. In addition, researchers should be able to concentrate on their research rather than on the tools required to undertake the research, analysis and storage of data. Pillars 1 & 2 support researchers trough the application of automated tools (eg the ANDS funded "Data Capture" project4) and seamless access to high performance compute facilities with associated data storage access.
Objective 3 – Increase the number and concentration of funded research partnerships
Research partnerships are vital to extend the University's research penetration and impact. Pillar 3 provides the collaborative tool required to enable and support rapid and effective collaboration regardless of geographical and temporal separation. A fine example is the recently acquired NeCTAR grant ($1.3M) to set-up and implement a Virtual Laboratory for the multiple partners in the project: Above and Beyond Speech, Language and Music: A Virtual Lab for Human Communication Science (HSCvLAB)5.
Objective 4 – Ensure UWS attracts and graduates high quality HDR students to its areas of research strength.
HDR candidates now have a much higher level of expectation for modern collaborative tools as well as access to high speed and quality compute and analysis capabilities. Pillars 2 & 3 both provide these tools and services to ensure that UWS presents an attractive and technically advanced research environment for HDR candidates.
Summary of Projects
The summary table below shows infrastructure projects to deliver eResearch services, starting in a staggered fashion, over the period from 2011 to 2015.
Operational support for these developing services will largely rest with ITS, ORS and the Library. During this time, eResearch will expand into areas it is currently not able to address via a new institutional change-management project: promoting eResearch practice throughout the UWS research community, and providing a development service to help groups adopt eResearch tools.
Research and data Repository
Basic storage service available to all UWS
Integrated Service established
Integrated service established across all-three infrastructure pillars with staff and student development programs in place
"Virtual Research Enviroments"
UWS eResearch Culture
Roadmap for eResearch support
In scope for eResearch as of 2013
At this stage the main focus is on operationalising support for eResearch. At current capacity the group can:
- Lead infrastructure projects which will build capability in our stakeholders, resourced using UWS, and/or external funds, such as the Research Data Repository (ITS budget) and Australian National Data Service (ANDS) projects. A three-year plan for infrastructure projects is set out below.
- Assist in IT procurement and deployment for researchers, for example in sourcing research data storage and computing resources.
- Provide advisory services, for example assisting with grant writing, sitting on selection panels for technical staff to support researchers.
- Serve on project steering committees for eResearch projects (for example the MARCS Institute-led NeCTAR virtual laboratory).
- Assist in selecting and developing technology that has an impact on research, for example assisting with the current review of repository systems being conducted in the library.
- Coordinate Intersect NSW consulting activities:
- Small projects conducted by Intersect as part of the eResearch Analyst service they provide.
- Short training courses available to members on request.
- Additional paid consulting. Examples include setting up a High Performance Computer for IIE, and writing a data management plan for HIE.
- Act as mentors on undergraduate Computer Science projects to encourage development of software.
Of the above, the major focus is on (1) as solid IT and organisational infrastructure is the basis for good eResearch practice.
New activities starting 2013
Subject to the level of staffing provided in the 2013 budget, the following activities will commence:
1. Outreach, change management and training to promote eResearch:
A. Systematic development and deployment of professional development programs across the university, for example training in eResearch methods for Higher Degree by Research Students (HDRS) and their supervisors.
B. Coordination of technology user groups, seminars, workshops and interest groups to drive eResearch practices. This could include:
i. Research computing user group.
ii. Reproducible research practitioners network.
iii. Digital humanities interest group.
C. Communications via university channels
D. Evangelising eResearch, as services delivered by projects come on-line.
Resourcing: The budget submitted for 2013-2015 for eResearch includes an eResearch outreach role.
2. Driving adoption of eResearch by rolling out high-impact eResearch tools to key research groups. Adapting eResearch tools (for data management, analysis, visualisation and modelling, as well as communications) for use at UWS. Deploying the growing array of tools that have been funded by the Australian National Data Service ANDS and now NeCTAR.
Resourcing: There is very limited capacity to pursue this without additional technical staff, but as the basic data management, compute and collaboration tools are established that will free the two eResearch technical staff to work on this strand.
3. Direct eResearch collaboration, such as:
A.Developing software for research projects
B.Detailed collaboration with researchers on specific research projects.
Resourcing: this class of activities is best done at the institute or research centre level; there is precedent for this in recent efforts in the Hawkesbury Institute for the Environment hiring a data manager and Short & McCarthy's proposed research centre in the Digital Humanities which includes two software developers with a dotted-line report to eResearch. Below we present matrices that show a coordinated approach to building eResearch capability across the major stakeholder groups.
Detailed view of projects
There are three eResearch infrastructure 'pillars'.
- Data management: storage, data catalogues
- Computing: high performance computing, cloud computing for analysis, modelling and visualisation.
- Collaboration tools: virtual labs, virtual research environments.
In 2011 the eResearch team developed an Interim Strategy for a Research Data Repository, which is now established as a formal project, with a steering committee chaired by a representative of the PVCR.
For each project there is a timeline showing a broad-brush picture of the developing service, as well as a table listing the impacts and budget implications on various stakeholder groups. These are all in draft and for discussion.
It is important to keep in mind the interconnectedness of the three 'pillars' as each component will influence the other. Some overlap will exist, such as inserting a standardised file sharing (collaboration) layer within the UWS Research Data Store. The current budget proposal is to for three strands of project based activity:
- Basic funding for storage and computing infrastructure growth which is essential maintenance of the services established by the RDR project.
- A continuation of the RDR project with a scope that will cover not only more services in Pillar 1, data management but integration with computing services (Pillar 2) and some basic collaboration services (Pillar 3).
- A project to scope the provision of one or more collaboration services (AKA VREs or Virtual Research Environments). This will produce best practice guidelines for collaborating using existing tools, and produce a more detailed roadmap for investment.
- A fourth strand will be undertaken using eResearch resources, to establish eResearch within the university.
Short business cases are being produced for the first three projects above – once those bids are finalised the approved versions will be added to this document.
Once the timelines for each project are established it will be possible to show an overall roadmap.
Data Management Pillar
The Research Data Repository (RDR) (opens in a new window) provides the infrastructure and the catalogue function for researchers to be able to store, share, cite, and archive research data, in accordance with policies, requirements, and best data management practices. Included are the services and information required to define, promote, and provide the RDR to researchers.
- Build services and acquire software to identify, capture and manage key research data in institutes, research centres and schools from its time of creation to archiving, dissemination and citation, allowing world-class high profile research that is visible and will score highly in current and future research metrics as well as ensure compliance with relevant codes.
- Link managed research data storage to flexible on-demand computing services to enable research data processing, analysis, modelling and visualisation.
- Provide essential basic collaboration services at the data and computing level, allowing UWS researchers to establish partnerships within Australia and globally (OBJECTIVE 3, research plan).
It will be rolled out over the next few years and firmly established by mid 2013. Full adoption for large numbers of users will take time, partly because of the length of the research lifecycle, such as the time between preparing grant applications and finalising projects or the timespan of a PhD over several years.
The RDR is projected to grow as follows:
The chart below displays the same information as the table above.
These estimates assume that there are a number of latent potential users who have sufficient connections to eResearch by word of mouth, who will also immediately see the value of the service and begin using it. Following 2013, additional users will be brought in through concerted communications and further word of mouth, but the growth may be slower. It is also impossible to estimate the use in terms of storage size, as large teams may have small data storage needs (their data does not consume as much storage space), and small teams may have large data storage needs.
RDR budget for 2013- 2015 approved
New infrastructure for research data storage
Existing ad hoc services ported to new infrastructure
Basic "R Drive" services via ITS Service Desk
Data collections identified for inclusion in the Research Data Catalogue
Working Research Data Catalogue with data collections described and sent to Research Data Australia (per Australian National Data Service requirements).
Integration between the Research Data Catalogue and the Research Data Store
Trial integration between Research Data Catalogue and easy to use file-sharing tech (eg Dropbox.com)
Data Capture app (1) at HIE which will feed data and data descriptors to the Research Data Store and the Research Data Catalogue
Material explaining and promoting good data management practices (including use of RDR to meet many best practices requirements)
10 or more Pilot research group users
Recruitment via selection of research projects with internal grants or projects of strategic importance
Promotion of Basic "R Drive" services across UWS
Procurement of additional storage
Finalise Research Data Catalogue support strategy
Integration of the RDSI to the Research Data Catalogue
Integration of additional file-sharing technology
Integration with the Compute strategy (standard approach for UWS provided compute and storage)
File sharing and collaboration layer for the Data Store
Research Data Catalogue functional improvements (eg researcher interface, improved integration, improved automation)
Draft storage and data archive strategy
Data Capture applications fit for use within three (3) institutes, such as HIE, MARCS and IIE
Recruitment of users via external grant processes
Data management plans for four (4) Institutes or Schools, such as HIE, MARCS, IIE, and SCEM.
Strategy for constructing a revised UWS Data Management Policy
RDR is "Business as usual"
RDR strategy for storage and access of data under ethics restrictions in place
Researcher interface improvements – automated data discovery and descriptions ready to review and submit
Data Catalogue maintenance and support
Finalise and implement data archive strategy
Procurement of archive infrastructure
Integrated with computing resources
Draft Needs Assessment Tool – to be integrated with ITSM (ServiceNow) as a submittable
Draft Data Management Policy
Strategy for discovering and including historical data – extraction project
Integrated with computing and collaboration projects.
Data Catalogue maintenance and support
Researcher Needs Assessment Tool finalised and implemented
eResearch consulting formalised around the Needs Assessment Tool
Data Management Policy implemented
Implementation of historical data extraction project
|ITS||Service established to provide storage in simple and medium complexity scenarios.|
Library to manage RDC
Research data librarians will continue to work with stakeholders
Integrate RDC with the Institutional Research Publications Repository and associated workflows
|ORS||Incremental changes to ORS systems|
|Intersect/AERO||No direct impact|
|eResearch team||Product owner of RDR as proxy for PVCR|
|Institutes||Each institute will have some direct data-capture for key research data flowing through to the catalogue and archival storage|
|Schools||As per institutes, above|
Peter Bugeia of Intersect is developing an interim UWS Research Computation strategy which will recommend changes to the way high performance computing service, servers, desktop and laptop computing, and software licenses are procured and provided to researchers. The computing strategy will also look at IT service integration with the research project lifecycle and existing ORS systems and processes.
A key objective of the strategy and resultant project is to provide easy pathways for researchers to identify and obtain the computing environment, software and technical support they need to do their research.
Benefits are expected to include:
- Researchers will be able to focus their efforts on their research rather than worrying about the infrastructure.
- Better infrastructure planning and budgeting made possible through early identification of research project computational requirements.
- Greater consolidation of computational infrastructure with economy of scale benefits.
- Co-ordinated access to external cloud and supercomputing computing facilities.
UWS Research Computing resources are projected to grow as follows:
The chart below displays the same information as the table above
In order to estimate in a consistent manner, the following equivalents have been applied:
- 1 high intensity virtual machine = 2 medium intensity virtual machines
- 1 low intensity virtual machine = 0.5 medium intensity virtual machines
Virtual machines are considered the best measure for growth, as there may be small research teams using several virtual machines, or large research teams that only need one high intensity machine; however, the number of researchers is also estimated, to be able to plot user uptake.
These estimates assume that there are a number of latent potential users who have sufficient connections to eResearch by word of mouth, who will also immediately see the value of the service and begin using it. Following 2013, additional users will be brought in through concerted communications and further word of mouth, but the uptake may be slower.
Interim strategy endorsed,
Research Computing Strategy 3-year project budget approved in principle.
2013 project budget approved
Project governance in place
Project resources in place
Project scope, project plan established, project kicked off
IT services identified
Modifications to RHESYS scoped and implemented
ORS operational changes established with linkages to new IT services
Research Project Computing strategy templates established
Existing research software licences reviewed and renegotiated
Additional researcher software licences identified, migrated, agreements negotiated
Establishment of 1 to 3 researcher user groups around researcher software.
5 to 10 research projects with a computing strategy defined
10 to 30 researchers using virtual environments to fulfil their computing needs
An increase in the number of researchers using national supercomputing facilities to fulfil their computational needs.
Development of a "research computing" induction training module
On line resources which researchers can access to inform and assist them with obtaining research computing.
Research groups with special needs defined.
All new research projects with a computing strategy defined
Integration with the Research Data Repository project
Increased use of virtual desktops and servers by researchers
Establishment of a best-of-breed research software register
Increased researcher self-service provisioning of computing infrastructure
Increased self-help and cross-discipline user groups
|2015||Further deployment TBA|
Service catalogue for computational requirements
New service-desk scripts.
Additional service calls from researchers.
Service requests auto-triggered by ORS systems.
Possible Introduction of a researcher self-service provisioning model using software such as open-stack.
Additional research-related software licences, increased site licenses.
Possible renegotiation of research-related software licence terms with vendors to allow operation in virtual environments.
An overall reduction in non-standard and ad-hoc environments
Increased support for "non-standard" environments
Every research project to have a "research project computing strategy". This implies some changes to RHESYS (data entry) plus operational changes.
Linkage of RHESYS to IT service system.
Provide subject matter expertise
Assist with updates to researcher on-line resources, the development of
additional training material, researcher outreach, communication and change management
UWS to ask AERO to maintain basic information on research computing including definitions, examples etc
Help UWS develop strategy & procedures
Lead stakeholder in rollouts
Develop user group meetings and assist with networking between computing-intensive researchers
Increased governanace burden nd overall co-ordination
|Institutes||UWS eResearch to organize briefing sessions and training as appropriate, starting early 2013|
UWS eResearch to organize briefing sessions and training as appropriate
Starting Mid-late 2013
Collaboration Pillar (Virtual Laboratories and Virtual Research Environment)
The third major component of eResearch is collaboration, including project communications, dissemination of research, and collaborative work based on research data and computational processes. While many disciplines and projects need very specific software, there are some common services that will be useful across many, if not all research projects and groups. Virtual Research Environments (VREs) fill this gap providing a web presence for a group, repository services for data, ways to share and discuss research, with links to the RDR and compute facilities.
VREs installed for institutes (3 to 5)
One or more VRE platforms supported at UWS
Generic VRE available institution-wide
Integrated with data management and compute
Participate in project to choose and deploy 1 or more VREs.
Service catalogue included VREs
Host software on ITS-sourced infrastructure
|Library||Repository and catalogue systems will need to provide feeds to VREs.|
|ORS||Research systems will need to provide feeds on projects/grants etc for use in the VRE for authorisation and for information.|
Intersect may offer support/hosting for eResearch applications on top of RDSI storage and NeCTAR cloud services.
|eResearch team||Lead Project (as with PDR)|
|Develop strategy/policy documents|
|Institutes||TBA- post scoping study|
|Schools||TBA- post scoping study|
The following diagrams reflect our current planning for the rollout of the Research Data Repository (computing and collaboration services are not shown on this diagram as work on those is at an earlier stage)