In August, Dr Marta Teperek began her new role at Delft University in the Netherlands. In her usual style of doing things properly and thoroughly, she has contributed this blog reflecting on the lessons learned in the process of setting up Cambridge University’s highly successful Research Data Facility.
On 27-28 June 2017 I attended the Jisc’s Research Data Network meeting at the University of York. I was one of several people invited to talk about experiences of setting up RDM services in a workshop organised by Stephen Grace from London South Bank University and Sarah Jones from the Digital Curation Centre. The purpose of the workshop was to share lessons learned and help those that were just starting to set up research data services within their institutions. Each of the presenters prepared three slides: 1. What went well, 2. What didn’t go so well, 3. What they would do differently. All slides from the session are now publicly available.
For me the session was extremely useful not only because of the exchange of practices and learning opportunity, but also because the whole exercise prompted me to critically reflect on Cambridge Research Data Management (RDM) services. This blog post is a recollection of my thoughts on what went well, what didn’t go so well and what could have been done differently, as inspired by the original workshop’s questions.
What went well
RDM services at Cambridge started in January 2015 – quite late compared to other UK institutions. The late start meant however that we were able to learn from others and to avoid some common mistakes when developing our RDM support. The Jisc’s Research Data Management mailing list was particularly helpful, as it is a place used by professionals working with research data to look for help, ask questions, share reflections and advice. In addition, Research Data Management Fora organised by the Digital Curation Centre proved to be not only an excellent vehicle for knowledge and good practice exchange, but also for building networks with colleagues in similar roles. In addition, Cambridge also joined the Jisc Research Data Shared Service (RDSS) pilot, which aimed to create a joint research repository and related infrastructure. Being part of the RDSS pilot not only helped us to further engage with the community, but also allowed us to better understand the RDM needs at the University of Cambridge by undertaking the Data Asset Framework exercise.
In exchange for all the useful advice received from others, we aimed to be transparent about our work as well. We therefore regularly published blog posts about research data management at Cambridge on the Unlocking Research blog. There were several additional advantages of the transparent approach: it allowed us to reflect on our activities, it provided an archival record of what was done and rationale for this and it also facilitated more networking and comments exchange with the wider RDM community.
Engaging Cambridge community with RDM
Our initial attempts to engage research community at Cambridge with RDM was compliance based: we were telling our researchers that they must manage and share their research data because this was what their funders require. Unsurprisingly however, this approach was rather unsuccessful – researchers were not prepared to devote time to RDM if they did not see the benefits of doing so. We therefore quickly revised the approach and changed the focus of our outreach to (selfish) benefits of good data management and of effective data sharing. This allowed us to build an engaged RDM community, in particular among early career researchers. As a result, we were able to launch two dedicated programmes, further strengthening our community involvement in RDM: the Data Champions programme and also the Open Research Pilot Project. Data Champions are (mostly) researchers, who volunteered their time to act as local experts on research data management and sharing to provide advice and specialised training within their departments.The Open Research Pilot Project is looking at the benefits and barriers to conducting Open Research.
In addition, ensuring that the wide range of stakeholders from across the University were part of the RDM Project Group and had an oversight of development and delivery of RDM services, allowed us to develop our services quite quickly. As a result, services developed were endorsed by wide range of stakeholders at Cambridge and they were also developed in a relatively coherent fashion. As an example, effective collaboration between the Office of Scholarly Communication, the Library, the Research Office and the University Information Services allowed integration between the Cambridge research repository, Apollo, and the research information system, Symplectic Elements.
What didn’t go so well
One of the aspects of our RDM service development that did not go so well was the business case development. We started developing the RDM business case in early 2015. The business case went through numerous iterations, and at the time of writing of this blog post (August 2017), financial sustainability for the RDM services has not yet been achieved.
One of the strongest factors which contributed to the lack of success in business case development was insufficient engagement of senior leadership with RDM. We have invested a substantial amount of time and effort in engaging researchers with RDM and by moving away from compliance arguments, to the extent that we seem to have forgotten that compliance- and research integrity-based advocacy is necessary to ensure the buy in of senior leadership.
In addition, while trying to move quickly with service development, and at the same time trying to gain trust and engagement in RDM service development from the various stakeholder groups at Cambridge, we ended up taking part in various projects and undertakings, which were sometimes loosely connected to RDM. As a result, some of the activities lacked strategic focus and a lot of time was needed to re-define what the RDM service is and what it is not in order to ensure that expectations of the various stakeholders groups could be properly managed.
What could have been done differently
There are a number of things which could have been done differently and more effectively. Firstly, and to address the main problem of insufficient engagement with senior leadership, one could have introduced dedicated, short sessions for principal investigators on ensuring effective research data management and research reproducibility across their research teams. Senior researchers are ultimately those who make decisions at research-intensive institutions, and therefore their buy-in and their awareness of the value of good RDM practice is necessary for achieving financial sustainability of RDM services.
In addition, it would have been valuable to set aside time for strategic thinking and for defining (and re-defining, as necessary) the scope of RDM services. This is also related to the overall branding of the service. In Cambridge a lot of initial harm was done due to negative association between Open Access to publications and RDM. Due to overarching funders’ and government’s requirements for Open Access to publications, many researchers started perceiving Open Access to publications merely as a necessary compliance condition. The advocacy for RDM at Cambridge started as ‘Open Data’ requirements, which led many researchers to believe that RDM is yet another requirement to comply with and that it was only about open sharing of research data. It took us a long time to change the messages and to rebrand the service as one supporting researchers in their day to day research practice and that proper management of research data leads to efficiency savings. Finally, only research data which are management properly from the very start of the research process can be then easily shared at the end of the project.
Finally, and which is also related to the focusing and defining of the service, it would have been useful to decide on a benchmarking strategy from the very beginning of the service creation. What is the goal(s) of the service? Is it to increase the number of shared datasets? Is it to improve day to day data management practice? Is to to ensure that researchers know how to use novel tools for data analysis? And, once the goal(s) is decided, design a strategy to benchmark the progress towards achieving this goal(s). Otherwise it can be challenging to decide which projects and undertakings are worth continuation and which ones are less successful and should be revised or discontinued. In order to address one aspect of benchmarking, Cambridge led the creation of an international group aiming to develop a benchmarking strategy for RDM training programmes, which aims to create tools for improving RDM training provision.
My final reflection is to re-iterate that the questions asked of me by the workshop leaders at the Jisc RDN meeting really inspired me to think more holistically about the work done towards development of RDM services at Cambridge. Looking forward I think asking oneself the very same three questions: what went well, what did not go so well and what you would do differently, might become for a useful regular exercise ensuring that RDM service development is well balanced and on track towards its intended goals.
Published 24 August 2017
Written by Dr Marta Teperek