Tag Archives: Open Research

We are going OPEN – the Open Research experiment has begun!

There has been much discussion recently about the reproducibility crisis and about the growing distrust among the public in the quality of research. As illustrated in our ‘Case for Open Research’ series of blog posts, one of the main reasons for this is that researchers are currently rewarded for the number of papers they publish in high impact factor journals, and not necessarily for the quality of work that they are doing.

Indeed, Cambridge researchers clearly indicated that the lack of incentives to do anything other than publishing in these types of journals is one of the main blockers discouraging them from adopting a more open research practice.

Joining forces with the Wellcome Trust

The Office of Scholarly Communication started talking about these problem with the Open Research team at the Wellcome Trust. The Wellcome Trust are natural allies, as they have consistently led their researchers towards greater openness. They were one of the first funding bodies to introduce policies on Open Access and on data management and sharing. Now the Wellcome Trust is moving towards proactively supporting Open Research beyond enforcing their compliance requirements.

To promote immediate and transparent research sharing, they have recently launched the Wellcome Open Research platform which allows researchers to submit articles about virtually any research output and get published within a couple of days. The Wellcome Trust is now considering making Open Research one of their strategic priorities.

We quickly realised that we have a lot of shared interests, and joining forces to tackle the problem together made a lot of sense. We came up with the idea to launch the Open Research Pilot Project.

The Open Research Pilot – understanding the barriers to “openness”

We conceived the project as a two year experiment, which would allow us to gain an understanding of what is needed for researchers to share and get credit for all outputs of the research process. These include non-positive results, protocols, source code, presentations and other research outputs beyond the remit of traditional publications.

The Project aims to understand the barriers preventing researchers from sharing (including resource and time implications), as well as what the incentives are. The Project aims to utilise the new Wellcome Open Research publishing platform, together with other channels, to share these outputs.

The invitation to take part in the Pilot was sent to all researchers at Cambridge funded by the Wellcome Trust. Participating researchers had to commit to sharing of research outputs beyond traditional publications and to engage with the Project, by participating in Project meetings and contributing to Project publications.

Is ‘doing the right thing’ enough incentive?

Our biggest question was whether anyone would be willing to participate in the Pilot. We did not offer any incentive other than encouraging researchers to contribute to the greater good. The only support available to those who wanted to take part in the project was that offered by the Wellcome Trust and Cambridge Open Research team members, but there was no financial aid available to prospective participants. We thought that regardless of the outcome, that inviting researchers would be a good exercise to go through – we thought that if no one applied, we would have learnt that doing ‘the right thing’ was not a good enough motivator.

Thankfully, we received several fantastic applications from individual researchers and research groups who demonstrated great interest in and motivation for Open Research. We initially planned to work with two research groups, but given the quality of applications received and passion for Open Research expressed by the applicants, we decided to extend the scope of the project to four research groups. We have selected researchers doing different types of research, with the aim of learning about distinct problems in sharing that are experienced in diverse research disciplines:

  •       Dr Laurent Gatto –is  doing computational biology research, with a special focus on proteomics data. His interest is: How to effectively share research data and the code needed to reproduce them?
  •       Dr David Savage – is researching molecular pathogenesis of the consequences of obesity. His question is: What are the problems with sharing data coming from human participants?
  •       Dr Benjamin Steventon – is a developmental biologist generating and analysing large-scale imaging datasets. He would like to know: Are there image repositories allowing one to share large image datasets in a re-usable way?
  •       Dr Marta Costa and Dr Greg Jefferis (and others) – researchers leading the work on two collaborative projects: Connectomics and Virtual Fly Brain, which will create interactive tools to interrogate Drosophila neural network connections. They would like to understand: What are the issues with sharing complex interactive datasets? How to ensure long-term preservation of complex digital objects?


So what motivated these researchers to apply for the project? We asked this question at the application stage and were positively surprised by the altruistic answers that we received. Our researchers were largely driven by a desire to improve the research process. We have seen responses like:

  • “Openness in research, from data and software to publication, is a central pillar of good research.”
  • “I am very concerned (disappointed as a scientist) by the current wave of ‘unreproducible’ and/or ‘irrelevant’ research, and am very passionate about contributing to improving scientific endeavour in this regard.”
  • I am very enthusiastic about exploiting new ways of sharing my research output beyond the established peer-review journal system.”
  • “I believe that sharing research outputs fully, including data and code are essential to accelerate research, and I have benefitted from it in my own research.”

Summarising, researchers expressed a great desire for contributing to a cultural change. Researchers wanted to change the way in which research was disseminated and to increase research transparency and reproducibility.

Let’s get to work

We all met (the researchers, Wellcome Trust and Cambridge Open Research teams) on Friday 27 January to officially start the two year project. Each research group was appointed a facilitator – a dedicated member of the Cambridge Open Research team to support researchers during the Project. Research groups will meet with their facilitators on a monthly basis in order to discuss shareable research outputs and to decide on best ways to disseminate these outputs. Every six months all project members will meet together to discuss the barriers to sharing discovered and to assess the progress of the Project.

One of the main goals of the Project is to learn what the barriers and incentives are for Open Research and to share these findings with others interested in the subject to inform policy development. Therefore, we will be regularly publishing blog posts on the Unlocking Research blog and on the Wellcome Open Research blog with case studies describing what we have discovered while working together. There will be an update from each research group every six months. We will also be publicly sharing all main outputs of the Project.

We are all extremely excited about going “Open” and we suggest that anyone interested in the Open Research practice watches this space.

Published 08 February 2017
Written by Dr Marta Teperek
Creative Commons License

Creating a research data community

Are research institutions engaging their researchers with Research Data Management (RDM)? And if so, how are they doing it? In this post, Rosie Higman (@RosieHLib), Research Data Advisor, University of Cambridge, and Hardy Schwamm (@hardyschwamm),  Research Data Manager, Lancaster University explore the work they are doing in their respective institutions.

Whilst funder policies were the initial catalyst for many RDM services at UK universities there are many reasons to engage with RDM, from increased impact to moving towards Open Research as the new normal. And a growing number of researchers are keen to get involved! These reasons also highlight the need for a democratic, researcher-led approach if the behavioural change necessary for RDM is to be achieved. Following initial discussions online and at the Research Data Network event in Cambridge on 6 September, we wanted to find out whether and how others are engaging researchers beyond iterating funder policies.

At both Cambridge and Lancaster we are starting initiatives focused on this, respectively Data Champions and Data Conversations. The Data Champions at Cambridge will act as local experts in RDM, advocating at a departmental level and helping the RDM team to communicate across a fragmented institution. We also hope they will form a community of practice, sharing their expertise in areas such as big data and software preservation. The Lancaster University Data Conversations will provide a forum to researchers from all disciplines to share their data experiences and knowledge. The first event will be on 30 January 2017.

RDMFBreakoutHaving presented our respective plans to the RDM Forum (RDMF16) in Edinburgh on 22nd November we ran breakout sessions where small groups discussed the approaches our and other universities were taking, the results summarised below highlighting different forms that engagement with researchers will take.

Targeting our training

RDM workshops seem to be the most common way research data teams are engaging with researchers, typically targeting postgraduate research students and postdoctoral researchers. A recurrent theme was the need to target workshops for specific disciplinary groups, including several workshops run jointly between institutions where this meant it was possible to get sufficient participants for smaller disciplines. Alongside targeting disciplines some have found inviting academics who have experience of sharing their data to speak at workshops greatly increases engagement.

As well as focusing workshops so they are directly applicable to particular disciplines, several institutions have had success in linking their workshop to a particular tangible output, recognising that researchers are busy and are not interested in a general introduction. Examples of this include workshops around Data Management Plans, and embedding RDM into teaching students how to use databases.

An issue many institutions are having is getting the timing right for their workshops: too early and research students won’t have any data to manage or even be thinking about it; too late and students may have got into bad data management habits. Finding the goldilocks time which is ‘just right’ can be tricky. Two solutions to this problem were proposed: having short online training available before a more in-depth training later on, and having a 1 hour session as part of an induction followed by a 2 hour session 9-18 months into the PhD.

Tailored support

Alongside workshops, the most popular way to get researchers interested in RDM was through individual appointments, so that the conversation can be tailored to their needs, although this obviously presents a problem of scalability when most institutions only have one individual staff member dedicated to RDM.

IMG_20161122_121401There are two solutions to this problem which were mentioned during the breakout session. Firstly, some people are using a ‘train the trainer’ approach to involve other research support staff who are based in departments and already have regular contact with researchers. These people can act as intermediaries and are likely to have a good awareness of the discipline-specific issues which the researchers they support will be interested in.

The other option discussed was holding drop-in sessions within departments, where researchers know the RDM team will be on a regular basis. These have had mixed success at many institutions but seem to work better when paired with a more established service such as the Open Access or Impact team.

What RDM services should we offer?

We started the discussion at the RDM Forum thinking about extending our services beyond sheer compliance in order to create an “RDM community” where data management is part of good research practice and contributes to the Open Research agenda. This is the thinking behind the new initiatives at Cambridge and Lancaster.

However, there were also some critical or sceptical voices at our RDMF16 discussions. How can we promote an RDM community when we struggle to persuade researchers being compliant with institutional and funder policies? All RDM support teams are small and have many other tasks aside from advocacy and training. Some expressed concern that they lack the skills to market our services beyond the traditional methods used by libraries. We need to address and consider these concerns about capacity and skill sets as we attempt to engage researchers beyond compliance.


It is clear from our discussions that there is a wide variety of RDM-related activities at UK universities which stretch beyond enforcing compliance, but engaging large numbers of researchers is an ongoing concern. We also realised that many RDM professionals are not very good at practising what we preach and sharing our materials, so it’s worth highlighting that training materials can be shared on the RDM training community on Zenodo as long as they have an open license.

Many thanks to the participants at our breakout session at the RDMForum 16, and Angus Whyte for taking notes which allowed us to write this piece. You can follow previous discussions on this topic on Gitter.

Published on 30 November
Written by Rosie Higman and Hardy Schwamm
Creative Commons License

Walking the talk- reflections on working ‘openly’

As part of Open Access Week 2016, the Office of Scholarly Communication is publishing a series of blog posts on open access and open research. In this post Dr Lauren Cadwallader discusses her experience of researching openly.

Earlier this year I was awarded the first Altmetric.com Annual Research grant to carry out a proof-of-concept study looking at using altmetrics as a way of identifying journal articles that eventually get included into a policy document. As part of the grant condition I am required to share this work openly. “No problem!” I thought, “My job is all about being open. I know exactly what to do.”

However, it’s been several years since I last carried out an academic research project and my previous work was carried out with no idea of the concept of open research (although I’m now sharing lots of it here!). Throughout my project I kept a diary documenting my reflections on being open (and researching in general) – mainly the mistakes I made along the way and the lessons I learnt. This blog summarises those lessons.

To begin at the beginning

I carried out a PhD at Cambridge not really aware of scholarly best practice. The Office of Scholarly Communication didn’t exist. There wasn’t anyone to tell me that I should share my data. My funder didn’t have any open research related policies. So I didn’t share because I didn’t know I could, or should, or why I would want to.

I recently attended The Data Dialogue conference and was inspired to hear many of the talks about open data but also realised that although I know some of the pitfalls researchers fall into I don’t quite feel equipped to carry out a project and have perfectly open and transparent methods and data at the end. Of course, if I’d been smart enough to attend an RDM workshop before starting my project I wouldn’t feel like this!

My PhD supervisor and the fieldwork I carried out had instilled in me some practices that are useful to carrying out open research:.

Lesson #1. Never touch your raw data files

This is something I learnt from my PhD and found easy to apply here. Altmetric.com sent me the data I requested for my project and I immediately saved it as the raw file and saved another version as my working file. That made it easy when I came to share my files in the repository as I could include the raw and edited data. Big tick for being open.

Getting dirty with the data

Lesson #2. Record everything you do

Another thing I was told to do during my PhD lab work was to record everything you do. And that is all well and good in the lab or the field but what about when you are playing with your data? I found I started cleaning up the spreadsheet Altmetric.com sent and I went from having 36 columns to just 12 but I hadn’t documented my reasons for excluding large swathes of data. So I took a step back and filled out my project notebook explaining my rationale. Documenting every decision at the time felt a little bit like overkill but if I need to articulate my decisions for excluding data from my analysis in the future (e.g. during peer review) then it would be helpful to know what I based my reasoning on.

Lesson #3. Date things. Actually, date everything

I’d been typing up my notes about why some data is excluded and others not so it informs my final data selection and I’d noticed that I’d been making decisions and notes as I go along but not recording when. If I’m trying to unpick my logic at a later date it is helpful if I know when I made a decision. Which decision came first? Did I have all my ‘bright ideas’ on the same day and now the reason they don’t look so bright is was because I was sleep deprived (or hungover in the case of my student days) and not thinking straight. Recording dates is actually another trick I learnt as a student – data errors can be picked up as lab or fieldwork errors if you can work back and see what you did when – but have forgotten to apply thus far. In fact, it was only at this point that I began dating my diary entries…

Lesson #4. A tidy desk(top) is a tidy mind

Screen Shot 2016-10-24 at 13.21.11I was working on this project just one day a week over the summer so every week I was having to refresh my mind as to where I stopped the week before and what my plans were that week. I was, of course, now making copious notes about my plans and dating decisions so this was relatively easy. However, upon returning from a week’s holiday, I opened my data files folder and was greeted by 10 different spreadsheets and a few other files. It took me a few moments to work out which files I needed to work on, which made me realise I needed to do some housekeeping.

Aside from making life easier now, it will make the final write up and sharing easier if I can find things and find the correct version. So I went from messy computer to tidy computer and could get back to concentrating on my analysis rather than worrying if I was looking at the right spreadsheet.


Lesson #5. Version control

One morning I had been working on my data adding in information from other sources and everything was going swimmingly when I realised that I hadn’t included all of my columns in my filters and now my data was all messed up. To avoid weeping in my office I went for a cup of tea and a biscuit.

Upon returning to my desk I crossed my fingers and managed to recover an earlier version of my spreadsheet using a handy tip I’d found online. Phew! I then repeated my morning’s work. Sigh. But at least my data was once again correct. Instead of relying on handy tips discovered by frantic Googling, just use version control. Archive your files periodically and start working on a new version. Tea and biscuits cannot solve everything.

Getting it into the Open

After a couple more weeks of problem free analysis it was time to present my work as a poster at the 3:AM Altmetrics conference. I’ve made posters before so that was easy. It then dawned on me at about 3pm the day I needed to finish the poster that perhaps I should share a link to my data. Cue a brief episode of swearing before realising I sit 15ft away from our Research Data Advisor and she would help me out! After filling out the data upload form for our institutional repository to get a placeholder record and therefore DOI for my data, I set to work making my spreadsheet presentable.

Lesson #6. Making your data presentable can be hard work if you are not prepared

I only have a small data set but it took me a lot longer than I thought it would to make it sharable. Part of me was tempted just to share the very basic data I was using (the raw file from Altmetric.com plus some extra information I had added) but that is not being open to reproducibility. People need to be able to see my workings so I persevered.

I’d labelled the individual sheets and the columns within those sheets in a way that was intelligible to me but not necessarily to other people so they all needed renaming. Then I had to tidy up all the little notes I’d made in cells and put those into a Read Me file to explain some things. And then I had to actually write the Read Me file and work out the best format for it (a neutral text file or pdf is best).

I thought I was finished but as our Research Data Advisor pointed out, my spreadsheets were returning a lot of errors because of the formula I was using (it was taking issue with me asking it to divide something by 0) and that I should share one file that included the formulae and one with just the numbers.

If I’d had time, I would have gone for a cup of tea and a biscuit to avoid weeping in the office but I didn’t have time for tea or weeping. Actually producing a spreadsheet without formulae turned out to be simple once I’d Googled how to do it and then my data files were complete. All I then needed to do was send them to the Data team and upload a pdf of my poster to the repository. Job done! Time to head to the airport for the conference!

Lesson #7. Making your work open is very satisfying.

Just over three weeks have passed since the conference and I’m amazed that already my poster has been viewed on the repository 84 times and my data has been viewed 153 times! Wowzers! That truly is very satisfying and makes me feel that all the effort and emergency cups of tea were worth it. As this was a proof-of-concept study I would be very happy for someone to use my work, although I am planning to keep working on it. Seeing the usage stats of my work and knowing that I have made it open to the best of my ability is really encouraging for the future of this type of research. And of course, when I write these results up with publication in mind it will be as an open access publication.

But first, it’s time for a nice relaxed cup of tea.

Published 25 October 2016
Written by Dr Lauren Cadwallader
Creative Commons License