Tag Archives: open access

How open is Cambridge? 2017 edition

Welcome to Open Access Week 2017. The Office of Scholarly Communication at Cambridge is celebrating with a series of blog posts, announcements and events. In today’s blog post we revisit the question about the openness of Cambridge. 

For Open Access week last year I looked at how open Cambridge was using the extremely useful Lantern tool, developed by Cottage Labs, and which is the basis of the Wellcome Trust’s compliance tool. If you haven’t used it before, Lantern takes a list of DOIs, PMIDs, or PMCIDs and runs these through a variety of sources to try and determine the Open Access status of the publication. I found that, for publications in 2015, 51.8% of all of Cambridge’s research publications were available in at least one ‘Open Access’ source. How did Cambridge’s 2016 publications fair? Read on to find out.

Using the same method as last year, I first obtained a list of DOIs from Web of Science (n=9416) and Scopus (n=9124) for articles, proceedings papers and reviews published in 2016. Combining and deduplicating these lists returned 10,674 unique DOIs (~29 publications/day). I also refreshed the 2015 publication data using the latest Web of Science and Scopus information, which returned 10,090 unique DOIs. Year-on-year, this represents a 5.8% increase in the total number of publications attributable to Cambridge – more than inflation!

The deduplicated DOI lists for 2015 and 2016 (20,764 DOIs in total) were fed into Lantern and analysed in combination with information from Web of Science and the University’s institutional repository Apollo.

Figure 1. Distribution of papers, published in 2015 and 2016 which have a DOI, according to the Open Access sources they can be found in. 57.5% of 2016’s articles appear in at least one Open Access source, which represents a 4% increase over 2015. One third of all papers published in 2016 are available in Apollo.

Very pleasingly the percentage of publications available in at least one Open Access source increased to 57.5% in 2016 compared to only 53.4% for 2015 publications. Given that the total number of publications also increased during this period this result is doubly exciting. In raw numbers, this means that while 5384 publications were Open Access in 2015, an impressive 6135 publications were made Open Access in 2016.

Most of this increase can be attributed to the much larger share of publications that appear in Apollo, which is now the largest source of Open Access material for the University of Cambridge. An additional 822 publications were deposited in Apollo in 2016 compared to 2015, which is a 30% increase in one year alone.

You can now find more of the University’s research outputs in Apollo than in any other Open Access source. And because we operate an extremely popular Request a Copy service, potentially all of the publications held in Apollo, even those that are restricted and under embargo, are available to anyone in the world. You just need to ask.

Published 23 October 2017
Written by Dr Arthur Smith
Creative Commons License

Open Access policy, procedure & process at Cambridge

First up, HEFCE’s Open Access policy:

At the outset, let’s be clear: the HEFCE Open Access policy applies to all researchers working at all UK HEIs. If an HEI wants to submit a journal article for consideration in REF 2021 the article must appear in an Open Access repository (although there is a long list of exceptions). Keen observers will note that in the above flowchart HEFCE’s policy is enforced based on deposit within three months of acceptance. This requirement has caused significant consternation amongst researchers and administrators alike; however, during the first two years of the policy (i.e. until 31 March 2018) publications deposited within three months of publication will still be eligible for the REF. At Cambridge, we have been recording manuscript deposits that meet this criterion as exceptions to the policy[1].

Next up, the RCUK Open Access policy. This policy is straightforward to implement, the only complication being payment of APCs, which is contingent on sufficient block grant funding. Otherwise, the choice for authors is usually quite obvious: does the journal have a compliant embargo? No? Then pay for immediate open access.

One extra feature of the RCUK Open Access policy not captured here is the Europe PMC deposit requirement for MRC and BBSRC funded papers. Helpfully, the policy document makes no mention of this requirement; rather, this feature of the policy appears in the accompanying FAQs. I’m not expert, but this seems like the wrong way to write policies.

Finally, we have the COAF policy, possibly the single most complicated OA policy to enforce anywhere in the world. The most challenging part of the COAF policy is the Europe PMC deposit requirement. It is often difficult to know whether a journal will indeed deposit the paper in Europe PMC, and if, for whatever reason, the publisher doesn’t immediately deposit the paper, it can take months of back-and-forth with editors, journal managers and publishing assistants to complete the deposit. This is an extremely burdensome process, though the blame should be laid squarely at the publishers. How hard is it to update a PMC record? Does it really take two months to update the Creative Commons licence?

This leads us to one of the more unusual parts of the COAF policy: publications are considered journals if they are indexed in Medline. That means we will occasionally receive book chapters that need to meet the journal OA policy. Most publishers are unwilling to make such publications OA in line with COAF’s journal requirements so they are usually non-compliant.

What happens if you should be foolish enough to try to combine these policies into one process? Well, as you might expect, you get something very complicated:

This flowchart, despite its length, still doesn’t capture every possible policy outcome and is missing several nuances related to the payment of APCs, but nonetheless, it gives an idea of the enormous complexity that underlies the decision making process behind every article deposited in Apollo and in other repositories across the UK.

[1] Within the University’s CRIS, Symplectic Elements, only one date range is possible so we have chosen to monitor compliance from the acceptance date. Publications deposited within the ‘transitional’ three months from publication window receive an ‘Other’ exception within Elements that contains a short note to this effect.

Published 18 September 2017
Written by Dr Arthur Smith
Creative Commons License

What I wish I’d known at the start – setting up an RDM service

In August, Dr Marta Teperek began her new role at Delft University in the Netherlands. In her usual style of doing things properly and thoroughly, she has contributed this blog reflecting on the lessons learned in the process of setting up Cambridge University’s highly successful Research Data Facility.

On 27-28 June 2017 I attended the Jisc’s Research Data Network meeting at the University of York. I was one of several people invited to talk about experiences of setting up RDM services in a workshop organised by Stephen Grace from London South Bank University and Sarah Jones from the Digital Curation Centre. The purpose of the workshop was to share lessons learned and help those that were just starting to set up research data services within their institutions. Each of the presenters prepared three slides: 1. What went well, 2. What didn’t go so well, 3. What they would do differently. All slides from the session are now publicly available.

For me the session was extremely useful not only because of the exchange of practices and learning opportunity, but also because the whole exercise prompted me to critically reflect on Cambridge Research Data Management (RDM) services. This blog post is a recollection of my thoughts on what went well, what didn’t go so well and what could have been done differently, as inspired by the original workshop’s questions.

What went well

RDM services at Cambridge started in January 2015 – quite late compared to other UK institutions. The late start meant however that we were able to learn from others and to avoid some common mistakes when developing our RDM support. The Jisc’s Research Data Management mailing list was particularly helpful, as it is a place used by professionals working with research data to look for help, ask questions, share reflections and advice. In addition, Research Data Management Fora organised by the Digital Curation Centre proved to be not only an excellent vehicle for knowledge and good practice exchange, but also for building networks with colleagues in similar roles. In addition, Cambridge also joined the Jisc Research Data Shared Service (RDSS) pilot, which aimed to create a joint research repository and related infrastructure. Being part of the RDSS pilot not only helped us to further engage with the community, but also allowed us to better understand the RDM needs at the University of Cambridge by undertaking the Data Asset Framework exercise.

In exchange for all the useful advice received from others, we aimed to be transparent about our work as well. We therefore regularly published blog posts about research data management at Cambridge on the Unlocking Research blog. There were several additional advantages of the transparent approach: it allowed us to reflect on our activities, it provided an archival record of what was done and rationale for this and it also facilitated more networking and comments exchange with the wider RDM community.

Engaging Cambridge community with RDM

Our initial attempts to engage research community at Cambridge with RDM was compliance based: we were telling our researchers that they must manage and share their research data because this was what their funders require. Unsurprisingly however, this approach was rather unsuccessful – researchers were not prepared to devote time to RDM if they did not see the benefits of doing so. We therefore quickly revised the approach and changed the focus of our outreach to (selfish) benefits of good data management and of effective data sharing. This allowed us to build an engaged RDM community, in particular among early career researchers. As a result, we were able to launch two dedicated programmes, further strengthening our community involvement in RDM: the Data Champions programme and also the Open Research Pilot Project. Data Champions are (mostly) researchers, who volunteered their time to act as local experts on research data management and sharing to provide advice and specialised training within their departments.The Open Research Pilot Project is looking at the benefits and barriers to conducting Open Research.

In addition, ensuring that the wide range of stakeholders from across the University were part of the RDM Project Group and had an oversight of development and delivery of RDM services, allowed us to develop our services quite quickly. As a result, services developed were endorsed by wide range of stakeholders at Cambridge and they were also developed in a relatively coherent fashion. As an example, effective collaboration between the Office of Scholarly Communication, the Library, the Research Office and the University Information Services allowed integration between the Cambridge research repository, Apollo, and the research information system, Symplectic Elements.

What didn’t go so well

One of the aspects of our RDM service development that did not go so well was the business case development. We started developing the RDM business case in early 2015. The business case went through numerous iterations, and at the time of writing of this blog post (August 2017), financial sustainability for the RDM services has not yet been achieved.

One of the strongest factors which contributed to the lack of success in business case development was insufficient engagement of senior leadership with RDM. We have invested a substantial amount of time and effort in engaging researchers with RDM and by moving away from compliance arguments, to the extent that we seem to have forgotten that compliance- and research integrity-based advocacy is necessary to ensure the buy in of senior leadership.

In addition, while trying to move quickly with service development, and at the same time trying to gain trust and engagement in RDM service development from the various stakeholder groups at Cambridge, we ended up taking part in various projects and undertakings, which were sometimes loosely connected to RDM. As a result, some of the activities lacked strategic focus and a lot of time was needed to re-define what the RDM service is and what it is not in order to ensure that expectations of the various stakeholders groups could be properly managed.

What could have been done differently

There are a number of things which could have been done differently and more effectively. Firstly, and to address the main problem of insufficient engagement with senior leadership, one could have introduced dedicated, short sessions for principal investigators on ensuring effective research data management and research reproducibility across their research teams. Senior researchers are ultimately those who make decisions at research-intensive institutions, and therefore their buy-in and their awareness of the value of good RDM practice is necessary for achieving financial sustainability of RDM services.

In addition, it would have been valuable to set aside time for strategic thinking and for defining (and re-defining, as necessary) the scope of RDM services. This is also related to the overall branding of the service. In Cambridge a lot of initial harm was done due to negative association between Open Access to publications and RDM. Due to overarching funders’ and government’s requirements for Open Access to publications, many researchers started perceiving Open Access to publications merely as a necessary compliance condition. The advocacy for RDM at Cambridge started as ‘Open Data’ requirements, which led many researchers to believe that RDM is yet another requirement to comply with and that it was only about open sharing of research data. It took us a long time to change the messages and to rebrand the service as one supporting researchers in their day to day research practice and that proper management of research data leads to efficiency savings. Finally, only research data which are management properly from the very start of the research process can be then easily shared at the end of the project.

Finally, and which is also related to the focusing and defining of the service, it would have been useful to decide on a benchmarking strategy from the very beginning of the service creation. What is the goal(s) of the service? Is it to increase the number of shared datasets? Is it to improve day to day data management practice? Is to to ensure that researchers know how to use novel tools for data analysis? And, once the goal(s) is decided, design a strategy to benchmark the progress towards achieving this goal(s). Otherwise it can be challenging to decide which projects and undertakings are worth continuation and which ones are less successful and should be revised or discontinued. In order to address one aspect of benchmarking, Cambridge led the creation of an international group aiming to develop a benchmarking strategy for RDM training programmes, which aims to create tools for improving RDM training provision.

Final reflections

My final reflection is to re-iterate that the questions asked of me by the workshop leaders at the Jisc RDN meeting really inspired me to think more holistically about the work done towards development of RDM services at Cambridge. Looking forward I think asking oneself the very same three questions: what went well, what did not go so well and what you would do differently, might become for a useful regular exercise ensuring that RDM service development is well balanced and on track towards its intended goals.


Published 24 August 2017
Written by Dr Marta Teperek

Creative Commons License