POV: Why Science Should Be (Even More) Open
Sharing research data is vital

Although it presents many new challenges, the basic idea of sharing research data is in keeping with the scientific tradition of openness and collegiality that dates back to the founding of the Royal Society in 1660 by early scientists. Photo courtesy of Flickr contributor Peter Hopper
“POV,” a new addition to BU Today, is an opinion page that provides timely commentaries from students, faculty, and staff on a variety of issues: on-campus, local, state, national, or international. Anyone interested in submitting a piece, which should be about 700 words long, should contact Rich Barlow at barlowr@bu.edu.
Isaac Newton, revered as one of our greatest scientists because of his pioneering work in calculus, gravity, and optics, never called himself a scientist—the word was not coined until 100 years after his death. He would, however, have considered himself an alchemist, as he spent much of his time studying and writing about alchemy. One major difference between early scientists and alchemists was their degree of openness. Alchemy emphasized secret, hidden knowledge, while the sciences had a culture of open sharing.
Like alchemists, early scientists shared their knowledge through personal connections and letters, not surprising since many were alchemists. But the founding of the Royal Society in 1660 gave protoscientists a public forum for sharing their discoveries. In 1665 the society began publication of the Philosophical Transactions of the Royal Society, the first modern, peer-reviewed scientific journal. Still being published today, it is available online through the BU Libraries.
By publishing their findings, scientists received credit for their discoveries and recognition from their colleagues in exchange for making their work public. This culture of scientific sharing through journals continues; scientists are often judged by the quantity and quality of their publications, and by how often their works are cited by other scientists. This publication record has a major (perhaps excessive) impact on hiring and tenure decisions. Scientists also share in other, informal ways, such as laboratories sharing tissue samples with other researchers.
Computers and the internet have changed traditional scientific publishing and communication, as the economic model built around the printing and distributing of journals does not apply to the digital age. One response to these changes is the open access movement. Harvard faculty led the way in adopting a policy that requires that articles written by faculty be freely available; BU faculty followed with a weaker, but still significant policy of encouraging, not requiring, open access. There has also been a trend towards requiring that the results of publicly financed research be freely available, based on the idea that the public paid for the research and has a right to see the results. This idea is reflected in the 2008 mandate from the National Institutes of Health (NIH) that all peer-reviewed journal articles about NIH-funded research be made available through the online database PubMed Central.
The open access movement, while new, retains the journal article as the center of scientific culture and the primary method of sharing results. Another movement, advocating the wider sharing of scientific data, seeks to bring another kind of openness to science. There has always been some sharing of scientific data, often among informal networks of colleagues in similar fields, not unlike networks of alchemists sharing their secrets with the select few.
Now, digital technologies have made wider sharing of large amounts of data practical, and in some fields this online sharing has become standard. One example is gene and protein sequences available through the National Center for Biotechnology Information (NCBI). But the data being shared are just a drop compared to the ocean of research data being generated by scientists worldwide.
Sharing research data strengthens science in several ways. At a basic level, it can help identify errors and even fraud, and assist scientists attempting to reproduce the original study. In other cases, the data can be used in new and original ways, especially important when gathering new data would be problematic, as in studies of fragile ecosystems or when redoing a study would be prohibitively expensive. Sharing data can also provide new avenues for prestige and recognition in the scientific community; the original researcher would be acknowledged in any new research based on the data, and at least one study has shown that sharing the data behind an article increases the article’s citation rate.
As with open access, sharing of data is beginning to be mandated by funding agencies. The NIH requires in many cases that recipients of grants share their final research data, and in 2011 the National Science Foundation (NSF) mandated that all grant applications include a data management plan, with policies for the reuse and redistribution of data.
These data-sharing requirements are going to increase. Last February, a White House Office of Science and Technology Policy memorandum directed all federal agencies with more than $100 million in research expenditures to develop plans to make the published results, and in many cases the data, of the research they fund freely available.
In response to these developments, the culture of science needs to shift toward more open sharing of research data. It won’t be easy; legitimate concerns about issues such as confidentiality and ownership of intellectual property must be addressed. There will be resistance because many scientists feel territorial about their data, and worry the data might be misunderstood or misused if made public.
Major research institutions such as BU need to support scientists in making this change. The University is already doing some things, but needs to do more.
- Researchers at BU need support in developing and implementing the data management plans required by the NIH, the NSF, and other funding agencies. BU Libraries, in consultation with other offices, has created a research data management website that provides information and practical advice on creating such plans, and also offers classes and consultation on data management. Similarly, Information Services & Technology provides the technical infrastructure for many researchers to store and manage their data and intends to expand its offerings in this area.
- BU researchers need places to share their data. No single solution fits all research data, which can range in size from a few megabytes to several petabytes and come in a bewildering array of formats. In many cases, discipline-specific repositories such as NCBI are the appropriate vehicle for sharing data. Some of these repositories are government-funded, others are supported by consortia of research institutions, and BU should participate in such consortia where feasible.
- Because appropriate repositories don’t yet exist for all data, there are some cases where the researcher’s institution needs to step up and provide a home for the data. The Association of Research Libraries, with others, has put forward a strong proposal for a Shared Access Research Ecosystem (SHARE), “a network of digital repositories at universities, libraries, and other research institutions across the United States that will provide long-term public access to federally funded research articles and data.”
- BU should support the SHARE project, and a good first step would be expanding the capabilities of OpenBU, our institutional repository, so that it can support research data as well as articles.
- BU researchers need a clear policy on how the new costs of data management and sharing will be met. Funding agencies are reluctant to allow specific line items for data management in a grant, considering it part of overhead. So it will be up to BU as an institution to provide and pay for the infrastructure required.
- Finally, BU researchers need the University to recognize the importance of data sharing to the advancement of scientific knowledge. One way would be to consider a researcher’s data—whether it is shared, how often it is cited—as well as publications in hiring and tenure decisions.
Although it presents many new challenges, the basic idea of sharing research data is in keeping with the scientific tradition of openness and collegiality that dates back to the founding of the Royal Society, if not before, and that helped distinguish science from alchemy. This shift in the culture of science should be embraced rather than feared.
David Fristrom, head of the Science Engineering Library, can be reached at fristrom@bu.edu.
“POV” is an opinion page that provides timely commentaries from students, faculty, and staff on a variety of issues: on-campus, local, state, national, or international. Anyone interested in submitting a piece, which should be about 700 words long, should contact Rich Barlow at barlowr@bu.edu. BU Today reserves the right to reject or edit submissions. The views expressed are solely those of the author and are not intended to represent the views of Boston University.
Comments & Discussion
Boston University moderates comments to facilitate an informed, substantive, civil conversation. Abusive, profane, self-promotional, misleading, incoherent or off-topic comments will be rejected. Moderators are staffed during regular business hours (EST) and can only accept comments written in English. Statistics or facts must include a citation or a link to the citation.