What is open data?

Open access is not just for publications.  Indeed, access to the data that supports an article may be as important as access to the article itself.  Open data is research data that is freely available online for anyone to download, copy, and reuse, with no financial, legal or technical barriers.

Open data enhances the reproducibility and transparency of research by allowing other investigators to verify authors’ findings.  Freely available data also enhances the rate of scientific discovery by allowing anyone to analyze data in ways that its creators did not anticipate.

Adapted from the Scholarly Publishing and Academic Resources Coalition (SPARC).

Where can I find open data?

You may be familiar with freely available data from state and national government organizations and surveys, such as the National Cancer Institute Genomic Data Commons, a data sharing and analysis platform that provides genomic datasets and the tools to analyze them, or the National Health and Nutrition Examination Survey (NHANES), a series of studies that assess health and nutritional status of Americans.

Increasingly, research institutes, projects, labs and individuals are making their data freely available, either because a journal or funder requires them to do so, or simply because they want others to reuse their work (and get credit when they do!).  Freely available data can be found in many data repositories, which provide long-term access to, and preservation and storage of, data.

For a local twist on open data, check out Analyze Boston, where you can find freely available datasets from the city of Boston, or Personal Genome Project, a project started at the Harvard Medical School that invites participants to publicly share their personal genetic, health and trait data.

If you need help finding open data, or want to learn more about making your data freely available, then please email us at hhsl@tufts.edu.

Post contributed by Laura Pavlech

Comments are closed.