Getting Data Shared – the next step and the real work begins

A commentary (linked here) in today’s Nature Genetics by a group in which I am part covers something I care a lot about. Sharing data effectively. Although its mostly social engineering, we describe how we have built a standards based approach, who is using it, and how it helps us at the Harvard Stem Cell Institute. My view is that its time to take ‘science commons‘ iel; structured interaction amongst large groups of scientists with common purpose as a serious paradigm that will actually bear fruit. My money’s on this approach and the Harvard Stem Cell Institute is helping support my idea in building (still early stages) an HSCI commons.

“Reformatting data is a full-time job for many researchers, even before the minimum reporting guidelines, terminologies and formats of each field are taken into consideration. In this issue, we present a Commentary and a Perspective suggesting solutions to these problems that have been developed by a process of community consultation and open review to which the journal was a party. In the Commentary, Susanna-Assunta Sansone and colleagues identify one central problem, namely that “most repositories are designed for specific assay types, necessitating the fragmentation of complex datasets,” and they offer a unified view of the metadata formatting that will be needed to ensure that biomedical research datasets become interoperable. This solution is the overarching ISA framework, where the acronym stands for ‘Investigation’ (the project context), ‘Study’ (a unit of research) and ‘Assay’ (analytical measurement) (p 121). This proposal shifts the sets of reporting standards agreed upon by each community into the infrastructure and formatting of the data files themselves. Sansone and colleagues also list a set of participant communities that can pioneer the approach and teach by example.” Links to the announcement at  Harvard School of Public Health

Sansone, S., Rocca-Serra, P., Field, D., Maguire, E., Taylor, C., Hofmann, O., Fang, H., Neumann, S., Tong, W., Amaral-Zettler, L., Begley, K., Booth, T., Bougueleret, L., Burns, G., Chapman, B., Clark, T., Coleman, L., Copeland, J., Das, S., de Daruvar, A., de Matos, P., Dix, I., Edmunds, S., Evelo, C., Forster, M., Gaudet, P., Gilbert, J., Goble, C., Griffin, J., Jacob, D., Kleinjans, J., Harland, L., Haug, K., Hermjakob, H., Sui, S., Laederach, A., Liang, S., Marshall, S., McGrath, A., Merrill, E., Reilly, D., Roux, M., Shamu, C., Shang, C., Steinbeck, C., Trefethen, A., Williams-Jones, B., Wolstencroft, K., Xenarios, I., & Hide, W. (2012). Toward interoperable bioscience data Nature Genetics, 44 (2), 121-126 DOI: 10.1038/ng.1054

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s