A commentary (linked here) in today’s Nature Genetics by a group in which I am part covers something I care a lot about. Sharing data effectively. Although its mostly social engineering, we describe how we have built a standards based approach, who is using it, and how it helps us at the Harvard Stem Cell Institute. My view is that its time to take ‘science commons‘ iel; structured interaction amongst large groups of scientists with common purpose as a serious paradigm that will actually bear fruit. My money’s on this approach and the Harvard Stem Cell Institute is helping support my idea in building (still early stages) an HSCI commons.
“Reformatting data is a full-time job for many researchers, even before the minimum reporting guidelines, terminologies and formats of each field are taken into consideration. In this issue, we present a Commentary and a Perspective suggesting solutions to these problems that have been developed by a process of community consultation and open review to which the journal was a party. In the Commentary, Susanna-Assunta Sansone and colleagues identify one central problem, namely that “most repositories are designed for specific assay types, necessitating the fragmentation of complex datasets,” and they offer a unified view of the metadata formatting that will be needed to ensure that biomedical research datasets become interoperable. This solution is the overarching ISA framework, where the acronym stands for ‘Investigation’ (the project context), ‘Study’ (a unit of research) and ‘Assay’ (analytical measurement) (p 121). This proposal shifts the sets of reporting standards agreed upon by each community into the infrastructure and formatting of the data files themselves. Sansone and colleagues also list a set of participant communities that can pioneer the approach and teach by example.” Links to the announcement at Harvard School of Public Health
Sansone, S., Rocca-Serra, P., Field, D., Maguire, E., Taylor, C., Hofmann, O., Fang, H., Neumann, S., Tong, W., Amaral-Zettler, L., Begley, K., Booth, T., Bougueleret, L., Burns, G., Chapman, B., Clark, T., Coleman, L., Copeland, J., Das, S., de Daruvar, A., de Matos, P., Dix, I., Edmunds, S., Evelo, C., Forster, M., Gaudet, P., Gilbert, J., Goble, C., Griffin, J., Jacob, D., Kleinjans, J., Harland, L., Haug, K., Hermjakob, H., Sui, S., Laederach, A., Liang, S., Marshall, S., McGrath, A., Merrill, E., Reilly, D., Roux, M., Shamu, C., Shang, C., Steinbeck, C., Trefethen, A., Williams-Jones, B., Wolstencroft, K., Xenarios, I., & Hide, W. (2012). Toward interoperable bioscience data Nature Genetics, 44 (2), 121-126 DOI: 10.1038/ng.1054