涂勇(Tu Yong)分享 http://blog.sciencenet.cn/u/tuic 科学数据共享,信息生态学

博文

科学就是数据,数据就是科学-science杂志在数据管理上的新举措

已有 4427 次阅读 2013-3-20 10:45 |系统分类:科研笔记| SOM, Science杂志, 数据存储

科学是数据驱动的,已经越来越依赖于不同类型数据的辅助。在这种背景下,nature杂志物理学编辑Brooks Hanson发表了题为“Making  Data  Maximally  Available”的文章,总体来看,科学杂志社已经对论文中的数据进行了详细的规定,本文提出的一些做法和思路(如公共数据库,SOM等)可以对国内顶尖刊物的数据保存提供参考和借鉴。另一方面,科学杂志对作者提出了更为严格的要求,一方面提供的数据已经扩展到计算机代码,尽量能够对科研过程进行重现;另一方面将在参考文献以外公开支撑性资源,并且提供申明将数据作为致谢中的一部分,这些都将增加作者的工作量,但这些举措一方面将方便读者能获得更多的相关资源,同时对于数据的生产者的贡献进行肯定,有利于促进数据和文献的融合,将更加推动学术论文的规范性。

在具体的政策方面,期刊都在科学数据保存方面采取了更加严格的措施,除了将他们进行上网外,还增加了支撑性的在线资源SOM(supporting online material)去扩大数据表达和可用性。

他们鼓励作者通过两种方式提交数据:一种是将数据提交到公共数据库中,如果没有这些数据库的话,将他们的数据纳入到SOM中。对于那些没有可信站点的大型的数据库来说,我们通常要求作者签署一个存储协议,在这个协议中作者将数据存储到一个机构网站上,同时将数据在science网站上进行备份。这种协议仅仅是一个权宜之计,需要机构维持的仓储来对数据的保藏进行支撑。

数据和分析已经变得更加复杂,科学杂志将数据获取的要求扩展到数据生产或者分析过程中涉及的计算机程序。科学杂志社还将要求作者提供一个列表,包含了文章中的参考文献以及在线支撑资源SOM(这个列表将提供在线版本)。同时还提供了一个模板来约束方法和数据描述,这些将帮助评议者和读者。同时我们还将要求作者提供一个详细的声明,关于数据的可获取和监管作为他们致谢中的一部分。科学杂志社也意识到这些总体要求中需要例外说明,比如,保留个体的隐私,当数据或材料是从第三方获得的,或者处于安全方面的考虑。


附上英文全文:http://www.sciencemag.org/content/331/6018/649.full

Making Data Maximally Available

Brooks Hanson1, Andrew Sugden2, BruceAlberts3

1Brooks Hanson is Deputy Editor forphysical sciences at Science.

2Andrew Sugden is Deputy Editor forbiological sciences and International Managing Editor at Science.

3Bruce Alberts is Editor-in-Chief of Science.

Related Resources


Science is driven by data. New technologieshave vastly increased the ease of data collection and consequently the amountof data collected, while also enabling data to be independently mined andreanalyzed by others. And society now relies on scientific data of diversekinds; for example, in responding to disease outbreaks, managing resources,responding to climate change, and improving transportation. It is obvious thatmaking data widely available is an essential element of scientific research.The scientific community strives to meet its basic responsibilities towardtransparency, standardization, and data archiving. Yet, as pointed out in aspecial section of this issue (pp. 692–729), scientists are struggling with thehuge amount, complexity, and variety of the data that are now being produced.


Recognizing the long shelf-life of data andtheir varied applications, and the close relation of data to the integrity ofreported results, publishers, including Science, have increasingly assumed moreresponsibility for ensuring that data are archived and available afterpublication. Thus, Science and other journals have strengthened their policiesregarding data, and as publishing moved online, added supporting online material(SOM) to expand data presentation and availability. But it is a growingchallenge to ensure that data produced during the course of reported researchare appropriately described, standardized, archived, and available to all.



CREDIT: THINKSTOCK

Science's policy for some time has beenthat “all data necessary to understand, assess, and extend the conclusions ofthe manuscript must be available to any reader of Science” (seewww.sciencemag.org/site/feature/contribinfo/). Besides prohibiting referencesto data in unpublished papers (including those described as “in press”), wehave encouraged authors to comply in one of two ways: either by depositing datain public databases that are reliably supported and likely to be maintained or,when such a database is not available, by including their data in the SOM.However, online supplements have too often become unwieldy, and journals arenot equipped to curate huge data sets. For very large databases without aplausible home, we have therefore required authors to enter into an archivingagreement, in which the author commits to archive the data on an institutionalWeb site, with a copy of the data held at Science. But such agreements are onlya stopgap solution; more support for permanent, community-maintained archivesis badly needed.


To address the growing complexity of dataand analyses, Science is extending our data access requirement listed above toinclude computer codes involved in the creation or analysis of data. To providecredit and reveal data sources more clearly, we will ask authors to produce asingle list that combines references from the main paper and the SOM (thiscomplete list will be available in the online version of the paper). And toimprove the SOM, we will provide a template to constrain its content to methodsand data descriptions, as an aid to reviewers and readers. We will also askauthors to provide a specific statement regarding the availability and curationof data as part of their acknowledgements, requesting that reviewers consider thisa responsibility of the authors. We recognize that exceptions may be needed tothese general requirements; for example, to preserve the privacy ofindividuals, or in some cases when data or materials are obtained from thirdparties, and/or for security reasons. But we expect these exceptions to berare.


As gatekeepers to publication, journalsclearly have an important part to play in making data publicly and permanentlyavailable. But the most important steps for improving the way that science ispracticed and conveyed must come from the wider scientific community.Scientists play critical roles in the leadership of journals and societies, asreviewers for papers and grants, and as authors themselves. We must all acceptthat science is data and that data are science, and thus provide for, andjustify the need for the support of, much-improved data curation.





https://blog.sciencenet.cn/blog-376111-672075.html

上一篇:增强政府资助科学研究成果的获取-关于科学数据部分的解读
收藏 IP: 168.160.23.*| 热度|

1 许培扬

该博文允许注册用户评论 请点击登录 评论 (2 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-19 13:20

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部