Data Management

A. Agreement

– NPTMI is a multi-faceted, complex initiative aimed at developing a national, multi-crop disease forecasting tool, thus the network of scientists from Land Grant universities, USDA-ARS, and cooperating national laboratories involved in the Initiative must work cooperatively in sharing and publishing data. Primary data, which is collected in field research plots, are needed for secondary analysis, e.g., modeling, by members of NPTMI. In addition, it is anticipated that users (private and public entities) outside of the network will request access to data generated by the Initiative.

B. Public Access to Data

– is generally expected within 4 years of project initiation. Once data are publicly available, outside users may use the data by citing/acknowledging the data set, without any additional permissions needed.

C. Guidelines

– presented below apply to use of data by NPMTI members during the project period:

1.    Data – there are two types of NPMTI data, defined below. Anyone who creates data, whether primary or secondary, will be considered a “Data Creator.” Anyone who wishes to use NPMTI data will be referred to as a “Data User.”

2.    Primary Data: These are typically field data such as data regarding pathogen populations and disease development. Primary data are collected as a result via a NPTMI experimental protocol and typically specify the data collection method, structure, storage, and process for sharing. These data will be aggregated in NPTMI data repositories accessible to the full Initiative, using a data model that attributes the Data Creator.

a.    Secondary Data: When primary data undergo a significant transformation due to, for instance, data analysis or modeling, the outputs are now considered secondary data. In this instance, the data analyst or modeler is considered the Data Creator of these secondary data and receive the same attribution rights and responsibilities afforded to primary Data Creators. Secondary data will also be aggregated in NPTMI repositories accessible to the full Initiative, using a data model that attributes the Data Creator or the source in the case of weather data and historic outbreak data.

b.    Software – We will also develop and share software in NPTMI. These include digital tools, machine learning, and additional algorithms, and decision support tools developed by NPTMI members. The software will be available in an NPTMI software repository accessible to the full team, with documentation that attributes the original Software Creators. Anyone who wishes to use NPTMI software will be referred to as a Software User.

3.    Quality Assurance – The role of the NPMTI Executive Committee is to identify quality, content, and format standards for data and software. The role of the Data Creator is to assure these data are of the highest quality with no known errors or changes expected to occur once they are uploaded to the Initiative’s database. Similarly, it is the role of the Software Creator to assure that the software is of the highest quality, with appropriate error handling, change management, and documentation. Further, since some software will result in secondary data, it is their role (both the Data Creator and Software Creator) to assure that the secondary data are “clean.”