How to Contribute Data to MaizeGDB

MaizeGDB accepts data that meets these criteria:

  • Data adheres to the FAIR data principles
  • Data and metadata have been published in a publically available scientific journal
  • Primary data is deposited in the appropriate repository (like NCBI)
  • Data is licensed as public data (See https://creativecommons.org/licenses).
  • Comprehensive metadata (information about your dataset) is provided and meets MaizeGDB standards for the data type
  • Additional criteria may be required depending on the data type

MaizeGDB will accept data directly from researchers. Some types of data must also be deposited at standard repositories, as MaizeGDB is not a permanent repository for primary data. The MaizeGDB team can advise on the correct repository and help with the submission, if needed. More detailed information on criteria by datatype is below.


It is particularly important that all sequence data be submitted to NCBI's GenBank (US), EBI's ENA (Europe), or the DDBJ (Japan). These repositories share all of their data on a daily basis, so data submitted toby any of these will be visible at all three.


BECOME A COMMUNITY CURATOR
Members of the maize community are also encouraged to become community curators. Community curators can contribute comments to most data types, including gene loci, gene models, markers, et cetera. To create a community annotator account, follow the link at the top of the page marked "login/register" and check the box labeled "I am interested in being a MaizeGDB curator" when you fill out the form to "Create an Annotation Account". You will be contacted via e-mail when your account is activated.

If you already have a MaizeGDB login but are not a community curator, click
to request permission to be a curator.



How to contribute:


NEW GENES, GENE FUNCTION, AND GENE TO GENE MODEL ASSOCIATIONS
If your research is identifying new genes, providing evidence of gene function, and/or linking gene loci to gene models, it would be very helpful to MaizeGDB and the maize community, as well as increasing visibility of your research if you submit this information directly to MaizeGDB rather than relying on the MaizeGDB curators reading your paper and extracting this information to load into the database. The number of maize papers published each month is much greater than the number of papers that can be curated. Send your paper to a MaizeGDB curator to have your gene function information loaded into the MaizeGDB database.

Notes can be added directly to records at MaizeGDB by researchers. To add a note, you will need a community curation account. Log in to the site using the login/register link displayed at the top of any MaizeGDB page. Once logged in, click "Add free text annotation" in the annotation section of most data record displays.


GENOME ASSEMBLIES
Preparing a genome assembly for hosting at MaizeGDB is a lengthy process. The best practice is to contact a MaizeGDB sequence curator before submitting your assembly to Genbank, so we can work together to insure you comply with MaizeGDB and long-term repository requirements. The MaizeGDB metadata template contains information required by GenBank (also ENA and DDBJ) for genome assembly submissions, along with additional information required by MaizeGDB for hosting genomes.

To be hosted at MaizeGDB, Genome Assemblies must:
  • Comply with nomenclature protocols established by the maize community and MaizeGDB. Information about naming assemblies and gene models is available here. Please contact a MaizeGDB curator to get an identifier for your genome assembly and annotation.
  • Chromosome names should be prefaced with "chr", for example, chr1, chr2, et cetera. Unplaced scaffolds should be named individually, NOT be combined into one. That is, there should be no "chr0" or "chrUnplaced". GenBank will not accept this.
  • Be submitted to GenBank, EBI's ENA or DDBJ. All three share data daily. Submitting contigs is not sufficient - the assembly itself must be submitted. This will involve creating BioSample, BioProject, and WGS records. Be as complete and as accurate as possible.
  • A MaizeGDB Genome Assembly and Annotation Metadata Template must be filled in. Request a template from a MaizeGDB sequence curator or download the template from here.
  • If possible, sibling seed and pedigree should be deposited at GRIN. The accession should specified in metadata.
  • A contact person from your team/consortium should be assigned to coordinate with MaizeGDB staff.

The assembly metadata template requires BioSample and BioProject accessions from GenBank, ENA, or DDBJ.

To submit your data to EBI's ENA, see instructions here. Note that ENA submission requires command line skills as there is no longer a web portal for submitting genome assemblies.

To submit your data to GenBank, see these tutorials. Note that NCBI checks for vector/primer contamination so you may need to warn in advance about known chloroplast sequence in the maize nuclear genome.


ANNOTATIONS
Genome annotations (gene models) can be hosted at MaizeGDB in the form of browser tracks and/or downloads of GFF or FASTA files. It is best to have both transcript and protein FASTA. Additional data is also accepted, for example, SNP alignments, orthologs, et cetera. Contact a MaizeGDB sequence curator for more information.

See the nomenclature guidelines for naming your gene models.


OTHER NUCLEOTIDE SEQUENCES, INCLUDING INDIVIDUAL GENES


PROTEIN SEQUENCES

Protein sequences should be submitted to GenBank, EBI's Uniprot, or the DDBJ.


NEXT GENERATION SEQUENCE READS
Next generation sequence reads should be submitted to GenBank's Short Read Archive (SRA).


MAPPED SEQUENCE READS AND OTHER EXPRESSION DATA
Gene expression data should be submitted to GenBank's Gene Expression Omnibus (GEO).

MaizeGDB can also host gene and protein expression tracks on genome browsers. Contact a MaizeGDB team member for more information.


MAIZE SNPS
GenBank dbSNP no longer accepts non-human SNPs. Maize SNPS should be submitted to EBI's EVA. It is important to submit SNPs to EVA because EVA will provide permanent identifiers for each SNP, and will collapse identical SNPs into one record, maintaining the original submission identifier as well as a consensus identifier.

MaizeGDB will also host tracks of aligned SNPs on the genome browsers. Note that MaizeGDB does not have sufficient personnel to do the alignments. Contact a MaizeGDB team member for more information.


GENOTYPE AND PHENOTYPE DATA

MaizeGDB had a long history of curating genotype and phenotype data from individual genes or small sets of genes. Detailed lab-driven data about any gene is important to the Maize community, as this is the best functional data we can have. We want to curate as much of this type of data as possible. This includes detailed descriptions of mutant phenotypes and phenotypic changes to mutant expression in various genotypes. This type of data can be submitted by email to the MaizeGDB curators.


MAPS

MaizeGDB team welcomes genetic maps. Please contact a MaizeGDB curator for more information.


METABOLOMICS, IONOMICS AND OTHER DATA TYPES

Contact a MaizeGDB team member if you have a dataset you would like hosted at MaizeGDB which is not listed here. We will work with you to see if and how your data can be hosted. To learn more about data repositories for these types of data, please see the FAIR data page.



FAQs

WHERE DOES THE DATA STORED AT MAIZEGDB COME FROM?
  • The original data was inherited from the MaizeDB and ZmDB projects.
  • Sequence data comes from GenBank, genome assembly and annotation groups, and other research groups that are producing genomic, transcriptomic, and proteomic sequence data for maize.
  • Other types of bulk data are contributed by community members, usually in standard file formats like GFF, VCF, BED and are added to the database by members of the MaizeGDB Team.
  • Pubic data from published literature are hand curated and entered record-by-record by MaizeGDB and community curators.


HOW DO I BECOME A COMMUNITY CURATOR?


HOW CAN COMMUNITY MEMBERS CONTRIBUTE DATA?
MaizeGDB staff members regularly attend the Plant and Animal Genome Conference in San Diego, California and the Annual Maize Genetics Conference. To schedule a meeting at any of these conferences, use the feedback form at the top of this page to contact the MaizeGDB team, or you can contact a specific MaizeGDB member.

Contact MaizeGDB directly with a request to host your data, using the feedback button at the top of the page or contact a specific MaizeGDB member.

Notes can be added directly to records at MaizeGDB by researchers. To add a note, you will need a community curation account. Log in to the site using the login/register link displayed at the top of any MaizeGDB page. Once logged in, click "Add free text annotation" in the annotation section of most data record displays.

Although researchers are encouraged to use a standard long-term repository that is appropriate for the type of data, large datasets can be made available through MaizeGDB by special arrangement. Use the feedback form at the top of this page to contact the MaizeGDB team or contact a specific MaizeGDB member to find out what arrangements can be made to accommodate your data.

If possible, it is best to contact the MaizeGDB Team before you begin to generate large datasets so that a standardized format can be agreed upon and so that a customized pipeline can be created for importing your data in a timely and efficient manner.


WHEN CAN I EXPECT DATA I GENERATED TO APPEAR AT MAIZEGDB?
The MaizeGDB database is typically updated the first Tuesday of each month.

Unless you have contacted us to make specific arrangements to accommodate your data, you shouldn't expect it to appear at the MaizeGDB site unless our data curators have curated your paper, usually because it was recommended by the Editorial Board. Use the feedback form at the top of this page to contact the MaizeGDB team to find out what arrangements can be made to make your important data become available through MaizeGDB.


THE AGENCIES THAT FUND OUR RESEARCH HAVE ENCOURAGED ME TO CONTRIBUTE DATA TO MAIZEGDB. WHAT CAN I DO TO ENSURE THAT MAIZEGDB WILL TAKE MY DATA?
If you wish to contribute a large dataset, you should contact the MaizeGDB team to make special arrangements for its inclusion at MaizeGDB. Note that contacting MaizeGDB personnel well in advance and committing funds from your grants to cover the cost of personnel to curate your data into MaizeGDB are the best ways of ensuring that MaizeGDB can accommodate your requests for data storage.

In general, if the data you are generating have historically been stored at MaizeGDB (e.g., your project is planning to generate genetic maps using a new set of probes), it is very easy for us to commit to including your data in the database. However, if you are proposing to create data of a type that is not currently stored at MaizeGDB, more work would be required of the staff at MaizeGDB (e.g., it may be necessary to make new tables in which to store your data and new data displays would be needed for the website).

Unless you have contacted the MaizeGDB team, please do not assume that we can accommodate your data. We are happy to make special arrangements to create new tables and data displays (e.g., we collaborate with the FSU Cytogenetic Map of Maize Project to make their cytological images and data available).

In summary, we encourage you to contact us prior to reporting to the funding agencies that we will take any and all maize data your project plans to generate.


WHAT SHOULD I DO IF I AM PLANNING A GENOME ASSEMBLY PROJECT OR PLANNING TO ANNOTATION THE MAIZE GENOME?