How to submit data to ClinVar

ClinVar welcomes submissions from clinical testing labs, researchers, locus-specific databases, expert panels, and professional societies. The scope of the submission may be as small as a single variant. The database has a flexible data model, so submissions may be minimal or very detailed. Submissions may be structured to provide summary data about a variant (variant level/aggregate data) or reports about variants per case (case-level) as long as the combination of variants per individual is not identifiable according to NIH guidelines.

If you wish to obtain accessions from ClinVar for publication, and keep your data private until publication, please contact us at clinvar@ncbi.nlm.nih.gov and we will assign accessions for you (more...). These accessions must included in your final submission. You may submit your data after publication; if that is your preference please reference your citation in your original submission.

ClinVar assumes that the submitter has obtained appropriate consent for the level of data being submitted, which will be available for unrestricted distribution.

Novel variants submitted to ClinVar are in turn submitted to dbSNP or dbVar, as appropriate, for accessioning. Thus submission to ClinVar is considered being in compliance with NIH's Genomic Data Sharing policy.

Table of Contents

Formats accepted

A Data Dictionary, which provides detailed definitions of the data elements in ClinVar and how they are stored in the submission spreadsheets, the XML and the database, is available from the home page. A FAQ about submitting data to ClinVar is also available.

Spreadsheet

Two Excel spreadsheet templates are available for simpler submissions, one aimed at more general variant-level data and one designed for more specific case-level data. The spreadsheet templates are available on the ClinVar ftp site:

ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/submission_templates/

The spreadsheet templates have version numbers so the spreadsheet format can be tracked as we refine the submission process. Updates to the spreadsheet templates are documented in a README file on the ftp site:

ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/submission_templates/README.txt

We will attempt to make our submission processing backwards compatible so that older versions of the templates may continue to be used.  Please note that our processing of the spreadsheets depends on the names of the sheet in the workbook and name of the column, so please do not edit these.

Note to Safari users:  To download a spreadsheet from our FTP site, please use a browser other than Safari. Safari will probably redirect you to Finder, and give you the impression that you need credentials for the ftp site.  You do not need login credentials to access content on ClinVar's ftp site. You will not encounter this problem with Firefox or Chrome.

tsv/csv files

In addition to explicit spreadsheets, you may format a submission that emulates a spreadsheet. A set of files, each with the same base name, should be constructed to correspond to each tab in the spreadsheet, e.g. the following files would correspond to the tabs in SubmissionTemplateLite.xlsx:

FILENAME.SubmissionInfo.csv
FILENAME.Variant.csv
FILENAME.ExpEvidence.csv

and the following files would correspond to a set of tabs in SubmissionTemplate.xlsx:

FILENAME.SubmissionInfo.csv
FILENAME.Variant.csv
FILENAME.AggregateData.csv

The first line of each file must provide the name of the column for which data are being provided. For example, if the submission contains only required fields, only the names for those columns need to be provided.  In other words, the submission is interpreted by the name of the table and the name of the column,  not the order of the columns.  As a consequence, any submission must adhere to the names of the tabs and columns in the submission template.

XML

Detail-rich submissions are more readily submitted by XML. The xsd is available on the ftp site:

ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/clinvar_submission.xsd

Documentation for review status (e.g. expert panel, professional society)

ClinVar is a data archive, and relies on domain experts to evaluate available evidence and to submit current interpretations of simple or complex variants.  ClinVar then represents the level of review any variation has achieved by reporting a review status, represented graphically by a number of golden stars. The ClinGen group has established infrastructure to evaluate submitters wishing to be recognized as expert panels or providers of practice guidelines.  If you have any question about the process of obtaining such recognition, please contact us.

Minimal content

The minimal data required for submission is submitter information, how the data were collected, a valid variant description (either HGVS or genomic location and change) and a clinical assertion and/or phenotype. Utility may be increased by providing supporting evidence, such as number of observations of the variant, allele frequency, co-segregations, and mode of inheritance for variant level data. For case-level data, submission of affected status, presence of family history, biallelic variant occurrence, and ethnicity are encouraged to support review.

Where to submit

If you wish to submit data to ClinVar, please review the documentation and spreadsheet templates on the ftp site. If you have any questions, please address them to clinvar@ncbi.nlm.nih.gov.  You may send your submission as an attachment to the same address.  If your file is too large to transfer by email, let us know and we will arrange for an ftp site for you.

Please note, if you already have a handle for submissions to dbSNP, you may continue to use this application, but the content will not be as rich as if you use ClinVar's submission forms.

Submission hints

This section provides hints to help make the submission process smooth.  Please note we also provide an FAQ specific to submitting data.

HGVS expressions

Consider using Mutalyzer to check that your HGVS expressions are valid.

Please be as specfic as possible about describing the sequence variants that have been observed. If you have information on multiple nucleotide changes that result in the same protein change, please information as it related to each nucleotide changes.

Phenotype

Our submission interfaces provide multiple options to report phenotype. These options support making a distinction about whether you are reporting

  • the relationship of a variant to a specific disorder or condition
    • Single disorder. Complete either Phenotype ID (Condition ID) and Phenotype ID (Condition ID) value columns  or Preferred phenotype name column. If you submit data in all three, the value in the Preferred phenotype name column will be added to the set of names that are valid for the disorder specified by identifiers. There is no need to submit the name that is already associated with an identifier.
    • Multiple disorders in the same individual or set of aggregate data. If you are submitting multiple phenotypes for a variant to indicate they are observed together in the same individual or set of individuals, and the interpretation applies to the set of phenotypes, please report that on one row of the spreadsheet, or one TraitSet in xml.
    • Multiple disorders for the same variant, but different indivuduals. If you are submitting multiple phenotypes to indicate that the variant has been observed with one or the other phenotype, please store the variant on multiple rows, one per phenotype.
  • phenotypes (clinical features) observed in the set of individuals (aggregate data) or individual  about which you are making a report
  • indication for genetic testing

Please note this FAQ about reporting phenotype.

We strongly recommend you submit using phenotypic information based on identifiers from groups that provide concepts, their definitions, and stable identifiers for those concepts.  Staff of ClinVar, GTR, and MedGen maintain this tutorial to aid your evalutation. If you include a MIM number for the phenotype, please ensure that the MIM number is for the disease, not the gene.

Allele Origin

There are several allowed values for allele origin; several of these, like "maternal", are subsets of "germline". For some display and search purposes, all submissions with allele origin terms other than "somatic" are grouped into "germline"; however, ClinVar does retain whichever specific term you provide on your submission.

Submitting control data

ClinVar does accept control data, for example, data from a specific ethnic group for variants that are reported elsewhere to be pathogenic. This data may be submitted with ClinVar's spreadsheet template; please specify that the affected status for these individuals is "no".

Submissions with publications/ hold until published

ClinVar accepts submission at any point in the publication cycle. ClinVar supports submissions that are held in private until a paper is published or three months have passed after the time of submission. The information that the hold is being requested should be stored in the submission by submitting hold until published as a value for Release status. On spreadsheets, this column has the title Release status. In xml, the element name is ReleaseStatus. On an XML submission, submit  <ReleaseStatus>hold until published</ReleaseStatus>.

Releases

ClinVar will retain the submission on hold until notified by the submitter or the journal of the publication, or for three months, whichever comes first. Submitters will be notified in advance of any release.

Write to the Help Desk

Last updated: 2014-10-17T08:23:05-04:00