In practice, VA-Spec schema used to represent actual data are Profiles defined to constrain and/or extend core Statement, Study Result, and Evidence Line classes to support a specific type of variant knowledge.
The VA-Spec defines a Profiling Methodology which specifies the types of specializations and extensions that are permitted in authoring profiles, as illustrated in the diagram and detailed in the ‘Profiling Tasks` below.
Examples of specializations defined in Variant Pathogenicity profiles.
(A) Core Proposition and Statement classes, showing a subset of their attributes. (B) ACMG-based Variant Pathogenicity profiles derived from these core classes, with profiling specializations in green. Text in curly braces are enumerations, which in some cases are nested inside fields of a MappableConcept. The actual VA-Spec v1.0 schema for these profiles are here and here.
Profiling tasks supported by the VA-Spec, and illustrated in the example above, include:
Profiling Task
Example
Define domain-specific subtypes of general purpose Core Model classes
Specialization of Proposition into VariantPathogenicityProposition
Define new attributes to capture domain-specific information
The Statement qualifiers geneContextQualifier and alleleOriginQualifier
Define or import classes for domain entities that profiles are about
The VariantPathogenicityProposition profile uses MolecularVariation and CategoricalVariation classes imported from VRS and CatVRS, and a Condition class defined in the VA-Spec itself.
Constrain values of core attributes to take specific types as values
Restricting the VariantPathogenicityStatement.object field to take a Condition as its value
Define value sets and binding them to select attributes.
Restricting nested fields in the MappableConcept object taken by VariantPathogenicityStatement.classification to a set of enumerated values based on ACMG Guideline terminology.
Refine cardinality of select attributes
Making Statement.classification a required field in the ACMG Variant Pathogenicity Statement.
Description: Specializes generic VA core classes for a particular type of knowledge, through formal definition of concrete subclasses.
Mechanism: Relies on bespoke GKS Metaschema Processorinherits and extends functions, and requisite tooling, to implement class inheritance and attribute extension which are not natively supported by JSON Schema.
Application: Used in authoring “Base Profiles” for Propositions and Study Results, which can be used/referenced within Statement and Evidence Line profiles.
Rationale: Allows for the types of attribute extension and addition that are applied in these Base Profiles (e.g. to specialize Proposition subject and object attributes, and create specific Proposition qualifiers and StudyResult data items)
Inheritance-Based Profiling Example:
# From the source yaml file where the Variant Pathogenicity Proposition Base Profile is authoredVariantPathogenicityProposition:inherits:ClinicalVariantProposition# MSP inherits keywordmaturity:trial usetype:objectdescription:A proposition describing the role of a variant in causing a heritable condition.properties:objectCondition:extends:object# MSP extends keywordoneOf:-$ref:Condition-$refCurie:gks.core:iriReferencedescription:The :ref:`Condition` for which the variant impact is stated.penetranceQualifier:# Addition of new qualifier attribute$refCurie:gks.core:MappableConceptdescription:Reports the penetrance of the pathogenic effect...
Description: Defines subschema that layer additional constraints on top of VA core attributes to refine the values they are able to take.
Mechanism: Relies on schema composition using the native JSON Schema allOf keyword, which does not result in creation of concrete subclasses for each profile.
Application: Used in authoring “Community Profiles” that add guideline-specific constraints on core Statement and Evidence Line classes, which can leverage base Proposition profiles to represent semantics of the possible fact they assert or evaluate evidence against, respectively.
Rationale: Allows implementers to define simple constraints for Statement and Evidence Line profiles in a way that does not require running bespoke MSP tooling
Composition-Based Profiling Example:
# From the source yaml file where the Variant Pathogenicity Statement AMCG 2015 Community Profile is authoredVariantPathogenicityStatement:description:A Statement describing the role of a variant in causing an inherited condition.# JSON Schema 'allOf' keyword used for schema compositionallOf:-$ref:"/ga4gh/schema/va-spec/1.0.0/base/json/Statement"# list of property definitions that further constrain attributes in the base Statement class-properties:# A constraint on the Statement.proposition attribute requiring it to take a VariantPathogenicityPropositionproposition:$ref:"/ga4gh/schema/va-spec/1.0.0/base/json/VariantPathogenicityProposition"description:A proposition about the pathogenicity of a variant, the validity of which is assessed and reported by the Statement.# A constraint on the code field nested within a MappableConcept that requires the 'strength' attribute to take specific values.strength:description:The strength of support that an ACMG 2015 Variant Pathogenicity statement is determined to provide for or against the proposed pathogenicity of the assessed variant.properties:primaryCoding:code:enum:-definitive-likelysystem:const:ACMG Guidelines, 2015
We recognize that this approach involving different mechanisms and ad hoc tooling to support authoring different subsets of profiles is not ideal, but was adopted given available technologies and bandwidth at this point in development.
Future versions of the VA-Spec will adopt a single, coherent, and consistent technical approach and tooling support for profile authoring, which will likely leverage the LinkML Framework (in particular, LinkML Map).