VerbChunk (v1)

Description

Non-recursive verb groups, which include modals, auxiliary verbs, and medial adverbs, and end at the head verb or predicate adjective.

Properties

Synopsis

Properties marked with * are required.

Property

Type

Origin

neg

string

native

tense

string

native

voice

string

native

text

string

Span (v5)

end

integer

Interval (v5)

start

integer

Interval (v5)

targets

array of string

Interval (v5)

timeUnit

string

Region (v5)

classification

object

Annotation (v6)

classifications

object

Annotation (v6)

document

string

Annotation (v6)

label

string

Annotation (v6)

labelset

array of string

Annotation (v6)

labelsetUri

string

Annotation (v6)

id *

string

Thing (v1)

Native

neg

  • type: string

  • required: no

Indicates whether or not the verb is negated. Values include YES, NO.


tense

  • type: string

  • required: no

Provides tense information for the verb. Example values include BeVBG, BeVBN, FutCon, HaveVBN, Pas, PasCon, PasPer, PasPerCon, Per, Pre, PreCon, PrePer, PrePerCon, SimFut, SimPas, SimPre, none


voice

  • type: string

  • required: no

Indicates if the verb group is active or passive. Possible values include ACTIVE, PASSIVE, or NONE


Inherited from Span (v5)

text

  • type: string

  • required: no

The surface string in the primary data covered by this span.


Inherited from Interval (v5)

end

  • type: integer

  • required: no

The ending offset in the primary data. This point is exclusive. For time intervals, the unit is determined by the timeUnit metadata key. For text intervals, the unit is Unicode code point.


start

  • type: integer

  • required: no

The starting offset in the primary data. This point is inclusive. For time intervals, the unit is determined by the timeUnit metadata key. For text intervals, the unit is Unicode code point.


targets

  • type: array of string

  • required: no

IDs of a sequence of annotations covering the region of primary data referred to by this annotation. Used as an alternative to start and end to point to component annotations (for example a token sequence) rather than directly into primary data, or to link two or more annotations (for example in a coreference annotation).


Inherited from Region (v5)

timeUnit

  • type: string

  • required: no

Specifies which unit of time the measurement is based. Can be seconds or milliseconds, or in case of annotations on a VideoDocument, frames. If not specified, milliseconds (in whole numbers) is assumed.

Note

This metadata is only relevant for time-based annotations.


Inherited from Annotation (v6)

classification

  • type: object

  • required: no

A map from label values to their “score” numbers provided by a classifier. The score can be probability, similarity, confidence, or any other real number that was used to determine the label value.

Optional on top of the label property. However when this property is used, the label property must be one of the keys and the keys must match to the values defined in the labelset or labelsetUri annotation metadata.


classifications

  • type: object

  • required: no

Alias for the classification metadata. Here for historical reasons.


document

  • type: string

  • required: no

The identifier of the document that the annotation is over.


label

  • type: string

  • required: no

A label given to this object by classification. The value must be a simple string value of the label and must be one of the values defined in the labelset or labelsetUri annotation metadata.

For example, for the Sentence subtype, this could be used to indicate the type of sentence, such as “declarative”, “interrogative”, “exclamatory”, etc. For NamedEntity subtype, this could be used to indicate the type of named entity, such as “PER”, “ORG”, “LOC”, “MISC” (following the CoNLL-2003 labels).

For non-linguistic annotations, for example for TimeFrame, this could be used to indicate the type of the time frame, such as “speech”, “music”, “noise”, “bars-and-tones”, etc, for acoustic classification. Or “slate”, “lower-third”, “credits” for visual classification of video frames.

Note

Annotations from a type of classifier model must have this property.


labelset

  • type: array of string

  • required: no

When an annotation object contains results of a classification task, this metadata is used to specify the label values used in classification. Individual annotations then must have label property that is one of the values in this list.

Note

Annotations from a classifier app must have this metadata or labelsetUri metadata.

Note

Not all of labels specified in the labelset must occur in the output annotations. For example, a labelset can contain a catch-all negative label, but if the negative label can be not interesting enough to keep in the output annotation.


labelsetUri

  • type: string

  • required: no

A URI to an externally defined labelset. Since the labelset metadata is a list of simple strings, this URI can be used to point to a more detailed definition of the labelset. This can be a JSON-LD document or a SKOS concept scheme, for example.

Note

Annotations from a classifier app must have this metadata or labelset metadata.


Inherited from Thing (v1)

id

  • type: string

  • required: yes

A unique identifier for the annotation or document. Uniqueness is relative to the view the annotation is in or the list of documents at the top level of a MMIF file.


JSON Schema

{
  "additionalProperties": true,
  "properties": {
    "id": {
      "description": "A unique identifier for the annotation or document. Uniqueness is relative to the view the annotation is in or the list of documents at the top level of a MMIF file.",
      "type": "string"
    },
    "document": {
      "type": "string",
      "description": "The identifier of the document that the annotation is over."
    },
    "labelset": {
      "items": {
        "type": "string"
      },
      "type": "array",
      "description": "When an annotation object contains results of a classification task, this metadata is used to specify the label values used in classification. Individual annotations then must have <code>label</code> property that is one of the values in this list. <br><br> [note] Annotations from a classifier app must have this metadata or <code> labelsetUri </code> metadata. [/note]<br><br> [note] Not all of labels specified in the <code>labelset</code> must occur in the output annotations. For example, a <code>labelset</code> can contain a <i>catch-all</i> negative label, but if the negative label can be not interesting enough to keep in the output annotation. [/note]"
    },
    "labelsetUri": {
      "type": "string",
      "description": "A URI to an externally defined labelset. Since the <code>labelset</code> metadata is a list of simple strings, this URI can be used to point to a more detailed definition of the labelset. This can be a JSON-LD document or a SKOS concept scheme, for example. <br><br> [note] Annotations from a classifier app must have this metadata or <code> labelset </code> metadata. [/note]"
    },
    "label": {
      "type": "string",
      "description": "A label given to this object by classification. The value must be a simple string value of the label and must be one of the values defined in the <code>labelset</code> or <code>labelsetUri</code> annotation metadata. <br><br> For example, for the <code>Sentence</code> subtype, this could be used to indicate the type of sentence, such as \"declarative\", \"interrogative\", \"exclamatory\", etc. For <code>NamedEntity</code> subtype, this could be used to indicate the type of named entity, such as \"PER\", \"ORG\", \"LOC\", \"MISC\" (following the CoNLL-2003 labels). <br><br> For non-linguistic annotations, for example for <code>TimeFrame</code>, this could be used to indicate the type of the time frame, such as \"speech\", \"music\", \"noise\", \"bars-and-tones\", etc, for acoustic classification. Or \"slate\", \"lower-third\", \"credits\" for visual classification of video frames. <br><br> [note] Annotations from a type of classifier model must have this property. [/note]"
    },
    "classifications": {
      "additionalProperties": {
        "type": "number"
      },
      "type": "object",
      "description": "Alias for the <code>classification</code> metadata. Here for historical reasons."
    },
    "classification": {
      "additionalProperties": {
        "type": "number"
      },
      "type": "object",
      "description": "A map from label values to their \"score\" numbers provided by a classifier. The score can be probability, similarity, confidence, or any other real number that was used to determine the label value. <br><br> <i>Optional</i> on top of the <code>label</code> property. However when this property is used, the <code>label</code> property must be one of the keys and the keys must match to the values defined in the <code>labelset</code> or <code>labelsetUri</code> annotation metadata."
    },
    "timeUnit": {
      "type": "string",
      "description": "Specifies which unit of time the measurement is based. Can be *seconds* or *milliseconds*, or in case of annotations on a VideoDocument, *frames*. If not specified, *milliseconds* (in whole numbers) is assumed. <br><br> [note] This metadata is only relevant for time-based annotations. [/note]"
    },
    "start": {
      "type": "integer",
      "description": "The starting offset in the primary data. This point is inclusive. For time intervals, the unit is determined by the *timeUnit* metadata key. For text intervals, the unit is Unicode code point."
    },
    "end": {
      "type": "integer",
      "description": "The ending offset in the primary data. This point is exclusive. For time intervals, the unit is determined by the *timeUnit* metadata key. For text intervals, the unit is Unicode code point."
    },
    "targets": {
      "items": {
        "type": "string"
      },
      "type": "array",
      "description": "IDs of a sequence of annotations covering the region of primary data referred to by this annotation. Used as an alternative to *start* and *end* to point to component annotations (for example a token sequence) rather than directly into primary data, or to link two or more annotations (for example in a coreference annotation)."
    },
    "text": {
      "type": "string",
      "description": "The surface string in the primary data covered by this span."
    },
    "tense": {
      "type": "string",
      "description": "Provides tense information for the verb. Example values include BeVBG, BeVBN, FutCon, HaveVBN, Pas, PasCon, PasPer, PasPerCon, Per, Pre, PreCon, PrePer, PrePerCon, SimFut, SimPas, SimPre, none"
    },
    "voice": {
      "type": "string",
      "description": "Indicates if the verb group is active or passive. Possible values include ACTIVE, PASSIVE, or NONE"
    },
    "neg": {
      "type": "string",
      "description": "Indicates whether or not the verb is negated. Values include YES, NO."
    }
  },
  "required": [
    "id"
  ],
  "type": "object"
}