VerbChunk (v1)¶

Previous Version: none (this is the oldest version)
URI: http://clams.ai/vocabulary/type/VerbChunk/v1
Shortname: VerbChunk
Inherits from: Span (v5)
Included in: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.2.0
Also Known As: This type is also accessible at the following legacy URLs for backward compatibility:
- http://mmif.clams.ai/vocabulary/VerbChunk/v1
- http://vocab.lappsgrid.org/VerbChunk
Similar Types: This type is similar to the following types in other vocabularies:
- http://vocab.lappsgrid.org/VerbChunk

Description¶

Non-recursive verb groups, which include modals, auxiliary verbs, and medial adverbs, and end at the head verb or predicate adjective.

Properties¶

Synopsis¶

Properties marked with * are required.

Property	Type	Origin
neg	string	native
tense	string	native
voice	string	native
text	string	Span (v5)
end	integer	Interval (v5)
start	integer	Interval (v5)
targets	array of string	Interval (v5)
timeUnit	string	Region (v5)
classification	object	Annotation (v6)
classifications	object	Annotation (v6)
document	string	Annotation (v6)
label	string	Annotation (v6)
labelset	array of string	Annotation (v6)
labelsetUri	string	Annotation (v6)
id *	string	Thing (v1)

Native¶

`neg`¶

type: string
required: no

Indicates whether or not the verb is negated. Values include YES, NO.

`tense`¶

type: string
required: no

Provides tense information for the verb. Example values include BeVBG, BeVBN, FutCon, HaveVBN, Pas, PasCon, PasPer, PasPerCon, Per, Pre, PreCon, PrePer, PrePerCon, SimFut, SimPas, SimPre, none

`voice`¶

type: string
required: no

Indicates if the verb group is active or passive. Possible values include ACTIVE, PASSIVE, or NONE

Inherited from Span (v5)¶

`text`¶

type: string
required: no

The surface string in the primary data covered by this span.

Inherited from Interval (v5)¶

`end`¶

type: integer
required: no

The ending offset in the primary data. This point is exclusive. For time intervals, the unit is determined by the timeUnit property. For text intervals, the unit is Unicode code point.

`start`¶

type: integer
required: no

The starting offset in the primary data. This point is inclusive. For time intervals, the unit is determined by the timeUnit property. For text intervals, the unit is Unicode code point.

`targets`¶

type: array of string
required: no

IDs of a sequence of annotations covering the region of primary data referred to by this annotation. Used as an alternative to start and end to point to component annotations (for example a token sequence) rather than directly into primary data, or to link two or more annotations (for example in a coreference annotation).

Inherited from Region (v5)¶

`timeUnit`¶

type: string
required: no

Specifies which unit of time the measurement is based. Can be seconds or milliseconds, or in case of annotations on a VideoDocument, frames. If not specified, milliseconds (in whole numbers) is assumed.

Note

This property is only relevant for time-based annotations.

Inherited from Annotation (v6)¶

`classification`¶

type: object
required: no

A map from label values to their “score” numbers provided by a classifier. The score can be probability, similarity, confidence, or any other real number that was used to determine the label value.

Optional on top of the label property. However when this property is used, the label property must be one of the keys and the keys must match to the values defined in the labelset or labelsetUri properties.

`classifications`¶

type: object
required: no

Alias for the classification property. Here for historical reasons.

`document`¶

type: string
required: no

The identifier of the document that the annotation is over.

`label`¶

type: string
required: no

A label given to this object by classification. The value must be a simple string value of the label and must be one of the values defined in the labelset or labelsetUri properties.

For example, for the Sentence subtype, this could be used to indicate the type of sentence, such as “declarative”, “interrogative”, “exclamatory”, etc. For NamedEntity subtype, this could be used to indicate the type of named entity, such as “PER”, “ORG”, “LOC”, “MISC” (following the CoNLL-2003 labels).

For non-linguistic annotations, for example for TimeFrame, this could be used to indicate the type of the time frame, such as “speech”, “music”, “noise”, “bars-and-tones”, etc, for acoustic classification. Or “slate”, “lower-third”, “credits” for visual classification of video frames.

Note

Annotations from a type of classifier model must have this property.

`labelset`¶

type: array of string
required: no

When an annotation object contains results of a classification task, this property is used to specify the label values used in classification. Individual annotations then must have label property that is one of the values in this list.

Note

Annotations from a classifier app must have this property or labelsetUri.

Note

Not all of labels specified in the labelset must occur in the output annotations. For example, a labelset can contain a catch-all negative label, but if the negative label can be not interesting enough to keep in the output annotation.

`labelsetUri`¶

type: string
required: no

A URI to an externally defined labelset. Since the labelset property is a list of simple strings, this URI can be used to point to a more detailed definition of the labelset. This can be a JSON-LD document or a SKOS concept scheme, for example.

Note

Annotations from a classifier app must have this property or labelset.

Inherited from Thing (v1)¶

`id`¶

type: string
required: yes

A unique identifier for the annotation or document. Uniqueness is relative to the view the annotation is in or the list of documents at the top level of a MMIF file.

JSON Schema¶

{
  "additionalProperties": true,
  "properties": {
    "id": {
      "description": "A unique identifier for the annotation or document. Uniqueness is relative to the view the annotation is in or the list of documents at the top level of a MMIF file.",
      "type": "string"
    },
    "document": {
      "type": "string",
      "description": "The identifier of the document that the annotation is over."
    },
    "labelset": {
      "items": {
        "type": "string"
      },
      "type": "array",
      "description": "When an annotation object contains results of a classification task, this property is used to specify the label values used in classification. Individual annotations then must have <code>label</code> property that is one of the values in this list. <br><br> [note] Annotations from a classifier app must have this property or <code>labelsetUri</code>. [/note]<br><br> [note] Not all of labels specified in the <code>labelset</code> must occur in the output annotations. For example, a <code>labelset</code> can contain a <i>catch-all</i> negative label, but if the negative label can be not interesting enough to keep in the output annotation. [/note]"
    },
    "labelsetUri": {
      "type": "string",
      "description": "A URI to an externally defined labelset. Since the <code>labelset</code> property is a list of simple strings, this URI can be used to point to a more detailed definition of the labelset. This can be a JSON-LD document or a SKOS concept scheme, for example. <br><br> [note] Annotations from a classifier app must have this property or <code>labelset</code>. [/note]"
    },
    "label": {
      "type": "string",
      "description": "A label given to this object by classification. The value must be a simple string value of the label and must be one of the values defined in the <code>labelset</code> or <code>labelsetUri</code> properties. <br><br> For example, for the <code>Sentence</code> subtype, this could be used to indicate the type of sentence, such as \"declarative\", \"interrogative\", \"exclamatory\", etc. For <code>NamedEntity</code> subtype, this could be used to indicate the type of named entity, such as \"PER\", \"ORG\", \"LOC\", \"MISC\" (following the CoNLL-2003 labels). <br><br> For non-linguistic annotations, for example for <code>TimeFrame</code>, this could be used to indicate the type of the time frame, such as \"speech\", \"music\", \"noise\", \"bars-and-tones\", etc, for acoustic classification. Or \"slate\", \"lower-third\", \"credits\" for visual classification of video frames. <br><br> [note] Annotations from a type of classifier model must have this property. [/note]"
    },
    "classifications": {
      "additionalProperties": {
        "type": "number"
      },
      "type": "object",
      "description": "Alias for the <code>classification</code> property. Here for historical reasons."
    },
    "classification": {
      "additionalProperties": {
        "type": "number"
      },
      "type": "object",
      "description": "A map from label values to their \"score\" numbers provided by a classifier. The score can be probability, similarity, confidence, or any other real number that was used to determine the label value. <br><br> <i>Optional</i> on top of the <code>label</code> property. However when this property is used, the <code>label</code> property must be one of the keys and the keys must match to the values defined in the <code>labelset</code> or <code>labelsetUri</code> properties."
    },
    "timeUnit": {
      "type": "string",
      "description": "Specifies which unit of time the measurement is based. Can be *seconds* or *milliseconds*, or in case of annotations on a VideoDocument, *frames*. If not specified, *milliseconds* (in whole numbers) is assumed. <br><br> [note] This property is only relevant for time-based annotations. [/note]"
    },
    "start": {
      "type": "integer",
      "description": "The starting offset in the primary data. This point is inclusive. For time intervals, the unit is determined by the *timeUnit* property. For text intervals, the unit is Unicode code point."
    },
    "end": {
      "type": "integer",
      "description": "The ending offset in the primary data. This point is exclusive. For time intervals, the unit is determined by the *timeUnit* property. For text intervals, the unit is Unicode code point."
    },
    "targets": {
      "items": {
        "type": "string"
      },
      "type": "array",
      "description": "IDs of a sequence of annotations covering the region of primary data referred to by this annotation. Used as an alternative to *start* and *end* to point to component annotations (for example a token sequence) rather than directly into primary data, or to link two or more annotations (for example in a coreference annotation)."
    },
    "text": {
      "type": "string",
      "description": "The surface string in the primary data covered by this span."
    },
    "tense": {
      "type": "string",
      "description": "Provides tense information for the verb. Example values include BeVBG, BeVBN, FutCon, HaveVBN, Pas, PasCon, PasPer, PasPerCon, Per, Pre, PreCon, PrePer, PrePerCon, SimFut, SimPas, SimPre, none"
    },
    "voice": {
      "type": "string",
      "description": "Indicates if the verb group is active or passive. Possible values include ACTIVE, PASSIVE, or NONE"
    },
    "neg": {
      "type": "string",
      "description": "Indicates whether or not the verb is negated. Values include YES, NO."
    }
  },
  "required": [
    "id"
  ],
  "type": "object"
}

VerbChunk (v1)¶

Description¶

Properties¶

Synopsis¶

Native¶

neg¶

tense¶

voice¶

Inherited from Span (v5)¶

text¶

Inherited from Interval (v5)¶

end¶

start¶

targets¶

Inherited from Region (v5)¶

timeUnit¶

Inherited from Annotation (v6)¶

classification¶

classifications¶

document¶

label¶

labelset¶

labelsetUri¶

Inherited from Thing (v1)¶

id¶

JSON Schema¶

`neg`¶

`tense`¶

`voice`¶

`text`¶

`end`¶

`start`¶

`targets`¶

`timeUnit`¶

`classification`¶

`classifications`¶

`document`¶

`label`¶

`labelset`¶

`labelsetUri`¶

`id`¶