mmif.serialize package¶
Core package to provide serialization and deserialization of MMIF format.
model module¶
The model module contains the classes used to represent an
abstract MMIF object as a live Python object.
The MmifObject class or one of its derivatives is subclassed by
all other classes defined in this SDK, except for MmifObjectEncoder.
These objects are generally instantiated from JSON, either as a string or as an already-loaded Python dictionary. This base class provides the core functionality for deserializing MMIF JSON data into live objects and serializing live objects into MMIF JSON data. Specialized behavior for the different components of MMIF is added in the subclasses.
This module defines two main collection types:
DataList: List-like collections that support integer/slice indexing. For ID-based access, use indexing orgetin the container level. For example, for DocumentList, use its parent Mmif object’s getter methods to access documents by ID. (e.g.,mmif['doc1']).DataDict: Dict-like collections that support string key access.
- class mmif.serialize.model.DataDict(mmif_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
MmifObject,Generic[T,S]- get(key: T, default=None) S | None[source]¶
Dictionary-style safe access with optional default value.
This method provides pythonic dict behavior - returns the value for the given key, or a default value if the key is not found.
- Parameters:
key – The key to look up
default – The value to return if key is not found (default: None)
- Returns:
The value associated with the key, or the default value
Examples¶
# Access contains metadata: timeframe_meta = view.metadata.contains.get(AnnotationTypes.TimeFrame) if timeframe_meta is None: print("No TimeFrame annotations in this view") # With custom default: value = some_dict.get('key', default={})
- class mmif.serialize.model.DataList(mmif_obj: bytes | str | list | None = None, *_)[source]¶
Bases:
MmifObject,Generic[T]The DataList class is an abstraction that represents the various lists found in a MMIF file, such as documents, subdocuments, views, and annotations.
- Parameters:
mmif_obj (Union[str, list]) – the data that the list contains
- deserialize(mmif_json: str | list) None[source]¶
Passes the input data into the internal deserializer.
- get(key: str, default=None) T | None[source]¶
Deprecated since version 1.1.3: Do not use in new code. Will be removed in 2.0.0. Use container-level access or positional indexing instead.
Deprecated method for retrieving list elements by string ID.
- Parameters:
key – the key to search for
default – the default value to return if the key is not found (defaults to None)
- Returns:
the value matching that key, or the default value if not found
Examples¶
Old pattern (deprecated, do not use):
view = mmif.views.get('v1') # DeprecationWarning!
New patterns to use instead:
# For ID-based access, use container: view = mmif['v1'] # Or with safe access: view = mmif.get('v1', default=None) # For positional access: view = mmif.views[0]
See Also¶
__getitem__ : List-style positional access with integers
- class mmif.serialize.model.MmifObject(mmif_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
objectAbstract superclass for MMIF related key-value pair objects.
Any MMIF object can be initialized as an empty placeholder or an actual representation with a JSON formatted string or equivalent dict object argument.
This superclass has four specially designed instance variables, and these variable names cannot be used as attribute names for MMIF objects.
_unnamed_attributes: Only can be either None or an empty dictionary. If it’s set to None, it means the class won’t take any
Additional Attributesin the JSON schema sense. If it’s an empty dict, users can throw any k-v pairs to the class, as long as the key is not a “reserved” name, and those additional attributes will be stored in this dict while in memory._attribute_classes: This is a dict from a key name to a specific python class to use for deserialize the value. Note that a key name in this dict does NOT have to be a named attribute, but is recommended to be one.
_required_attributes: This is a simple list of names of attributes that are required in the object. When serialize, an object will skip its empty (e.g. zero-length, or None) attributes unless they are in this list. Otherwise, the serialized JSON string would have empty representations (e.g.
"",[])._exclude_from_diff: This is a simple list of names of attributes that should be excluded from the diff calculation in
__eq__.
# TODO (krim @ 8/17/20): this dict is however, a duplicate with the type hints in the class definition. Maybe there is a better way to utilize type hints (e.g. getting them as a programmatically), but for now developers should be careful to add types to hints as well as to this dict.
Also note that those special attributes MUST be set in the __init__() before calling super method, otherwise deserialization will not work.
And also, a subclass that has one or more named attributes, it must set those attributes in the __init__() before calling super method. When serializing a MmifObject, all empty attributes will be ignored, so for optional named attributes, you must leave the values empty (len == 0), but NOT None. Any None-valued named attributes will cause issues with current implementation.
- Parameters:
mmif_obj – JSON string or dict to initialize an object. If not given, an empty object will be initialized, sometimes with an ID value automatically generated, based on its parent object.
- deserialize(mmif_json: str | dict) None[source]¶
Takes a JSON-formatted string or a simple dict that’s json-loaded from such a string as an input and populates object’s fields with the values specified in the input.
- Parameters:
mmif_json – JSON-formatted string or dict from such a string that represents a MMIF object
- disallow_additional_properties() None[source]¶
Call this method in
__init__()to prevent the insertion of unnamed attributes after initialization.
- get(obj_id, default=None)[source]¶
High-level safe getter that returns a default value instead of raising KeyError.
This method wraps
__getitem__()with exception handling, making it safe to query for objects that might not exist. Available on all MmifObject subclasses.- Parameters:
obj_id – An attribute name or object identifier (document ID, view ID, annotation ID, or property name depending on the object type). For Mmif objects: when annotation ID is given as a “short” ID (without view ID prefix), searches from the first view.
default – The value to return if the key is not found (default: None)
- Returns:
The object/value searched for, or the default value if not found
Examples¶
Safe access pattern (works on all MmifObject subclasses):
# On Mmif objects: view = mmif.get('v1', default=None) # Returns None if not found doc = mmif.get('doc1', default=None) # On Annotation/Document objects: label = annotation.get('label', default='unknown') author = document.get('author', default='anonymous')
See Also¶
__getitem__ : Direct access that raises KeyError when not found
- id_delimiter: ClassVar[str] = ':'¶
- static is_empty(obj) bool[source]¶
return True if the obj is None or “emtpy”. The emptiness first defined as having zero length. But for objects that lack __len__ method, we need additional check.
- reserved_names: Set[str] = {'_attribute_classes', '_contextual_attributes', '_exclude_from_diff', '_required_attributes', '_unnamed_attributes', 'reserved_names'}¶
- serialize(pretty: bool = False, include_context: bool = True) str[source]¶
Generates JSON representation of an object.
- Parameters:
pretty – If True, returns string representation with indentation.
include_context – If
False, excludes contextual attributes from serialization. Contextual attributes hold information that varies at runtime (e.g., timestamps) and do not constitute the core information of the MMIF object. This is useful for comparing two MMIF objects for equality.
- Returns:
JSON string of the object.
- set_additional_property(key: str, value: Any) None[source]¶
Method to set values in _unnamed_attributes.
- Parameters:
key – the attribute name
value – the desired value
- Returns:
None
- Raise:
AttributeError if additional properties are disallowed by
disallow_additional_properties()
- view_prefix: ClassVar[str] = 'v_'¶
- class mmif.serialize.model.MmifObjectEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶
Bases:
JSONEncoderEncoder class to define behaviors of de-/serialization
- default(obj: MmifObject)[source]¶
Overrides default encoding behavior to prioritize
MmifObject.serialize().
mmif module¶
The mmif module contains the classes used to represent a full MMIF
file as a live Python object.
The Mmif class is a high-level container that provides convenient
string-based access to documents, views, and annotations via mmif[id].
The underlying documents and views attributes are list-like collections
that use integer indexing; use container-level access for ID-based lookups.
See the specification docs and the JSON Schema file for more information.
- class mmif.serialize.mmif.Mmif(mmif_obj: bytes | str | dict | None = None, *, validate: bool = True)[source]¶
Bases:
MmifObjectMmifObject that represents a full MMIF file.
This is a high-level container object that provides convenient string-based access to documents, views, and annotations using their IDs. The underlying collections (
documentsandviews) are list-like and use integer indexing, but Mmif itself accepts string IDs for convenient access.- Parameters:
mmif_obj – the JSON data
validate – whether to validate the data against the MMIF JSON schema.
Examples¶
Accessing objects by ID (high-level, convenient):
mmif = Mmif(mmif_json) doc = mmif['m1'] # Get document by ID view = mmif['v1'] # Get view by ID ann = mmif['v1:a1'] # Get annotation by long-form ID # Safe access with default: doc = mmif.get('m99', default=None)
Accessing via underlying lists (positional access):
first_doc = mmif.documents[0] # First document last_view = mmif.views[-1] # Last view all_views = mmif.views[1:4] # Slice of views
- add_document(document: Document, overwrite=False) None[source]¶
Appends a Document object to the documents list.
Fails if there is already a document with the same ID or a view with the same ID in the MMIF object.
- Parameters:
document – the Document object to add
overwrite – if set to True, will overwrite an existing view with the same ID
- Raises:
KeyError – if
overwriteis set to False and existing object (document or view) with the same ID exists- Returns:
None
- add_view(view: View, overwrite=False) None[source]¶
Appends a View object to the views list.
Fails if there is already a view with the same ID or a document with the same ID in the MMIF object.
- Parameters:
view – the View object to add
overwrite – if set to True, will overwrite an existing view with the same ID
- Raises:
KeyError – if
overwriteis set to False and existing object (document or view) with the same ID exists- Returns:
None
- generate_capital_annotations()[source]¶
Automatically convert any “pending” temporary properties from Document objects to Annotation objects . The generated Annotation objects are then added to the last View in the views lists.
See https://github.com/clamsproject/mmif-python/issues/226 for rationale behind this behavior and discussion.
- get_alignments(at_type1: str | TypesBase, at_type2: str | TypesBase) Dict[str, List[Annotation]][source]¶
Finds views where alignments between two given annotation types occurred.
- Parameters:
at_type1 – the first annotation type to search for alignments
at_type2 – the second annotation type to search for alignments
- Returns:
a dict that keyed by view IDs (str) and has lists of alignment Annotation objects as values.
- get_all_views_contain(*at_types: TypesBase | str) List[View][source]¶
Returns the list of all views in the MMIF if given types are present in that view’s ‘contains’ metadata.
- Parameters:
at_types – a list of types or just a type to check for. When given more than one types, all types must be found.
- Returns:
the list of views that contain the type
- get_all_views_with_error() List[View][source]¶
Returns the list of all views in the MMIF that have errors.
- Returns:
the list of views that contain errors but no annotations
- get_annotations_between_time(start: int | float, end: int | float, time_unit: str = 'ms', at_types: List[str | TypesBase] = []) Iterator[Annotation][source]¶
Finds annotations that are anchored between the given time points.
- Parameters:
start – the start time point
end – the end time point
time_unit – the unit of the input time points. Default is ms.
at_types – a list of annotation types to filter with. Any type in this list will be included in the return.
- Returns:
an iterator of Annotation objects that are anchored between the given time points
- get_document_by_id(doc_id: str) Document[source]¶
Deprecated since version 1.1.0: Will be removed in 2.0.0. Use general
__getitem__()method instead, e.g.,mmif[doc_id].Finds a Document object with the given ID.
- Parameters:
doc_id – the ID to search for
- Returns:
a reference to the corresponding document, if it exists
- Raises:
KeyError – if there is no corresponding document
- get_document_location(m_type: DocumentTypes | str, path_only=False) str | None[source]¶
Method to get the location of first document of given type.
- Parameters:
m_type – the type to search for
path_only – if True, returns resolved file system path instead of location URI
- Returns:
the value of the location field in the corresponding document
- get_documents_by_app(app_id: str) List[Document][source]¶
Method to get all documents object queries by its originated app name.
- Parameters:
app_id – the app name to search for
- Returns:
a list of documents matching the requested app name, or an empty list if the app not found
- get_documents_by_property(prop_key: str, prop_value: str) List[Document][source]¶
Method to retrieve documents by an arbitrary key-value pair in the document properties objects.
- Parameters:
prop_key – the metadata key to search for
prop_value – the metadata value to match
- Returns:
a list of documents matching the requested metadata key-value pair
- get_documents_by_type(doc_type: str | DocumentTypes) List[Document][source]¶
Method to get all documents where the type matches a particular document type, which should be one of the CLAMS document types.
- Parameters:
doc_type – the type of documents to search for, must be one of
Documenttype defined in the CLAMS vocabulary.- Returns:
a list of documents matching the requested type, or an empty list if none found.
- get_documents_in_view(vid: str | None = None) List[Document][source]¶
Method to get all documents object queries by a view id.
- Parameters:
vid – the source view ID to search for
- Returns:
a list of documents matching the requested source view ID, or an empty list if the view not found
- get_documents_locations(m_type: DocumentTypes | str, path_only=False) List[str | None][source]¶
This method returns the file paths of documents of given type. Only top-level documents have locations, so we only check them.
- Parameters:
m_type – the type to search for
path_only – if True, returns resolved file system paths instead of location URIs
- Returns:
a list of the values of the location fields in the corresponding documents
- get_end(annotation: Annotation) int | float[source]¶
An alias to get_anchor_point method with start=False.
- get_last_error() str | None[source]¶
Returns the last error message found in the views.
- Returns:
the error message in human-readable format, or None if no error is found
- get_start(annotation: Annotation) int | float[source]¶
An alias to get_anchor_point method with start=True.
- get_view_by_id(view_id: str) View[source]¶
Deprecated since version 1.1.0: Will be removed in 2.0.0. Use general
__getitem__()method instead, e.g.,mmif[view_id].Finds a View object with the given ID.
- Parameters:
view_id – the ID to search for
- Returns:
a reference to the corresponding view, if it exists
- Raises:
Exception – if there is no corresponding view
- get_view_contains(at_types: TypesBase | str | List[str | TypesBase]) View | None[source]¶
Returns the last view appended that contains the given types in its ‘contains’ metadata.
- Parameters:
at_types – a list of types or just a type to check for. When given more than one types, all types must be found.
- Returns:
the view, or None if the type is not found
- get_view_with_error() View | None[source]¶
Returns the last view appended that contains an error.
- Returns:
the view, or None if no error is found
- get_views_contain(*at_types: TypesBase | str) List[View][source]¶
Returns the list of all views in the MMIF if given types are present in that view’s ‘contains’ metadata.
- Parameters:
at_types – a list of types or just a type to check for. When given more than one types, all types must be found.
- Returns:
the list of views that contain the type
- get_views_for_document(doc_id: str) List[View][source]¶
Returns the list of all views that have annotations anchored on a particular document. Note that when the document is inside a view (generated during the workflow’s running), doc_id must be prefixed with the view_id.
- get_views_with_error() List[View][source]¶
Returns the list of all views in the MMIF that have errors.
- Returns:
the list of views that contain errors but no annotations
- new_view() View[source]¶
Creates an empty view with a new ID and appends it to the views list.
- Returns:
a reference to the new View object
- sanitize()[source]¶
Sanitizes a Mmif object by running some safeguards. Concretely, it performs the following before returning the JSON string.
validating output using built-in MMIF jsonschema
remove non-existing annotation types from
containsmetadata
- serialize(sanitize: bool = False, autogenerate_capital_annotations: bool = True, **kwargs) str[source]¶
Serializes the MMIF object to a JSON string.
- Parameters:
sanitize – If True, performs some sanitization of before returning the JSON string. See
sanitize()for details.autogenerate_capital_annotations – If True, automatically convert any “pending” temporary properties from Document objects to Annotation objects. See
generate_capital_annotations()for details.kwargs – Keyword arguments to pass to the parent’s
serializemethod (e.g.,pretty=True,include_context=False).
- Returns:
JSON string of the MMIF object.
- static validate(json_str: bytes | str | dict) None[source]¶
Validates a MMIF JSON object against the MMIF Schema. Note that this method operates before processing by MmifObject._load_str, so it expects @ and not _ for the JSON-LD @-keys.
- Raises:
jsonschema.exceptions.ValidationError – if the input fails validation
- Parameters:
json_str – a MMIF JSON dict or string
- Returns:
None
view module¶
The view module contains the classes used to represent a MMIF view
as a live Python object.
In MMIF, views are created by apps in a workflow that are annotating data that was previously present in the MMIF file.
The View class is a high-level container that provides convenient
string-based access to annotations via view[id]. The underlying
annotations attribute is a list-like collection that uses integer indexing;
use container-level access for ID-based lookups.
- class mmif.serialize.view.Contain(mmif_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
DataDict[str,str]Contain object that represents the metadata of a single annotation type in the
containsmetadata of a MMIF view.
- class mmif.serialize.view.View(view_obj: bytes | str | dict | None = None, parent_mmif=None, *_)[source]¶
Bases:
MmifObjectView object that represents a single view in a MMIF file.
A view is identified by an ID, and contains certain metadata, a list of annotations, and potentially a JSON-LD
@contextIRI.This is a high-level container object that provides convenient string-based access to annotations using their IDs. The underlying
annotationscollection is list-like and uses integer indexing, but View itself accepts string IDs for convenient access.If
view_objis not provided, an empty View will be generated.- Parameters:
view_obj – the JSON data that defines the view
Examples¶
Accessing annotations by ID (high-level, convenient):
view = mmif['v1'] ann = view['v1:a1'] # Get annotation by ID doc = view['v1:td1'] # Get document by ID # Safe access with default: ann = view.get('v1:a999', default=None)
Accessing via underlying list (positional access):
first_ann = view.annotations[0] # First annotation last_ann = view.annotations[-1] # Last annotation some_anns = view.annotations[1:5] # Slice of annotations
- add_annotation(annotation: Annotation, overwrite=False) Annotation[source]¶
Adds an annotation to the current view.
Fails if there is already an annotation with the same ID in the view, unless
overwriteis set to True.- Parameters:
annotation – the
mmif.serialize.annotation.Annotationobject to addoverwrite – if set to True, will overwrite an existing annotation with the same ID
- Raises:
KeyError – if
overwriteis set to False and an annotation with the same ID exists in the view- Returns:
the same Annotation object passed in as
annotation
- add_document(document: Document, overwrite=False) Annotation[source]¶
Appends a Document object to the annotations list.
Fails if there is already a document with the same ID in the annotations list.
- Parameters:
document – the Document object to add
overwrite – if set to True, will overwrite an existing view with the same ID
- Returns:
the added Document object (as an Annotation)
- get_annotation_by_id(ann_id) Annotation[source]¶
Deprecated since version 1.1.0: Will be removed in 2.0.0. Use general
Mmif.__getitem__()method instead to retrieve any annotation across the MMIF, or View.__getitems__() to retrieve annotations within the view.Thinly wraps the Mmif.__getitem__ method and returns an Annotation object. Note that although this method is under View class, it can be used to retrieve any annotation across the entire MMIF.
- Parameters:
ann_id – the ID of the annotation to retrieve.
- Returns:
found
mmif.serialize.annotation.Annotationobject.- Raises:
KeyError – if the annotation with the given ID is not found
- get_annotations(at_type: str | TypesBase | None = None, **properties) Generator[Annotation, None, None][source]¶
Look for certain annotations in this view, specified by parameters
- Parameters:
at_type – @type of the annotations to look for. When this is None, any @type will match.
properties – properties of the annotations to look for. When given more than one property, all properties must match. Note that annotation type metadata are specified in the contains view metadata, not in individual annotation objects.
- get_document_by_id(doc_id) Document[source]¶
Deprecated since version 1.1.0: Will be removed in 2.0.0. Use general
Mmif.__getitem__()method instead to retrieve any document across the MMIF, or View.__getitems__() to retrieve documents within the view.Thinly wraps the Mmif.__getitem__ method and returns a Document object. Note that although this method is under View class, it can be used to retrieve any document across the entire MMIF.
- Parameters:
doc_id – the ID of the document to retrieve.
- Returns:
found
mmif.serialize.annotation.Documentobject.- Raises:
KeyError – if the document with the given ID is not found
- get_error() str | None[source]¶
Get the “text” representation of the error occurred during processing. Text representation is supposed to be human-readable. When ths view does not have any error, returns None.
- new_annotation(at_type: str | TypesBase, aid: str | None = None, overwrite=False, **properties) Annotation[source]¶
Generates a new
mmif.serialize.annotation.Annotationobject and adds it to the current view.Fails if there is already an annotation with the same ID in the view, unless
overwriteis set to True.- Parameters:
at_type – the desired
@typeof the annotation.aid – the desired ID of the annotation, when not given, the mmif SDK tries to automatically generate an ID based on Annotation type and existing annotations in the view.
overwrite – if set to True, will overwrite an existing annotation with the same ID.
- Raises:
KeyError – if
overwriteis set to False and an annotation with the same ID exists in the view.- Returns:
the generated
mmif.serialize.annotation.Annotation
- new_contain(at_type: str | TypesBase, **contains_metadata) Contain | None[source]¶
Adds a new element to the
containsmetadata.- Parameters:
at_type – the
@typeof the annotation type being addedcontains_metadata – any metadata associated with the annotation type
- Returns:
the generated
Containobject
- new_textdocument(text: str, lang: str = 'en', did: str | None = None, overwrite=False, **properties) Document[source]¶
Generates a new
mmif.serialize.annotation.Documentobject, particularly typed as TextDocument and adds it to the current view.Fails if there is already a text document with the same ID in the view, unless
overwriteis set to True.- Parameters:
text – text content of the new document
lang – ISO 639-1 code of the language used in the new document
did – the desired ID of the document, when not given, the mmif SDK tries to automatically generate an ID based on Annotation type and existing documents in the view.
overwrite – if set to True, will overwrite an existing document with the same ID
- Raises:
KeyError – if
overwriteis set to False and an document with the same ID exists in the view- Returns:
the generated
mmif.serialize.annotation.Document
- class mmif.serialize.view.ViewMetadata(viewmetadata_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
MmifObjectViewMetadata object that represents the
metadataobject within a MMIF view.- Parameters:
viewmetadata_obj – the JSON data that defines the metadata
- add_app_configuration(config_key: str, config_value: str | int | float | bool | None | List[str | int | float | bool | None]) None[source]¶
Add a configuration key-value pair to the app_configuration dictionary.
- add_parameter(param_key: str, param_value: str)[source]¶
Add a single runtime parameter to the view metadata. Note that parameter value must be a string.
- add_parameters(**runtime_params: str)[source]¶
Add runtime parameters as a batch (dict) to the view metadata. Note that parameter values must be strings.
- get_app_configuration(config_key: str) str | int | float | bool | None | List[str | int | float | bool | None][source]¶
Get a configuration value from the app_configuration dictionary.
- new_contain(at_type: str | TypesBase, **contains_metadata) Contain | None[source]¶
Adds a new element to the
containsdictionary.- Parameters:
at_type – the
@typeof the annotation type being addedcontains_metadata – any metadata associated with the annotation type
- Returns:
the generated
Containobject
annotation module¶
The annotation module contains the classes used to represent a
MMIF annotation as a live Python object.
In MMIF, annotations are created by apps in a workflow as a part
of a view. For documentation on how views are represented, see
mmif.serialize.view.
- class mmif.serialize.annotation.Annotation(anno_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
MmifObjectMmifObject that represents an annotation in a MMIF view.
- add_property(name: str, value: str | int | float | bool | None | List[str | int | float | bool | None] | List[List[str | int | float | bool | None]] | Dict[str, str | int | float | bool | None] | Dict[str, List[str | int | float | bool | None]]) None[source]¶
Adds a property to the annotation’s properties.
- Parameters:
name – the name of the property
value – the property’s desired value
- Returns:
None
- aligned_to_by(alignment: Annotation) Annotation | None[source]¶
Retrieves the other side of
Alignmentannotation that has this annotation on one side.- Parameters:
alignment –
Alignmentannotation that has this annotation on one side- Returns:
the annotation that this annotation is aligned to (other side of
Alignment), or None if this annotation is not used in theAlignment.
- property at_type: TypesBase¶
- static check_prop_value_is_simple_enough(value: str | int | float | bool | None | List[str | int | float | bool | None] | List[List[str | int | float | bool | None]] | Dict[str, str | int | float | bool | None] | Dict[str, List[str | int | float | bool | None]]) bool[source]¶
- get(prop_name: str, default=None)[source]¶
Safe property access with optional default value.
Searches for an annotation property by name and returns its value, or a default value if not found. This method searches in multiple locations with the following priority:
Direct properties (in
annotation.properties)Ephemeral properties (view-level metadata from
contains)Special fields (
@type,properties)
This allows convenient access to properties without explicitly checking the
propertiesobject or view-level metadata.- Parameters:
prop_name – The name of the property to retrieve
default – The value to return if the property is not found (default: None)
- Returns:
The property value, or the default value if not found
Examples¶
# Access annotation properties: label = annotation.get('label', default='unknown') start_time = annotation.get('start', default=0) # Access @type: at_type = annotation.get('@type') # Safe access with custom default: targets = annotation.get('targets', default=[])
See Also¶
__getitem__ : Direct property access that raises KeyError when not found get_property : Alias for this method
- get_all_aligned() Iterator[Annotation][source]¶
Generator to iterate through all alignments and aligned annotations. Note that this generator will yield the Alignment annotations as well. Every odd-numbered yield will be an Alignment annotation, and every even-numbered yield will be the aligned annotation. If there’s a specific annotation type that you’re looking for, you need to filter the generated results outside.
- Returns:
yields the alignment annotation and the aligned annotation. The order is decided by the order of appearance of Alignment annotations in the MMIF
- get_property(prop_name: str, default=None)[source]¶
Safe property access with optional default value.
Searches for an annotation property by name and returns its value, or a default value if not found. This method searches in multiple locations with the following priority:
Direct properties (in
annotation.properties)Ephemeral properties (view-level metadata from
contains)Special fields (
@type,properties)
This allows convenient access to properties without explicitly checking the
propertiesobject or view-level metadata.- Parameters:
prop_name – The name of the property to retrieve
default – The value to return if the property is not found (default: None)
- Returns:
The property value, or the default value if not found
Examples¶
# Access annotation properties: label = annotation.get('label', default='unknown') start_time = annotation.get('start', default=0) # Access @type: at_type = annotation.get('@type') # Safe access with custom default: targets = annotation.get('targets', default=[])
See Also¶
__getitem__ : Direct property access that raises KeyError when not found get_property : Alias for this method
- property id: str¶
- property long_id: str¶
- property parent: str¶
- class mmif.serialize.annotation.AnnotationProperties(mmif_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
MmifObject,MutableMapping[str,T]AnnotationProperties object that represents the
propertiesobject within a MMIF annotation.- Parameters:
mmif_obj – the JSON data that defines the properties
- class mmif.serialize.annotation.Document(doc_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
AnnotationDocument object that represents a single document in a MMIF file.
A document is identified by an ID, and contains certain attributes and potentially contains the contents of the document itself, metadata about how the document was created, and/or a list of subdocuments grouped together logically.
If
document_objis not provided, an empty Document will be generated.- Parameters:
document_obj – the JSON data that defines the document
- add_property(name: str, value: str | int | float | bool | None | List[str | int | float | bool | None]) None[source]¶
Adds a property to the document’s properties.
Unlike the parent
Annotationclass, added properties of aDocumentobject can be lost during serialization unless it belongs to somewhere in aMmifobject. This is because we want to keepDocumentobject as “read-only” as possible. Thus, if you want to add a property to aDocumentobject,add the document to a
Mmifobject (either in the documents list or in a view from the views list), ordirectly write to
Document.propertiesinstead of using this method (which is not recommended).
With the former method, the SDK will record the added property as a Annotation annotation object, separate from the original Document object. See
Mmif.generate_capital_annotations()for more.A few notes to keep in mind:
You can’t overwrite an existing property of a
Documentobject.A MMIF can have multiple
Annotationobjects with the same property name but different values. When this happens, the SDK will only keep the latest value (in order of appearances in views list) of the property, effectively overwriting the previous values.
- Parameters:
name – the name of the property
value – the property’s desired value (note: Document accepts fewer value types than Annotation)
- get(prop_name, default=None)[source]¶
Safe property access with optional default value for Document objects.
Searches for a document property by name and returns its value, or a default value if not found. Documents have a more complex property hierarchy than regular annotations:
Priority order (highest to lowest): 1. Special fields (‘id’, ‘location’) 2. Pending properties (added after creation, to be serialized as
Annotationobjects) 3. Ephemeral properties (from existingAnnotationannotations or view metadata) 4. Original properties (indocument.properties)This allows convenient access to all document properties regardless of where they’re stored internally.
- Parameters:
prop_name – The name of the property to retrieve
default – The value to return if the property is not found (default: None)
- Returns:
The property value, or the default value if not found
Examples¶
# Access document properties: mime = document.get('mime', default='application/octet-stream') location = document.get('location') # Access properties added after creation (pending): author = document.get('author', default='anonymous') publisher = document.get('publisher') # Access ephemeral properties from Annotation objects: sentiment = document.get('sentiment', default='neutral')
See Also¶
add_property : Add a new property to the document Mmif.generate_capital_annotations : How pending properties are serialized
- get_property(prop_name, default=None)[source]¶
Safe property access with optional default value for Document objects.
Searches for a document property by name and returns its value, or a default value if not found. Documents have a more complex property hierarchy than regular annotations:
Priority order (highest to lowest): 1. Special fields (‘id’, ‘location’) 2. Pending properties (added after creation, to be serialized as
Annotationobjects) 3. Ephemeral properties (from existingAnnotationannotations or view metadata) 4. Original properties (indocument.properties)This allows convenient access to all document properties regardless of where they’re stored internally.
- Parameters:
prop_name – The name of the property to retrieve
default – The value to return if the property is not found (default: None)
- Returns:
The property value, or the default value if not found
Examples¶
# Access document properties: mime = document.get('mime', default='application/octet-stream') location = document.get('location') # Access properties added after creation (pending): author = document.get('author', default='anonymous') publisher = document.get('publisher') # Access ephemeral properties from Annotation objects: sentiment = document.get('sentiment', default='neutral')
See Also¶
add_property : Add a new property to the document Mmif.generate_capital_annotations : How pending properties are serialized
- property location: str | None¶
locationproperty must be a legitimate URI. That is, should the document be a local file then the file:// scheme must be used. Returns None when no location is set.
- location_address() str | None[source]¶
Retrieves the full address from the document location URI. Returns None when no location is set.
- location_path(nonexist_ok=True) str | None[source]¶
Retrieves a path that’s resolved to a pathname in the local file system. To obtain the original value of the “path” part in the location string (before resolving), use
properties.location_path_literalmethod. Returns None when no location is set.- Parameters:
nonexist_ok – if False, raise FileNotFoundError when the resolved path doesn’t exist
- location_scheme() str | None[source]¶
Retrieves URI scheme of the document location. Returns None when no location is set.
- property text_language: str¶
- property text_value: str¶
- class mmif.serialize.annotation.DocumentProperties(mmif_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
AnnotationPropertiesDocumentProperties object that represents the
propertiesobject within a MMIF document.- Parameters:
mmif_obj – the JSON data that defines the properties
- property location: str | None¶
locationproperty must be a legitimate URI. That is, should the document be a local file then the file:// scheme must be used. Returns None when no location is set.
- location_address() str | None[source]¶
Retrieves the full address from the document location URI. Returns None when no location is set.
- location_path() str | None[source]¶
Deprecated since version 1.0.2: Will be removed in 2.0.0. Use
location_path_resolved()instead.
- location_path_literal() str | None[source]¶
Retrieves only path name of the document location (hostname is ignored). Returns None when no location is set.
- location_path_resolved(nonexist_ok=True) str | None[source]¶
Retrieves only path name of the document location (hostname is ignored), and then try to resolve the path name in the local file system. This method should be used when the document scheme is
fileor empty. For other schemes, users should installmmif-locdoc-<scheme>plugin.Returns None when no location is set. Raise ValueError when no code found to resolve the given location scheme.
- location_scheme() str | None[source]¶
Retrieves URI scheme of the document location. Returns None when no location is set.
- property text_language: str¶
- property text_value: str¶
- class mmif.serialize.annotation.Text(text_obj: bytes | str | dict | None = None, *_)[source]¶
Bases:
MmifObject- property lang: str¶
- property value: str¶