clams.source package

Module contents

class clams.source.WorkflowSource(common_documents_json: List[str | dict] | None = None, common_metadata_json: str | dict | None = None)[source]

Bases: object

A WorkflowSource object is used at the beginning of a CLAMS workflow to populate a new MMIF file with media.

The same WorkflowSource object can be used repeatedly to generate multiple MMIF objects.

Parameters:
  • common_documents_json – JSON doc_lists for any documents that should be common to all MMIF objects produced by this workflow.

  • common_metadata_json – JSON doc_lists for metadata that should be common to all MMIF objects produced by this workflow.

add_document(document: str | dict | Document) None[source]

Adds a document to the working source MMIF.

When you’re done, fetch the source MMIF with produce().

Parameters:

document – the medium to add, as a JSON dict or string or as a MMIF Medium object

change_metadata(key: str, value)[source]

Adds or changes a metadata entry in the working source MMIF.

Parameters:
  • key – the desired key of the metadata property

  • value – the desired value of the metadata property

from_data(doc_lists: Iterable[List[str | dict | Document]], metadata_objs: Iterable[str | dict | MmifMetadata | None] | None = None) Generator[Mmif, None, None][source]

Provided with an iterable of document lists and an optional iterable of metadata objects, generates MMIF objects produced from that data.

doc_lists and metadata_objs should be matched pairwise, so that if they are zipped together, each pair defines a single MMIF object from this workflow source.

Parameters:
  • doc_lists – an iterable of document lists to generate MMIF from

  • metadata_objs – an iterable of metadata objects paired with the document lists

Returns:

a generator of produced MMIF files from the data

mmif: Mmif[source]
prime() None[source]

Primes the WorkflowSource with a fresh MMIF object.

Call this method if you want to reset the WorkflowSource without producing a MMIF object with produce().

produce() Mmif[source]

Returns the source MMIF and resets the WorkflowSource.

Call this method once you have added all the documents for your Workflow.

Returns:

the current MMIF object that has been prepared