Cheshire3 Object Model - Document


class cheshire3.baseObjects.Document(data, creator='', history=, []mimeType='', parent=None, filename='', tagName='', byteCount=0, byteOffset=0, wordCount=0)[source]

A Document is a wrapper for raw data and its metadata.

A Document is the raw data which will become a Record. It may be processed into a Record by a Parser, or into another Document type by a PreParser. Documents might be stored in a DocumentStore, if necessary, but can generally be discarded. Documents may be anything from a JPG file, to an unparsed XML file, to a string containing a URL. This allows for future compatability with new formats, as they may be incorporated into the system by implementing a Document type and a PreParser.


Return the raw data associated with this document.


The following implementations are included in the distribution by default:

class cheshire3.document.StringDocument(data, creator='', history=, []mimeType='', parent=None, filename=None, tagName='', byteCount=0, byteOffset=0, wordCount=0)[source]