pycognaize.index.Index

class Index(token, url)[source]

Bases: object

Use this abstract class for creating task specific Index subclasses, which allow comparison and matching between pycognaize documents

Parameters:
  • token (str)

  • url (str)

Methods

build

Build an encoding for a document, which can then be used for comparison and matching :type document: Document :param document: The document object to be built :rtype: Any :return: The final encoding.

build_and_store

Builds an encoded document and stores it in the session

match

Match a given document with an existing document in the index :type document: Document :param document: The document to be matched with an existing one :type full_index: dict :param full_index: The document encoding :return int: A tuple with the matched document, and a confidence value from 0 to 100

match_and_get

Match a given document with an existing document in the index

response_to_dict

Returns a dictionary representing the response where keys are document ids and values are the encoded documents

Attributes

id

abstract build(document)[source]

Build an encoding for a document, which can then be used for comparison and matching :type document: Document :param document: The document object to be built :rtype: Any :return: The final encoding. The type of the encoding is task specific, therefore allows any type

Parameters:

document (Document)

Return type:

Any

build_and_store(document)[source]

Builds an encoded document and stores it in the session

Parameters:

document (Document)

Return type:

Response

abstract match(document, full_index)[source]

Match a given document with an existing document in the index :type document: Document :param document: The document to be matched with an existing one :type full_index: dict :param full_index: The document encoding :return int: A tuple with the matched document, and a confidence value from 0 to 100

Return type:

Document

Parameters:
match_and_get(document)[source]

Match a given document with an existing document in the index

Parameters:

document (Document)

Return type:

Document

static response_to_dict(response)[source]

Returns a dictionary representing the response where keys are document ids and values are the encoded documents