pycognaize.document.tag.html_tag.HTMLTag

class HTMLTag(raw_value, raw_ocr_value, is_table, html_id, field_id, tag_id, row_index, col_index, xpath, is_td=True)[source]

Bases: HTMLTagABC

Parameters:
  • raw_value (str)

  • raw_ocr_value (str)

  • is_table (bool)

  • html_id (Union[str, List[str]])

  • field_id (Optional[str])

  • tag_id (Optional[str])

  • row_index (int)

  • col_index (int)

  • xpath (str)

  • is_td (bool)

Methods

construct_from_raw

Build HTMLTag from pycognaize raw data :type html: HTML :param html: HTML :type raw: dict :param raw: pycognaize field's tag info :rtype: HTMLTag :return:

set_class_confidence

to_dict

Converts tag to dict

Attributes

col_index

field_id

html_id

is_table

is_td

raw_ocr_value

raw_value

returns adjusted value

row_index

tag_id

xpath

classmethod construct_from_raw(raw, html)[source]

Build HTMLTag from pycognaize raw data :type html: HTML :param html: HTML :type raw: dict :param raw: pycognaize field’s tag info :rtype: HTMLTag :return:

Parameters:
  • raw (dict)

  • html (HTML)

Return type:

HTMLTag

property raw_value

returns adjusted value

to_dict()[source]

Converts tag to dict

Return type:

dict