pycognaize.common.utils.find_first_word_coords

find_first_word_coords(text, ocr_data, case_sensitive=False, sort=False, clean=True, cleanup_regex=re.compile('[^a-zA-Z\\\\d)\\\\[\\\\](-.,]'))[source]

Detect the coordinates of the first occurrence: of text in ocr_data if any.

If the text is not found in ocr_data return None :type text: str :param text: :type ocr_data: list :param ocr_data: List of dictionaries. Each dictionary contains information about a single word. Each word dictionary has the following keys: confidence, right,

left, top, bottom, ocr_text, word_id_number

Parameters:

case_sensitive (bool) – If True, the search will be case-sensitive
sort (bool) – If True, ocr_data will be ordered by word_id_number key before searching
clean (bool) – If true, disregard all non-alphanumeric character from the search
cleanup_regex (re._pattern_type) – Optional. Provide the regex for cleanup to be used (has effect only if clean=True)
text (str)
ocr_data (list)

Return type:

Optional[dict]

Returns:

Dictionary with word coordinates (keys: left, right, top, bottom, matched_words. matched_words includes the original word coordinate data for the matched words)