pycognaize.common.table_utils.assign_indices_to_tables
- assign_indices_to_tables(tables, all_tables=None, threshold=0.4)[source]
- If the document is an XBRL document,
the function matches the tables based on the ordering of all tables.
- If it’s not an XBRL document,
the tables are grouped by pages and for each page, the tables are left sorted and ordered horizontally and vertically.
- Return dict where the keys are indices based above-mentioned ordering
and the values are the corresponding tables.
- Parameters:
tables – a list of tables that need to be indexed
all_tables (
Optional
[list
]) – a list of all tables in the document. This parameter is required if the tables are from an XBRL documentthreshold (
float
) – intersection threshold
- Return type:
dict