table-extract
Extract structured tables from HTML or CSV sources into normalized rows and columns.
Why install it
Research workflows frequently encounter useful data trapped in tables. This tool gives agents a portable table extraction primitive without requiring a full scraping stack.
Inputs
source_type:html,csv, orautopath: optional local file path for the sourcehtml_text: optional raw HTML contentcsv_text: optional raw CSV contenttable_index: zero-based table index for HTML extractionheader_row: whether the first row should be treated as the header
Outputs
tables: extracted tables with columns and rowsdetected_count: number of tables detected in the sourcewarnings: extraction warningsmetadata: summary metadata about the extraction
Local development
python -m unittest discover -s tests -p 'test_*.py'
Example invocation
python -u table_extract/__main__.py < input.json
With input.json containing:
{
"html_text": "<table><tr><th>Name</th></tr><tr><td>Ada</td></tr></table>",
"source_type": "html"
}