top | item 36268551 Ask HN: What's a good library/command line tool to extract tables from PDFs? 5 points| alfarez | 2 years ago | reply 4 comments order hn newest [+] [-] UglyToad|2 years ago|reply There's probably newer AI powered tools but Tabula is the main library I know of https://github.com/tabulapdf/tabula-java [+] [-] andrewio|2 years ago|reply You can use a PDF parser tool to extract data from PDF tables. I'm building parsio.io - we use pre-trained AI-powered parsers to parse PDF tables: https://parsio.io/table-extraction/. Another example us Tabula (free) [+] [-] phiv|2 years ago|reply there is also this option: https://docs.ropensci.org/tabulizer/ [+] [-] phiv|2 years ago|reply have not tried it, but this has been in my bookmarks a while: https://github.com/camelot-dev/excalibur
[+] [-] UglyToad|2 years ago|reply There's probably newer AI powered tools but Tabula is the main library I know of https://github.com/tabulapdf/tabula-java
[+] [-] andrewio|2 years ago|reply You can use a PDF parser tool to extract data from PDF tables. I'm building parsio.io - we use pre-trained AI-powered parsers to parse PDF tables: https://parsio.io/table-extraction/. Another example us Tabula (free)
[+] [-] phiv|2 years ago|reply have not tried it, but this has been in my bookmarks a while: https://github.com/camelot-dev/excalibur
[+] [-] UglyToad|2 years ago|reply
[+] [-] andrewio|2 years ago|reply
[+] [-] phiv|2 years ago|reply
[+] [-] phiv|2 years ago|reply