Discover more from Tools for Reporters
Clean those PDFs with Tabula
Tabula is a made-by-journalists tool for data in PDF documents
Hello, reporters! This week, we’re featuring guest contributor Teddy Maiorca again:
Have you ever looked at a PDF full of data and wished those tables would just turn into a spreadsheet without having to manually enter each row yourself?
Look no further than Tabula, a free, open-sourced tool that enables you to extract data locked inside PDF files in a matter of seconds.
Ditch the hassle of trying to copy rows from a table and paste them into Excel. Just upload your PDF to Tabula, click and drag to draw a box around the table you want to pull data from and, voilà, you can export your data as a CSV or Excel file.
If you have an especially large PDF and don’t have the time to draw a box around every table, hit the “automatically detect tables” button. Tabula can even save your selection as a template if you have multiple PDFs with similar layouts.
It works on Mac, Windows and Linux and was created by journalists, for journalists. Liberate that data, reporters!
Teddy Maiorca is a University of Missouri student who is currently working as a graphics editor for the Columbia Missourian.