English Deutsch Français Italiano Español Português 繁體中文 Bahasa Indonesia Tiếng Việt ภาษาไทย
All categories

I'm looking to extract the text from a PDF, the text inside the PDF is sorted in columns, no format at all, and some spaces between each column.
Is there a way to read from the PDF and get the text in xml, txt or else? Want to find a way to ensure i get the data/columns with no errors/shift.

2007-03-06 15:28:19 · 2 answers · asked by Cat 9 6 in Computers & Internet Programming & Design

2 answers

You don't extract text from a PDF. This format is for viewing only.

To extract text you will have to download a copy of the PDF file, then convert it to text with a PDF converter. Save the new file with a new name.

Here is a free online converter. I tried it - it works great -- is easy -- but I don't know how long it will be available.

Free PDF Online Converter – may not be available forever – today is 2/20/07
http://media-convert.com/convert/index.php?pg=doc&sid=bx0driteh0n6mn9mykk4ghpuwo

2007-03-06 15:36:39 · answer #1 · answered by TheHumbleOne 7 · 0 0

The only thing I can think of using Adobe Professional. It's used to make PDF's and extract them. And will keep your format.

The Syko Ward

2007-03-06 23:37:31 · answer #2 · answered by The Syko Ward 5 · 0 0

fedest.com, questions and answers