I have a need to convert Adobe Portable Document Format (PDF) files to various formats. Mostly text. As an author, I sometimes have to convert PDF files to text so that I can copy and paste certain portions for articles and reviews. As a news reporter I often get large files in PDF format. The need to convert these to text format is obvious. In those PDF files there are sometimes graphics which I require for news reporting. I have been using the old Print Screen and save system requiring a graphics program to cut out the graphic. If I could extract the graphic easier, it makes my job that much easier.
Supreme Court decisions are a good example. Many are in PDF. It is helpful to convert to text so that I can copy and paste portions of the decisions into my articles. Other times I need to covert the whole file with graphics—as in the case of military manuals or other PDF files I might download from Archive.org or Project Gutenberg.
I have searched for and used several online options. They do all right, but tend to produce some gobbledygook in the translation. They do not seem to work well with graphics.
During my search I ran across PDFZilla. Having experience with Mozilla and Filezilla, I was hoping this was free. It is not, however. As a disclaimer I should note that I downloaded the trial version that only translates 50% of the file and has other limitations. I am writing this review based on that version and am told I can get a license for writing this review.
I will give you the bad first. You can select where the output file goes, but you cannot create a new folder while saving a file. You should plan to set up your folders before translating PDF files or separate them into folders later if you need to. Also the windows cannot be resized for easier viewing.
You can see the various conversion options in Figure 1. I did not use the PDF to Excel Converter because I do not have files that I need converted into that format. At least I have not run across any yet.
Two options I like are the PDF Merger and PDF Cutter. The PDF Merger is a good option. In the trial version they add a water mark. I had two particular files to test. I downloaded two SEABEE manuals. The came in Volume I and Volume II. Though the word “Merger” is misspelled in the window for this option, PDFZilla merged the two files seamlessly. This option I can find useful at times. The one problem with this option is that it does not give you the option to export the file to a folder you select. You can select the output folder by clicking on the link at the bottom.
The other option, PDF cutter, allows you to cut out pages. I had some problems with testing this one. I chose a file over 700 pages longs with graphics at first. This file sent the PDF Cutter into a tizzy. It uses drag-and-drop functionality. That is you drag the PDF to the cutter window. The file did not show up and I had to eventually restart the computer to clear the memory. The second time I tried a plain Supreme Court text PDF and it displayed (50% of it) and I edited out a few pages. The edited output file appeared in the folder that I selected it from. You can resize the file, but not the window. The window should be about the same size as the main screen to see the file easier. Again the cut file in the trial version adds a watermark.
Transferring from PDF to Word was difficult for me, but might not be for you if you use Microsoft Word. I use Openoffice.org—A Microsoft Word clone. It has to import the Microsoft Word DOC file into ODT format. It does well, but the formatting is odd. In most cases I would not use that option anyway preferring PDF to text.
The PDF to Text converter works well saving in RTF format or TXT. While on this feature, I will go over some of the options on the screen.
On the upper-right portion of the window you will see a drop-down menu for output formats. I selected PDF to txt and this is what shows. You can use the drop-down menu to select other formats in case you mistakenly push the wrong option from the main screen. Just above that is the Options where you can select the first page and last page to covert and whether or not you desire page breaks. Below that you can select the folder you desire to output to and, of course, the button to start the conversion is at the bottom.
You can drag-and-drop the PDF to the screen. Or you can choose the file by selecting the green cross on the File Menu. Next to the green cross is a folder with a green cross. This option allows you to add an entire folder of PDF files for bulk conversion. This is very useful to convert bulk PDF files to a specific format. The red X allows you to delete some of the PDF files if you do not desire to convert them. Thus you could select a folder and delete those you do not want to convert.
The PDF to text works good. Especially with straight text PDF files such as articles and Supreme Court decisions. There will be some formatting required. For example if you convert a file with a large a table of contents. You might have to remove the table of contents depending on your need. But it converts the text well. I have used the PDF to RTF format and it converts everything including the graphics, but the graphics appear upside down. You can rotate the graphics depending on the program you use to for your final print. You can also save the graphic and adjust it for your need in a graphics program.
The PDF to HTML seems to convert the text flawlessly even to recreating the format of the PDF. This makes a long HTML file, in some cases, with narrow columns. The trial version does not seem to translate any graphics, but the registered version might. It also automatically adds links to other portions of the HTML file. For example the file I translated adds an appendix and the HTML added several links to that Appendix and various graphics.
The other option I found useful was the Image to PDF Converter. I just happened to download several graphics for a technical document. PDFZilla converted them into a PDF file with watermarks. I then tried to convert that document to text, but only got the watermarks. The Image to PDF converter, as you see in the figure, allows you to add your own document title, an author, subject and keywords. You can also select an output path for the file. You can also adjust the quality of the file which will in turn make it smaller or larger.
Overall the trial version seems to work well. I would give it 3.5 out of five stars. I do not know if the registered version, allowing all functions, will work better. Some of the faults I cite may be corrected in future upgrades.
You could, of course, use register for Acrobat Pro. This requires a subscription of $14.99 a month (as of this writing) and a one year commitment. That is $179.88 per year. This is a good option for a large company having to edit or convert a large number of files. In addition to that you convert online in “the cloud” were anyone can have access to your files and the national government normally does.
PDFZilla is a program on your computer which converts the file on your computer. Although the NSA still has access to the file, it makes it difficult for others. If you are converting files for research for, say, a new invention idea, you probably do not want it floating on a “cloud” where any devil can intercept your work and develop your idea first. It is still a little more difficult to hack into your computer. The other advantage is that PDFZilla is $49.95 (as of this writing).
This program seems to be relatively new and may not covert everything absolutely flawlessly. What it does convert for me makes the price worth it. They could probably even squeeze a few more dollars from poor reporters such as me.