by ukoenig » Sun May 06, 2012 9:27 am
Marco,
a freeware-tool, I coudn't find
Here is onother product that works from commandline :
http://www.verypdf.comA Test :
Description:
Convert text based PDF files to plain text files.
Convert scanned PDF files to plain text files by OCR technology.
Usage: pdf2txtocr.exe [options] <PDF-file> <Text-file>
-firstpage <int> : first PDF page to convert
-lastpage <int> : last PDF page to convert
-res <int> : set resolution, the unit is DPI (default is 300 dpi)
-ownerpwd <string> : set owner password for encrypted PDF file
-userpwd <string> : set user password for encrypted PDF file
-layout : maintain original physical layout
-noc : don't insert page breaks 0x0C between pages in text file
-bitcount <int> : set color depth when render PDF page to image data,
it can be set 1, 8, 24, default is 8bit
-ocr : enable OCR function for scanned PDF file
-lang <string> : choose the language for OCR engine
-text <string> : add additional text at end of each text page,
this parameter supports the following variables:
%PageNumber%: current page number
%PageCount% : total page count of PDF file
-$ <string> : input your License KeyExamples:
pdf2txtocr.exe C:\in.pdf C:\out.txt
pdf2txtocr.exe -firstpage 1 -lastpage 1 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -res 300 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ownerpwd 123 -userpwd 456 C:\in.pdf C:\out.txt
pdf2txtocr.exe -layout C:\in.pdf C:\out.txt
pdf2txtocr.exe -noc C:\in.pdf C:\out.txt
pdf2txtocr.exe C:\in.tif C:\out.txt
pdf2txtocr.exe C:\in.jpg C:\out.txt
pdf2txtocr.exe C:\in.bmp C:\out.txt
pdf2txtocr.exe C:\in.png C:\out.txt
pdf2txtocr.exe -ocr -lang eng C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 1 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 8 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -bitcount 24 C:\in.pdf C:\out.txt
pdf2txtocr.exe -ocr -lang deu C:\in.pdf C:\out.txt
pdf2txtocr.exe -lang deu C:\in.tif C:\out.txt
pdf2txtocr.exe -text "PageText %PageNumber% of %PageCount%" C:\in.pdf C:\out.txtFollowing command line will OCR all PDF files in D:\temp\ folder to text files:
for %F in (D:\temp\*.pdf) do pdf2txtocr.exe -ocr -lang deu "%F" "%~dpnF.txt"
Following command line will OCR all PDF files in D:\temp\ folder and subdirectories to text files:
for /r D:\temp %F in (*.pdf) do pdf2txtocr.exe -ocr "%F" "%~dpnF.txt"
Following command line will OCR all PDF files from D:\temp\ folder and output text files to C:\test folder:
for %F in (D:\temp\*.pdf) do pdf2txtocr.exe -ocr "%F" "C:\test\%~nF.txt""Best Regards
Uwe
Since 1995 ( the first release of FW 1.9 )
i work with FW.
If you have any questions about special functions, maybe i can help.