pdf - pdfbeads not working for this file but hocr2pdf does, any ideas? - Ask Ubuntu
can't seem make pdfbeads produce ocr'ed file hocr, hocr2pd program works not pdfbeads, what's special on file ? hocr produced tesseract:
tesseract -psm 1 -l eng 00000001.tif out hocr
this ie. creates pdf not searchable one:
pdfbeads *tif > new.pdf /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require': iconv deprecated in future, use string#encode instead. [deprecation] requiring "rmagick" deprecated. use "rmagick" instead prepared data processing 00000001.tif /var/lib/gems/1.9.1/gems/pdfbeads-1.1.1/lib/pdfbeads/pdfpage.rb:445: warning: jbig2 compression complete. pages:1 symbols:401 log2:9 processed 00000001.tif
however pdfbeads works other files have created tif's
example files tif+hocr here:
https://drive.google.com/drive/folders/0b0bk4vnmvvv6bupgrglaczfyx0e?usp=sharing
Comments
Post a Comment