Source language: Translate to:

OCR for Neobook using Tesseract Engine

Questions about NeoBook PlugIns

Moderator: Neosoft Support

Postby Wrangler » Fri May 04, 2012 12:11 pm

Here is the result of a message screen grab out of my email prog in jpg:

Hello,
You are receiving this email because you are watching the topic, "OCR for Neobook using Iesseract Hmgine" at NeoSot't Support Forum. This topic has received a reply since your last visit. You can use the following link
to view the replies made, no more notifications will be sent until you visit the topic.
http://wwmneosoftwai-e.com/fozum/viewt"p=LLLJE4E;#L;ZJ£-£54
If you no longer wish to watch this topic you can either click the “Stop watching this topic link" found at the bottom of the topic above, or by clicking the following link:
http: //wwmneosoftware . com/forum/viewtopic.php?t=19347iu.nwatch=topic

Doesn't seem to handle urls very well. Can it be made capable to "learn"?

Interestingly, it handles the urls fine with recognize html:

<div class='ocr_page' id='page_1' title='image ""; bbox 0 0 6152 872'>
<div class='ocr_carea' id='block_1_1' title="bbox 19 29 176 71">
<p class='ocr_par'>
<span class='ocr_line' id='line_1_1' title="bbox 19 29 176 71"><span class='ocr_word' id='word_1_1' title="bbox 19 29 176 71"><span class='ocrx_word' id='xword_1_1' title="x_wconf -2"><strong>Hello,</strong></span></span></span>
</p>
</div>
<div class='ocr_carea' id='block_1_2' title="bbox 19 150 6098 252">
<p class='ocr_par'>
<span class='ocr_line' id='line_1_2' title="bbox 19 150 6098 192"><span class='ocr_word' id='word_1_2' title="bbox 19 150 103 184"><span class='ocrx_word' id='xword_1_2' title="x_wconf -2"><strong>You</strong></span></span> <span class='ocr_word' id='word_1_3' title="bbox 133 158 212 184"><span class='ocrx_word' id='xword_1_3' title="x_wconf -1"><strong>are</strong></span></span> <span class='ocr_word' id='word_1_4' title="bbox 245 150 496 192"><span class='ocrx_word' id='xword_1_4' title="x_wconf -2"><strong>receiving</strong></span></span> <span class='ocr_word' id='word_1_5' title="bbox 524 150 632 184"><span class='ocrx_word' id='xword_1_5' title="x_wconf -3"><strong>this</strong></span></span> <span class='ocr_word' id='word_1_6' title="bbox 663 150 800 184"><span class='ocrx_word' id='xword_1_6' title="x_wconf -2"><strong>email</strong></span></span> <span class='ocr_word' id='word_1_7' title="bbox 830 150 1024 184"><span class='ocrx_word' id='xword_1_7' title="x_wconf -2"><strong>because</strong></span></span> <span class='ocr_word' id='word_1_8' title="bbox 1057 158 1140 192"><span class='ocrx_word' id='xword_1_8' title="x_wconf -2"><strong>you</strong></span></span> <span class='ocr_word' id='word_1_9' title="bbox 1169 158 1248 184"><span class='ocrx_word' id='xword_1_9' title="x_wconf -1"><strong>are</strong></span></span> <span class='ocr_word' id='word_1_10' title="bbox 1278 150 1504 192"><span class='ocrx_word' id='xword_1_10' title="x_wconf -2"><strong>watching</strong></span></span> <span class='ocr_word' id='word_1_11' title="bbox 1533 150 1614 184"><span class='ocrx_word' id='xword_1_11' title="x_wconf -2"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_12' title="bbox 1645 150 1800 192"><span class='ocrx_word' id='xword_1_12' title="x_wconf -3"><strong>topic,</strong></span></span> <span class='ocr_word' id='word_1_13' title="bbox 1844 150 1952 184"><span class='ocrx_word' id='xword_1_13' title="x_wconf -3"><strong>"OCR</strong></span></span> <span class='ocr_word' id='word_1_14' title="bbox 1982 150 2064 184"><span class='ocrx_word' id='xword_1_14' title="x_wconf -2"><strong>for</strong></span></span> <span class='ocr_word' id='word_1_15' title="bbox 2091 150 2288 184"><span class='ocrx_word' id='xword_1_15' title="x_wconf -3"><strong>Neobook</strong></span></span> <span class='ocr_word' id='word_1_16' title="bbox 2315 150 2456 192"><span class='ocrx_word' id='xword_1_16' title="x_wconf -2"><strong>using</strong></span></span> <span class='ocr_word' id='word_1_17' title="bbox 2484 150 2736 184"><span class='ocrx_word' id='xword_1_17' title="x_wconf -2"><strong>Tesseract</strong></span></span> <span class='ocr_word' id='word_1_18' title="bbox 2764 150 2956 192"><span class='ocrx_word' id='xword_1_18' title="x_wconf -4"><strong>Engine“</strong></span></span> <span class='ocr_word' id='word_1_19' title="bbox 2990 155 3044 184"><span class='ocrx_word' id='xword_1_19' title="x_wconf -1"><strong>at</strong></span></span> <span class='ocr_word' id='word_1_20' title="bbox 3072 150 3268 184"><span class='ocrx_word' id='xword_1_20' title="x_wconf -3"><strong>NeoSoft</strong></span></span> <span class='ocr_word' id='word_1_21' title="bbox 3298 150 3492 192"><span class='ocrx_word' id='xword_1_21' title="x_wconf -2"><strong>Support</strong></span></span> <span class='ocr_word' id='word_1_22' title="bbox 3523 150 3680 184"><span class='ocrx_word' id='xword_1_22' title="x_wconf -2"><strong>Forum.</strong></span></span> <span class='ocr_word' id='word_1_23' title="bbox 3716 150 3826 184"><span class='ocrx_word' id='xword_1_23' title="x_wconf -3"><strong>This</strong></span></span> <span class='ocr_word' id='word_1_24' title="bbox 3858 150 3996 192"><span class='ocrx_word' id='xword_1_24' title="x_wconf -1"><strong>topic</strong></span></span> <span class='ocr_word' id='word_1_25' title="bbox 4024 150 4106 184"><span class='ocrx_word' id='xword_1_25' title="x_wconf -1"><strong>has</strong></span></span> <span class='ocr_word' id='word_1_26' title="bbox 4140 150 4362 184"><span class='ocrx_word' id='xword_1_26' title="x_wconf -3"><strong>received</strong></span></span> <span class='ocr_word' id='word_1_27' title="bbox 4392 158 4416 184"><span class='ocrx_word' id='xword_1_27' title="x_wconf -1"><strong>a</strong></span></span> <span class='ocr_word' id='word_1_28' title="bbox 4448 150 4586 192"><span class='ocrx_word' id='xword_1_28' title="x_wconf -3"><strong>reply</strong></span></span> <span class='ocr_word' id='word_1_29' title="bbox 4616 150 4751 184"><span class='ocrx_word' id='xword_1_29' title="x_wconf -1"><strong>since</strong></span></span> <span class='ocr_word' id='word_1_30' title="bbox 4783 158 4893 192"><span class='ocrx_word' id='xword_1_30' title="x_wconf -3"><strong>your</strong></span></span> <span class='ocr_word' id='word_1_31' title="bbox 4924 150 5032 184"><span class='ocrx_word' id='xword_1_31' title="x_wconf -1"><strong>last</strong></span></span> <span class='ocr_word' id='word_1_32' title="bbox 5060 150 5220 184"><span class='ocrx_word' id='xword_1_32' title="x_wconf -3"><strong>visit.</strong></span></span> <span class='ocr_word' id='word_1_33' title="bbox 5257 150 5342 184"><span class='ocrx_word' id='xword_1_33' title="x_wconf -2"><strong>You</strong></span></span> <span class='ocr_word' id='word_1_34' title="bbox 5372 158 5454 184"><span class='ocrx_word' id='xword_1_34' title="x_wconf -1"><strong>can</strong></span></span> <span class='ocr_word' id='word_1_35' title="bbox 5481 158 5563 184"><span class='ocrx_word' id='xword_1_35' title="x_wconf -1"><strong>use</strong></span></span> <span class='ocr_word' id='word_1_36' title="bbox 5594 150 5676 184"><span class='ocrx_word' id='xword_1_36' title="x_wconf -1"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_37' title="bbox 5708 150 5958 192"><span class='ocrx_word' id='xword_1_37' title="x_wconf -3"><strong>following</strong></span></span> <span class='ocr_word' id='word_1_38' title="bbox 5988 150 6098 184"><span class='ocrx_word' id='xword_1_38' title="x_wconf -3"><strong>link</strong></span></span></span>
<span class='ocr_line' id='line_1_3' title="bbox 20 210 2448 252"><span class='ocr_word' id='word_1_39' title="bbox 20 215 74 244"><span class='ocrx_word' id='xword_1_39' title="x_wconf -1"><strong>to</strong></span></span> <span class='ocr_word' id='word_1_40' title="bbox 101 210 216 244"><span class='ocrx_word' id='xword_1_40' title="x_wconf -3"><strong>view</strong></span></span> <span class='ocr_word' id='word_1_41' title="bbox 244 210 325 244"><span class='ocrx_word' id='xword_1_41' title="x_wconf -1"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_42' title="bbox 357 210 548 252"><span class='ocrx_word' id='xword_1_42' title="x_wconf -1"><strong>replies</strong></span></span> <span class='ocr_word' id='word_1_43' title="bbox 577 210 708 252"><span class='ocrx_word' id='xword_1_43' title="x_wconf -2"><strong>made,</strong></span></span> <span class='ocr_word' id='word_1_44' title="bbox 747 218 802 244"><span class='ocrx_word' id='xword_1_44' title="x_wconf -1"><strong>no</strong></span></span> <span class='ocr_word' id='word_1_45' title="bbox 830 218 941 244"><span class='ocrx_word' id='xword_1_45' title="x_wconf -3"><strong>more</strong></span></span> <span class='ocr_word' id='word_1_46' title="bbox 971 210 1332 244"><span class='ocrx_word' id='xword_1_46' title="x_wconf -2"><strong>notifications</strong></span></span> <span class='ocr_word' id='word_1_47' title="bbox 1362 210 1472 244"><span class='ocrx_word' id='xword_1_47' title="x_wconf -3"><strong>will</strong></span></span> <span class='ocr_word' id='word_1_48' title="bbox 1503 210 1556 244"><span class='ocrx_word' id='xword_1_48' title="x_wconf -2"><strong>be</strong></span></span> <span class='ocr_word' id='word_1_49' title="bbox 1590 215 1699 244"><span class='ocrx_word' id='xword_1_49' title="x_wconf -1"><strong>sent</strong></span></span> <span class='ocr_word' id='word_1_50' title="bbox 1727 210 1865 244"><span class='ocrx_word' id='xword_1_50' title="x_wconf -1"><strong>until</strong></span></span> <span class='ocr_word' id='word_1_51' title="bbox 1896 218 1980 252"><span class='ocrx_word' id='xword_1_51' title="x_wconf -2"><strong>you</strong></span></span> <span class='ocr_word' id='word_1_52' title="bbox 2006 210 2148 244"><span class='ocrx_word' id='xword_1_52' title="x_wconf -3"><strong>visit</strong></span></span> <span class='ocr_word' id='word_1_53' title="bbox 2176 210 2258 244"><span class='ocrx_word' id='xword_1_53' title="x_wconf -1"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_54' title="bbox 2289 210 2448 252"><span class='ocrx_word' id='xword_1_54' title="x_wconf -2"><strong>topic.</strong></span></span></span>
</p>
</div>
<div class='ocr_carea' id='block_1_3' title="bbox 19 315 6130 497">
<p class='ocr_par'>
<span class='ocr_line' id='line_1_4' title="bbox 19 315 6130 380"><span class='ocr_word' id='word_1_55' title="bbox 19 315 6130 380"><span class='ocrx_word' id='xword_1_55' title="x_wconf -7"><strong>http://wwmneosoftware.com/forum/viewt</strong></span></span></span>
</p>
<p class='ocr_par'>
<span class='ocr_line' id='line_1_5' title="bbox 22 455 4856 497"><span class='ocr_word' id='word_1_56' title="bbox 22 455 74 489"><span class='ocrx_word' id='xword_1_56' title="x_wconf -1"><strong>If</strong></span></span> <span class='ocr_word' id='word_1_57' title="bbox 104 463 187 497"><span class='ocrx_word' id='xword_1_57' title="x_wconf -2"><strong>you</strong></span></span> <span class='ocr_word' id='word_1_58' title="bbox 215 463 270 490"><span class='ocrx_word' id='xword_1_58' title="x_wconf -1"><strong>no</strong></span></span> <span class='ocr_word' id='word_1_59' title="bbox 302 455 467 497"><span class='ocrx_word' id='xword_1_59' title="x_wconf -2"><strong>longer</strong></span></span> <span class='ocr_word' id='word_1_60' title="bbox 494 455 607 489"><span class='ocrx_word' id='xword_1_60' title="x_wconf -2"><strong>wish</strong></span></span> <span class='ocr_word' id='word_1_61' title="bbox 637 460 690 489"><span class='ocrx_word' id='xword_1_61' title="x_wconf -1"><strong>to</strong></span></span> <span class='ocr_word' id='word_1_62' title="bbox 718 455 860 489"><span class='ocrx_word' id='xword_1_62' title="x_wconf -3"><strong>watch</strong></span></span> <span class='ocr_word' id='word_1_63' title="bbox 888 455 996 489"><span class='ocrx_word' id='xword_1_63' title="x_wconf -2"><strong>this</strong></span></span> <span class='ocr_word' id='word_1_64' title="bbox 1028 456 1167 497"><span class='ocrx_word' id='xword_1_64' title="x_wconf -2"><strong>topic</strong></span></span> <span class='ocr_word' id='word_1_65' title="bbox 1197 463 1280 497"><span class='ocrx_word' id='xword_1_65' title="x_wconf -2"><strong>you</strong></span></span> <span class='ocr_word' id='word_1_66' title="bbox 1310 463 1392 490"><span class='ocrx_word' id='xword_1_66' title="x_wconf -3"><strong>can</strong></span></span> <span class='ocr_word' id='word_1_67' title="bbox 1420 455 1588 490"><span class='ocrx_word' id='xword_1_67' title="x_wconf -2"><strong>either</strong></span></span> <span class='ocr_word' id='word_1_68' title="bbox 1618 455 1756 489"><span class='ocrx_word' id='xword_1_68' title="x_wconf -4"><strong>click</strong></span></span> <span class='ocr_word' id='word_1_69' title="bbox 1784 455 1865 489"><span class='ocrx_word' id='xword_1_69' title="x_wconf -2"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_70' title="bbox 1900 455 2036 497"><span class='ocrx_word' id='xword_1_70' title="x_wconf -3"><strong>‘Stop</strong></span></span> <span class='ocr_word' id='word_1_71' title="bbox 2062 455 2288 497"><span class='ocrx_word' id='xword_1_71' title="x_wconf -3"><strong>watching</strong></span></span> <span class='ocr_word' id='word_1_72' title="bbox 2318 456 2424 489"><span class='ocrx_word' id='xword_1_72' title="x_wconf -2"><strong>this</strong></span></span> <span class='ocr_word' id='word_1_73' title="bbox 2456 456 2596 497"><span class='ocrx_word' id='xword_1_73' title="x_wconf -2"><strong>topic</strong></span></span> <span class='ocr_word' id='word_1_74' title="bbox 2627 455 2760 489"><span class='ocrx_word' id='xword_1_74' title="x_wconf -4"><strong>link“</strong></span></span> <span class='ocr_word' id='word_1_75' title="bbox 2795 455 2932 489"><span class='ocrx_word' id='xword_1_75' title="x_wconf -2"><strong>found</strong></span></span> <span class='ocr_word' id='word_1_76' title="bbox 2963 460 3015 489"><span class='ocrx_word' id='xword_1_76' title="x_wconf -2"><strong>at</strong></span></span> <span class='ocr_word' id='word_1_77' title="bbox 3045 455 3126 489"><span class='ocrx_word' id='xword_1_77' title="x_wconf -1"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_78' title="bbox 3156 456 3327 489"><span class='ocrx_word' id='xword_1_78' title="x_wconf -2"><strong>bottom</strong></span></span> <span class='ocr_word' id='word_1_79' title="bbox 3354 455 3408 489"><span class='ocrx_word' id='xword_1_79' title="x_wconf -1"><strong>of</strong></span></span> <span class='ocr_word' id='word_1_80' title="bbox 3438 456 3518 489"><span class='ocrx_word' id='xword_1_80' title="x_wconf -2"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_81' title="bbox 3551 456 3688 497"><span class='ocrx_word' id='xword_1_81' title="x_wconf -2"><strong>topic</strong></span></span> <span class='ocr_word' id='word_1_82' title="bbox 3719 456 3874 497"><span class='ocrx_word' id='xword_1_82' title="x_wconf -3"><strong>above,</strong></span></span> <span class='ocr_word' id='word_1_83' title="bbox 3914 463 3968 489"><span class='ocrx_word' id='xword_1_83' title="x_wconf -2"><strong>or</strong></span></span> <span class='ocr_word' id='word_1_84' title="bbox 3996 456 4053 497"><span class='ocrx_word' id='xword_1_84' title="x_wconf -1"><strong>by</strong></span></span> <span class='ocr_word' id='word_1_85' title="bbox 4084 455 4305 497"><span class='ocrx_word' id='xword_1_85' title="x_wconf -3"><strong>clicking</strong></span></span> <span class='ocr_word' id='word_1_86' title="bbox 4335 456 4414 489"><span class='ocrx_word' id='xword_1_86' title="x_wconf -1"><strong>the</strong></span></span> <span class='ocr_word' id='word_1_87' title="bbox 4448 455 4697 497"><span class='ocrx_word' id='xword_1_87' title="x_wconf -3"><strong>following</strong></span></span> <span class='ocr_word' id='word_1_88' title="bbox 4728 455 4856 490"><span class='ocrx_word' id='xword_1_88' title="x_wconf -3"><strong>link:</strong></span></span></span>
</p>
</div>
<div class='ocr_carea' id='block_1_4' title="bbox 508 2384 540 2531">
<p class='ocr_par'>
<span class='ocr_line' id='line_1_6' title="bbox 516 2384 534 2396"><span class='ocr_word' id='word_1_89' title="bbox 516 2384 534 2396"><span class='ocrx_word' id='xword_1_89' title="x_wconf -5"><strong>Y</strong></span></span></span>
</p>
<p class='ocr_par'>
<span class='ocr_line' id='line_1_7' title="bbox 510 2410 538 2425"><span class='ocr_word' id='word_1_90' title="bbox 510 2410 538 2425"><span class='ocrx_word' id='xword_1_90' title="x_wconf -12"><strong>44</strong></span></span></span>
<span class='ocr_line' id='line_1_8' title="bbox 520 2440 540 2454"><span class='ocr_word' id='word_1_91' title="bbox 520 2440 540 2454"><span class='ocrx_word' id='xword_1_91' title="x_wconf -5"><em>I‘</em></span></span></span>
<span class='ocr_line' id='line_1_9' title="bbox 509 2465 536 2481"><span class='ocr_word' id='word_1_92' title="bbox 509 2465 536 2481"><span class='ocrx_word' id='xword_1_92' title="x_wconf -9"><strong>4-</strong></span></span></span>
<span class='ocr_line' id='line_1_10' title="bbox 508 2493 535 2511"><span class='ocr_word' id='word_1_93' title="bbox 508 2493 535 2511"><span class='ocrx_word' id='xword_1_93' title="x_wconf -8"><strong>4»</strong></span></span></span>
</p>
</div>
<div class='ocr_carea' id='block_1_5' title="bbox 19 576 1924 618">
<p class='ocr_par'>
<span class='ocr_line' id='line_1_11' title="bbox 19 576 1924 618"><span class='ocr_word' id='word_1_94' title="bbox 19 576 1924 618"><span class='ocrx_word' id='xword_1_94' title="x_wconf -5"><strong>http://www.neosoftware.com/forum/viewtopic.php?t=19347&unwatch=topic</strong></span></span></span>
</p>
</div>
</div>
Wrangler
--------------
"You never know about a woman. Whether she'll laugh, cry or go for a gun." - Louis L'Amour

Windows 7 Ultimate SP1 64bit
16GB Ram
Asus GTX 950 OC Strix
Software made with NeoBook
http://highdesertsoftware.com
User avatar
Wrangler
 
Posts: 1507
Joined: Thu Mar 31, 2005 11:40 pm
Location: USA

Postby TechMediaPlugins2010 » Fri May 04, 2012 12:15 pm

Handle URL?

try to copy the result text and past in a html editor, like Dreamweaver. Work well for me. But this is just a bonus.

*******

PS: i understand what you said, now.
Last edited by TechMediaPlugins2010 on Fri May 04, 2012 12:19 pm, edited 1 time in total.
Advanced Plugins for NeoBook
www.techmedia-plugins.com.br
TechMediaPlugins2010
 
Posts: 298
Joined: Wed Jun 23, 2010 1:45 pm
Location: Rio de Janeiro - Brazil

Postby Wrangler » Fri May 04, 2012 12:18 pm

The html is ok. The text url is not right:

http: //wwmneosoftware . com/forum/viewtopic.php?t=19347iu.nwatch=topic
Wrangler
--------------
"You never know about a woman. Whether she'll laugh, cry or go for a gun." - Louis L'Amour

Windows 7 Ultimate SP1 64bit
16GB Ram
Asus GTX 950 OC Strix
Software made with NeoBook
http://highdesertsoftware.com
User avatar
Wrangler
 
Posts: 1507
Joined: Thu Mar 31, 2005 11:40 pm
Location: USA

Postby TechMediaPlugins2010 » Fri May 04, 2012 12:20 pm

It is generated by Tesseract engine.
Advanced Plugins for NeoBook
www.techmedia-plugins.com.br
TechMediaPlugins2010
 
Posts: 298
Joined: Wed Jun 23, 2010 1:45 pm
Location: Rio de Janeiro - Brazil

Postby domino » Fri May 04, 2012 12:24 pm

Just to say nice work Alberto.

I've run a few files of the type I need to work with and it handles the text very well - just another 7000 to go (really!)

I'll give it a more rigorous checkout tomorrow, and check out the html stuff as well....

Thanks...
User avatar
domino
 
Posts: 275
Joined: Sat Apr 02, 2005 7:11 am
Location: Notts UK

Postby TechMediaPlugins2010 » Fri May 04, 2012 12:24 pm

Make a little research. seems Tesseract have some difficulties with hyperlinks.
Advanced Plugins for NeoBook
www.techmedia-plugins.com.br
TechMediaPlugins2010
 
Posts: 298
Joined: Wed Jun 23, 2010 1:45 pm
Location: Rio de Janeiro - Brazil

Postby Wrangler » Fri May 04, 2012 12:31 pm

Not blaming you, Al. I know it's the engine. But the funny thing is it recognizes the urls fine when html recognized.

Even the best ocr's out there require a little cleanup. It's to be expected.
Wrangler
--------------
"You never know about a woman. Whether she'll laugh, cry or go for a gun." - Louis L'Amour

Windows 7 Ultimate SP1 64bit
16GB Ram
Asus GTX 950 OC Strix
Software made with NeoBook
http://highdesertsoftware.com
User avatar
Wrangler
 
Posts: 1507
Joined: Thu Mar 31, 2005 11:40 pm
Location: USA

Postby TechMediaPlugins2010 » Fri May 04, 2012 12:36 pm

I know, Pete. Cheers
Advanced Plugins for NeoBook
www.techmedia-plugins.com.br
TechMediaPlugins2010
 
Posts: 298
Joined: Wed Jun 23, 2010 1:45 pm
Location: Rio de Janeiro - Brazil

Postby Wrangler » Fri May 04, 2012 12:39 pm

Anyway, good job buddy. Now I'll have to start looking for things to ocr. :)
Wrangler
--------------
"You never know about a woman. Whether she'll laugh, cry or go for a gun." - Louis L'Amour

Windows 7 Ultimate SP1 64bit
16GB Ram
Asus GTX 950 OC Strix
Software made with NeoBook
http://highdesertsoftware.com
User avatar
Wrangler
 
Posts: 1507
Joined: Thu Mar 31, 2005 11:40 pm
Location: USA

Postby HPW » Fri May 04, 2012 2:53 pm

Hello,

Just try to test and it failed.
I unpack into a directory tmOCR under my Plugin folder preserving the directory 'tessdata'
When I start and press 'Create plugin' I get the Rectangle_Host filles with 'OCR Plugin ACTIVE' on grey background.
The I press 'Load Image' and choose OCR_TEST.bmp.
The variable ImageOCR gets filled with the correct name+path.

ImageOCR=C:\Programme\NeoBook5\Plugins\tmOCR\OCR_TEST.bmp
Rectangle_Host.tmOCRLastError=000 Ok

Then press 'RECOGNIZE'
Debugger shows with green sign:
tmOCRLoadRecognize "Rectangle_Host"

But nothing happens. No Text. No new variable in debugger.
Same with 'RECOGNIZE HTML'

No [Rectangle_host.tmOCRText] and no [Rectangle_host.tmOCRTextHTML] is set.

Running on German win xp pro with latest neobook.

What is the second parameter of:

tmOCRCreate "Rectangle_Host" ""

The readme talk about a scale_faktor to set.
Where and how it is set?

Where can I use 'lgGerman' and is it using 'deu.traineddata' then?

Bug: The about box show tmBase plugin

No NBR version of plugin?


Regards

Hans-Peter
Hans-Peter
User avatar
HPW
 
Posts: 2521
Joined: Fri Apr 01, 2005 11:24 pm
Location: Germany

Postby Wrangler » Fri May 04, 2012 3:00 pm

Hans: It worked for me with jpg format, but I didn't try bmp. Can you try with a jpg and see if it's format related?
Wrangler
--------------
"You never know about a woman. Whether she'll laugh, cry or go for a gun." - Louis L'Amour

Windows 7 Ultimate SP1 64bit
16GB Ram
Asus GTX 950 OC Strix
Software made with NeoBook
http://highdesertsoftware.com
User avatar
Wrangler
 
Posts: 1507
Joined: Thu Mar 31, 2005 11:40 pm
Location: USA

Postby TechMediaPlugins2010 » Fri May 04, 2012 3:04 pm

The second person that can't use it.

Please post tesseract.dll on same folder of the app. To test, please post tesseract.dll on windows system

To use German language, you have to download the files and use:

tmOCRLanguage and put lgGerman

to modify the ScaleFactor

tmOCRScaleFactor with an integer, like 2 or 3

**

About the "about" box, i just forget to update it, it is not a commercial plugin.
Advanced Plugins for NeoBook
www.techmedia-plugins.com.br
TechMediaPlugins2010
 
Posts: 298
Joined: Wed Jun 23, 2010 1:45 pm
Location: Rio de Janeiro - Brazil

Postby TechMediaPlugins2010 » Fri May 04, 2012 3:11 pm

There's a new version, in same link, with the "about" box fixed.
Advanced Plugins for NeoBook
www.techmedia-plugins.com.br
TechMediaPlugins2010
 
Posts: 298
Joined: Wed Jun 23, 2010 1:45 pm
Location: Rio de Janeiro - Brazil

Postby HPW » Fri May 04, 2012 3:18 pm

Please post tesseract.dll on same folder of the app. To test, please post tesseract.dll on windows system


I have it in the same folder because I unzip it all.
I tried windows/system without any more success.

To use German language, you have to download the files


I have downloaded 'deu.traineddata' so is it used by the 'lgGerman'?

The BMP comes with the plugin so it should work.
But I also convert it to jpg and tip with neopaint with no success.

Edit: Downloaded the new zip and about is fixed. Other problems remains.
Edit: Going to bed now.

Regards

Hans-Peter
Hans-Peter
User avatar
HPW
 
Posts: 2521
Joined: Fri Apr 01, 2005 11:24 pm
Location: Germany

Postby TechMediaPlugins2010 » Fri May 04, 2012 3:28 pm

The only thing that plugin don't work is not finding the tesseract.dll and/or the tessdata folder

Please COMPILE the example pub, put the EXE in a folder

In that folder, put tesseract.dll and the TESSDATA folder, with the files that in the zip so the structure will be

Application Folder
EXE
TESSERACT.DLL
\TESSDATA


PS: plugins works with almost any file format
Advanced Plugins for NeoBook
www.techmedia-plugins.com.br
TechMediaPlugins2010
 
Posts: 298
Joined: Wed Jun 23, 2010 1:45 pm
Location: Rio de Janeiro - Brazil

PreviousNext

Return to PlugIn Discussions

Who is online

Users browsing this forum: No registered users and 2 guests