-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add font information to result iterator #9
Comments
Hi Charles Commented out the release for now and have it running. Will test more this Thanks On Wed, Mar 13, 2013 at 5:00 AM, Charles Weld [email protected]:
|
Hi Walter, IntPtr TessResultIteratorWordFontAttributes(
IntPtr handle,
ref bool is_bold, ref bool is_italic, ref bool is_underlined, ref bool is_monospace,
ref bool is_serif, ref bool is_smallcaps, ref int pointsize, ref int font_id); The benefit of just having it return an IntPtr here is that you only need to dereference it if required, i.e. the font hasn't been created yet. Implementation wise I was thinking we could maintain a dictionary of Font, referenced by font_id, in the result iterator or even page and only create a new one if it doesn't exist. Pseudo code for this would probably look like this:
You'll of course need to dereference the string ptr using Marshal.PtrToStringAnsi, assuming the return type is null terminated ANSI string. AssumptionsI've made a few assumptions here mainly the following:
Good luck! |
An possible enhancement might be to use an enum for the font attributes (Bold, Italic, etc), checkout http://stackoverflow.com/questions/8447/enum-flags-attribute for an example of what I'm talking about. |
Hi Charles Regarding your suggestions for the Font change, I was looking to only IN RESULTITERATOR.CS: (after GetText) IN PAGE.CS: (after GetMeanConfidence) IN BASEAPI.CS: (after ResultIteratorGetConfidence)
CallingConvention.Cdecl, EntryPoint = I will work out how to use GitHub before getting lost in the act of coding On Wed, Mar 13, 2013 at 5:00 AM, Charles Weld [email protected]:
|
There are two main reasons I was thinking of using a cached font obj these are in no real order:
As I previously mentioned previously I'm assuming the font's are uniquely identified by font id. |
Hi all, I recently had a use for this feature, so I took a stab at implementing it (forty2/tesseract@fadf20d) Shortly thereafter, I came across pull request #275 from last June, which covers some of the same territory, plus more. The primary difference difference between that code and mine is that I went ahead and implemented the suggested cached font objects rather than directly exposing the API with its 8-ish "out" params. Still, I didn't feel right about opening a second pull request for the same feature without at least some discussion. So, what can I do to help get this feature merged in in some form or another? Thanks! |
Hi Zach, As for #275, well to be honest it just fell by the wayside :( |
I left some thoughts as noted on your commits. Also after actually looking at the implementation I don't think the cache in this case would be giving you much since your effectively marshalling all the font information each time anyway. While you could change it to only Marshall the font name if it's not in the cache I'm not sure it's worth it. It's also something that could be added retrospectively if required without any API changes. |
Sorry that's as notes not noted. |
Font information isn't yet available from the result iterator. This will have two parts to complete this these being:
Also make sure that you free any required resource (check the doc for TessResultIteratorWordFontAttributes).
Signature retrieved from capi.h from https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip
The text was updated successfully, but these errors were encountered: