Re: [tesseract-ocr] Re: Tesseract training ground truth: I'm confused about the box files

2024-09-05 Thread Mateusz Matela
nop wrote: Ehm: 1. Tesseract v3 (legacy) engine training is based on characters. 2. Tesseract LSTM engine (tesseract >=v4) training script is based on lines (group of words) Box files reflect that. And yes - box files are important. Zdenko pi 12. 7. 2024 o 14:14 Mateusz Matela napís

[tesseract-ocr] Re: Tesseract training ground truth: I'm confused about the box files

2024-07-12 Thread Mateusz Matela
file and let the training script autogenerate them. In that case the reported error rates were crazy, like 99% instead of 0.5%. This suggests that conclusion 3 is correct. środa, 10 lipca 2024 o 15:17:07 UTC+2 Mateusz Matela napisał(a): > Hi all, > > Sorry if double posting, my previou

[tesseract-ocr] Tesseract training ground truth: I'm confused about the box files

2024-07-10 Thread Mateusz Matela
Hi all, Sorry if double posting, my previous message didn't appear and I don't see any info about waiting for acceptance or something. I was searching for this topic in this forum and it was mentioned a few times, but I couldn't find a clear and definitive explanation. How does the information

Re: Quoted phrase doesn't match when stemming and synonyms combined.

2023-01-12 Thread Mateusz Matela
rocess. In Perl you can use forking via the process manager cpan module, most other languages do it as well (but not as well imo) On Jan 11, 2023, at 8:47 AM, Mateusz Matela wrote: After reindexing with SGF the document matches, as expected. Still, it looks like SGF was designed to work well

Re: Quoted phrase doesn't match when stemming and synonyms combined.

2023-01-11 Thread Mateusz Matela
ys be considered temporary and replaceable, it’s not a database, it’s a search tool to search a data set, and if done with that in mind you put the index on replaceable hardware and expect/have a plan for them to simply die and be replaced On Jan 11, 2023, at 6:27 AM, Mateusz Matela wrote: W

Re: Quoted phrase doesn't match when stemming and synonyms combined.

2023-01-11 Thread Mateusz Matela
W dniu 11.01.2023 o 12:04, Dave pisze: Hmm. As an experiment what happens when you use a range of three or four with the quotes using the tilda in the query? You mean query like "test polskie"~1 ? Yes, it does match. Unfortunately it's not a workaround I can use because the query is provided

Quoted phrase doesn't match when stemming and synonyms combined.

2023-01-11 Thread Mateusz Matela
Hi, My query is 'test polskie'. I use MorfologikFilter for Polish stemming, it turns 'polskie' into 'polski' + 'polskie'. I also use SynonymGraphFilter which turns 'polski' into 'pol'. Here's what I see in quey analysis (token position in parenthesis): Tokenizer: test(1) polskie(2) MF: test(1)

[jira] [Closed] (HTTPASYNC-135) HttpAsyncMethods.createHead() methods creates HttpGet objects

2018-03-22 Thread Mateusz Matela (JIRA)
[ https://issues.apache.org/jira/browse/HTTPASYNC-135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mateusz Matela closed HTTPASYNC-135. So inconvenient :P OK, looks good. > HttpAsyncMethods.createHead() methods creates Http

[jira] [Commented] (HTTPASYNC-135) HttpAsyncMethods.createHead() methods creates HttpGet objects

2018-03-17 Thread Mateusz Matela (JIRA)
[ https://issues.apache.org/jira/browse/HTTPASYNC-135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403547#comment-16403547 ] Mateusz Matela commented on HTTPASYNC-135: -- If you're wa

[jira] [Commented] (HTTPASYNC-135) HttpAsyncMethods.createHead creates Get

2018-03-11 Thread Mateusz Matela (JIRA)
[ https://issues.apache.org/jira/browse/HTTPASYNC-135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16394479#comment-16394479 ] Mateusz Matela commented on HTTPASYNC-135: -- I agree this kind of issue is

[jira] [Commented] (HTTPASYNC-135) HttpAsyncMethods.createHead creates Get

2018-03-07 Thread Mateusz Matela (JIRA)
[ https://issues.apache.org/jira/browse/HTTPASYNC-135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389266#comment-16389266 ] Mateusz Matela commented on HTTPASYNC-135: -- That's weird. I was l

[jira] [Created] (HTTPASYNC-135) HttpAsyncMethods.createHead creates Get

2018-03-06 Thread Mateusz Matela (JIRA)
Mateusz Matela created HTTPASYNC-135: Summary: HttpAsyncMethods.createHead creates Get Key: HTTPASYNC-135 URL: https://issues.apache.org/jira/browse/HTTPASYNC-135 Project: HttpComponents

Re: GWT compiler hangs when "Compiling permutation 0"

2015-04-22 Thread Mateusz Matela
Related issues for reference, it's hard to find them. https://code.google.com/p/google-web-toolkit/issues/detail?id=7857 https://code.google.com/p/google-web-toolkit/issues/detail?id=9184 -- You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To

[Dpp-robot] [dpp:bugs] #836 Eclipse hangs after class field rename

2014-10-01 Thread Mateusz Matela
--- ** [bugs:#836] Eclipse hangs after class field rename** **Status:** open **Group:** 13.3.29 **Created:** Wed Oct 01, 2014 04:16 PM UTC by Mateusz Matela **Last Updated:** Wed Oct 01, 2014 04:16 PM UTC **Owner:** nobody 1. Saros session started by sharing one of the source folders in a

[jira] [Commented] (SOLR-3967) Mapping error: langid.enforceSchema option checks source field instead of target field

2013-02-03 Thread Mateusz Matela (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569777#comment-13569777 ] Mateusz Matela commented on SOLR-3967: -- Yes, the fix looks good. Th

[jira] [Closed] (SOLR-3967) Mapping error: langid.enforceSchema option checks source field instead of target field

2013-02-03 Thread Mateusz Matela (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mateusz Matela closed SOLR-3967. > Mapping error: langid.enforceSchema option checks source field instead of > target

[jira] [Created] (SOLR-3967) Mapping error: langid.enforceSchema option checks source field instead of target field

2012-10-19 Thread Mateusz Matela (JIRA)
Mateusz Matela created SOLR-3967: Summary: Mapping error: langid.enforceSchema option checks source field instead of target field Key: SOLR-3967 URL: https://issues.apache.org/jira/browse/SOLR-3967

[jira] [Updated] (VELOCITY-806) Can't access map values when get(Object key) is overriden

2011-07-01 Thread Mateusz Matela (JIRA)
[ https://issues.apache.org/jira/browse/VELOCITY-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mateusz Matela updated VELOCITY-806: Priority: Minor (was: Major) > Can't access map values when get(Object

[jira] [Created] (VELOCITY-806) Can't access map values when get(Object key) is overriden

2011-07-01 Thread Mateusz Matela (JIRA)
: Bug Components: Engine Affects Versions: 1.7 Reporter: Mateusz Matela I try to add a java.util.Map instance to Velocity context and access its values from a template, for example: ${map.get("someKey")} It doesn't work if use my own implementation of Map.get(Object key)