If you are uploading a PDF, then you must be doing it via Tika or via an extract handler (which uses Tika under the covers).
Try getting a standalone Tika of the same version and see what it outputs. Perhaps there is something in those specific PDF pages that confuse Tika. Like, if it used different font for English text and therefore Adobe encoded each letter individually and therefore broke the flow. PDF is not a content format, but presentation format. These things happen. Regards, Alex On Tue, 8 Sep 2020 at 09:11, <ad...@ukr.net> wrote: > > > Thank you for support, > > I upload PDF file page by page. And in this case left to right (LTR) or right > to left (RTL) reading apples for the whole document not for the specific text > block ( separate for Arabic, separate for Enlish) > > I can see the same behavior with output for via /select as well as /browse > call > > Almost sure the problem is with during upload > <filter class="solr.ASCIIFoldingFilterFactory"/> > > But adding this to the > <analyzer type="index"> and latter to another analyzer does not change the > result. > >