Hi Edwin, I do not know, but my guess would be that each character is counted as 1 in regex regardless how many bytes it takes in used encoding.
Regards, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 3 Jan 2018, at 16:43, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: > > Thanks for the reply. > > I am doing the search on existing data that has already been indexed, and > it is likely to be a one time thing. > > This subject:/.{255,}.*/ works for English characters. However, there are > Chinese characters in some of the records. The length seems to be more than > 255, but it does not shows up in the results. > > Do you know how the length for Chinese characters and other languages are > being determined? > > Regards, > Edwin > > > On 3 January 2018 at 23:01, Alexandre Rafalovitch <arafa...@gmail.com> > wrote: > >> Do that during indexing as Emir suggested. Specifically, use an >> UpdateRequestProcessor chain, probably with the Clone and FieldLength >> processors: http://www.solr-start.com/javadoc/solr-lucene/org/ >> apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html >> >> Regards, >> Alex. >> >> On 31 December 2017 at 22:00, Zheng Lin Edwin Yeo <edwinye...@gmail.com> >> wrote: >>> Hi, >>> >>> Would like to check, if it is possible to query a field which has data of >>> more than a certain length? >>> >>> Like for example, I want to query the field subject that has more than >> 255 >>> bytes. Is it possible? >>> >>> I am currently using Solr 6.5.1. >>> >>> Regards, >>> Edwin >>