Yeah, it’s always a question “how much is enough/too much”. That looks reasonable for alphatitle, but what about title? Your original question was that the sorting changes depending on which field you sort on. If your title field uses something that tokenizes or doesn’t include the same analysis chain (particularly the lowercasing and patternreplace) then I’d expect the order to change.
Best, Erick > On Jul 15, 2020, at 4:49 PM, David Hastings <hastings.recurs...@gmail.com> > wrote: > > thanks, ill check the admin, didnt want to send a big clock of text but: > > > - > - > > Tokenizer: > org.apache.lucene.analysis.core.KeywordTokenizerFactoryclass: > solr.KeywordTokenizerFactoryluceneMatchVersion: 7.1.0 > - > > Token Filters: > org.apache.lucene.analysis.core.LowerCaseFilterFactoryclass: > solr.LowerCaseFilterFactoryluceneMatchVersion: 7.1.0 > org.apache.lucene.analysis.miscellaneous.TrimFilterFactoryclass: > solr.TrimFilterFactoryluceneMatchVersion: 7.1.0 > org.apache.lucene.analysis.pattern.PatternReplaceFilterFactorypattern: > ([^a-z])replace: allclass: solr.PatternReplaceFilterFactoryreplacement > luceneMatchVersion: 7.1.0 > - > > Query Analyzer: > > <http://192.168.1.33:7300/solr/#/mega/analysis?analysis.fieldname=alphatitle> > org.apache.solr.analysis.TokenizerChain > - > > Tokenizer: > org.apache.lucene.analysis.core.KeywordTokenizerFactoryclass: > solr.KeywordTokenizerFactoryluceneMatchVersion: 7.1.0 > - > > Token Filters: > org.apache.lucene.analysis.core.LowerCaseFilterFactoryclass: > solr.LowerCaseFilterFactoryluceneMatchVersion: 7.1.0 > org.apache.lucene.analysis.miscellaneous.TrimFilterFactoryclass: > solr.TrimFilterFactoryluceneMatchVersion: 7.1.0 > org.apache.lucene.analysis.pattern.PatternReplaceFilterFactorypattern: > ([^a-z])replace: allclass: solr.PatternReplaceFilterFactoryreplacement > luceneMatchVersion: 7.1.0 > > > On Wed, Jul 15, 2020 at 4:47 PM Erick Erickson <erickerick...@gmail.com> > wrote: > >> I’d look two places: >> >> 1> try the admin/analysis page from the admin UI. In particular, look at >> what tokens actually get in the index. >> >> 2> again, the admin UI will let you choose the field (alphatitle and >> title) and see what the actual indexed tokens are. >> >> Both have the issue that I don’t know what tokenizer you are using. For >> sorting it better be something >> like KeywordTokenizer. Anything that breaks up the input into separate >> tokens will produce surprises. >> >> And unless you have lowercaseFilter in front of your patternreplace, >> you’re removing uppercase characters. >> >> Best, >> Erick >> >>> On Jul 15, 2020, at 3:06 PM, David Hastings < >> hastings.recurs...@gmail.com> wrote: >>> >>> howdy, >>> i have a field that sorts fine all other content, and i cant seem to >> debug >>> why it wont sort for me on this one chunk of it. >>> "sort":"alphatitle asc", "debugQuery":"on", "_":"1594733127740"}}, >> "response >>> ":{"numFound":3,"start":0,"docs":[ { "title":"Money orders", { >>> "title":"Finance, >>> consolidation and rescheduling of debts", { "title":"Rights in former >>> German Islands in Pacific", }, >>> >>> its using a copyfield from "title" to "alphatitle" that replaces all >>> punctuation >>> pattern: ([^a-z])replace: allclass: solr.PatternReplaceFilterFactory >>> >>> and if i use just title it flips: >>> >>> "title":"Finance, consolidation and rescheduling of debts"}, { >> "title":"Rights >>> in former German Islands in Pacific"}, { "title":"Money orders"}] >>> >>> and im banging my head trying to figure out what it is about this >>> content in particular that is not sorting the way I would expect. >>> don't suppose someone would be able to lead me to a good place to look? >> >>