Hi Charlie I was indexing at index time only. The synonyms/acronyms were coming from the published journals xml files so I wasn't expecting to maintain them myself. If it worked, I was expecting, hopefully, to update the synonyms file automatically.
As I just explained to Bernd I'm finding that because I'm just using supplied acronyms from the documents there's some overlap on the words used and it's giving me unexpected results. For example if I enter diabetes it finds the acronym DM for diabetes mellitus, which then coincides with an authors initials and puts them at the top of the list which is completely wrong, or is it? Perhaps I was looking for an author DM. Just too much noise to be useful I think. Thanks for your input anyway. Shaun On Fri, 15 Jan 2021 at 11:18, Charlie Hull <ch...@opensourceconnections.com> wrote: > I'm wondering if you should be using these acronyms at index time, not > search time. It will make your index bigger and you'll have to re-index > to add new synonyms (as they may apply to old documents) but this could > be an occasional task, and in the meantime you could use query-time > synonyms for the new ones. > > Maintaining 9000 synonyms in Solr's synonyms.txt file seems unweildy to me. > > Cheers > > Charlie > > On 15/01/2021 09:48, Shaun Campbell wrote: > > I have a medical journals search application and I've a list of some > 9,000 > > acronyms like this: > > > > MSNQ=>MSNQ Multiple Sclerosis Neuropsychological Screening Questionnaire > > SRN=>SRN Stroke Research Network > > IGBP=>IGBP isolated gastric bypass > > TOMADO=>TOMADO Trial of Oral Mandibular Advancement Devices for > Obstructive > > sleep apnoea–hypopnoea > > SRM=>SRM standardised response mean > > SRT=>SRT substrate reduction therapy > > SRS=>SRS Sexual Rating Scale > > SRU=>SRU stroke rehabilitation unit > > T2w=>T2w T2-weighted > > Ab-P=>Ab-P Aberdeen participation restriction subscale > > MSOA=>MSOA middle-layer super output area > > SSA=>SSA site-specific assessment > > SSC=>SSC Study Steering Committee > > SSB=>SSB short-stretch bandage > > SSE=>SSE sum squared error > > SSD=>SSD social services department > > NVPI=>NVPI Nausea and Vomiting of Pregnancy Instrument > > > > I tried to put them in a synonyms file, either just with a comma between, > > or with an arrow in between and the acronym repeated on the right like > > above, and no matter what I try I'm getting really strange search > results. > > It's like words in one acronym are matching with the same word in another > > acronym and then searching with that acronym which is completely > unrelated. > > > > I don't think Solr can handle this, but does anyone know of any crafty > > tricks in Solr to handle this situation where I can either search by the > > acronym or by the text? > > > > Shaun > > > > -- > Charlie Hull - Managing Consultant at OpenSource Connections Limited > <www.o19s.com> > Founding member of The Search Network <https://thesearchnetwork.com/> > and co-author of Searching the Enterprise > <https://opensourceconnections.com/about-us/books-resources/> > tel/fax: +44 (0)8700 118334 > mobile: +44 (0)7767 825828 >