Re: Copy field a source of copy field

Erick Erickson Tue, 18 Jul 2017 15:32:10 -0700

OK, I take it back. Keepwords handle multiple words just fine. So I
have to rewind.

I'm having no trouble at all applying multiple, successive keepwords
filters, even when there are multiple words on a single line in the
keepwords file. Your use of shingles in here is probably going to
confuse things, so I'd probably recommend taking that out until you
work out what's happening with multiple keepwords filters, then add it
back in.

The images you pasted almost look like you're showing the contents of
elevate.xml, but I suspect that's bogus.

But I think this is an XY problem, you're asking about how to chain
copyFields and we got off into talking about chaining keepwords and
the like. You state:

"So, the requirements here, are to be able to find all species in
species files (step one) and then make a facet with species in file
genus, step two."

Then you say:

"And the second one (genus), which contains genus that has to be for
facet purposes, like this"

How are those reconciled? Do you want facets on the genus+species? Or
just on the genus? Or both? So let's just start over.

What's also missing is why you think you need keepwords in the first
place. Is this a free-text field you're trying to extract
genus/species from? Or do you have the genus/species extracted
already?

Give us two docs, a sample search and what you want as outcome.
Because if you just want to facet on genus then do a copyField simply
to a "genus" field that strips out everything but the genus (however
you implement that, tricky given sub-species perhaps).

Ditto if you want to facet on species. Just a species_facet field that
you put whatever you want into. Or just use KeywordTokenizer for
species if you're guaranteed that you want the whole field.

You can then use copyField to copy as you wish.

Best,
Erick

On Tue, Jul 18, 2017 at 2:23 PM, tstusr <ulfrhe...@gmail.com> wrote:
> Well, for me it's kind of strange because it's working only with words that
> have blank spaces. It seems that maybe I'm not explaining well.
>
> My field is defined as follows:
>
>   <fieldType name="genus_type" class="solr.TextField"
> positionIncrementGap="0">
>     <analyzer type="index">
>       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping/mapping-ISOLatin1Accent.txt"/>
>       <charFilter class="solr.PatternReplaceCharFilterFactory"
> pattern="[0-9]+|(\-)(\s*)" replacement=""/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>       <filter class="solr.ShingleFilterFactory" maxShingleSize="3"
> outputUnigrams="true"/>
>       <filter class="solr.KeepWordFilterFactory" words="species.txt"
> ignoreCase="true"/>
>       <filter class="solr.KeepWordFilterFactory" words="genus.txt"
> ignoreCase="true"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>   </fieldType>
>
> We have 2 KWF files, "species" and then "genus". It seems that is just
> working with genus.
>
> Since I'm not able to use copy fields, what choices I have?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425p4346665.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Copy field a source of copy field

Reply via email to