Thank you all. Learn a lot from you guys.

On Thu, May 23, 2019 at 3:54 PM Nicolas Franck <nicolas.fra...@ugent.be>
wrote:

> In that case you'll have to duplicate that field:
>
> id: $name_of_file
> id_t: $name_of_file
>
> The first field should be marked as "string", and set to be the key field.
> Id-fields cannot be tokenized.
>
> The second field is a derivative (you can just copy the contents, or use
> copyField),
> and should be set to a type of field, that does tokenization. In this case
> you'll
> need a field type that uses n-grams:
>
>
> https://lucene.apache.org/solr/guide/6_6/tokenizers.html#Tokenizers-N-GramTokenizer
>
> otherwise you'll end up using wildcard queries ( _id_s:my* ) that do not
> perform very well.
>
> On 23 May 2019, at 09:39, Mohomed Rimash <rim...@yaalalabs.com<mailto:
> rim...@yaalalabs.com>> wrote:
>
> yes in that case your file name should be key field of each document you
> added to the solr
>
> On Thu, 23 May 2019 at 12:32, luckydog xf <luckydo...@gmail.com<mailto:
> luckydo...@gmail.com>> wrote:
>
> Thanks  guys.
>
> *Don't mean to be a bother*, just want to confirm, I know it's doable to
> search keywords, but what I want  is * FileName(s) * that contains the
> string. The answer is still a yes?
>
> Thanks again.
>
> On Thu, May 23, 2019 at 2:20 PM Jörn Franke <jornfra...@gmail.com<mailto:
> jornfra...@gmail.com>> wrote:
>
> You can go much more than grep. I recommend to get a book on Solr and
> read
> through it. Then you get the full context and you can see if it is useful
> for you.
>
> Am 23.05.2019 um 07:44 schrieb luckydog xf <luckydo...@gmail.com<mailto:
> luckydo...@gmail.com>>:
>
> Hi, list,
>
>   A quick question, we have tons of Microsoft docx/PDFs files( some
> PDFs
> are scanned copies), and we want to populate into Apache solr and
> search
> a
> few keywords that contain in the files and  return filenames
> accordingly.
>
>  # it's the same thing as `grep -r KEYWORD /PATH/XXX` in Linux system.
>
>  Is it doable ?
>
>  Thanks,
>
>
>
>

Reply via email to