Thank you all. Learn a lot from you guys. On Thu, May 23, 2019 at 3:54 PM Nicolas Franck <nicolas.fra...@ugent.be> wrote:
> In that case you'll have to duplicate that field: > > id: $name_of_file > id_t: $name_of_file > > The first field should be marked as "string", and set to be the key field. > Id-fields cannot be tokenized. > > The second field is a derivative (you can just copy the contents, or use > copyField), > and should be set to a type of field, that does tokenization. In this case > you'll > need a field type that uses n-grams: > > > https://lucene.apache.org/solr/guide/6_6/tokenizers.html#Tokenizers-N-GramTokenizer > > otherwise you'll end up using wildcard queries ( _id_s:my* ) that do not > perform very well. > > On 23 May 2019, at 09:39, Mohomed Rimash <rim...@yaalalabs.com<mailto: > rim...@yaalalabs.com>> wrote: > > yes in that case your file name should be key field of each document you > added to the solr > > On Thu, 23 May 2019 at 12:32, luckydog xf <luckydo...@gmail.com<mailto: > luckydo...@gmail.com>> wrote: > > Thanks guys. > > *Don't mean to be a bother*, just want to confirm, I know it's doable to > search keywords, but what I want is * FileName(s) * that contains the > string. The answer is still a yes? > > Thanks again. > > On Thu, May 23, 2019 at 2:20 PM Jörn Franke <jornfra...@gmail.com<mailto: > jornfra...@gmail.com>> wrote: > > You can go much more than grep. I recommend to get a book on Solr and > read > through it. Then you get the full context and you can see if it is useful > for you. > > Am 23.05.2019 um 07:44 schrieb luckydog xf <luckydo...@gmail.com<mailto: > luckydo...@gmail.com>>: > > Hi, list, > > A quick question, we have tons of Microsoft docx/PDFs files( some > PDFs > are scanned copies), and we want to populate into Apache solr and > search > a > few keywords that contain in the files and return filenames > accordingly. > > # it's the same thing as `grep -r KEYWORD /PATH/XXX` in Linux system. > > Is it doable ? > > Thanks, > > > >