1- How can I put the file extension into my index? I'm using Nutch to crawling web pages and sending Nutch's data to Solr for indexing. and I have no idea to put the file extension to my index. 2- please give me some help links about mime type. I'm new to Solr and don't know anything about mime type. please note that I should index data of Nutch and I couldn't find useful commands in Nutch tutorial for advanced indexing! thank you very much
On Mon, Sep 12, 2011 at 6:07 PM, Jaeger, Jay - DOT <jay.jae...@dot.wi.gov>wrote: > Some possibilities: > > 1) Put the file extension into your index (that is what we did when we were > testing indexing documents with Solr) > 2) Put a mime type for the document into your index. > 3) Put the whole file name / URL into your index, and match on part of the > name. This will give some false positives. > > JRJ > > -----Original Message----- > From: ahmad ajiloo [mailto:ahmad.aji...@gmail.com] > Sent: Monday, September 12, 2011 5:58 AM > To: solr-user@lucene.apache.org > Subject: Fwd: How to serach on specific file types ? > > Hello > I want to search on articles. So need to find only specific files like doc, > docx, and pdf. > I don't need any html pages. Thus the result of our search should only > consists of doc, docx, and pdf files. > can you help me? >