Re: Which is a good XPath generator?

2010-07-25 Thread Geert-Jan Brits
I am assuming (like Li I think) that you want to induce a structure/schema from a html-example so you can use that schema to extract data from similiar html-structured pages. Another term often used in literature for that is "Wrapper Induction". Beside DOM, using CSS-classes often give good disti

Solr 4.0 and lucene-analyzers

2010-07-25 Thread Pavel Minchenkov
Hi, If generate solr maven artifacts from trunk, it will have dependency on lucene-analyzers:4.0-dev, which can't be resolved. Maybe I'm doing something wrong? Thanks. -- Pavel Minchenkov

Re: Novice seeking help to change filters to search without diacritics

2010-07-25 Thread Erick Erickson
use copyfield in your schema file. The copyfield takes its own analyzer, so the original can fold and the copy may not. dismax might help you at query time on this... HTH Erick On Sat, Jul 24, 2010 at 11:40 PM, HSingh wrote: > > > : Usually people set up two fields, one with diacritics and one

Re: filter query on timestamp slowing query???

2010-07-25 Thread oferiko
britske wrote: > > just wanted to mention a possible other route, which might be entirely > hypothetical :-) > > *If* you could query on internal docid (I'm not sure that it's available > out-of-the-box, or if you can at all) > your original problem, quoted below, could imo be simplified to ask

RE: filter query on timestamp slowing query???

2010-07-25 Thread Jonathan Rochkind
britske wrote: >> *If* you could query on internal docid (I'm not sure that it's available >> out-of-the-box, or if you can at all) >> your original problem, quoted below, could imo be simplified to asking for >> the last docid inserted (that match the other criteria from your use-case) >> and in t

how to Protect data

2010-07-25 Thread Girish Pandit
Hi, I was being ask about protecting data, means that the search index data is stored in the some indexed files and when you open those indexed files, I can clearly see the data, means some texts, e.g. name, address, postal code etc. is there anyway I can hide the data? means some kind of da

Re: a bug of solr distributed search

2010-07-25 Thread Li Li
where is the link of this patch? 2010/7/24 Yonik Seeley : > On Fri, Jul 23, 2010 at 2:23 PM, MitchK wrote: >> why do we do not send the output of TermsComponent of every node in the >> cluster to a Hadoop instance? >> Since TermsComponent does the map-part of the map-reduce concept, Hadoop >> onl

Re: a bug of solr distributed search

2010-07-25 Thread Li Li
the solr version I used is 1.4 2010/7/26 Li Li : > where is the link of this patch? > > 2010/7/24 Yonik Seeley : >> On Fri, Jul 23, 2010 at 2:23 PM, MitchK wrote: >>> why do we do not send the output of TermsComponent of every node in the >>> cluster to a Hadoop instance? >>> Since TermsComponent

"SELECT" on a Rich Document to download/display content

2010-07-25 Thread Girish Pandit
Hi, I indexed a word document, when I do select, it shows the file name. How can I display content? also if I add "hl=true", is this going to show me the line with the highlight from the word document? I am using below URL to do select: http://localhost:8983/solr/select/?q=Management it sho

Re: a bug of solr distributed search

2010-07-25 Thread MitchK
Good morning, https://issues.apache.org/jira/browse/SOLR-1632 - Mitch Li Li wrote: > > where is the link of this patch? > > 2010/7/24 Yonik Seeley : >> On Fri, Jul 23, 2010 at 2:23 PM, MitchK wrote: >>> why do we do not send the output of TermsComponent of every node in the >>> cluster to a

question about relevance

2010-07-25 Thread Bharat Jain
Hello All, I have a index which store multiple objects belonging to a user for e.g. -> Identifies user object type e.g. userBasic or userAdv > MAPS to userBasicInfoObject -> MAPS to userAdvInfoObject Now when I am doing some query I get multipl