Re: Is relevance score related to position of the term?

2011-01-27 Thread Em
Hi, no, you missunderstood me, I only said that Solr does not care of the positions *usually*. Lucene got SpanNearQuery which considers the position of the Query's terms relative to eachother. Furthermore there exists a SpanFirstQuery which boosts occurences of a Term at the beginning of a speci

Re: Solr for noSQL

2011-01-27 Thread Gora Mohanty
On Fri, Jan 28, 2011 at 6:00 AM, Jianbin Dai wrote: [...] > Do we have data import handler to fast read in data from noSQL database, > specifically, MongoDB I am thinking to use? [...] Have you tried the links that a Google search turns up? Some of them look like pretty good prospects. Regards,

Re: NOT operator not working

2011-01-27 Thread Ahmet Arslan
--- On Fri, 1/28/11, abhayd wrote: > From: abhayd > Subject: NOT operator not working > To: solr-user@lucene.apache.org > Date: Friday, January 28, 2011, 8:45 AM > > i have a field in xml file Accessory Data > / Memory > solr schema field declared as    name="deviceType" type="text" > indexe

NOT operator not working

2011-01-27 Thread abhayd
i have a field in xml file Accessory Data / Memory solr schema field declared as I am trying to eliminate results by using NOT. For example I want all devices for a term except where DeviceType is not Accessory* SO here is what i m trying /solr/select?indent=on&version=2.2&q=(sharp+AND+-devi

Re: Solr for noSQL

2011-01-27 Thread Dai Jianbin 00901725
Do we have performance measurement? Would it be much slower compared to other DIH? > There no special connectors available to read from the key-value > stores like memcache/cassandra/mongodb. You would have to get a Java > client library for the DB and code your own dataimporthandler > datasourc

Re: Solr for noSQL

2011-01-27 Thread Dennis Gearon
Why not make one's own DIH handler, Lance? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/

Re: DismaxParser Query

2011-01-27 Thread Isan Fulia
Hi all, I am currently using solr1.4.1 .Do I need to apply patch for extended dismax parser. On 28 January 2011 03:42, Erick Erickson wrote: > In general, patches are applied to the source tree and it's re-compiled. > See: http://wiki.apache.org/solr/HowToContribute#Working_With_Patches > > Thi

Re: SolrCloud Questions for MultiCore Setup

2011-01-27 Thread Lance Norskog
Hello- I have not used SolrCloud. On 1/27/11, Em wrote: > > Hi, > > excuse me for pushing this for a second time, but I can't figure it out by > looking at the source code... > > Thanks! > > > >> Hi Lance, >> >> thanks for your explanation. >> >> As far as I know in distributed search i have to

Re: Solr for noSQL

2011-01-27 Thread Lance Norskog
There no special connectors available to read from the key-value stores like memcache/cassandra/mongodb. You would have to get a Java client library for the DB and code your own dataimporthandler datasource. I cannot recommend this; you should make your own program to read data and upload to Solr

Re: Tika config in ExtractingRequestHandler

2011-01-27 Thread Lance Norskog
The tika.config file is obsolete. I don't know what replaces it. On 1/27/11, Erlend Garåsen wrote: > > If this configuration file is the same as the tika-mimetypes.xml file > inside Nutch' conf file, I have an example. > > I was trying to implement language detection for Solr and thought I had >

Re: configure httpclient to access solr with user credential on third party host

2011-01-27 Thread Jayendra Patil
This should help HttpClient client = new HttpClient(); client.getParams().setAuthenticationPreemptive(true); AuthScope scope = new AuthScope(AuthScope.ANY_HOST,AuthScope.ANY_PORT); client.getState().setCredentials(scope, new UsernamePasswordCredentials(user, password)); Regards, Jayendra On

Re: Searching for negative numbers very slow

2011-01-27 Thread Simon Wistow
On Thu, Jan 27, 2011 at 11:32:26PM +, me said: > If I do > > qt=dismax > fq=uid:1 > > (or any other positive number) then queries are as quick as normal - in > the 20ms range. For what it's worth uid is a TrieIntField with precisionStep=0, omitNorms=true, positionIncrementGap=0

Solr for noSQL

2011-01-27 Thread Jianbin Dai
Hi, Do we have data import handler to fast read in data from noSQL database, specifically, MongoDB I am thinking to use? Or a more general question, how does Solr work with noSQL database? Thanks. Jianbin

Re: DIH clean=false

2011-01-27 Thread Chris Hostetter
: Then for clean=false, my understanding is that it won't blow off existing : index. For data that exist in index and db table (by the same uniqueKey) : it will update the index data regardless if there is actual field update. : For existing index data but not existing in table (by comparing un

Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Chris Hostetter
: Subject: Import Handler for tokenizing facet string into multi-valued : solr.StrField.. : In-Reply-To: <1296123345064-2361292.p...@n3.nabble.com> : References: <1296123345064-2361292.p...@n3.nabble.com> -Hoss

Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-01-27 Thread Simon Wistow
On Tue, Jan 25, 2011 at 01:28:16PM +0100, Markus Jelsma said: > Are you sure you need CMS incremental mode? It's only adviced when running on > a machine with one or two processors. If you have more you should consider > disabling the incremental flags. I'll test agin but we added those to get b

Searching for negative numbers very slow

2011-01-27 Thread Simon Wistow
If I do qt=dismax fq=uid:1 (or any other positive number) then queries are as quick as normal - in the 20ms range. However, any of fq=uid:\-1 or fq=uid:[* TO -1] or fq=uid:[-1 to -1] or fq=-uid:[0 TO *] then queries are incredibly slow - in the 9 *s

Re: Is relevance score related to position of the term?

2011-01-27 Thread cyang2010
Just a little clarification, when i say position of the term, i mean the position of the term within the field. For example, "Jamie Lee" -- Lee is the second position of the name field. "Lee Jamie" -- Lee is the first position of the name field in this case. -- View this message in context

Re: Is relevance score related to position of the term?

2011-01-27 Thread cyang2010
Hi Em, Thanks for reply. Basically you are saying there is no builtin solution that care about the position of the term to impact the relevancy score. In my scenario, i will get those two document with the same score. The order depends on the sequence of indexing. Thanks, Cyang -- View t

Re: DismaxParser Query

2011-01-27 Thread Erick Erickson
In general, patches are applied to the source tree and it's re-compiled. See: http://wiki.apache.org/solr/HowToContribute#Working_With_Patches This is pretty easy, and I do know that "some people" have applied the eDismax patch to the 1.4 code line, but I haven't done it myself. Best Erick On Th

Re: configure httpclient to access solr with user credential on third party host

2011-01-27 Thread Darniz
thanks exaclty i asked my domain hosting provider and he provided me with some other port i am wondering can i specify credentials without the port i mean when i open the browser and i type www.mydomainmame/solr i get the tomcat auth login screen. in the same way can i configure the http clien

disappearing MBeans

2011-01-27 Thread matthew sporleder
I am using JMX to monitor my replication status and am finding that my MBeans are disappearing. I turned on debugging for JMX and found that solr seems to be deleting the mbeans. Is this a bug? Some trace info is below.. here's me reading the mbean successfully: Jan 27, 2011 5:00:02 PM ServerCo

Re: SolrCloud Questions for MultiCore Setup

2011-01-27 Thread Em
Hi, excuse me for pushing this for a second time, but I can't figure it out by looking at the source code... Thanks! > Hi Lance, > > thanks for your explanation. > > As far as I know in distributed search i have to tell Solr what other > shards it has to query. So, if I want to query a sp

Re: Is relevance score related to position of the term?

2011-01-27 Thread Em
Hi Cyang, usually Solr isn't looking at the position of a term. However, there are solutions out there for considering the term's position when calculating a doc's score. Furthermore: If two docs got the same score, I think they are ordered the way they were found in the index. Does this answer

Is relevance score related to position of the term?

2011-01-27 Thread cyang2010
Let me describe the question using an example: If search "Lee" on name field as exact term match, returning result can be: Lee Jamie Jamie Lee Will solr grant higher score to "Lee Jamie" vs "Jamie Lee" based on the position of the term in name field of each document? >From what i know, the sc

EmbeddedSolr issues

2011-01-27 Thread Karthik Manimaran
Hi, Am getting the following messages while using EmbeddedSolr to retrieve the Term Vectors. I also happened to go through https://issues.apache.org/jira/browse/SOLR-914 . Should I ignore these messages and proceed or should I make any changes? [#|2011-01-27T11:56:34.593-0500|INFO|glassfish3

Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Dennis Schafroth
Thanks for the hints! Sorry about stealing the thread "query range in multivalued date field" Mistakenly responded to it. cheers, :-Dennis On 27/01/2011, at 16.48, Erik Hatcher wrote: > Beyond what Erick said, I'll add that it is often better to "do this from the > outside" and send in mul

Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Erik Hatcher
Beyond what Erick said, I'll add that it is often better to "do this from the outside" and send in multiple actual end-user displayable facet values. When you send in a field like "Water -- Irrigation ; Water -- Sewage", that is what will get stored (if you have it set to stored), but what you

Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Erick Erickson
Tokenization is fine with facets, that caution is about, say, faceting on the tokenized body of a document where you have potentially a huge number of unique tokens. But if there is a controlled number of distinct values, you shouldn't have to do anything except index to a tokenized field. I'd rem

Re: Tika config in ExtractingRequestHandler

2011-01-27 Thread Erlend Garåsen
If this configuration file is the same as the tika-mimetypes.xml file inside Nutch' conf file, I have an example. I was trying to implement language detection for Solr and thought I had to invoke some Tika functionality by this configuration file in order to do so, but found out that I could

RE: DismaxParser Query

2011-01-27 Thread Jonathan Rochkind
Yes, I think nested queries are the only way to do that, and yes, nested queries like Daniel's example work (I've done it myself). I haven't really tried to get into understanding/demonstrating _exactly_ how the relevance ends up working on the overall master query in such a situation, but it s

Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Stefan Matheis
Simo, it's freenode.net On Thu, Jan 27, 2011 at 4:16 PM, Simone Tripodi wrote: > Hi Paul, > sorry I'm late but I've been in the middle of a conf call :( On which > IRC server the #solr channel is? I'll reach you ASAP. > Thanks a lot! > Simo > > http://people.apache.org/~simonetripodi/ > http://ww

Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Simone Tripodi
Hi Paul, sorry I'm late but I've been in the middle of a conf call :( On which IRC server the #solr channel is? I'll reach you ASAP. Thanks a lot! Simo http://people.apache.org/~simonetripodi/ http://www.99soft.org/ On Thu, Jan 27, 2011 at 4:00 PM, Paul Libbrecht wrote: > > Le 27 janv. 2011 à

Re: Tika config in ExtractingRequestHandler

2011-01-27 Thread Adam Estrada
I believe that as along as Tika is included in a folder that is referenced by solrconfig.xml you should be good. Solr will automatically throw mime types to Tika for parsing. Can anyone else add to this? Thanks, Adam On Thu, Jan 27, 2011 at 5:06 AM, Erlend Garåsen wrote: > > The wiki page for th

Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Paul Libbrecht
Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit : > thanks a lot for your feedbacks, much more than appreciated! :) One more anomaly I find: the license is in the output of the pom.xml. I think this should not be the case. *my* license should be there, not the license of the archetype. Or? paul

Re: Question About Writing Custom Query Parser Plugin

2011-01-27 Thread Erik Hatcher
Yes, you need to create both a QParserPlugin and a QParser implementation. Look at Solr's own source code for the LuceneQParserPlugin/LuceneQParser and built it like that. Baking the surround query parser into Solr out of the box would be a useful contribution, so if you care to give it a litt

Re: Question About Writing Custom Query Parser Plugin

2011-01-27 Thread Ahsan |qbal
Any One On Thu, Jan 27, 2011 at 1:27 PM, Ahson Iqbal wrote: > Hi All > > I want to integrate lucene Surround Query Parser with solr 1.4.1, and for > that I > am writing Custom Query Parser Plugin, To accomplish this task I should > write a > sub class of "org.apache.solr.search.QParserPlugin" an

Detect Out of Memory Errors

2011-01-27 Thread saureen
Hi, is ther a way by which i could detect the out of memory errors in solr so that i could implement some functionality such as restarting the tomcat or alert me via email whenever such error is detected.? -- View this message in context: http://lucene.472066.n3.nabble.com/Detect-Out-of-Memory-

AW: DismaxParser Query

2011-01-27 Thread Daniel Pötzinger
It may also be an option to mix the query parsers? Something like this (not tested): q={!lucene}field1:test OR field2:test2 _query_:{!dismax qf=fields}+my dismax -bad So you have the benefits of lucene and dismax parser -Ursprüngliche Nachricht- Von: Erick Erickson [mailto:erickerick...

Re: DismaxParser Query

2011-01-27 Thread Erick Erickson
What version of Solr are you using, and could you consider either 3x or applying a patch to 1.4.1? Because eDismax (extended dismax) handles the full Lucene query language and probably works here. See the Solr JIRA 1553 at https://issues.apache.org/jira/browse/SOLR-1553 Best Erick On Thu, Jan 27,

Re: How to find Master & Slave are in sync

2011-01-27 Thread Erick Erickson
Let's back up a moment and ask why you are doing this from scripts, because this feels like an XY problem, see: http://people.apache.org/~hossman/#xyproblem What are you trying to accomplish by swapping cores on the master and slave? Solr 1.4 has conf

Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Dennis Schafroth
Hi, Pretty novice into SOLR coding, but looking for hints about how (if not already done) to implement a PatternTokenizer, that would index this into multivalie fields of solr.StrField for facetting. Ex. Water -- Irrigation ; Water -- Sewage should be tokenized into Water Irrigation Sewage

Re: DismaxParser Query

2011-01-27 Thread lee carroll
with dismax you get to say things like match all terms if less then 3 terms entered else match term-x it produces highly flexible and relevant matches and works very well in lots of common search usescases. field boosting allows further tuning. if you have rigid rules like the last one you quote i

Re: Post PDF to solr with asp.net

2011-01-27 Thread Gora Mohanty
On Thu, Jan 27, 2011 at 3:44 PM, Andrew McCombe wrote: > Hi > > We are trying to post some PDF documents to solr for indexing using ASP.net > but cannot find any documentation or a library that will allow posting of > binary data. [...] Do not have much idea of ASP.net, but SolrNet ( http://code.

Re: How to find Master & Slave are in sync

2011-01-27 Thread Shanmugavel SRD
Markus, The problem here is if I call the below two URLs immediately after replication then I am getting both the index versions as same. In my python script I have added code to swap the online core on master with offline core on master and online core on slave with offline core on slave, if bo

Re: DismaxParser Query

2011-01-27 Thread Isan Fulia
It worked by making mm=0 (it acted as OR operator) but how to handle this field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) On 27 January 2011 17:06, lee carr

Re: query range in multivalued date field

2011-01-27 Thread Erick Erickson
Range queries work on multivalued fields. I suspect the date math conversion is fooling you. For instance,NOW/HOUR first rounds down to the current hour, *then* subtracts one hour. If you attach &debugQuery=on (or check the debug checkbox in the admin full search page), you'll see the exact result

Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Paul Libbrecht
Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit : > thanks a lot for your feedbacks, much more than appreciated! :) Good time sync. I need it right now. > * Yes it also packs a Solr webepp, it is needed to embed it in > Tomcat. Do you think it could be a useful feature having also webapp > .war

Re: configure httpclient to access solr with user credential on third party host

2011-01-27 Thread Upayavira
Looks like you are connecting to Tomcat's AJP port, not the HTTP one. Connect to the Tomcat HTTP port and I suspect you'll have greater success. Upayavira On Wed, 26 Jan 2011 22:45 -0800, "Darniz" wrote: > > Hello, > i uploaded solr.war file on my hosting provider and added security > constrain

Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Simone Tripodi
Hi Paul, thanks a lot for your feedbacks, much more than appreciated! :) Going through your comments: * Yes it also packs a Solr webepp, it is needed to embed it in Tomcat. Do you think it could be a useful feature having also webapp .war as output? if it helps, I'm open to add it as well. * s

Re: DIH and duplicate content

2011-01-27 Thread Markus Jelsma
http://wiki.apache.org/solr/Deduplication On Thursday 27 January 2011 12:32:29 Rosa (Anuncios) wrote: > Is there a way to avoid duplicate content in a index at the moment i'm > uploading my xml feed via DIH? > > I would like to have only one entry for a given description. I mean if > the desci

Re: DismaxParser Query

2011-01-27 Thread lee carroll
sorry ignore that - we are on dismax here - look at mm param in the docs you can set this to achieve what you need On 27 January 2011 11:34, lee carroll wrote: > the default operation can be set in your config to be "or" or on the query > something like q.op=OR > > > > On 27 January 2011 11:26,

Re: DismaxParser Query

2011-01-27 Thread Bijeet Singh
The DisMax query parser internally hard-codes its operator to OR. This is quite unlike the Lucene query parser, for which the default operator can be configured using the solrQueryParser in schema.xml Regards, Bijeet Singh On Thu, Jan 27, 2011 at 4:56 PM, Isan Fulia wrote: > but q="keyword1 key

Re: DismaxParser Query

2011-01-27 Thread lee carroll
the default operation can be set in your config to be "or" or on the query something like q.op=OR On 27 January 2011 11:26, Isan Fulia wrote: > but q="keyword1 keyword2" does AND operation not OR > > On 27 January 2011 16:22, lee carroll > wrote: > > > use dismax q for first three fields an

DIH and duplicate content

2011-01-27 Thread Rosa (Anuncios)
Hi, Is there a way to avoid duplicate content in a index at the moment i'm uploading my xml feed via DIH? I would like to have only one entry for a given description. I mean if the desciption of one product already exist in index not import this new product. Is there a built in function? O

Re: DismaxParser Query

2011-01-27 Thread Isan Fulia
but q="keyword1 keyword2" does AND operation not OR On 27 January 2011 16:22, lee carroll wrote: > use dismax q for first three fields and a filter query for the 4th and 5th > fields > so > q="keyword1 keyword 2" > qf = field1,feild2,field3 > pf = field1,feild2,field3 > mm=something sensible f

Re: DismaxParser Query

2011-01-27 Thread lee carroll
use dismax q for first three fields and a filter query for the 4th and 5th fields so q="keyword1 keyword 2" qf = field1,feild2,field3 pf = field1,feild2,field3 mm=something sensible for you defType=dismax fq=" field4:(keyword3 OR keyword4) AND field5:(keyword5)" take a look at the dismax docs for

query range in multivalued date field

2011-01-27 Thread ramzesua
hi all. My query range for multivalued date field work incorrect. My schema. There is field "requestDate" that have multivalued attr.: Some data from the index: 2.0 sale 11 sale 2011-01-26T08:18:35Z2011-01-27T01:31:28Z 3.0 coldpop 111

Post PDF to solr with asp.net

2011-01-27 Thread Andrew McCombe
Hi We are trying to post some PDF documents to solr for indexing using ASP.net but cannot find any documentation or a library that will allow posting of binary data. Has anyone done this and if so, how? Regards Andrew McCombe iWeb Solutions Ltd.

Tika config in ExtractingRequestHandler

2011-01-27 Thread Erlend Garåsen
The wiki page for the ExtractingRequestHandler says that I can add the following configuration: /my/path/to/tika.config I have tried to google for an example of such a Tika config file, but haven't found anything. Erlend -- Erlend Garåsen Center for Information Technology Services Universi

Re: Does solr supports indexing of files other than UTF-8

2011-01-27 Thread Paul Libbrecht
At least in java utf-8 transcoding is done on a stream basis. No issue there. paul Le 27 janv. 2011 à 09:51, prasad deshpande a écrit : > The size of docs can be huge, like suppose there are 800MB pdf file to index > it I need to translate it in UTF-8 and then send this file to index. Now > sup

DismaxParser Query

2011-01-27 Thread Isan Fulia
Hi all, The query for standard request handler is as follows field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND field5:(keyword5) How the same above query can be written for dismax request handler -- Thanks & Reg

Re: Does solr supports indexing of files other than UTF-8

2011-01-27 Thread prasad deshpande
The size of docs can be huge, like suppose there are 800MB pdf file to index it I need to translate it in UTF-8 and then send this file to index. Now suppose there can be any number of clients who can upload file. at that time it will affect performance. and already our product support localization

Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Paul Libbrecht
Simone, It's good that you did so! I had found this three days ago while googling. And I am starting to make sense of it. It works well. Two little comments: - you are saying that it packages a standalone multicore and a standalone app. But it actually also packs a webapp. At first, I had rej

Re: Does solr supports indexing of files other than UTF-8

2011-01-27 Thread Paul Libbrecht
Why is converting documents to utf-8 not feasible? Nowadays any platform offers such services. Can you give a detailed failure description (maybe with the URL to a sample document you post)? paul Le 27 janv. 2011 à 07:31, prasad deshpande a écrit : > I am able to successfully index/search non-

Question About Writing Custom Query Parser Plugin

2011-01-27 Thread Ahson Iqbal
Hi All I want to integrate lucene Surround Query Parser with solr 1.4.1, and for that I am writing Custom Query Parser Plugin, To accomplish this task I should write a sub class of "org.apache.solr.search.QParserPlugin" and implement its two methods public void init(NamedList nl) public QPar

Re: How to group result when search on multiple fields

2011-01-27 Thread Stefan Matheis
On Thu, Jan 27, 2011 at 1:25 AM, cyang2010 wrote: > > Is "Field Collapsing" a new feature for solr 4.0 (not yet released yet)? > > That's at least what the Wiki tells you, yes.