Problem with org.apache.solr.handler.component.SearchHandler

2010-09-15 Thread Michał Flasiński
Hi, When I use 1.4 version, I get exception: ERROR [SolrCore] java.lang.NullPointerException     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)     at org.ap

Apache Hadoop Get Together Berlin October 2010 - this time with a huge Mahout focus

2010-09-15 Thread Isabel Drost
Hello, this is to announce the next Apache Hadoop Get Together sponsored by JTeam (http://www.jteam.nl) that will take place in newthinking store in Berlin. When: October 7th, 5p.m. Where: Newthinking store Berlin As always there will be slots of 30min each for talks on your Hadoop topic. After

Re: Geographic clustering

2010-09-15 Thread Joe Chesak
Charlie, I hear you! I'm looking for that same functionality. This problem is bigger than it looks. Your single-dimension example is a good starting point. It makes sense that when the user asks for all widgets priced between $0 and $100 he gets that information in facets. You have a couple

Re: How to install DuplicatesDetectorService

2010-09-15 Thread hellboy
OK. I need to find find/prevent duplicates in Database using Solr-Index I use Django with Haystack integration. I use TextProfileSignature to smart detect duplicates in text fields solrconfig.xml wrote: > > > class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory"> >

Re: Geographic clustering

2010-09-15 Thread gwk
Hi Charlie, I think I understand what you mean, I had a similar requirement and this is what we made: http://www.mysecondhome.co.uk/search.html?view=map It allows full faceting on all fields the site allows in normal list search. Some information about my implementation is in my original t

Re: about SolrCloud

2010-09-15 Thread 郭芸
HI,all,i have found SolrCloud's webapp. it is in src/webapp/web 2010-09-15 郭芸 发件人: 郭芸 发送时间: 2010-09-15 10:33:49 收件人: User Solr 抄送: 主题: about SolrCloud Dear All: I am studying SolrCloud now,I downloaded it from:https://svn.apache.org/repos/asf/lucene/solr/branches/cloud/ but i fo

How will solr behave if data importing is called while another importing operation is still ongoing?

2010-09-15 Thread yklxmas
Hello everyone, I've just started using solr for one of my projects. I wonder if anyone could give me some advice on the approach we're taking. Basically we have a file system that have many xml files to be indexed by solr. However, users might make changes to the files by using another editoria

Re: How will solr behave if data importing is called while another importing operation is still ongoing?

2010-09-15 Thread Gora Mohanty
On Wed, Sep 15, 2010 at 2:31 PM, yklxmas wrote: > > [...] > Basically we have a file system that have many xml files to be indexed by > solr. However, users might make changes to the files by using another > editorial system that will export xml to the file system. After xml is > exported, a call

Re: How to install DuplicatesDetectorService

2010-09-15 Thread hellboy
Is there possible to rewrite this code to Python: private static String getFuzzyHashing(MediaUnit unit) { TextProfileSignature tps = new TextProfileSignature(); // initialise with empty parameters to force default values of TextProfileSignature attributes

Re: How will solr behave if data importing is called while another importing operation is still ongoing?

2010-09-15 Thread yklxmas
Gora Mohanty-3 wrote: > > On Wed, Sep 15, 2010 at 2:31 PM, yklxmas wrote: >> >> [...] >> Basically we have a file system that have many xml files to be indexed by >> solr. However, users might make changes to the files by using another >> editorial system that will export xml to the file system

Re: How will solr behave if data importing is called while another importing operation is still ongoing?

2010-09-15 Thread Gora Mohanty
On Wed, Sep 15, 2010 at 4:21 PM, yklxmas wrote: [...] >> I'm using standard data import handler with file data source and xpath >> processor. so my script will be calling >> http://host:8983/solr/dataimport?command=full-import I am not sure if you are aware of this, but unless you are doing some

Re: Geographic clustering

2010-09-15 Thread Charlie DeTar
Thanks Joe and gwk, You're both exactly on track, that's precisely what I'm looking for -- something like what MarkerCluster does, but where I can handle hundreds of thousands of documents and constrain by other facets and such. So I guess I'll look into reimplementing a component like gwk's. be

Re: Problem with org.apache.solr.handler.component.SearchHandler

2010-09-15 Thread Erick Erickson
What request did you submit when this happened? Because I don't think merely declaring the component matters unless you use it, so I doubt that'd make any difference... Best Erick 2010/9/15 Michał Flasiński > Hi, > When I use 1.4 version, I get exception: > > ERROR [SolrCore] java.lang.NullPoin

Re: How to install DuplicatesDetectorService

2010-09-15 Thread Erick Erickson
Have you looked at: http://wiki.apache.org/solr/Deduplication Best Erick On Wed, Sep 15, 2010 at 4:58 AM, hellboy wrote: > > Is there possible to rewrite this code to Python: > > private static String getFuzzyHashing(MediaUnit unit) { >

Color search for images

2010-09-15 Thread Shawn Heisey
My index consists of metadata for a collection of 45 million objects, most of which are digital images. The executives have fallen in love with Google's color image search. Here's a search for "flower" with a red color filter: http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc

Solr returning irrelevant results

2010-09-15 Thread Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR)
Hi, I was running a query on the word "mining" and got results from documents that have nothing to do with mining. I got results with a score of 0.2997284 and less. It looks like Solr was querying the dsm.fulltext field for "mine" as well, which is ok except there were no "mine" words in the

Re: Solr returning irrelevant results

2010-09-15 Thread Yonik Seeley
On Wed, Sep 15, 2010 at 11:29 AM, Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR) wrote: > I was running a query on the word "mining" and got results from > documents that have nothing to do with mining.  I got results with a > score of 0.2997284 and less.  It looks like Solr was querying the > dsm.fullt

RE: Solr returning irrelevant results

2010-09-15 Thread Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR)
Sorry about that, I made it uppercase to emphasize it. The word was just "examined" Vincent Vu Nguyen Division of Science Quality and Translation Office of the Associate Director for Science Centers for Disease Control and Prevention (CDC) 404-498-6154 Century Bldg 2400 Atlanta, GA 30329

Re: Color search for images

2010-09-15 Thread Ken Krugler
On Sep 15, 2010, at 7:59am, Shawn Heisey wrote: My index consists of metadata for a collection of 45 million objects, most of which are digital images. The executives have fallen in love with Google's color image search. Here's a search for "flower" with a red color filter: http://www.

Boosting specific field value

2010-09-15 Thread Ravi Kiran
Hello, I am currently querying solr for a "*primarysection*" which will return documents like - *q=primarysection:(Politics* OR Nation*)&fq=contenttype:("Blog" OR "Photo Gallery) pubdatetime:[NOW-3MONTHS TO NOW]"*. Each document has several fields of which I am most interested in single val

Re: Color search for images

2010-09-15 Thread Paul Dlug
On Wed, Sep 15, 2010 at 12:41 PM, Ken Krugler wrote: > > On Sep 15, 2010, at 7:59am, Shawn Heisey wrote: > >> My index consists of metadata for a collection of 45 million objects, most >> of which are digital images.  The executives have fallen in love with >> Google's color image search.  Here's

Re: Color search for images

2010-09-15 Thread Shashi Kant
Shawn, I have done some research into this, machine-vision especially on a large scale is a hard problem, not to be entered into lightly. I would recommend starting with OpenCV - a comprehensive toolkit for extracting various features such as Color, Edge etc from images. Also there is a project LIR

Re: Color search for images

2010-09-15 Thread Shashi Kant
> > On a related note, I'm curious if anyone has run across a good set of > algorithms (or hopefully a library) for doing naive image > classification. I'm looking for something that can classify images > into something similar to the broad categories that Google image > search has (Face, Photo, Cl

RE: Solr returning irrelevant results

2010-09-15 Thread Nguyen, Vincent (CDC/OSELS/PHITPO) (CTR)
Actually, I think I found the issue. Some of the PDFs weren't OCR'ed very well and the text from the word "examined" was read as "~8 mined" Vincent Vu Nguyen Division of Science Quality and Translation Office of the Associate Director for Science Centers for Disease Control and Prevention (CDC)

Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
What could be possible error for 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log SEVERE: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(libgcj.so.90) at java.util.concurrent.FutureTask.get(libgcj.so.90)

Re: Null Pointer Exception while indexing

2010-09-15 Thread Yonik Seeley
On Wed, Sep 15, 2010 at 1:12 PM, andrewdps wrote: > > What could be possible error for > > 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log > SEVERE: java.util.concurrent.ExecutionException: > java.lang.NullPointerException >   at java.util.concurrent.FutureTask$Sync.innerGet(libgcj.s

Re: Boosting specific field value

2010-09-15 Thread Erick Erickson
This seems like a simple query-time boost, although I may not be understanding your problem well. That is, q=source(bbc OR "associated press")^10 As for boosting more recent documents, see: http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents HTH Erick On We

Re: Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
I'm sorry,but how do I use that.Is that something to do with uninstalling "gcu" and installing jvm and openJDK? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Null-Pointer-Exception-while-indexing-tp1481154p1481285.html Sent from the Solr - User mailing list archive

Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Andre Bickford
I'm working on creating a solr index search for a charitable organization. The solr index stores documents of donors. Each donor document has the following four fields: Id Name Address Gift Amount (multiValued) Gift Date (multiValued) In our relational database, there is a one-to-many relations

Re: Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
I still get the same error when I try to index the mrc file... This was the previous version of the Java on our server. # java -version java version "1.5.0" gij (GNU libgcj) version 4.3.2 Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditio

Null Pointer Exception while indexing

2010-09-15 Thread andrewdps
What could be possible error for 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log SEVERE: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(libgcj.so.90) at java.util.concurrent.FutureTask.get(libgcj.so.90)

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Erick Erickson
One strategy is to denormalize all the way. That is, each Solr "document" is Gift Amount and Gift Date would not be multiValued. You'd create a different "document" for each gift, so you'd have multiple documents with the same Id, Name, and Address. Be careful, though, if you've defined Id as a Uni

Re: Geographic clustering

2010-09-15 Thread Dennis Gearon
To me, it's a great idea. But I would prefer 2D areas superimposed over the map with a count per area, probably positioned near the median density point. Don't know of any applications that do this, or COULD do this, but intuitively, that feels like the right format. BTW, what is your usage for

RE: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Andre Bickford
Thanks for the response Erick. I did actually try exactly what you suggested. I flipped the index over so that a gift is the document. This solution certainly solves the previous problem, but introduces a new issue where the search results show duplicate donors. If a donor gave 12 times in a ye

using variables/properties in dataconfig.xml

2010-09-15 Thread Jason Chaffee
Is it possible to use the same type of property configuration in dataconfig.xml as is possible in solrconfig.xml? I tried it and it didn't seem to work. For example, ${solr.data.dir:/opt/search/store/solr/data} And in the dataconfig.xml, I would like to do this to configure the baseUrl

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Jonathan Rochkind
I might consider what Erick suggested to actually be 'normalization' rather than de-normalization! It's just that in Solr you only get one 'table'. Here's yet another approach, which will have it's own trade-offs: Keep the document as it is, representing a donor. But in addition to indexing

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Jonathan Rochkind
Okay if you _only_ need to offer full years as facet drill-downs, not within a year, and not multiple years at once, you could index: "-amount" as a token in a multi-valued field. And zero-pad amount out to a buncha digits. 2006-0200 2007-1000 (big doner!) Now you could find ever

Re: Geographic clustering

2010-09-15 Thread Dennis Gearon
Nice work! I like the squares a lot better than the other style, for some reason. What blows my mind is how many second homes for sale there are in the Grand Caymans. WOW! Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and C

Re: Simple Filter Query (fq) Use Case Question

2010-09-15 Thread Andre Bickford
Hi Jonathan, Thank you very much for your creative suggestions. I also wondered if perhaps combining giftDate and giftAmount into one single token was a possible solution. I'll definitely explore this further using your ideas. I especially like your idea of combing the giftDate and giftAmount

Re: Boosting specific field value

2010-09-15 Thread Ravi Kiran
Erick, I afraid you misinterpreted my issueif I query like you said i.e q=source(bbc OR "associated press")^10 I will ONLY get documents with source BBC or Associated Press...what I am asking is - if my query query does not deal with source at all but uses some other field...since the

RE: Boosting specific field value

2010-09-15 Thread Jonathan Rochkind
Maybe you are looking for the 'bq' (boost query) parameter in dismax? http://wiki.apache.org/solr/DisMaxQParserPlugin#bq_.28Boost_Query.29 From: Ravi Kiran [ravi.bhas...@gmail.com] Sent: Wednesday, September 15, 2010 10:02 PM To: solr-user@lucene.apache.org

Re: Color search for images

2010-09-15 Thread Dennis Gearon
My guess is that they are leveraging text on the same web page. I'm sure there's some post doctoral types who could get a graphic shape analyzer, color analyzer, to at least say it's a flower. However, even Google would have to build new datacenters to have the horsepower to do that kind of gr

Re: Color search for images

2010-09-15 Thread Shashi Kant
> I'm sure there's some post doctoral types who could get a graphic shape > analyzer, color analyzer, to at least say it's a flower. > > However, even Google would have to build new datacenters to have the > horsepower to do that kind of graphic processing. > Not necessarily true. Like.com - whi

Re: Boosting specific field value

2010-09-15 Thread Ravi Kiran
Hello Mr.Rochkind, I am using StandardRequestHandler so I presume I cannot use bq param right ?? Is there a way we can mix dismax and standardhandler i.e use lucene syntax for query and use dismax style for bq using localparams/nested queries? I remember seeing your post

Re: Color search for images

2010-09-15 Thread Stephen Weiss
There's a project out there called LIRE (I heard about it on this list) that's supposed to create a lucene-based CIBR index for images. I wonder if this could be integrated with Solr? Personally I don't really care about the flower part, I'm more worried about searching whether the flower is r

Handling Aggregate Records/Roll-up in Solr

2010-09-15 Thread Thomas Martin
Can someone point me to the mechanism in Sol that might allow me to roll-up or aggregate records for display. We have many items that are similar and only want to show a representative record to the user until they select that record. As an example - We carry a polo shirt and have 15 records