Re: getting a list of top page-ranked webpages

2010-09-16 Thread Dennis Gearon
This was supposed to be a question: > And, most popular in the world, per dominant culture in > each country, per religious majority, per language culture . > . . > Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded'

Re: Color search for images

2010-09-16 Thread Dennis Gearon
Sounds like someone is/has going to say/said: "Make it so, number one" There are some good links off of this article about the color Magenta, (like, uh, who knows what 'cyan' or 'magenta' are anyway? So I looked it up. Refilling my printer cartidges required an explanation.) http://en.wikipedi

Re: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Dennis Gearon
Yes, field collapsing is like faceting, only more so, and very useful, I believe. As my project gets going, I have lready imagined uses for it. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.

Re: getting a list of top page-ranked webpages

2010-09-16 Thread Dennis Gearon
There's a great web page somewhere that shows the popularity as the subway map of tokyo. And, most popular in the world, per dominant culture in each country, per religious majority, per language culture . . . Dennis Gearon Signature Warning EARTH has a Right To Life, otherw

Re: DIH: alternative approach to deltaQuery

2010-09-16 Thread Lukas Kahwe Smith
On 17.09.2010, at 05:40, Lance Norskog wrote: > Database optimization is not like program optimization- it is wildly > unpredictable. well an RDBMS that cannot handle true != false as a NOP during the planning stage doesn't even do basics in optimization. But this approach is so much more eff

Re: Color search for images

2010-09-16 Thread Shawn Heisey
On 9/16/2010 7:45 AM, Shashi Kant wrote: Lire is a nascent effort and based on a cursory overview a while back, IMHO was an over-simplified version of what a CBIR engine should be. They use CEDD (color& edge descriptors). Wouldn't work for the kind of applications I am working on - which needs

Re: Solr Rolling Log Files

2010-09-16 Thread Lance Norskog
Rolling logfiles is configured in the servlet container, not Solr. Indexing logfiles is a pain because of multiline log outputs like Exceptions. Vladimir Sutskever wrote: Can SOLR be configured out of the box to handle rolling log files? Kind regards, Vladimir Sutskever Investment Bank - Te

Re: Null Pointer Exception while indexing

2010-09-16 Thread Lance Norskog
Andrew, you should download Solr from the apache site. This packaging is wrong-headed. As to Java, a Linux person would know the system for picking which is the standard Java. andrewdps wrote: Also,the solr Java properties looks like this using gcj,despite setting java_home in /etc/profile

Re: Null Pointer Exception while indexing

2010-09-16 Thread Lance Norskog
Good eye, Thomas! Yes, GCJ is a non-starter. You're best off downloading Java 1.6 yourself, but I understand that it is easier to use the public package repositories. Thomas Joiner wrote: My guess would be that Jetty has some configuration somewhere that is telling it to use GCJ. Is it possib

Re: DIH: alternative approach to deltaQuery

2010-09-16 Thread Lance Norskog
Database optimization is not like program optimization- it is wildly unpredictable. What bugs me about the delta approach is using the last time DIH ran, rather than a timestamp from the DB. Oh well. Also, with SOLR-1499 you can query Solr directly to see what it has. Lukas Kahwe Smith wrote

Index partitioned/ Full indexing by MSSQL or MySQL

2010-09-16 Thread Tommy Molto
Hi, My company have a site of ads that have 2 types of data: active ads (ads that are valid) and inactive ads (that are no longer valid, but we have to show the page to get users to see related ads , related searches, etc). Some doubts crossed the mind of the team: - ·

Re: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Andre Bickford
Thanks to everyone for your suggestions. It seems that creating the index using gifts as the top level entity is the appropriate approach so I can effectively filter gifts on both the gift amount and gift date without running into multiValued field issues. It introduces a problem of listing do

Re: SOLR interface with PHP using javabin?

2010-09-16 Thread onlinespend...@gmail.com
OK, thanks for the suggestion. Why do you recommend using JSON over simply using the built-in PHPSerializedResponseWriter? I find using an interface that requires the data to be parsed to be inefficient (this would include the aforementioned PHPSerializedResponseWriter as well). Wouldn't it

Re: getting a list of top page-ranked webpages

2010-09-16 Thread Ken Krugler
Hi Ian, On Sep 16, 2010, at 2:44pm, Ian Upright wrote: Hi, this question is a little off topic, but I thought since so many people on this are probably experts in this field, someone may know. I'm experimenting with my own semantic-based search engine, but I want to test it with a large co

Re: Get all results from a solr query

2010-09-16 Thread Scott Gonyea
lol, note to self: scratch out IPs. Good thing firewalls exist to keep my stupidity at bay. Scott On Thu, Sep 16, 2010 at 2:55 PM, Scott Gonyea wrote: > If you want to do it in Ruby, you can use this script as scaffolding: > require 'rsolr' # run `gem install rsolr` to get this > solr  = RSolr.

Re: Get all results from a solr query

2010-09-16 Thread Scott Gonyea
If you want to do it in Ruby, you can use this script as scaffolding: require 'rsolr' # run `gem install rsolr` to get this solr  = RSolr.connect(:url => 'http://ip-10-164-13-204:8983/solr') total = solr.select({:rows => 0})["response"]["numFound"] rows  = 10 query = {   :rows   => rows,   :sta

getting a list of top page-ranked webpages

2010-09-16 Thread Ian Upright
Hi, this question is a little off topic, but I thought since so many people on this are probably experts in this field, someone may know. I'm experimenting with my own semantic-based search engine, but I want to test it with a large corpus of web pages. Ideally I would like to have a list of the

Re: Get all results from a solr query

2010-09-16 Thread Shashi Kant
Start with a *:*, then the “numFound” attribute of the element should give you the rows to fetch by a 2nd request. On Thu, Sep 16, 2010 at 4:49 PM, Christopher Gross wrote: > That will stil just return 10 rows for me.  Is there something else in > the configuration of solr to have it return all

RE: Re: Get all results from a solr query

2010-09-16 Thread Markus Jelsma
Not according to the wiki; http://wiki.apache.org/solr/CommonQueryParameters#rows   But you could always create an issue for this one.   -Original message- From: Christopher Gross Sent: Thu 16-09-2010 22:50 To: solr-user@lucene.apache.org; Subject: Re: Get all results from a solr quer

Re: Get all results from a solr query

2010-09-16 Thread Christopher Gross
That will stil just return 10 rows for me. Is there something else in the configuration of solr to have it return all the rows in the results? -- Chris On Thu, Sep 16, 2010 at 4:43 PM, Shashi Kant wrote: > q=*:* > > On Thu, Sep 16, 2010 at 4:39 PM, Christopher Gross wrote: >> I have some que

Re: Get all results from a solr query

2010-09-16 Thread Shashi Kant
q=*:* On Thu, Sep 16, 2010 at 4:39 PM, Christopher Gross wrote: > I have some queries that I'm running against a solr instance (older, > 1.2 I believe), and I would like to get *all* the results back (and > not have to put an absurdly large number as a part of the rows > parameter). > > Is there

Get all results from a solr query

2010-09-16 Thread Christopher Gross
I have some queries that I'm running against a solr instance (older, 1.2 I believe), and I would like to get *all* the results back (and not have to put an absurdly large number as a part of the rows parameter). Is there a way that I can do that? Any help would be appreciated. -- Chris

Understanding Lucene's File Format

2010-09-16 Thread Giovanni Fernandez-Kincade
Hi, I've been trying to understand Lucene's file format and I keep getting hung up on one detail - how can Lucene quickly find the frequency data (or proximity data) for a particular term? According to the file formats page on the Lucene website

DIH: alternative approach to deltaQuery

2010-09-16 Thread Lukas Kahwe Smith
Hi, I think i have mentioned this approach before on this list, but I really think that the deltaQuery approach which is currently explained as the "way to do updates" is far from ideal. It seems to add a lot of redundant queries. I therefore propose to merge the initial import and delta querie

Re: DataImportHandler with multiline SQL

2010-09-16 Thread Lukas Kahwe Smith
On 16.09.2010, at 21:07, David Yang wrote: > Hi > > > > I am using the DIH to retrieve data, and as part of the process, I > wanted to create a temporary table and then import data from that. I > have played around a little with DIH and it seems like for a query like: > "select x; select y;" y

DataImportHandler with multiline SQL

2010-09-16 Thread David Yang
Hi I am using the DIH to retrieve data, and as part of the process, I wanted to create a temporary table and then import data from that. I have played around a little with DIH and it seems like for a query like: "select x; select y;" you can have select y to return no results and do random stuf

Re: SOLR interface with PHP using javabin?

2010-09-16 Thread Thomas Joiner
If you wish to interface to Solr from PHP, and decide to go with Yonik's suggestion to use JSON, I would suggest using http://code.google.com/p/solr-php-client/ It has served my needs for the most part. On Thu, Sep 16, 2010 at 1:33 PM, Yonik Seeley wrote: > On Thu, Sep 16, 2010 at 2:30 PM, onlin

Re: SOLR interface with PHP using javabin?

2010-09-16 Thread Yonik Seeley
On Thu, Sep 16, 2010 at 2:30 PM, onlinespend...@gmail.com wrote: >  I am planning on creating a website that has some SOLR search capabilities > for the users, and was also planning on using PHP for the server-side > scripting. > > My goal is to find the most efficient way to submit search queries

SOLR interface with PHP using javabin?

2010-09-16 Thread onlinespend...@gmail.com
I am planning on creating a website that has some SOLR search capabilities for the users, and was also planning on using PHP for the server-side scripting. My goal is to find the most efficient way to submit search queries from the website, interface with SOLR, and display the results back on

Re: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Dennis Gearon
Is a core a running piece of software, or just an index/config pairing? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/16/10, Jonathan Rochkind wrote: > From:

Re: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Dennis Gearon
So THAT'S what a core is! I have been wondering. Thank you very much! Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/16/10, Jonathan Rochkind wrote: > From: J

Re: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Jonathan Rochkind
One solr core has essentially one index in it. (not only one 'field', but one indexed collection of documents) There are weird hacks, like I believe the spellcheck component kind of creates it's own sub-indexes, not sure how it does that. You can have more than one core in a single solr instan

Re: Color search for images

2010-09-16 Thread Dennis Gearon
LOL! now that is one of the wisest things I've seen in a while. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/16/10, Shashi Kant wrote: > From: Shashi Kant

RE: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Dennis Gearon
This brings me to ask a question that's been on my mind for awhile. Are indexes set up for the whole site, or a set of searches, with several different indexes for a site? How many instances does one Solr/Lucene instance have access to, (not counting shards/segments)? Dennis Gearon Signature W

Re: Color search for images

2010-09-16 Thread Dennis Gearon
That's impressive! So Google has BOUGHT some doctoral types, or highly specialized geeks, And is looking at X number of images. I bet the number of images on his video film library is at least several orders of magnitude above what Like deals with. Dennis Gearon Signature Warning

Re: Handling Aggregate Records/Roll-up in Solr

2010-09-16 Thread Dennis Gearon
Look for faceting or field collapsing. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Wed, 9/15/10, Thomas Martin wrote: > From: Thomas Martin > Subject: Handling

RE: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Dennis Gearon
There's something that works a little bit like 'DISTINCT' called field collapsing. Take a look in the archives for it. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On

Re: Boosting specific field value

2010-09-16 Thread Jonathan Rochkind
Actually, I _think_ you need to use a nested query for my idea, you can just use "LocalParams". &q={!lucene} your query in lucene syntax I think that'll work, I think you can use "LocalParams" directly in the 'q', no need for nested query. If it will, it avoids the escaping nightmares with n

Re: Index update issue

2010-09-16 Thread Erik Hatcher
Be sure to issue a commit after updates (either with a separate or append ?commit=true to your update requests). Out of curiosity are you using any Ruby library to speak to Solr? Or hand rolling some Net::HTTP stuff? Erik On Sep 16, 2010, at 9:29 AM, maggie chen wrote: Dear Al

Re: Boosting specific field value

2010-09-16 Thread Ravi Kiran
Awesome I also did not know about q.alt accepting lucene style...Thanks to both of you, Mr. Ackermann and Mr. Rockkind, I learnt a lot in just this thread than I have done in the last 6 months of reading and dealing with solr. As you folks pointed out q.alt and nested queries are both great option

Re: Null Pointer Exception while indexing

2010-09-16 Thread Thomas Joiner
My guess would be that Jetty has some configuration somewhere that is telling it to use GCJ. Is it possible to completely remove GCJ from the system? Another possibility would be to uninstall Jetty, and then reinstall it, and hope that on the reinstall it would pick up on the OpenJDK. What distr

Re: Null Pointer Exception while indexing

2010-09-16 Thread andrewdps
Lance, We are on Solr Specification Version: 1.4.1 -- View this message in context: http://lucene.472066.n3.nabble.com/Null-Pointer-Exception-while-indexing-tp1481154p1488320.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Null Pointer Exception while indexing

2010-09-16 Thread andrewdps
Also,the solr Java properties looks like this using gcj,despite setting java_home in /etc/profile jetty.logs = /usr/local/vufind/solr/jetty/logs path.separator = : java.vm.name = GNU libgcj java.vm.specification.name = Java(tm) Virtual Machine Specification java.runtime.version = 1.5.0 java.home

Re: Null Pointer Exception while indexing

2010-09-16 Thread andrewdps
Thanks for all the suggestions. As far as JAVA is concerned,I'm worried to see different things.I'm afraid if things are going wrong with the settings. r...@zoombox:/etc# echo $JAVA_HOME /usr/lib/jvm/default-java r...@zoombox:/etc# java -version java version "1.6.0_0" OpenJDK Runtime Environmen

RE: using variables/properties in dataconfig.xml

2010-09-16 Thread Ephraim Ofir
No, it's not possible. See workaround: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com%3e Ephraim Ofir | Reporting and Host Development team | ICQ P: +972 3 7665510 | M: + 972 52 4888510 | F: +972 3

Index update issue

2010-09-16 Thread maggie chen
Dear All, I use Solr in Rails. I added a new item, the index number update took a long time (one hour). For example, now the index is "97" and add a new item, the index will become "98" in one hour. I checked all of Solr config files, but I couldn't find the setting about that. I comment out

Re: Color search for images

2010-09-16 Thread Shashi Kant
> Lire looks promising, but how hard is it to integrate the content-based > search into Solr as opposed to Lucene?  I myself am not a Java developer.  I > have access to people who are, but their time is scarce. > Lire is a nascent effort and based on a cursory overview a while back, IMHO was an

Re: Color search for images

2010-09-16 Thread Shashi Kant
On Thu, Sep 16, 2010 at 3:21 AM, Lance Norskog wrote: > Yes, notice the flowers are all a medium-dark crimson red. There are a bunch > of these image-indexing & search technologies, but there is no (to my > knowledge) "finished technology"- it's very much an area of research. If you > want to sear

RE: Boosting specific field value

2010-09-16 Thread Jonathan Rochkind
Nice, I didn't know about q.alt. Or, alternately, yes, you could use a nested query, good call. Which, yes, I agree is kind of confusing at first. &qt=dismax # use dismax for the overall query &bq=whatever # so we can use bq, since we're using dismax &q=_query_:"{!lucene} solr-lucene sy

Re: Full text search in facet scope

2010-09-16 Thread Peter Karich
Hi, if you index your doc with text='operating system' with an additional keyword field='linux' (of type string, can be multivalued) then solr facetting should be what you want: solr/select?q=*:*&facet=true&facet.field=keyword&rows=10 or rows=0 depending on your needs Does this help? Regards, P

Re: Null Pointer Exception while indexing

2010-09-16 Thread Yonik Seeley
On Wed, Sep 15, 2010 at 2:01 PM, andrewdps wrote: > I still get the same error when I try to index the mrc file... If you get the exact same error, then you are still using GCJ. When you type "java" it's probably going to GCJ because of your path (i.e. change it or directly specify the path to th

Full text search in facet scope

2010-09-16 Thread Bogdan Gusiev
I need to build a faceted search. Each facet consists of keywords that should also be applied to search query in addition to mail query string For instance: I am searching for "operating system" and I need two facets "linux" and "windows". Each should append it's keyword to query string to get cou

Re: Null Pointer Exception while indexing

2010-09-16 Thread Israel Ekpo
Try removing the data directory and then restart your Servlet container and see if that helps. On Thu, Sep 16, 2010 at 3:28 AM, Lance Norskog wrote: > Which version of Solr? 1.4?, 1.4.1? 3.x branch? trunk? if the 3.x or the > trunk, when did you pull it? > > > andrewdps wrote: > >> What could be

Re: Solr for statistical data

2010-09-16 Thread Peter Karich
Hi Kjetil, is this custom component (which performes groub by + calcs stats) somewhere available? I would like to do something similar. Would you mind to share if it isn't already available? The grouping stuff sounds similar to https://issues.apache.org/jira/browse/SOLR-236 where you can have me

Re: Color search for images

2010-09-16 Thread Shawn Heisey
On 9/15/2010 10:50 AM, Shashi Kant wrote: Shawn, I have done some research into this, machine-vision especially on a large scale is a hard problem, not to be entered into lightly. I would recommend starting with OpenCV - a comprehensive toolkit for extracting various features such as Color, Edge

Solr for statistical data

2010-09-16 Thread Kjetil Ødegaard
Hi all, we're currently using Solr 1.4.0 in a project for statistical data, where we group and sum a number of "double" values. Probably not what most people use Solr for, but it seems to be working fine for us :-) We do have some challenges, especially with memory use, so I thought I'd check h

Re: Handling Aggregate Records/Roll-up in Solr

2010-09-16 Thread Markus Jelsma
You should " just flatten the representation of the shirt in the data model." On Wednesday 15 September 2010 22:23:17 Thomas Martin wrote: > Can someone point me to the mechanism in Sol that might allow me to > roll-up or aggregate records for display. We have many items that are > similar and o

Re: Boosting specific field value

2010-09-16 Thread Chantal Ackermann
Hi Ravi, with dismax, use the parameter "q.alt" which expects standard lucene syntax (instead of "q"). If "q.alt" is present in the query, "q" is not required. Add the parameter "qt=dismax". Chantal On Thu, 2010-09-16 at 06:22 +0200, Ravi Kiran wrote: > Hello Mr.Rochkind, >

RE: Simple Filter Query (fq) Use Case Question

2010-09-16 Thread Chantal Ackermann
Hi Andre, changing the entity in your index from donor to gift changes of course the scope of your search results. I found it helpful to re-think such change from that "other" side (the result side). If the users of your search application look for individual gifts, in the end, then changing the i

bug in dataimport context ?

2010-09-16 Thread Marc Emery
Hi, It seems the dataimport context session setter is harcoded to the "entitySession": private void putVal(String name, Object val, Map map) { if(val == null) map.remove(name); else entitySession.put(name, val); } shouldn't be rather like this: private void putVal(String name, Obj

Re: Null Pointer Exception while indexing

2010-09-16 Thread Lance Norskog
Which version of Solr? 1.4?, 1.4.1? 3.x branch? trunk? if the 3.x or the trunk, when did you pull it? andrewdps wrote: What could be possible error for 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log SEVERE: java.util.concurrent.ExecutionException: java.lang.NullPointerException

Re: Color search for images

2010-09-16 Thread Lance Norskog
Yes, notice the flowers are all a medium-dark crimson red. There are a bunch of these image-indexing & search technologies, but there is no (to my knowledge) "finished technology"- it's very much an area of research. If you want to search the word 'flower' and index data that can find blobs of

Re: Color search for images

2010-09-16 Thread Li Li
do you mean content based image retrieval or just search images by tag? if the former, you can try LIRE 2010/9/15 Shawn Heisey : >  My index consists of metadata for a collection of 45 million objects, most > of which are digital images.  The executives have fallen in love with > Google's color im