Null pointer exception when mixing highlighter & shards & q.alt

2010-09-06 Thread Ron Mayer
Short summary: * Using both highlighting and shards and q.alt is giving me a null pointer exception. * Really easy to workaround; but since the similar cases without shards work, perhaps this should too. * If you think it should be fixed, point me in the right direction and I c

Re: Many sparse facets?

2010-09-06 Thread Ron Mayer
Jonathan Rochkind wrote: > I could certainly be wrong. If you have a facet with a LOT fewer unique > values than documents in the query, I'd be curious what happens if you try > facet.method=enum. Cool. I'll be trying that later. > > I'm definitely not an expert, just trying to help figure

RE: getting started - books/in dept material

2010-09-06 Thread Dennis Gearon
Not sure there's enough info there?(NOT, LOL!) ;-) Thanks very much, had missed that. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/6/10, Markus Jelsma wr

RE: Many sparse facets?

2010-09-06 Thread Jonathan Rochkind
I could certainly be wrong. If you have a facet with a LOT fewer unique values than documents in the query, I'd be curious what happens if you try facet.method=enum. facet.enum.cache.minDf's documentation suggests it can effect memory usage with enum too, but seems more focused on when you ha

RE: Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread Dennis Gearon
Oh, THAT MOD! LOL! I thought it was some search engine specific acronym. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/6/10, Markus Jelsma wrote: > From: Ma

Re: Many sparse facets?

2010-09-06 Thread Ron Mayer
Jonathan Rochkind wrote: > What matters isn't how many documents have a value, so much > as how many unique values there are in the field total. If > there aren't that many, faceting can be done fairly quickly and fairly > efficiently. Really? Don't these 2 log file lines: INFO: UnInverted m

RE: Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread Markus Jelsma
The remainder of an arithmetic division http://en.wikipedia.org/wiki/Modulo_operation -Original message- From: Dennis Gearon Sent: Mon 06-09-2010 22:04 To: solr-user@lucene.apache.org; Subject: Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?) What is a 'simple MOD'? Den

RE: getting started - books/in dept material

2010-09-06 Thread Markus Jelsma
Did you miss the wiki? http://wiki.apache.org/solr/SolrResources   -Original message- From: Dennis Gearon Sent: Mon 06-09-2010 22:05 To: solr-user@lucene.apache.org; Subject: getting started - books/in dept material I really don't want to understand the code that is IN Solr/Lucene. S

Re: Alphanumeric wildcard search problem

2010-09-06 Thread Hasnain
Finally got it working, thanks for your help and support -- View this message in context: http://lucene.472066.n3.nabble.com/Alphanumeric-wildcard-search-problem-tp1393332p1429315.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Many sparse facets?

2010-09-06 Thread Jonathan Rochkind
What matters isn't how many documents have a value, so much as how many unique values there are in the field total. If there aren't that many, faceting can be done fairly quickly and fairly efficiently. Otherwise, the only thing I can think of is experimenting with the two different facet meth

Many sparse facets?

2010-09-06 Thread Ron Mayer
Is there a good way of handling a large number of facets that are quite sparse (most documents not having any value most facets)? In my system I have quite a few documents (few million, will soon grow to mid tens of millions), and our users are requesting an ever-increasing number of facets (curre

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread Andrzej Bialecki
On 2010-09-06 22:03, Dennis Gearon wrote: What is a 'simple MOD'? md5(docId) % numShards -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System

getting started - books/in dept material

2010-09-06 Thread Dennis Gearon
I really don't want to understand the code that is IN Solr/Lucene. So I'm looking for books on USING Solr/Lucene and configuring it plus making good queries. Any suggestions for current material? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all di

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread Dennis Gearon
What is a 'simple MOD'? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/6/10, Andrzej Bialecki wrote: > From: Andrzej Bialecki > Subject: Re: SolrCloud distr

RE: Hardware Specs Question

2010-09-06 Thread Dennis Gearon
Very interesting stuff! I'm pretty sure everything will be non hard disk for intense applications FRONT line use by 10 years or sooner, with hard disk as backup/boot up. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Cro

RE: Hardware Specs Question

2010-09-06 Thread Toke Eskildsen
From: Dennis Gearon [gear...@sbcglobal.net]: > I wouldn't have thought that CPU was a big deal with the speed/cores of CPU's > continuously growing according to Moore's law and the change in Disk Speed > barely changine 50% in 15 years. Must have a lot to do with caching. I am not sure I follow yo

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread Andrzej Bialecki
On 2010-09-06 16:41, Yonik Seeley wrote: On Mon, Sep 6, 2010 at 10:18 AM, MitchK wrote: [...consistent hashing...] But it doesn't solve the problem at all, correct me if I am wrong, but: If you add a new server, let's call him IP3-1, and IP3-1 is nearer to the current ressource X, than doc x wi

Re: Alphanumeric wildcard search problem

2010-09-06 Thread Hasnain
Hi Erik, So I took your advice and started fresh with solr, got my self latest copy of solr and started adding things gradually in configuration files. Unfortunately, this still doesnot work. But, I realized that searching for q=r-1* didnt return any results but when I query like this q=r

Re: How to retrieve the full corpus

2010-09-06 Thread Markus Jelsma
You can use Luke to inspect a Lucene index. Check the schema browser in your Solr admin interface for an example. On Monday 06 September 2010 16:52:03 Roland Villemoes wrote: > Hi All, > > How can I retrieve all words from a Solr core? > I need a list of all the words and how often they occur in

Re: How to retrieve the full corpus

2010-09-06 Thread Andrzej Bialecki
On 2010-09-06 17:15, Yonik Seeley wrote: On Mon, Sep 6, 2010 at 10:52 AM, Roland Villemoes wrote: How can I retrieve all words from a Solr core? I need a list of all the words and how often they occur in the index. http://wiki.apache.org/solr/TermsComponent It doesn't currently stream thoug

Re: How to enable Unicode Support in Solr

2010-09-06 Thread Walter Underwood
On Sep 6, 2010, at 7:59 AM, Yonik Seeley wrote: > On Mon, Sep 6, 2010 at 10:30 AM, Walter Underwood > wrote: >> On Sep 6, 2010, at 1:49 AM, Lance Norskog wrote: >> >>> 1) The XML file must include the UTF-8 encoding metadata in the first line. >> >> If it requires that, it isn't a legal XML p

Re: How to retrieve the full corpus

2010-09-06 Thread Yonik Seeley
On Mon, Sep 6, 2010 at 10:52 AM, Roland Villemoes wrote: > How can I retrieve all words from a Solr core? > I need a list of all the words and how often they occur in the index. http://wiki.apache.org/solr/TermsComponent It doesn't currently stream though, so requesting *all* at once might take

Re: Solr is indexing jdbc properties

2010-09-06 Thread savvas.andreas
It did work!! \o/ the admin page had been cached in FF.. Thanks very much Alex. -- Savvas On 6 September 2010 16:02, Savvas-Andreas Moysidis < savvas.andreas.moysi...@googlemail.com> wrote: > Hi Alex, > > Thanks very much for your reply and pointers. > That didn't work I'm afraid.. > > I'll t

Re: Solr is indexing jdbc properties

2010-09-06 Thread savvas.andreas
Hi Alex, Thanks very much for your reply and pointers. That didn't work I'm afraid.. I'll try to see if sql server supports a conversion function. Regards, -- Savvas On 6 September 2010 15:49, Alexey-34 [via Lucene] < ml-node+1426936-1702828919-54...@n3.nabble.com > wrote: > > http://wiki.apa

Re: How to retrieve the full corpus

2010-09-06 Thread mike anderson
You might check out Luke, the Lucene Index Toolbox. http://www.getopt.org/luke/ I know you can browse the index and get frequency counts, though I'm not sure if you can export the entire index as a list like what you're looking for. Hope this helps, Mike On Mon, Sep 6, 2010 at 10:52 AM, Roland

Re: How to enable Unicode Support in Solr

2010-09-06 Thread Yonik Seeley
On Mon, Sep 6, 2010 at 10:30 AM, Walter Underwood wrote: > On Sep 6, 2010, at 1:49 AM, Lance Norskog wrote: > >> 1) The XML file must include the UTF-8 encoding metadata in the first line. > > If it requires that, it isn't a legal XML parser. The encoding declaration is > optional and it defaults

How to retrieve the full corpus

2010-09-06 Thread Roland Villemoes
Hi All, How can I retrieve all words from a Solr core? I need a list of all the words and how often they occur in the index. med venlig hilsen/best regards Roland Villemoes Tel: (+45) 22 69 59 62 E-Mail: mailto:r...@alpha-solutions.dk Alpha Solutions A/S Borgergade 2, 3.sal, 1300 København K Te

Re: Solr is indexing jdbc properties

2010-09-06 Thread Alexey Serba
http://wiki.apache.org/solr/DataImportHandlerFaq#Blob_values_in_my_table_are_added_to_the_Solr_document_as_object_strings_like_B.401f23c5 Try to add convertType attribute to dataSource declaration, i.e. HTH, Alex On Mon, Sep 6, 2010 at 5:49 PM, savvas.andreas wrote: > > Hello, > > I am trying

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread Yonik Seeley
On Mon, Sep 6, 2010 at 10:18 AM, MitchK wrote: [...consistent hashing...] > But it doesn't solve the problem at all, correct me if I am wrong, but: If > you add a new server, let's call him IP3-1, and IP3-1 is nearer to the > current ressource X, than doc x will be indexed at IP3-1 - even if IP2-1

Re: How to enable Unicode Support in Solr

2010-09-06 Thread Walter Underwood
On Sep 6, 2010, at 1:49 AM, Lance Norskog wrote: > 1) The XML file must include the UTF-8 encoding metadata in the first line. If it requires that, it isn't a legal XML parser. The encoding declaration is optional and it defaults to UTF-8. wunder -- Walter Underwood

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread MitchK
Andrzej, thank you for sharing your experiences. > b) use consistent hashing as the mapping schema to assign documents to a > changing number of shards. There are many explanations of this schema on > the net, here's one that is very simple: > Boom. With the given explanation, I understan

Re: anyone use hadoop+solr?

2010-09-06 Thread Yonik Seeley
On Mon, Sep 6, 2010 at 9:47 AM, MitchK wrote: > are there any discussions about SolrCloud-indexing? Not recently - personally I've been sidetracked by other stuff. Mapping docs to shards is the easy part... take a hash of the id, and then I imagine the shard id (the label for the index) can just

Solr is indexing jdbc properties

2010-09-06 Thread savvas.andreas
Hello, I am trying to index some data stored in an SQL Server database through DIH. My setup in data-config.xml is the following: However, when I run the indexer (invoking http://127.0.0.1:8983/solr/admin/dataimport.jsp?hand

Re: anyone use hadoop+solr?

2010-09-06 Thread MitchK
Yonik, are there any discussions about SolrCloud-indexing? I would be glad to join them, if I find some interesting papers about that topic. - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p1426469.html Sent from the Solr - User maili

SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread Andrzej Bialecki
(I adjusted the subject to better reflect the content of this discussion). On 2010-09-06 14:37, MitchK wrote: Thanks for your detailed feedback Andzej! From what I understood, SOLR-1301 becomes obsolete ones Solr becomes cloud-ready, right? Who knows... I certainly didn't expect this code

London open-source search social - 13th Sept

2010-09-06 Thread Richard Marr
Hi all, Apologies for the short notice but we've booked a London Search Social for the 13th Sept. Come along if you fancy geeking out about search and related technology over a beer. Details on the meetup page. http://www.meetup.com/london-search-social/ Rich

Re: anyone use hadoop+solr?

2010-09-06 Thread Yonik Seeley
On Mon, Sep 6, 2010 at 8:37 AM, MitchK wrote: > 10 % numShards(10) ->  1 -> doc 10 will be indexed at shard 1... and what > about the older version at shard 2? I am no expert when it comes to > cloudComputing and the other stuff. > If you can point me to one or another reference where I can read a

Re: anyone use hadoop+solr?

2010-09-06 Thread MitchK
Thanks for your detailed feedback Andzej! >From what I understood, SOLR-1301 becomes obsolete ones Solr becomes cloud-ready, right? > Looking into the future: eventually, when SolrCloud arrives we will be > able to index straight to a SolrCloud cluster, assigning documents to > shards throug

RE: Show a facet filter "All"

2010-09-06 Thread PeterKerk
Just talking about it helped me in deciding this is not something I want :) Thanks! :) -- View this message in context: http://lucene.472066.n3.nabble.com/Show-a-facet-filter-All-tp1421248p1425600.html Sent from the Solr - User mailing list archive at Nabble.com.

SOLR-1194 workaround

2010-09-06 Thread viruslviv
Hello Solr community, I have the problem described in SOLR-1194: Query Analyzer not Invoking for Custom FiledType - When we use Custom QParser Plugin. Does anybody know how to apply fix for this instead of writing my own code that will perform the same job as analyzer does? Thanks in advance, Ze

Re: anyone use hadoop+solr?

2010-09-06 Thread Andrzej Bialecki
On 2010-09-04 19:53, MitchK wrote: Hi, this topic started a few months ago, however there are some questions from my side, that I couldn't answer by looking at the SOLR-1301-issue nor the wiki-pages. Let me try to explain my thoughts: Given: a Hadoop-cluster, a solr-search-cluster and nutch as

Re: In Need of Direction; Phrase-Context Tracking / Injection (Child Indexes) / Dismissal

2010-09-06 Thread Jan Høydahl / Cominvent
Hi, Yes, the stemming and other features of Solr is nice. A search result from Solr gives you each occurence of X in Y through highlighting - the regex highlighter is programmable to extract e.g. a sentence as context. You can also get number of occurrences (term frequency TF) from the termvect

Re: FW: How to enable Unicode Support in Solr

2010-09-06 Thread Lance Norskog
1) The XML file must include the UTF-8 encoding metadata in the first line. 2) If you are using Tomcat: Tomcat comes without UTF-8 as the default. The Solr wiki gives the directions on how to fix this. 3) If you are using Windows: Windows does not use UTF-8 by default. Tracking down UTF-8 encodi

Re: FW: How to enable Unicode Support in Solr

2010-09-06 Thread Darx Oman
Hi amier try saving the xml file encoding as UTF-8 On Mon, Sep 6, 2010 at 11:08 AM, Darx Darx wrote: > > > > Date: Mon, 6 Sep 2010 10:10:25 +0500 > > Subject: How to enable Unicode Support in Solr > > From: am...@techarete.com > > > To: solr-user@lucene.apache.org > > > > I have an index that t

Re: How to enable Unicode Support in Solr

2010-09-06 Thread Peter Karich
Hi, Solr is only able to handle unicode (UTF-8). Make really sure that you push it into the index in the correct encoding. See my (accepted ;-)) answer: http://stackoverflow.com/questions/3086367/how-to-view-the-xml-documents-sent-to-solr/3088515#3088515 Regards, Peter. > I have an index that

Re: How to enable Unicode Support in Solr

2010-09-06 Thread Grijesh.singh
solr supports UTF8 char set so use any ofthe MappingCharFilterFactory/ UnicodeNormalizationFilterFactory/ASCIIFoldingFilterFactory for searching and indexing of that type of data - Grijesh -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-enable-Unicode-Support-