Re: 'Minimum Should Match' on subquery level

2010-05-31 Thread Myron Chelyada
Thanks a lot for reply.

But I've already figured out that nested queries can help me to implement
what I was looking for.

-Myron

2010/5/28 Chris Hostetter 

>
> : I need to use Lucene's  `minimum number should match` option of
> BooleanQuery
> : on Solr.
>
> unfortunately, the Lucene QueryParser doesn't support any way of
> manipulating the minNumberSHouldMatch property of BooleanQueries specified
> in that syntax.
>
> I'm not sure of anyway to do what you're looking for w/o some custom code
> (either customing the QUeryParser, or writing a QParser that modifies the
> BooleanQueries produced)
>
>
>
>
>
>
>
>
> -Hoss
>
>


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread Geert-Jan Brits
NP ;-) .

Just to explain:

With tooltips I meant js-tooltips (not the native webbrowser tooltips)
since sliders require JS anyway, presenting additional info in a Js-tooltip
on drag, doesn't limit the nr of people able to view it.

I think this is ok from a usability standpoint since I don't consider the
'nr of items left' info 100% essential (after all lots of sites do well
without it at the moment).
Call if graceful degradation ;-)

As for mobile, I never realized that 'hover' is an issue on mobile, but on
drag is supported on mobile touch displays...

Moreover, having a navigational-complex site like kayak.com /
tripadvisor.com to work well on mobile (from a usability perspective)  is
pretty much an utopia anyway.
For these types of sites, specialized mobile sites (or apps as is the case
for the above brands) are the way to go in my opinion.

Geert-Jan


2010/5/28 Mark Bennett 

> Haha!  Important tooltips are now "deprecated" in Web Applications.
>
> This is nothing "official", of course.
>
> But it's being advised to avoid important UI tasks that require cursor
> tracking, mouse-over, hovering, etc. in web applications.
>
> Why?  Many touch-centric mobile devices don't support "hover".  For me I'm
> used to my laptop where the touch pad or stylus *is* able to measure the
> pressure.  But the finger based touch devices generally can differenciate
> it
> I guess.
>
> They *can* tell one gesture from another, but only looking at the timing
> and
> shape.  And hapless hover aint one of them.
>
> With that said, I'm still a fan of Tool Tips in desktop IDE's like Eclipse,
> or even on Web applications when I'm on a desktop.
>
> I guess the point is that, if it's a really important thing, then you need
> to expose it in another way on mobile.
>
> Just passing this on, please don't shoot the messenger.  ;-)
>
> Mark
>
> --
> Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
> Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513
>
>
> On Thu, May 27, 2010 at 2:55 PM, Geert-Jan Brits  wrote:
>
> > Perhaps you could show the 'nr of items left' as a tooltip of sorts when
> > the
> > user actually drags the slider.
> > If the user doesn't drag (or hovers over ) the slider 'nr of items left'
> > isn't shown.
> >
> > Moreover, initially a slider doesn't limit the results so 'nr of items
> > left'
> > shown for the slider would be the same as the overall number of items
> left
> > (thereby being redundant)
> >
> > I must say I haven't seen this been implemented but it would be rather
> easy
> > to adapt a slider implementation, to show the nr on drag/ hover.  (they
> > exit
> > for jquery, scriptaculous and a bunch of other libs)
> >
> > Geert-Jan
> >
> > 2010/5/27 Lukas Kahwe Smith 
> >
> > >
> > > On 27.05.2010, at 23:32, Geert-Jan Brits wrote:
> > >
> > > > Something like sliders perhaps?
> > > > Of course only numerical ranges can be put into sliders. (or a
> concept
> > > that
> > > > may be logically presented as some sort of ordening, such as "bad,
> hmm,
> > > > good, great"
> > > >
> > > > Use Solr's Statscomponent to show the min and max values
> > > >
> > > > Have a look at tripadvisor.com for good uses/implementation of
> sliders
> > > > (price, and reviewscore are presented as sliders)
> > > > my 2c: try to make the possible input values discrete (like at
> > > tripadvisor)
> > > > which gives a better user experience and limits the potential nr of
> > > queries
> > > > (cache-wise advantage)
> > >
> > >
> > > yeah i have been pondering something similar. but i now realized that
> > this
> > > way the user doesnt get an overview of the distribution without
> actually
> > > applying the filter. that being said, it would be nice to display 3
> > numbers
> > > with the silders, the count of items that were filtered out on the
> lower
> > and
> > > upper boundaries as well as the number of items still left (*).
> > >
> > > aside from this i just put a little tweak to my facetting online:
> > > http://search.un-informed.org/search?q=malaria&tm=any&s=Search
> > >
> > > if you deselect any of the checkboxes, it updates the counts. however i
> > > display both the count without and with those additional checkbox
> filters
> > > applied (actually i only display two numbers of they are not the same):
> > > http://screencast.com/t/MWUzYWZkY2Yt
> > >
> > > regards,
> > > Lukas Kahwe Smith
> > > m...@pooteeweet.org
> > >
> > > (*) if anyone has a slider that can do the above i would love to
> > integrate
> > > that and replace the adoption year checkboxes with that
> >
>


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread Gijs Kunze

On 5/28/2010 9:31 PM, Chris Hostetter wrote:

: Perhaps you could show the 'nr of items left' as a tooltip of sorts when the
: user actually drags the slider.

Years ago, when we were first working on building Solr, a coworker of mind
suggested using double bar sliders (ie: pick a range using a min and a
max) for all numeric facets and putting "sparklines" above them to give
the user a visual indication of the "spread" of documents across the
numeric spectrum.

it wsa a little more complicated then anything we needed -- and seemed
like a real pain in hte ass to implement.  i still don't know of anyone
doing anything like that, but it's definitley an interesting idea.

The hard part is really just deciding what "quantum" interval you want
to use along the xaxis to decide how to count the docs for the y axis.

http://en.wikipedia.org/wiki/Sparkline
http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0001OR


-Hoss

   
I love the idea of a sparkline at range-sliders. I think if I have time, 
I might add them to the range sliders on our site. I already have all 
the data since I show the count for a range while the user is dragging 
by storing the facet counts for each interval in javascript.


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread Geert-Jan Brits
Interesting..

say you have a double slider with a discrete range (like tripadvisor et.al.)
perhaps it would be a good guideline to use these discrete points for the
quantum interval for the sparkline as well?

Of course it then becomes the question which discrete values to use for the
slider. I tend to follow what tripadvisor does for it's price-slider:
set a cap for the max price, and set a fixed interval ($25) for the discrete
steps. (of course there are edge cases like when no product hits the maximum
capped price)

I have also seen non-linear steps implemented, but I guess this doesn't go
well with the notion of sparlines.


Anyway, from a implementation standpoint it would be enough for Solr to
return the 'nr of items' per interval. From that, it would be easy to
calculate on the application-side the 'nr of items' for each possible
slider-combination.

getting these values from solr would require (staying with the
price-example):
- a new discretised price field. And doing a facet.field.
- the (continu) price field already present, and doing 50 facet queries (if
you have 50 steps)
- another more elegant way ;-) . Perhaps an addition to statscomponent that
returns all counts within a discrete (to be specified) step?  Would this
slow the statscomponent-code down a lot, or ir the info already (almost)
present in statscomponent for doing things as calculating sddev / means,
etc?
- something I'm completely missing...




2010/5/28 Chris Hostetter 

>
> : Perhaps you could show the 'nr of items left' as a tooltip of sorts when
> the
> : user actually drags the slider.
>
> Years ago, when we were first working on building Solr, a coworker of mind
> suggested using double bar sliders (ie: pick a range using a min and a
> max) for all numeric facets and putting "sparklines" above them to give
> the user a visual indication of the "spread" of documents across the
> numeric spectrum.
>
> it wsa a little more complicated then anything we needed -- and seemed
> like a real pain in hte ass to implement.  i still don't know of anyone
> doing anything like that, but it's definitley an interesting idea.
>
> The hard part is really just deciding what "quantum" interval you want
> to use along the xaxis to decide how to count the docs for the y axis.
>
> http://en.wikipedia.org/wiki/Sparkline
> http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0001OR
>
>
> -Hoss
>
>


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread Geert-Jan Brits
May I ask how you implemented getting the facet counts for each interval? Do
you use a facet-query per interval?
And perhaps for inspiration a link to the site you implemented this ..

Thanks,
Geert-Jan

I love the idea of a sparkline at range-sliders. I think if I have time, I
> might add them to the range sliders on our site. I already have all the data
> since I show the count for a range while the user is dragging by storing the
> facet counts for each interval in javascript.
>


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread Lukas Kahwe Smith

On 31.05.2010, at 11:29, Geert-Jan Brits wrote:

> May I ask how you implemented getting the facet counts for each interval? Do
> you use a facet-query per interval?
> And perhaps for inspiration a link to the site you implemented this ..
> 
> Thanks,
> Geert-Jan
> 
> I love the idea of a sparkline at range-sliders. I think if I have time, I
>> might add them to the range sliders on our site. I already have all the data
>> since I show the count for a range while the user is dragging by storing the
>> facet counts for each interval in javascript.
>> 


i guess the easiest is to do the intervals at index time, obviously less 
flexible.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





Re: Luke browser does not show non-String Solr fields?

2010-05-31 Thread Ahmet Arslan

> Solr 1.4
> 
> > You haven't identified the version of Luke you're
> using.
> 
> Luke 1.0.1 (2010-04-01)
> 

I think with solr you need to use Release 0.9.9.1 or  0.9.9
Because solr 1.4.0 uses lucene 2.9.1


  


Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread olivier sallou
Hi,
I have created in index with several fields.
If I query my index in the admin section of solr (or via http request), I
get results for my search if I specify the requested field:
Query:   note:Aspergillus  (look for "Aspergillus" in field "note")
However, if I query the same word against all fields  ("Aspergillus" or
"all:Aspergillus") , I have no match in response from Solr.

Do you have any idea of what can be wrong with my index?

Regards

Olivier


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread gwk

On 5/31/2010 11:29 AM, Geert-Jan Brits wrote:

May I ask how you implemented getting the facet counts for each interval? Do
you use a facet-query per interval?
And perhaps for inspiration a link to the site you implemented this ..

Thanks,
Geert-Jan

I love the idea of a sparkline at range-sliders. I think if I have time, I
   

might add them to the range sliders on our site. I already have all the data
since I show the count for a range while the user is dragging by storing the
facet counts for each interval in javascript.

 
   

Hi,

Sorry, seems I pressed send halfway through my mail and forgot about it. 
The site I implemented my numerical range faceting on is 
http://www.mysecondhome.co.uk/search.html and I got the facets by making 
a small patch for Solr (https://issues.apache.org/jira/browse/SOLR-1240) 
which does the same thing for numbers what date faceting does for dates.


The biggest issue with range-faceting is the double counting of edges 
(which also happens in date faceting, see 
https://issues.apache.org/jira/browse/SOLR-397). My patch deals with 
that by adding an extra parameter which allows you specify which end of 
the range query should be exclusive.


A secondary issue is that you can't do filter queries with one end 
inclusive and one end exclusive (i.e. price:[500 TO 1000}). You can get 
around this by doing "price:({500 TO 1000} OR 500)". I've looked into 
the JavaCC code of Lucene to see if I could fix it so you could mix [] 
and {} but unfortunately I'm not familiar enough with it to get it to work.


Regards,

gwk


Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread Michael Kuhlmann
Am 31.05.2010 11:50, schrieb olivier sallou:
> Hi,
> I have created in index with several fields.
> If I query my index in the admin section of solr (or via http request), I
> get results for my search if I specify the requested field:
> Query:   note:Aspergillus  (look for "Aspergillus" in field "note")
> However, if I query the same word against all fields  ("Aspergillus" or
> "all:Aspergillus") , I have no match in response from Solr.

Querying "Aspergillus" without a field does only work if you're using
DisMaxHandler.

Do you have a field "all"?

Try "*:Aspergillus" instead.


Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread Abdelhamid ABID
Check your request handler setting, what do you have in the query field (qf)
entry ?

On 5/31/10, olivier sallou  wrote:
>
> Hi,
> I have created in index with several fields.
> If I query my index in the admin section of solr (or via http request), I
> get results for my search if I specify the requested field:
> Query:   note:Aspergillus  (look for "Aspergillus" in field "note")
> However, if I query the same word against all fields  ("Aspergillus" or
> "all:Aspergillus") , I have no match in response from Solr.
>
> Do you have any idea of what can be wrong with my index?
>
> Regards
>
>
> Olivier
>



-- 
Abdelhamid ABID
Software Engineer- J2EE / WEB


Re: TikaEntityProcessor not working?

2010-05-31 Thread Noble Paul നോബിള്‍ नोब्ळ्
BinFileDataSource  will only work with file, Try FieldStreamDataSource

On Mon, May 31, 2010 at 3:30 AM, Brad Greenlee  wrote:

> Hi. I'm trying to get Solr to index a database in which one column is a
> filename of a PDF document I'd like to index. My configuration looks like
> this:
>
> 
>   url="jdbc:mysql://localhost/document_db" user="user" password="password"
> readOnly="true"/>
>  
>  
>
>   url="/some/path/${document.filename}" dataSource="ds-file" format="text">
>
>  
>
>  
> 
>
> I'm using Solr from trunk (as of two days ago). The import process
> completes without errors, and it picks up the columns from the database, but
> not the content from the PDF file. It is definitely trying to access the PDF
> file, for if I give it an incorrect path name, it complains. It doesn't seem
> to be attempting to index the PDF, though, as it completes in about 40ms,
> whereas if I import the PDF via the ExtractingRequestHandler, it takes about
> 11 seconds to index it.
>
> I've also tried the tika example in example-DIH and that doesn't seem to
> index anything, either. Am I doing something wrong, or is this just not
> working yet?
>
> Cheers,
>
> Brad
>
>


-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread Gijs Kunze

On 5/31/2010 11:50 AM, olivier sallou wrote:

Hi,
I have created in index with several fields.
If I query my index in the admin section of solr (or via http request), I
get results for my search if I specify the requested field:
Query:   note:Aspergillus  (look for "Aspergillus" in field "note")
However, if I query the same word against all fields  ("Aspergillus" or
"all:Aspergillus") , I have no match in response from Solr.

Do you have any idea of what can be wrong with my index?

Regards

Olivier

   
Look for the  tag in your schema.xml. The field 
defined in it determines the default field which is searched when no 
explicit field is specified in your query.


Regards,

gwk


AW: strange results with query and hyphened words

2010-05-31 Thread Markus.Rietzler
i am not very sure, whether this helps me. 

i see the point, that there will be problems. 

but

the default-config for index is:



and for query:

 


with this settings i don't find "profiauskunft" when searching for  
"profi-auskunft" (analyse0.jpg)

if i use "catenateWords="1""


analysis.jsp says that there is a match (analyse1.jpg).


but in our life search "profi-auskunft" won't match "profiaukunft", only finds 
"profi-auskunft".
could anyone please clearify the output of analysis.jsp for me.
why is there a highlight in anylises.jsp but not a match when doing a search. 
even from the admin panel


when i have

profi  auskunft
   profiauskunft

does this mean "profi (auskunft profiauksunft)" will match the word "profi" 
follewed by "auskunft" or "profiauksunft".
is this OR the same as i configure with defaultOperator in solrQueryParser-tag?
the "OR"-thing does only apply to the query-part, right? what will that mean in 
the index part? 


> -Ursprüngliche Nachricht-
> Von: Sascha Szott [mailto:sz...@zib.de] 
> Gesendet: Sonntag, 30. Mai 2010 19:01
> An: solr-user@lucene.apache.org
> Betreff: Re: strange results with query and hyphened words
> 
> Hi Markus,
> 
> I was facing the same problem a few days ago and found an 
> explanation in 
> the mail archive that clarifies my question regarding the usage of 
> Solr's WordDelimiterFilterFactory:
> 
> http://markmail.org/message/qoby6kneedtwd42h
> 
> Best,
> Sascha
> 
> markus.rietz...@rzf.fin-nrw.de wrote:
> > i am wondering why a search term with hyphen doesn't match.
> >
> > my search term is "prof-auskunft". in 
> WordDelimiterFilterFactory i have
> > catenateWords, so my understanding is that profi-auskunft 
> would search
> > for profiauskunft. when i use the analyse panel in solr 
> admi i see that
> > profi-auskunft matches a term "profiauskunft".
> >
> > the analyse will show
> >
> > Query Analyzer
> > WhitespaceTokenizerFactory
> > profi-auskunft
> > SynonymFilterFactory
> > profi-auskunft
> > StopFilterFactory
> > profi-auskunft
> >
> > WordDelimiterFilterFactory
> >
> > term position   1   2
> > term text   profi   auskunft
> > profiauskunft
> > term type   wordword
> > word
> > source start,end0,5 6,14
> > 0,15
> >
> > LowerCaseFilterFactory
> > SnowballPorterFilterFactory
> >
> > why is auskunft and profiauskunft in one column. how do they get
> > searched?
> >
> > when i search "profiauskunft" i have 230 hits, when i now search for
> > "profi-auskunft" i do get less hits. when i call the search with
> > debugQuery=on i see
> >
> > body:"profi (auskunft profiauskunft)"
> >
> > what does this query mean? profi and "auskunft or profiauskunft"?
> >
> >
> >
> >
> >  > positionIncrementGap="100">
> >
> >  
> >  
> >  
> >  
> >  
> >  
> >   >  ignoreCase="true"
> >  words="de/stopwords_de.txt"
> >  enablePositionIncrements="true"
> >  />
> >  
> >   > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >  
> >   > language="German" protected="de/protwords_de.txt"/>
> >
> >
> >  
> >   > synonyms="de/synonyms_de.txt" ignoreCase="true" expand="true"/>
> >   >  ignoreCase="true"
> >  words="de/stopwords_de.txt"
> >  enablePositionIncrements="true"
> >  />
> >   > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >  
> >   > language="German" protected="de/protwords_de.txt"/>
> >
> > 
> >
> >
> 
> 


Restricting the values returned by Facet Fields using Filter Query

2010-05-31 Thread Ninad Raut
Hi,

Is it possible to restrict the values returned by Facet Fields using Filter
Queries to Group on only those documents will pass the filter query passed
in filter criteria??


I am under the assumption that fq is disjoint from facet.field function. Let
me know if my assumptions are right or wrong.

Regards,
Ninad R


Re: AW: XSLT for JSON

2010-05-31 Thread stockii

thx for your help.

the problem ist that our app already have an suggest version implement. and
now, i want to use a new "version" of autosuggestion, but the responseformat
isnt the same. so its not backward compatible. the client cannot change the
uses of the response format ... =(

today i will try out with velocity and the json.nl parameter. i didnt know
about these options. thx =)


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/XSLT-for-JSON-tp845386p858025.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Restricting the values returned by Facet Fields using Filter Query

2010-05-31 Thread Gijs Kunze

On 5/31/2010 12:01 PM, Ninad Raut wrote:

Hi,

Is it possible to restrict the values returned by Facet Fields using Filter
Queries to Group on only those documents will pass the filter query passed
in filter criteria??


I am under the assumption that fq is disjoint from facet.field function. Let
me know if my assumptions are right or wrong.

Regards,
Ninad R

   

Hi,

Filter queries do restrict the documents set used for faceting. In fact 
you have to explicitly turn it off if you don't want it to by using 
tagging/excluding (see 
http://wiki.apache.org/solr/SimpleFacetParameters#LocalParams_for_faceting).


Regards,
gwk


Re: strange results with query and hyphened words

2010-05-31 Thread Sascha Szott

Hi Markus,


the default-config for index is:



and for query:



That's not true. The default configuration for query-time processing is:



By using this setting, a search for "profi-auskunft" will match 
"profiauskunft".


It's important to note, that WordDelimiterFilterFactory's catenate* 
parameters should only be used in the index-time analysis stack. 
Otherwise the strange behaviour (search for profi-auskunft is translated 
into "profi followed by (auskunft or profiauskunft)" you mentioned will 
occur.


Best,
Sascha


-Ursprüngliche Nachricht-
Von: Sascha Szott [mailto:sz...@zib.de]
Gesendet: Sonntag, 30. Mai 2010 19:01
An: solr-user@lucene.apache.org
Betreff: Re: strange results with query and hyphened words

Hi Markus,

I was facing the same problem a few days ago and found an
explanation in
the mail archive that clarifies my question regarding the usage of
Solr's WordDelimiterFilterFactory:

http://markmail.org/message/qoby6kneedtwd42h

Best,
Sascha

markus.rietz...@rzf.fin-nrw.de wrote:

i am wondering why a search term with hyphen doesn't match.

my search term is "prof-auskunft". in

WordDelimiterFilterFactory i have

catenateWords, so my understanding is that profi-auskunft

would search

for profiauskunft. when i use the analyse panel in solr

admi i see that

profi-auskunft matches a term "profiauskunft".

the analyse will show

Query Analyzer
WhitespaceTokenizerFactory
profi-auskunft
SynonymFilterFactory
profi-auskunft
StopFilterFactory
profi-auskunft

WordDelimiterFilterFactory

term position   1   2
term text   profi   auskunft
profiauskunft
term type   wordword
word
source start,end0,5 6,14
0,15

LowerCaseFilterFactory
SnowballPorterFilterFactory

why is auskunft and profiauskunft in one column. how do they get
searched?

when i search "profiauskunft" i have 230 hits, when i now search for
"profi-auskunft" i do get less hits. when i call the search with
debugQuery=on i see

body:"profi (auskunft profiauskunft)"

what does this query mean? profi and "auskunft or profiauskunft"?






  
  
  
  
  
  
  
  
  
  
  


  
  
  
  
  
  











Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread olivier sallou
Ok,
I use default e.g. standard request handler.
Using "*:Aspergillus" does not work either.

I can try with DisMax but this means that I know all field names. My schema
knows a number of them, but some other fields are defined via dynamic fields
(I know the type, but I do not know their names).
Is there any way to query all fields including dynamic ones?

thanks

Olivier

2010/5/31 Michael Kuhlmann 

> Am 31.05.2010 11:50, schrieb olivier sallou:
> > Hi,
> > I have created in index with several fields.
> > If I query my index in the admin section of solr (or via http request), I
> > get results for my search if I specify the requested field:
> > Query:   note:Aspergillus  (look for "Aspergillus" in field "note")
> > However, if I query the same word against all fields  ("Aspergillus" or
> > "all:Aspergillus") , I have no match in response from Solr.
>
> Querying "Aspergillus" without a field does only work if you're using
> DisMaxHandler.
>
> Do you have a field "all"?
>
> Try "*:Aspergillus" instead.
>


Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread Michael Kuhlmann
Am 31.05.2010 12:36, schrieb olivier sallou:
> Is there any way to query all fields including dynamic ones?

Yes, using the *:term query. (Please note that the asterisk should not
be quoted.)

To answer your question, we need more details on your Solr
configuration, esp. the part of schema.xml that defines your "note" field.

Greetings,
Michael




Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread olivier sallou
I finally got a solution. As I use dynamic fields. I use the copyField to a
global indexed attribute, and specify this attribute as defaultSearchField
in my schema.

The *:term with "standard" query type fails without this...

This solution requires to double the required indexing data but works in all
cases...

In my schema I have:

Some other fields are "lowercase" or "int" types.

Regards

2010/5/31 Michael Kuhlmann 

> Am 31.05.2010 12:36, schrieb olivier sallou:
> > Is there any way to query all fields including dynamic ones?
>
> Yes, using the *:term query. (Please note that the asterisk should not
> be quoted.)
>
> To answer your question, we need more details on your Solr
> configuration, esp. the part of schema.xml that defines your "note" field.
>
> Greetings,
> Michael
>
>
>


Re: strange results with query and hyphened words

2010-05-31 Thread Sascha Szott
Sorry Markus, I mixed up the index and query field in analysis.jsp. In 
fact, I meant that a search for profiauskunft matches profi-auskunft.


I'm not sure, whether the case you are dealing with (search for 
profi-auskunft should match profiauskunft) is appropriately addressed by 
the WordDelimiterFilter. What about using the PatternReplaceCharFilter 
at query time to eliminate all intra-word hyphens?


-Sascha

Sascha Szott wrote:

Hi Markus,


the default-config for index is:



and for query:



That's not true. The default configuration for query-time processing is:



By using this setting, a search for "profi-auskunft" will match
"profiauskunft".

It's important to note, that WordDelimiterFilterFactory's catenate*
parameters should only be used in the index-time analysis stack.
Otherwise the strange behaviour (search for profi-auskunft is translated
into "profi followed by (auskunft or profiauskunft)" you mentioned will
occur.

Best,
Sascha


-Ursprüngliche Nachricht-
Von: Sascha Szott [mailto:sz...@zib.de]
Gesendet: Sonntag, 30. Mai 2010 19:01
An: solr-user@lucene.apache.org
Betreff: Re: strange results with query and hyphened words

Hi Markus,

I was facing the same problem a few days ago and found an
explanation in
the mail archive that clarifies my question regarding the usage of
Solr's WordDelimiterFilterFactory:

http://markmail.org/message/qoby6kneedtwd42h

Best,
Sascha

markus.rietz...@rzf.fin-nrw.de wrote:

i am wondering why a search term with hyphen doesn't match.

my search term is "prof-auskunft". in

WordDelimiterFilterFactory i have

catenateWords, so my understanding is that profi-auskunft

would search

for profiauskunft. when i use the analyse panel in solr

admi i see that

profi-auskunft matches a term "profiauskunft".

the analyse will show

Query Analyzer
WhitespaceTokenizerFactory
profi-auskunft
SynonymFilterFactory
profi-auskunft
StopFilterFactory
profi-auskunft

WordDelimiterFilterFactory

term position 1 2
term text profi auskunft
profiauskunft
term type word word
word
source start,end 0,5 6,14
0,15

LowerCaseFilterFactory
SnowballPorterFilterFactory

why is auskunft and profiauskunft in one column. how do they get
searched?

when i search "profiauskunft" i have 230 hits, when i now search for
"profi-auskunft" i do get less hits. when i call the search with
debugQuery=on i see

body:"profi (auskunft profiauskunft)"

what does this query mean? profi and "auskunft or profiauskunft"?




































Question about specifying the query analysis at query time

2010-05-31 Thread Marc Sturlese
Hey there,
I am facing a problem related to query analysis and stopwords. Have some
ideas how to sort it out but would like to do it in the cleanest way
possible.
I am using dismax and I query to 3 fields. These fields are defined as
"text" this way:

  





  
  


 


  

stopword.txt has the same words in the index and query analyzer. The thing
is, in some search requests (not all of them) want to add some extra
stopwords (at query time). The 3 fields would have the same extra stopwords.
I want these extra stopwords to be indexed in the index but that some
searches never find these words.
All documents would be indexed with the same analyzer but want a different
one at search time depending on a defined criteria. Before executing the
query I already now
How would be the best way to do this?
Thanks in advance


How bad is stopping Solr with SIGKILL?

2010-05-31 Thread Andrew Clegg

Hi folks,

I had a Solr instance (in Jetty on Linux) taken down by a process monitoring
tool (God) with a SIGKILL recently.

How bad is this? Can it cause index corruption if it's in the middle of
indexing something? Or will it just lose uncommitted changes? What if the
signal arrives in the middle of the commit process?

Unfortunately I can't tell exactly what it was doing at the time as
someone's deleted the logfile :-(

Thanks,

Andrew.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-bad-is-stopping-Solr-with-SIGKILL-tp858119p858119.html
Sent from the Solr - User mailing list archive at Nabble.com.


AW: strange results with query and hyphened words

2010-05-31 Thread Markus.Rietzler
> 
> Sorry Markus, I mixed up the index and query field in 
> analysis.jsp. In 
> fact, I meant that a search for profiauskunft matches profi-auskunft.
> 
> I'm not sure, whether the case you are dealing with (search for 
> profi-auskunft should match profiauskunft) is appropriately 
> addressed by the WordDelimiterFilter. 

ok, seems like this is the point.

> What about using the PatternReplaceCharFilter 
> at query time to eliminate all intra-word hyphens?
> 

ok, would be a way. 
i thought that catenateWords would help at this point, but it doesn't. 
wonder then whats the difference between a patternreplacement and the
catenateWords.

markus


AW: strange results with query and hyphened words

2010-05-31 Thread Markus.Rietzler

> I'm not sure, whether the case you are dealing with (search for 
> profi-auskunft should match profiauskunft) is appropriately 
> addressed by 
> the WordDelimiterFilter. What about using the 
> PatternReplaceCharFilter 
> at query time to eliminate all intra-word hyphens?
> 
maybe it would be best to have solr search for

"profi-auskunft" or "profiauskunft"

if i have "profi-auskunft" as the query. maybe it is not a good idea
to remove the hyphen at all. 

markus


Re: Restricting the values returned by Facet Fields using Filter Query

2010-05-31 Thread Erik Hatcher

Maybe what you're looking for is facet.mincount=1 ?

Erik

On May 31, 2010, at 6:01 AM, Ninad Raut wrote:


Hi,

Is it possible to restrict the values returned by Facet Fields using  
Filter
Queries to Group on only those documents will pass the filter query  
passed

in filter criteria??


I am under the assumption that fq is disjoint from facet.field  
function. Let

me know if my assumptions are right or wrong.

Regards,
Ninad R




Re: TikaEntityProcessor not working?

2010-05-31 Thread Brad Greenlee

It is a file. Only the filename is stored in the database.

Brad


On May 31, 2010, at 2:59 AM, Noble Paul നോബിള്‍  नो 
ब्ळ्  wrote:



BinFileDataSource  will only work with file, Try FieldStreamDataSource

On Mon, May 31, 2010 at 3:30 AM, Brad Greenlee   
wrote:


Hi. I'm trying to get Solr to index a database in which one column  
is a
filename of a PDF document I'd like to index. My configuration  
looks like

this:


url="jdbc:mysql://localhost/document_db" user="user"  
password="password"

readOnly="true"/>


  
url="/some/path/${document.filename}" dataSource="ds-file"  
format="text">

  

  



I'm using Solr from trunk (as of two days ago). The import process
completes without errors, and it picks up the columns from the  
database, but
not the content from the PDF file. It is definitely trying to  
access the PDF
file, for if I give it an incorrect path name, it complains. It  
doesn't seem
to be attempting to index the PDF, though, as it completes in about  
40ms,
whereas if I import the PDF via the ExtractingRequestHandler, it  
takes about

11 seconds to index it.

I've also tried the tika example in example-DIH and that doesn't  
seem to
index anything, either. Am I doing something wrong, or is this just  
not

working yet?

Cheers,

Brad





--
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Restricting the values returned by Facet Fields using Filter Query

2010-05-31 Thread Ninad Raut
Hi,
I tried a small POC and found that Filter queries do restrict the
documents set used for "Group By" on Facet Field.

It will help me restrict documents on some other filters when I am
grouping on muti-valued "Buzz" field.

Thanks Gijs and Erik.

Regards,
Ninad R



On 5/31/10, Erik Hatcher  wrote:
> Maybe what you're looking for is facet.mincount=1 ?
>
>   Erik
>
> On May 31, 2010, at 6:01 AM, Ninad Raut wrote:
>
>> Hi,
>>
>> Is it possible to restrict the values returned by Facet Fields using
>> Filter
>> Queries to Group on only those documents will pass the filter query
>> passed
>> in filter criteria??
>>
>>
>> I am under the assumption that fq is disjoint from facet.field
>> function. Let
>> me know if my assumptions are right or wrong.
>>
>> Regards,
>> Ninad R
>
>


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread gwk

On 5/31/2010 11:50 AM, gwk wrote:

On 5/31/2010 11:29 AM, Geert-Jan Brits wrote:
May I ask how you implemented getting the facet counts for each 
interval? Do

you use a facet-query per interval?
And perhaps for inspiration a link to the site you implemented this ..

Thanks,
Geert-Jan

I love the idea of a sparkline at range-sliders. I think if I have 
time, I
might add them to the range sliders on our site. I already have all 
the data
since I show the count for a range while the user is dragging by 
storing the

facet counts for each interval in javascript.


Hi,

Sorry, seems I pressed send halfway through my mail and forgot about 
it. The site I implemented my numerical range faceting on is 
http://www.mysecondhome.co.uk/search.html and I got the facets by 
making a small patch for Solr 
(https://issues.apache.org/jira/browse/SOLR-1240) which does the same 
thing for numbers what date faceting does for dates.


The biggest issue with range-faceting is the double counting of edges 
(which also happens in date faceting, see 
https://issues.apache.org/jira/browse/SOLR-397). My patch deals with 
that by adding an extra parameter which allows you specify which end 
of the range query should be exclusive.


A secondary issue is that you can't do filter queries with one end 
inclusive and one end exclusive (i.e. price:[500 TO 1000}). You can 
get around this by doing "price:({500 TO 1000} OR 500)". I've looked 
into the JavaCC code of Lucene to see if I could fix it so you could 
mix [] and {} but unfortunately I'm not familiar enough with it to get 
it to work.


Regards,

gwk


Hi,

I was supposed to work on something else but I just couldn't resist, and 
just implemented some bar-graphs for the range sliders and I really like 
it. In my case it was really easy, all the data was already right there 
in javascript so it's not causing additional server side load. It's also 
really nice to see the graph updating when a facet is selected/changed.


Regards,

gwk



Re: Sites with Innovative Presentation of Tags and Facets

2010-05-31 Thread gwk

On 5/31/2010 4:24 PM, gwk wrote:

On 5/31/2010 11:50 AM, gwk wrote:

On 5/31/2010 11:29 AM, Geert-Jan Brits wrote:
May I ask how you implemented getting the facet counts for each 
interval? Do

you use a facet-query per interval?
And perhaps for inspiration a link to the site you implemented this ..

Thanks,
Geert-Jan

I love the idea of a sparkline at range-sliders. I think if I have 
time, I
might add them to the range sliders on our site. I already have all 
the data
since I show the count for a range while the user is dragging by 
storing the

facet counts for each interval in javascript.


Hi,

Sorry, seems I pressed send halfway through my mail and forgot about 
it. The site I implemented my numerical range faceting on is 
http://www.mysecondhome.co.uk/search.html and I got the facets by 
making a small patch for Solr 
(https://issues.apache.org/jira/browse/SOLR-1240) which does the same 
thing for numbers what date faceting does for dates.


The biggest issue with range-faceting is the double counting of edges 
(which also happens in date faceting, see 
https://issues.apache.org/jira/browse/SOLR-397). My patch deals with 
that by adding an extra parameter which allows you specify which end 
of the range query should be exclusive.


A secondary issue is that you can't do filter queries with one end 
inclusive and one end exclusive (i.e. price:[500 TO 1000}). You can 
get around this by doing "price:({500 TO 1000} OR 500)". I've looked 
into the JavaCC code of Lucene to see if I could fix it so you could 
mix [] and {} but unfortunately I'm not familiar enough with it to 
get it to work.


Regards,

gwk


Hi,

I was supposed to work on something else but I just couldn't resist, 
and just implemented some bar-graphs for the range sliders and I 
really like it. In my case it was really easy, all the data was 
already right there in javascript so it's not causing additional 
server side load. It's also really nice to see the graph updating when 
a facet is selected/changed.


Regards,

gwk

(Tried attaching an image, but it didn't work, so here it is: 
http://img249.imageshack.us/img249/7766/faceting.png)


deleteDocByID

2010-05-31 Thread stockii

Hello.

i have a littel porblem to synchronize my index with my database.

we have an extra table for delta-import. we cannot use a modified-field =(

so. in this delta-table where all id saved to update. after a deltaimport
this table should be deleted. this works fine.

but when an item is is deleted i save this in my delta-table with a flag
"is_deleted" 
 
how can i add the row $deleteDocByID" ? i use an script for docBoosting and
i thought to solve this problem on the same way. but it dont wont to work =( 

this is my script: 
function DeleteDoc(row){
var is_deleted = row.get('is_deleted');
var id = row.get('update_id');

if(is_deleted == true){
row.put('$deleteDocById', 'id');
}

}



can i use this script in my normal delta-import ? or should i create a new
entity ? 

whats your solutions ? what do yu thing is the smartest way to delete these
docs from the index ? 

thx =) =) =) 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/deleteDocByID-tp858903p858903.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 1.4 query fails against all fields, but succeed if field is specified.

2010-05-31 Thread Chris Hostetter

: > Is there any way to query all fields including dynamic ones?
: 
: Yes, using the *:term query. (Please note that the asterisk should not
: be quoted.)

uh ...no, completely incorrect.  you can not use "*" to denote 'all 
fields' in that way.

there is no syntax for "find this term in any field" ... every query 
involves a field of some kind.  if you want to be able to query against 
the text of "all" fields you need to use copyField to create some kind of 
"all" or "allText" field 9you can name it whatever you want)



-Hoss



Re: deleteDocByID

2010-05-31 Thread stockii

oh sry, i solve it with this entity: 
 

;) thx nabble
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/deleteDocByID-tp858903p858951.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-31 Thread iorixxx

http://lucene.472066.n3.nabble.com/file/n859016/qa.writepublic.com.xml
qa.writepublic.com.xml 

I modified your schema.xml

You need to restart jetty and re-index you documents.

After that in the solr admin page, if you search 

prefix_full:"george clo" prefix_token:(george clo)

you will get your documents for to use in suggestion.

After trying this, can you tell us if this is what you were looking for?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Does-SOLR-Allow-q-A-or-B-AND-C-or-D-tp849703p859016.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Luke browser does not show non-String Solr fields?

2010-05-31 Thread jlist9
Thanks for the suggestion. I tried 0.9.9.1 but saw the same problem.
I didn't see 0.9.9 on their download page.

On Mon, May 31, 2010 at 2:39 AM, Ahmet Arslan  wrote:
>
>> Solr 1.4
>>
>> > You haven't identified the version of Luke you're
>> using.
>>
>> Luke 1.0.1 (2010-04-01)
>>
>
> I think with solr you need to use Release 0.9.9.1 or  0.9.9
> Because solr 1.4.0 uses lucene 2.9.1
>
>
>
>


Re: Luke browser does not show non-String Solr fields?

2010-05-31 Thread Ahmet Arslan
> Thanks for the suggestion. I tried
> 0.9.9.1 but saw the same problem.
> I didn't see 0.9.9 on their download page.

http://www.getopt.org/luke/ has 0.9.9 version. But that may not be the issue.

I suspect that trie based fields causing this. Because they index each value at 
various levels of precision.  

Do you have problems with other than trie based (tint, tdate, etc) types?



  


Re: Luke browser does not show non-String Solr fields?

2010-05-31 Thread Chris Hostetter

: 1. Queries like "id:123" which work fine in /solr/admin web interface but
: returns nothing in Luke. Query "*:*" returns all records fine in Luke. I
: expect Luke returns the same result as /solr/admin since it's essentially
: a Lucene query?

you haven't told us what fieldtype you are using for the "id" field -- but 
i'm going to go out on a limb and guess it's TrieIntFieldType (or possibly 
a SortedIntFieldType) ... those field types encode their values  in such a 
way that they sort lexigraphicaly and produce faster range queries -- if 
Luke doesn't kow about that special encoding, it can search on them (or 
even display the terms properly)

Luke has a "view terms" feature right? ... look at the raw terms in your 
"id" ifeld and i bet you'll see they look nothing like numbers -- and 
that's why you can search on them as numbers in Luke

(when you serach on them in Solr, SOlr knows about your schema, and knows 
about your field types, and can do the proper encoding/decoding)



-Hoss



newbie question on how to batch commit documents

2010-05-31 Thread Steve Kuo
I have a newbie question on what is the best way to batch add/commit a large
collection of document data via solrj.  My first attempt  was to write a
multi-threaded application that did following.

Collection docs = new ArrayList();
for (Widget w : widges) {
doc.addField("id", w.getId());
doc.addField("name", w.getName());
   doc.addField("price", w.getPrice());
doc.addField("category", w.getCat());
doc.addField("srcType", w.getSrcType());
docs.add(doc);

// commit docs to solr server
server.add(docs);
server.commit();
}

And I got this exception.

rg.apache.solr.common.SolrException:
Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

The solrj wiki/documents seemed to indicate that because multiple threads
were calling SolrServer.commit() which in term called
CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
thought was to change the configs for autowarming.  But after looking at the
autowarm params, I am not sure what can be changed or perhaps a different
approach is recommened.







Your help is much appreciated.


Re: Luke browser does not show non-String Solr fields?

2010-05-31 Thread Trey Grainger
I submitted a patch a few months back for a Solr Document Inspector which
allows one to see the indexed values for any document in a Solr index (
https://issues.apache.org/jira/browse/SOLR-1837). This is more or less a
port of Luke's DocumentReconstructor into Solr, but the tool additionally
has access to all the solr schema/field type information for display
purposes (i.e. Trie Fields are human-readable).

This won't help you search for values in an index or inspect anything at a
macro level (i.e. term counts across the index), but there are other tools
in Solr for that.  Given a UniqueID, however, you can view all the indexed
values for each field in that particular document.  You can always do a
search within Solr for the values you are looking for and then use this tool
to view the indexed values for any documents which match.

This may or may not help you (I'm can't tell what problem you are trying to
solve), but I thought it would be worth mentioning as one tool in your
toolbox.

-Trey

>
>
>
>


NPE error when extending DefaultSolrHighlighter

2010-05-31 Thread Gerald

I was looking at solr-386 and thought I would try to create a custom
highlighter for something I was doing. 

I created a class that looks something like this: 

public class CustomOutputHighlighter extends DefaultSolrHighlighter { 
 @Override 
 public NamedList doHighlighting(DocList docs, Query query,
SolrQueryRequest req, String[] defaultFields) throws IOException { 
 NamedList highlightedValues = super.doHighlighting(docs,
query, req, defaultFields); 
  
 // do more stuff here 

 return highlightedValues 
 } 
} 

and have replaced the  line in my solrconfig xml so that it
looks something like this: 

 

and left all the existing default highlighting parameters as-is 

The code compiles with no problem, and should simply perform the normal
highlighting (since all I am doing is calling the original doHighlighting
code and returning the results).  However, when I start Solr, I get an NPE
error: 

java.lang.NullPointerException 
at
org.apache.solr.highlight.DefaultSolrHighlighter.init(DefaultSolrHighlighter.java:75)
 
at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:437) 
at org.apache.solr.core.SolrCore.initHighLighter(SolrCore.java:612) 
at org.apache.solr.core.SolrCore.(SolrCore.java:558) 
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
 
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) 
at
org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) 
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) 
at
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) 
at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) 
at
org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) 
at
org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) 
at
org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) 
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) 
at
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) 
at
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161)
 
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) 
at
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) 
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) 
at
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) 
at org.mortbay.jetty.Server.doStart(Server.java:210) 
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) 
at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) 
at java.lang.reflect.Method.invoke(Unknown Source) 
at org.mortbay.start.Main.invokeMain(Main.java:183) 
at org.mortbay.start.Main.start(Main.java:497) 
at org.mortbay.start.Main.main(Main.java:115) 

It doesn't seem to even call my custom highlighter (I put a breakpoint in
which did not get hit).  Any ideas re what I am doing wrong?? 

If I use the default highlighter, I don't get this error and have no
problems 

I am using a copy of 1.5.0-dev solr ($Id: CHANGES.txt 906924 2010-02-05
12:43:11Z noble $) 

thanks for any advice 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/NPE-error-when-extending-DefaultSolrHighlighter-tp859670p859670.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: newbie question on how to batch commit documents

2010-05-31 Thread Erik Hatcher
Move the commit outside your loop and you'll be in better shape.   
Better yet, enable autocommit in solrconfig.xml and don't commit from  
your multithreaded client, otherwise you still run the risk of too  
many commits happening concurrently.


Erik

On May 31, 2010, at 5:27 PM, Steve Kuo wrote:

I have a newbie question on what is the best way to batch add/commit  
a large
collection of document data via solrj.  My first attempt  was to  
write a

multi-threaded application that did following.

Collection docs = new  
ArrayList();

for (Widget w : widges) {
   doc.addField("id", w.getId());
   doc.addField("name", w.getName());
  doc.addField("price", w.getPrice());
   doc.addField("category", w.getCat());
   doc.addField("srcType", w.getSrcType());
   docs.add(doc);

   // commit docs to solr server
   server.add(docs);
   server.commit();
}

And I got this exception.

rg.apache.solr.common.SolrException:
Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
424)
	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
243)
	at  
org 
.apache 
.solr 
.client 
.solrj 
.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)

at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

The solrj wiki/documents seemed to indicate that because multiple  
threads

were calling SolrServer.commit() which in term called
CommonsHttpSolrServer.request() resulting in multiple searchers.  My  
first
thought was to change the configs for autowarming.  But after  
looking at the
autowarm params, I am not sure what can be changed or perhaps a  
different

approach is recommened.

   

   

   

Your help is much appreciated.




Re: Storing different entities in Solr

2010-05-31 Thread Moazzam Khan
Thanks for the replies guys. I am not at work so I don't have the
exact schema but here's what it roughly looks like:


Request:
==
id
client_id
pm_id
pm2_id
title



Advisor:
==
id
person_id
address_id
bio
sector (IT, doctor, etc)



There's another table.

RequestAdvisor:
=
id
advisor_id
request_id

The idea of adding a prefix to primary keys does sound good. I can
just do advisor_123 and request_12345.





On Sun, May 30, 2010 at 9:22 PM, Bill Au  wrote:
> There is only one primary key in a single index.  If the id of your
> different document types do collide, you can simply add a prefix or suffix
> to make them unique.
>
> Bill
>
> On Fri, May 28, 2010 at 1:12 PM, Moazzam Khan  wrote:
>
>> Thanks for all your answers guys. Requests and consultants have a many
>> to many relationship so I can't store request info in a document with
>> advisorID as the primary key.
>>
>> Bill's solution and multicore solutions might be what I am looking
>> for. Bill, will I be able to have 2 primary keys (so I can update and
>> delete documents)? If yes, can you please give me a link or someting
>> where I can get more info on this?
>>
>> Thanks,
>> Moazzam
>>
>>
>>
>> On Fri, May 28, 2010 at 11:50 AM, Bill Au  wrote:
>> > You can keep different type of documents in the same index.  If each
>> > document has a type field.  You can restrict your searches to specific
>> > type(s) of document by using a filter query, which is very fast and
>> > efficient.
>> >
>> > Bill
>> >
>> > On Fri, May 28, 2010 at 12:28 PM, Nagelberg, Kallin <
>> > knagelb...@globeandmail.com> wrote:
>> >
>> >> Multi-core is an option, but keep in mind if you go that route you will
>> >> need to do two searches to correlate data between the two.
>> >>
>> >> -Kallin Nagelberg
>> >>
>> >> -Original Message-
>> >> From: Robert Zotter [mailto:robertzot...@gmail.com]
>> >> Sent: Friday, May 28, 2010 12:26 PM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: Re: Storing different entities in Solr
>> >>
>> >>
>> >> Sounds like you'll want to use a multiple core setup. One core fore each
>> >> type
>> >> of "document"
>> >>
>> >> http://wiki.apache.org/solr/CoreAdmin
>> >> --
>> >> View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/Storing-different-entities-in-Solr-tp852299p852346.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >
>>
>


Re: NPE error when extending DefaultSolrHighlighter

2010-05-31 Thread Koji Sekiguchi

(10/06/01 6:45), Gerald wrote:

I was looking at solr-386 and thought I would try to create a custom
highlighter for something I was doing.

I created a class that looks something like this:

public class CustomOutputHighlighter extends DefaultSolrHighlighter {
  @Override
  public NamedList doHighlighting(DocList docs, Query query,
SolrQueryRequest req, String[] defaultFields) throws IOException {
  NamedList highlightedValues = super.doHighlighting(docs,
query, req, defaultFields);

  // do more stuff here

  return highlightedValues
  }
}

and have replaced the  line in my solrconfig xml so that it
looks something like this:



and left all the existing default highlighting parameters as-is

The code compiles with no problem, and should simply perform the normal
highlighting (since all I am doing is calling the original doHighlighting
code and returning the results).  However, when I start Solr, I get an NPE
error:

   

Try to put a constructor that has an argument SolrCore:

public CustomOutputHighlighter(SolrCore core){
super(core);
}

Koji

--
http://www.rondhuit.com/en/



Re: Luke browser does not show non-String Solr fields?

2010-05-31 Thread jlist9
The id field has type "long" in schema.xml. In Luke, they are shown
as "hex dump". When viewing a doc (returned by *:*), I pick the ID field
and press the "Show" button, Luke pops up a dialog that allows me
to change the "Show Content As" value. When I choose "Number",
I get an error message:

"Some values could not be properly represented in this format. They
are marked in grey and presented as a hex dump."

So it seems like Luke does not understand Solr's long type. This
is not a native Lucene type?

On Mon, May 31, 2010 at 9:52 AM, Chris Hostetter
 wrote:
>
> : 1. Queries like "id:123" which work fine in /solr/admin web interface but
> : returns nothing in Luke. Query "*:*" returns all records fine in Luke. I
> : expect Luke returns the same result as /solr/admin since it's essentially
> : a Lucene query?
>
> you haven't told us what fieldtype you are using for the "id" field -- but
> i'm going to go out on a limb and guess it's TrieIntFieldType (or possibly
> a SortedIntFieldType) ... those field types encode their values  in such a
> way that they sort lexigraphicaly and produce faster range queries -- if
> Luke doesn't kow about that special encoding, it can search on them (or
> even display the terms properly)
>
> Luke has a "view terms" feature right? ... look at the raw terms in your
> "id" ifeld and i bet you'll see they look nothing like numbers -- and
> that's why you can search on them as numbers in Luke
>
> (when you serach on them in Solr, SOlr knows about your schema, and knows
> about your field types, and can do the proper encoding/decoding)
>
>
>
> -Hoss
>
>


Re: newbie question on how to batch commit documents

2010-05-31 Thread findbestopensource
Add commit after the loop. I would advise to use commit in a separate
thread. I do keep separate timer thread, where every minute I will do
commit and at the end of every day I will optimize the index.

Regards
Aditya
www.findbestopensource.com


On Tue, Jun 1, 2010 at 2:57 AM, Steve Kuo  wrote:

> I have a newbie question on what is the best way to batch add/commit a
> large
> collection of document data via solrj.  My first attempt  was to write a
> multi-threaded application that did following.
>
> Collection docs = new ArrayList();
> for (Widget w : widges) {
>doc.addField("id", w.getId());
>doc.addField("name", w.getName());
>   doc.addField("price", w.getPrice());
>doc.addField("category", w.getCat());
>doc.addField("srcType", w.getSrcType());
>docs.add(doc);
>
>// commit docs to solr server
>server.add(docs);
>server.commit();
> }
>
> And I got this exception.
>
> rg.apache.solr.common.SolrException:
>
> Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later
>
>
> Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later
>
>at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
>at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
>at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>at
> org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)
>
> The solrj wiki/documents seemed to indicate that because multiple threads
> were calling SolrServer.commit() which in term called
> CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
> thought was to change the configs for autowarming.  But after looking at
> the
> autowarm params, I am not sure what can be changed or perhaps a different
> approach is recommened.
>
>  class="solr.FastLRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>  class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>  class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
> Your help is much appreciated.
>