Non-prefix, hierarchical autocomplete? Would SOLR-1316 work? Solritas?

2010-06-19 Thread Andy
Hi,

I've seen some posts on using SOLR-1316 or Solritas for autocomplete. Wondered 
what is the best solution for my use case:

1) I would like to have an "hierarchical" autocomplete. For example, I have a 
"Country" dropdown list and a "City" textbox. A user would select a country 
from the dropdown list, and then type out the City in the textbox. Based on 
which country he selected, I want to limit the autocomplete suggestions to 
cities that are relevant for the selected country.

This hierarchy could be multi-level. For example, there may be a "Neighborhood" 
textbox. The autocomplete suggestions for "Neighborhood" would be limited to 
neighborhoods that are relevant for the city entered by the user in the "City" 
textbox.

2) I want to have autocomplete suggestions that includes non-prefix matches. 
For example, if the user type "auto", the autocomplete suggestions should 
include terms such as "automata" and "build automation".

3) I'm doing autocomplete for tags. I would like to allow multi-word tags and 
use comma (",") as a separator for tags. So when the use hits the space bar, he 
is still typing out the same tag, but when he hits the comma key, he's starting 
a new tag.

Would SOLR-1316 or Solritas work for the above requirements? If they do how do 
I set it up? I can't really find much documentation on SOLR-1316 or Solritas in 
this area.

Thanks.


  


Re: Non-prefix, hierarchical autocomplete? Would SOLR-1316 work? Solritas?

2010-06-19 Thread Andy
Forgot to add, I would like to order the autocomplete suggestions for 
tags/cities based on how many times they are present in the documents.

--- On Sat, 6/19/10, Andy  wrote:

> From: Andy 
> Subject: Non-prefix, hierarchical autocomplete? Would SOLR-1316 work? 
> Solritas?
> To: solr-user@lucene.apache.org
> Date: Saturday, June 19, 2010, 3:28 AM
> Hi,
> 
> I've seen some posts on using SOLR-1316 or Solritas for
> autocomplete. Wondered what is the best solution for my use
> case:
> 
> 1) I would like to have an "hierarchical" autocomplete. For
> example, I have a "Country" dropdown list and a "City"
> textbox. A user would select a country from the dropdown
> list, and then type out the City in the textbox. Based on
> which country he selected, I want to limit the autocomplete
> suggestions to cities that are relevant for the selected
> country.
> 
> This hierarchy could be multi-level. For example, there may
> be a "Neighborhood" textbox. The autocomplete suggestions
> for "Neighborhood" would be limited to neighborhoods that
> are relevant for the city entered by the user in the "City"
> textbox.
> 
> 2) I want to have autocomplete suggestions that includes
> non-prefix matches. For example, if the user type "auto",
> the autocomplete suggestions should include terms such as
> "automata" and "build automation".
> 
> 3) I'm doing autocomplete for tags. I would like to allow
> multi-word tags and use comma (",") as a separator for tags.
> So when the use hits the space bar, he is still typing out
> the same tag, but when he hits the comma key, he's starting
> a new tag.
> 
> Would SOLR-1316 or Solritas work for the above
> requirements? If they do how do I set it up? I can't really
> find much documentation on SOLR-1316 or Solritas in this
> area.
> 
> Thanks.
> 
> 
>       
> 





Re: performance sorting multivalued field

2010-06-19 Thread Marc Sturlese

Hey Erik,
I am currently sorting by a multiValued. It apears a feature tha't you may
not know wich of the fields of the multiValued field makes the document be
in that position. This is good for me, I don't care for my tests.
What I need to know if there is any performance issue in all of this.
Thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/performance-sorting-multivalued-field-tp905943p907502.html
Sent from the Solr - User mailing list archive at Nabble.com.


Specifiying multiple mlt.fl fields

2010-06-19 Thread Darren Govoni
Hi,
  I read the wiki and tried about a dozen variations such as:

...&mlt.fl=field1&mlt.fl=field2

and

...&mlt.fl=field1,field2&...

to specify more than one MLT field and it won't take. What's the trick?
Also, how to do it with SolrJ?

Nothing I try works. Solr 4.0 nightly build.

Any tips, very appreciated!

Darren




Re: Specifiying multiple mlt.fl fields

2010-06-19 Thread Sascha Szott

Hi Darren,

try mlt.fl=field1 field2

Best,
Sascha

Darren Govoni wrote:

Hi,
   I read the wiki and tried about a dozen variations such as:

...&mlt.fl=field1&mlt.fl=field2

and

...&mlt.fl=field1,field2&...

to specify more than one MLT field and it won't take. What's the trick?
Also, how to do it with SolrJ?

Nothing I try works. Solr 4.0 nightly build.

Any tips, very appreciated!

Darren







Re: Specifiying multiple mlt.fl fields

2010-06-19 Thread Darren Govoni
It works! Thanks Sascha. I swear I tried that combination. Hehe.

On Sat, 2010-06-19 at 21:19 +0200, Sascha Szott wrote:

> Hi Darren,
> 
> try mlt.fl=field1 field2
> 
> Best,
> Sascha
> 
> Darren Govoni wrote:
> > Hi,
> >I read the wiki and tried about a dozen variations such as:
> >
> > ...&mlt.fl=field1&mlt.fl=field2
> >
> > and
> >
> > ...&mlt.fl=field1,field2&...
> >
> > to specify more than one MLT field and it won't take. What's the trick?
> > Also, how to do it with SolrJ?
> >
> > Nothing I try works. Solr 4.0 nightly build.
> >
> > Any tips, very appreciated!
> >
> > Darren
> >
> >
> >
> 




Use of EmbeddedSolrServer

2010-06-19 Thread Robert Naczinski
Hi,

is there a best practice for use of EmbeddedSolrServer?

Knows someone  a good link besides http://wiki.apache.org/solr/Solrj #
EmbeddedSolrServer

Regards,

Robert


Re: Tips on recursive xml-parsing in dataConfig

2010-06-19 Thread Tor Henning Ueland
The case changed to not using those xml-files at all, i ended up using
some other datafiles as sources, witch had everything flat, so no
recursion was needed afterall. But thanks for the input! :)

Best regards

On Tue, Jun 8, 2010 at 11:08 AM, Geert-Jan Brits  wrote:
> my bad, it looks like XPathEntityProcessor doesn't support relative xpaths.
>
> However, I quickly looked at the Slashdot example (which is pretty good
> actually) at http://wiki.apache.org/solr/DataImportHandler.
> From that I infer that you use only 1 entity per xml-doc. And within that
> entity use multiple field declararations with xpath-attributes to extract
> the values you want.
> So even though your xml-dcoument is nested (like most xml's are) your
> field-declarations are not.
>
> I think your best bet is to read the slashdot example and go from there.
>
> For now, I'm not entirely sure what you want a solr-document to be in your
> example. i.e:
> - 1 solr-document per 1 xml-document (as supplied)
> - or 1 solr-doc per CHAP  per PARA or per SUB?
>
> Once you know that, perhaps coming up with a decent pointer is easier.
>
> HTH,
> Geert-Jan
>
>
> 
>
> 2010/6/8 Tor Henning Ueland 
>
>> I have tried both to change the datasource per child node to use the
>> parent nodes name, and tried to making the Xpath`s relative, all
>> causing either exceptions telling that Xpath must start with /, or
>> nullpointer exceptions ( nsfgrantsdir document : null).
>>
>> Best regards
>>
>> On Mon, Jun 7, 2010 at 4:12 PM, Geert-Jan Brits  wrote:
>> > I'm guessing (I'm not familiar with the xml dataimport handler, but I am
>> > pretty familiar with Xpath)
>> > that your problem lies in having absolute xpath-queries, instead of
>> relative
>> > xpath queries to your parent node.
>> >
>> > e.g: /DOK/TEKST/KAP is absolute ( the prefixed '/' tells it to be). Try
>> > 'KAP' instead.
>> > The same for all xpaths deeper in the tree.
>> >
>> > Geert-Jan
>> >
>> > 2010/6/7 Tor Henning Ueland 
>> >
>> >> Hi,
>> >>
>> >> I am doing some testing of dataimport to Solr from XML-documents with
>> >> many children in the children. To parse the children i some levels
>> >> down using Xpath goes fine, but the speed is very slow. (~1 minute per
>> >> document, on a quad Xeon server). When i do the same using the format
>> >> solr wants it, the parsing time is 0.02 seconds per document.
>> >>
>> >> I have published a quick example here:
>> >> http://pastebin.com/adhcEvRx
>> >>
>> >> My question is:
>> >>
>> >> I hope that i have done something wrong in the child-parsing  (as you
>> >> can see, it goes down quite a few levels). Can anybody point me in the
>> >> right direction so i can speed up the process?  I have been looking
>> >> around for some examples, but nobody gives examples of such deep data
>> >> indexing.
>> >>
>> >> PS: I know there are some bugs in the Xpath naming etc, but it is just
>> >> a rough example :)
>> >>
>> >> --
>> >> Best regars
>> >> Tor Henning Ueland
>> >>
>> >
>>
>>
>>
>> --
>> Mvh
>> Tor Henning Ueland
>>
>



-- 
Mvh
Tor Henning Ueland


Re: Autocompletion with Solritas

2010-06-19 Thread Ken Krugler

Hi Erik,

On Jun 18, 2010, at 6:58pm, Erik Hatcher wrote:

Have a look at suggest.vm - the "name" field is used in there too.   
Just those two places, layout.vm and suggest.vm.


That was the missing change I needed.

Thanks much!

-- Ken



  And I had already added a ## TODO in my local suggest.vm:

## TODO: make this more generic, maybe look at the request  
terms.fl?  or just take the first terms field in the response?


And also, ideally, there'd be a /suggest handler mapped with the  
field name specified there.  I simply used what was already  
available to put suggest in there easily.


Erik

On Jun 18, 2010, at 7:54 PM, Ken Krugler wrote:


Hi Erik,

On Jun 17, 2010, at 8:34pm, Erik Hatcher wrote:

Your wish is my command.  Check out trunk, fire up Solr (ant run- 
example), index example data, hit http://localhost:8983/solr/ 
browse - type in search box.


Just used jQuery's autocomplete plugin and the terms component for  
now, on the name field.  Quite simple to plug in, actually.  Check  
the commit diff.  The main magic is doing this:





Stupidly, though, jQuery's autocomplete seems to be hardcoded to  
send a q parameter, but I coded it to also send the same value as  
terms.prefix - but this could be an issue if hitting a different  
request handler where q is used for the actual query for filtering  
terms on.


Let's say, just for grins, that a different field (besides "name")  
is being used for autocompletion.


What would be all the places I'd need to hit to change the field,  
besides the terms.fl value in layout.vm? For example, what about  
browse.vm:


  $("input[type=text]").autoSuggest("/solr/suggest",  
{selectedItemProp: "name", searchObjProps: "name"}});


I'm asking because I'm trying to use this latest support with an  
index that uses "product_name" for the auto-complete field, and I'm  
not getting any auto-completes happening.


I see from the Solr logs that requests being made to /solr/terms  
during auto-complete that look like:


INFO: [] webapp=/solr path=/terms  
params 
= 
{limit 
= 
10 
×tamp 
= 
1276903135595 
&terms 
.fl 
= 
product_name 
&q 
= 
rug 
&wt=velocity&terms.sort=count&v.template=suggest&terms.prefix=rug}  
status=0 QTime=0


Which I'd expect to work, but don't seem to be generating any  
results.


What's odd is that if I try curling the same thing:

curl -v "http://localhost:8983/solr/terms?limit=10×tamp=1276903135595&terms.fl=product_name&q=rug&wt=velocity&terms.sort=count&v.template=suggest&terms.prefix=rug 
"


I get an empty HTML response:

< Content-Type: text/html; charset=utf-8
< Content-Length: 0
< Server: Jetty(6.1.22)

If I just use what I'd consider to be the minimum set of parameters:

curl -v "http://localhost:8983/solr/terms?limit=10&terms.fl=product_name&q=rug&terms.sort=count&terms.prefix=rug 
"


Then I get the expected XML response:

< Content-Type: text/xml; charset=utf-8
< Content-Length: 225
< Server: Jetty(6.1.22)
<


0name="QTime">0name="product_name">7



Any ideas what I'm doing wrong?

Thanks,

-- Ken



On Jun 17, 2010, at 8:03 PM, Ken Krugler wrote:


I don't believe Solritas supports autocompletion out of the box.

So I'm wondering if anybody has experience using the LucidWorks  
distro & Solritas, plus the AJAX Solr auto-complete widget.


I realize that AJAX Solr's autocomplete support is mostly just  
leveraging the jQuery Autocomplete plugin, and hooking it up to  
Solr facets, but I was curious if there were any tricks or traps  
in getting it all to work.


Thanks,

-- Ken




Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Re: federated / meta search

2010-06-19 Thread Lance Norskog
https://issues.apache.org/jira/browse/LUCENE-1812

On Fri, Jun 18, 2010 at 7:26 PM, Otis Gospodnetic
 wrote:
> Lance, which project in Solr are you referring to?
>
>
> Thanks,
>
> Otis
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
>> From: Lance Norskog 
>> To: solr-user@lucene.apache.org
>> Sent: Fri, June 18, 2010 8:16:46 PM
>> Subject: Re: federated / meta search
>>
>> Yes, you can do this. You need to have a common system for creating
> unique
>> ids for the documents.
>
> Also, there's an odd problem around relevance.
>> Relevance scoring is
> based on all of the terms in a field in the whole index,
>> and there is
> a "statistical fingerprint" of this for an index. With two
>> indexes
> from two sources, the terms in the documents will not have the
>> same
> "fingerprint". Relevance scores from one shard will not match
>> the
> meaning of a document's score in the other shard.
>
> There is a
>> project to make this work in Solr, but it is not nearly finished.
>
> Lance
>> Norskog
>
> On Fri, Jun 18, 2010 at 4:28 AM, Sascha Szott <
>> ymailto="mailto:sz...@zib.de"; href="mailto:sz...@zib.de";>sz...@zib.de>
>> wrote:
>> Hi Joe & Markus,
>>
>> sounds good! Maybe I should
>> better add a note on the Wiki page on federated
>> search
>> [1].
>>
>> Thanks,
>> Sascha
>>
>> [1]
>> href="http://wiki.apache.org/solr/FederatedSearch"; target=_blank
>> >http://wiki.apache.org/solr/FederatedSearch
>>
>> Joe Calderon
>> wrote:
>>>
>>> yes, you can use distributed search across shards
>> with different
>>> schemas as long as the query only references
>> overlapping fields, i
>>> usually test adding new fields or tokenizers
>> on one shard and deploy
>>> only after i verified its working
>> properly
>>>
>>> On Thu, Jun 17, 2010 at 1:10 PM, Markus
>> Jelsma<
>> href="mailto:markus.jel...@buyways.nl";>markus.jel...@buyways.nl>
>>>
>>  wrote:


>> Hi,



 Check out
>> Solr sharding [1] capabilities. I never tested it with
 different
>> schema's but if each node is queried with fields that it
>> supports,
 it should return useful
>> results.



 [1]:
>> href="http://wiki.apache.org/solr/DistributedSearch"; target=_blank
>> >http://wiki.apache.org/solr/DistributedSearch




>> Cheers.

 -Original
>> message-
 From: Sascha Szott<
>> ymailto="mailto:sz...@zib.de";
>> href="mailto:sz...@zib.de";>sz...@zib.de>
 Sent: Thu
>> 17-06-2010 19:44
 To:
>> ymailto="mailto:solr-user@lucene.apache.org";
>> href="mailto:solr-user@lucene.apache.org";>solr-user@lucene.apache.org;

>> Subject: federated / meta search

 Hi
>> folks,

 if I'm seeing it right Solr currently
>> does not provide any support for
 federated / meta searching.
>> Therefore, I'd like to know if anyone has
 already put efforts
>> into this direction? Moreover, is federated / meta
 search
>> considered a scenario Solr should be able to deal with at all or

>> is it (far) beyond the scope of Solr?

 To be more
>> precise, I'll give you a short explanation of my
 requirements.
>> Assume, there are a couple of Solr instances running at

>> different places. The documents stored within those instances are
>> all
 from the same domain (bibliographic records), but it can not
>> be ensured
 that the schema definitions conform to 100%. But lets
>> say, there are at
 least some index fields that are present in
>> all instances (fields with
 the same name and type definition).
>> Now, I'd like to perform a search on
 all instances at the same
>> time (with the restriction that the query
 contains only those
>> fields that overlap among the different schemas) and
 combine the
>> results in a reasonable way by utilizing the score
 information
>> associated with each hit. Please note, that due to legal
 issues
>> it is not feasible to build a single index that integrates the

>> documents of all Solr instances under
>> consideration.

 Thanks in
>> advance,

>> Sascha


>>
>>
>
>
>
> --
>>
> Lance Norskog
>
>> href="mailto:goks...@gmail.com";>goks...@gmail.com
>



-- 
Lance Norskog
goks...@gmail.com


Re: Indexing HTML files in SOLR

2010-06-19 Thread Lance Norskog
Ah! You need a SolrJ program that uses Tika to parse the files and
upload the text. I think there is such a program already but do not
know where it is.

Lance

On Thu, Jun 17, 2010 at 6:13 AM, seesiddharth  wrote:
>
> Thank you so much for the reply...The link suggested by you is helpful but
> they have explain everything with use of curl command which I don't want to
> use.
> I was more interested in uploading the .html documents using HTTP web
> request.
> So I have stored all .html files at one location & then created HTML parser
> which will fetch the content from these html file & build an XML string
> (like  name="">..). Then I sent these
> XML string using HTTP web request method (in .net ) to solr server to
> add/update the document.
> Now I am able to search the data in solr of all uploaded documents.
> It will be great if u answer my question :
> Is there any better approach to achieve the same functionality ?
>
> Regards,
> Siddharth
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Indexing-HTML-files-in-SOLR-tp896530p902644.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goks...@gmail.com


Minor bug in Solritas with post-facet search

2010-06-19 Thread Ken Krugler
I ran into one minor problem, where if I clicked a facet, and then  
tried a search, I'd get a 404 error.


I think the problem is with the fqs Velocity macro in  
VM_global_library.vm, where it's missing the #else to insert a '?'  
into the URL:


#macro(fqs $p)#foreach($fq in $p)#if($velocityCount>1)&#{else}? 
#{end}fq=$esc.url($fq)#end#end


Without this, the URL becomes /solr/browsefq=, instead of /solr/ 
browse?fq=


But I'm completely new to the world of Velocity templating, so I've  
got low confidence that this is the right way to fix it.


-- Ken


Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Minor bug with Solritas and price data

2010-06-19 Thread Ken Krugler
I noticed that my prices weren't showing up, even though I've got a  
price field.


I think the issue is with this line from hit.vm:

  #field('name') $! 
number.currency($doc.getFieldValue('price'))


The number.currency() function needs to get passed something that  
looks like a number, but $doc.getFieldValue() will return "[2.96]",  
because it could be a list of values.


The square brackets confuse number.currency, so you get no price.

I think this line needs to be:

  #field('name') $! 
number.currency($doc.getFirstValue('price'))


...since getFirstValue() returns a single value without brackets.

-- Ken


Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Re: Minor bug in Solritas with post-facet search

2010-06-19 Thread Erik Hatcher

Fixed.

 action URLs really shouldn't have query string parameters on  
them anyway, nor do they appear to work if so, so I moved the fq's to  
hidden input fields.


Adding the ? into the URLs gets tricky, and doing it in #fqs isn't the  
right place, as those are often tacked on after other parameters where  
the ? has already been added.


Erik

On Jun 19, 2010, at 8:46 PM, Ken Krugler wrote:

I ran into one minor problem, where if I clicked a facet, and then  
tried a search, I'd get a 404 error.


I think the problem is with the fqs Velocity macro in  
VM_global_library.vm, where it's missing the #else to insert a '?'  
into the URL:


#macro(fqs $p)#foreach($fq in $p)#if($velocityCount>1)&#{else}? 
#{end}fq=$esc.url($fq)#end#end


Without this, the URL becomes /solr/browsefq=, instead of /solr/ 
browse?fq=


But I'm completely new to the world of Velocity templating, so I've  
got low confidence that this is the right way to fix it.


-- Ken


Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g








Re: Minor bug with Solritas and price data

2010-06-19 Thread Erik Hatcher
That's not a bug with the example schema, as price is a single-valued  
field.  getFirstValue will work, yes, but isn't necessary when it's  
single valued.  If you've got multiple prices, you probably want  
something like:


   #foreach($price in $doc.getFieldValue('price'))$! 
number.currency($price)#end


Note that in your scenario it isn't returning a string with brackets  
(except to the UI) - it's truly an array within the template.


Though maybe what you want in your schema is a single valued price  
field?  :)


Erik

On Jun 19, 2010, at 9:12 PM, Ken Krugler wrote:

I noticed that my prices weren't showing up, even though I've got a  
price field.


I think the issue is with this line from hit.vm:

 #field('name') $! 
number.currency($doc.getFieldValue('price'))


The number.currency() function needs to get passed something that  
looks like a number, but $doc.getFieldValue() will return "[2.96]",  
because it could be a list of values.


The square brackets confuse number.currency, so you get no price.

I think this line needs to be:

 #field('name') $! 
number.currency($doc.getFirstValue('price'))


...since getFirstValue() returns a single value without brackets.

-- Ken


Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g