RE: Problem loading my codec sometimes

2013-09-25 Thread Scott Schneider
Thanks for your quick response!  My jar was in solr/lib.  I removed all the 
 directives from solrconfig.xml, but I still get the error.  My solr.xml 
doesn't have sharedLib.

By the way, I am running Solr 4.4.0 with most of the default example files 
(including solr.xml).  My schema.xml and solrconfig.xml are from another 
project using Solr 3.6.  I modified them a bit to fix any obvious errors.

I still wonder why it can create a new index using my codec, but not load an 
index previously created with my codec.  In solrconfig.xml, I specify the 
CodecFactory along with the package name, whereas the codec name that is read 
from the index file has no package name.  Could that be the problem?  I think 
that's the way it's supposed to be.  Could it be that Solr has my jar in the 
classpath, but SPI is not registering my codec class from the jar?  I'm not 
familiar with SPI.

What else can I try?

Thanks,
Scott


> -Original Message-
> From: Shawn Heisey [mailto:s...@elyograg.org]
> Sent: Tuesday, September 24, 2013 5:51 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Problem loading my codec sometimes
> 
> On 9/24/2013 6:32 PM, Scott Schneider wrote:
> > I created my own codec and Solr can find it sometimes and not other
> times.  When I start fresh (delete the data folder and run Solr), it
> all works fine.  I can add data and query it.  When I stop Solr and
> start it again, I get:
> >
> > Caused by: java.lang.IllegalArgumentException: A SPI class of type
> org.apache.lucene.codecs.Codec with name 'MyCodec' does not exist. You
> need to add the corresponding JAR file supporting this SPI to your
> classpath.The current classpath supports the following names:
> [SimpleText, Appending, Lucene40, Lucene3x, Lucene41, Lucene42]
> >
> > I added the JAR to the path and I'm pretty sure Java sees it, or else
> it would not be using my codec when I start fresh.  (I've looked at the
> index files and verified that it's using my codec.)  I suppose Solr is
> asking SPI for my codec based on the codec class name stored in the
> index files, but I don't see why this would fail when a fresh start
> works.
> 
> What I always recommend for those who want to use custom and contrib
> jars is that they put all such jars (and their dependencies) into
> ${solr.solr.home}/lib, don't use any  directives in
> solrconfig.xml,
> and don't put the sharedLib attribute into solr.xml.  Doing it in any
> other way has a tendency to trigger bugs or causes jars to get loaded
> more than once.
> 
> The ${solr.solr.home} property defaults to $CWD/solr (CWD is current
> working directory for those who don't already know) and is the location
> of the solr.xml file.  Note that depending on the exact version of Solr
> and which servlet container you are using, there may actually be two
> solr.xml files, one which loads solr into your container and one that
> configures Solr.  I am referring to the latter.
> 
> If you are using the solr example and its directory layout, the
> directory you would need to put all jars into is example/solr/lib ...
> which is a directory that doesn't exist and has to be created.
> 
> http://wiki.apache.org/solr/Solr.xml%20%28supported%20through%204.x%29
> http://wiki.apache.org/solr/Solr.xml%204.4%20and%20beyond
> 
> Thanks,
> Shawn



Re: Problem loading my codec sometimes

2013-09-25 Thread Alexandre Rafalovitch
On Wed, Sep 25, 2013 at 2:10 PM, Scott Schneider <
scott_schnei...@symantec.com> wrote:

> I still wonder why it can create a new index using my codec, but not load
> an index previously created with my codec.


Could be a sequence of classpath initialization. For example, the
write/read functions happen after full classpath is setup. But when you
open (not read/write) an index after restart, it may need to check
something (e.g. your codec) and that's before full classpath is available.
Then, putting things into system level classpath resolves the issue.

Pure conjecture here, but that's exactly the kind of things that triggers
classloader bugs.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


Re: Hash range to shard assignment

2013-09-25 Thread lochri
Great, thank you !!!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hash-range-to-shard-assignment-tp4091204p4091889.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Plugin for Kettle.

2013-09-25 Thread Srini.daruna
Hi Everyone,

Is there a way to index data into Solr from kettle.?
If so, could you please tell me how to do that.?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Plugin-for-Kettle-tp4091877.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud High Availability during indexing operation

2013-09-25 Thread Erick Erickson
And do any of the documents have the same , which
is usually called "id"? Subsequent adds of docs with the same
 replace the earlier one.

It's not definitive because it changes as merges happen, old copies
of docs that have been deleted or updated will be purged, but what
does your admin page show for "maxDoc"? If it's more than "numDocs"
then you have duplicate s. NOTE: if you optimize
(which you usually shouldn't) then maxDoc and numDocs will be
the same so if you test this don't optimize.

Best,
Erick


On Tue, Sep 24, 2013 at 10:43 AM, Walter Underwood
 wrote:
> Did all of the curl update commands return success? Ane errors in the logs?
>
> wunder
>
> On Sep 24, 2013, at 6:40 AM, Otis Gospodnetic wrote:
>
>> Is it possible that some of those 80K docs were simply not valid? e.g.
>> had a wrong field, had a missing required field, anything like that?
>> What happens if you clear this collection and just re-run the same
>> indexing process and do everything else the same?  Still some docs
>> missing?  Same number?
>>
>> And what if you take 1 document that you know is valid and index it
>> 80K times, with a different ID, of course?  Do you see 80K docs in the
>> end?
>>
>> Otis
>> --
>> Solr & ElasticSearch Support -- http://sematext.com/
>> Performance Monitoring -- http://sematext.com/spm
>>
>>
>>
>> On Tue, Sep 24, 2013 at 2:45 AM, Saurabh Saxena  
>> wrote:
>>> Doc count did not change after I restarted the nodes. I am doing a single
>>> commit after all 80k docs. Using Solr 4.4.
>>>
>>> Regards,
>>> Saurabh
>>>
>>>
>>> On Mon, Sep 23, 2013 at 6:37 PM, Otis Gospodnetic <
>>> otis.gospodne...@gmail.com> wrote:
>>>
 Interesting. Did the doc count change after you started the nodes again?
 Can you tell us about commits?
 Which version? 4.5 will be out soon.

 Otis
 Solr & ElasticSearch Support
 http://sematext.com/
 On Sep 23, 2013 8:37 PM, "Saurabh Saxena"  wrote:

> Hello,
>
> I am testing High Availability feature of SolrCloud. I am using the
> following setup
>
> - 8 linux hosts
> - 8 Shards
> - 1 leader, 1 replica / host
> - Using Curl for update operation
>
> I tried to index 80K documents on replicas (10K/replica in parallel).
> During indexing process, I stopped 4 Leader nodes. Once indexing is done,
> out of 80K docs only 79808 docs are indexed.
>
> Is this an expected behaviour ? In my opinion replica should take care of
> indexing if leader is down.
>
> If this is an expected behaviour, any steps that can be taken from the
> client side to avoid such a situation.
>
> Regards,
> Saurabh Saxena
>

>
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>


Re: How to sort over all documents by score after Result Grouping / Field Collapsing

2013-09-25 Thread Erick Erickson
Nope, there's nothing in Solr that does this for you already,
you need to do something custom. You could write a component
that did this as the last step of writing the data to the client,
but that just moves the action from your front-end to custom
code in Solr that you have to maintain too, so I don't see the
benefit.

Best,
Erick

On Tue, Sep 24, 2013 at 11:46 AM, go2jun  wrote:
> Thanks Erick for your response.
>
> My goal is
> 1. try to search from solr. In the search result, we would like show no more
> than two results from the same source id.
> 2. For the search results, we would like all these results sorted by their
> score.
>
> So If I use solr result grouping to get the top two result from each group,
> then I need to un-group them.
>
> So my question is there any pure solr solution to handle this? I prefer it
> handle by solr other than my application, because the search result are very
> large.
>
> Thanks!
> Jun
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-sort-over-all-documents-by-score-after-Result-Grouping-Field-Collapsing-tp4091593p4091784.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR grouped query sorting on numFound

2013-09-25 Thread Erick Erickson
Hmmm, just specifying &sort= is _almost_ what you want,
except it sorts by the value of fields in the doc not numFound.

this shouldn't be hard to do on the client though, but you'd
have to return all the groups...

FWIW,
Erick

On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan  wrote:
> We ran into 1 snag during development with SOLR and I thought I'd run it by
> anyone to see if they had any slick ways to solve this issue.
>
> Basically, we're performing a SOLR query with grouping and want to be able
> to sort by the number of documents found within each group.
>
> Our query response from SOLR looks something like this:
>
> {
>
>   "responseHeader":{
>
> "status":0,
>
> "QTime":17,
>
> "params":{
>
>   "indent":"true",
>
>   "q":"*:*",
>
>   "group.limit":"0",
>
>   "group.field":"rfp_stub",
>
>   "group":"true",
>
>   "wt":"json",
>
>   "rows":"1000"}},
>
>   "grouped":{
>
> "rfp_stub":{
>
>   "matches":18470,
>
>   "groups":[{
>
>
> "groupValue":"java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e",
>
>   "doclist":{"*numFound*":3,"start":0,"docs":[]
>
>   }},
>
> {
>
>
> "groupValue":"java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce",
>
>   "doclist":{"*numFound*":5,"start":0,"docs":[]
>
>   }},
>
> {
>
>
> "groupValue":"java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131",
>
>   "doclist":{"*numFound*":6,"start":0,"docs":[]
>
>   }},
>
> …
>
>
> The *numFound* shows the number of documents within that group.  Is there
> anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
> supported, but wondered if anyone their has come across this and if there
> was any suggested workarounds given that the dataset is really too large to
> hold in memory on our app servers?


Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread Erick Erickson
I've sometimes seen this handled by clever tokenizing. For "Bill Rogers",
index (untokenized) something like
Bill|Bill Rogers
Rogers|Bill Rogers

Your suggester then is a simple term lookup (see TermsComponent)
which is quite fast. What you _don't_ get is autocorrect. But if you
use terms.prefix, you can also control whether it's whole word match
or not. To get whole-word in the above, you would set your prefix to
"Rogers|" for instance. Or you may want to leave off the "|" to see
more of an autocomplete-type response.

Then, of course, when you display this you need to only display what's
after the "|" (or whatever delimiter you use).

One other note, this will be case sensitive, so you probably want to
do casing yourself, index things like
rogers|Bill Rogers
and lowercase what you send in to terms component.

Best,
Erick



On Tue, Sep 24, 2013 at 2:01 PM, JMill  wrote:
> Hi,
>
> I'm using Solr's Suggester function to implement an autocomplete feature.
> I have it setup to check against the "username" and "name" fields.  Problem
> is when running  a query against the name, the second term, after
> whitespace (surename) returns 0 results.  Works if if query is a partial
> name starting from the begining e.g. Given the name "Bill Rogers", a query
> for Rogers will return 0 results whereas a query for "Bill" will return
> positive (Bill Rogers). As for the username, it's not working at.
>
> I am after the following behaviour.
>
> Match any partial words in the fields "username" or "name" and return the
> results.  If there is match in the field "name" the return the whole name
> e.g. given the queries "Rogers" or "Bill"" return "Bill Rogers (not the
> single word that was a match)".
>
> schema.xml extract
> ..
> 
>  
>  multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
> ...
> 
> 
> ...
>
>  positionIncrementGap="100">
>  
>
>
>
>  
> 
>
>
> solrconfig.xml
>
> 
> 
>suggest
>org.apache.solr.spelling.suggest.Suggester
> name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
>autocomplete  
>0.005
>true
>
> 
>
> 
>
> ..
>  name="/suggest">
>   
> true
> suggest
> true
> 5
> true
>   
>   
>  spellcheck
>   
> 


Re: Soft commit and flush

2013-09-25 Thread Erick Erickson
Why do you care? Curiosity or are you trying to find a
behavior you can count on?

because "soft commits are about visibility, hard commits are
about durability". Meaning you can't count on a soft commit
writing anything to disk at all. I suspect in your tests the soft
commit had nothing to do with the changes on disk, those were
just a consequence of indexing more data triggering a flush
to disk and would have happened if you hadn't done a soft
commit.

hard commits are what you can control writes to disk with,
not soft commits.

Best,
Erick

On Tue, Sep 24, 2013 at 3:56 PM, Shawn Heisey  wrote:
> On 9/24/2013 5:51 AM, adfel70 wrote:
>>
>> My conclusion is that soft commit always flushes the data, but because of
>> the implementation of NRTCachingDirectoryFactory, the data will be written
>> to the disk when its getting too big.
>
>
> The NRTCachingDirectoryFactory (which creates NRTCachingDirectory instances)
> used by default in newer Solr versions has default settings for some of its
> parameters that show up in the solr log:
>
> maxCacheMB=48.0 maxMergeSizeMB=4.0
>
> The constructor javadocs for NRTCachingDirectory show what circumstances
> will cause the directory to use RAM instead of flushing to disk:
>
> http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/store/NRTCachingDirectory.html#NRTCachingDirectory%28org.apache.lucene.store.Directory,%20double,%20double%29
>
> "We will cache a newly created output if 1) it's a flush or a merge and the
> estimated size of the merged segment is <= maxMergeSizeMB, and 2) the total
> cached bytes is <= maxCachedMB"
>
> Thanks,
> Shawn
>


Re: SolrCloud setup - any advice?

2013-09-25 Thread Erick Erickson
H, I confess I haven't had a chance to play with this yet,
but have you considered docValues for some of your fields? See:
http://wiki.apache.org/solr/DocValues

And just to tantalize you:

> Since Solr4.2 to build a forward index for a field, for purposes of sorting, 
> faceting, grouping, function queries, etc.

> You can specify a different docValuesFormat on the fieldType 
> (docValuesFormat="Disk") to only load minimal data on the heap, keeping other 
> data structures on disk.

Do note, though:
> Not a huge improvement for a static index

this latter isn't a problem though since you don't have a static index

Erick

On Tue, Sep 24, 2013 at 4:13 AM, Neil Prosser  wrote:
> Shawn: unfortunately the current problems are with facet.method=enum!
>
> Erick: We already round our date queries so they're the same for at least
> an hour so thankfully our fq entries will be reusable. However, I'll take a
> look at reducing the cache and autowarming counts and see what the effect
> on hit ratios and performance are.
>
> For SolrCloud our soft commit (openSearcher=false) interval is 15 seconds
> and our hard commit is 15 minutes.
>
> You're right about those sorted fields having a lot of unique values. They
> can be any number between 0 and 10,000,000 (it's sparsely populated across
> the documents) and could appear in several variants across multiple
> documents. This is probably a good area for seeing what we can bend with
> regard to our requirements for sorting/boosting. I've just looked at two
> shards and they've each got upwards of 1000 terms showing in the schema
> browser for one (potentially out of 60) fields.
>
>
>
> On 21 September 2013 20:07, Erick Erickson  wrote:
>
>> About caches. The queryResultCache is only useful when you expect there
>> to be a number of _identical_ queries. Think of this cache as a map where
>> the key is the query and the value is just a list of N document IDs
>> (internal)
>> where N is your window size. Paging is often the place where this is used.
>> Take a look at your admin page for this cache, you can see the hit rates.
>> But, the take-away is that this is a very small cache memory-wise, varying
>> it is probably not a great predictor of memory usage.
>>
>> The filterCache is more intense memory wise, it's another map where the
>> key is the fq clause and the value is bounded by maxDoc/8. Take a
>> close look at this in the admin screen and see what the hit ratio is. It
>> may
>> be that you can make it much smaller and still get a lot of benefit.
>> _Especially_ considering it could occupy about 44G of memory.
>> (43,000,000 / 8) * 8192 And the autowarm count is excessive in
>> most cases from what I've seen. Cutting the autowarm down to, say, 16
>> may not make a noticeable difference in your response time. And if
>> you're using NOW in your fq clauses, it's almost totally useless, see:
>> http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/
>>
>> Also, read Uwe's excellent blog about MMapDirectory here:
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>> for some problems with over-allocating memory to the JVM. Of course
>> if you're hitting OOMs, well.
>>
>> bq: order them by one of their fields.
>> This is one place I'd look first. How many unique values are in each field
>> that you sort on? This is one of the major memory consumers. You can
>> get a sense of this by looking at admin/schema-browser and selecting
>> the fields you sort on. There's a text box with the number of terms
>> returned,
>> then a / ### where ### is the total count of unique terms in the field.
>> NOTE:
>> in 4.4 this will be -1 for multiValued fields, but you shouldn't be
>> sorting on
>> those anyway. How many fields are you sorting on anyway, and of what types?
>>
>> For your SolrCloud experiments, what are your soft and hard commit
>> intervals?
>> Because something is really screwy here. Your sharding moving the
>> number of docs down this low per shard should be fast. Back to the point
>> above, the only good explanation I can come up with from this remove is
>> that the fields you sort on have a LOT of unique values. It's possible that
>> the total number of unique values isn't scaling with sharding. That is,
>> each
>> shard may have, say, 90% of all unique terms (number from thin air). Worth
>> checking anyway, but a stretch.
>>
>> This is definitely unusual...
>>
>> Best,
>> Erick
>>
>>
>> On Thu, Sep 19, 2013 at 8:20 AM, Neil Prosser 
>> wrote:
>> > Apologies for the giant email. Hopefully it makes sense.
>> >
>> > We've been trying out SolrCloud to solve some scalability issues with our
>> > current setup and have run into problems. I'd like to describe our
>> current
>> > setup, our queries and the sort of load we see and am hoping someone
>> might
>> > be able to spot the massive flaw in the way I've been trying to set
>> things
>> > up.
>> >
>> > We currently run Solr 4.0.0 in the old style Master/Slave replication. 

Re: SOLR grouped query sorting on numFound

2013-09-25 Thread Brent Ryan
ya, that's the problem... you can't sort by "numFound" and it's not
feasible to do the sort on the client because the grouped result set is too
large.

Brent


On Wed, Sep 25, 2013 at 6:09 AM, Erick Erickson wrote:

> Hmmm, just specifying &sort= is _almost_ what you want,
> except it sorts by the value of fields in the doc not numFound.
>
> this shouldn't be hard to do on the client though, but you'd
> have to return all the groups...
>
> FWIW,
> Erick
>
> On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan  wrote:
> > We ran into 1 snag during development with SOLR and I thought I'd run it
> by
> > anyone to see if they had any slick ways to solve this issue.
> >
> > Basically, we're performing a SOLR query with grouping and want to be
> able
> > to sort by the number of documents found within each group.
> >
> > Our query response from SOLR looks something like this:
> >
> > {
> >
> >   "responseHeader":{
> >
> > "status":0,
> >
> > "QTime":17,
> >
> > "params":{
> >
> >   "indent":"true",
> >
> >   "q":"*:*",
> >
> >   "group.limit":"0",
> >
> >   "group.field":"rfp_stub",
> >
> >   "group":"true",
> >
> >   "wt":"json",
> >
> >   "rows":"1000"}},
> >
> >   "grouped":{
> >
> > "rfp_stub":{
> >
> >   "matches":18470,
> >
> >   "groups":[{
> >
> >
> > "groupValue":"java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e",
> >
> >   "doclist":{"*numFound*":3,"start":0,"docs":[]
> >
> >   }},
> >
> > {
> >
> >
> > "groupValue":"java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce",
> >
> >   "doclist":{"*numFound*":5,"start":0,"docs":[]
> >
> >   }},
> >
> > {
> >
> >
> > "groupValue":"java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131",
> >
> >   "doclist":{"*numFound*":6,"start":0,"docs":[]
> >
> >   }},
> >
> > …
> >
> >
> > The *numFound* shows the number of documents within that group.  Is there
> > anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
> > supported, but wondered if anyone their has come across this and if there
> > was any suggested workarounds given that the dataset is really too large
> to
> > hold in memory on our app servers?
>


Re: searching within documents

2013-09-25 Thread Erick Erickson
Because your "text" field type is completely broken.

for instance, at indexing time
> lowercasing before using WordDelimiterFilterFactory
means that one of the purposes of WDFF, breaking
tokens up on upper/lower case transitions can't happen.
Which you apparently intend since you have
splitOnCaseChange="1"

> you apply stemming at index time but not query time
(not finding q=acted)

For your query,
> you don't lowercase the input (contents:Sushant
not getting hits).

Please spend some time with the admin/analysis page
to understand the transformations at index time and
query time, that'll clarify a lot.

Best
Erick

On Tue, Sep 24, 2013 at 6:49 AM, Nutan  wrote:
> Why does it happens that for few words it shows output and for few it does
> not?
>
> For example,
> 1)
> q=contents:Sushant
>
> numfound is 0
>
> q=contents:sushant
>
> gives output
>
> 2)
> q=contents:acted
>
> numfound 0
>
> q=contents:well
>
> gives output
>
> This is the document:
> 
>   
> 13
> 
>   chetan
> 
> worst book
> solr,lucene
> Sushant acted well in kaipoche.
> 3 mistakes
> 0012345654334
> 
> 
>
> Please do reply.Help will be appreciated.
> Thanks in advance.
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091713.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Sum function causing error in solr

2013-09-25 Thread Tanu Garg
 0 down vote favorite


Hi I m trying sum function in solr

val:"sum(kcmeta/bookmark/count,10.00)"

itis giving error:


  undefined field: "kcmeta"
  400


My field name is "kcmeta/bookmark/count"

Please suggest how to resolve this.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sum-function-causing-error-in-solr-tp4091901.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR grouped query sorting on numFound

2013-09-25 Thread Erick Erickson
but if it's too large on the client, wouldn't it also be too large on
the server? After all, you have to hold the entire set of groups in
memory since you can't know ahead of time which will be the largest.
Or at least the counts of them all. I suppose you could do some
two-pass process where you returned 1 doc/group with absolutely
minimal data (like score and ID) and then issued a second query that
got the data to display if (and only if) that suited your use-case.
Otherwise I'm afraid you're into custom Solr code

Best,
Erick

On Wed, Sep 25, 2013 at 6:40 AM, Brent Ryan  wrote:
> ya, that's the problem... you can't sort by "numFound" and it's not
> feasible to do the sort on the client because the grouped result set is too
> large.
>
> Brent
>
>
> On Wed, Sep 25, 2013 at 6:09 AM, Erick Erickson 
> wrote:
>
>> Hmmm, just specifying &sort= is _almost_ what you want,
>> except it sorts by the value of fields in the doc not numFound.
>>
>> this shouldn't be hard to do on the client though, but you'd
>> have to return all the groups...
>>
>> FWIW,
>> Erick
>>
>> On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan  wrote:
>> > We ran into 1 snag during development with SOLR and I thought I'd run it
>> by
>> > anyone to see if they had any slick ways to solve this issue.
>> >
>> > Basically, we're performing a SOLR query with grouping and want to be
>> able
>> > to sort by the number of documents found within each group.
>> >
>> > Our query response from SOLR looks something like this:
>> >
>> > {
>> >
>> >   "responseHeader":{
>> >
>> > "status":0,
>> >
>> > "QTime":17,
>> >
>> > "params":{
>> >
>> >   "indent":"true",
>> >
>> >   "q":"*:*",
>> >
>> >   "group.limit":"0",
>> >
>> >   "group.field":"rfp_stub",
>> >
>> >   "group":"true",
>> >
>> >   "wt":"json",
>> >
>> >   "rows":"1000"}},
>> >
>> >   "grouped":{
>> >
>> > "rfp_stub":{
>> >
>> >   "matches":18470,
>> >
>> >   "groups":[{
>> >
>> >
>> > "groupValue":"java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e",
>> >
>> >   "doclist":{"*numFound*":3,"start":0,"docs":[]
>> >
>> >   }},
>> >
>> > {
>> >
>> >
>> > "groupValue":"java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce",
>> >
>> >   "doclist":{"*numFound*":5,"start":0,"docs":[]
>> >
>> >   }},
>> >
>> > {
>> >
>> >
>> > "groupValue":"java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131",
>> >
>> >   "doclist":{"*numFound*":6,"start":0,"docs":[]
>> >
>> >   }},
>> >
>> > …
>> >
>> >
>> > The *numFound* shows the number of documents within that group.  Is there
>> > anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
>> > supported, but wondered if anyone their has come across this and if there
>> > was any suggested workarounds given that the dataset is really too large
>> to
>> > hold in memory on our app servers?
>>


Re: Sum function causing error in solr

2013-09-25 Thread Erick Erickson
Don't name your fields that way. The general advice
is to only name your fields as though they were
Java variables. The field name you're using is
too easily confused with a mathematical operation
after all.

Personally, I only use lower-case, underscore, and
perhaps numbers to avoid surprises like this.

Best,
Erick

On Wed, Sep 25, 2013 at 6:40 AM, Tanu Garg  wrote:
>  0 down vote favorite
>
>
> Hi I m trying sum function in solr
>
> val:"sum(kcmeta/bookmark/count,10.00)"
>
> itis giving error:
>
> 
>   undefined field: "kcmeta"
>   400
> 
>
> My field name is "kcmeta/bookmark/count"
>
> Please suggest how to resolve this.
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Sum-function-causing-error-in-solr-tp4091901.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread Ing. Jorge Luis Betancourt Gonzalez
I've used a separated core for storing suggestions, based on what I see in: 
https://github.com/cominvent/autocomplete. You can check the blog post on 
www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/‎. This is 
really flexible, on the downside it does not use the suggester component su 
this are like regular queries against a separated core.

Greetings!

- Mensaje original -
De: "Erick Erickson" 
Para: solr-user@lucene.apache.org
Enviados: Miércoles, 25 de Septiembre 2013 6:16:51
Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)

I've sometimes seen this handled by clever tokenizing. For "Bill Rogers",
index (untokenized) something like
Bill|Bill Rogers
Rogers|Bill Rogers

Your suggester then is a simple term lookup (see TermsComponent)
which is quite fast. What you _don't_ get is autocorrect. But if you
use terms.prefix, you can also control whether it's whole word match
or not. To get whole-word in the above, you would set your prefix to
"Rogers|" for instance. Or you may want to leave off the "|" to see
more of an autocomplete-type response.

Then, of course, when you display this you need to only display what's
after the "|" (or whatever delimiter you use).

One other note, this will be case sensitive, so you probably want to
do casing yourself, index things like
rogers|Bill Rogers
and lowercase what you send in to terms component.

Best,
Erick



On Tue, Sep 24, 2013 at 2:01 PM, JMill  wrote:
> Hi,
>
> I'm using Solr's Suggester function to implement an autocomplete feature.
> I have it setup to check against the "username" and "name" fields.  Problem
> is when running  a query against the name, the second term, after
> whitespace (surename) returns 0 results.  Works if if query is a partial
> name starting from the begining e.g. Given the name "Bill Rogers", a query
> for Rogers will return 0 results whereas a query for "Bill" will return
> positive (Bill Rogers). As for the username, it's not working at.
>
> I am after the following behaviour.
>
> Match any partial words in the fields "username" or "name" and return the
> results.  If there is match in the field "name" the return the whole name
> e.g. given the queries "Rogers" or "Bill"" return "Bill Rogers (not the
> single word that was a match)".
>
> schema.xml extract
> ..
> 
>  
>  multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
> ...
> 
> 
> ...
>
>  positionIncrementGap="100">
>  
>
>
>
>  
> 
>
>
> solrconfig.xml
>
> 
> 
>suggest
>org.apache.solr.spelling.suggest.Suggester
> name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
>autocomplete  
>0.005
>true
>
> 
>
> 
>
> ..
>  name="/suggest">
>   
> true
> suggest
> true
> 5
> true
>   
>   
>  spellcheck
>   
> 

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu


AND/OR logic

2013-09-25 Thread PAVAN
Hi,

 I have situation that if user enters any string to search then i need
to display the results in the following way.

1. It has to match exact string what he entered.
 
  (or)

2. It has to split the string and search for individual keywords(for every
string) and it has to display the results.

Ex : suppose user searching for 

 "royal enfeild bikes"

First Conditionit has to check for whole string "royal enfiled bikes"
Second Condition - it has to display the results like   royal OR enfiled
OR bikes

Can anyone give me suggestions.










--
View this message in context: 
http://lucene.472066.n3.nabble.com/AND-OR-logic-tp4091931.html
Sent from the Solr - User mailing list archive at Nabble.com.


Custom Request Handlers

2013-09-25 Thread PAVAN
Hi,


  I am new to solr Can anyone suggest me how can i write my own custom
handlers. Because i need to filter queries based on 4 to 5 conditions.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Request-Handlers-tp4091936.html
Sent from the Solr - User mailing list archive at Nabble.com.


autocomplete_edge type split words

2013-09-25 Thread elisabeth benoit
Hello,

I am using solr 4.2.1 and I have a autocomplete_edge type defined in
schema.xml



  





   
  




 
  


When I have a request with more then one word, for instance "rue de la", my
request doesn't match with my autocomplete_edge field unless I use quotes
around the query. In other words q=rue de la doesnt work and q="rue de la"
works.

I've check the request with debugQuery=on, and I can see in first case, the
query is splitted into words, and I don't understand why since my field
type uses KeywordTokenizerFactory.

Does anyone have a clue on how I can request my field without using quotes?

Thanks,
Elisabeth


Re: searching within documents

2013-09-25 Thread Nutan
Okay thanks,
I will surely read that page.
Thanks a lot.


On Wed, Sep 25, 2013 at 4:21 PM, Erick Erickson [via Lucene] <
ml-node+s472066n4091910...@n3.nabble.com> wrote:

> Because your "text" field type is completely broken.
>
> for instance, at indexing time
> > lowercasing before using WordDelimiterFilterFactory
> means that one of the purposes of WDFF, breaking
> tokens up on upper/lower case transitions can't happen.
> Which you apparently intend since you have
> splitOnCaseChange="1"
>
> > you apply stemming at index time but not query time
> (not finding q=acted)
>
> For your query,
> > you don't lowercase the input (contents:Sushant
> not getting hits).
>
> Please spend some time with the admin/analysis page
> to understand the transformations at index time and
> query time, that'll clarify a lot.
>
> Best
> Erick
>
> On Tue, Sep 24, 2013 at 6:49 AM, Nutan <[hidden 
> email]>
> wrote:
>
> > Why does it happens that for few words it shows output and for few it
> does
> > not?
> >
> > For example,
> > 1)
> > q=contents:Sushant
> >
> > numfound is 0
> >
> > q=contents:sushant
> >
> > gives output
> >
> > 2)
> > q=contents:acted
> >
> > numfound 0
> >
> > q=contents:well
> >
> > gives output
> >
> > This is the document:
> > 
> >   
> > 13
> > 
> >   chetan
> > 
> > worst book
> > solr,lucene
> > Sushant acted well in kaipoche.
> > 3 mistakes
> > 0012345654334
> > 
> > 
> >
> > Please do reply.Help will be appreciated.
> > Thanks in advance.
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091713.html
>
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091910.html
>  To unsubscribe from searching within documents, click 
> here
> .
> NAML
>




--
View this message in context: 
http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091938.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread JMill
Thanks.


On Wed, Sep 25, 2013 at 1:46 PM, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:

> I've used a separated core for storing suggestions, based on what I see
> in: https://github.com/cominvent/autocomplete. You can check the blog
> post on
> www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/.
> This is really flexible, on the downside it does not use the suggester
> component su this are like regular queries against a separated core.
>
> Greetings!
>
> - Mensaje original -
> De: "Erick Erickson" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 6:16:51
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)
>
> I've sometimes seen this handled by clever tokenizing. For "Bill Rogers",
> index (untokenized) something like
> Bill|Bill Rogers
> Rogers|Bill Rogers
>
> Your suggester then is a simple term lookup (see TermsComponent)
> which is quite fast. What you _don't_ get is autocorrect. But if you
> use terms.prefix, you can also control whether it's whole word match
> or not. To get whole-word in the above, you would set your prefix to
> "Rogers|" for instance. Or you may want to leave off the "|" to see
> more of an autocomplete-type response.
>
> Then, of course, when you display this you need to only display what's
> after the "|" (or whatever delimiter you use).
>
> One other note, this will be case sensitive, so you probably want to
> do casing yourself, index things like
> rogers|Bill Rogers
> and lowercase what you send in to terms component.
>
> Best,
> Erick
>
>
>
> On Tue, Sep 24, 2013 at 2:01 PM, JMill 
> wrote:
> > Hi,
> >
> > I'm using Solr's Suggester function to implement an autocomplete feature.
> > I have it setup to check against the "username" and "name" fields.
>  Problem
> > is when running  a query against the name, the second term, after
> > whitespace (surename) returns 0 results.  Works if if query is a partial
> > name starting from the begining e.g. Given the name "Bill Rogers", a
> query
> > for Rogers will return 0 results whereas a query for "Bill" will return
> > positive (Bill Rogers). As for the username, it's not working at.
> >
> > I am after the following behaviour.
> >
> > Match any partial words in the fields "username" or "name" and return the
> > results.  If there is match in the field "name" the return the whole name
> > e.g. given the queries "Rogers" or "Bill"" return "Bill Rogers (not the
> > single word that was a match)".
> >
> > schema.xml extract
> > ..
> >  />
> >  
> >  > multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
> > ...
> > 
> > 
> > ...
> >
> >  > positionIncrementGap="100">
> >  
> >
> >
> >
> >  
> > 
> >
> >
> > solrconfig.xml
> >
> > 
> > 
> >suggest
> >org.apache.solr.spelling.suggest.Suggester
> > > name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
> >autocomplete  
> >0.005
> >true
> >
> > 
> >
> > 
> >
> > ..
> >  > name="/suggest">
> >   
> > true
> > suggest
> > true
> > 5
> > true
> >   
> >   
> >  spellcheck
> >   
> > 
>
> 
> III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero
> del 2014. Ver www.uci.cu
>
> 
> III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero
> del 2014. Ver www.uci.cu
>


Re: Sum function causing error in solr

2013-09-25 Thread Tanu Garg
Thanks erick

I know we should name variable as java variables. But theses are the
predefined variables. I cant rename it.

Provide me the solution with this constraint.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sum-function-causing-error-in-solr-tp4091901p4091927.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Sum function causing error in solr

2013-09-25 Thread Jack Krupansky

Escape any special characters in field names with backslash.

-- Jack Krupansky

-Original Message- 
From: Tanu Garg

Sent: Wednesday, September 25, 2013 8:45 AM
To: solr-user@lucene.apache.org
Subject: Re: Sum function causing error in solr

Thanks erick

I know we should name variable as java variables. But theses are the
predefined variables. I cant rename it.

Provide me the solution with this constraint.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sum-function-causing-error-in-solr-tp4091901p4091927.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Sum function causing error in solr

2013-09-25 Thread Yonik Seeley
On Wed, Sep 25, 2013 at 6:40 AM, Tanu Garg  wrote:
> My field name is "kcmeta/bookmark/count"

You can always use the "field" function to handle whacky field
names... unquoted if it's simple enough, or quoted if it's not.
field(kcmeta/bookmark/count)
or
field("kcmeta/bookmark/count")


-Yonik
http://lucidworks.com


Re: AND/OR logic

2013-09-25 Thread Jack Krupansky
Is your requirement specifying that none of the alternative results should 
be shown if there are any primary results, or that the primary results 
should be shown first, or that both are acceptable?


You could do a phrase query for the full list of terms, and then do a second 
query with OR if the first returns no results.


Or, you could use a boost on the phrase and OR it with a de-boosted 
parenthesized OR query of the individual terms to show both but with the 
phrase matches first. If the natural scoring doesn't already boost the 
phrases enough by default.


You could also do that second approach by simply using the edismax query 
parser with the pf parameter to boost the phrase of all terms.


-- Jack Krupansky

-Original Message- 
From: PAVAN

Sent: Wednesday, September 25, 2013 9:12 AM
To: solr-user@lucene.apache.org
Subject: AND/OR logic

Hi,

I have situation that if user enters any string to search then i need
to display the results in the following way.

1. It has to match exact string what he entered.

 (or)

2. It has to split the string and search for individual keywords(for every
string) and it has to display the results.

Ex : suppose user searching for

"royal enfeild bikes"

First Conditionit has to check for whole string "royal enfiled bikes"
Second Condition - it has to display the results like   royal OR enfiled
OR bikes

Can anyone give me suggestions.










--
View this message in context: 
http://lucene.472066.n3.nabble.com/AND-OR-logic-tp4091931.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread JMill
I followed the instructions, I am able to browse to "
http://localhost:8983/solr/ac/browse?q=ce&debugQuery=true"; but I am not
getting any suggestions (typed in c in Find Textbox).

I wonder if loading the example data is the problem?  The response I get
after executing the script  "feed-ac.sh" (step 3) is the following.

user$ ./feed-ac.sh


02239


Are you able to confirm if this the expected response?




On Wed, Sep 25, 2013 at 1:46 PM, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:

> I've used a separated core for storing suggestions, based on what I see
> in: https://github.com/cominvent/autocomplete. You can check the blog
> post on
> www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/.
> This is really flexible, on the downside it does not use the suggester
> component su this are like regular queries against a separated core.
>
> Greetings!
>
> - Mensaje original -
> De: "Erick Erickson" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 6:16:51
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)
>
> I've sometimes seen this handled by clever tokenizing. For "Bill Rogers",
> index (untokenized) something like
> Bill|Bill Rogers
> Rogers|Bill Rogers
>
> Your suggester then is a simple term lookup (see TermsComponent)
> which is quite fast. What you _don't_ get is autocorrect. But if you
> use terms.prefix, you can also control whether it's whole word match
> or not. To get whole-word in the above, you would set your prefix to
> "Rogers|" for instance. Or you may want to leave off the "|" to see
> more of an autocomplete-type response.
>
> Then, of course, when you display this you need to only display what's
> after the "|" (or whatever delimiter you use).
>
> One other note, this will be case sensitive, so you probably want to
> do casing yourself, index things like
> rogers|Bill Rogers
> and lowercase what you send in to terms component.
>
> Best,
> Erick
>
>
>
> On Tue, Sep 24, 2013 at 2:01 PM, JMill 
> wrote:
> > Hi,
> >
> > I'm using Solr's Suggester function to implement an autocomplete feature.
> > I have it setup to check against the "username" and "name" fields.
>  Problem
> > is when running  a query against the name, the second term, after
> > whitespace (surename) returns 0 results.  Works if if query is a partial
> > name starting from the begining e.g. Given the name "Bill Rogers", a
> query
> > for Rogers will return 0 results whereas a query for "Bill" will return
> > positive (Bill Rogers). As for the username, it's not working at.
> >
> > I am after the following behaviour.
> >
> > Match any partial words in the fields "username" or "name" and return the
> > results.  If there is match in the field "name" the return the whole name
> > e.g. given the queries "Rogers" or "Bill"" return "Bill Rogers (not the
> > single word that was a match)".
> >
> > schema.xml extract
> > ..
> >  />
> >  
> >  > multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
> > ...
> > 
> > 
> > ...
> >
> >  > positionIncrementGap="100">
> >  
> >
> >
> >
> >  
> > 
> >
> >
> > solrconfig.xml
> >
> > 
> > 
> >suggest
> >org.apache.solr.spelling.suggest.Suggester
> > > name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
> >autocomplete  
> >0.005
> >true
> >
> > 
> >
> > 
> >
> > ..
> >  > name="/suggest">
> >   
> > true
> > suggest
> > true
> > 5
> > true
> >   
> >   
> >  spellcheck
> >   
> > 
>
> 
> III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero
> del 2014. Ver www.uci.cu
>
> 
> III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero
> del 2014. Ver www.uci.cu
>


Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread Ing. Jorge Luis Betancourt Gonzalez
The response does not show any error, can you confirm that the data is in solr? 
you should be able to see the numDoc stats in the admin UI. Which version of 
Solr are you using? I believe that the example was tested on Solr 3.x at least 
at the time I use it.

Regards,

- Mensaje original -
De: "JMill" 
Para: solr-user@lucene.apache.org
Enviados: Miércoles, 25 de Septiembre 2013 10:57:31
Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)

I followed the instructions, I am able to browse to "
http://localhost:8983/solr/ac/browse?q=ce&debugQuery=true"; but I am not
getting any suggestions (typed in c in Find Textbox).

I wonder if loading the example data is the problem?  The response I get
after executing the script  "feed-ac.sh" (step 3) is the following.

user$ ./feed-ac.sh


02239


Are you able to confirm if this the expected response?




On Wed, Sep 25, 2013 at 1:46 PM, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:

> I've used a separated core for storing suggestions, based on what I see
> in: https://github.com/cominvent/autocomplete. You can check the blog
> post on
> www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/.
> This is really flexible, on the downside it does not use the suggester
> component su this are like regular queries against a separated core.
>
> Greetings!
>
> - Mensaje original -
> De: "Erick Erickson" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 6:16:51
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)
>
> I've sometimes seen this handled by clever tokenizing. For "Bill Rogers",
> index (untokenized) something like
> Bill|Bill Rogers
> Rogers|Bill Rogers
>
> Your suggester then is a simple term lookup (see TermsComponent)
> which is quite fast. What you _don't_ get is autocorrect. But if you
> use terms.prefix, you can also control whether it's whole word match
> or not. To get whole-word in the above, you would set your prefix to
> "Rogers|" for instance. Or you may want to leave off the "|" to see
> more of an autocomplete-type response.
>
> Then, of course, when you display this you need to only display what's
> after the "|" (or whatever delimiter you use).
>
> One other note, this will be case sensitive, so you probably want to
> do casing yourself, index things like
> rogers|Bill Rogers
> and lowercase what you send in to terms component.
>
> Best,
> Erick
>
>
>
> On Tue, Sep 24, 2013 at 2:01 PM, JMill 
> wrote:
> > Hi,
> >
> > I'm using Solr's Suggester function to implement an autocomplete feature.
> > I have it setup to check against the "username" and "name" fields.
>  Problem
> > is when running  a query against the name, the second term, after
> > whitespace (surename) returns 0 results.  Works if if query is a partial
> > name starting from the begining e.g. Given the name "Bill Rogers", a
> query
> > for Rogers will return 0 results whereas a query for "Bill" will return
> > positive (Bill Rogers). As for the username, it's not working at.
> >
> > I am after the following behaviour.
> >
> > Match any partial words in the fields "username" or "name" and return the
> > results.  If there is match in the field "name" the return the whole name
> > e.g. given the queries "Rogers" or "Bill"" return "Bill Rogers (not the
> > single word that was a match)".
> >
> > schema.xml extract
> > ..
> >  />
> >  
> >  > multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
> > ...
> > 
> > 
> > ...
> >
> >  > positionIncrementGap="100">
> >  
> >
> >
> >
> >  
> > 
> >
> >
> > solrconfig.xml
> >
> > 
> > 
> >suggest
> >org.apache.solr.spelling.suggest.Suggester
> > > name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
> >autocomplete  
> >0.005
> >true
> >
> > 
> >
> > 
> >
> > ..
> >  > name="/suggest">
> >   
> > true
> > suggest
> > true
> > 5
> > true
> >   
> >   
> >  spellcheck
> >   
> > 
>
> 
> III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero
> del 2014. Ver www.uci.cu
>
> 
> III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero
> del 2014. Ver www.uci.cu
>


III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu


III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu


Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread JMill
I simple query through admin (*:*) confirms the data is exists. The version
I'm working with is solr 4.4.0. The autocomplete manual refers to 3.x. I
wonder of this is the problem?


On Wed, Sep 25, 2013 at 4:01 PM, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:

> The response does not show any error, can you confirm that the data is in
> solr? you should be able to see the numDoc stats in the admin UI. Which
> version of Solr are you using? I believe that the example was tested on
> Solr 3.x at least at the time I use it.
>
> Regards,
>
> - Mensaje original -
> De: "JMill" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 10:57:31
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)
>
> I followed the instructions, I am able to browse to "
> http://localhost:8983/solr/ac/browse?q=ce&debugQuery=true"; but I am not
> getting any suggestions (typed in c in Find Textbox).
>
> I wonder if loading the example data is the problem?  The response I get
> after executing the script  "feed-ac.sh" (step 3) is the following.
>
> user$ ./feed-ac.sh
> 
> 
> 0 name="QTime">2239
> 
>
> Are you able to confirm if this the expected response?
>
>
>
>
> On Wed, Sep 25, 2013 at 1:46 PM, Ing. Jorge Luis Betancourt Gonzalez <
> jlbetanco...@uci.cu> wrote:
>
> > I've used a separated core for storing suggestions, based on what I see
> > in: https://github.com/cominvent/autocomplete. You can check the blog
> > post on
> > www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/.
> > This is really flexible, on the downside it does not use the suggester
> > component su this are like regular queries against a separated core.
> >
> > Greetings!
> >
> > - Mensaje original -
> > De: "Erick Erickson" 
> > Para: solr-user@lucene.apache.org
> > Enviados: Miércoles, 25 de Septiembre 2013 6:16:51
> > Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
> columns)
> >
> > I've sometimes seen this handled by clever tokenizing. For "Bill Rogers",
> > index (untokenized) something like
> > Bill|Bill Rogers
> > Rogers|Bill Rogers
> >
> > Your suggester then is a simple term lookup (see TermsComponent)
> > which is quite fast. What you _don't_ get is autocorrect. But if you
> > use terms.prefix, you can also control whether it's whole word match
> > or not. To get whole-word in the above, you would set your prefix to
> > "Rogers|" for instance. Or you may want to leave off the "|" to see
> > more of an autocomplete-type response.
> >
> > Then, of course, when you display this you need to only display what's
> > after the "|" (or whatever delimiter you use).
> >
> > One other note, this will be case sensitive, so you probably want to
> > do casing yourself, index things like
> > rogers|Bill Rogers
> > and lowercase what you send in to terms component.
> >
> > Best,
> > Erick
> >
> >
> >
> > On Tue, Sep 24, 2013 at 2:01 PM, JMill 
> > wrote:
> > > Hi,
> > >
> > > I'm using Solr's Suggester function to implement an autocomplete
> feature.
> > > I have it setup to check against the "username" and "name" fields.
> >  Problem
> > > is when running  a query against the name, the second term, after
> > > whitespace (surename) returns 0 results.  Works if if query is a
> partial
> > > name starting from the begining e.g. Given the name "Bill Rogers", a
> > query
> > > for Rogers will return 0 results whereas a query for "Bill" will return
> > > positive (Bill Rogers). As for the username, it's not working at.
> > >
> > > I am after the following behaviour.
> > >
> > > Match any partial words in the fields "username" or "name" and return
> the
> > > results.  If there is match in the field "name" the return the whole
> name
> > > e.g. given the queries "Rogers" or "Bill"" return "Bill Rogers (not the
> > > single word that was a match)".
> > >
> > > schema.xml extract
> > > ..
> > >  > />
> > >  
> > >  stored="false"
> > > multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
> > > ...
> > > 
> > > 
> > > ...
> > >
> > >  > > positionIncrementGap="100">
> > >  
> > >
> > >
> > >
> > >  
> > > 
> > >
> > >
> > > solrconfig.xml
> > >
> > > 
> > > 
> > >suggest
> > > name="classname">org.apache.solr.spelling.suggest.Suggester
> > > > > name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
> > >autocomplete  
> > >0.005
> > >true
> > >
> > > 
> > >
> > > 
> > >
> > > ..
> > >  > > name="/suggest">
> > >   
> > > true
> > > suggest
> > > true
> > > 5
> > > true
> > >   
> > >   
> > >  spellcheck
> > >   
> > > 
> >
> >
> 
> > III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero
> > del 2014. Ver www.uci.cu
> >
> >
> 
> > III Escuela Internac

Solr sorting situation!

2013-09-25 Thread Gustav
Hey guys, I know that sorting in Solr works as a cascade, but i'm kind stuck
in a rough situation:


I've got the following sort fields in a Product:
ProductSequence(0~99
default=99),ProductName(text_general),ProductType(0~1),ProductHasPrinciples(0~1),ProductInStock(0~1),ProductPrice(0~999,99)

Then, i must apply the following rules to these products(i'll put the
sorting i've done so far)

1 - Sort by the Product Sequence Field

Ok, easy! &sort=ProductSequence asc

2 - Sort by the product name inputed(that means, the product who got the
input search in name will come first), in which those, will be sorted by
price ascending

So far, so good! &sort=ProductSequence asc, rint(query({!edismax
qf=ProductName v=searchTerm})) desc, price asc

3 - And now, my problem! Then, sort the rest of the products, first those
who ProducType = 1 in which:
ProductHasPrinciples = 1 will be sorted by price ascending
ProductHasPrinciples = 0 will be sorted by price ascending

Oh boy! &sort=ProductSequence asc, rint(query({!edismax qf=ProductName
v=searchTerm})) desc, price asc, productType desc,ProductHasPrinciples desc,
price asc?

That didnt work, because it doesnt attend the third "Product type=1" 
rule,
so i tried:

&sort=ProductSequence asc, rint(query({!edismax qf=ProductName
v=searchTerm})) desc, productType desc,ProductHasPrinciples desc, price
asc?

Which also didnt work, cause it breaks the second rule "Price ascending
sorting"


Any ideas/suggestions?

Thanks :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-sorting-situation-tp4091966.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR: Searching on OpenNLP fields is unstable

2013-09-25 Thread rashi gandhi
HI,



I am working on OpenNLP integration with SOLR. I have successfully applied
the patch (LUCENE-2899-x.patch) to latest SOLR source code (branch_4x).

I have designed OpenNLP analyzer and index data to it. Analyzer declaration
in schema.xml is as



  





























 





 





And field declared for this analyzer:





Problem is here : When I search over this field Detail_Person, results are
not constant.



When I search Detail_Person:brett, it return one document





But again when I fire the same query, it return zero document.



Searching is not stable on OpenNLP field, sometimes it return documents and
sometimes not but documents are there.

And if I search on non OpenNLP fields, it is working properly, results are
stable and correct.

Please help me to make solr results consistent.

Thanks in Advance.


Solr 4.4 Import from CSV to Multi-value field - Adds quote on last value

2013-09-25 Thread Mike L.
 
Solr Family,
 
    I'm a Solr 3.6 user who just pulled down 4.4 yesterday and noticed 
something a bit odd when importing into a multi-valued field. I wouldn't be 
surprised if there's a user-error on my end but hopefully there isn't a bug. 
Here's the situation.
 
I created some test data to import and one field needs to be split into a 
multi-valued field. This data resides within a .csv file and is structured like 
the following: 
 
(below are replacement field names. Also note - there are no quotes " within 
the data.)
 
field1|field2|field3|field4_valueA,field4_valueB,field4_valueC
 http://[myserver]/solr/[my 
corename]/update?commit=true&separator=|&escape=\&stream.file=[location of 
file]&fieldnames=field1,field2,field3,field4&optimize=true&stream.contentType=application/csv&f.field4.split=true&f.field4.separator=%2C
 
After importing the data, I see similiar results as the below for the 
multi-valued field , field4: 
 

field4_valueA
field4_valueB
field4_valueC"  (Why is there a trailing quote here?) 

 
I also noticed if only 1 value is being inserted into this multivalued field - 
there is no issue. It always happens on the last value.
 
Thanks in advance,
Cheers!
Mike

RE: Problem loading my codec sometimes

2013-09-25 Thread Chris Hostetter

: I still wonder why it can create a new index using my codec, but not 
: load an index previously created with my codec.  In solrconfig.xml, I 
: specify the CodecFactory along with the package name, whereas the codec 
: name that is read from the index file has no package name.  Could that 
: be the problem?  I think that's the way it's supposed to be.  Could it 
: be that Solr has my jar in the classpath, but SPI is not registering my 
: codec class from the jar?  I'm not familiar with SPI.

it's very possible that there is a classloader / SPI runtime race 
condition in looking up the codec names found in segment files.  This sort 
of classpath related runtime issue is extremely hard to write tests for.

Could you please file a bug and include...

 * the source of your codec (or a simple sample codec that you can 
   also use to reproduce the problem)
 * a ziped up copy of your entire solr home directory, including 
   the jar file containing your codec so we can verify the SPI files 
   are in their properly
- no need to include an actual index here
 * some simple sample docments in xml or json taht we can index 
   with the schema you are using 



-Hoss


Re: Solr 4.4 Import from CSV to Multi-value field - Adds quote on last value

2013-09-25 Thread Mike L.
 
Nevermind, I figured it out. Excel was applying a hidden quote on the data. 
Thanks anyway.

From: Mike L. 
To: "solr-user@lucene.apache.org"  
Sent: Wednesday, September 25, 2013 11:32 AM
Subject: Solr 4.4 Import from CSV to Multi-value field - Adds quote on last 
value


 
Solr Family,
 
    I'm a Solr 3.6 user who just pulled down 4.4 yesterday and noticed 
something a bit odd when importing into a multi-valued field. I wouldn't be 
surprised if there's a user-error on my end but hopefully there isn't a bug. 
Here's the situation.
 
I created some test data to import and one field needs to be split into a 
multi-valued field. This data resides within a .csv file and is structured like 
the following: 
 
(below are replacement field names. Also note - there are no quotes " within 
the data.)
 
field1|field2|field3|field4_valueA,field4_valueB,field4_valueC
 http://[myserver]/solr/[my 
corename]/update?commit=true&separator=|&escape=\&stream.file=[location of 
file]&fieldnames=field1,field2,field3,field4&optimize=true&stream.contentType=application/csv&f.field4.split=true&f.field4.separator=%2C
 
After importing the data, I see similiar results as the below for the 
multi-valued field , field4: 
 

field4_valueA
field4_valueB
field4_valueC"  (Why is there a trailing quote here?) 

 
I also noticed if only 1 value is being inserted into this multivalued field - 
there is no issue. It always happens on the last value.
 
Thanks in advance,
Cheers!
Mike

RE: Problem loading my codec sometimes

2013-09-25 Thread Scott Schneider
Ah, I fixed it.  I wasn't properly including the org.apache.lucene.codecs.Codec 
file in my jar.  I wasn't sure if it was necessary in Solr, since I specify my 
factory in solrconfig.xml.  I think that's why I could create a new index, but 
not load an existing one.

Scott


> -Original Message-
> From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
> Sent: Wednesday, September 25, 2013 9:49 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Problem loading my codec sometimes
> 
> 
> : I still wonder why it can create a new index using my codec, but not
> : load an index previously created with my codec.  In solrconfig.xml, I
> : specify the CodecFactory along with the package name, whereas the
> codec
> : name that is read from the index file has no package name.  Could
> that
> : be the problem?  I think that's the way it's supposed to be.  Could
> it
> : be that Solr has my jar in the classpath, but SPI is not registering
> my
> : codec class from the jar?  I'm not familiar with SPI.
> 
> it's very possible that there is a classloader / SPI runtime race
> condition in looking up the codec names found in segment files.  This
> sort
> of classpath related runtime issue is extremely hard to write tests
> for.
> 
> Could you please file a bug and include...
> 
>  * the source of your codec (or a simple sample codec that you can
>also use to reproduce the problem)
>  * a ziped up copy of your entire solr home directory, including
>the jar file containing your codec so we can verify the SPI files
>are in their properly
> - no need to include an actual index here
>  * some simple sample docments in xml or json taht we can index
>with the schema you are using
> 
> 
> 
> -Hoss


RE: Problem loading my codec sometimes

2013-09-25 Thread Chris Hostetter

: Ah, I fixed it.  I wasn't properly including the 
: org.apache.lucene.codecs.Codec file in my jar.  I wasn't sure if it was 
: necessary in Solr, since I specify my factory in solrconfig.xml.  I 
: think that's why I could create a new index, but not load an existing 
: one.

Ah interesting.  

yes, you definitely need the SPI registration in the jar file so that it 
can resolve codec files found on disk when opening them -- the 
configuration in solrconfig.xml tells solr hch codec to use when writing 
new segments, but it must respect the codec information in segements found 
on disk when opening them (that's how the index backcompat works), and 
those are looked up via SPI.

Can you do me a favor please and still file an issue with these details.  
the attachments i asked about before would still be handy, but probably 
not neccessary -- at a minimum could you show us the "jar tf" output of 
your plugin jar when you were having the problem.

Even if the codec factory code can find the configured codec on startup, 
we should probably throw a very load error write away if that same codec 
can't be found by name using SPI to prevent people from running into 
confusing problems when making mistakes like this.



-Hoss


Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread Ing. Jorge Luis Betancourt Gonzalez
Perhaps this could be an issue, I know that this works perfectly in solr 3.6 
(this is the one I was using). Currently I don't have a solr 4.4 to do some 
tests, but what have been done in that core should work in solr 4.4, perhaps 
there is a setting that need some tweaking but it's impossible of knowing 
without checking the logs. In case that any incompatibility is present it 
should pop out on the logs.

Regards,

- Mensaje original -
De: "JMill" 
Para: solr-user@lucene.apache.org
Enviados: Miércoles, 25 de Septiembre 2013 11:10:32
Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)

I simple query through admin (*:*) confirms the data is exists. The version
I'm working with is solr 4.4.0. The autocomplete manual refers to 3.x. I
wonder of this is the problem?


On Wed, Sep 25, 2013 at 4:01 PM, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:

> The response does not show any error, can you confirm that the data is in
> solr? you should be able to see the numDoc stats in the admin UI. Which
> version of Solr are you using? I believe that the example was tested on
> Solr 3.x at least at the time I use it.
>
> Regards,
>
> - Mensaje original -
> De: "JMill" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 10:57:31
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)
>
> I followed the instructions, I am able to browse to "
> http://localhost:8983/solr/ac/browse?q=ce&debugQuery=true"; but I am not
> getting any suggestions (typed in c in Find Textbox).
>
> I wonder if loading the example data is the problem?  The response I get
> after executing the script  "feed-ac.sh" (step 3) is the following.
>
> user$ ./feed-ac.sh
> 
> 
> 0 name="QTime">2239
> 
>
> Are you able to confirm if this the expected response?
>
>
>
>
> On Wed, Sep 25, 2013 at 1:46 PM, Ing. Jorge Luis Betancourt Gonzalez <
> jlbetanco...@uci.cu> wrote:
>
> > I've used a separated core for storing suggestions, based on what I see
> > in: https://github.com/cominvent/autocomplete. You can check the blog
> > post on
> > www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/.
> > This is really flexible, on the downside it does not use the suggester
> > component su this are like regular queries against a separated core.
> >
> > Greetings!
> >
> > - Mensaje original -
> > De: "Erick Erickson" 
> > Para: solr-user@lucene.apache.org
> > Enviados: Miércoles, 25 de Septiembre 2013 6:16:51
> > Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
> columns)
> >
> > I've sometimes seen this handled by clever tokenizing. For "Bill Rogers",
> > index (untokenized) something like
> > Bill|Bill Rogers
> > Rogers|Bill Rogers
> >
> > Your suggester then is a simple term lookup (see TermsComponent)
> > which is quite fast. What you _don't_ get is autocorrect. But if you
> > use terms.prefix, you can also control whether it's whole word match
> > or not. To get whole-word in the above, you would set your prefix to
> > "Rogers|" for instance. Or you may want to leave off the "|" to see
> > more of an autocomplete-type response.
> >
> > Then, of course, when you display this you need to only display what's
> > after the "|" (or whatever delimiter you use).
> >
> > One other note, this will be case sensitive, so you probably want to
> > do casing yourself, index things like
> > rogers|Bill Rogers
> > and lowercase what you send in to terms component.
> >
> > Best,
> > Erick
> >
> >
> >
> > On Tue, Sep 24, 2013 at 2:01 PM, JMill 
> > wrote:
> > > Hi,
> > >
> > > I'm using Solr's Suggester function to implement an autocomplete
> feature.
> > > I have it setup to check against the "username" and "name" fields.
> >  Problem
> > > is when running  a query against the name, the second term, after
> > > whitespace (surename) returns 0 results.  Works if if query is a
> partial
> > > name starting from the begining e.g. Given the name "Bill Rogers", a
> > query
> > > for Rogers will return 0 results whereas a query for "Bill" will return
> > > positive (Bill Rogers). As for the username, it's not working at.
> > >
> > > I am after the following behaviour.
> > >
> > > Match any partial words in the fields "username" or "name" and return
> the
> > > results.  If there is match in the field "name" the return the whole
> name
> > > e.g. given the queries "Rogers" or "Bill"" return "Bill Rogers (not the
> > > single word that was a match)".
> > >
> > > schema.xml extract
> > > ..
> > >  > />
> > >  
> > >  stored="false"
> > > multiValued="true" omitNorms="true" omitTermFreqAndPositions="false" />
> > > ...
> > > 
> > > 
> > > ...
> > >
> > >  > > positionIncrementGap="100">
> > >  
> > >
> > >
> > >
> > >  
> > > 
> > >
> > >
> > > solrconfig.xml
> > >
> > > 
> > > 
> > >suggest
> > > name="classname">org.apache.solr.spelling.suggest.Suggester
> > > > > name="lookupImpl">org.apache.solr.spelling.sugg

Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread JMill
Could it be the jquery library that is the problem?   I opened up
solr-home/ac/conf/velocity/head.vm with an editor and I see a reference to
the jquery library but I can't seem to find the directory referenced,
 line:  

Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread Ing. Jorge Luis Betancourt Gonzalez
Try quering the core where the data has been imported, something like:

http://localhost:8983/solr/suggestions/select?q=uc

In the previous URL suggestions is the name I give to the core, so this should 
change, if you get results, then the problem could be the jquery dependency. I 
don't remember doing any change, as far as I know that js file is bundled with 
solr (at leat in 3.x) version perhaps you could change it the correct jquery 
version on solr 4.4, if you go into the admin panel (in solr 3.6):

http://localhost:8983/solr/admin/schema.jsp

And inspect the loaded code, the required file (jquery-1.4.2.min.js) gets 
loaded in solr 4.4 it should load a similar file, but perhaps a more recent 
version.

Perhaps you could change that part to something like:

  

Which is used at least on a solr 4.1 that I have laying aroud here somewhere.

In any case you can test the suggestions using the URL that I suggest on the 
top of this mail, in that case you should be able to see the possible results, 
of course in a less fancy way.

- Mensaje original -
De: "JMill" 
Para: solr-user@lucene.apache.org
Enviados: Miércoles, 25 de Septiembre 2013 13:59:32
Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)

Could it be the jquery library that is the problem?   I opened up
solr-home/ac/conf/velocity/head.vm with an editor and I see a reference to
the jquery library but I can't seem to find the directory referenced,
 line:  

Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread JMill
That seems to work. I get back an xml containing a bunch of suggestions.
Can we agree that it's jquery that's the problem?

On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:
> Try quering the core where the data has been imported, something like:
>
> http://localhost:8983/solr/suggestions/select?q=uc
>
> In the previous URL suggestions is the name I give to the core, so this
should change, if you get results, then the problem could be the jquery
dependency. I don't remember doing any change, as far as I know that js
file is bundled with solr (at leat in 3.x) version perhaps you could change
it the correct jquery version on solr 4.4, if you go into the admin panel
(in solr 3.6):
>
> http://localhost:8983/solr/admin/schema.jsp
>
> And inspect the loaded code, the required file (jquery-1.4.2.min.js) gets
loaded in solr 4.4 it should load a similar file, but perhaps a more recent
version.
>
> Perhaps you could change that part to something like:
>
>   
>
> Which is used at least on a solr 4.1 that I have laying aroud here
somewhere.
>
> In any case you can test the suggestions using the URL that I suggest on
the top of this mail, in that case you should be able to see the possible
results, of course in a less fancy way.
>
> - Mensaje original -
> De: "JMill" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 13:59:32
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
columns)
>
> Could it be the jquery library that is the problem?   I opened up
> solr-home/ac/conf/velocity/head.vm with an editor and I see a reference to
> the jquery library but I can't seem to find the directory referenced,
>  line:  

Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread Ing. Jorge Luis Betancourt Gonzalez
As far as I can tell it is. You can check that by seeing the Console logs on 
your browser (chrome, firefox, etc.). There should be an error saying that the 
$ function it's not found. In any case I'll try to set up a testing environment 
here, but I can only use solr 4.1, which I have here. I haven't 
downloaded/tested the 4.4 version yet. Do you try replacing the line that 
includes the jquery-1.4.3.min.js with the new one?

- Mensaje original -
De: "JMill" 
Para: solr-user@lucene.apache.org
Enviados: Miércoles, 25 de Septiembre 2013 14:44:53
Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)

That seems to work. I get back an xml containing a bunch of suggestions.
Can we agree that it's jquery that's the problem?

On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:
> Try quering the core where the data has been imported, something like:
>
> http://localhost:8983/solr/suggestions/select?q=uc
>
> In the previous URL suggestions is the name I give to the core, so this
should change, if you get results, then the problem could be the jquery
dependency. I don't remember doing any change, as far as I know that js
file is bundled with solr (at leat in 3.x) version perhaps you could change
it the correct jquery version on solr 4.4, if you go into the admin panel
(in solr 3.6):
>
> http://localhost:8983/solr/admin/schema.jsp
>
> And inspect the loaded code, the required file (jquery-1.4.2.min.js) gets
loaded in solr 4.4 it should load a similar file, but perhaps a more recent
version.
>
> Perhaps you could change that part to something like:
>
>   
>
> Which is used at least on a solr 4.1 that I have laying aroud here
somewhere.
>
> In any case you can test the suggestions using the URL that I suggest on
the top of this mail, in that case you should be able to see the possible
results, of course in a less fancy way.
>
> - Mensaje original -
> De: "JMill" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 13:59:32
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
columns)
>
> Could it be the jquery library that is the problem?   I opened up
> solr-home/ac/conf/velocity/head.vm with an editor and I see a reference to
> the jquery library but I can't seem to find the directory referenced,
>  line:  

Cross index join query performance

2013-09-25 Thread Peter Keegan
I'm doing a cross-core join query and the join query is 30X slower than
each of the 2 individual queries. Here are the queries:

Main query: http://localhost:8983/solr/mainindex/select?q=title:java
QTime: 5 msec
hit count: 1000

Sub query: http://localhost:8983/solr/subindex/select?q=+fld1:[0.1 TO 0.3]
QTime: 4 msec
hit count: 25K

Join query:
http://localhost:8983/solr/mainindex/select?q=title:java&fq={!joinfromIndex=mainindex
toIndex=subindex from=docid to=docid}fld1:[0.1 TO 0.3]
QTime: 160 msec
hit count: 205

Here are the index spec's:

mainindex size: 117K docs, 1 segment
mainindex schema:
   
   
   docid

subindex size: 117K docs, 1 segment
subindex schema:
   
   
   docid

With debugQuery=true I see:
  "debug":{
"join":{
  "{!join from=docid to=docid fromIndex=subindex}fld1:[0.1 TO 0.3]":{
"time":155,
"fromSetSize":24742,
"toSetSize":24742,
"fromTermCount":117810,
"fromTermTotalDf":117810,
"fromTermDirectCount":117810,
"fromTermHits":24742,
"fromTermHitsTotalDf":24742,
"toTermHits":24742,
"toTermHitsTotalDf":24742,
"toTermDirectCount":24627,
"smallSetsDeferred":115,
"toSetDocsAdded":24742}},

Via profiler and debugger, I see 150 msec spent in the outer
'while(term!=null)' loop in: JoinQueryWeight.getDocSet(). This seems like a
lot of time to join the bitsets. Does this seem right?

Peter


Re: Cross index join query performance

2013-09-25 Thread Peter Keegan
I forgot to mention - this is Solr 4.3

Peter



On Wed, Sep 25, 2013 at 3:38 PM, Peter Keegan wrote:

> I'm doing a cross-core join query and the join query is 30X slower than
> each of the 2 individual queries. Here are the queries:
>
> Main query: http://localhost:8983/solr/mainindex/select?q=title:java
> QTime: 5 msec
> hit count: 1000
>
> Sub query: http://localhost:8983/solr/subindex/select?q=+fld1:[0.1 TO 0.3]
> QTime: 4 msec
> hit count: 25K
>
> Join query:
> http://localhost:8983/solr/mainindex/select?q=title:java&fq={!joinfromIndex=mainindex
>  toIndex=subindex from=docid to=docid}fld1:[0.1 TO 0.3]
> QTime: 160 msec
> hit count: 205
>
> Here are the index spec's:
>
> mainindex size: 117K docs, 1 segment
> mainindex schema:
> required="true" multiValued="false" />
> stored="true" multiValued="false" />
>docid
>
> subindex size: 117K docs, 1 segment
> subindex schema:
> required="true" multiValued="false" />
> required="false" multiValued="false" />
>docid
>
> With debugQuery=true I see:
>   "debug":{
> "join":{
>   "{!join from=docid to=docid fromIndex=subindex}fld1:[0.1 TO 0.3]":{
> "time":155,
> "fromSetSize":24742,
> "toSetSize":24742,
> "fromTermCount":117810,
> "fromTermTotalDf":117810,
> "fromTermDirectCount":117810,
> "fromTermHits":24742,
> "fromTermHitsTotalDf":24742,
> "toTermHits":24742,
> "toTermHitsTotalDf":24742,
> "toTermDirectCount":24627,
> "smallSetsDeferred":115,
> "toSetDocsAdded":24742}},
>
> Via profiler and debugger, I see 150 msec spent in the outer
> 'while(term!=null)' loop in: JoinQueryWeight.getDocSet(). This seems like a
> lot of time to join the bitsets. Does this seem right?
>
> Peter
>
>


Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread JMill
Not yet but I do see the "$" not found in console.

On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:
> As far as I can tell it is. You can check that by seeing the Console logs
on your browser (chrome, firefox, etc.). There should be an error saying
that the $ function it's not found. In any case I'll try to set up a
testing environment here, but I can only use solr 4.1, which I have here. I
haven't downloaded/tested the 4.4 version yet. Do you try replacing the
line that includes the jquery-1.4.3.min.js with the new one?
>
> - Mensaje original -
> De: "JMill" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 14:44:53
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
columns)
>
> That seems to work. I get back an xml containing a bunch of suggestions.
> Can we agree that it's jquery that's the problem?
>
> On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
> jlbetanco...@uci.cu> wrote:
>> Try quering the core where the data has been imported, something like:
>>
>> http://localhost:8983/solr/suggestions/select?q=uc
>>
>> In the previous URL suggestions is the name I give to the core, so this
> should change, if you get results, then the problem could be the jquery
> dependency. I don't remember doing any change, as far as I know that js
> file is bundled with solr (at leat in 3.x) version perhaps you could
change
> it the correct jquery version on solr 4.4, if you go into the admin panel
> (in solr 3.6):
>>
>> http://localhost:8983/solr/admin/schema.jsp
>>
>> And inspect the loaded code, the required file (jquery-1.4.2.min.js) gets
> loaded in solr 4.4 it should load a similar file, but perhaps a more
recent
> version.
>>
>> Perhaps you could change that part to something like:
>>
>>src="#{url_root}/js/lib/jquery-1.7.2.min.js">
>>
>> Which is used at least on a solr 4.1 that I have laying aroud here
> somewhere.
>>
>> In any case you can test the suggestions using the URL that I suggest on
> the top of this mail, in that case you should be able to see the possible
> results, of course in a less fancy way.
>>
>> - Mensaje original -
>> De: "JMill" 
>> Para: solr-user@lucene.apache.org
>> Enviados: Miércoles, 25 de Septiembre 2013 13:59:32
>> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
> columns)
>>
>> Could it be the jquery library that is the problem?   I opened up
>> solr-home/ac/conf/velocity/head.vm with an editor and I see a reference
to
>> the jquery library but I can't seem to find the directory referenced,
>>  line:  

Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread Ing. Jorge Luis Betancourt Gonzalez
That's and indication that jQuery can't be loaded, and without jQuery the 
autocomplete plugin won't work. This plugin is used to show the popup list that 
show up at the bottom of the input.

- Mensaje original -
De: "JMill" 
Para: solr-user@lucene.apache.org
Enviados: Miércoles, 25 de Septiembre 2013 15:40:00
Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple columns)

Not yet but I do see the "$" not found in console.

On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:
> As far as I can tell it is. You can check that by seeing the Console logs
on your browser (chrome, firefox, etc.). There should be an error saying
that the $ function it's not found. In any case I'll try to set up a
testing environment here, but I can only use solr 4.1, which I have here. I
haven't downloaded/tested the 4.4 version yet. Do you try replacing the
line that includes the jquery-1.4.3.min.js with the new one?
>
> - Mensaje original -
> De: "JMill" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 14:44:53
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
columns)
>
> That seems to work. I get back an xml containing a bunch of suggestions.
> Can we agree that it's jquery that's the problem?
>
> On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
> jlbetanco...@uci.cu> wrote:
>> Try quering the core where the data has been imported, something like:
>>
>> http://localhost:8983/solr/suggestions/select?q=uc
>>
>> In the previous URL suggestions is the name I give to the core, so this
> should change, if you get results, then the problem could be the jquery
> dependency. I don't remember doing any change, as far as I know that js
> file is bundled with solr (at leat in 3.x) version perhaps you could
change
> it the correct jquery version on solr 4.4, if you go into the admin panel
> (in solr 3.6):
>>
>> http://localhost:8983/solr/admin/schema.jsp
>>
>> And inspect the loaded code, the required file (jquery-1.4.2.min.js) gets
> loaded in solr 4.4 it should load a similar file, but perhaps a more
recent
> version.
>>
>> Perhaps you could change that part to something like:
>>
>>src="#{url_root}/js/lib/jquery-1.7.2.min.js">
>>
>> Which is used at least on a solr 4.1 that I have laying aroud here
> somewhere.
>>
>> In any case you can test the suggestions using the URL that I suggest on
> the top of this mail, in that case you should be able to see the possible
> results, of course in a less fancy way.
>>
>> - Mensaje original -
>> De: "JMill" 
>> Para: solr-user@lucene.apache.org
>> Enviados: Miércoles, 25 de Septiembre 2013 13:59:32
>> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
> columns)
>>
>> Could it be the jquery library that is the problem?   I opened up
>> solr-home/ac/conf/velocity/head.vm with an editor and I see a reference
to
>> the jquery library but I can't seem to find the directory referenced,
>>  line:  

Re: Implementing Solr Suggester for Autocomplete (multiple columns)

2013-09-25 Thread JMill
I guess we found the culprit.

On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:
> That's and indication that jQuery can't be loaded, and without jQuery the
autocomplete plugin won't work. This plugin is used to show the popup list
that show up at the bottom of the input.
>
> - Mensaje original -
> De: "JMill" 
> Para: solr-user@lucene.apache.org
> Enviados: Miércoles, 25 de Septiembre 2013 15:40:00
> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
columns)
>
> Not yet but I do see the "$" not found in console.
>
> On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
> jlbetanco...@uci.cu> wrote:
>> As far as I can tell it is. You can check that by seeing the Console logs
> on your browser (chrome, firefox, etc.). There should be an error saying
> that the $ function it's not found. In any case I'll try to set up a
> testing environment here, but I can only use solr 4.1, which I have here.
I
> haven't downloaded/tested the 4.4 version yet. Do you try replacing the
> line that includes the jquery-1.4.3.min.js with the new one?
>>
>> - Mensaje original -
>> De: "JMill" 
>> Para: solr-user@lucene.apache.org
>> Enviados: Miércoles, 25 de Septiembre 2013 14:44:53
>> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
> columns)
>>
>> That seems to work. I get back an xml containing a bunch of suggestions.
>> Can we agree that it's jquery that's the problem?
>>
>> On Wednesday, September 25, 2013, Ing. Jorge Luis Betancourt Gonzalez <
>> jlbetanco...@uci.cu> wrote:
>>> Try quering the core where the data has been imported, something like:
>>>
>>> http://localhost:8983/solr/suggestions/select?q=uc
>>>
>>> In the previous URL suggestions is the name I give to the core, so this
>> should change, if you get results, then the problem could be the jquery
>> dependency. I don't remember doing any change, as far as I know that js
>> file is bundled with solr (at leat in 3.x) version perhaps you could
> change
>> it the correct jquery version on solr 4.4, if you go into the admin panel
>> (in solr 3.6):
>>>
>>> http://localhost:8983/solr/admin/schema.jsp
>>>
>>> And inspect the loaded code, the required file (jquery-1.4.2.min.js)
gets
>> loaded in solr 4.4 it should load a similar file, but perhaps a more
> recent
>> version.
>>>
>>> Perhaps you could change that part to something like:
>>>
>>>   > src="#{url_root}/js/lib/jquery-1.7.2.min.js">
>>>
>>> Which is used at least on a solr 4.1 that I have laying aroud here
>> somewhere.
>>>
>>> In any case you can test the suggestions using the URL that I suggest on
>> the top of this mail, in that case you should be able to see the possible
>> results, of course in a less fancy way.
>>>
>>> - Mensaje original -
>>> De: "JMill" 
>>> Para: solr-user@lucene.apache.org
>>> Enviados: Miércoles, 25 de Septiembre 2013 13:59:32
>>> Asunto: Re: Implementing Solr Suggester for Autocomplete (multiple
>> columns)
>>>
>>> Could it be the jquery library that is the problem?   I opened up
>>> solr-home/ac/conf/velocity/head.vm with an editor and I see a reference
> to
>>> the jquery library but I can't seem to find the directory referenced,
>>>  line:  

How to always tokenize on underscore?

2013-09-25 Thread Greg Preston
[Using SolrCloud 4.4.0]

I have a text field where the data will sometimes be delimited by
whitespace, and sometimes by underscore.  For example, both of the
following are possible input values:

Group_EN_1000232142_blah_1000232142abc_foo
Group EN 1000232142 blah 1000232142abc foo

What I'd like to do is have underscores treated as spaces for
tokenization purposes.  I've tried using a PatternReplaceFilterFactory
with:


  



  
  



  


but that seems to do the pattern replacement on each token, rather
than splitting tokens into multiple tokens based on the pattern.  So
with the input "Group_EN_1000232142_blah_1000232142abc_foo" I end up
with a single token of "group en 1000232142 blah 1000232142abc foo"
rather than what I want, which is 6 tokens: "group", "en",
"1000232142", "blah", "1000232142abc", "foo".

Is there a way to configure for the behavior I'm looking for, or would
I need to write a customer tokenizer?

Thanks!

-Greg


Re: How to always tokenize on underscore?

2013-09-25 Thread Jack Krupansky

Use the char filter instead:
http://lucene.apache.org/core/4_4_0/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceCharFilterFactory.html

-- Jack Krupansky

-Original Message- 
From: Greg Preston

Sent: Wednesday, September 25, 2013 5:43 PM
To: solr-user@lucene.apache.org
Subject: How to always tokenize on underscore?

[Using SolrCloud 4.4.0]

I have a text field where the data will sometimes be delimited by
whitespace, and sometimes by underscore.  For example, both of the
following are possible input values:

Group_EN_1000232142_blah_1000232142abc_foo
Group EN 1000232142 blah 1000232142abc foo

What I'd like to do is have underscores treated as spaces for
tokenization purposes.  I've tried using a PatternReplaceFilterFactory
with:

   
 
   
   
   
 
 
   
   
   
 
   

but that seems to do the pattern replacement on each token, rather
than splitting tokens into multiple tokens based on the pattern.  So
with the input "Group_EN_1000232142_blah_1000232142abc_foo" I end up
with a single token of "group en 1000232142 blah 1000232142abc foo"
rather than what I want, which is 6 tokens: "group", "en",
"1000232142", "blah", "1000232142abc", "foo".

Is there a way to configure for the behavior I'm looking for, or would
I need to write a customer tokenizer?

Thanks!

-Greg 



Re: How to always tokenize on underscore?

2013-09-25 Thread Greg Preston
This is exactly what I needed.  Thank you!

-Greg


On Wed, Sep 25, 2013 at 2:48 PM, Jack Krupansky  wrote:
> Use the char filter instead:
> http://lucene.apache.org/core/4_4_0/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceCharFilterFactory.html
>
> -- Jack Krupansky
>
> -Original Message- From: Greg Preston
> Sent: Wednesday, September 25, 2013 5:43 PM
> To: solr-user@lucene.apache.org
> Subject: How to always tokenize on underscore?
>
>
> [Using SolrCloud 4.4.0]
>
> I have a text field where the data will sometimes be delimited by
> whitespace, and sometimes by underscore.  For example, both of the
> following are possible input values:
>
> Group_EN_1000232142_blah_1000232142abc_foo
> Group EN 1000232142 blah 1000232142abc foo
>
> What I'd like to do is have underscores treated as spaces for
> tokenization purposes.  I've tried using a PatternReplaceFilterFactory
> with:
>
> positionIncrementGap="100">
>  
>
>
> replacement=" " replace="all" />
>  
>  
>
>
> replacement=" " replace="all" />
>  
>
>
> but that seems to do the pattern replacement on each token, rather
> than splitting tokens into multiple tokens based on the pattern.  So
> with the input "Group_EN_1000232142_blah_1000232142abc_foo" I end up
> with a single token of "group en 1000232142 blah 1000232142abc foo"
> rather than what I want, which is 6 tokens: "group", "en",
> "1000232142", "blah", "1000232142abc", "foo".
>
> Is there a way to configure for the behavior I'm looking for, or would
> I need to write a customer tokenizer?
>
> Thanks!
>
> -Greg


Re: AND/OR logic

2013-09-25 Thread PAVAN
Hi Jack,

Thank you for giving reply

 I am doing my configuration in the following way. Can you please check
it and give some suggestion if i need any changes based on my requirement.

i am storing the query string in the following way.

If q="a b c d e f"

s=a+b+c+d+e+f
s1=a+b+c+d+e
s2=f






   query
   debug
  
  
   
   
   
   
   *,score
   false
   all
   true
  
  
  3
  json

   AND
   fsw_title
   14
   {!edismax v=$s}
   mpId desc
   city:All OR _query_:"{!field f=city v=$c}"
   true
   mpId
   4
   true
   score desc
   edismax
   false
   tsw_title^15.0 tf_title^10.0 tsw_keywords^1
keywords^0.5
   fsw_title~1^50.0
   fsw_title~1^25.0
   sum(product(typeId,100),weightage)
   
   OR
   fsw_title
   14
   _query_:"{!edismax qf=$qfs1 pf=$pfx pf2=$pf2x v=$s1}" AND
_query_:"{!edismax qf=$qfs2 v=$s2}"
   mpId desc, score desc
   city:All OR _query_:"{!field f=city v=$c}"
   true
   mpId
   4
   true
   score desc
   false
   fsw_title^30 tsw_title^20 tf_title^15.0
keywords^1.0
   tsw_title^15.0 tf_title^10.0 tsw_keywords^1
keywords^0.5
   fsw_title~1^100.0
   fsw_title~1^50.0
   fsw_title~1^25.0
   product(typeId,100)





--
View this message in context: 
http://lucene.472066.n3.nabble.com/AND-OR-logic-tp4091931p4092059.html
Sent from the Solr - User mailing list archive at Nabble.com.