Re: Is it possible to grouping solr results by their domain ?

2012-04-09 Thread Jan Høydahl
Sure!

http://wiki.apache.org/solr/FieldCollapsing

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 9. apr. 2012, at 07:27, hadi wrote:

> I have crawled many site with nutch and using solr 3.4 to browse the results
> but i want to group the result  by their domain.
> for example if i search one site like "tabnak" the first result only contain
> the http://tabnak.ir and do not show the other result from that domain.
> 
> for example i want to ignore the result such as(like google) :
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Is-it-possible-to-grouping-solr-results-by-their-domain-tp3895995p3895995.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Cloud-aware request processing?

2012-04-09 Thread Benson Margulies
I'm working on a prototype of a scheme that uses SolrCloud to, in
effect, distribute a computation by running it inside of a request
processor.

If there are N shards and M operations, I want each node to perform
M/N operations. That, of course, implies that I know N.

Is that fact available anyplace inside Solr, or do I need to just configure it?


Re: Cloud-aware request processing?

2012-04-09 Thread Jan Høydahl
Hi,

Instead of using Solr, you may want to have a look at Hadoop or another 
framework for distributed computation, see e.g. 
http://java.dzone.com/articles/comparison-gridcloud-computing

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 9. apr. 2012, at 13:41, Benson Margulies wrote:

> I'm working on a prototype of a scheme that uses SolrCloud to, in
> effect, distribute a computation by running it inside of a request
> processor.
> 
> If there are N shards and M operations, I want each node to perform
> M/N operations. That, of course, implies that I know N.
> 
> Is that fact available anyplace inside Solr, or do I need to just configure 
> it?



'No JSP support' error in embedded Jetty for solrCloud as of apache-solr-4.0-2012-04-02_11-54-55

2012-04-09 Thread Benson Margulies
Starting the leader with:

 java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=rnicloud
-DzkRun -DnumShards=3 -Djetty.port=9167  -jar start.jar

and browsing to

http://localhost:9167/solr/rnicloud/admin/zookeeper.jsp

I get:

HTTP ERROR 500

Problem accessing /solr/rnicloud/admin/zookeeper.jsp. Reason:

JSP support not configured
Powered by Jetty://


Re: Cloud-aware request processing?

2012-04-09 Thread Benson Margulies
 Jan Høydahl,

My problem is intimately connected to Solr. it is not a batch job for
hadoop, it is a distributed real-time query scheme. I hate to add yet
another complex framework if a Solr RP can do the job simply.

For this problem, I can transform a Solr query into a subset query on
each shard, and then let the SolrCloud mechanism.

I am well aware of the 'zoo' of alternatives, and I will be evaluating
them if I can't get what I want from Solr.

On Mon, Apr 9, 2012 at 9:34 AM, Jan Høydahl  wrote:
> Hi,
>
> Instead of using Solr, you may want to have a look at Hadoop or another 
> framework for distributed computation, see e.g. 
> http://java.dzone.com/articles/comparison-gridcloud-computing
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 9. apr. 2012, at 13:41, Benson Margulies wrote:
>
>> I'm working on a prototype of a scheme that uses SolrCloud to, in
>> effect, distribute a computation by running it inside of a request
>> processor.
>>
>> If there are N shards and M operations, I want each node to perform
>> M/N operations. That, of course, implies that I know N.
>>
>> Is that fact available anyplace inside Solr, or do I need to just configure 
>> it?
>


RE: Re: Cloud-aware request processing?

2012-04-09 Thread Darren Govoni

"...it is a distributed real-time query scheme..."

SolrCloud does this already. It treats all the shards like one-big-index, and you can 
query it normally to get "subset" results from each shard. Why do you have to 
re-write the query for each shard? Seems unnecessary.

--- Original Message ---
On 4/9/2012  08:45 AM Benson Margulies wrote: Jan Høydahl,

My problem is intimately connected to Solr. it is not a batch job for
hadoop, it is a distributed real-time query scheme. I hate to add yet
another complex framework if a Solr RP can do the job simply.

For this problem, I can transform a Solr query into a subset query on
each shard, and then let the SolrCloud mechanism.

I am well aware of the 'zoo' of alternatives, and I will be evaluating
them if I can't get what I want from Solr.

On Mon, Apr 9, 2012 at 9:34 AM, Jan Høydahl  wrote:
> Hi,
>
> Instead of using Solr, you may want to have a look at Hadoop or another 
framework for distributed computation, see e.g. 
http://java.dzone.com/articles/comparison-gridcloud-computing
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 9. apr. 2012, at 13:41, Benson Margulies wrote:
>
>> I'm working on a prototype of a scheme that uses SolrCloud to, in
>> effect, distribute a computation by running it inside of a request
>> processor.
>>
>> If there are N shards and M operations, I want each node to perform
>> M/N operations. That, of course, implies that I know N.
>>
>> Is that fact available anyplace inside Solr, or do I need to just 
configure it?
>




Is http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster up to date?

2012-04-09 Thread Benson Margulies
I specify -Dcollection.configName=rnicloud, but the admin gui tells me
that I have a collection named 'collection1'.

And, as reported in a prior email, the admin UI URL in there seems wrong.


Re: JNDI in db-data-config.xml websphere

2012-04-09 Thread tech20nn
Have to use exact JNDI name in db-data-config.xml, as unmanaged threads in
Websphere do not have access to java:comp/env namespace. 

Resource name can not be mapped to websphere jdbc datasource name via
reference definition in web.xml.

Now using jndiName="jdbc/testdb" instead of
jndiName="java:comp/env/jdbc/testdb" and also defining websphere JDBC
datasource as "jdbc/testdb"

--
View this message in context: 
http://lucene.472066.n3.nabble.com/JNDI-in-db-data-config-xml-websphere-tp3884787p3896869.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Re: Cloud-aware request processing?

2012-04-09 Thread Benson Margulies
On Mon, Apr 9, 2012 at 9:50 AM, Darren Govoni  wrote:
> "...it is a distributed real-time query scheme..."
>
> SolrCloud does this already. It treats all the shards like one-big-index,
> and you can query it normally to get "subset" results from each shard. Why
> do you have to re-write the query for each shard? Seems unnecessary.

For reasons described in previous email that I won't repeat here.

>
> --- Original Message ---
> On 4/9/2012  08:45 AM Benson Margulies wrote: Jan Høydahl,
> 
> My problem is intimately connected to Solr. it is not a batch job for
> hadoop, it is a distributed real-time query scheme. I hate to add yet
> another complex framework if a Solr RP can do the job simply.
> 
> For this problem, I can transform a Solr query into a subset query on
> each shard, and then let the SolrCloud mechanism.
> 
> I am well aware of the 'zoo' of alternatives, and I will be evaluating
> them if I can't get what I want from Solr.
> 
> On Mon, Apr 9, 2012 at 9:34 AM, Jan Høydahl 
> wrote:
> > Hi,
> >
> > Instead of using Solr, you may want to have a look at Hadoop or
> another framework for distributed computation, see e.g.
> http://java.dzone.com/articles/comparison-gridcloud-computing
> >
> > --
> > Jan Høydahl, search solution architect
> > Cominvent AS - www.cominvent.com
> > Solr Training - www.solrtraining.com
> >
> > On 9. apr. 2012, at 13:41, Benson Margulies wrote:
> >
> >> I'm working on a prototype of a scheme that uses SolrCloud to, in
> >> effect, distribute a computation by running it inside of a request
> >> processor.
> >>
> >> If there are N shards and M operations, I want each node to perform
> >> M/N operations. That, of course, implies that I know N.
> >>
> >> Is that fact available anyplace inside Solr, or do I need to just
> configure it?
> >
> 
> 


Re: Solr is indexing but not showing results

2012-04-09 Thread Ahmet Arslan
>  stored="true"
> required="true"/>
>  stored="true"
> required="true"/>      

String type is not tokenized. Indexed verbatim. Use a different type for full 
text search. e.g. type="text"



Stumped on using a custom update request processor with SolrCloud

2012-04-09 Thread Benson Margulies
If you would be so kind as to look at
https://issues.apache.org/jira/browse/SOLR-3342, you will see that I
tried to use a working configuration for a URP of mine with SolrCloud,
and received in return an NPE.

Somehow or another, by default, the XmlUpdateRequestHandler ends up
using (I think) the PeerSync class to establish the indexibleId. When
I add in my URP, I am somehow turning this off, and I'm currently
stumped as to how to turn it back on.

If you don't care to read the JIRA, my relevant configuration is right
here. Is there something else I need in the 'defaults' list, or some
other processor I need to put in my chain?

   




  


  

  RNI




RE: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread Young, Cody
I believe you're looking for what's called, "Matrix Counts"

Please see this JIRA issue. To my knowledge it has been committed in trunk but 
not 3.x.

https://issues.apache.org/jira/browse/SOLR-2898

This feature is accessed by using group.facet=true

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Saturday, April 07, 2012 7:02 PM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I've been searching for a solution to my issue, and this seems to come closest 
to it. But not exactly.

I am indexing clothing. Each article of clothing comes in many sizes and 
colors, and can belong to any number of categories.

For example take the following: I add 6 documents to solr as follows:

product, color, size, category

shirt A, red, small, valentines day
shirt A, red, large, valentines day
shirt A, blue, small, valentines day
shirt A, blue, large, valentines day
shirt A, green, small, valentines day
shirt A, green, large, valentines day

I'd like my facet counts to return as follows:

color

red (1)
blue (1)
green (1)

size

small (1)
large (1)

category

valentines day (1)

But they come back like this:

color:
red (2)
blue (2)
green (2)

size:
small (2)
large (2)

category
valentines day (6)

I see the group.facet parameter in version 4.0 does exactly this. However how 
can I make this happen now? There are all sorts of ecommerce systems out there 
that facet exactly how i'm asking. i thought solr is supposed to be the very 
best fastest search system, yet it doesn't seem to be able to facet correct for 
items with multiple values?

Am i indexing my data wrong? 

how can i make this happen?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3893744.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: how to correctly facet clothing multiple sizes and colors?

2012-04-09 Thread Robert Petersen
You *could* do it by making one and only one solr document for each
clothing item, then just have the front end render all the sizes and
colors available for that item as size/color pickers on the product
page.  You can add all the colors and sized to the one document in the
index so they are searchable also, but the caveat is that they won't
show up as a facet.  This is just one simple approach.

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Saturday, April 07, 2012 7:04 PM
To: solr-user@lucene.apache.org
Subject: how to correctly facet clothing multiple sizes and colors?

I've been searching for a solution to my issue, and this seems to come
closest to it. But not exactly. 

I am indexing clothing. Each article of clothing comes in many sizes and
colors, and can belong to any number of categories. 

For example take the following: I add 6 documents to solr as follows: 

product, color, size, category 

shirt A, red, small, valentines day 
shirt A, red, large, valentines day 
shirt A, blue, small, valentines day 
shirt A, blue, large, valentines day 
shirt A, green, small, valentines day 
shirt A, green, large, valentines day 

I'd like my facet counts to return as follows: 

color 

red (1) 
blue (1) 
green (1) 

size 

small (1) 
large (1) 

category 

valentines day (1) 

But they come back like this: 

color: 
red (2) 
blue (2) 
green (2) 

size: 
small (2) 
large (2) 

category 
valentines day (6) 

I see the group.facet parameter in version 4.0 does exactly this.
However
how can I make this happen now? There are all sorts of ecommerce systems
out
there that facet exactly how i'm asking. i thought solr is supposed to
be
the very best fastest search system, yet it doesn't seem to be able to
facet
correct for items with multiple values? 

Am i indexing my data wrong? 

how can i make this happen?

--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-correctly-facet-clothing-multi
ple-sizes-and-colors-tp3893747p3893747.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: 'No JSP support' error in embedded Jetty for solrCloud as of apache-solr-4.0-2012-04-02_11-54-55

2012-04-09 Thread Ryan McKinley
zookeeper.jsp was removed (along with all JSP stuff) in trunk

Take a look at the cloud tab in the UI, or check the /zookeeper
servlet for the JSON raw output

ryan


On Mon, Apr 9, 2012 at 6:42 AM, Benson Margulies  wrote:
> Starting the leader with:
>
>  java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=rnicloud
> -DzkRun -DnumShards=3 -Djetty.port=9167  -jar start.jar
>
> and browsing to
>
> http://localhost:9167/solr/rnicloud/admin/zookeeper.jsp
>
> I get:
>
> HTTP ERROR 500
>
> Problem accessing /solr/rnicloud/admin/zookeeper.jsp. Reason:
>
>    JSP support not configured
> Powered by Jetty://


Re: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread danjfoley
I did get this working with version 4. However my facet queries still don't 
group.

Sent from my phone

- Reply message -
From: "Young, Cody [via Lucene]" 
Date: Mon, Apr 9, 2012 12:45 pm
Subject: To truncate or not to truncate (group.truncate vs. facet)
To: "danjfoley" 



I believe you're looking for what's called, "Matrix Counts"

Please see this JIRA issue. To my knowledge it has been committed in trunk but 
not 3.x.

https://issues.apache.org/jira/browse/SOLR-2898

This feature is accessed by using group.facet=true

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Saturday, April 07, 2012 7:02 PM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I've been searching for a solution to my issue, and this seems to come closest 
to it. But not exactly.

I am indexing clothing. Each article of clothing comes in many sizes and 
colors, and can belong to any number of categories.

For example take the following: I add 6 documents to solr as follows:

product, color, size, category

shirt A, red, small, valentines day
shirt A, red, large, valentines day
shirt A, blue, small, valentines day
shirt A, blue, large, valentines day
shirt A, green, small, valentines day
shirt A, green, large, valentines day

I'd like my facet counts to return as follows:

color

red (1)
blue (1)
green (1)

size

small (1)
large (1)

category

valentines day (1)

But they come back like this:

color:
red (2)
blue (2)
green (2)

size:
small (2)
large (2)

category
valentines day (6)

I see the group.facet parameter in version 4.0 does exactly this. However how 
can I make this happen now? There are all sorts of ecommerce systems out there 
that facet exactly how i'm asking. i thought solr is supposed to be the very 
best fastest search system, yet it doesn't seem to be able to facet correct for 
items with multiple values?

Am i indexing my data wrong? 

how can i make this happen?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3893744.html
Sent from the Solr - User mailing list archive at Nabble.com.


___
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897366.html

To unsubscribe from To truncate or not to truncate (group.truncate vs. facet, 
visit 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897422.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread Young, Cody
You tried adding the parameter

&group.facet=true ?

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Monday, April 09, 2012 10:09 AM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I did get this working with version 4. However my facet queries still don't 
group.

Sent from my phone

- Reply message -
From: "Young, Cody [via Lucene]" 
Date: Mon, Apr 9, 2012 12:45 pm
Subject: To truncate or not to truncate (group.truncate vs. facet)
To: "danjfoley" 



I believe you're looking for what's called, "Matrix Counts"

Please see this JIRA issue. To my knowledge it has been committed in trunk but 
not 3.x.

https://issues.apache.org/jira/browse/SOLR-2898

This feature is accessed by using group.facet=true

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Saturday, April 07, 2012 7:02 PM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I've been searching for a solution to my issue, and this seems to come closest 
to it. But not exactly.

I am indexing clothing. Each article of clothing comes in many sizes and 
colors, and can belong to any number of categories.

For example take the following: I add 6 documents to solr as follows:

product, color, size, category

shirt A, red, small, valentines day
shirt A, red, large, valentines day
shirt A, blue, small, valentines day
shirt A, blue, large, valentines day
shirt A, green, small, valentines day
shirt A, green, large, valentines day

I'd like my facet counts to return as follows:

color

red (1)
blue (1)
green (1)

size

small (1)
large (1)

category

valentines day (1)

But they come back like this:

color:
red (2)
blue (2)
green (2)

size:
small (2)
large (2)

category
valentines day (6)

I see the group.facet parameter in version 4.0 does exactly this. However how 
can I make this happen now? There are all sorts of ecommerce systems out there 
that facet exactly how i'm asking. i thought solr is supposed to be the very 
best fastest search system, yet it doesn't seem to be able to facet correct for 
items with multiple values?

Am i indexing my data wrong? 

how can i make this happen?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3893744.html
Sent from the Solr - User mailing list archive at Nabble.com.


___
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897366.html

To unsubscribe from To truncate or not to truncate (group.truncate vs. facet, 
visit 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897422.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread Young, Cody
One other thing, I believe that you need to be using facet.field on single 
valued string fields for group.facet to function properly. Are the fields 
you're faceting on multiValued=false?

Cody

-Original Message-
From: Young, Cody [mailto:cody.yo...@move.com] 
Sent: Monday, April 09, 2012 10:36 AM
To: solr-user@lucene.apache.org
Subject: RE: To truncate or not to truncate (group.truncate vs. facet)

You tried adding the parameter

&group.facet=true ?

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Monday, April 09, 2012 10:09 AM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I did get this working with version 4. However my facet queries still don't 
group.

Sent from my phone

- Reply message -
From: "Young, Cody [via Lucene]" 
Date: Mon, Apr 9, 2012 12:45 pm
Subject: To truncate or not to truncate (group.truncate vs. facet)
To: "danjfoley" 



I believe you're looking for what's called, "Matrix Counts"

Please see this JIRA issue. To my knowledge it has been committed in trunk but 
not 3.x.

https://issues.apache.org/jira/browse/SOLR-2898

This feature is accessed by using group.facet=true

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Saturday, April 07, 2012 7:02 PM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I've been searching for a solution to my issue, and this seems to come closest 
to it. But not exactly.

I am indexing clothing. Each article of clothing comes in many sizes and 
colors, and can belong to any number of categories.

For example take the following: I add 6 documents to solr as follows:

product, color, size, category

shirt A, red, small, valentines day
shirt A, red, large, valentines day
shirt A, blue, small, valentines day
shirt A, blue, large, valentines day
shirt A, green, small, valentines day
shirt A, green, large, valentines day

I'd like my facet counts to return as follows:

color

red (1)
blue (1)
green (1)

size

small (1)
large (1)

category

valentines day (1)

But they come back like this:

color:
red (2)
blue (2)
green (2)

size:
small (2)
large (2)

category
valentines day (6)

I see the group.facet parameter in version 4.0 does exactly this. However how 
can I make this happen now? There are all sorts of ecommerce systems out there 
that facet exactly how i'm asking. i thought solr is supposed to be the very 
best fastest search system, yet it doesn't seem to be able to facet correct for 
items with multiple values?

Am i indexing my data wrong? 

how can i make this happen?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3893744.html
Sent from the Solr - User mailing list archive at Nabble.com.


___
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897366.html

To unsubscribe from To truncate or not to truncate (group.truncate vs. facet, 
visit 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897422.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr is indexing but not showing results

2012-04-09 Thread srini
Hi Thanks for your reply. As per your suggestion I changed XML field type to
text. 

   

but when I start solr it is throwing following exception.
SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'text'
specified on field XML

Any suggestions!!(Thanks for your reply)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-is-indexing-but-not-showing-results-tp3897176p3897626.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr is indexing but not showing results

2012-04-09 Thread Walter Underwood
You will need to define or customize a field type for text. 

The example schema.xml file that is installed with Solr 3.5 has a several kinds 
of text fields, "text_general" and "text_en" are good places to start. You can 
use one of those, then customize it.

wunder

On Apr 9, 2012, at 11:27 AM, srini wrote:

> Hi Thanks for your reply. As per your suggestion I changed XML field type to
> text. 
> 
>  required="true"/>   
> 
> but when I start solr it is throwing following exception.
> SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'text'
> specified on field XML
> 
> Any suggestions!!(Thanks for your reply)
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-is-indexing-but-not-showing-results-tp3897176p3897626.html
> Sent from the Solr - User mailing list archive at Nabble.com.







Re: Solr is indexing but not showing results

2012-04-09 Thread Jeevanandam Madanagopal
Srini -

This "text" datatype comes as sample configuration in SOLR distribution. Check 
this, it may suit your need!


  






  
  







  



-Jeevanandam
 
On Apr 10, 2012, at 12:08 AM, Walter Underwood wrote:

> You will need to define or customize a field type for text. 
> 
> The example schema.xml file that is installed with Solr 3.5 has a several 
> kinds of text fields, "text_general" and "text_en" are good places to start. 
> You can use one of those, then customize it.
> 
> wunder
> 
> On Apr 9, 2012, at 11:27 AM, srini wrote:
> 
>> Hi Thanks for your reply. As per your suggestion I changed XML field type to
>> text. 
>> 
>> > required="true"/>   
>> 
>> but when I start solr it is throwing following exception.
>> SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'text'
>> specified on field XML
>> 
>> Any suggestions!!(Thanks for your reply)
>> 
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Solr-is-indexing-but-not-showing-results-tp3897176p3897626.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 
> 



Re: Solr is indexing but not showing results

2012-04-09 Thread Walter Underwood
That is not a good configuration. Synonyms should be expanded at index time, 
not query time. --wunder

On Apr 9, 2012, at 11:43 AM, Jeevanandam Madanagopal wrote:

> Srini -
> 
> This "text" datatype comes as sample configuration in SOLR distribution. 
> Check this, it may suit your need!
> 
>  autoGeneratePhraseQueries="true">
>  
>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" 
> splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>  
>
> ignoreCase="true" expand="true"/>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" 
> splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>
> 
> 
> -Jeevanandam
> 
> On Apr 10, 2012, at 12:08 AM, Walter Underwood wrote:
> 
>> You will need to define or customize a field type for text. 
>> 
>> The example schema.xml file that is installed with Solr 3.5 has a several 
>> kinds of text fields, "text_general" and "text_en" are good places to start. 
>> You can use one of those, then customize it.
>> 
>> wunder
>> 
>> On Apr 9, 2012, at 11:27 AM, srini wrote:
>> 
>>> Hi Thanks for your reply. As per your suggestion I changed XML field type to
>>> text. 
>>> 
>>> >> required="true"/>   
>>> 
>>> but when I start solr it is throwing following exception.
>>> SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'text'
>>> specified on field XML
>>> 
>>> Any suggestions!!(Thanks for your reply)
>>> 
>>> --
>>> View this message in context: 
>>> http://lucene.472066.n3.nabble.com/Solr-is-indexing-but-not-showing-results-tp3897176p3897626.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 






Re: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread danjfoley
I am using group.facet and it works fine for regular facet.field but not for 
facet.query

Sent from my phone

- Reply message -
From: "Young, Cody [via Lucene]" 
Date: Mon, Apr 9, 2012 1:38 pm
Subject: To truncate or not to truncate (group.truncate vs. facet)
To: "danjfoley" 



One other thing, I believe that you need to be using facet.field on single 
valued string fields for group.facet to function properly. Are the fields 
you're faceting on multiValued=false?

Cody

-Original Message-
From: Young, Cody [mailto:cody.yo...@move.com] 
Sent: Monday, April 09, 2012 10:36 AM
To: solr-user@lucene.apache.org
Subject: RE: To truncate or not to truncate (group.truncate vs. facet)

You tried adding the parameter

&group.facet=true ?

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Monday, April 09, 2012 10:09 AM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I did get this working with version 4. However my facet queries still don't 
group.

Sent from my phone

- Reply message -
From: "Young, Cody [via Lucene]" 
Date: Mon, Apr 9, 2012 12:45 pm
Subject: To truncate or not to truncate (group.truncate vs. facet)
To: "danjfoley" 



I believe you're looking for what's called, "Matrix Counts"

Please see this JIRA issue. To my knowledge it has been committed in trunk but 
not 3.x.

https://issues.apache.org/jira/browse/SOLR-2898

This feature is accessed by using group.facet=true

Cody

-Original Message-
From: danjfoley [mailto:d...@micamedia.com] 
Sent: Saturday, April 07, 2012 7:02 PM
To: solr-user@lucene.apache.org
Subject: Re: To truncate or not to truncate (group.truncate vs. facet)

I've been searching for a solution to my issue, and this seems to come closest 
to it. But not exactly.

I am indexing clothing. Each article of clothing comes in many sizes and 
colors, and can belong to any number of categories.

For example take the following: I add 6 documents to solr as follows:

product, color, size, category

shirt A, red, small, valentines day
shirt A, red, large, valentines day
shirt A, blue, small, valentines day
shirt A, blue, large, valentines day
shirt A, green, small, valentines day
shirt A, green, large, valentines day

I'd like my facet counts to return as follows:

color

red (1)
blue (1)
green (1)

size

small (1)
large (1)

category

valentines day (1)

But they come back like this:

color:
red (2)
blue (2)
green (2)

size:
small (2)
large (2)

category
valentines day (6)

I see the group.facet parameter in version 4.0 does exactly this. However how 
can I make this happen now? There are all sorts of ecommerce systems out there 
that facet exactly how i'm asking. i thought solr is supposed to be the very 
best fastest search system, yet it doesn't seem to be able to facet correct for 
items with multiple values?

Am i indexing my data wrong? 

how can i make this happen?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3893744.html
Sent from the Solr - User mailing list archive at Nabble.com.


___
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897366.html

To unsubscribe from To truncate or not to truncate (group.truncate vs. facet, 
visit 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897422.html
Sent from the Solr - User mailing list archive at Nabble.com.


___
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897487.html

To unsubscribe from To truncate or not to truncate (group.truncate vs. facet, 
visit 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==

--
View this message in context: 
http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897694.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr is indexing but not showing results

2012-04-09 Thread Jeevanandam Madanagopal
I agree partially, it actually depends. For instance during index time few of 
the synonyms mapping may or may not expand (for e.g.. frequent data index 
population from different source). So good apply at index time as well as query 
time to achieve complete ratio. Mostly of the time I did similar settings to 
meet customer requirements.

For example: 
-
Below sample text datatype with synonyms at index & query time (below config 
has similar analyzer structure of tokenizer & filter; so we can keep commonly 
one  config too.) 


  







  
  


   




  


-Jeevanandam


On Apr 10, 2012, at 12:18 AM, Walter Underwood wrote:

> That is not a good configuration. Synonyms should be expanded at index time, 
> not query time. --wunder
> 
> On Apr 9, 2012, at 11:43 AM, Jeevanandam Madanagopal wrote:
> 
>> Srini -
>> 
>> This "text" datatype comes as sample configuration in SOLR distribution. 
>> Check this, it may suit your need!
>> 
>> > autoGeneratePhraseQueries="true">
>> 
>>   
>>   >   ignoreCase="true"
>>   words="stopwords.txt"
>>   enablePositionIncrements="true"
>>   />
>>   > generateNumberParts="1" catenateWords="1" catenateNumbers="1" 
>> catenateAll="0" splitOnCaseChange="1"/>
>>   
>>   > protected="protwords.txt"/>
>>   
>> 
>> 
>>   
>>   > ignoreCase="true" expand="true"/>
>>   >   ignoreCase="true"
>>   words="stopwords.txt"
>>   enablePositionIncrements="true"
>>   />
>>   > generateNumberParts="1" catenateWords="0" catenateNumbers="0" 
>> catenateAll="0" splitOnCaseChange="1"/>
>>   
>>   > protected="protwords.txt"/>
>>   
>> 
>>   
>> 
>> 
>> -Jeevanandam
>> 
>> On Apr 10, 2012, at 12:08 AM, Walter Underwood wrote:
>> 
>>> You will need to define or customize a field type for text. 
>>> 
>>> The example schema.xml file that is installed with Solr 3.5 has a several 
>>> kinds of text fields, "text_general" and "text_en" are good places to 
>>> start. You can use one of those, then customize it.
>>> 
>>> wunder
>>> 
>>> On Apr 9, 2012, at 11:27 AM, srini wrote:
>>> 
 Hi Thanks for your reply. As per your suggestion I changed XML field type 
 to
 text. 
 
 >>> required="true"/>   
 
 but when I start solr it is throwing following exception.
 SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'text'
 specified on field XML
 
 Any suggestions!!(Thanks for your reply)
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-is-indexing-but-not-showing-results-tp3897176p3897626.html
 Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
> 
> 
> 
> 



Re: Solr is indexing but not showing results

2012-04-09 Thread Walter Underwood
There are some well-understood problems with query-time synonyms. Read about 
them here:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

Expanding synonyms at both index and query time causes a different problem, 
over-counting the score for any term in the synonym map.

wunder

On Apr 9, 2012, at 12:14 PM, Jeevanandam Madanagopal wrote:

> I agree partially, it actually depends. For instance during index time few of 
> the synonyms mapping may or may not expand (for e.g.. frequent data index 
> population from different source). So good apply at index time as well as 
> query time to achieve complete ratio. Mostly of the time I did similar 
> settings to meet customer requirements.
> 
> For example: 
> -
> Below sample text datatype with synonyms at index & query time (below config 
> has similar analyzer structure of tokenizer & filter; so we can keep commonly 
> one  config too.) 
> 
>  autoGeneratePhraseQueries="true">
>  
>
> ignoreCase="true" expand="true"/>
> words="stopwords.txt" enablePositionIncrements="true" />
> generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" 
> splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>  
>
> ignoreCase="true" expand="true"/>
>words="stopwords.txt" enablePositionIncrements="true" />
> generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" 
> splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>
> 
> -Jeevanandam
> 
> 
> On Apr 10, 2012, at 12:18 AM, Walter Underwood wrote:
> 
>> That is not a good configuration. Synonyms should be expanded at index time, 
>> not query time. --wunder
>> 
>> On Apr 9, 2012, at 11:43 AM, Jeevanandam Madanagopal wrote:
>> 
>>> Srini -
>>> 
>>> This "text" datatype comes as sample configuration in SOLR distribution. 
>>> Check this, it may suit your need!
>>> 
>>> >> autoGeneratePhraseQueries="true">
>>>
>>>  
>>>  >>  ignoreCase="true"
>>>  words="stopwords.txt"
>>>  enablePositionIncrements="true"
>>>  />
>>>  >> generateNumberParts="1" catenateWords="1" catenateNumbers="1" 
>>> catenateAll="0" splitOnCaseChange="1"/>
>>>  
>>>  >> protected="protwords.txt"/>
>>>  
>>>
>>>
>>>  
>>>  >> ignoreCase="true" expand="true"/>
>>>  >>  ignoreCase="true"
>>>  words="stopwords.txt"
>>>  enablePositionIncrements="true"
>>>  />
>>>  >> generateNumberParts="1" catenateWords="0" catenateNumbers="0" 
>>> catenateAll="0" splitOnCaseChange="1"/>
>>>  
>>>  >> protected="protwords.txt"/>
>>>  
>>>
>>>  
>>> 
>>> 
>>> -Jeevanandam
>>> 
>>> On Apr 10, 2012, at 12:08 AM, Walter Underwood wrote:
>>> 
 You will need to define or customize a field type for text. 
 
 The example schema.xml file that is installed with Solr 3.5 has a several 
 kinds of text fields, "text_general" and "text_en" are good places to 
 start. You can use one of those, then customize it.
 
 wunder
 
 On Apr 9, 2012, at 11:27 AM, srini wrote:
 
> Hi Thanks for your reply. As per your suggestion I changed XML field type 
> to
> text. 
> 
>  required="true"/>   
> 
> but when I start solr it is throwing following exception.
> SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'text'
> specified on field XML
> 
> Any suggestions!!(Thanks for your reply)
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-is-indexing-but-not-showing-results-tp3897176p3897626.html
> Sent from the Solr - User mailing list archive at Nabble.com.
 
>> 
>> 
>> 
>> 
> 

--
Walter Underwood
wun...@wunderwood.org





Re: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread Martijn v Groningen
The group.facet option only works for field facets (facet.field). Others
facets types (query, range and pivot) aren't supported yet.
The group.facet works for both single and multivalued fields specified in
the facet.field parameter.

Martijn

On 9 April 2012 20:58, danjfoley  wrote:

> I am using group.facet and it works fine for regular facet.field but not
> for facet.query
>
> Sent from my phone
>
> - Reply message -
> From: "Young, Cody [via Lucene]"  >
> Date: Mon, Apr 9, 2012 1:38 pm
> Subject: To truncate or not to truncate (group.truncate vs. facet)
> To: "danjfoley" 
>
>
>
> One other thing, I believe that you need to be using facet.field on single
> valued string fields for group.facet to function properly. Are the fields
> you're faceting on multiValued=false?
>
> Cody
>
> -Original Message-
> From: Young, Cody [mailto:cody.yo...@move.com]
> Sent: Monday, April 09, 2012 10:36 AM
> To: solr-user@lucene.apache.org
> Subject: RE: To truncate or not to truncate (group.truncate vs. facet)
>
> You tried adding the parameter
>
> &group.facet=true ?
>
> Cody
>
> -Original Message-
> From: danjfoley [mailto:d...@micamedia.com]
> Sent: Monday, April 09, 2012 10:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: To truncate or not to truncate (group.truncate vs. facet)
>
> I did get this working with version 4. However my facet queries still
> don't group.
>
> Sent from my phone
>
> - Reply message -
> From: "Young, Cody [via Lucene]"  >
> Date: Mon, Apr 9, 2012 12:45 pm
> Subject: To truncate or not to truncate (group.truncate vs. facet)
> To: "danjfoley" 
>
>
>
> I believe you're looking for what's called, "Matrix Counts"
>
> Please see this JIRA issue. To my knowledge it has been committed in trunk
> but not 3.x.
>
> https://issues.apache.org/jira/browse/SOLR-2898
>
> This feature is accessed by using group.facet=true
>
> Cody
>
> -Original Message-
> From: danjfoley [mailto:d...@micamedia.com]
> Sent: Saturday, April 07, 2012 7:02 PM
> To: solr-user@lucene.apache.org
> Subject: Re: To truncate or not to truncate (group.truncate vs. facet)
>
> I've been searching for a solution to my issue, and this seems to come
> closest to it. But not exactly.
>
> I am indexing clothing. Each article of clothing comes in many sizes and
> colors, and can belong to any number of categories.
>
> For example take the following: I add 6 documents to solr as follows:
>
> product, color, size, category
>
> shirt A, red, small, valentines day
> shirt A, red, large, valentines day
> shirt A, blue, small, valentines day
> shirt A, blue, large, valentines day
> shirt A, green, small, valentines day
> shirt A, green, large, valentines day
>
> I'd like my facet counts to return as follows:
>
> color
>
> red (1)
> blue (1)
> green (1)
>
> size
>
> small (1)
> large (1)
>
> category
>
> valentines day (1)
>
> But they come back like this:
>
> color:
> red (2)
> blue (2)
> green (2)
>
> size:
> small (2)
> large (2)
>
> category
> valentines day (6)
>
> I see the group.facet parameter in version 4.0 does exactly this. However
> how can I make this happen now? There are all sorts of ecommerce systems
> out there that facet exactly how i'm asking. i thought solr is supposed to
> be the very best fastest search system, yet it doesn't seem to be able to
> facet correct for items with multiple values?
>
> Am i indexing my data wrong?
>
> how can i make this happen?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3893744.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ___
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897366.html
>
> To unsubscribe from To truncate or not to truncate (group.truncate vs.
> facet, visit
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897422.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ___
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897487.html
>
> To unsubscribe from To truncate or not to truncate (group.truncate vs.
> facet, visit
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-tr

How to facet data from a multivalued field?

2012-04-09 Thread Thiago
Hello everybody,

I've already searched about this topic in the forum, but I didn't find any
case like this. I ask for apologizes if this topic have been already
discussed.

I'm having a problem in faceting a multivalued field. My field is called
series, and it has names of TV series like the big bang theory, two and a
half men ...

In this field I can have a lot of TV series names. For example:


   Two and a Half Men
   How I Met Your Mother
   The Big Bang Theory


What I want to do is: search and count how many documents related to each
series. I'm doing it using facet search in this field. But it's returning
each word separately. Like this:





   91
   91
   21
   45
   45
   21
   45
   45
   91
   21
   45






And what I want is something like:





   21
   45
   91






Is there any possible way to do it with facet search? I don't want the
terms, I just want each string including the white spaces. Do I have to
change my fieldtype to do this?

Thanks to everybody.

Thiago 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-facet-data-from-a-multivalued-field-tp3897853p3897853.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to facet data from a multivalued field?

2012-04-09 Thread Darren Govoni
Your handler for that field should be looked at.
Try not using a handler that tokenizes or stems the field.
You want to leave the text as is. I forget the handler setting for that,
but its documented in there somewhere.


On Mon, 2012-04-09 at 13:02 -0700, Thiago wrote:
> Hello everybody,
> 
> I've already searched about this topic in the forum, but I didn't find any
> case like this. I ask for apologizes if this topic have been already
> discussed.
> 
> I'm having a problem in faceting a multivalued field. My field is called
> series, and it has names of TV series like the big bang theory, two and a
> half men ...
> 
> In this field I can have a lot of TV series names. For example:
> 
> 
>Two and a Half Men
>How I Met Your Mother
>The Big Bang Theory
> 
> 
> What I want to do is: search and count how many documents related to each
> series. I'm doing it using facet search in this field. But it's returning
> each word separately. Like this:
> 
> 
> 
> 
> 
>91
>91
>21
>45
>45
>21
>45
>45
>91
>21
>45
> 
> 
> 
> 
> 
> 
> And what I want is something like:
> 
> 
> 
> 
> 
>21
>45
>91
> 
> 
> 
> 
> 
> 
> Is there any possible way to do it with facet search? I don't want the
> terms, I just want each string including the white spaces. Do I have to
> change my fieldtype to do this?
> 
> Thanks to everybody.
> 
> Thiago 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-facet-data-from-a-multivalued-field-tp3897853p3897853.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 




Re: Strange behavior with search on empty string and NOT

2012-04-09 Thread Chris Hostetter

: Would it be a good idea to have Solr throw syntax error if an empty string
: query occurs? 

erick's explanation wasn't very precise ... 

solr doesn't have any special handling of "empty strings", but what you 
are searching for *might* be a totally valid query based on how the field 
type is configured (ie: strfield, or keywordtokenizer, etc...

in your case, you seem to be seraching for "" in a field for the 
analyzer produces no tokens for "", so it falls out of the query.


-Hoss


Re: Dynamically changing facet hierarchies and facet values

2012-04-09 Thread Chris Hostetter

: I have a use case where the facet hierarchies as well as facet names change
: very frequently.
: 
: For example:
: (Smartphones >> Android ) may become
: Smartphones >> GSM >> And roid.
: 
: OR
:"Smartphone"  could be renamed to "Smart Phone"
: 
: If I use traditional hierarchical faceting, then every change would mean a
: re-index of a large number of documents.
: 
: Just curious to know how others have solved this problem in the past.

I've dealt with this in the past using a custom plugin for the faceting. 
basically each document had a category field that only contained the id# 
of a category it was directly in, and the actaul hierarchy info was stored 
in an XML data file that the plugin loaded at init and used to build the 
query associated with each node by looking at all the categoryId number 
from all hte descendent categories (optimizations can be made if you know 
documents are only mapped to leaf level categories, or if you can define 
your hierarchy in terms of other fields -- ie: catId#345might be definable 
by the query "type:phone AND os:android AND tech:GSM")

for small hiarchies, you can do the same thing from any solr client that 
knows what hierarchy you have usng many facet.queries - just put whatever 
info you need to remap the flat facet.query responses into a hierarchy as 
localparams on each facet.query.




-Hoss


Boosting when matching specific field values

2012-04-09 Thread gseoeltru solr
I am using edismax when executing search against set of news articles. I
would like to also boost the scores of matched documents based on another
field in the documents which I will call "source" which can be set to 3
possible strings.   So if the "source" field has a value "a", then I want
to multiply the score by 1. If the "source" field has a value "b", then I
want to multiple the score by 2 ... and so on. What is the way to go about
doing this ?
Any help here mucho appreciated !


Re: solr analysis-extras configuration

2012-04-09 Thread Chris Hostetter

: Further info: I can make this work if I stay out of tomcat -- I
: download a fresh solr binary distro, copy those five JARs from 'dist'
: and 'contrib' into example/solr/lib/, copy my solrconfig.xml and
: schema.xml, and run 'java -jar start.jar', and it works fine.  But
: trying to add those same JARs to my tomcat instance's solrhome/lib
: doesn't work.  Any ideas how to troubleshoot?

is there anything else about how you have tomicat+solr configured that 
might be causing tomcat to load *any* solr or lucene jars directly, 
instead of letting the solr.war file load them from your solr home dir?  
did you change anything about tomcat's classpath? did you copy any jars 
anywhere other then your solrhome/lib dir?

these kinds of "classloader hell" errors can happen if a parent 
classloader has already loaded some class that depends on (or is depended 
on by) a another class loaded by the solr war.


-Hoss


Re: how to correctly facet clothing multiple sizes and colors?

2012-04-09 Thread Andrew Harvey
What we do in our application is exactly what Robert described. We index 
Products, not variants. The variant data (colour, size etc.) is denormalised 
into the product document at index time. We then facet on the variant 
attributes and get product count instead of variant count. 

What you're seeing are correct results. You are indexing 6 documents, as you 
said before. You actually only want to index one document with multi-valued 
fields. 

Hope that's somehow helpful,

Andrew

On 10/04/2012, at 3:01, "Robert Petersen"  wrote:

> You *could* do it by making one and only one solr document for each
> clothing item, then just have the front end render all the sizes and
> colors available for that item as size/color pickers on the product
> page.  You can add all the colors and sized to the one document in the
> index so they are searchable also, but the caveat is that they won't
> show up as a facet.  This is just one simple approach.
> 
> -Original Message-
> From: danjfoley [mailto:d...@micamedia.com] 
> Sent: Saturday, April 07, 2012 7:04 PM
> To: solr-user@lucene.apache.org
> Subject: how to correctly facet clothing multiple sizes and colors?
> 
> I've been searching for a solution to my issue, and this seems to come
> closest to it. But not exactly. 
> 
> I am indexing clothing. Each article of clothing comes in many sizes and
> colors, and can belong to any number of categories. 
> 
> For example take the following: I add 6 documents to solr as follows: 
> 
> product, color, size, category 
> 
> shirt A, red, small, valentines day 
> shirt A, red, large, valentines day 
> shirt A, blue, small, valentines day 
> shirt A, blue, large, valentines day 
> shirt A, green, small, valentines day 
> shirt A, green, large, valentines day 
> 
> I'd like my facet counts to return as follows: 
> 
> color 
> 
> red (1) 
> blue (1) 
> green (1) 
> 
> size 
> 
> small (1) 
> large (1) 
> 
> category 
> 
> valentines day (1) 
> 
> But they come back like this: 
> 
> color: 
> red (2) 
> blue (2) 
> green (2) 
> 
> size: 
> small (2) 
> large (2) 
> 
> category 
> valentines day (6) 
> 
> I see the group.facet parameter in version 4.0 does exactly this.
> However
> how can I make this happen now? There are all sorts of ecommerce systems
> out
> there that facet exactly how i'm asking. i thought solr is supposed to
> be
> the very best fastest search system, yet it doesn't seem to be able to
> facet
> correct for items with multiple values? 
> 
> Am i indexing my data wrong? 
> 
> how can i make this happen?
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-correctly-facet-clothing-multi
> ple-sizes-and-colors-tp3893747p3893747.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr with UIMA

2012-04-09 Thread chris3001
Tommaso,
I apologize for my delayed response. Thank you very much for your time
looking into this!! 
I will try to replicate your efforts on my end this week.

Respectfully,
Chris

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3898094.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Suggester not working for digit starting terms

2012-04-09 Thread Erick Erickson
Is it possible that your fieldType definition for a_suggest is
stripping out the digits? Consider using TermsComponent
http://wiki.apache.org/solr/TermsComponent or the admin
page or Luke to examine the terms actually _in_ your
index. Or look at the admin/analysis page and give it some
sample input to determine what the results of the analysis
chain is

Best
Erick

On Sat, Apr 7, 2012 at 3:24 PM, jmlucjav  wrote:
> Hi,
>
> I am using Suggester component, as advised in Solr3 book (using solr3.5):
>        
>                
>                        a_suggest
>                         name="classname">org.apache.solr.spelling.suggest.Suggester
>                         name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup
>                        a_suggest
>                        true
>                        100
>                
>        
>        
>                
>                        true
>                        a_suggest
>                        true
>                        5
>                        true
>                
>                
>                        suggest
>                
>        
>
> But, even if it works fine with words, it seems it does not work for terms
> starting with diggits. For example:
> http://localhost:8983/solr/suggest?&q=500
> gets 0 results, but I know '500 $' is in the a_suggest field, as I can find
> many hits by:
> http://localhost:8983/solr/select/?q={!prefix f=a_suggest}500
>
> Am I missing something? I have been trying to play with
> spellcheck.onlyMorePopular and spellcheck.accuracy but I get the same
> results.
>
> thansk
> xab
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Suggester-not-working-for-digit-starting-terms-tp3893433p3893433.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question on using dynamic fields

2012-04-09 Thread Erick Erickson
Hmmm, not sure about the dataconfig.xml file. What
are you trying to index? Is this DIH? Because
if you're simply posting Solr-formatted XML docs,
dataconfig.xml is irrelevant

You say you're not seeing the output. One of two
things is going on:
1> The data is not in the index. See the admin/schema browser
  page to examine what actually went in your index.
2> Try doing the query with fl=*. You may simply not be asking
  for the fields to be returned.

Best
Erick

On Sun, Apr 8, 2012 at 9:09 PM, Rakesh Varna  wrote:
> Hello Solr-users,
>   I am trying to index xml files which have the following tags: (I am
> using Solr 3.5 on Tomcat)
>
> 
> 0.98
> 0.767
> .
> ..
> ..
> ..
> 0.2873
> 
>
> The numbers after "theta" are not a continuous sequence and I do not know
> how many such tags are there. I thought this was a good candidate for
> dynamic fields and have the following schema for those tags:
>     stored="true"/>
> Is this correct? If so, what should I use in the data-config.xml file to
> index these tags?
>
> When I try the admin feature in the browser and query *:* , I don't see the
> theta fields in the response.
>
> If not, is dynamicFields a wrong choice? Is there another way of indexing
> these fields?
>
> Thanks in advance,
> Rakesh Varna


Re: Problem about range search

2012-04-09 Thread Erick Erickson
Hmmm, works fine for me using the "popularity" field in
the default schema.

What version of Solr are you using? What is your complete
handler definition?

Best
Erick

On Mon, Apr 9, 2012 at 12:10 AM, ZHANG Liang F
 wrote:
> Hi,
> I ran into a problem when trying range facet search. I had a schema define 
> like this:
>  
>   
>   
>   
>   
>   
>   
>  
>
> I try to set up a range search on "size" field which stands for the size of a 
> file. So I have the following requestHandler config in solrconfig.xml:
>  after
>  size
>  0
>  15728640
>  3145728
>
> But an error says:  Unable to range facet on 
> field:size{type=long,properties=indexed,stored,omitNorms,omitTermFreqAndPositions}
>
> It doesn't show any clue, and I also tried  tag, but got the same error.
>
> Could you please help to suggest?
>
> Thanks in advance!


Re: Re: Cloud-aware request processing?

2012-04-09 Thread Erick Erickson
I _think_ you need to look at the Zookeeper information, perhaps
something like ZkController.getCloudState or some such?

Warning: I haven't been in that code, so this is just a guess. But
since the SolrCloud stuff has to know this kind of info in order
to do distributed indexing, it's got to be available, but I confess
it's not clear where.

But I'm guessing here...

Best
Erick

On Mon, Apr 9, 2012 at 8:22 AM, Benson Margulies  wrote:
> On Mon, Apr 9, 2012 at 9:50 AM, Darren Govoni  wrote:
>> "...it is a distributed real-time query scheme..."
>>
>> SolrCloud does this already. It treats all the shards like one-big-index,
>> and you can query it normally to get "subset" results from each shard. Why
>> do you have to re-write the query for each shard? Seems unnecessary.
>
> For reasons described in previous email that I won't repeat here.
>
>>
>> --- Original Message ---
>> On 4/9/2012  08:45 AM Benson Margulies wrote: Jan Høydahl,
>> 
>> My problem is intimately connected to Solr. it is not a batch job for
>> hadoop, it is a distributed real-time query scheme. I hate to add yet
>> another complex framework if a Solr RP can do the job simply.
>> 
>> For this problem, I can transform a Solr query into a subset query on
>> each shard, and then let the SolrCloud mechanism.
>> 
>> I am well aware of the 'zoo' of alternatives, and I will be evaluating
>> them if I can't get what I want from Solr.
>> 
>> On Mon, Apr 9, 2012 at 9:34 AM, Jan Høydahl 
>> wrote:
>> > Hi,
>> >
>> > Instead of using Solr, you may want to have a look at Hadoop or
>> another framework for distributed computation, see e.g.
>> http://java.dzone.com/articles/comparison-gridcloud-computing
>> >
>> > --
>> > Jan Høydahl, search solution architect
>> > Cominvent AS - www.cominvent.com
>> > Solr Training - www.solrtraining.com
>> >
>> > On 9. apr. 2012, at 13:41, Benson Margulies wrote:
>> >
>> >> I'm working on a prototype of a scheme that uses SolrCloud to, in
>> >> effect, distribute a computation by running it inside of a request
>> >> processor.
>> >>
>> >> If there are N shards and M operations, I want each node to perform
>> >> M/N operations. That, of course, implies that I know N.
>> >>
>> >> Is that fact available anyplace inside Solr, or do I need to just
>> configure it?
>> >
>> 
>> 


Re: Boosting when matching specific field values

2012-04-09 Thread Chris Hostetter

: possible strings.   So if the "source" field has a value "a", then I want
: to multiply the score by 1. If the "source" field has a value "b", then I
: want to multiple the score by 2 ... and so on. What is the way to go about
: doing this ?

how long is your "and so on" list?

You could use the boost param of edismax to do this, by constructing a 
function that returns the appropriate value based on the results of your 
query (either using the termfreq() or "query() functions) ... but if these 
mappings from values->score multipliers are generally static, you can also 
use ExternalFileFiel (it doesn't have to key off of the unique key field, 
it can key off of any single valued field) .. of the mappings are 
*REALLY* static just computed them when indexing hte docs.


-Hoss


Re: how to correctly facet clothing multiple sizes and colors?

2012-04-09 Thread danjfoley
The problem with that approach is that if you selected say large and red you'd 
get back all the products with large and red as variants. Not the products with 
red in the large size add would be expected.

Sent from my phone

- Reply message -
From: "Andrew Harvey [via Lucene]" 
Date: Mon, Apr 9, 2012 5:21 pm
Subject: how to correctly facet clothing multiple sizes and colors?
To: "danjfoley" 



What we do in our application is exactly what Robert described. We index 
Products, not variants. The variant data (colour, size etc.) is denormalised 
into the product document at index time. We then facet on the variant 
attributes and get product count instead of variant count. 

What you're seeing are correct results. You are indexing 6 documents, as you 
said before. You actually only want to index one document with multi-valued 
fields. 

Hope that's somehow helpful,

Andrew

On 10/04/2012, at 3:01, "Robert Petersen"  wrote:

> You *could* do it by making one and only one solr document for each
> clothing item, then just have the front end render all the sizes and
> colors available for that item as size/color pickers on the product
> page.  You can add all the colors and sized to the one document in the
> index so they are searchable also, but the caveat is that they won't
> show up as a facet.  This is just one simple approach.
> 
> -Original Message-
> From: danjfoley [mailto:d...@micamedia.com] 
> Sent: Saturday, April 07, 2012 7:04 PM
> To: solr-user@lucene.apache.org
> Subject: how to correctly facet clothing multiple sizes and colors?
> 
> I've been searching for a solution to my issue, and this seems to come
> closest to it. But not exactly. 
> 
> I am indexing clothing. Each article of clothing comes in many sizes and
> colors, and can belong to any number of categories. 
> 
> For example take the following: I add 6 documents to solr as follows: 
> 
> product, color, size, category 
> 
> shirt A, red, small, valentines day 
> shirt A, red, large, valentines day 
> shirt A, blue, small, valentines day 
> shirt A, blue, large, valentines day 
> shirt A, green, small, valentines day 
> shirt A, green, large, valentines day 
> 
> I'd like my facet counts to return as follows: 
> 
> color 
> 
> red (1) 
> blue (1) 
> green (1) 
> 
> size 
> 
> small (1) 
> large (1) 
> 
> category 
> 
> valentines day (1) 
> 
> But they come back like this: 
> 
> color: 
> red (2) 
> blue (2) 
> green (2) 
> 
> size: 
> small (2) 
> large (2) 
> 
> category 
> valentines day (6) 
> 
> I see the group.facet parameter in version 4.0 does exactly this.
> However
> how can I make this happen now? There are all sorts of ecommerce systems
> out
> there that facet exactly how i'm asking. i thought solr is supposed to
> be
> the very best fastest search system, yet it doesn't seem to be able to
> facet
> correct for items with multiple values? 
> 
> Am i indexing my data wrong? 
> 
> how can i make this happen?
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-correctly-facet-clothing-multi
> ple-sizes-and-colors-tp3893747p3893747.html
> Sent from the Solr - User mailing list archive at Nabble.com.


___
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/how-to-correctly-facet-clothing-multiple-sizes-and-colors-tp3893747p3898049.html

To unsubscribe from how to correctly facet clothing multiple sizes and colors?, 
visit 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3893747&code=ZGFuQG1pY2FtZWRpYS5jb218Mzg5Mzc0N3wtMTEyNjQzODIyNg==

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-correctly-facet-clothing-multiple-sizes-and-colors-tp3893747p3898271.html
Sent from the Solr - User mailing list archive at Nabble.com.

SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Benson Margulies
Those of you insomniacs who have read my messages here over the last
few weeks might recall that I've been working on a request handler
that wraps the SearchHandler to rewrite queries and then reorder
results.

(I haven't quite worked out how to apply Grant's alternative
suggestions without losing the performance advantages I was looking
for in the first place.)

Today, I realized that the RequestHandler approach, as opposed to
search components, wasn't going to be viable. I was growing too much
dependency on internal Solr quirks.

So I refactored it into a pair of SearchComponents -- one to go first
and rewrite the query, and one to go after "query" and rescore.

And it works just fine - until I configure it into a SolrCloud
cluster. At which point it started coming up with very wrong answers.

I think that the reason is that I don't have an implementation of the
distributedProcess method, or, more generally, that I don't understand
the protocol on a SearchComponent when distributed processing is
happening. Has anyone written anything yet about these considerations?
I can put multiple processes in the debugging and see who gets called
with what, but I was hoping for some sort of short cut.


Re: Question on using dynamic fields

2012-04-09 Thread Rakesh Varna
Hi Erick,
   Thanks for the response. I am trying to index xml files in a directory.
I provide the xpath details, file location etc in data-config.xml. I will
try the 2 approaches that you have mentioned.

Regards,
Rakesh Varna

On Mon, Apr 9, 2012 at 3:38 PM, Erick Erickson wrote:

> Hmmm, not sure about the dataconfig.xml file. What
> are you trying to index? Is this DIH? Because
> if you're simply posting Solr-formatted XML docs,
> dataconfig.xml is irrelevant
>
> You say you're not seeing the output. One of two
> things is going on:
> 1> The data is not in the index. See the admin/schema browser
>  page to examine what actually went in your index.
> 2> Try doing the query with fl=*. You may simply not be asking
>  for the fields to be returned.
>
> Best
> Erick
>
> On Sun, Apr 8, 2012 at 9:09 PM, Rakesh Varna 
> wrote:
> > Hello Solr-users,
> >   I am trying to index xml files which have the following tags: (I am
> > using Solr 3.5 on Tomcat)
> >
> > 
> > 0.98
> > 0.767
> > .
> > ..
> > ..
> > ..
> > 0.2873
> > 
> >
> > The numbers after "theta" are not a continuous sequence and I do not know
> > how many such tags are there. I thought this was a good candidate for
> > dynamic fields and have the following schema for those tags:
> >>  stored="true"/>
> > Is this correct? If so, what should I use in the data-config.xml file to
> > index these tags?
> >
> > When I try the admin feature in the browser and query *:* , I don't see
> the
> > theta fields in the response.
> >
> > If not, is dynamicFields a wrong choice? Is there another way of indexing
> > these fields?
> >
> > Thanks in advance,
> > Rakesh Varna
>


Re: SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Mark Miller

On Apr 9, 2012, at 7:34 PM, Benson Margulies wrote:

> Those of you insomniacs who have read my messages here over the last
> few weeks might recall that I've been working on a request handler
> that wraps the SearchHandler to rewrite queries and then reorder
> results.
> 
> (I haven't quite worked out how to apply Grant's alternative
> suggestions without losing the performance advantages I was looking
> for in the first place.)
> 
> Today, I realized that the RequestHandler approach, as opposed to
> search components, wasn't going to be viable. I was growing too much
> dependency on internal Solr quirks.
> 
> So I refactored it into a pair of SearchComponents -- one to go first
> and rewrite the query, and one to go after "query" and rescore.
> 
> And it works just fine - until I configure it into a SolrCloud
> cluster. At which point it started coming up with very wrong answers.
> 
> I think that the reason is that I don't have an implementation of the
> distributedProcess method, or, more generally, that I don't understand
> the protocol on a SearchComponent when distributed processing is
> happening. Has anyone written anything yet about these considerations?
> I can put multiple processes in the debugging and see who gets called
> with what, but I was hoping for some sort of short cut.



Grant started something on this once: 
http://wiki.apache.org/solr/WritingDistributedSearchComponents
It's only a start though.

Unfortunately, to this point, adventurous souls have had to debug and study 
there way to understanding the distrib process solo mostly.

Perhaps we can encourage anyone that has written a distributed component to 
help jump in on that wiki page. Any takers?

- Mark Miller
lucidimagination.com













Re: Why this document does not match?

2012-04-09 Thread Alexander Ramos Jardim
Sorry for the answer.

2012/3/29 Erick Erickson 

> Alexander:
>
> Your images were stripped by one of our mail servers, so there's not
> much we can see ...
>
> But guessing, you aren't searching the fields you think you are:
> itemNameSearch:fifa12
> becomes
> itemNameSearch:fifa defaultSearchField:12
>

That's exactly what's happening! Why does this happen?


>
> where defaultSearchField is defined in your schema.xml file.
> Try itemNameSearch:(fifa 12) or similar.
>
> Using debugQuery=on should show this in the "parsed_query" section if I'm
> right.
>
> If that doesn't help, maybe you can post your info again?
>
>  upgrade?>
>

this has been discussed a lot. And my customer's sysdamin accepted
upgrading to Solr 3.5 , but we won't be doing this in the next month.


>
> Best
> Erick
>
> On Wed, Mar 28, 2012 at 5:31 PM, Alexander Ramos Jardim
>  wrote:
> >
> > Hi,
> >
> > I have an old Solr 1.3 version running on an issue. I have a field
> configured in such a way that "fifa 12" and "fifa12" should match the same
> documents, as it can be seen in screenshot bellow.
> >
> >
> >
> >
> > When I run the query itemNameSearch:fifa12, I get the folowing result:
> >
> >
> >
> >
> > That seems okay. But I have the following document on the index:
> >
> >
> > As my field is defined, I expected the query to match this document.
> This is not what is happening. Does anyone have any idea on what is wrong?
> >
> >
> > --
> > Alexander Ramos Jardim
>



-- 
Alexander Ramos Jardim


Re: SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Benson Margulies
That page seems to be saying that the 'distributed' APIs take place on
the leader, and the ordinary prepare/process APIs out at the leaves.
I'll set out to prove or disprove that tomorrow.


On Mon, Apr 9, 2012 at 8:17 PM, Mark Miller  wrote:
>
> On Apr 9, 2012, at 7:34 PM, Benson Margulies wrote:
>
>> Those of you insomniacs who have read my messages here over the last
>> few weeks might recall that I've been working on a request handler
>> that wraps the SearchHandler to rewrite queries and then reorder
>> results.
>>
>> (I haven't quite worked out how to apply Grant's alternative
>> suggestions without losing the performance advantages I was looking
>> for in the first place.)
>>
>> Today, I realized that the RequestHandler approach, as opposed to
>> search components, wasn't going to be viable. I was growing too much
>> dependency on internal Solr quirks.
>>
>> So I refactored it into a pair of SearchComponents -- one to go first
>> and rewrite the query, and one to go after "query" and rescore.
>>
>> And it works just fine - until I configure it into a SolrCloud
>> cluster. At which point it started coming up with very wrong answers.
>>
>> I think that the reason is that I don't have an implementation of the
>> distributedProcess method, or, more generally, that I don't understand
>> the protocol on a SearchComponent when distributed processing is
>> happening. Has anyone written anything yet about these considerations?
>> I can put multiple processes in the debugging and see who gets called
>> with what, but I was hoping for some sort of short cut.
>
>
>
> Grant started something on this once: 
> http://wiki.apache.org/solr/WritingDistributedSearchComponents
> It's only a start though.
>
> Unfortunately, to this point, adventurous souls have had to debug and study 
> there way to understanding the distrib process solo mostly.
>
> Perhaps we can encourage anyone that has written a distributed component to 
> help jump in on that wiki page. Any takers?
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>


Re: Is http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster up to date?

2012-04-09 Thread Mark Miller

On Apr 9, 2012, at 9:52 AM, Benson Margulies wrote:

> I specify -Dcollection.configName=rnicloud, but the admin gui tells me
> that I have a collection named 'collection1'.
> 
> And, as reported in a prior email, the admin UI URL in there seems wrong.


Sorry - that param name is not entirely clear I guess - it's the name of the 
collection set you are uploading. Later, you could refer multiple Collections 
to that set of config files by using that name. In that example, because you 
don't override the collection name, it takes the default, which is the SolrCore 
name, which is collection1. You can override the collection name to something 
else by adding an attrib in solr.xml or using a param with CoreAdmin when 
creating a core dynamically. If you don't, it simply uses the SolrCore name for 
the name of the collection.

- Mark Miller
lucidimagination.com













Re: Why this document does not match?

2012-04-09 Thread Chris Hostetter

: > itemNameSearch:fifa defaultSearchField:12

: That's exactly what's happening! Why does this happen?

whyspace is meaningful to the query parser: it tells the query parser 
there are multiple clauses for a boolean query.

if you want to search for any works the user typed in the field 
"itemNameSearch" then you can either set the default search field to 
itemNameSearch...
df=itemNameSearch & q=fifa 12
...or put parens arround the words you want to search for in a specific 
field...
q=itemNameSearch:(fifa 12)

-Hoss


[CFP] Open Source Search Conference Oct 2, 2012

2012-04-09 Thread Erik Hatcher
Sending this on behalf of my friends at BasisTech -



Subject: Call for Presentations: Open Source Search Conference Oct. 2, 2012 
(Chantilly, VA)

==
Call for Presentations & Save the Date
Open Source Search Conference Oct 2, 2012 
(tutorials Oct. 1) in Chantilly, VA
http://www.basistech.com/conference/2012/oss/
==

The second annual Open Source Search Conference will be held on October 2, 2012 
in Chantilly, VA, and you are invited to submit a presentation. The conference 
will be attended by government employees and contractors who are evaluating, 
building, or using Apache Solr and other open source tools for search 
applications throughout the government.

This event is a unique opportunity to share tips and ideas to overcome 
challenges working with open source search projects. We are also looking for 
people who are interested in providing half- and full-day tutorials on the day 
before the conference (October 1, 2012). The tutorials should provide hands-on 
guidance for using or developing open source search applications.
For more information, visit: http://www.basistech.com/conference/2012/oss/

==Dates==
Conference: October 2, 2012
Tutorials: October 1, 2012

==Submission Instructions==
Please email submissions for conference presentations and tutorials to 
oss2...@basistech.com by April 23, 2012.
To submit a presentation or tutorial, e-mail the following information:
1. Title
2. Author
3. Brief Biography
4. Description of presentation or tutorial (100-150 words)
5. Brief description of author’s experience with Apache Solr and/or other open 
source tools
6. Specify whether the presentation or tutorial is targeted towards users or 
developers

==Suggested Topics==
1. Large-scale Apache Solr
* Solr at exabyte scale 
* High-load deployments
* Complex queries
2. Analytic interfaces
* Geospatial search
* Iterative Analytics using Solr (index reprocessing, etc.)
* Exploring and Discovering Big Data with Solr
* Linguistic plug-in use and development
* Document clustering (semantic, field collapsing, dynamic faceting)
* Language identification
* Search in a multilingual site
* Sentiment analysis
3. Text Mining
* Text analytics processing
* Entity extraction
* Name matching
4. Security
* Access control
* Index encryption
5.Case studies and user experiences
* Migrating to Solr from other search engines
* Other topics


==About the Conference==
The Open Source Search Conference is sponsored by Basis Technology, which has 
been producing government conferences since 2006 and focuses on topics 
including text analytics, human language technology, and the nexus of language, 
culture and technology for the federal community. For more information about 
our conferences, visit: http://www.basistech.com/conference.
Basis Technology provides software solutions for text analytics, information 
retrieval, and name resolution in over 40 languages. Our customers include 
leading software vendors, content providers, financial institutions, and 
government agencies in the defense and intelligence industry.



[Lucene Revolution] Agenda Updated!

2012-04-09 Thread Erik Hatcher
We've updated the agenda and keynotes for the upcoming Lucene Revolution 
conference, May 7-10 in Boston, MA.  We've got a lot of the committers coming, 
and Hoss' infamous "Stump the Chump" session, and many great talks.  All we're 
missing is you it's not too late to sign up ;)

 http://www.lucenerevolution.com/agenda

We're unveiling a couple of new/revamped training classes, Solr 101 and Solr 
201 - the seats are filling up, so register soon.  I'm working like mad to 
complete the "Solr 201" materials and will be teaching one of those sessions 
myself. 

See you at the Revolution.

Erik



Re: SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Benson Margulies
Um, maybe I've hit a quirk?

In my solrconfig.xml, my special SearchComponents are installed only
for a specific QT. So, it looks to me as if that QT is not propagated
into the request out to the shards, and so they run the ordinary
request handler without my components in it.

Is this intended behavior I have to tweak via a distribution-aware
component, or perhaps a bug, or does it make no sense at all and I
need to look for some mistake of mine?


Re: Question on using dynamic fields

2012-04-09 Thread Rakesh Varna
Hi Erick,
   The schema browser says that no dynamic fields were indexed. Any idea
how do I specify dynamic fields through XPath when I only know the prefix
and nothing else?

Regards,
Rakesh Varna

On Mon, Apr 9, 2012 at 4:49 PM, Rakesh Varna  wrote:

> Hi Erick,
>Thanks for the response. I am trying to index xml files in a directory.
> I provide the xpath details, file location etc in data-config.xml. I will
> try the 2 approaches that you have mentioned.
>
> Regards,
> Rakesh Varna
>
>
> On Mon, Apr 9, 2012 at 3:38 PM, Erick Erickson wrote:
>
>> Hmmm, not sure about the dataconfig.xml file. What
>> are you trying to index? Is this DIH? Because
>> if you're simply posting Solr-formatted XML docs,
>> dataconfig.xml is irrelevant
>>
>> You say you're not seeing the output. One of two
>> things is going on:
>> 1> The data is not in the index. See the admin/schema browser
>>  page to examine what actually went in your index.
>> 2> Try doing the query with fl=*. You may simply not be asking
>>  for the fields to be returned.
>>
>> Best
>> Erick
>>
>> On Sun, Apr 8, 2012 at 9:09 PM, Rakesh Varna 
>> wrote:
>> > Hello Solr-users,
>> >   I am trying to index xml files which have the following tags: (I am
>> > using Solr 3.5 on Tomcat)
>> >
>> > 
>> > 0.98
>> > 0.767
>> > .
>> > ..
>> > ..
>> > ..
>> > 0.2873
>> > 
>> >
>> > The numbers after "theta" are not a continuous sequence and I do not
>> know
>> > how many such tags are there. I thought this was a good candidate for
>> > dynamic fields and have the following schema for those tags:
>> >   > >  stored="true"/>
>> > Is this correct? If so, what should I use in the data-config.xml file to
>> > index these tags?
>> >
>> > When I try the admin feature in the browser and query *:* , I don't see
>> the
>> > theta fields in the response.
>> >
>> > If not, is dynamicFields a wrong choice? Is there another way of
>> indexing
>> > these fields?
>> >
>> > Thanks in advance,
>> > Rakesh Varna
>>
>
>


Re: SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Mark Miller
Yeah, that's how it works - it ends up hitting the select request handler (this 
might be overridable with shards.qt) All the params are passed along, so in 
general, it will act the same as the top level req handler - but it can the 
remove the shards param so you don't have an infinite recursion of distrib 
requests (say in the case you put shards in the tea handler in solrconfig). 

I think you have to investigate shards.qt  
Or look at adding those components to the std select handler as well. 

Sent from my iPhone

On Apr 9, 2012, at 9:26 PM, Benson Margulies  wrote:

> Um, maybe I've hit a quirk?
> 
> In my solrconfig.xml, my special SearchComponents are installed only
> for a specific QT. So, it looks to me as if that QT is not propagated
> into the request out to the shards, and so they run the ordinary
> request handler without my components in it.
> 
> Is this intended behavior I have to tweak via a distribution-aware
> component, or perhaps a bug, or does it make no sense at all and I
> need to look for some mistake of mine?


Re: To truncate or not to truncate (group.truncate vs. facet)

2012-04-09 Thread danjfoley
Is this planned as a future feature? Is it in the bug tracker as a feature 
yet..just wondering how long until it is a feature.  I could live without price 
counts for a bit.

Sent from my phone

- Reply message -
From: "Martijn v Groningen-2 [via Lucene]" 

Date: Mon, Apr 9, 2012 3:31 pm
Subject: To truncate or not to truncate (group.truncate vs. facet)
To: "danjfoley" 



The group.facet option only works for field facets (facet.field). Others
facets types (query, range and pivot) aren't supported yet.
The group.facet works for both single and multivalued fields specified in
the facet.field parameter.

Martijn

On 9 April 2012 20:58, danjfoley  wrote:

> I am using group.facet and it works fine for regular facet.field but not
> for facet.query
>
> Sent from my phone
>
> - Reply message -
> From: "Young, Cody [via Lucene]"  >
> Date: Mon, Apr 9, 2012 1:38 pm
> Subject: To truncate or not to truncate (group.truncate vs. facet)
> To: "danjfoley" 
>
>
>
> One other thing, I believe that you need to be using facet.field on single
> valued string fields for group.facet to function properly. Are the fields
> you're faceting on multiValued=false?
>
> Cody
>
> -Original Message-
> From: Young, Cody [mailto:cody.yo...@move.com]
> Sent: Monday, April 09, 2012 10:36 AM
> To: solr-user@lucene.apache.org
> Subject: RE: To truncate or not to truncate (group.truncate vs. facet)
>
> You tried adding the parameter
>
> &group.facet=true ?
>
> Cody
>
> -Original Message-
> From: danjfoley [mailto:d...@micamedia.com]
> Sent: Monday, April 09, 2012 10:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: To truncate or not to truncate (group.truncate vs. facet)
>
> I did get this working with version 4. However my facet queries still
> don't group.
>
> Sent from my phone
>
> - Reply message -
> From: "Young, Cody [via Lucene]"  >
> Date: Mon, Apr 9, 2012 12:45 pm
> Subject: To truncate or not to truncate (group.truncate vs. facet)
> To: "danjfoley" 
>
>
>
> I believe you're looking for what's called, "Matrix Counts"
>
> Please see this JIRA issue. To my knowledge it has been committed in trunk
> but not 3.x.
>
> https://issues.apache.org/jira/browse/SOLR-2898
>
> This feature is accessed by using group.facet=true
>
> Cody
>
> -Original Message-
> From: danjfoley [mailto:d...@micamedia.com]
> Sent: Saturday, April 07, 2012 7:02 PM
> To: solr-user@lucene.apache.org
> Subject: Re: To truncate or not to truncate (group.truncate vs. facet)
>
> I've been searching for a solution to my issue, and this seems to come
> closest to it. But not exactly.
>
> I am indexing clothing. Each article of clothing comes in many sizes and
> colors, and can belong to any number of categories.
>
> For example take the following: I add 6 documents to solr as follows:
>
> product, color, size, category
>
> shirt A, red, small, valentines day
> shirt A, red, large, valentines day
> shirt A, blue, small, valentines day
> shirt A, blue, large, valentines day
> shirt A, green, small, valentines day
> shirt A, green, large, valentines day
>
> I'd like my facet counts to return as follows:
>
> color
>
> red (1)
> blue (1)
> green (1)
>
> size
>
> small (1)
> large (1)
>
> category
>
> valentines day (1)
>
> But they come back like this:
>
> color:
> red (2)
> blue (2)
> green (2)
>
> size:
> small (2)
> large (2)
>
> category
> valentines day (6)
>
> I see the group.facet parameter in version 4.0 does exactly this. However
> how can I make this happen now? There are all sorts of ecommerce systems
> out there that facet exactly how i'm asking. i thought solr is supposed to
> be the very best fastest search system, yet it doesn't seem to be able to
> facet correct for items with multiple values?
>
> Am i indexing my data wrong?
>
> how can i make this happen?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3893744.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ___
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897366.html
>
> To unsubscribe from To truncate or not to truncate (group.truncate vs.
> facet, visit
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3838797&code=ZGFuQG1pY2FtZWRpYS5jb218MzgzODc5N3wtMTEyNjQzODIyNg==
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp3838797p3897422.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ___
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/To-truncate-or-not-to-truncate-group-truncate-vs-facet-tp

RE: Problem about range search

2012-04-09 Thread ZHANG Liang F
Hi, 
I just found the root cause. The definition for the 'long' type is not right. 
the previous definition was:



which doesn't support range query! now I changed to : , and it's done!

Thanks a lot!
Liang

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 2012年4月10日 6:53
To: solr-user@lucene.apache.org
Subject: Re: Problem about range search

Hmmm, works fine for me using the "popularity" field in the default schema.

What version of Solr are you using? What is your complete handler definition?

Best
Erick

On Mon, Apr 9, 2012 at 12:10 AM, ZHANG Liang F 
 wrote:
> Hi,
> I ran into a problem when trying range facet search. I had a schema define 
> like this:
>  
>   
>   
>   
>   
>   
>    />
>  
>
> I try to set up a range search on "size" field which stands for the size of a 
> file. So I have the following requestHandler config in solrconfig.xml:
>  after
>  size
>  0
>  15728640
>  3145728
>
> But an error says:  Unable to range facet on 
> field:size{type=long,properties=indexed,stored,omitNorms,omitTermFreqA
> ndPositions}
>
> It doesn't show any clue, and I also tried  tag, but got the same error.
>
> Could you please help to suggest?
>
> Thanks in advance!


which approach is correct?

2012-04-09 Thread neosky
Here are my fields
101NGHGJGKGKLHJFKGJGKGK

the sequence field is from 300 bytes to 56K bytes, no spaces
I want to ngram from 3 to 8
NGH GHG HGJ ...
NGHG GHGJ HGJG ...
...
 
 
 
 
 
 
 
 
 













--
View this message in context: 
http://lucene.472066.n3.nabble.com/which-approach-is-correct-tp3898711p3898711.html
Sent from the Solr - User mailing list archive at Nabble.com.