Update specific field on Solr index

2011-10-13 Thread Chamnap Chhorn
Hello,

I'm working on solr 1.4 with around 10 millions documents. Usually, it's
fine. However, the issue arises when I add new field to the schema.xml, I
need to reindex the whole database for that new field. Indexing the whole
database with the whole properties takes so long to do. It would be better
if I index only that new field and pass it to solr.

I'm not using DIH provided by solr. I'm using my rake script to index
documents; it's written in Ruby.

As I know lucence doesn't have update operation. It has only DELETE and ADD
statement. If I send the partial document with only few fields, the new
document will contain only few fields.

Is there any better way to do it?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Score boosting

2010-07-08 Thread Chamnap Chhorn
Hi everyone,

I have a requirement to achieve, but i can't figure out how to do it. Hope
someone could help me.

Here is the requirement: A book has several keyphrases (available to use in
searching). The author could buy the search result position with these
keyphrases or simply add keyphrases related to this book. Here, I need to
implement the search affected by the position field.

I'm not so sure how to implement this requirement. Hope anyone could help
me!

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Score boosting

2010-07-11 Thread Chamnap Chhorn
Thanks for your reply. Do you have other solution? Here each keyphrase must
be matched with the exact and whole world. The problem is that it is a
multivalued column.

Chamnap

On Thu, Jul 8, 2010 at 7:40 PM, osocurious2 wrote:

>
> Sounds like you want Payloads. I don't think you can guarantee a position,
> but you can boost relative to others. You can give one author/book a boost
> of 0 for the phrase Cooking, and another author/book a boost of .5 and yet
> another a boost of 1.0. For searches that include the phrase Cooking, the
> scores should reflect the boosts and the authors that bought the higher
> boost value will sort higher. These discuss Payloads (it isn't a trivial
> task by the way):
>  http://www.ultramagnus.org/?p=1
>
>
> http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
> or use this to see other Solr-User group discussions on the topic:
>
>
> http://lucene.472066.n3.nabble.com/template/NodeServlet.jtp?tpl=search-page&node=472068&query=Using+Lucene's+payload+in+Solr
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Score-boosting-tp951214p951510.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Ranking position in solr

2010-07-12 Thread Chamnap Chhorn
I wonder there is a proper way to fulfill this requirement. A book has
several keyphrases. Each keyphrase consists from one word to 3 words. The
author could either buy keyphrase position or don't buy position. Note: each
author could buy more than 1 keyphrase. The keyphrase search must be exact
and case sensitive.

For example: Book A, keyphrases: agile, web, development Book B, keyphrases:
css, html, web

Let's say Author of Book A buys search result position 1 with keyphrase
"web", so his book should be in the first position. His book should be
listed before the Book B.

Anyone has any suggestions on how to implement this in solr?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Ranking position in solr

2010-07-13 Thread Chamnap Chhorn
The problem is that every time I update the elevate.xml, I need to restart
solr tomcat service. This feature needs to be updated frequently. How would
i handle that?

Any idea or other solutions?

On Mon, Jul 12, 2010 at 5:45 PM, Ahmet Arslan  wrote:

> > I wonder there is a proper way to
> > fulfill this requirement. A book has
> > several keyphrases. Each keyphrase consists from one word
> > to 3 words. The
> > author could either buy keyphrase position or don't buy
> > position. Note: each
> > author could buy more than 1 keyphrase. The keyphrase
> > search must be exact
> > and case sensitive.
> >
> > For example: Book A, keyphrases: agile, web, development
> > Book B, keyphrases:
> > css, html, web
> >
> > Let's say Author of Book A buys search result position 1
> > with keyphrase
> > "web", so his book should be in the first position. His
> > book should be
> > listed before the Book B.
> >
> > Anyone has any suggestions on how to implement this in
> > solr?
>
> http://wiki.apache.org/solr/QueryElevationComponent - which is used to
> "elevate" results based on editorial decisions - may help.
>
>
>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Ranking position in solr

2010-07-13 Thread Chamnap Chhorn
I'm using solr 1.4 and only one core. The elevate xml file is quite big, and
i wonder can solr handle that? How to reload the core?

On Tue, Jul 13, 2010 at 4:12 PM, Ahmet Arslan  wrote:

> > The problem is that every time I
> > update the elevate.xml, I need to restart
> > solr tomcat service. This feature needs to be updated
> > frequently. How would
> > i handle that?
>
> You can reload core, without restarting tomcat, if you are using multi-core
> setup. Which version of solr are you using?
>
>
>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Ranking position in solr

2010-07-14 Thread Chamnap Chhorn
I sent this command: curl http://localhost:8081/solr/update -F stream.body='
', but it doesn't reload.

It doesn't reload automatically after every commit or optimize unless I add
new document then i commit.

Any idea?

On Tue, Jul 13, 2010 at 4:54 PM, Ahmet Arslan  wrote:

> > I'm using solr 1.4 and only one core.
> > The elevate xml file is quite big, and
> > i wonder can solr handle that? How to reload the core?
>
> Markus Jelsma's suggestion is more robust. You don't need to restart or
> reload anything. Put elevate.xml under data directory. It will reloaded
> automatically after every commit or optimize.
>
>
>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


why spellcheck and elevate search components can't work together?

2010-07-19 Thread Chamnap Chhorn
In my solrconfig.xml, I setup this way, but it doesn't work at all. Any one
can help? it works one without other one.

  
string_ci
elevateListings.xml
false
  

  

  explicit
  20
  dismax
  name^2 full_text^1
  uuid
  2.2
  on
  0.1


  type:Listing


  false


  spellcheck


  elevateListings

  

If I remove spellcheck component, the elevate component works (the result
also loads from elevateListings.xml).
If I remove elevate component,
http://localhost:8081/solr/select/?q=redd&qt=mb_listings&spellcheck=true&spellcheck.collate=truedoes
work.

Any ideas?

Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn
Hi,

I'm a ruby developer, no background in Java at all. I need
*exclusive=true*to work on elevation search component. However, it
does need a patch,
https://issues.apache.org/jira/browse/SOLR-1966. Anyone could present me a
step by step in order to do that?


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn
I'm using Solr 1.4, but the exclusive=true doesn't work for me at all. I
wonder how that is ?

Any ideas?

On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson wrote:

> I don't think you need to go there. That patch has been committed
> to the 3.x code line as well as the 1.4. You can get the 3x build here:
> http://hudson.zones.apache.org/hudson/job/Solr-3.x/
> Just try that...
>
> Or you can get the latest build if you're working with 1.4 here:
>
> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
>
> HTH
> Erick
>
> On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn  >wrote:
>
> > Hi,
> >
> > I'm a ruby developer, no background in Java at all. I need
> > *exclusive=true*to work on elevation search component. However, it
> > does need a patch,
> > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could present me
> a
> > step by step in order to do that?
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn
Ah, I get what you mean. One more thing, I wonder the patch SOLR-1147 has
included in the last successful build or not?


Thanks,
Chamnap

On Tue, Jul 20, 2010 at 8:26 AM, Erick Erickson wrote:

> Not 1.4 released, 1.4 last successful build. I don't think the 1.4 release
> has the patch applied.
>
> See the second link I originally provided.
>
> HTH
> Erick
>
> On Mon, Jul 19, 2010 at 9:15 PM, Chamnap Chhorn  >wrote:
>
> > I'm using Solr 1.4, but the exclusive=true doesn't work for me at all. I
> > wonder how that is ?
> >
> > Any ideas?
> >
> > On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson  > >wrote:
> >
> > > I don't think you need to go there. That patch has been committed
> > > to the 3.x code line as well as the 1.4. You can get the 3x build here:
> > > http://hudson.zones.apache.org/hudson/job/Solr-3.x/
> > > Just try that...
> > >
> > > Or you can get the latest build if you're working with 1.4 here:
> > >
> > >
> >
> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
> > >
> > > HTH
> > > Erick
> > >
> > > On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn <
> chamnapchh...@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm a ruby developer, no background in Java at all. I need
> > > > *exclusive=true*to work on elevation search component. However, it
> > > > does need a patch,
> > > > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could
> present
> > me
> > > a
> > > > step by step in order to do that?
> > > >
> > > >
> > > > --
> > > > Chhorn Chamnap
> > > > http://chamnapchhorn.blogspot.com/
> > > >
> > >
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>


Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn
The other thing I want to ask is the latest build of solr is stable or not?
I'm afraid it might bring some other problems to my system.

Thanks,
Chamnap

On Tue, Jul 20, 2010 at 8:41 AM, Chamnap Chhorn wrote:

> Ah, I get what you mean. One more thing, I wonder the patch SOLR-1147 has
> included in the last successful build or not?
>
>
> Thanks,
> Chamnap
>
>
> On Tue, Jul 20, 2010 at 8:26 AM, Erick Erickson 
> wrote:
>
>> Not 1.4 released, 1.4 last successful build. I don't think the 1.4 release
>> has the patch applied.
>>
>> See the second link I originally provided.
>>
>> HTH
>> Erick
>>
>> On Mon, Jul 19, 2010 at 9:15 PM, Chamnap Chhorn > >wrote:
>>
>> > I'm using Solr 1.4, but the exclusive=true doesn't work for me at all. I
>> > wonder how that is ?
>> >
>> > Any ideas?
>> >
>> > On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson <
>> erickerick...@gmail.com
>> > >wrote:
>> >
>> > > I don't think you need to go there. That patch has been committed
>> > > to the 3.x code line as well as the 1.4. You can get the 3x build
>> here:
>> > > http://hudson.zones.apache.org/hudson/job/Solr-3.x/
>> > > Just try that...
>> > >
>> > > Or you can get the latest build if you're working with 1.4 here:
>> > >
>> > >
>> >
>> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
>> > >
>> > > HTH
>> > > Erick
>> > >
>> > > On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn <
>> chamnapchh...@gmail.com
>> > > >wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I'm a ruby developer, no background in Java at all. I need
>> > > > *exclusive=true*to work on elevation search component. However, it
>> > > > does need a patch,
>> > > > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could
>> present
>> > me
>> > > a
>> > > > step by step in order to do that?
>> > > >
>> > > >
>> > > > --
>> > > > Chhorn Chamnap
>> > > > http://chamnapchhorn.blogspot.com/
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Chhorn Chamnap
>> > http://chamnapchhorn.blogspot.com/
>> >
>>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: How to patch SOLR-1966?

2010-07-19 Thread Chamnap Chhorn
Is there an easy way to apply this patch to solr 1.4 release in my system
because I will use in the production server.

Thanks,
Chamnap

On Tue, Jul 20, 2010 at 8:49 AM, Chamnap Chhorn wrote:

> The other thing I want to ask is the latest build of solr is stable or not?
> I'm afraid it might bring some other problems to my system.
>
> Thanks,
> Chamnap
>
>
> On Tue, Jul 20, 2010 at 8:41 AM, Chamnap Chhorn 
> wrote:
>
>> Ah, I get what you mean. One more thing, I wonder the patch SOLR-1147 has
>> included in the last successful build or not?
>>
>>
>> Thanks,
>> Chamnap
>>
>>
>> On Tue, Jul 20, 2010 at 8:26 AM, Erick Erickson 
>> wrote:
>>
>>> Not 1.4 released, 1.4 last successful build. I don't think the 1.4
>>> release
>>> has the patch applied.
>>>
>>> See the second link I originally provided.
>>>
>>> HTH
>>> Erick
>>>
>>> On Mon, Jul 19, 2010 at 9:15 PM, Chamnap Chhorn >> >wrote:
>>>
>>> > I'm using Solr 1.4, but the exclusive=true doesn't work for me at all.
>>> I
>>> > wonder how that is ?
>>> >
>>> > Any ideas?
>>> >
>>> > On Tue, Jul 20, 2010 at 6:40 AM, Erick Erickson <
>>> erickerick...@gmail.com
>>> > >wrote:
>>> >
>>> > > I don't think you need to go there. That patch has been committed
>>> > > to the 3.x code line as well as the 1.4. You can get the 3x build
>>> here:
>>> > > http://hudson.zones.apache.org/hudson/job/Solr-3.x/
>>> > > Just try that...
>>> > >
>>> > > Or you can get the latest build if you're working with 1.4 here:
>>> > >
>>> > >
>>> >
>>> http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/
>>> > >
>>> > > HTH
>>> > > Erick
>>> > >
>>> > > On Mon, Jul 19, 2010 at 6:17 AM, Chamnap Chhorn <
>>> chamnapchh...@gmail.com
>>> > > >wrote:
>>> > >
>>> > > > Hi,
>>> > > >
>>> > > > I'm a ruby developer, no background in Java at all. I need
>>> > > > *exclusive=true*to work on elevation search component. However, it
>>> > > > does need a patch,
>>> > > > https://issues.apache.org/jira/browse/SOLR-1966. Anyone could
>>> present
>>> > me
>>> > > a
>>> > > > step by step in order to do that?
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Chhorn Chamnap
>>> > > > http://chamnapchhorn.blogspot.com/
>>> > > >
>>> > >
>>> >
>>> >
>>> >
>>> > --
>>> > Chhorn Chamnap
>>> > http://chamnapchhorn.blogspot.com/
>>> >
>>>
>>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


dismax request handler without q

2010-07-19 Thread Chamnap Chhorn
I wonder how could i make a query to return only *all books* that has
keyphrase "web development" using dismax handler? A book has multiple
keyphrases (keyphrase is multivalued column). Do I have to pass q parameter?


Is it the correct one?
http://locahost:8081/solr/select?&q=hotel&fq=keyphrase:%20hotel

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: dismax request handler without q

2010-07-19 Thread Chamnap Chhorn
There are some default configuration on my solrconfig.xml that I didn't show
you. I'm a little confused when reading
http://wiki.apache.org/solr/DisMaxRequestHandler#q. I think q is for plain
user input query.

On Tue, Jul 20, 2010 at 12:08 PM, olivier sallou
wrote:

> Hi,
> this is not very clear, if you need to query only keyphrase, why don't you
> query directly it? e.g. q=keyphrase:hotel ?
> Furthermore, why dismax if only keyphrase field is of interest? dismax is
> used to query multiple fields automatically.
>
> At least dismax do not appear in your query (using query type). It is set
> in
> your config for your default request handler?
>
> 2010/7/20 Chamnap Chhorn 
>
> > I wonder how could i make a query to return only *all books* that has
> > keyphrase "web development" using dismax handler? A book has multiple
> > keyphrases (keyphrase is multivalued column). Do I have to pass q
> > parameter?
> >
> >
> > Is it the correct one?
> > http://locahost:8081/solr/select?&q=hotel&fq=keyphrase:%20hotel
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: dismax request handler without q

2010-07-19 Thread Chamnap Chhorn
I can't put q=keyphrase:hotel in my request using dismax handler. It returns
no result.

On Tue, Jul 20, 2010 at 1:19 PM, Chamnap Chhorn wrote:

> There are some default configuration on my solrconfig.xml that I didn't
> show you. I'm a little confused when reading
> http://wiki.apache.org/solr/DisMaxRequestHandler#q. I think q is for plain
> user input query.
>
>
> On Tue, Jul 20, 2010 at 12:08 PM, olivier sallou  > wrote:
>
>> Hi,
>> this is not very clear, if you need to query only keyphrase, why don't you
>> query directly it? e.g. q=keyphrase:hotel ?
>> Furthermore, why dismax if only keyphrase field is of interest? dismax is
>> used to query multiple fields automatically.
>>
>> At least dismax do not appear in your query (using query type). It is set
>> in
>> your config for your default request handler?
>>
>> 2010/7/20 Chamnap Chhorn 
>>
>> > I wonder how could i make a query to return only *all books* that has
>> > keyphrase "web development" using dismax handler? A book has multiple
>> > keyphrases (keyphrase is multivalued column). Do I have to pass q
>> > parameter?
>> >
>> >
>> > Is it the correct one?
>> > http://locahost:8081/solr/select?&q=hotel&fq=keyphrase:%20hotel
>> >
>> > --
>> > Chhorn Chamnap
>> > http://chamnapchhorn.blogspot.com/
>> >
>>
>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


LocalSolr distance in km?

2010-07-20 Thread Chamnap Chhorn
Hi,

I want to do a geo query with LocalSolr. However, It seems it supports only
miles **when calculating distances. Is there a quick way to use this search
component with solr using Km instead?
The other thing I want it to calculate distance start from 500 meters up.
How could I do this?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


OutOfMemoryErrors

2010-08-16 Thread Chamnap Chhorn
I got this error, anyone could explain and solve this?

SEVERE: Exception invoking periodic operation:
java.lang.OutOfMemoryError: Java heap space
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:114)
at
org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1337)
at
org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1601)
at
org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1590)
at java.lang.Thread.run(Thread.java:636)
Aug 17, 2010 1:56:43 AM
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process
SEVERE: Error reading request, ignored
java.lang.OutOfMemoryError: Java heap space
Aug 17, 2010 2:00:45 AM
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process
SEVERE: Error reading request, ignored
java.lang.OutOfMemoryError: Java heap space
Aug 17, 2010 2:09:44 AM org.apache.catalina.connector.CoyoteAdapter service
SEVERE: An exception or error occurred in the container during the request
processing
java.lang.OutOfMemoryError: Java heap space
Aug 17, 2010 2:10:32 AM
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler process
SEVERE: Error reading request, ignored
java.lang.OutOfMemoryError: Java heap space
Aug 17, 2010 2:12:12 AM org.apache.coyote.http11.Http11Processor process
SEVERE: Error finishing response
java.lang.OutOfMemoryError: Java heap space
Aug 17, 2010 2:15:16 AM org.apache.coyote.http11.Http11Protocol init


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: OutOfMemoryErrors

2010-08-17 Thread Chamnap Chhorn
Should I add this line with double quote or not? because if I don't, it
doesn't work at all in my /etc/init.d/tomcat6.

export CATALINA_OPTS="-Xms256m -Xmx1024m";

On Tue, Aug 17, 2010 at 1:36 PM, Grijesh.singh wrote:

>
> put that line in your startup script or u can set as env var
> export CATALINA_OPTS=-Xms256m -Xmx1024m;
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/OutOfMemoryErrors-tp1181731p1182708.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: OutOfMemoryErrors

2010-08-17 Thread Chamnap Chhorn
Is there a way to verify that I have added correctlly?

On Tue, Aug 17, 2010 at 2:41 PM, Chamnap Chhorn wrote:

> Should I add this line with double quote or not? because if I don't, it
> doesn't work at all in my /etc/init.d/tomcat6.
>
>
> export CATALINA_OPTS="-Xms256m -Xmx1024m";
>
> On Tue, Aug 17, 2010 at 1:36 PM, Grijesh.singh wrote:
>
>>
>> put that line in your startup script or u can set as env var
>> export CATALINA_OPTS=-Xms256m -Xmx1024m;
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/OutOfMemoryErrors-tp1181731p1182708.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


possible to have multiple elevation file?

2010-08-23 Thread Chamnap Chhorn
Hi,

I need multiple elevation file for each site (around 200). I think one big
elevation file is difficult to manage. How could I manage each elevation
file differently?

Thanks
-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: possible to have multiple elevation file?

2010-08-23 Thread Chamnap Chhorn
Hi,

Here, I talk about
QueryElevationComponent<http://wiki.apache.org/solr/QueryElevationComponent?action=fullsearch&context=180&value=linkto%3A%22QueryElevationComponent%22>
.
Anyone has some idea?

Thanks

On Mon, Aug 23, 2010 at 3:10 PM, Chamnap Chhorn wrote:

> Hi,
>
> I need multiple elevation file for each site (around 200). I think one big
> elevation file is difficult to manage. How could I manage each elevation
> file differently?
>
> Thanks
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Multiple partial word searching with dismax handler

2010-10-19 Thread Chamnap Chhorn
Hi,

I have some problem with combining the query with multiple parital-word
searching in dismax handler. In order to make multiple partial word
searching, I use EdgeNGramFilterFactory, and my query must be something like
this: "name_ngram:sun name_ngram:hot" in q.alt combined with my search
handler (
http://localhost:8081/solr/select/?q.alt=name_ngram:sun%20name_ngram:hot&qt=products).
I wonder how I combine this with my search handler.

Here is my search handler config:
  

  explicit
  20
  dismax
  name^200 full_text
  fap^15
  uuid
  2.2
  on
  0.1


  type:Product


  false


  spellcheck
  elevateProducts

  

If I query with this url
http://localhost:8081/solr/select/?q.alt=name_ngram:sun%20name_ngram:hot&q=sun
hot&qt=products, it doesn't show the correct answer like the previous query.

How could configure this in my search handler with boost score?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Multiple partial word searching with dismax handler

2010-10-20 Thread Chamnap Chhorn
Anyone can suggests how to do multiple partial word searching?

On Wed, Oct 20, 2010 at 11:42 AM, Chamnap Chhorn wrote:

> Hi,
>
> I have some problem with combining the query with multiple parital-word
> searching in dismax handler. In order to make multiple partial word
> searching, I use EdgeNGramFilterFactory, and my query must be something like
> this: "name_ngram:sun name_ngram:hot" in q.alt combined with my search
> handler (
> http://localhost:8081/solr/select/?q.alt=name_ngram:sun%20name_ngram:hot&qt=products).
> I wonder how I combine this with my search handler.
>
> Here is my search handler config:
>   
> 
>   explicit
>   20
>   dismax
>   name^200 full_text
>   fap^15
>   uuid
>   2.2
>   on
>   0.1
> 
> 
>   type:Product
> 
> 
>   false
> 
> 
>   spellcheck
>   elevateProducts
> 
>   
>
> If I query with this url 
> http://localhost:8081/solr/select/?q.alt=name_ngram:sun%20name_ngram:hot&q=sun
> hot&qt=products<http://localhost:8081/solr/select/?q.alt=name_ngram:sun%20name_ngram:hot&q=sun+hot&qt=products>,
> it doesn't show the correct answer like the previous query.
>
> How could configure this in my search handler with boost score?
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


QueryElevation Component is so slow

2010-10-28 Thread Chamnap Chhorn
Hi,

I'm using solr 1.4 and using QueryElevation Component for guaranteed search
position. I have around 700,000 documents with 1 Mb elevation file. It turns
out it is quite slow on the newrelic monitoring website:

Slowest Components Count Exclusive Total   QueryElevationComponent 1 506,858
ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms 0%
org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0% 506,862 ms
100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0% 0.0 ms
0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%

As you could see, QueryElevationComponent takes quite a lot of time. Any
suggestion how to improve this?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: QueryElevation Component is so slow

2010-10-28 Thread Chamnap Chhorn
Sorry for very bad pasting. I paste it again.

Slowest Components  Count   Exclusive
 Total
QueryElevationComponent 1 506,858 ms 100%
506,858 ms 100%
SolrIndexSearcher 1 2.0 ms
 0% 2.0 ms 0%
org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms
 0% 506,862 ms 100%
QueryComponent 1 1.0 ms
 0%   1.0 ms 0%
DebugComponent 1 0.0 ms
 0% 0.0 ms 0%
FacetComponent 1 0.0 ms
 0% 0.0 ms 0%

On Thu, Oct 28, 2010 at 4:57 PM, Chamnap Chhorn wrote:

> Hi,
>
> I'm using solr 1.4 and using QueryElevation Component for guaranteed search
> position. I have around 700,000 documents with 1 Mb elevation file. It turns
> out it is quite slow on the newrelic monitoring website:
>
> Slowest Components Count Exclusive Total   QueryElevationComponent 1
> 506,858 ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms 0%
> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0% 506,862 ms
> 100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0% 0.0 ms
> 0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%
>
> As you could see, QueryElevationComponent takes quite a lot of time. Any
> suggestion how to improve this?
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: QueryElevation Component is so slow

2010-10-29 Thread Chamnap Chhorn
anyone has some suggestions to improve the search?
thanks

On 10/28/10, Chamnap Chhorn  wrote:
> Sorry for very bad pasting. I paste it again.
>
> Slowest Components  Count   Exclusive
>  Total
> QueryElevationComponent 1 506,858 ms
> 100%
> 506,858 ms 100%
> SolrIndexSearcher 1 2.0 ms
>  0% 2.0 ms 0%
> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms
>  0% 506,862 ms 100%
> QueryComponent 1 1.0 ms
>  0%   1.0 ms 0%
> DebugComponent 1 0.0 ms
>  0% 0.0 ms 0%
> FacetComponent 1 0.0 ms
>  0% 0.0 ms     0%
>
> On Thu, Oct 28, 2010 at 4:57 PM, Chamnap Chhorn
> wrote:
>
>> Hi,
>>
>> I'm using solr 1.4 and using QueryElevation Component for guaranteed
>> search
>> position. I have around 700,000 documents with 1 Mb elevation file. It
>> turns
>> out it is quite slow on the newrelic monitoring website:
>>
>> Slowest Components Count Exclusive Total   QueryElevationComponent 1
>> 506,858 ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms 0%
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0% 506,862
>> ms
>> 100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0% 0.0
>> ms
>> 0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%
>>
>> As you could see, QueryElevationComponent takes quite a lot of time. Any
>> suggestion how to improve this?
>>
>> --
>> Chhorn Chamnap
>> http://chamnapchhorn.blogspot.com/
>>
>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: QueryElevation Component is so slow

2010-10-29 Thread Chamnap Chhorn
Thanks for reply.

I'm looking for how to improve the speed of the search query. The
QueryElevation Component is taking too much time which is
unacceptable. The size of elevation file is only 1 Mb. I wonder other
people using this component without problems (related to speed)? Am I
using it the wrong way or there is a limit when using this component?

On 10/29/10, Lance Norskog  wrote:
> I do not know if this is accurate. There are direct tools to monitor
> these problems: jconsole, visualgc/visualvm, YourKit, etc. Often these
> counts allot many things to one place that should be spread out.
>
> On Fri, Oct 29, 2010 at 12:27 AM, Chamnap Chhorn
>  wrote:
>> anyone has some suggestions to improve the search?
>> thanks
>>
>> On 10/28/10, Chamnap Chhorn  wrote:
>>> Sorry for very bad pasting. I paste it again.
>>>
>>> Slowest Components                                      Count   Exclusive
>>>      Total
>>> QueryElevationComponent                                 1     506,858 ms
>>> 100%
>>> 506,858 ms 100%
>>> SolrIndexSearcher                                         1     2.0 ms
>>>  0%     2.0 ms     0%
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter()     1     1.0 ms
>>>  0%     506,862 ms 100%
>>> QueryComponent                                             1     1.0 ms
>>>  0%   1.0 ms     0%
>>> DebugComponent                                             1     0.0 ms
>>>  0%     0.0 ms     0%
>>> FacetComponent                                             1     0.0 ms
>>>  0%     0.0 ms     0%
>>>
>>> On Thu, Oct 28, 2010 at 4:57 PM, Chamnap Chhorn
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using solr 1.4 and using QueryElevation Component for guaranteed
>>>> search
>>>> position. I have around 700,000 documents with 1 Mb elevation file. It
>>>> turns
>>>> out it is quite slow on the newrelic monitoring website:
>>>>
>>>> Slowest Components Count Exclusive Total   QueryElevationComponent 1
>>>> 506,858 ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms 0%
>>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0%
>>>> 506,862
>>>> ms
>>>> 100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0% 0.0
>>>> ms
>>>> 0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%
>>>>
>>>> As you could see, QueryElevationComponent takes quite a lot of time. Any
>>>> suggestion how to improve this?
>>>>
>>>> --
>>>> Chhorn Chamnap
>>>> http://chamnapchhorn.blogspot.com/
>>>>
>>>
>>>
>>>
>>> --
>>> Chhorn Chamnap
>>> http://chamnapchhorn.blogspot.com/
>>>
>>
>>
>> --
>> Chhorn Chamnap
>> http://chamnapchhorn.blogspot.com/
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: QueryElevation Component is so slow

2010-10-29 Thread Chamnap Chhorn
Thanks for reply.

I'm looking for how to improve the speed of the search query. The
QueryElevation Component is taking too much time which is
unacceptable. The size of elevation file is only 1 Mb. I wonder other
people using this component without problems (related to speed)? Am I
using it the wrong way or there is a limit when using this component?

On 10/29/10, Lance Norskog  wrote:
> I do not know if this is accurate. There are direct tools to monitor
> these problems: jconsole, visualgc/visualvm, YourKit, etc. Often these
> counts allot many things to one place that should be spread out.
>
> On Fri, Oct 29, 2010 at 12:27 AM, Chamnap Chhorn
>  wrote:
>> anyone has some suggestions to improve the search?
>> thanks
>>
>> On 10/28/10, Chamnap Chhorn  wrote:
>>> Sorry for very bad pasting. I paste it again.
>>>
>>> Slowest Components                                      Count   Exclusive
>>>      Total
>>> QueryElevationComponent                                 1     506,858 ms
>>> 100%
>>> 506,858 ms 100%
>>> SolrIndexSearcher                                         1     2.0 ms
>>>  0%     2.0 ms     0%
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter()     1     1.0 ms
>>>  0%     506,862 ms 100%
>>> QueryComponent                                             1     1.0 ms
>>>  0%   1.0 ms     0%
>>> DebugComponent                                             1     0.0 ms
>>>  0%     0.0 ms     0%
>>> FacetComponent                                             1     0.0 ms
>>>  0%     0.0 ms     0%
>>>
>>> On Thu, Oct 28, 2010 at 4:57 PM, Chamnap Chhorn
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using solr 1.4 and using QueryElevation Component for guaranteed
>>>> search
>>>> position. I have around 700,000 documents with 1 Mb elevation file. It
>>>> turns
>>>> out it is quite slow on the newrelic monitoring website:
>>>>
>>>> Slowest Components Count Exclusive Total   QueryElevationComponent 1
>>>> 506,858 ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms 0%
>>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0%
>>>> 506,862
>>>> ms
>>>> 100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0% 0.0
>>>> ms
>>>> 0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%
>>>>
>>>> As you could see, QueryElevationComponent takes quite a lot of time. Any
>>>> suggestion how to improve this?
>>>>
>>>> --
>>>> Chhorn Chamnap
>>>> http://chamnapchhorn.blogspot.com/
>>>>
>>>
>>>
>>>
>>> --
>>> Chhorn Chamnap
>>> http://chamnapchhorn.blogspot.com/
>>>
>>
>>
>> --
>> Chhorn Chamnap
>> http://chamnapchhorn.blogspot.com/
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: QueryElevation Component is so slow

2010-10-30 Thread Chamnap Chhorn
Well, I use Solr 1.4.

There are 30698 lines in my elevation file. I need only 20 results response
back at a time.

On Sun, Oct 31, 2010 at 9:12 AM, Lance Norskog  wrote:

> Now you got me interested. Always a bad thing ;)
>
> Looking at the QueryElevationComponent, I don't know enough to decide
> if it has algorithms that don't scale. It does something odd with
> sorting. It has a concurrent access path for each query, which should
> not be a problem. It has not changed much since a year ago.
>
> Which Solr release are you using? If it is the trunk or 3.x, it's
> possible that the Lucene API changes have left QE very slow.
>
> Also, how many lines are in your elevation file? How many "answers" do
> you supply per query?  It is possible that the special tricks QE does
> with sorting don't scale to large elevation database or large numbers
> of results per query.
>
> Lance
>
> On Fri, Oct 29, 2010 at 7:04 AM, Chamnap Chhorn 
> wrote:
> > Thanks for reply.
> >
> > I'm looking for how to improve the speed of the search query. The
> > QueryElevation Component is taking too much time which is
> > unacceptable. The size of elevation file is only 1 Mb. I wonder other
> > people using this component without problems (related to speed)? Am I
> > using it the wrong way or there is a limit when using this component?
> >
> > On 10/29/10, Lance Norskog  wrote:
> >> I do not know if this is accurate. There are direct tools to monitor
> >> these problems: jconsole, visualgc/visualvm, YourKit, etc. Often these
> >> counts allot many things to one place that should be spread out.
> >>
> >> On Fri, Oct 29, 2010 at 12:27 AM, Chamnap Chhorn
> >>  wrote:
> >>> anyone has some suggestions to improve the search?
> >>> thanks
> >>>
> >>> On 10/28/10, Chamnap Chhorn  wrote:
> >>>> Sorry for very bad pasting. I paste it again.
> >>>>
> >>>> Slowest Components  Count
> Exclusive
> >>>>  Total
> >>>> QueryElevationComponent 1 506,858
> ms
> >>>> 100%
> >>>> 506,858 ms 100%
> >>>> SolrIndexSearcher 1 2.0 ms
> >>>>  0% 2.0 ms 0%
> >>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms
> >>>>  0% 506,862 ms 100%
> >>>> QueryComponent 1 1.0
> ms
> >>>>  0%   1.0 ms 0%
> >>>> DebugComponent 1 0.0
> ms
> >>>>  0% 0.0 ms 0%
> >>>> FacetComponent 1 0.0
> ms
> >>>>  0% 0.0 ms 0%
> >>>>
> >>>> On Thu, Oct 28, 2010 at 4:57 PM, Chamnap Chhorn
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I'm using solr 1.4 and using QueryElevation Component for guaranteed
> >>>>> search
> >>>>> position. I have around 700,000 documents with 1 Mb elevation file.
> It
> >>>>> turns
> >>>>> out it is quite slow on the newrelic monitoring website:
> >>>>>
> >>>>> Slowest Components Count Exclusive Total   QueryElevationComponent 1
> >>>>> 506,858 ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0 ms
> 0%
> >>>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0%
> >>>>> 506,862
> >>>>> ms
> >>>>> 100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms 0%
> 0.0
> >>>>> ms
> >>>>> 0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%
> >>>>>
> >>>>> As you could see, QueryElevationComponent takes quite a lot of time.
> Any
> >>>>> suggestion how to improve this?
> >>>>>
> >>>>> --
> >>>>> Chhorn Chamnap
> >>>>> http://chamnapchhorn.blogspot.com/
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Chhorn Chamnap
> >>>> http://chamnapchhorn.blogspot.com/
> >>>>
> >>>
> >>>
> >>> --
> >>> Chhorn Chamnap
> >>> http://chamnapchhorn.blogspot.com/
> >>>
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> goks...@gmail.com
> >>
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: QueryElevation Component is so slow

2010-10-30 Thread Chamnap Chhorn
20

On Sun, Oct 31, 2010 at 9:44 AM, Lance Norskog  wrote:

> How many items for each query?
>
> On Sat, Oct 30, 2010 at 7:34 PM, Chamnap Chhorn 
> wrote:
> > Well, I use Solr 1.4.
> >
> > There are 30698 lines in my elevation file. I need only 20 results
> response
> > back at a time.
> >
> > On Sun, Oct 31, 2010 at 9:12 AM, Lance Norskog 
> wrote:
> >
> >> Now you got me interested. Always a bad thing ;)
> >>
> >> Looking at the QueryElevationComponent, I don't know enough to decide
> >> if it has algorithms that don't scale. It does something odd with
> >> sorting. It has a concurrent access path for each query, which should
> >> not be a problem. It has not changed much since a year ago.
> >>
> >> Which Solr release are you using? If it is the trunk or 3.x, it's
> >> possible that the Lucene API changes have left QE very slow.
> >>
> >> Also, how many lines are in your elevation file? How many "answers" do
> >> you supply per query?  It is possible that the special tricks QE does
> >> with sorting don't scale to large elevation database or large numbers
> >> of results per query.
> >>
> >> Lance
> >>
> >> On Fri, Oct 29, 2010 at 7:04 AM, Chamnap Chhorn <
> chamnapchh...@gmail.com>
> >> wrote:
> >> > Thanks for reply.
> >> >
> >> > I'm looking for how to improve the speed of the search query. The
> >> > QueryElevation Component is taking too much time which is
> >> > unacceptable. The size of elevation file is only 1 Mb. I wonder other
> >> > people using this component without problems (related to speed)? Am I
> >> > using it the wrong way or there is a limit when using this component?
> >> >
> >> > On 10/29/10, Lance Norskog  wrote:
> >> >> I do not know if this is accurate. There are direct tools to monitor
> >> >> these problems: jconsole, visualgc/visualvm, YourKit, etc. Often
> these
> >> >> counts allot many things to one place that should be spread out.
> >> >>
> >> >> On Fri, Oct 29, 2010 at 12:27 AM, Chamnap Chhorn
> >> >>  wrote:
> >> >>> anyone has some suggestions to improve the search?
> >> >>> thanks
> >> >>>
> >> >>> On 10/28/10, Chamnap Chhorn  wrote:
> >> >>>> Sorry for very bad pasting. I paste it again.
> >> >>>>
> >> >>>> Slowest Components  Count
> >> Exclusive
> >> >>>>  Total
> >> >>>> QueryElevationComponent 1
> 506,858
> >> ms
> >> >>>> 100%
> >> >>>> 506,858 ms 100%
> >> >>>> SolrIndexSearcher 1 2.0
> ms
> >> >>>>  0% 2.0 ms 0%
> >> >>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0
> ms
> >> >>>>  0% 506,862 ms 100%
> >> >>>> QueryComponent 1
> 1.0
> >> ms
> >> >>>>  0%   1.0 ms 0%
> >> >>>> DebugComponent 1
> 0.0
> >> ms
> >> >>>>  0% 0.0 ms 0%
> >> >>>> FacetComponent 1
> 0.0
> >> ms
> >> >>>>  0% 0.0 ms 0%
> >> >>>>
> >> >>>> On Thu, Oct 28, 2010 at 4:57 PM, Chamnap Chhorn
> >> >>>> wrote:
> >> >>>>
> >> >>>>> Hi,
> >> >>>>>
> >> >>>>> I'm using solr 1.4 and using QueryElevation Component for
> guaranteed
> >> >>>>> search
> >> >>>>> position. I have around 700,000 documents with 1 Mb elevation
> file.
> >> It
> >> >>>>> turns
> >> >>>>> out it is quite slow on the newrelic monitoring website:
> >> >>>>>
> >> >>>>> Slowest Components Count Exclusive Total   QueryElevationComponent
> 1
> >> >>>>> 506,858 ms 100% 506,858 ms 100% SolrIndexSearcher 1 2.0 ms 0% 2.0
> ms
> >> 0%
> >> >>>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter() 1 1.0 ms 0%
> >> >>>>> 506,862
> >> >>>>> ms
> >> >>>>> 100% QueryComponent 1 1.0 ms 0% 1.0 ms 0% DebugComponent 1 0.0 ms
> 0%
> >> 0.0
> >> >>>>> ms
> >> >>>>> 0% FacetComponent 1 0.0 ms 0% 0.0 ms 0%
> >> >>>>>
> >> >>>>> As you could see, QueryElevationComponent takes quite a lot of
> time.
> >> Any
> >> >>>>> suggestion how to improve this?
> >> >>>>>
> >> >>>>> --
> >> >>>>> Chhorn Chamnap
> >> >>>>> http://chamnapchhorn.blogspot.com/
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> Chhorn Chamnap
> >> >>>> http://chamnapchhorn.blogspot.com/
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Chhorn Chamnap
> >> >>> http://chamnapchhorn.blogspot.com/
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Lance Norskog
> >> >> goks...@gmail.com
> >> >>
> >> >
> >> >
> >> > --
> >> > Chhorn Chamnap
> >> > http://chamnapchhorn.blogspot.com/
> >> >
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> goks...@gmail.com
> >>
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Xml representation of indexed document

2012-03-09 Thread Chamnap Chhorn
Hi all,

I'm doing data import using DIH in solr 3.5. I'm curious to know whether it
is see the xml representation of indexed data from the browser. Is it
possible?
I just want to make sure these data is correctly indexed with correct value
or for debugging purpose.

-- 
Chamnap


Re: Xml representation of indexed document

2012-03-10 Thread Chamnap Chhorn
Thanks Anupam and Paul.

Yes, it can't display unstored fields. I can't find the way to extract
unstored fields in Luke. Any idea?
In your project, which indexer do you use? Previously, I wrote a ruby
script to index, but it took a lot of time. That's why I changed to DIH.


Chamnap


On Sat, Mar 10, 2012 at 4:41 PM, Paul Libbrecht  wrote:

> Chamnap,
>
> that'd be a view of the stored fields only (although Luke has some more to
> extract unstored fields).
> In my search projects I have an indexer and that component (not DIH) can
> display an "indexed view" of a document.
>
> maybe it helps.
>
> paul
>
>
> Le 10 mars 2012 à 08:57, Anupam Bhattacharya a écrit :
>
> > You can use Luke to view Lucene Indexes.
> >
> > Anupam
> >
> > On Sat, Mar 10, 2012 at 12:27 PM, Chamnap Chhorn <
> chamnapchh...@gmail.com>wrote:
> >
> >> Hi all,
> >>
> >> I'm doing data import using DIH in solr 3.5. I'm curious to know
> whether it
> >> is see the xml representation of indexed data from the browser. Is it
> >> possible?
> >> I just want to make sure these data is correctly indexed with correct
> value
> >> or for debugging purpose.
> >>
> >> --
> >> Chamnap
> >>
> >
> >
> >
> > --
> > Thanks & Regards
> > Anupam Bhattacharya
>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Accessing other entities from DIH

2012-03-10 Thread Chamnap Chhorn
Hi all,

I'm using DIH solr 3.5 to import data from mysql. In my document, I have
some fields: name, category, text_spell, ...
text_spell is a multi-valued field which combines from name and category
(category is a multi-value field as well).


   

   


In this case, I would use ScriptTransformer to produce a new array of
[name, category], but the from the example in solr
wiki,
it seems it could only access the current row in the current entity.
Is it possible to access other entities?

If not possible, how could i solve this problem. I know I could use UNION
statement, but it duplicates the query and it would degrade the performance
as well. Any idea?

-- 
Chamnap


Re: Accessing other entities from DIH

2012-03-10 Thread Chamnap Chhorn
Thanks Mikhail.

Yeah, in this case CopyField is better. I can combine multiple fields into
a new field, right? Something like this:




Anyway, I might need to access the child entity and parent entity. Can you
provide me some examples on how to use context? I'm not a java developer,
it's a little abstract to me in the solr wiki.
Or, could you give some links that explain this into more details?

Chamnap

On Sat, Mar 10, 2012 at 7:11 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Hello,
>
> First of all you can have an access to the context, where the parent entity
> fields can be obtained from (following your link):
>
> The semantics of execution is same as that of a java transformer. The
> method can have two arguments as in 'transformRow(Map ,
> Context context) in the abstract class 'Transformer' . As it is javascript
> the second argument may be omittted and it still works.
>
> then,
>
> generally it sounds like a copyfield
> http://wiki.apache.org/solr/SchemaXml#Copy_Fields have you considered it?
>
> On Sat, Mar 10, 2012 at 3:42 PM, Chamnap Chhorn  >wrote:
>
> > Hi all,
> >
> > I'm using DIH solr 3.5 to import data from mysql. In my document, I have
> > some fields: name, category, text_spell, ...
> > text_spell is a multi-valued field which combines from name and category
> > (category is a multi-value field as well).
> >
> >  >query="SELECT uuid, name from listings" pk="uuid">
> >>  query="SELECT `categories`.`name` FROM categories INNER JOIN
> > `listing_categories` ON
> > `categories`.`uuid`=`listing_categories`.`category_uuid`) WHERE
> > `listing_categories`.`listing_uuid`='${listing.uuid}'">
> >
> >   
> > 
> >
> > In this case, I would use ScriptTransformer to produce a new array of
> > [name, category], but the from the example in solr
> > wiki<http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer>,
> > it seems it could only access the current row in the current entity.
> > Is it possible to access other entities?
> >
> > If not possible, how could i solve this problem. I know I could use UNION
> > statement, but it duplicates the query and it would degrade the
> performance
> > as well. Any idea?
> >
> > --
> > Chamnap
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Lucid Certified
> Apache Lucene/Solr Developer
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  
>


Re: Xml representation of indexed document

2012-03-10 Thread Chamnap Chhorn
Mikhail, DIH interactive ui doesn't look good to me because I can't see the
xml of indexed documents. I need to see to make sure I'm doing right.

How do you make sure you're doing right by using DIH interactive ui?

On Sat, Mar 10, 2012 at 7:14 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Hello,
>
> DIH has a cute interactive ui with debug/verbose features. Have you checked
> them?
>
> On Sat, Mar 10, 2012 at 10:57 AM, Chamnap Chhorn  >wrote:
>
> > Hi all,
> >
> > I'm doing data import using DIH in solr 3.5. I'm curious to know whether
> it
> > is see the xml representation of indexed data from the browser. Is it
> > possible?
> > I just want to make sure these data is correctly indexed with correct
> value
> > or for debugging purpose.
> >
> > --
> > Chamnap
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Lucid Certified
> Apache Lucene/Solr Developer
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


More explanation on row in DIH

2012-03-10 Thread Chamnap Chhorn
Hi all,

Anyone please help explain me about a row in DIH.
Let's say, a listing can have multiple keyphrase_assets. A keyphrase_asset
is a comma-seperated value ("hotel,bank,..."). I need to index and split by
comma into a multi-valued keyphrase field.

function fKeyphrasePosition(row) {
}


Therefore, my fKeyphrasePosition function will executed every row of
keyphrase_asset or just execute only one time?
Because keyphrase is a dynamic field in schema.xml ("keyphrase_0",
"keyphrase_1", ...), I need to access the current index as well. I know I
can write a subquery to return row number of keyphrase_asset. Is that the
only way to go?

It's really difficult for me to debug my javascript code. I try to use
LogTransformer, but couldn't see where it outputs to.

-- 
Chamnap


Re: Index-time field boost with DIH

2012-05-24 Thread Chamnap Chhorn
Anyone could help me? I really need index-time field-boosting.

On Thu, May 24, 2012 at 4:21 PM, Chamnap Chhorn wrote:

> Hi all,
>
> I want to do index-time boost field on DIH. Is there any way to do this? I
> see on this documentation, there is only $docBoost. How about field boost?
> Is it possible?
>
> Thanks
> http://chamnap.github.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Index-time field boost with DIH

2012-05-24 Thread Chamnap Chhorn
I need to do index-time field boosting because the client buy position
asset. Therefore, some document when matched are more important than
others. That's what index time boost does, right?

On Thu, May 24, 2012 at 10:10 PM, Walter Underwood wrote:

> Why? Query-time boosting is fast and more flexible.
>
> wunder
> Search Guy, Netflix & Chegg
>
> On May 24, 2012, at 6:11 AM, Chamnap Chhorn wrote:
>
> > Anyone could help me? I really need index-time field-boosting.
> >
> > On Thu, May 24, 2012 at 4:21 PM, Chamnap Chhorn  >wrote:
> >
> >> Hi all,
> >>
> >> I want to do index-time boost field on DIH. Is there any way to do
> this? I
> >> see on this documentation, there is only $docBoost. How about field
> boost?
> >> Is it possible?
> >>
> >> Thanks
> >> http://chamnap.github.com/
> >>
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
>
>
>
>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Index-time field boost with DIH

2012-05-24 Thread Chamnap Chhorn
Thanks for your reply.

I need to boost at the document level and at the field level as well. Only
the query match certain fields would get boost.

In DIH, there is $docBoost (boost at document level), but documentation
about field-boost at all.

On Thu, May 24, 2012 at 10:32 PM, Walter Underwood wrote:

> If you want different boosts for different documents, then use the "boost"
> parameter in edismax. You can store the factor in a field, then use it to
> affect the score.
>
> If you store it in a field named "docboost", you could use this in an
> edismax config in your solrconfig.xml.
>
>   log(max(docboost,1))
>
> This will be multiplied into the score for each document. I use the max()
> function to avoid problems with zero and negative values.
>
> wunder
>
> On May 24, 2012, at 8:19 AM, Chamnap Chhorn wrote:
>
> > I need to do index-time field boosting because the client buy position
> > asset. Therefore, some document when matched are more important than
> > others. That's what index time boost does, right?
> >
> > On Thu, May 24, 2012 at 10:10 PM, Walter Underwood <
> wun...@wunderwood.org>wrote:
> >
> >> Why? Query-time boosting is fast and more flexible.
> >>
> >> wunder
> >> Search Guy, Netflix & Chegg
> >>
> >> On May 24, 2012, at 6:11 AM, Chamnap Chhorn wrote:
> >>
> >>> Anyone could help me? I really need index-time field-boosting.
> >>>
> >>> On Thu, May 24, 2012 at 4:21 PM, Chamnap Chhorn <
> chamnapchh...@gmail.com
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> I want to do index-time boost field on DIH. Is there any way to do
> >> this? I
> >>>> see on this documentation, there is only $docBoost. How about field
> >> boost?
> >>>> Is it possible?
> >>>>
> >>>> Thanks
> >>>> http://chamnap.github.com/
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Chhorn Chamnap
> >>> http://chamnapchhorn.blogspot.com/
> >>
> >>
> >>
> >>
> >>
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
>
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>
>


-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Must require quote with single word token query?

2010-11-16 Thread Chamnap Chhorn
I have one question related to single word token with dismax query. In order
to be found I need to add the quote around the search query all the time.
This is quite hard for me to do since it is part of full text search.

Here is my solr query and field type definition (Solr 1.4):

  






  




With this query q=smart%20mobile&qf=keyphrase&debugQuery=on&defType=dismax,
solr returns nothing. However, with quote on the search query q="smart
mobile"&qf=keyphrase&debugQuery=on&defType=dismax, the result is found.

Is it a must to use quote for a single word token field?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Must require quote with single word token query?

2010-11-17 Thread Chamnap Chhorn
Thanks for your reply. Here is some other details:

1. Keyphrase field definition:
   

2. I'm using solr 1.4.

3. My dismax definition is the original configuration after install solr:
  

 dismax
 explicit
 0.01
 
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 
 
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
 
 
popularity^0.5 recip(price,1,1000,1000)^0.3
 
 
id,name,price,score
 
 
2<-1 5<-2 6<90%
 
 100
 *:*
 
 text features name
 
 0
 
 name
 regex 

  

4. Here is the result returned back of the original query: smart mobile.






 0

 1
 

  on
  uuid,name,fap

  true
  smart mobile

  keyphrase
  dismax

 





 smart mobile

 smart mobile
 +((DisjunctionMaxQuery((keyphrase:smart))
DisjunctionMaxQuery((keyphrase:mobile)))~2) ()

 +(((keyphrase:smart)
(keyphrase:mobile))~2) ()

 
 DisMaxQParser

 
 

 
  1.0

  
1.0


 1.0




 0.0



 0.0




 0.0



 0.0




 0.0



  
  
0.0


 0.0




 0.0



 0.0




 0.0




 0.0




 0.0


  

 



5. Here is parsed query with "smart mobile" (with quotes) which returns the
result:


 "smart mobile"

 "smart mobile"


 +DisjunctionMaxQuery((keyphrase:smart mobile)) ()

 +(keyphrase:smart mobile) ()

 
  

4.503682 = (MATCH) sum of:
  4.503682 = (MATCH) fieldWeight(keyphrase:smart mobile in 13092), product of:
1.0 = tf(termFreq(keyphrase:smart mobile)=1)
10.29413 = idf(docFreq=1, maxDocs=21748)
0.4375 = fieldNorm(field=keyphrase, doc=13092)


 

6. Here, I tried to use automatic phrase query (pf parameter): doesn't
return any results.
http://localhost:8081/solr/select?q=smart%20mobile&qf=keyphrase&pf=keyphrase&debugQuery=on&defType=dismax

 smart mobile

 smart mobile
 +((DisjunctionMaxQuery((keyphrase:smart))
DisjunctionMaxQuery((keyphrase:mobile)))~2)
DisjunctionMaxQuery((keyphrase:smart mobile))

 +(((keyphrase:smart)
(keyphrase:mobile))~2) (keyphrase:smart mobile)

 

 DisMaxQParser

 
 

Thanks
Chamnap

On Wed, Nov 17, 2010 at 8:10 PM, Erick Erickson wrote:

> Try qt=dismax or deftype=dismax, I was also getting 0 results with
> defType on 1.4.1. I'll see what's up with that...
>
> But if that doesn't work...
>
> May we see your dismax definition too? You shouldn't need the
> quotes, so something's wrong somewhere
>
> What version of Solr are you using?
>
> Also, please post the results of running your original query
> with &debugQuery=on
>
> Best
> Erick
>
> On Tue, Nov 16, 2010 at 10:28 PM, Chamnap Chhorn  >wrote:
>
> > I have one question related to single word token with dismax query. In
> > order
> > to be found I need to add the quote around the search query all the time.
> > This is quite hard for me to do since it is part of full text search.
> >
> > Here is my solr query and field type definition (Solr 1.4):
> > > positionIncrementGap="100">
> >  
> >
> >
> >
> > > words="stopwords.txt" enablePositionIncrements="true"/>
> > > ignoreCase="true" expand="false" />
> >
> >  
> >
> >
> > > stored="false" multiValued="true"/>
> >
> > With this query
> q=smart%20mobile&qf=keyphrase&debugQuery=on&defType=dismax,
> > solr returns nothing. However, with quote on the search query q="smart
> > mobile"&qf=keyphrase&debugQuery=on&defType=dismax, the result is found.
> >
> > Is it a must to use quote for a single word token field?
> >
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Must require quote with single word token query?

2010-11-18 Thread Chamnap Chhorn
Well, this field is a keyphrase. I want to make it to case-insensitive
single token field. It matches only when the user types the same as data in
solr.

What's wrong with that? Does it can be done in another way?

On Thu, Nov 18, 2010 at 6:08 PM, Ahmet Arslan  wrote:

> This happening because query parser pre-tokenizes your query using whites
> paces. It is tokenized before it reaches your query analyzer.
> And you are using KeywordTokenizer in your field definition.
>
> Is there a special reason for you to use KeywordTokenizer ?
>
>
> --- On Thu, 11/18/10, Chamnap Chhorn  wrote:
>
> > From: Chamnap Chhorn 
> > Subject: Re: Must require quote with single word token query?
> > To: solr-user@lucene.apache.org
> > Date: Thursday, November 18, 2010, 5:19 AM
> > Thanks for your reply. Here is some
> > other details:
> >
> > 1. Keyphrase field definition:
> > > type="text_keyword" indexed="true" stored="false"
> > multiValued="true"/>
> >
> > 2. I'm using solr 1.4.
> >
> > 3. My dismax definition is the original configuration after
> > install solr:
> >> class="solr.SearchHandler" >
> > 
> >   > name="defType">dismax
> >   > name="echoParams">explicit
> >   > name="tie">0.01
> >  
> > text^0.5 features^1.0 name^1.2
> > sku^1.5 id^10.0 manu^1.1 cat^1.4
> >  
> >  
> > text^0.2 features^1.1 name^1.5
> > manu^1.4 manu_exact^1.9
> >  
> >  
> > popularity^0.5
> > recip(price,1,1000,1000)^0.3
> >  
> >  
> > id,name,price,score
> >  
> >  
> > 2<-1 5<-2
> > 6<90%
> >  
> >   > name="ps">100
> >   > name="q.alt">*:*
> >  
> >  text
> > features name
> >  
> >   > name="f.name.hl.fragsize">0
> >  
> >   > name="f.name.hl.alternateField">name
> >   > name="f.text.hl.fragmenter">regex 
> > 
> >   
> >
> > 4. Here is the result returned back of the original query:
> > smart mobile.
> >
> > 
> >
> > 
> >
> > 
> >  0
> >
> >  1
> >  
> >
> >   on
> >   uuid,name,fap
> >
> >   true
> >   smart mobile
> >
> >   keyphrase
> >   dismax
> >
> >  
> > 
> > 
> >
> > 
> >
> >  smart mobile
> >
> >  smart mobile
> >   > name="parsedquery">+((DisjunctionMaxQuery((keyphrase:smart))
> > DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
> >
> >   > name="parsedquery_toString">+(((keyphrase:smart)
> > (keyphrase:mobile))~2) ()
> >
> >  
> >  DisMaxQParser
> >
> >  
> >  
> >
> >  
> >   1.0
> >
> >   
> >  > name="time">1.0
> >
> >  > name="org.apache.solr.handler.component.QueryComponent">
> >   > name="time">1.0
> >
> > 
> >  > name="org.apache.solr.handler.component.FacetComponent">
> >
> >   > name="time">0.0
> > 
> >
> >  > name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >   > name="time">0.0
> >
> > 
> >  > name="org.apache.solr.handler.component.HighlightComponent">
> >
> >   > name="time">0.0
> > 
> >
> >  > name="org.apache.solr.handler.component.StatsComponent">
> >   > name="time">0.0
> >
> > 
> >  > name="org.apache.solr.handler.component.DebugComponent">
> >
> >   > name="time">0.0
> >
> > 
> >
> >   
> >   
> >  > name="time">0.0
> >
> >  > name="org.apache.solr.handler.component.QueryComponent">
> >   > name="time">0.0
> >
> > 
> >  > name="org.apache.solr.handler.component.FacetComponent">
> >
> >   > name="time">0.0
> > 
> >
> >  > name="org.apache.solr.handler.component.MoreLikeThisComponent">
> >   > name="time">0.0
> >

Re: Must require quote with single word token query?

2010-11-19 Thread Chamnap Chhorn
Wow, i never know this syntax before. What's that called?

On 11/19/10, Yonik Seeley  wrote:
> On Tue, Nov 16, 2010 at 10:28 PM, Chamnap Chhorn
>  wrote:
>> I have one question related to single word token with dismax query. In
>> order
>> to be found I need to add the quote around the search query all the time.
>> This is quite hard for me to do since it is part of full text search.
>>
>> Here is my solr query and field type definition (Solr 1.4):
>>    > positionIncrementGap="100">
>>      
>>        
>>        
>>        
>>        > words="stopwords.txt" enablePositionIncrements="true"/>
>>        > ignoreCase="true" expand="false" />
>>        
>>      
>>    
>>
>>    > stored="false" multiValued="true"/>
>>
>> With this query
>> q=smart%20mobile&qf=keyphrase&debugQuery=on&defType=dismax,
>> solr returns nothing. However, with quote on the search query q="smart
>> mobile"&qf=keyphrase&debugQuery=on&defType=dismax, the result is found.
>>
>> Is it a must to use quote for a single word token field?
>
> Yes, you must currently quote tokens if they contain whitespace -
> otherwise the query parser first breaks on whitespace before doing
> analysis on each part separately.
>
> Using dismax is an odd choice if you are only querying on keyphrase though.
> You might look at the field query parser - it is a basic single-field
> single-value parser with no operators (hence no need to escape any
> special characters).
>
> q={!field f=keyphrase}smart%20mobile
>
> or you can decompose it using param dereferencing (sometimes easier to
> construct)
>
> q={!field f=keyphrase v=$qq}&qq=smart%20mobile
>
> -Yonik
> http://www.lucidimagination.com
>

-- 
Sent from my mobile device

Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Must require quote with single word token query?

2010-11-24 Thread Chamnap Chhorn
I've looked at solr local params. However, I can't figure out how to
integrate it with my full text search using dismax handler. Here is my full
text search request handler.

  

  explicit
  20
  dismax
  name_ngram^20 name^40 postal_code address description
long_description location keyphrase short_description category telephone
email website
  name_ngram
  fap^10
  uuid
  2.2
  on
  0.1


  type:Listing


  false


  spellcheck
  elevateListings

  

Note: postal_code, keyphrase, category, telephone, email, website has field
type "text_keyword".

Thanks
On Sat, Nov 20, 2010 at 9:49 AM, Yonik Seeley wrote:

> On Fri, Nov 19, 2010 at 9:41 PM, Chamnap Chhorn 
> wrote:
> > Wow, i never know this syntax before. What's that called?
>
> I dubbed it "local params" since it adds local info to a parameter
> (think extra metadata, like XML attributes on an element).
>
> http://wiki.apache.org/solr/LocalParams
>
> It's used mostly to invoke different query parsers, but it's also used
> to add extra metadata to faceting commands too (and is required for
> stuff like multi-select faceting):
>
>
> http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams
>
>
> -Yonik
> http://www.lucidimagination.com
>
>
>
> > On 11/19/10, Yonik Seeley  wrote:
> >> On Tue, Nov 16, 2010 at 10:28 PM, Chamnap Chhorn
> >>  wrote:
> >>> I have one question related to single word token with dismax query. In
> >>> order
> >>> to be found I need to add the quote around the search query all the
> time.
> >>> This is quite hard for me to do since it is part of full text search.
> >>>
> >>> Here is my solr query and field type definition (Solr 1.4):
> >>> >>> positionIncrementGap="100">
> >>>  
> >>>
> >>>
> >>>
> >>> >>> words="stopwords.txt" enablePositionIncrements="true"/>
> >>> synonyms="synonyms.txt"
> >>> ignoreCase="true" expand="false" />
> >>>
> >>>  
> >>>
> >>>
> >>> >>> stored="false" multiValued="true"/>
> >>>
> >>> With this query
> >>> q=smart%20mobile&qf=keyphrase&debugQuery=on&defType=dismax,
> >>> solr returns nothing. However, with quote on the search query q="smart
> >>> mobile"&qf=keyphrase&debugQuery=on&defType=dismax, the result is found.
> >>>
> >>> Is it a must to use quote for a single word token field?
> >>
> >> Yes, you must currently quote tokens if they contain whitespace -
> >> otherwise the query parser first breaks on whitespace before doing
> >> analysis on each part separately.
> >>
> >> Using dismax is an odd choice if you are only querying on keyphrase
> though.
> >> You might look at the field query parser - it is a basic single-field
> >> single-value parser with no operators (hence no need to escape any
> >> special characters).
> >>
> >> q={!field f=keyphrase}smart%20mobile
> >>
> >> or you can decompose it using param dereferencing (sometimes easier to
> >> construct)
> >>
> >> q={!field f=keyphrase v=$qq}&qq=smart%20mobile
> >>
> >> -Yonik
> >> http://www.lucidimagination.com
> >>
> >
> > --
> > Sent from my mobile device
> >
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Must require quote with single word token query?

2011-01-04 Thread Chamnap Chhorn
Very late reply, actually.

I can't manage it to work with local params using my text_keyword field and
multi word query. What i do to achieve is to do full text searching. If the
query matches the keyphrase, it will be in higher position.

Are they other ways to work around?

On Wed, Nov 24, 2010 at 9:13 PM, Jonathan Rochkind  wrote:

> Okay, we need to take a step back and think about what you are trying to
> do.
>
> Reading back in the thread and looking at your schema, you have a
> non-tokenized field whose terms can include whitespace.  There is in fact no
> good way to use that with dismax, dismax doesn't work that way. What you can
> do as Yonik suggests is use the 'field' query parser instead.  You can force
> the use of the 'field' query parser with 'local params', or you can even
> create a combined query with uses 'field' for one clause and 'dismax' for
> another, with nested query syntax.
>
> But every individual part of your query can only use one query parser at a
> time, there's no way to use both at once.
>
> But if you want to use that request handler, but force it to use 'field'
> _instead_, that can be easily done:
>
> &q=multi word query&defType=field&v=field_name
>
> You don't even need 'local params', although you can also do it with 'local
> params': &q={!field v=field_name}
>
> That's it. (except all those values need to be URI encoded).  But it won't
> be using dismax anymore, although it'll be using that request handler you
> have set up to default to dismax, you're telling it to use 'field' this time
> anyway.
>
> If that doesn't do what you want, why don't you take a step back and tell
> us what query behavior you are actually trying to create, and maybe someone
> can give you some ideas for accomplishing it.
>
> 
> From: Chamnap Chhorn [chamnapchh...@gmail.com]
> Sent: Wednesday, November 24, 2010 4:43 AM
> To: yo...@lucidimagination.com
> Cc: solr-user@lucene.apache.org
> Subject: Re: Must require quote with single word token query?
>
> I've looked at solr local params. However, I can't figure out how to
> integrate it with my full text search using dismax handler. Here is my full
> text search request handler.
>
>  
>
>  explicit
>  20
>  dismax
>  name_ngram^20 name^40 postal_code address description
> long_description location keyphrase short_description category telephone
> email website
>  name_ngram
>  fap^10
>  uuid
>  2.2
>  on
>  0.1
>
>
>  type:Listing
>
>
>  false
>
>
>  spellcheck
>  elevateListings
>
>  
>
> Note: postal_code, keyphrase, category, telephone, email, website has field
> type "text_keyword".
>
> Thanks
> On Sat, Nov 20, 2010 at 9:49 AM, Yonik Seeley  >wrote:
>
> > On Fri, Nov 19, 2010 at 9:41 PM, Chamnap Chhorn  >
> > wrote:
> > > Wow, i never know this syntax before. What's that called?
> >
> > I dubbed it "local params" since it adds local info to a parameter
> > (think extra metadata, like XML attributes on an element).
> >
> > http://wiki.apache.org/solr/LocalParams
> >
> > It's used mostly to invoke different query parsers, but it's also used
> > to add extra metadata to faceting commands too (and is required for
> > stuff like multi-select faceting):
> >
> >
> >
> http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams
> >
> >
> > -Yonik
> > http://www.lucidimagination.com
> >
> >
> >
> > > On 11/19/10, Yonik Seeley  wrote:
> > >> On Tue, Nov 16, 2010 at 10:28 PM, Chamnap Chhorn
> > >>  wrote:
> > >>> I have one question related to single word token with dismax query.
> In
> > >>> order
> > >>> to be found I need to add the quote around the search query all the
> > time.
> > >>> This is quite hard for me to do since it is part of full text search.
> > >>>
> > >>> Here is my solr query and field type definition (Solr 1.4):
> > >>> > >>> positionIncrementGap="100">
> > >>>  
> > >>>
> > >>>
> > >>>
> > >>> > >>> words="stopwords.txt" enablePositionIncrements="true"/>
> > >>> > syn

Multi-word exact keyword case-insensitive search suggestions

2011-01-12 Thread Chamnap Chhorn
Hi all,

I'm just stuck with exact keyword for several days. Hope you guys could help
me. Here is the scenario:

   1. It need to be matched with multi-word keyword and case insensitive
   2. Partial word or single word matching with this field is not allowed

I want to know the field type definition for this field and sample solr
query. I need to combine this search with my full text search which uses
dismax query.

Thanks
-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-13 Thread Chamnap Chhorn
Thanks for your reply. However, it doesn't work for my case at all. I think
it's the problem with query parser or something else. It forces me to put
double quote to the search query in order to get the results found.

"sim 010"
"sim 010"
+DisjunctionMaxQuery((keyphrase:sim 010)) ()
+(keyphrase:sim 010) ()

smart mobile
smart mobile

+((DisjunctionMaxQuery((keyphrase:smart))
DisjunctionMaxQuery((keyphrase:mobile)))~2) ()

+(((keyphrase:smart) (keyphrase:mobile))~2)
()

The intent here is to do a full text search, part of that is to search
keyword field, so I can't put quote to it.

On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada <
estrada.adam.gro...@gmail.com> wrote:

> Hi,
>
> the following seems to work pretty well.
>
> positionIncrementGap="100">
>  
>
>  maxShingleSize="4" outputUnigrams="true"
> outputUnigramIfNoNgram="false" />
>  
>
>
>
> autoGeneratePhraseQueries="true">
>  
>
>
>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>  
>
> ignoreCase="true" expand="true"/>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>
>
>
>
>
>
>
>
>
> I ingest the source fields as text_ws (I know I've changed it a bit) and
> then copy the field to text. This seems to do what you are asking for.
>
> Adam
>
> On Thu, Jan 13, 2011 at 12:05 AM, Chamnap Chhorn  >wrote:
>
> > Hi all,
> >
> > I'm just stuck with exact keyword for several days. Hope you guys could
> > help
> > me. Here is the scenario:
> >
> >   1. It need to be matched with multi-word keyword and case insensitive
> >   2. Partial word or single word matching with this field is not allowed
> >
> > I want to know the field type definition for this field and sample solr
> > query. I need to combine this search with my full text search which uses
> > dismax query.
> >
> > Thanks
> > --
> > Chhorn Chamnap
> > http://chamnapchhorn.blogspot.com/
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-14 Thread Chamnap Chhorn
Ahh, thanks guys for helping me!

For Adam solution, it doesn't work for me. Here is my Field, FieldType, and
solr query:


   
   
   
 




http://localhost:8081/solr/select?q=printing%20house&qf=keyphrase&debugQuery=on&defType=dismax


+((DisjunctionMaxQuery((keyphrase:smart))
DisjunctionMaxQuery((keyphrase:mobile)))~2) ()

+(((keyphrase:smart) (keyphrase:mobile))~2)
()


The result is not found.

For erick solution, it works for me. However, I can't put filter query,
since it's part of full text search. If I put fq, it would just return
documents that match exactly as the query. I want to show those that match
exactly on the top and the rest for documents that match partially.

The problem is that when the user search a word (eg. "printing" of the
keyword "printing house"), that document also include in the search results.
The other problem is that if the user search the reverse order(eg. "house
printing"), it's also found.

Cheers

On Sat, Jan 15, 2011 at 3:31 AM, Erick Erickson wrote:

> This might work:
>
> Define your field to use WhitespaceTokenizer and LowerCaseFilterFactory
>
> Use a filter query referencing this field.
>
> If you wanted the words to appear in their exact order, you could just
> define
> the "pf" field in your dismax.
>
> Best
> Erick
>
> On Thu, Jan 13, 2011 at 8:01 PM, Estrada Groups <
> estrada.adam.gro...@gmail.com> wrote:
>
> > Ahhh...the fun of open source software ;-). Requires a ton of trial and
> > error! I found what worked for me and figured it was worth passing it
> along.
> > If you don't mind...when you sort everything out on your end, please post
> > results for the rest of us to take a gander at.
> >
> > Cheers,
> > Adam
> >
> > On Jan 13, 2011, at 9:08 PM, Chamnap Chhorn 
> > wrote:
> >
> > > Thanks for your reply. However, it doesn't work for my case at all. I
> > think
> > > it's the problem with query parser or something else. It forces me to
> put
> > > double quote to the search query in order to get the results found.
> > >
> > > "sim 010"
> > > "sim 010"
> > > +DisjunctionMaxQuery((keyphrase:sim 010))
> > ()
> > > +(keyphrase:sim 010) ()
> > >
> > > smart mobile
> > > smart mobile
> > > 
> > > +((DisjunctionMaxQuery((keyphrase:smart))
> > > DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
> > > 
> > > +(((keyphrase:smart)
> > (keyphrase:mobile))~2)
> > > ()
> > >
> > > The intent here is to do a full text search, part of that is to search
> > > keyword field, so I can't put quote to it.
> > >
> > > On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada <
> > > estrada.adam.gro...@gmail.com> wrote:
> > >
> > >> Hi,
> > >>
> > >> the following seems to work pretty well.
> > >>
> > >>> >> positionIncrementGap="100">
> > >> 
> > >>   
> > >>> >> maxShingleSize="4" outputUnigrams="true"
> > >> outputUnigramIfNoNgram="false" />
> > >> 
> > >>   
> > >>
> > >>   
> > >>> positionIncrementGap="100"
> > >> autoGeneratePhraseQueries="true">
> > >> 
> > >>   
> > >>   
> > >>   
> > >>> >>   ignoreCase="true"
> > >>   words="stopwords.txt"
> > >>   enablePositionIncrements="true"
> > >>   />
> > >>> >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> > >>   
> > >>> >> protected="protwords.txt"/>
> > >>   
> > >> 
> > >> 
> > >>   
> > >>synonyms="synonyms.txt"
> > >> ignoreCase="true" expand="true"/>
> > >>> >>   ignoreCase="true"
> > >>   words="stopwords.txt"
> > >>   enablePositionIncrements="true"
> > >>   />
> > &

Re: Multi-word exact keyword case-insensitive search suggestions

2011-01-17 Thread Chamnap Chhorn
No other way around to fit this requirement?

On Sat, Jan 15, 2011 at 10:01 AM, Chamnap Chhorn wrote:

> Ahh, thanks guys for helping me!
>
> For Adam solution, it doesn't work for me. Here is my Field, FieldType, and
> solr query:
>
>  positionIncrementGap="100">
>
>
>
>  maxShingleSize="4" outputUnigrams="true"
> outputUnigramIfNoNgram="false" />
>  
> 
>
>  multiValued="true"/>
>
>
> http://localhost:8081/solr/select?q=printing%20house&qf=keyphrase&debugQuery=on&defType=dismax
>
>
> 
> +((DisjunctionMaxQuery((keyphrase:smart))
> DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
> 
> +(((keyphrase:smart)
> (keyphrase:mobile))~2) ()
>  
>
> The result is not found.
>
> For erick solution, it works for me. However, I can't put filter query,
> since it's part of full text search. If I put fq, it would just return
> documents that match exactly as the query. I want to show those that match
> exactly on the top and the rest for documents that match partially.
>
> The problem is that when the user search a word (eg. "printing" of the
> keyword "printing house"), that document also include in the search results.
> The other problem is that if the user search the reverse order(eg. "house
> printing"), it's also found.
>
> Cheers
>
>
> On Sat, Jan 15, 2011 at 3:31 AM, Erick Erickson 
> wrote:
>
>> This might work:
>>
>> Define your field to use WhitespaceTokenizer and LowerCaseFilterFactory
>>
>> Use a filter query referencing this field.
>>
>> If you wanted the words to appear in their exact order, you could just
>> define
>> the "pf" field in your dismax.
>>
>> Best
>> Erick
>>
>> On Thu, Jan 13, 2011 at 8:01 PM, Estrada Groups <
>> estrada.adam.gro...@gmail.com> wrote:
>>
>> > Ahhh...the fun of open source software ;-). Requires a ton of trial and
>> > error! I found what worked for me and figured it was worth passing it
>> along.
>> > If you don't mind...when you sort everything out on your end, please
>> post
>> > results for the rest of us to take a gander at.
>> >
>> > Cheers,
>> > Adam
>> >
>> > On Jan 13, 2011, at 9:08 PM, Chamnap Chhorn 
>> > wrote:
>> >
>> > > Thanks for your reply. However, it doesn't work for my case at all. I
>> > think
>> > > it's the problem with query parser or something else. It forces me to
>> put
>> > > double quote to the search query in order to get the results found.
>> > >
>> > > "sim 010"
>> > > "sim 010"
>> > > +DisjunctionMaxQuery((keyphrase:sim 010))
>> > ()
>> > > +(keyphrase:sim 010) ()
>> > >
>> > > smart mobile
>> > > smart mobile
>> > > 
>> > > +((DisjunctionMaxQuery((keyphrase:smart))
>> > > DisjunctionMaxQuery((keyphrase:mobile)))~2) ()
>> > > 
>> > > +(((keyphrase:smart)
>> > (keyphrase:mobile))~2)
>> > > ()
>> > >
>> > > The intent here is to do a full text search, part of that is to search
>> > > keyword field, so I can't put quote to it.
>> > >
>> > > On Thu, Jan 13, 2011 at 10:30 PM, Adam Estrada <
>> > > estrada.adam.gro...@gmail.com> wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> the following seems to work pretty well.
>> > >>
>> > >>   > > >> positionIncrementGap="100">
>> > >> 
>> > >>   
>> > >>   > > >> maxShingleSize="4" outputUnigrams="true"
>> > >> outputUnigramIfNoNgram="false" />
>> > >> 
>> > >>   
>> > >>
>> > >>   
>> > >>   > > positionIncrementGap="100"
>> > >> autoGeneratePhraseQueries="true">
>> > >> 
>> > >>   
>> > >>   
>> > >>   
>> > >>   > > >>       ignoreCase="true"
>> > >>   words="stopwords.txt"
>> > >>   enablePositionIncrements="true"
>> > >>   />
>> > >>   > > >> gen

q.alt=*:* for every request?

2011-02-07 Thread Chamnap Chhorn
Hi,

I use dismax handler with solr 1.4.
Sometimes, my request comes with q and fq, and others doesn't come with q
(only fq and q.alt=*:*). It's quite ok if I send q.alt=*:* for every
request? Does it have side effects on performance?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: problem when search grouping word

2011-02-24 Thread Chamnap Chhorn
There are many product names. How could I list them all, and the list is
growing fast as well?

On Thu, Feb 24, 2011 at 5:25 PM, Grijesh  wrote:

>
> may synonym will help
>
> -
> Thanx:
> Grijesh
> http://lucidimagination.com
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/problem-when-search-grouping-word-tp2566499p2566550.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: problem when search grouping word

2011-02-25 Thread Chamnap Chhorn
Any idea?

On Thu, Feb 24, 2011 at 6:49 PM, Chamnap Chhorn wrote:

> There are many product names. How could I list them all, and the list is
> growing fast as well?
>
>
> On Thu, Feb 24, 2011 at 5:25 PM, Grijesh  wrote:
>
>>
>> may synonym will help
>>
>> -
>> Thanx:
>> Grijesh
>> http://lucidimagination.com
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/problem-when-search-grouping-word-tp2566499p2566550.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> Chhorn Chamnap
> http://chamnapchhorn.blogspot.com/
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: Changing the schema

2011-05-13 Thread Chamnap Chhorn
I wonder what if I add new field in the schema, do i have to reindex?

If no need to reindex, can i just update the schema.xml directly? After
that, Should I restart the tomcat service?

If no need to reindex, how about the existing documents? If I do a query
with new field, does it cause errors?

On Fri, May 13, 2011 at 1:38 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Brian,
>
> Yes, you do need to reindex.  We've used Hadoop with Solr to speed up
> indexing
> by orders of magnitude for some of our customers.  Something to consider.
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Brian Lamb 
> > To: solr-user@lucene.apache.org
> > Sent: Thu, May 12, 2011 11:53:27 AM
> > Subject: Changing the schema
> >
> > If I change the field type in my schema, do I need to rebuild the  entire
> > index? I'm at a point now where it takes over a day to do a full  import
> due
> > to the sheer size of my application and I would prefer not having  to
> reindex
> > just because I want to make a change  somewhere.
> >
> > Thanks,
> >
> > Brian Lamb
> >
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: how often do you boys restart your tomcat?

2011-07-26 Thread Chamnap Chhorn
I often restarted the tomcat service before the memory reaches the os limit.
Usually, it eats up only 4 GB, but eventually it eats up 11GB.

On Wed, Jul 27, 2011 at 8:42 AM, Bing Yu  wrote:

> I find that, if I do not restart the master's tomcat for some days,
> the load average will keep rising to a high level, solr become slow
> and unstable, so I add a crontab to restart the tomcat everyday.
>
> do you boys restart your tomcat ? and is there any way to avoid restart
> tomcat?
>



-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/


Re: How to improve this solr query?

2012-07-02 Thread Chamnap Chhorn
Hi Michael,

Thanks for quick response. Based on documentation, "facet.mincount" means
that solr will return facet fields that has at least that number. For me, I
just want to ensure my facet fields count doesn't have zero value.

I try to increase to 10, but it still slows even for the same query.

Actually, those 13 million documents are divided into 200 portals. I
already include "fq=portal_uuid: kjkjkjk" inside each nested query, but
it's still slow.

On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Hi Chamnap,
>
> The first thing that jumped out at me was "facet.mincount=1". Are you
> sure you need this? Increasing this number should drastically improve
> speed.
>
> Michael Della Bitta
>
> 
> Appinions, Inc. -- Where Influence Isn’t a Game.
> http://www.appinions.com
>
>
> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn 
> wrote:
> > Hi all,
> >
> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb.
> The
> > problem is that my query is so slow; the average response time is 12 secs
> > against 13 millions documents.
> >
> > What I am doing is to send quoted string (q2) to string fields and
> > non-quoted string (q1) to other fields and combine the result together.
> >
> >
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> >
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> > *
> >
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> > *
> >
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
> >
> > I have done solr optimize already, but it's still slow. Any idea how to
> > improve the speed? Am I done anything wrong?
> >
> > --
> > Chhorn Chamnap
> > http://chamnap.github.com/
>



-- 
Chhorn Chamnap
http://chamnap.github.com/


Re: How to improve this solr query?

2012-07-02 Thread Chamnap Chhorn
Hi Lance,

I didn't use wildcards at all. This is a normal text search only. I need a
string field because it needs to be matched exactly, and the value is
sometimes a multi-word, so quoted it is necessary.

By the way, if I do a super plain query, it takes at least 600ms. I'm not
sure why. On another solr instance with similar amount of data, it takes
only 50ms.

I see something strange on the response, there is always

build

What does that mean?

On Tue, Jul 3, 2012 at 10:02 AM, Lance Norskog  wrote:

> Wildcards are slow. Leading wildcards are even more slow. Is there
> some way to search that data differently? If it is a string, can you
> change it to a text field and make sure 'apartment' is a separate
> word?
>
> On Mon, Jul 2, 2012 at 10:01 AM, Chamnap Chhorn 
> wrote:
> > Hi Michael,
> >
> > Thanks for quick response. Based on documentation, "facet.mincount" means
> > that solr will return facet fields that has at least that number. For
> me, I
> > just want to ensure my facet fields count doesn't have zero value.
> >
> > I try to increase to 10, but it still slows even for the same query.
> >
> > Actually, those 13 million documents are divided into 200 portals. I
> > already include "fq=portal_uuid: kjkjkjk" inside each nested query, but
> > it's still slow.
> >
> > On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
> > michael.della.bi...@appinions.com> wrote:
> >
> >> Hi Chamnap,
> >>
> >> The first thing that jumped out at me was "facet.mincount=1". Are you
> >> sure you need this? Increasing this number should drastically improve
> >> speed.
> >>
> >> Michael Della Bitta
> >>
> >> 
> >> Appinions, Inc. -- Where Influence Isn’t a Game.
> >> http://www.appinions.com
> >>
> >>
> >> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn <
> chamnapchh...@gmail.com>
> >> wrote:
> >> > Hi all,
> >> >
> >> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb.
> >> The
> >> > problem is that my query is so slow; the average response time is 12
> secs
> >> > against 13 millions documents.
> >> >
> >> > What I am doing is to send quoted string (q2) to string fields and
> >> > non-quoted string (q1) to other fields and combine the result
> together.
> >> >
> >> >
> >>
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> >> >
> >>
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> >> > *
> >> >
> >>
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> >> > *
> >> >
> >>
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
> >> >
> >> > I have done solr optimize already, but it's still slow. Any idea how
> to
> >> > improve the speed? Am I done anything wrong?
> >> >
> >> > --
> >> > Chhorn Chamnap
> >> > http://chamnap.github.com/
> >>
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnap.github.com/
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Chhorn Chamnap
http://chamnap.github.com/


Re: How to improve this solr query?

2012-07-02 Thread Chamnap Chhorn
Lance, I didn't use widcard at all. I use only this, the difference is
quoted or not.

q2=*"apartment"*
q1=*apartment*
*
*
On Tue, Jul 3, 2012 at 12:06 PM, Lance Norskog  wrote:

> &q2=*"apartment"*
> q1=*apartment*
>
> These are wildcards
>
> On Mon, Jul 2, 2012 at 8:30 PM, Chamnap Chhorn 
> wrote:
> > Hi Lance,
> >
> > I didn't use wildcards at all. This is a normal text search only. I need
> a
> > string field because it needs to be matched exactly, and the value is
> > sometimes a multi-word, so quoted it is necessary.
> >
> > By the way, if I do a super plain query, it takes at least 600ms. I'm not
> > sure why. On another solr instance with similar amount of data, it takes
> > only 50ms.
> >
> > I see something strange on the response, there is always
> >
> > build
> >
> > What does that mean?
> >
> > On Tue, Jul 3, 2012 at 10:02 AM, Lance Norskog 
> wrote:
> >
> >> Wildcards are slow. Leading wildcards are even more slow. Is there
> >> some way to search that data differently? If it is a string, can you
> >> change it to a text field and make sure 'apartment' is a separate
> >> word?
> >>
> >> On Mon, Jul 2, 2012 at 10:01 AM, Chamnap Chhorn <
> chamnapchh...@gmail.com>
> >> wrote:
> >> > Hi Michael,
> >> >
> >> > Thanks for quick response. Based on documentation, "facet.mincount"
> means
> >> > that solr will return facet fields that has at least that number. For
> >> me, I
> >> > just want to ensure my facet fields count doesn't have zero value.
> >> >
> >> > I try to increase to 10, but it still slows even for the same query.
> >> >
> >> > Actually, those 13 million documents are divided into 200 portals. I
> >> > already include "fq=portal_uuid: kjkjkjk" inside each nested query,
> but
> >> > it's still slow.
> >> >
> >> > On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
> >> > michael.della.bi...@appinions.com> wrote:
> >> >
> >> >> Hi Chamnap,
> >> >>
> >> >> The first thing that jumped out at me was "facet.mincount=1". Are you
> >> >> sure you need this? Increasing this number should drastically improve
> >> >> speed.
> >> >>
> >> >> Michael Della Bitta
> >> >>
> >> >> 
> >> >> Appinions, Inc. -- Where Influence Isn’t a Game.
> >> >> http://www.appinions.com
> >> >>
> >> >>
> >> >> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn <
> >> chamnapchh...@gmail.com>
> >> >> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17
> Gb.
> >> >> The
> >> >> > problem is that my query is so slow; the average response time is
> 12
> >> secs
> >> >> > against 13 millions documents.
> >> >> >
> >> >> > What I am doing is to send quoted string (q2) to string fields and
> >> >> > non-quoted string (q1) to other fields and combine the result
> >> together.
> >> >> >
> >> >> >
> >> >>
> >>
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> >> >> >
> >> >>
> >>
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> >> >> > *
> >> >> >
> >> >>
> >>
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> >> >> > *
> >> >> >
> >> >>
> >>
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
> >> >> >
> >> >> > I have done solr optimize already, but it's still slow. Any idea
> how
> >> to
> >> >> > improve the speed? Am I done anything wrong?
> >> >> >
> >> >> > --
> >> >> > Chhorn Chamnap
> >> >> > http://chamnap.github.com/
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Chhorn Chamnap
> >> > http://chamnap.github.com/
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> goks...@gmail.com
> >>
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnap.github.com/
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Chhorn Chamnap
http://chamnap.github.com/


Re: How to improve this solr query?

2012-07-03 Thread Chamnap Chhorn
Hi Erick and Michael,

It's not asterisk at all. Sorry to confuse you guys, it's actually
*dot *letter.
I put it that way because it contains quite a lot of fields there.

The reason I'm doing that is because I have some string fields and
non-string fields. The idea is to send quoted value to string fields and
non-quoted value to non-string fields. I have to do that in order to match
string fields.

I have tried using pf, but it doesn't match the string field at all. Do you
have any good resource about how to use pf? I looked into several latest
solr books, but they said very little about it.

On Wed, Jul 4, 2012 at 3:51 AM, Erick Erickson wrote:

> Chamnap:
>
> I've seen various e-mail programs put the asterisk in for terms that
> are in bold face.
>
> The queries you pasted have lots of "*" characters in it, I suspect
> that they were just
> things you put in bold in your original, that may be the source of the
> confusion about
> whether you were using wildcards.
>
> But on to your question. If your q1 and q2 are the same words,
> wouldn't it just work to
> specify the "pf" (phrase field) parameter for edismax? That
> automatically takes the terms
> in the query and turns it into a phrase query that's boosted higher.
>
> And what's the use-case here? I think hou might be making this more
> complex than
> it needs to be
>
> Best
> Erick
>
> On Tue, Jul 3, 2012 at 8:41 AM, Michael Della Bitta
>  wrote:
> > Chamnap,
> >
> > I have a hunch you can get away with not using *s.
> >
> > Michael Della Bitta
> >
> > --------
> > Appinions, Inc. -- Where Influence Isn’t a Game.
> > http://www.appinions.com
> >
> >
> > On Tue, Jul 3, 2012 at 2:16 AM, Chamnap Chhorn 
> wrote:
> >> Lance, I didn't use widcard at all. I use only this, the difference is
> >> quoted or not.
> >>
> >> q2=*"apartment"*
> >> q1=*apartment*
> >> *
> >> *
> >> On Tue, Jul 3, 2012 at 12:06 PM, Lance Norskog 
> wrote:
> >>
> >>> &q2=*"apartment"*
> >>> q1=*apartment*
> >>>
> >>> These are wildcards
> >>>
> >>> On Mon, Jul 2, 2012 at 8:30 PM, Chamnap Chhorn <
> chamnapchh...@gmail.com>
> >>> wrote:
> >>> > Hi Lance,
> >>> >
> >>> > I didn't use wildcards at all. This is a normal text search only. I
> need
> >>> a
> >>> > string field because it needs to be matched exactly, and the value is
> >>> > sometimes a multi-word, so quoted it is necessary.
> >>> >
> >>> > By the way, if I do a super plain query, it takes at least 600ms.
> I'm not
> >>> > sure why. On another solr instance with similar amount of data, it
> takes
> >>> > only 50ms.
> >>> >
> >>> > I see something strange on the response, there is always
> >>> >
> >>> > build
> >>> >
> >>> > What does that mean?
> >>> >
> >>> > On Tue, Jul 3, 2012 at 10:02 AM, Lance Norskog 
> >>> wrote:
> >>> >
> >>> >> Wildcards are slow. Leading wildcards are even more slow. Is there
> >>> >> some way to search that data differently? If it is a string, can you
> >>> >> change it to a text field and make sure 'apartment' is a separate
> >>> >> word?
> >>> >>
> >>> >> On Mon, Jul 2, 2012 at 10:01 AM, Chamnap Chhorn <
> >>> chamnapchh...@gmail.com>
> >>> >> wrote:
> >>> >> > Hi Michael,
> >>> >> >
> >>> >> > Thanks for quick response. Based on documentation,
> "facet.mincount"
> >>> means
> >>> >> > that solr will return facet fields that has at least that number.
> For
> >>> >> me, I
> >>> >> > just want to ensure my facet fields count doesn't have zero value.
> >>> >> >
> >>> >> > I try to increase to 10, but it still slows even for the same
> query.
> >>> >> >
> >>> >> > Actually, those 13 million documents are divided into 200
> portals. I
> >>> >> > already include "fq=portal_uuid: kjkjkjk" inside each nested
> query,
> >>> but
> >>> >> > it's still slow.

Re: How to improve this solr query?

2012-07-04 Thread Chamnap Chhorn
Hi Amit,

Thanks for your response.
1. It's just sometimes I see solr doesn't sort by score desc, so I made it
like that. I will have to check that again.
2. q1 and q2 are doing the search but just on different fields. String
fields means that it must match exactly, and solr need the q parameter to
be quoted. I did a nested query with the OR operator.

I'll check out the bf, pf, bq parameter more.

Thanks for the advise. :)

On Wed, Jul 4, 2012 at 2:28 PM, Amit Nithian  wrote:

> Couple questions:
> 1) Why are you explicitly telling solr to sort by score desc,
> shouldn't it do that for you? Could this be a source of performance
> problems since sorting requires the loading of the field caches?
> 2) Of the query parameters, q1 and q2, which one is actually doing
> "text" searching on your index? It looks like q1 is doing non-string
> related stuff, could this be better handled in either the bf or bq
> section of the edismax config? Looking at the sample though I don't
> understand how q1=apartment would hit non-string fields again (but see
> #3)
> 3) Are the "string" fields literally of string type (i.e. no analysis
> on the field) or are you saying string loosely to mean "text" field.
> pf ==> phrase fields ==> given a multiple word query, will ensure that
> the specified phrase exists in the specified fields separated by some
> slop ("hello my world" may match "hello world" depending on this slop
> value). The "qf" means that given a multi term query, each term exists
> in the specified fields (name, description whatever text fields you
> want).
>
> Best
> Amit
>
> On Mon, Jul 2, 2012 at 9:35 AM, Chamnap Chhorn 
> wrote:
> > Hi all,
> >
> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb.
> The
> > problem is that my query is so slow; the average response time is 12 secs
> > against 13 millions documents.
> >
> > What I am doing is to send quoted string (q2) to string fields and
> > non-quoted string (q1) to other fields and combine the result together.
> >
> >
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> >
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> > *
> >
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> > *
> >
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
> >
> > I have done solr optimize already, but it's still slow. Any idea how to
> > improve the speed? Am I done anything wrong?
> >
> > --
> > Chhorn Chamnap
> > http://chamnap.github.com/
>



-- 
Chhorn Chamnap
http://chamnap.github.com/


solr facet fields doesn't honor fq

2012-07-07 Thread Chamnap Chhorn
Hi all,

I have a question related to solr 3.5 on field facet. Here is my query:

http://localhost:8081/solr_new/select?tie=0.1&q.alt=*:*&q=bank&qf=nameaddress&fq=
*portal_uuid:+A4E7890F-A188-4663-89EB-176D94DF6774*&defType=dismax&*
facet=true*&facet.field=*location_uuid*&facet.field=*sub_category_uuids*

What I get back with field facet are:
1. Some location_uuids which is in the current portal_uuid (has facet count
> 0)
2. Some location_uuids are not in the current portal_uuid at all (has facet
count = 0)

It seems that solr doesn't honor the fq at all when returning field facet.
I need to add one more parameter "facet.mincount=1" in order to not return
location_uuids facet (2).

I think, solr does faceting on all location_uuid. It should does that
scoping to current portal_uuid. Any idea?

-- 
Chhorn Chamnap
http://chamnap.github.com/


partial word searching

2010-04-21 Thread Chamnap Chhorn
Hi everyone,


I'm quite new to solr 1.4. I have a requirement to be able to search partial
words ("sun hot" => "Sunway Hotel") and to search full word("sunway hotel"
=> "Sunway Hotel"). Currently, I could be able to search only full word.
Anyone has any suggestions?

-- 
Chhorn Chamnap
http://chamnapchhorn.blogspot.com/