URL Redirect

2011-10-06 Thread Finotti Simone
Hello,

I have been assigned the task to migrate from Endeca to Solr.

The former engine allowed me to set keyword triggers that, when matched 
exactly, caused the web client to redirect to a specified URL.

Does that feature exist in Solr? If so, where can I get some info?

Thank you

R: URL Redirect

2011-10-07 Thread Finotti Simone
Ok,
so I installed tucky on Tomcat 6.0. I have the following configuration:

/solr-p <- the solr configuration
/solr  <- the tucky configuration

I want to redirect request such as
http://localhost:8080/solr/select/?q=Somename&;...
to a different address, and this is promptly done with rule:




RuleAlessi
^/select/\?q=[Aa]lessi&.*$
/design/searchresult.asp/ene_m/4294950939/dept/design/


I have many of these rule and they works. If none of these rules are matched, I 
want to *forward* request
http://localhost:8080/solr/select/?q=NonMatchedQuery&;...
to 
http://localhost:8080/solr-p/select/?q=NonMatchedQuery&;...

It works with the following rule:


LastRule
/(.*)
/solr-p/$1


but it shows me on the browser's address bar the 'solr-p' thing, and this is 
not desired. If I change the rule to


LastRule
/(.*)
/$1


it seems that the forwarding is done correctly, but the answer to 
http://localhost:8080/solr/admin/ is

HTTP Status 400 - Missing solr core name in path

while if I access to http://localhost:8080/solr-p/admin/ it works as expected.

According to Tomcat's log files, it seems that the forwarding is done 
correctly. 

Please, could anybody explain me what's going on there? 

Thanks


Inizio: Ranveer Kumar [ranveer.s...@gmail.com]
Inviato: giovedì 6 ottobre 2011 10.21
Fine: solr-user@lucene.apache.org
Oggetto: Re: URL Redirect

Tucky can also help you if you are u
Sing java..
On Oct 6, 2011 1:24 PM, "Finotti Simone"  wrote:
> Hello,
>
> I have been assigned the task to migrate from Endeca to Solr.
>
> The former engine allowed me to set keyword triggers that, when matched
exactly, caused the web client to redirect to a specified URL.
>
> Does that feature exist in Solr? If so, where can I get some info?
>
> Thank you




R: URL Redirect

2011-10-10 Thread Finotti Simone
Hi,
for those who may be interested, I resolved it (with a little help from 
urlrewrite user group :-) ) by using type="proxy" rule.

S

____
Inizio: Finotti Simone [tech...@yoox.com]
Inviato: venerdì 7 ottobre 2011 11.38
Fine: solr-user@lucene.apache.org
Oggetto: R: URL Redirect

Ok,
so I installed tucky on Tomcat 6.0. I have the following configuration:

/solr-p <- the solr configuration
/solr  <- the tucky configuration

I want to redirect request such as
http://localhost:8080/solr/select/?q=Somename&;...
to a different address, and this is promptly done with rule:




RuleAlessi
^/select/\?q=[Aa]lessi&.*$
/design/searchresult.asp/ene_m/4294950939/dept/design/


I have many of these rule and they works. If none of these rules are matched, I 
want to *forward* request
http://localhost:8080/solr/select/?q=NonMatchedQuery&;...
to
http://localhost:8080/solr-p/select/?q=NonMatchedQuery&;...

It works with the following rule:


LastRule
/(.*)
/solr-p/$1


but it shows me on the browser's address bar the 'solr-p' thing, and this is 
not desired. If I change the rule to


LastRule
/(.*)
/$1


it seems that the forwarding is done correctly, but the answer to 
http://localhost:8080/solr/admin/ is

HTTP Status 400 - Missing solr core name in path

while if I access to http://localhost:8080/solr-p/admin/ it works as expected.

According to Tomcat's log files, it seems that the forwarding is done correctly.

Please, could anybody explain me what's going on there?

Thanks


Inizio: Ranveer Kumar [ranveer.s...@gmail.com]
Inviato: giovedì 6 ottobre 2011 10.21
Fine: solr-user@lucene.apache.org
Oggetto: Re: URL Redirect

Tucky can also help you if you are u
Sing java..
On Oct 6, 2011 1:24 PM, "Finotti Simone"  wrote:
> Hello,
>
> I have been assigned the task to migrate from Endeca to Solr.
>
> The former engine allowed me to set keyword triggers that, when matched
exactly, caused the web client to redirect to a specified URL.
>
> Does that feature exist in Solr? If so, where can I get some info?
>
> Thank you








Solr response writer

2011-12-07 Thread Finotti Simone
Hello,
I need to change the HTTP result code of the query result if some conditions 
are met.

Analyzing the flow of execution of Solr query process, it seems to me that the 
"place" that fits better is the QueryResponseWriter. Anyway I didn't found a 
way to change the HTTP request layout (I need to set 307 instead of 200), so I 
wonder if it's possible at all with the Solr (v 3.4) plugin mechanism actually 
provided.

Any insight would be greatly appreciated J

Thanks
S

Re: Solr response writer

2011-12-07 Thread Finotti Simone
That's the scenario:
I have an XML that maps words W to URLs; when a search request is issued by my 
web client, a query will be issued to my Solr application. If, after stemming, 
the query matches any in W, the client must be redirected to the associated URL.

I agree that it should be handled outside, but we are currently on progress of 
migrating from Endeca, and it has a feature that allow this scenario. For this 
reason, my boss asked if it was somehow possible to leave that functionality in 
the search engine.

thanks again


Inizio: Erik Hatcher [erik.hatc...@gmail.com]
Inviato: mercoledì 7 dicembre 2011 14.12
Fine: solr-user@lucene.apache.org
Oggetto: Re: Solr response writer

First, could you tell us more about your use case?   Why do you want to change 
the response code?   HTTP 307 = Temporary redirect - where are you going to 
redirect?  Sounds like something best handled outside of Solr.

If you went down the route of creating your own custom response writer, then 
you'd be locked into a single format (XML, or JSON, or which ever that you 
subclassed)


On Dec 7, 2011, at 06:48 , Finotti Simone wrote:

> Hello,
> I need to change the HTTP result code of the query result if some conditions 
> are met.
>
> Analyzing the flow of execution of Solr query process, it seems to me that 
> the "place" that fits better is the QueryResponseWriter. Anyway I didn't 
> found a way to change the HTTP request layout (I need to set 307 instead of 
> 200), so I wonder if it's possible at all with the Solr (v 3.4) plugin 
> mechanism actually provided.
>
> Any insight would be greatly appreciated J
>
> Thanks
> S







R: Solr response writer

2011-12-07 Thread Finotti Simone
I got your and Michael's point. Indeed, I'm not very skilled in web devolpment 
so there may be something that I'm missing. Anyway, Endeca does something like 
this:

1. accept a query
2. does the stemming;
3. check if the result of the step 2. matches one of the redirectable words. If 
so, returns an URL, otherwise returns the regular matching documents (our 
products' description).

Do you think that in Solr I will be able to replicate this behaviour without 
writing a custom plugin (request handler, response writer, etc)? Maybe I'm a 
little dense, but I fail to see how it would be possible... 

S


Inizio: Erik Hatcher [erik.hatc...@gmail.com]
Inviato: mercoledì 7 dicembre 2011 14.40
Fine: solr-user@lucene.apache.org
Oggetto: Re: Solr response writer

Either way (Endeca's 307, which seems crazy to me) or simply plucking off a 
"url" field from the first document returned in a search request... you're 
getting a URL back to your client and then using that URL to further send back 
to a users browser, I presume.  I personally wouldn't implement it with a 
custom response writer, just get the URL from the standard Solr response.

    Erik

On Dec 7, 2011, at 08:26 , Finotti Simone wrote:

> That's the scenario:
> I have an XML that maps words W to URLs; when a search request is issued by 
> my web client, a query will be issued to my Solr application. If, after 
> stemming, the query matches any in W, the client must be redirected to the 
> associated URL.
>
> I agree that it should be handled outside, but we are currently on progress 
> of migrating from Endeca, and it has a feature that allow this scenario. For 
> this reason, my boss asked if it was somehow possible to leave that 
> functionality in the search engine.
>
> thanks again
>
> 
> Inizio: Erik Hatcher [erik.hatc...@gmail.com]
> Inviato: mercoledì 7 dicembre 2011 14.12
> Fine: solr-user@lucene.apache.org
> Oggetto: Re: Solr response writer
>
> First, could you tell us more about your use case?   Why do you want to 
> change the response code?   HTTP 307 = Temporary redirect - where are you 
> going to redirect?  Sounds like something best handled outside of Solr.
>
> If you went down the route of creating your own custom response writer, then 
> you'd be locked into a single format (XML, or JSON, or which ever that you 
> subclassed)
>
>
> On Dec 7, 2011, at 06:48 , Finotti Simone wrote:
>
>> Hello,
>> I need to change the HTTP result code of the query result if some conditions 
>> are met.
>>
>> Analyzing the flow of execution of Solr query process, it seems to me that 
>> the "place" that fits better is the QueryResponseWriter. Anyway I didn't 
>> found a way to change the HTTP request layout (I need to set 307 instead of 
>> 200), so I wonder if it's possible at all with the Solr (v 3.4) plugin 
>> mechanism actually provided.
>>
>> Any insight would be greatly appreciated J
>>
>> Thanks
>> S
>
>
>
>
>







Re: Solr response writer

2011-12-07 Thread Finotti Simone
No, actually it's a .NET web service that queries Endeca (call it Wrapper). It 
returns to its clients a collection of unique product IDs, then the client will 
ask other web services for more detailed informations about the given products. 
As long as no URL redirection is involved, I think that solrnet ( 
http://code.google.com/p/solrnet/ ) is good enough to make our Wrapper connect 
to Solr, thus shielding the client from changes in the underlying search engine.

Endeca C# API also returns a 'RedirectionUrl' property in one of its object, 
which is set to an URL if the text search matches a redirection rule, in this 
case the Wrapper passes it down to its client (my fault here, I thought there 
was some sort of redirection through HTTP result code, but that's not the case).

The point is: since Solr doesn't have this feature, my only chance is to 
implement it into the "wrapping" web service itself, but I need to "access" how 
the words are analyzed by the search engine to make it work correctly. AFAICS, 
Solr only returns documents matching the request, so I'm missing something :-(

S

Inizio: Michael Kuhlmann [k...@solarier.de]
Inviato: mercoledì 7 dicembre 2011 15.29
Fine: solr-user@lucene.apache.org
Oggetto: Re: R: Solr response writer

Am 07.12.2011 15:09, schrieb Finotti Simone:
> I got your and Michael's point. Indeed, I'm not very skilled in web 
> devolpment so there may be something that I'm missing. Anyway, Endeca does 
> something like this:
>
> 1. accept a query
> 2. does the stemming;
> 3. check if the result of the step 2. matches one of the redirectable words. 
> If so, returns an URL, otherwise returns the regular matching documents (our 
> products' description).
>
> Do you think that in Solr I will be able to replicate this behaviour without 
> writing a custom plugin (request handler, response writer, etc)? Maybe I'm a 
> little dense, but I fail to see how it would be possible...

Endeca not only is a search engine, it's part of a web application. You
can send a query to the Endeca engine and send the response directly to
the user; it's already fully rendered. (At least when you configured it
this way.)

Solr can't do this in any way. Solr responses are always pure technical
data, not meant to be delivered to an end user. An exception to this is
the VelocityResponseWriter which can fill a web template.

Anything beyond the possibilities of the VelocityReponseWriter must be
handled by some web application that anaylzes Solr's reponses.

How do you want ot display your product descriptions, the default case?
I don't think you want to show some XML data.

Solr is a great search engine, but not more. It's just a small subset of
commercial search frameworks like Endeca. Therefore, you can't simply
replace it, you'll need some web application.

However, you don't need a custom response writer in this case, nor do
you have to Solr extend in any way. At least not for this requrement.

-Kuli






Re: Solr response writer

2011-12-07 Thread Finotti Simone

Thank you Erik, I will work on your suggestion! It seems it could work, 
provided I can boost matches on "redirect" document type

S

Inizio: Erik Hatcher [erik.hatc...@gmail.com]
Inviato: mercoledì 7 dicembre 2011 16.56
Fine: solr-user@lucene.apache.org
Oggetto: Re: Solr response writer

What you can do is index the "redirect" documents along with the associated 
words, and let Solr do the stemming.   Maybe add a "document type" field and if 
you get a match on a redirect document type, your web service can do what it 
needs to do from there.

Erik



On Dec 7, 2011, at 10:43 , Finotti Simone wrote:

> No, actually it's a .NET web service that queries Endeca (call it Wrapper). 
> It returns to its clients a collection of unique product IDs, then the client 
> will ask other web services for more detailed informations about the given 
> products. As long as no URL redirection is involved, I think that solrnet ( 
> http://code.google.com/p/solrnet/ ) is good enough to make our Wrapper 
> connect to Solr, thus shielding the client from changes in the underlying 
> search engine.
>
> Endeca C# API also returns a 'RedirectionUrl' property in one of its object, 
> which is set to an URL if the text search matches a redirection rule, in this 
> case the Wrapper passes it down to its client (my fault here, I thought there 
> was some sort of redirection through HTTP result code, but that's not the 
> case).
>
> The point is: since Solr doesn't have this feature, my only chance is to 
> implement it into the "wrapping" web service itself, but I need to "access" 
> how the words are analyzed by the search engine to make it work correctly. 
> AFAICS, Solr only returns documents matching the request, so I'm missing 
> something :-(
>
> S
> 
> Inizio: Michael Kuhlmann [k...@solarier.de]
> Inviato: mercoledì 7 dicembre 2011 15.29
> Fine: solr-user@lucene.apache.org
> Oggetto: Re: R: Solr response writer
>
> Am 07.12.2011 15:09, schrieb Finotti Simone:
>> I got your and Michael's point. Indeed, I'm not very skilled in web 
>> devolpment so there may be something that I'm missing. Anyway, Endeca does 
>> something like this:
>>
>> 1. accept a query
>> 2. does the stemming;
>> 3. check if the result of the step 2. matches one of the redirectable words. 
>> If so, returns an URL, otherwise returns the regular matching documents (our 
>> products' description).
>>
>> Do you think that in Solr I will be able to replicate this behaviour without 
>> writing a custom plugin (request handler, response writer, etc)? Maybe I'm a 
>> little dense, but I fail to see how it would be possible...
>
> Endeca not only is a search engine, it's part of a web application. You
> can send a query to the Endeca engine and send the response directly to
> the user; it's already fully rendered. (At least when you configured it
> this way.)
>
> Solr can't do this in any way. Solr responses are always pure technical
> data, not meant to be delivered to an end user. An exception to this is
> the VelocityResponseWriter which can fill a web template.
>
> Anything beyond the possibilities of the VelocityReponseWriter must be
> handled by some web application that anaylzes Solr's reponses.
>
> How do you want ot display your product descriptions, the default case?
> I don't think you want to show some XML data.
>
> Solr is a great search engine, but not more. It's just a small subset of
> commercial search frameworks like Endeca. Therefore, you can't simply
> replace it, you'll need some web application.
>
> However, you don't need a custom response writer in this case, nor do
> you have to Solr extend in any way. At least not for this requrement.
>
> -Kuli
>
>
>
>







Large RDBMS dataset

2011-12-14 Thread Finotti Simone
Hello,
I have a very large dataset (> 1 Mrecords) on the RDBMS which I want my Solr 
application to pull data from.

Problem is that the document fields which I have to index aren't in the same 
table, but I have to join records with two other tables. Well, in fact they are 
views, but I don't think that this makes any difference.

That's the data import handler that I've actually written:



  
  

  
  
  

  


It works, but it takes 1'38" to parse 100 records: it means 1 rec/s! That means 
that digesting the whole dataset would take 1 Ms (=> 12 days).

The problem is that for each record in "fd", Solr makes three distinct SELECT 
on the other three tables. Of course, this is absolutely inefficient.

Is there a way to have Solr loading every record in the four tables and join 
them when they are already loaded in memory?

TIA


Re: Large RDBMS dataset

2011-12-15 Thread Finotti Simone
Thank you (and all the others who spent time answering me) very much for your  
insights!

I didn't know how I've managed to miss CachedSqlEntityProcessor, but it seems 
that's just what I need.

bye


Inizio: Gora Mohanty [g...@mimirtech.com]
Inviato: mercoledì 14 dicembre 2011 16.39
Fine: solr-user@lucene.apache.org
Oggetto: Re: Large RDBMS dataset

On Wed, Dec 14, 2011 at 3:48 PM, Finotti Simone  wrote:
> Hello,
> I have a very large dataset (> 1 Mrecords) on the RDBMS which I want my Solr 
> application to pull data from.
[...]

> It works, but it takes 1'38" to parse 100 records: it means 1 rec/s! That 
> means that digesting the whole dataset would take 1 Ms (=> 12 days).

Depending on the size of the data that you are pulling from
the database, 1M records is not really that large a number.
We were doing ~75GB of stored data from ~7million records
in about 9h, including quite complicated transfomers. I would
imagine that there is much room for improvement in your case
also. Some notes on this:
* If you have servers to throw at the problem, and a sensible
  way to shard your RDBMS data, use parallel indexing to
  multiple Solr cores, maybe on multiple servers, followed by
  a merge. In our experience, given enough RAM and adequate
  provisioning of database servers, indexing speed scales linearly
  with the total no. of cores.
* Replicate your database, manually if needed. Look at the load
  on a database server during the indexing process, and provision
  enough database servers to match the no. of Solr indexing servers.
* This point is leading into flamewar territory, but consider switching
   databases. From our (admittedly non-rigorous measurements),
   mysql was at least a factor of 2-3 faster than MS-SQL, with the
   same dataset.
* Look at cloud-computing. If finances permit, one should be able
  to shrink indexing times to almost any desired level. E.g., for the
  dataset that we used, I have little doubt that we could have shrunk
  the time down to less than 1h, at an affordable cost on Amazon EC2.
  Unfortunately, we have not yet had the opportunity to try this.

> The problem is that for each record in "fd", Solr makes three distinct SELECT 
> on the other three tables. Of course, this is absolutely inefficient.
>
> Is there a way to have Solr loading every record in the four tables and join 
> them when they are already loaded in memory?

For various reasons, we did not investigate this in depth,
but you could also look at Solr's CachedSqlEntityProcessor.

Regards,
Gora






boosting

2012-01-11 Thread Finotti Simone
Hello ML,
I wonder if it is possibile to define a boost for certains fields in schema.xml 
configuration. As far, I have found ways to define a boost while indexing and 
while querying, so I suspect the straight answer is no. Anyway, I'd like a 
confirm, if possible.

Thank you in advance
S

Sorting on non-stored field

2012-03-14 Thread Finotti Simone
I was wondering: is it possible to sort a Solr result-set on a non-stored value?

Thank you

Spellchecker problem

2012-03-16 Thread Finotti Simone
Hello,
I have this configuration where a single master builds the Solr index and it 
replicates to two slave Solr instances. Regular queries are sent only to those 
two slaves. Configurations are the same for everyone (except of replication 
section, of course).

My problem: it's happened that, in a particular query, I expected spellchecker 
to give me a suggestion. Fact is that only one of the two instances answers as 
I had expected! I checked the data directory and discovered that the failing 
instance had a data/spellchecker directory almost empty (12 KB against 7 MB of 
the other working instance). I don't understand this behaviour.

I tried to issue a spellchecker.build=true command, and this is what I've got:


Problem accessing /solr/yoox_slave/select. Reason:

org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: 
NativeFSLock@C:\Users\sqladmin\LucidImagination\LucidWorksEnterprise\data\solr\cores\yoox_slave_1\spellchecker\write.lock

java.lang.RuntimeException: org.apache.lucene.store.LockObtainFailedException: 
Lock obtain timed out: 
NativeFSLock@C:\Users\sqladmin\LucidImagination\LucidWorksEnterprise\data\solr\cores\yoox_slave_1\spellchecker\write.lock
at 
org.apache.solr.spelling.IndexBasedSpellChecker.build(IndexBasedSpellChecker.java:92)
at 
org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:110)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1406)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
at 
com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:129)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:59)
at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:122)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:110)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed 
out: 
NativeFSLock@C:\Users\sqladmin\LucidImagination\LucidWorksEnterprise\data\solr\cores\yoox_slave_1\spellchecker\write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:84)
at org.apache.lucene.index.IndexWriter.(IndexWriter.java:840)
at 
org.apache.lucene.search.spell.SpellChecker.clearIndex(SpellChecker.java:470)
at 
org.apache.solr.spelling.IndexBasedSpellChecker.build(IndexBasedSpellChecker.java:88)
... 27 more


Does anybody faced a similar problem? Can you point me to the solution?


Thank you in advance


Skip first word

2012-07-25 Thread Finotti Simone
Hi

is there a tokenizer and/or a combination of filter to remove the first term 
from a field?

For example:
The quick brown fox

should be tokenized as:
quick
brown
fox

thank you in advance
S


Re: Skip first word

2012-07-26 Thread Finotti Simone
Hi Ahmet,
business asked me to apply EdgeNGram with minGramSize=1 on the first term and 
with minGramSize=3 on the latter terms.

We are developing a search suggestion mechanism, the idea is that if the user 
types "D", the engine should suggest "Dolce & Gabbana", but if we type "G", it 
should suggest other brands. Only if users type "Gab" it should suggest "Dolce 
& Gabbana".

Thanks
S

Inizio: Ahmet Arslan [iori...@yahoo.com]
Inviato: mercoledì 25 luglio 2012 18.10
Fine: solr-user@lucene.apache.org
Oggetto: Re: Skip first word

> is there a tokenizer and/or a combination of filter to
> remove the first term from a field?
>
> For example:
> The quick brown fox
>
> should be tokenized as:
> quick
> brown
> fox

There is no such filter that i know of. Though, you can implement one with 
modifying source code of LengthFilterFactory or StopFilterFactory. They both 
remove tokens. Out of curiosity, what is the use case for this?






Re: Skip first word

2012-07-27 Thread Finotti Simone
Hi Chantal,

if I understand correctly, this implies that I have to populate different 
fields according to their lenght. Since I'm not aware of any logical condition 
you can apply to copyField directive, it means that this logic has to be 
implementend by the process that populates the Solr core. Is this assumption 
correct?

That's kind of bad, because I'd like to have this kind of "rules" in the Solr 
configuration. Of course, if that's the only way... :)

Thank you 


Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com]
Inviato: giovedì 26 luglio 2012 18.32
Fine: solr-user@lucene.apache.org
Oggetto: Re: Skip first word

Hi,

use two fields:
1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 for 
inputs of length < 3,
2. the other one tokenized as appropriate with minsize=3 and longer for all 
longer inputs


Cheers,
Chantal


Am 26.07.2012 um 09:05 schrieb Finotti Simone:

> Hi Ahmet,
> business asked me to apply EdgeNGram with minGramSize=1 on the first term and 
> with minGramSize=3 on the latter terms.
>
> We are developing a search suggestion mechanism, the idea is that if the user 
> types "D", the engine should suggest "Dolce & Gabbana", but if we type "G", 
> it should suggest other brands. Only if users type "Gab" it should suggest 
> "Dolce & Gabbana".
>
> Thanks
> S
> 
> Inizio: Ahmet Arslan [iori...@yahoo.com]
> Inviato: mercoledì 25 luglio 2012 18.10
> Fine: solr-user@lucene.apache.org
> Oggetto: Re: Skip first word
>
>> is there a tokenizer and/or a combination of filter to
>> remove the first term from a field?
>>
>> For example:
>> The quick brown fox
>>
>> should be tokenized as:
>> quick
>> brown
>> fox
>
> There is no such filter that i know of. Though, you can implement one with 
> modifying source code of LengthFilterFactory or StopFilterFactory. They both 
> remove tokens. Out of curiosity, what is the use case for this?
>
>
>
>







R: Skip first word

2012-07-27 Thread Finotti Simone
Could you elaborate it, please? 

thanks
S


Inizio: in.abdul [in.ab...@gmail.com]
Inviato: giovedì 26 luglio 2012 20.36
Fine: solr-user@lucene.apache.org
Oggetto: Re: Skip first word

That's is best option I had also used shingle filter factory . .
On Jul 26, 2012 10:03 PM, "Chantal Ackermann-2 [via Lucene]" <
ml-node+s472066n399748...@n3.nabble.com> wrote:

> Hi,
>
> use two fields:
> 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2
> for inputs of length < 3,
> 2. the other one tokenized as appropriate with minsize=3 and longer for
> all longer inputs
>
>
> Cheers,
> Chantal
>
>
> Am 26.07.2012 um 09:05 schrieb Finotti Simone:
>
> > Hi Ahmet,
> > business asked me to apply EdgeNGram with minGramSize=1 on the first
> term and with minGramSize=3 on the latter terms.
> >
> > We are developing a search suggestion mechanism, the idea is that if the
> user types "D", the engine should suggest "Dolce & Gabbana", but if we type
> "G", it should suggest other brands. Only if users type "Gab" it should
> suggest "Dolce & Gabbana".
> >
> > Thanks
> > S
> > 
> > Inizio: Ahmet Arslan [[hidden 
> > email]<http://user/SendEmail.jtp?type=node&node=3997480&i=0>]
>
> > Inviato: mercoledì 25 luglio 2012 18.10
> > Fine: [hidden email]<http://user/SendEmail.jtp?type=node&node=3997480&i=1>
> > Oggetto: Re: Skip first word
> >
> >> is there a tokenizer and/or a combination of filter to
> >> remove the first term from a field?
> >>
> >> For example:
> >> The quick brown fox
> >>
> >> should be tokenized as:
> >> quick
> >> brown
> >> fox
> >
> > There is no such filter that i know of. Though, you can implement one
> with modifying source code of LengthFilterFactory or StopFilterFactory.
> They both remove tokens. Out of curiosity, what is the use case for this?
> >
> >
> >
> >
>
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/Skip-first-word-tp3997277p3997480.html
>  To unsubscribe from Lucene, click 
> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472066&code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




-
THANKS AND REGARDS,
SYED ABDUL KATHER
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Skip-first-word-tp3997277p3997509.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Skip first word

2012-07-27 Thread Finotti Simone
Brilliant!
Thank you very much :)


Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com]
Inviato: venerdì 27 luglio 2012 11.20
Fine: solr-user@lucene.apache.org
Oggetto: Re: Skip first word

Hi Simone,

no I meant that you populate the two fields with the same input - best done via 
copyField directive.

The first field will contain ngrams of size 1 and 2. The other field will 
contain ngrams of size 3 and longer (you might want to set a decent maxsize 
there).

The query for the autocomplete list uses the first field when the input (typed 
in by the user) is one or two characters long. Your example was: "D", "G", or 
than "Do" or "Ga". The result would search only on the single token field that 
contains for the input "Dolce & Gabbana" only the ngrams "D" and "Do". So, only 
the input "D" or "Do" would result in a hit on "Dolce & Gabbana".
Once the user has typed in the third letter: "Dol" or "Gab", you query the 
second, more tokenized field which would contain for "Dolce & Gabbana" the 
ngrams "Dol" "Dolc" "Dolce" "Gab" "Gabb" "Gabba" etc.
Both inputs "Gab" and "Dol" would then return "Dolce & Gabbana".

1. First  field type:




2. Secong field type:





3. field declarations:







Chantal

Am 27.07.2012 um 11:05 schrieb Finotti Simone:

> Hi Chantal,
>
> if I understand correctly, this implies that I have to populate different 
> fields according to their lenght. Since I'm not aware of any logical 
> condition you can apply to copyField directive, it means that this logic has 
> to be implementend by the process that populates the Solr core. Is this 
> assumption correct?
>
> That's kind of bad, because I'd like to have this kind of "rules" in the Solr 
> configuration. Of course, if that's the only way... :)
>
> Thank you
>
> 
> Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com]
> Inviato: giovedì 26 luglio 2012 18.32
> Fine: solr-user@lucene.apache.org
> Oggetto: Re: Skip first word
>
> Hi,
>
> use two fields:
> 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 for 
> inputs of length < 3,
> 2. the other one tokenized as appropriate with minsize=3 and longer for all 
> longer inputs
>
>
> Cheers,
> Chantal
>
>
> Am 26.07.2012 um 09:05 schrieb Finotti Simone:
>
>> Hi Ahmet,
>> business asked me to apply EdgeNGram with minGramSize=1 on the first term 
>> and with minGramSize=3 on the latter terms.
>>
>> We are developing a search suggestion mechanism, the idea is that if the 
>> user types "D", the engine should suggest "Dolce & Gabbana", but if we type 
>> "G", it should suggest other brands. Only if users type "Gab" it should 
>> suggest "Dolce & Gabbana".
>>
>> Thanks
>> S
>> 
>> Inizio: Ahmet Arslan [iori...@yahoo.com]
>> Inviato: mercoledì 25 luglio 2012 18.10
>> Fine: solr-user@lucene.apache.org
>> Oggetto: Re: Skip first word
>>
>>> is there a tokenizer and/or a combination of filter to
>>> remove the first term from a field?
>>>
>>> For example:
>>> The quick brown fox
>>>
>>> should be tokenized as:
>>> quick
>>> brown
>>> fox
>>
>> There is no such filter that i know of. Though, you can implement one with 
>> modifying source code of LengthFilterFactory or StopFilterFactory. They both 
>> remove tokens. Out of curiosity, what is the use case for this?
>>
>>
>>
>>
>
>
>
>
>







Split XML configuration

2012-09-20 Thread Finotti Simone
Hi,

is it possible to split schema.xml and solrconfig.xml configurations? My 
configurations are getting quite large and I'd like to be able to partition 
them logically in multiple files.

thank you in advance,
S


Query filtering

2012-09-27 Thread Finotti Simone
Hello,
I'm doing this query to return top 10 facets within a given "context", 
specified via the fq parameter.

http://solr/core/select?fq=(...)&q=*:*&rows=0&facet.field=interesting_facet&facet.limit=10

Now, I should search for a term inside the context AND the previously 
identified top 10 facet values.

Is there a way to do this with a single query?

thank you in advance,
S


Re: Query filtering

2012-09-28 Thread Finotti Simone
Hi Amit,
thank you for your answer, but I did know how to do it with two distinct 
queries: I hoped for some way to do it with a single query :-) (maybe using 
some advanced functionality with nested queries...)

S


Inizio: Amit Nithian [anith...@gmail.com]
Inviato: giovedì 27 settembre 2012 19.18
Fine: solr-user@lucene.apache.org
Oggetto: Re: Query filtering

I think one way to do this is issue another query and set a bunch of
filter queries to restrict "interesting_facet" to just those ten
values returned in the first query.

fq=interesting_facet:1 OR interesting_facet:2 etc&q=context:

Does that help?
Amit

On Thu, Sep 27, 2012 at 6:33 AM, Finotti Simone  wrote:
> Hello,
> I'm doing this query to return top 10 facets within a given "context", 
> specified via the fq parameter.
>
> http://solr/core/select?fq=(...)&q=*:*&rows=0&facet.field=interesting_facet&facet.limit=10
>
> Now, I should search for a term inside the context AND the previously 
> identified top 10 facet values.
>
> Is there a way to do this with a single query?
>
> thank you in advance,
> S