Re: build CMIS compatible Solr

2013-01-21 Thread Upayavira
We merely used Alfresco as the other side of the CMIS coin, to prove
that our connector was working, as colleagues had knowledge of it.

And yes, that link you found is to the connector.

Upayavira

On Sun, Jan 20, 2013, at 10:26 PM, Nicholas Li wrote:
> I think this might be the one you are talking about:
> https://github.com/sourcesense/solr-cmis
> 
> But I think Alfresco has already had search functionality, similar to
> Solr.
> Then why did you want to use it to index docs out of Alfresco?
> 
> On Fri, Jan 18, 2013 at 8:00 PM, Upayavira  wrote:
> 
> > A colleague of mine when I was working for Sourcesense made a CMIS
> > plugin for Solr. It was one way, and we used it to index stuff out of
> > Alfresco into Solr. I can't search for it now, let me know if you can't
> > find it.
> >
> > Upayavira
> >
> > On Fri, Jan 18, 2013, at 05:35 AM, Nicholas Li wrote:
> > > I want to make something like Alfresco, but not having that many
> > > features.
> > > And I'd like to utilise the searching ability of Solr.
> > >
> > > On Fri, Jan 18, 2013 at 4:11 PM, Gora Mohanty 
> > wrote:
> > >
> > > > On 18 January 2013 10:36, Nicholas Li  wrote:
> > > > > hi
> > > > >
> > > > > I am new to solr and I would like to use Solr as my document server,
> > plus
> > > > > search engine. But solr is not CMIS compatible( While it shoud not
> > be, as
> > > > > it is not build as a pure document management server).  In that
> > sense, I
> > > > > would build another layer beyond Solr so that the exposed interface
> > would
> > > > > be CMIS compatible.
> > > > [...]
> > > >
> > > > May I ask why? Solr is designed to be a search engine,
> > > > which is a very different beast from a document repository.
> > > > In the open-source world, Alfresco ( http://www.alfresco.com/ )
> > > > already exists, can index into Solr, and supports CMIS-based
> > > > access.
> > > >
> > > > Regards,
> > > > Gora
> > > >
> >


Re: Solr 4.0 - timeAllowed in distributed search

2013-01-21 Thread Upayavira
And think about distributed search, you are going through a 'proxy'
which, as well as forwarding your docs, must also merge any docs from
different shards into a single result set. That is likely to take some
time on 30,000 docs, and isn't a job that is needed on non-distributed
search.

Upayavira

On Sun, Jan 20, 2013, at 08:55 PM, Walter Underwood wrote:
> If you are going to request 30,000 rows, you can give up on getting good
> performance. It is not going to happen.
> 
> Even without all the disk accesses, think about how much is sent over the
> network, then parsed by the client. The client cannot even start working
> with the data until it is all received and parsed.
> 
> wunder
> 
> On Jan 20, 2013, at 8:49 AM, Michael Ryan wrote:
> 
> > (This is based on my knowledge of 3.6 - not sure if this has changed in 4.0)
> > 
> > You are using rows=3, which requires retrieving 3 documents from 
> > disk. In a non-distributed search, the QTime will not include the time it 
> > takes to retrieve these documents, but in a distributed search, it will. 
> > For a *:* query, the document retrieval will almost always be the slowest 
> > part of the query. I'd suggest measuring how long it takes for the response 
> > to be returned, or use rows=0.
> > 
> > The timeAllowed feature is very misleading. It only applies to a small 
> > portion of the query (which in my experience is usually not the part of the 
> > query that is actually slow). Do not depend on timeAllowed doing anything 
> > useful :)
> > 
> > -Michael
> > 
> > -Original Message-
> > From: Lyuba Romanchuk [mailto:lyuba.romanc...@gmail.com] 
> > Sent: Sunday, January 20, 2013 6:36 AM
> > To: solr-user@lucene.apache.org
> > Subject: Solr 4.0 - timeAllowed in distributed search
> > 
> > Hi,
> > 
> > I try to use timeAllowed in query both in distributed search with one shard 
> > and directly to the same shard.
> > I send the same query with timeAllowed=500 :
> > 
> >   - directly to the shard then QTime ~= 600 ms
> >   - through distributes search to the same shard QTime ~= 7 sec.
> > 
> > I have two questions:
> > 
> >   - It seems that timeAllowed parameter doesn't work for distributes
> >   search, does it?
> >   - What may be the reason that causes the query to the shard through
> >   distributes search takes much more time than to the shard directly (the
> >   same distribution remains without timeAllowed parameter in the query)?
> > 
> > 
> > Test results:
> > 
> > Ask one shard through distributed search:
> > 
> > 
> > http://localhost:8983/solr/shard_2013-01-07/select?q=*:*&rows=3&shards=127.0.0.1%3A8983%2Fsolr%2Fshard_2013-01-07&timeAllowed=500&partialResults=true&shards.info=true&debugQuery=true
> > 
> > 
> > true
> > 0
> > 7307
> > 
> > *:*
> > 127.0.0.1:8983/solr/shard_2013-01-07
> > true
> > true
> > true
> > 3
> > 500
> > 
> > 
> > 29574223
> > 1.0
> > 646
> >  ...
> > 30,000 docs
> > ...
> > 
> > *:*
> > *:*
> > MatchAllDocsQuery(*:*)
> > *:*
> > LuceneQParser
> > 6141.0  > name="prepare">0.0  > name="org.apache.solr.handler.component.QueryComponent"> > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> > 6141.0  > name="org.apache.solr.handler.component.QueryComponent"> > name="time">6022.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">119.0<
> > 
> > Ask the same shard directly:
> > 
> > http://localhost:8983/solr/shard_2013-01-07/select?q=*:*&rows=3&timeAllowed=500&partialResults=true&shards.info=true&debugQuery=true
> > 
> > true
> > 0
> > 617
> > 
> > *:*
> > true
> > true
> > true
> > 3
> > 500
> >  ...
> > 30,000 docs
> > *:* > name="querystring">*:* > name="parsedquery">MatchAllDocsQuery(*:*) > name="parsedquery_toString">*:*
> > LuceneQParser
> > 617.0  > name="prepare">0.0  > name="org.apache.solr.handler.component.QueryComponent"> > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> > 617.0  > name="org.apache.solr.handler.component.QueryComponent"> > name="time">516.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">0.0
> >  > name="time">101.0
> > 
> > Thank you.
> > Best regards,
> > Lyuba
> 
> --
> Walter Underwood
> wun...@wunderwood.org
> 
> 
> 


Re: Tokenized keywords

2013-01-21 Thread Romita Saha
Hi,

I have a field defined in scheme.xml named as 'original'. I first copy 
this field to "modified" and apply filters on this field "modified."

field name="original" type="string" indexed="true" stored="true"/>
field name="modified" type="text_general" indexed="true" stored="true"/>

 

I want to display in my as follows:

original: Search for all the Laptops
modified: search laptop

Thanks and regards,
Romita Saha

Panasonic R&D Center Singapore
Blk 1022 Tai Seng Avenue #06-3530
Tai Seng Ind. Est. Singapore 534415
DID: (65) 6550 5383 FAX: (65) 6550 5459
email: romita.s...@sg.panasonic.com



From:   Mikhail Khludnev 
To: solr-user@lucene.apache.org, 
Date:   01/21/2013 03:48 PM
Subject:Re: Tokenized keywords



Romita,
That's what exactly is shown debugQuery output. If you cant find it there,
paste output here, let's try to find together. Also pay attention to
explainOther debug parameter and analisys page in admin ui.
21.01.2013 10:50 пользователь "Romita Saha" 
написал:

> What I am trying to achieve is as follows.
>
> I query "Search for all the Laptops" and my tokenized key words are
> "search laptop" (I apply stopword filter to filter out words like
> for,all,the and i also user lowercase filter).
> I want to display these tokenized keywords using debugQuery.
>
> Thanks and regards,
> Romita
>
>
>
> From:   Dikchant Sahi 
> To: solr-user@lucene.apache.org,
> Date:   01/21/2013 02:26 PM
> Subject:Re: Tokenized keywords
>
>
>
> Can you please elaborate a more on what you are trying to achieve.
>
> Tokenizers work on indexed field and doesn't effect how the values will 
be
> displayed. The response value comes from stored field. If you want to 
see
> how your query is being tokenized, you can do it using analysis 
interface
> or enable debugQuery to see how your query is being formed.
>
>
> On Mon, Jan 21, 2013 at 11:06 AM, Romita Saha
> wrote
>
> > Hi,
> >
> > I use some tokenizers to tokenize the query. I want to see the 
tokenized
> > query words displayed in the .Could you kindly help me do
> that.
> >
> > Thanks and regards,
> > Romita
>
>




Re: Tokenized keywords

2013-01-21 Thread Dikchant Sahi
Tokenizer changes the behavior of how you search/index and not how you
store. What i understand is you want to display tokenized result always and
not just for debug purpose.

debugQuery has performance implications that should not be used for what
you are trying to achieve.

Basically, what you need is a way to store filtered and lowercased tokens
in the 'modified' field. What I see as a solution is either
you ingest 'original' field with your desired tokens directly instead of
using copyfield or write some custom code to store/index only the filtered
and lowercased result eg. custom transformer can be explored if you are
using data import handler.


On Mon, Jan 21, 2013 at 1:47 PM, Romita Saha
wrote:

> Hi,
>
> I have a field defined in scheme.xml named as 'original'. I first copy
> this field to "modified" and apply filters on this field "modified."
>
> field name="original" type="string" indexed="true" stored="true"/>
> field name="modified" type="text_general" indexed="true" stored="true"/>
>
>  
>
> I want to display in my as follows:
>
> original: Search for all the Laptops
> modified: search laptop
>
> Thanks and regards,
> Romita Saha
>
> Panasonic R&D Center Singapore
> Blk 1022 Tai Seng Avenue #06-3530
> Tai Seng Ind. Est. Singapore 534415
> DID: (65) 6550 5383 FAX: (65) 6550 5459
> email: romita.s...@sg.panasonic.com
>
>
>
> From:   Mikhail Khludnev 
> To: solr-user@lucene.apache.org,
> Date:   01/21/2013 03:48 PM
> Subject:Re: Tokenized keywords
>
>
>
> Romita,
> That's what exactly is shown debugQuery output. If you cant find it there,
> paste output here, let's try to find together. Also pay attention to
> explainOther debug parameter and analisys page in admin ui.
> 21.01.2013 10:50 пользователь "Romita Saha" 
> написал:
>
> > What I am trying to achieve is as follows.
> >
> > I query "Search for all the Laptops" and my tokenized key words are
> > "search laptop" (I apply stopword filter to filter out words like
> > for,all,the and i also user lowercase filter).
> > I want to display these tokenized keywords using debugQuery.
> >
> > Thanks and regards,
> > Romita
> >
> >
> >
> > From:   Dikchant Sahi 
> > To: solr-user@lucene.apache.org,
> > Date:   01/21/2013 02:26 PM
> > Subject:Re: Tokenized keywords
> >
> >
> >
> > Can you please elaborate a more on what you are trying to achieve.
> >
> > Tokenizers work on indexed field and doesn't effect how the values will
> be
> > displayed. The response value comes from stored field. If you want to
> see
> > how your query is being tokenized, you can do it using analysis
> interface
> > or enable debugQuery to see how your query is being formed.
> >
> >
> > On Mon, Jan 21, 2013 at 11:06 AM, Romita Saha
> > wrote
> >
> > > Hi,
> > >
> > > I use some tokenizers to tokenize the query. I want to see the
> tokenized
> > > query words displayed in the .Could you kindly help me do
> > that.
> > >
> > > Thanks and regards,
> > > Romita
> >
> >
>
>
>


Distance Range query

2013-01-21 Thread stefanocorsi
Hello,

I have an index containing items with a range of distance in which I would
like these items to be found when searched. Furthermore, in the same index,
I have the position of the item. For example.

Item A - 46.23211131, 10.3213213131 - 30km
Item B - 45.23211131, 9.3213213131 - 50km
...
Item N - 46.32132132, 10.213211321 - 100Km

I would like to be able to make the following query:

given a certain point X (let's say 46.323231223, 10.32132132131) I would
like to find all the items that are in the range of that point, according to
the item's range: i.e. if an item has a 30Km range, it should find itself in
30Km range from point X, if an item has a 100Km range, it should find itself
in a 100Km range from point X, and so on...

I hope I have explained it in a decent way...

Is there a way to do it? With solr 3? With solr 4?

Thank you for your help...

Stefano




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Distance-Range-query-tp4034977.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.0 - timeAllowed in distributed search

2013-01-21 Thread Lyuba Romanchuk
Hi Michael,

Thank you very much for your reply!

Does it mean that when timeAllowed is used only search is interrupted and
document retrieval is not?

In order to check the total time of the query I run curl with linux time to
measure the total time including retrieving of documents. If I understood
your answer correctly I had to get a similar total time in both cases but
according to the results they are similar to QTime and to each other:

   - for non distributed: QTime=789 ms when total time is ~1 sec
   - for distributed: QTime=7.75 sec and total time is 7.9 sec.

Here is the output of the curls (direct_query.xml and distributed_query.xml
contain 30,000 documents in the reply):

Directly ask the shard:**

time curl '
http://localhost:8983/solr/shard_2013-01-07/select?q=*:*&rows=3&timeAllowed=500&partialResults=true&debugQuery=true
' >& direct_query.xml


real0m1.025s

user0m0.008s

sys 0m0.053s

from direct_query.xml:



true

0

789



3*:*

500

truetrue



Ask the shard through distributed search:


*time curl '
http://localhost:8983/solr/shard_2013-01-07/select?q=*:*&rows=3&shards=127.0.0.1%3A8983%2Fsolr%2Fshard_2013-01-07&timeAllowed=500&partialResults=true&shards.info=true&debug=true
' *>& distributed_query.xml



real0m7.905s

user0m0.010s

sys 0m0.052s


from distributed_query.xml:




true

0

7750



*:*

true

127.0.0.1:8983/solr/shard_2013-01-07

true

true

3

500



281930201.0895






Best regards,
Lyuba


On Sun, Jan 20, 2013 at 6:49 PM, Michael Ryan  wrote:

> (This is based on my knowledge of 3.6 - not sure if this has changed in
> 4.0)
>
> You are using rows=3, which requires retrieving 3 documents from
> disk. In a non-distributed search, the QTime will not include the time it
> takes to retrieve these documents, but in a distributed search, it will.
> For a *:* query, the document retrieval will almost always be the slowest
> part of the query. I'd suggest measuring how long it takes for the response
> to be returned, or use rows=0.
>
> The timeAllowed feature is very misleading. It only applies to a small
> portion of the query (which in my experience is usually not the part of the
> query that is actually slow). Do not depend on timeAllowed doing anything
> useful :)
>
> -Michael
>
> -Original Message-
> From: Lyuba Romanchuk [mailto:lyuba.romanc...@gmail.com]
> Sent: Sunday, January 20, 2013 6:36 AM
> To: solr-user@lucene.apache.org
> Subject: Solr 4.0 - timeAllowed in distributed search
>
> Hi,
>
> I try to use timeAllowed in query both in distributed search with one
> shard and directly to the same shard.
> I send the same query with timeAllowed=500 :
>
>- directly to the shard then QTime ~= 600 ms
>- through distributes search to the same shard QTime ~= 7 sec.
>
> I have two questions:
>
>- It seems that timeAllowed parameter doesn't work for distributes
>search, does it?
>- What may be the reason that causes the query to the shard through
>distributes search takes much more time than to the shard directly (the
>same distribution remains without timeAllowed parameter in the query)?
>
>
> Test results:
>
> Ask one shard through distributed search:
>
>
>
> http://localhost:8983/solr/shard_2013-01-07/select?q=*:*&rows=3&shards=127.0.0.1%3A8983%2Fsolr%2Fshard_2013-01-07&timeAllowed=500&partialResults=true&shards.info=true&debugQuery=true
> 
> 
> true
> 0
> 7307
> 
> *:*
> 127.0.0.1:8983/solr/shard_2013-01-07
> true
> true
> true
> 3
> 500
> 
> 
> 29574223
> 1.0
> 646
>  ...
> 30,000 docs
> ...
> 
> *:*
> *:*
> MatchAllDocsQuery(*:*)
> *:*
> LuceneQParser
> 6141.0  name="prepare">0.0  name="org.apache.solr.handler.component.QueryComponent"> name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
> 6141.0  name="org.apache.solr.handler.component.QueryComponent"> name="time">6022.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">119.0<
>
> Ask the same shard directly:
>
>
> http://localhost:8983/solr/shard_2013-01-07/select?q=*:*&rows=3&timeAllowed=500&partialResults=true&shards.info=true&debugQuery=true
> 
> true
> 0
> 617
> 
> *:*
> true
> true
> true
> 3
> 500
>  ...
> 30,000 docs
> *:* name="querystring">*:* name="parsedquery">MatchAllDocsQuery(*:*) name="parsedquery_toString">*:*
> LuceneQParser
> 617.0  name="prepare">0.0  name="org.apache.solr.handler.component.QueryComponent"> name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
> 617.0  name="org.apache.solr.handler.component.QueryComponent"> name="time">516.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">0.0
>  name="time">101.0
>
> Thank you.
> Best regards,
> Lyuba
>


Re: build CMIS compatible Solr

2013-01-21 Thread Ahmet Arslan
Hi Nicholas,

You might be interested in http://manifoldcf.apache.org/ .

You can use it to index several document repositories into solr.

http://manifoldcf.apache.org/release/release-1.0.1/en_US/end-user-documentation.html#repositoryconnectiontypes


--- On Mon, 1/21/13, Upayavira  wrote:

> From: Upayavira 
> Subject: Re: build CMIS compatible Solr
> To: solr-user@lucene.apache.org
> Date: Monday, January 21, 2013, 10:05 AM
> We merely used Alfresco as the other
> side of the CMIS coin, to prove
> that our connector was working, as colleagues had knowledge
> of it.
> 
> And yes, that link you found is to the connector.
> 
> Upayavira
> 
> On Sun, Jan 20, 2013, at 10:26 PM, Nicholas Li wrote:
> > I think this might be the one you are talking about:
> > https://github.com/sourcesense/solr-cmis
> > 
> > But I think Alfresco has already had search
> functionality, similar to
> > Solr.
> > Then why did you want to use it to index docs out of
> Alfresco?
> > 
> > On Fri, Jan 18, 2013 at 8:00 PM, Upayavira 
> wrote:
> > 
> > > A colleague of mine when I was working for
> Sourcesense made a CMIS
> > > plugin for Solr. It was one way, and we used it to
> index stuff out of
> > > Alfresco into Solr. I can't search for it now, let
> me know if you can't
> > > find it.
> > >
> > > Upayavira
> > >
> > > On Fri, Jan 18, 2013, at 05:35 AM, Nicholas Li
> wrote:
> > > > I want to make something like Alfresco, but
> not having that many
> > > > features.
> > > > And I'd like to utilise the searching ability
> of Solr.
> > > >
> > > > On Fri, Jan 18, 2013 at 4:11 PM, Gora Mohanty
> 
> > > wrote:
> > > >
> > > > > On 18 January 2013 10:36, Nicholas Li
> 
> wrote:
> > > > > > hi
> > > > > >
> > > > > > I am new to solr and I would like
> to use Solr as my document server,
> > > plus
> > > > > > search engine. But solr is not CMIS
> compatible( While it shoud not
> > > be, as
> > > > > > it is not build as a pure document
> management server).  In that
> > > sense, I
> > > > > > would build another layer beyond
> Solr so that the exposed interface
> > > would
> > > > > > be CMIS compatible.
> > > > > [...]
> > > > >
> > > > > May I ask why? Solr is designed to be a
> search engine,
> > > > > which is a very different beast from a
> document repository.
> > > > > In the open-source world, Alfresco ( http://www.alfresco.com/ )
> > > > > already exists, can index into Solr, and
> supports CMIS-based
> > > > > access.
> > > > >
> > > > > Regards,
> > > > > Gora
> > > > >
> > >
>


Re: Response time in client was much longer than QTime in tomcat

2013-01-21 Thread pravesh
SOLR's QTime represents actual time it spent on searching, where as your c#
client response time might be the total time spent in sending HTTP request
and getting back the response(which might also include parsing the results)
.


Regards
Pravesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Response-time-in-client-was-much-longer-than-QTime-in-tomcat-tp4034148p4034996.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Large data importing getting rollback with solr

2013-01-21 Thread ashimbose
Hi Gora,

Thank you for your suggestion.

I have tried with you below option,

>* Have never tried this, but one can set up multiple request handlers
> in solrconfig.xml for each DIH instance that one plans to run.
>  These can run in parallel rather than the sequential indexing of
>  root entities in a single DIH instance. 

Here I used two data config
1. data_conf1.xml
2. data_conf2.xml

I have used them like below



  data_conf1.xml

  


  data_conf2.xml

  

Now I have run them separately one after the other like below..
1. http://localhost:8080/solr/dataimport?command=full-import
2. http://localhost:8080/solr/dataimport1?command=full-import

Any one of them running fine at a single instant. Means,
If I run first dataimport, it will successfully index, if after that I run
dataimport1, it is giving below error..

Caused by: java.sql.SQLException: JBC0088E: JBC0002E: Socket timeout
detected: Read timed out

Again I restart my tomcat and run dataimport1 first and then dataimport,
then also same problem, 

dataimport1 runs perfect but dataimport giving time out error,

Can anyone please help me... or any different method to index large data...

Thanks
Ashim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Large-data-importing-getting-rollback-with-solr-tp4034075p4035009.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to combine Qparsers in a plugin?

2013-01-21 Thread denl0

I have made a Qparserplugin to get querytimejoin from lucene in to Solr. But
I'm facing 2 major issues.
This is the code I'm using (With the right jars added to config etc it's
working.)

public class TestParserPlugin extends QParserPlugin {
@Override
public QParser createParser(String string, SolrParams sp, SolrParams sp1,
SolrQueryRequest sqr)
{ return new TestParser(string, sp1, sp1, sqr); }
@Override
public void init(NamedList nl) {
}
public class TestParser extends QParser {
public TestParser(String qstr, SolrParams localParams, SolrParams params,
SolrQueryRequest req)
{ super(qstr, localParams, params, req); }
@Override
public org.apache.lucene.search.Query parse() throws
org.apache.lucene.queryparser.classic.ParseException {
IndexReader reader;
try
{ reader = IndexReader.open(FSDirectory.open(new
File("C:\\java\\apache-solr-4.0.0\\apache-solr-4.0.0\\example\\solr\\Books\\data\\index")));
IndexSearcher searcher = new IndexSearcher(reader); BooleanQuery fromQuery =
new BooleanQuery(); fromQuery.add(new TermQuery(new Term("pageTxt",
"crazy")), BooleanClause.Occur.MUST); fromQuery.add(new TermQuery(new
Term("pageTxt", "test")), BooleanClause.Occur.SHOULD); return
JoinUtil.createJoinQuery("pageId", true, "fileId",fromQuery, searcher,
ScoreMode.Avg); }
catch (IOException ex)
{ Logger.getLogger(TestParserPlugin.class.getName()).log(Level.SEVERE, null,
ex); }
return null;
}
}
}

I still have 2 questions concerning this code:
Is there a way to get the searcher instead of opening from fs everytime I
use this plugin?(I think this will create a memory leak)
Is it possible to combine this qparser with edismax etc in stead of building
queries myself using occur.must etc...


Note: I also asked this question to the author of querytimejoin(on the jira
issue). 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: access matched token ids in the FacetComponent?

2013-01-21 Thread Mikhail Khludnev
Dmitry,

First of all, FacetComponent is the Solr's out-of-the-box functionality. It
runs after search is done and accesses the bitSet of the found document,
i.e. there is no spans (matched terms positions) there at all.

StandardFacetsAccumulator sounds like the "brand new" lucene faceting
library. see http://shaierera.blogspot.com/. I don't think but don't
exactly know whether they are accessible there too.

Some time ago my team successfully prototyped facet component backed on
spans blog.griddynamics.com/2011/10/solr-experience-search-parent-child.htmlbut
I don't suggest you go this way.
I can suggest you start from the following:
- supply PostFilter/DelegatingCollector
http://yonik.com/posts/advanced-filter-caching-in-solr/
- the DelegatingCollector will accept the scorer instance
- if this scorer is BooleanScorer2 (but not BooleanScorer!), you can access
the SpanQueryScorer in one of the legs and try to access the matched spans
- if you are in 3.x you'll have a problem with disjunction queries.

it seems challenging, doesn't it?

18.01.2013 17:40 пользователь "Dmitry Kan"  написал:

> Mikhail,
>
> Do you say, that it is not possible to access the matched terms positions
> in the FacetComponent? If that would be possible (somewhere in the
> StandardFacetsAccumulator class, where docids are available), then by
> knowing the matched term positions I can do some school simple math to
> calculate the sentence counts per doc id.
>
> Dmitry
>
> On Fri, Jan 18, 2013 at 2:45 PM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
> > Dmitry,
> >
> > It definitely seems like postptocessing highlighter's output. The also
> > approach is:
> > - limit number of occurrences of a word in a sentence to 1
> > - play with facet by function patch
> > https://issues.apache.org/jira/browse/SOLR-1581 accomplished by tf()
> > function.
> >
> > It doesn't seem like much help.
> >
> > On Fri, Jan 18, 2013 at 12:42 PM, Dmitry Kan 
> wrote:
> >
> > > that we actually require the count of the sentences inside
> > > each document where the hits were found.
> > >
> >
> >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> > Principal Engineer,
> > Grid Dynamics
> >
> > 
> >  
> >
>


Re: Tokenized keywords

2013-01-21 Thread Uwe Reh

Hi

probably my note is nonsense. But sometimes one is blind and not able to 
see simple things anymore.


Is this query, what you are looking for?
 q=modified:(search+for+Laptops)&fl=original,modified

Sorry, if my suggest is too trivial.

Uwe


Am 21.01.2013 09:17, schrieb Romita Saha:

Hi,

I have a field defined in scheme.xml named as 'original'. I first copy
this field to "modified" and apply filters on this field "modified."

field name="original" type="string" indexed="true" stored="true"/>
field name="modified" type="text_general" indexed="true" stored="true"/>

  

I want to display in my as follows:

original: Search for all the Laptops
modified: search laptop

Thanks and regards,
Romita Saha




Re: How to combine Qparsers in a plugin?

2013-01-21 Thread Mikhail Khludnev
Sure!

- use inherited members QParser.req.getSearcher()
- use util method .QParser.getParser("foo bar", "edismax", req).getQuery()

you are welcome


On Mon, Jan 21, 2013 at 3:40 PM, denl0 wrote:

>
> I have made a Qparserplugin to get querytimejoin from lucene in to Solr.
> But
> I'm facing 2 major issues.
> This is the code I'm using (With the right jars added to config etc it's
> working.)
>
> public class TestParserPlugin extends QParserPlugin {
> @Override
> public QParser createParser(String string, SolrParams sp, SolrParams sp1,
> SolrQueryRequest sqr)
> { return new TestParser(string, sp1, sp1, sqr); }
> @Override
> public void init(NamedList nl) {
> }
> public class TestParser extends QParser {
> public TestParser(String qstr, SolrParams localParams, SolrParams params,
> SolrQueryRequest req)
> { super(qstr, localParams, params, req); }
> @Override
> public org.apache.lucene.search.Query parse() throws
> org.apache.lucene.queryparser.classic.ParseException {
> IndexReader reader;
> try
> { reader = IndexReader.open(FSDirectory.open(new
>
> File("C:\\java\\apache-solr-4.0.0\\apache-solr-4.0.0\\example\\solr\\Books\\data\\index")));
> IndexSearcher searcher = new IndexSearcher(reader); BooleanQuery fromQuery
> =
> new BooleanQuery(); fromQuery.add(new TermQuery(new Term("pageTxt",
> "crazy")), BooleanClause.Occur.MUST); fromQuery.add(new TermQuery(new
> Term("pageTxt", "test")), BooleanClause.Occur.SHOULD); return
> JoinUtil.createJoinQuery("pageId", true, "fileId",fromQuery, searcher,
> ScoreMode.Avg); }
> catch (IOException ex)
> { Logger.getLogger(TestParserPlugin.class.getName()).log(Level.SEVERE,
> null,
> ex); }
> return null;
> }
> }
> }
>
> I still have 2 questions concerning this code:
> Is there a way to get the searcher instead of opening from fs everytime I
> use this plugin?(I think this will create a memory leak)
> Is it possible to combine this qparser with edismax etc in stead of
> building
> queries myself using occur.must etc...
>
>
> Note: I also asked this question to the author of querytimejoin(on the jira
> issue).
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics


 


Re: access matched token ids in the FacetComponent?

2013-01-21 Thread Dmitry Kan
Mikhail,

Thanks for the guidance! This indeed sounds challenging, esp. given the
bonus of fighting with solr 3.x in light of disjunction queries. Although,
moving to solr 4.0 if this makes life easier should be ok.

But even before getting one's hands dirty, it would be good to know, if
this is going to fly performance wise. Has your span based implementation
been fast enough? Did it stand close to the native solr's faceting in terms
of performance?

On Mon, Jan 21, 2013 at 2:33 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Dmitry,
>
> First of all, FacetComponent is the Solr's out-of-the-box functionality. It
> runs after search is done and accesses the bitSet of the found document,
> i.e. there is no spans (matched terms positions) there at all.
>
> StandardFacetsAccumulator sounds like the "brand new" lucene faceting
> library. see http://shaierera.blogspot.com/. I don't think but don't
> exactly know whether they are accessible there too.
>
> Some time ago my team successfully prototyped facet component backed on
> spans
> blog.griddynamics.com/2011/10/solr-experience-search-parent-child.htmlbut
> I don't suggest you go this way.
> I can suggest you start from the following:
> - supply PostFilter/DelegatingCollector
> http://yonik.com/posts/advanced-filter-caching-in-solr/
> - the DelegatingCollector will accept the scorer instance
> - if this scorer is BooleanScorer2 (but not BooleanScorer!), you can access
> the SpanQueryScorer in one of the legs and try to access the matched spans
> - if you are in 3.x you'll have a problem with disjunction queries.
>
> it seems challenging, doesn't it?
>
> 18.01.2013 17:40 пользователь "Dmitry Kan"  написал:
>
> > Mikhail,
> >
> > Do you say, that it is not possible to access the matched terms positions
> > in the FacetComponent? If that would be possible (somewhere in the
> > StandardFacetsAccumulator class, where docids are available), then by
> > knowing the matched term positions I can do some school simple math to
> > calculate the sentence counts per doc id.
> >
> > Dmitry
> >
> > On Fri, Jan 18, 2013 at 2:45 PM, Mikhail Khludnev <
> > mkhlud...@griddynamics.com> wrote:
> >
> > > Dmitry,
> > >
> > > It definitely seems like postptocessing highlighter's output. The also
> > > approach is:
> > > - limit number of occurrences of a word in a sentence to 1
> > > - play with facet by function patch
> > > https://issues.apache.org/jira/browse/SOLR-1581 accomplished by tf()
> > > function.
> > >
> > > It doesn't seem like much help.
> > >
> > > On Fri, Jan 18, 2013 at 12:42 PM, Dmitry Kan 
> > wrote:
> > >
> > > > that we actually require the count of the sentences inside
> > > > each document where the hits were found.
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > > Principal Engineer,
> > > Grid Dynamics
> > >
> > > 
> > >  
> > >
> >
>


Unable to load mysql Driver with Apache Solr-4.0.0

2013-01-21 Thread Shameer Thaha Koya
Hi Team,
I am trying to integrate apache Sonr(apache-solr-4.0.0) with Tomcat and getting 
below error.

15:12:40

SEVERE

DataImporter

Delta Import Failed

java.lang.RuntimeException: 
org.apache.solr.handler.dataimport.DataImportHandlerException: Could not load 
driver:  com.mysql.jdbc.Driver Processing Document # 1
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)


... 11 more

Caused by: org.apache.solr.common.SolrException: Error loading class ' 
com.mysql.jdbc.Driver'

at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:436)

at 
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:889)

... 12 more

Caused by: java.lang.ClassNotFoundException:  com.mysql.jdbc.Driver

at java.net.URLClassLoader$1.run(Unknown Source)





I tried below option's of adding mysql-connector-java-5.1.22-bin.jar, but none 
worked



1. added as part of war (solr.war/WEB-INF/lib )

2. added as part of web server (Tomcat/lib)

3. Created a new directory lib under example/solr and placed the mysql 
connector jar (mysql-connector-java-5.1.22-bin.jar) and refered the lib path  
from solr.xml.

  



Could you guide me how to get the mysql driver loaded for Sonar. Any help 
highly appreciable. Thanks




Shameer Thaha
Senior Developer
Emirates Group IT  |Dubai,
M : 971-552775725





Re: access matched token ids in the FacetComponent?

2013-01-21 Thread Dmitry Kan
Mikhail,

the griddynamics blog's link returns "Sorry, the page you were looking for
in this blog does not exist." Could you check if it is still available,
would be interesting to see the details!

Dmitry

On Mon, Jan 21, 2013 at 2:33 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Dmitry,
>
> First of all, FacetComponent is the Solr's out-of-the-box functionality. It
> runs after search is done and accesses the bitSet of the found document,
> i.e. there is no spans (matched terms positions) there at all.
>
> StandardFacetsAccumulator sounds like the "brand new" lucene faceting
> library. see http://shaierera.blogspot.com/. I don't think but don't
> exactly know whether they are accessible there too.
>
> Some time ago my team successfully prototyped facet component backed on
> spans
> blog.griddynamics.com/2011/10/solr-experience-search-parent-child.htmlbut
> I don't suggest you go this way.
> I can suggest you start from the following:
> - supply PostFilter/DelegatingCollector
> http://yonik.com/posts/advanced-filter-caching-in-solr/
> - the DelegatingCollector will accept the scorer instance
> - if this scorer is BooleanScorer2 (but not BooleanScorer!), you can access
> the SpanQueryScorer in one of the legs and try to access the matched spans
> - if you are in 3.x you'll have a problem with disjunction queries.
>
> it seems challenging, doesn't it?
>
> 18.01.2013 17:40 пользователь "Dmitry Kan"  написал:
>
> > Mikhail,
> >
> > Do you say, that it is not possible to access the matched terms positions
> > in the FacetComponent? If that would be possible (somewhere in the
> > StandardFacetsAccumulator class, where docids are available), then by
> > knowing the matched term positions I can do some school simple math to
> > calculate the sentence counts per doc id.
> >
> > Dmitry
> >
> > On Fri, Jan 18, 2013 at 2:45 PM, Mikhail Khludnev <
> > mkhlud...@griddynamics.com> wrote:
> >
> > > Dmitry,
> > >
> > > It definitely seems like postptocessing highlighter's output. The also
> > > approach is:
> > > - limit number of occurrences of a word in a sentence to 1
> > > - play with facet by function patch
> > > https://issues.apache.org/jira/browse/SOLR-1581 accomplished by tf()
> > > function.
> > >
> > > It doesn't seem like much help.
> > >
> > > On Fri, Jan 18, 2013 at 12:42 PM, Dmitry Kan 
> > wrote:
> > >
> > > > that we actually require the count of the sentences inside
> > > > each document where the hits were found.
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > > Principal Engineer,
> > > Grid Dynamics
> > >
> > > 
> > >  
> > >
> >
>


RE: Unable to load mysql Driver with Apache Solr-4.0.0

2013-01-21 Thread Shameer Thaha Koya
Hi Team,

I got this fixed. Accidently there was a space in the driver property of 
dataConfig and caused the issue. Thanks

driver=" com.mysql.jdbc.Driver"


Shameer Thaha
Emirates Group IT  |Dubai,
M : 971-552775725


From: Shameer Thaha Koya
Sent: 21 January 2013 15:53
To: 'solr-user@lucene.apache.org'
Cc: shameer...@gmail.com
Subject: Unable to load mysql Driver with Apache Solr-4.0.0

Hi Team,
I am trying to integrate apache Sonr(apache-solr-4.0.0) with Tomcat and getting 
below error.

15:12:40

SEVERE

DataImporter

Delta Import Failed

java.lang.RuntimeException: 
org.apache.solr.handler.dataimport.DataImportHandlerException: Could not load 
driver:  com.mysql.jdbc.Driver Processing Document # 1
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)


... 11 more

Caused by: org.apache.solr.common.SolrException: Error loading class ' 
com.mysql.jdbc.Driver'

at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:436)

at 
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:889)

... 12 more

Caused by: java.lang.ClassNotFoundException:  com.mysql.jdbc.Driver

at java.net.URLClassLoader$1.run(Unknown Source)





I tried below option's of adding mysql-connector-java-5.1.22-bin.jar, but none 
worked



1. added as part of war (solr.war/WEB-INF/lib )

2. added as part of web server (Tomcat/lib)

3. Created a new directory lib under example/solr and placed the mysql 
connector jar (mysql-connector-java-5.1.22-bin.jar) and refered the lib path  
from solr.xml.

  



Could you guide me how to get the mysql driver loaded for Sonar. Any help 
highly appreciable. Thanks




Shameer Thaha
Senior Developer
Emirates Group IT  |Dubai,
M : 971-552775725





Re: How to combine Qparsers in a plugin?

2013-01-21 Thread denl0
It's working but I have a new problem.

The first query I use with the TestQueryParser always works. But the second
one always gives the same result.

After restarting solr it works once again.

Do I need to clear a cache/or something esle before reusing my query parser?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011p4035051.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Large data importing getting rollback with solr

2013-01-21 Thread Upayavira
Do you have programming skills? If so, I'd suggest you write your own
importer that allows you to control precisely what it is you are trying
to do. The DIH, in my book is a great generic tool for low to medium
complexity tasks. It very much appears you are pushing beyond its
levels, and it would make sense for you to have more control, by using
your own code to do the indexing.

Upayavira

On Mon, Jan 21, 2013, at 11:36 AM, ashimbose wrote:
> Hi Gora,
> 
> Thank you for your suggestion.
> 
> I have tried with you below option,
> 
> >* Have never tried this, but one can set up multiple request handlers
> > in solrconfig.xml for each DIH instance that one plans to run.
> >  These can run in parallel rather than the sequential indexing of
> >  root entities in a single DIH instance. 
> 
> Here I used two data config
> 1. data_conf1.xml
> 2. data_conf2.xml
> 
> I have used them like below
> 
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
>   data_conf1.xml
> 
>   
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
>   data_conf2.xml
> 
>   
> 
> Now I have run them separately one after the other like below..
> 1. http://localhost:8080/solr/dataimport?command=full-import
> 2. http://localhost:8080/solr/dataimport1?command=full-import
> 
> Any one of them running fine at a single instant. Means,
> If I run first dataimport, it will successfully index, if after that I
> run
> dataimport1, it is giving below error..
> 
> Caused by: java.sql.SQLException: JBC0088E: JBC0002E: Socket timeout
> detected: Read timed out
> 
> Again I restart my tomcat and run dataimport1 first and then dataimport,
> then also same problem, 
> 
> dataimport1 runs perfect but dataimport giving time out error,
> 
> Can anyone please help me... or any different method to index large
> data...
> 
> Thanks
> Ashim
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Large-data-importing-getting-rollback-with-solr-tp4034075p4035009.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to combine Qparsers in a plugin?

2013-01-21 Thread Jack Krupansky
The Solr query results cache will assure that if you perform the same exact 
query and the index has not changed you will get the same exact results.


-- Jack Krupansky

-Original Message- 
From: denl0

Sent: Monday, January 21, 2013 8:36 AM
To: solr-user@lucene.apache.org
Subject: Re: How to combine Qparsers in a plugin?

It's working but I have a new problem.

The first query I use with the TestQueryParser always works. But the second
one always gives the same result.

After restarting solr it works once again.

Do I need to clear a cache/or something esle before reusing my query parser?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011p4035051.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Distance Range query

2013-01-21 Thread David Smiley (@MITRE.org)
Hi Stefano.

I answered someone's question on stackoverflow which is basically the same
question you have:
http://stackoverflow.com/questions/13723891/lucene-4-0-spatial-calculate-max-distance-dynamically-using-indexed-documets/13764793#13764793

Essentially, you should index circles and then search by your query point. 
That technique requires Solr 4.  I recommend the very latest 4.1 about to be
released for indexing non-point shapes as there are some rare bugs in 4.0.

I think you could also do it with Solr 3 spatial with something like:
{!frange l=0 u=9}sub(distField,geodist(sfield,lat,lon)).  However this
is internally a brute-force algorithm vs. Solr 4 indexing shapes which uses
an index (fast!).

~ David


stefanocorsi wrote
> Hello,
> 
> I have an index containing items with a range of distance in which I would
> like these items to be found when searched. Furthermore, in the same
> index, I have the position of the item. For example.
> 
> Item A - 46.23211131, 10.3213213131 - 30km
> Item B - 45.23211131, 9.3213213131 - 50km
> ...
> Item N - 46.32132132, 10.213211321 - 100Km
> 
> I would like to be able to make the following query:
> 
> given a certain point X (let's say 46.323231223, 10.32132132131) I would
> like to find all the items that are in the range of that point, according
> to the item's range: i.e. if an item has a 30Km range, it should find
> itself in 30Km range from point X, if an item has a 100Km range, it should
> find itself in a 100Km range from point X, and so on...
> 
> I hope I have explained it in a decent way...
> 
> Is there a way to do it? With solr 3? With solr 4?
> 
> Thank you for your help...
> 
> Stefano





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Distance-Range-query-tp4034977p4035083.html
Sent from the Solr - User mailing list archive at Nabble.com.


Full import through DIH leaving documents as uncommited

2013-01-21 Thread vijeshnair
I am using SOLR 4.0 , and using DIH to import and index the catalog data from
my MySQL database. The DIH took around 55minutes to complete the indexing,
but my max documents processed and actual documents which got indexed were
not matching. So when I checked the update handler statistics, it's showing
me roughly six hundred thousand documents as pending. Here is the update
handler stats

commits:2130
autocommit maxTime:   15000ms
autocommits:169
soft autocommit maxTime:1000ms
soft autocommits:1958
optimizes:0
rollbacks:0
expungeDeletes:0
docsPending:622009
adds:0
deletesById:0
deletesByQuery:0
errors:12
cumulative_adds:12191849
cumulative_deletesById:0
cumulative_deletesByQuery:1
cumulative_errors:0

Now when I have tried to commit it manually, i.e. /update?commit=true , it's
was throwing an out of memory error. It says writer hit out of memory error,
so cannot commit. Here is the stack

java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot
commit
at 
org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2717)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2875)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2855)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:531)
at
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007)
at
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
at
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)


It will be really great help if you can tell me a way to commit those
pending documents. Shall I restart the tomcat ? Any help will be much
appreciated.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Full-import-through-DIH-leaving-documents-as-uncommited-tp4035084.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SOLR 4 getting stuck during restart

2013-01-21 Thread Dyer, James
Are you trying to build the dictionary using a warming query?  I think I saw 
this happen once before a long time ago.  I think if you are issuing a warming 
query with "spellcheck.build=true", then you might also want to use 
"spellcheck.collate=false".  If this fixes it, could you open a bug report in 
JIRA with the "before" and "after" queries so we can try and fix this?

https://issues.apache.org/jira/browse/SOLR

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: vijeshnair [mailto:vijeshkn...@gmail.com] 
Sent: Saturday, January 19, 2013 4:03 AM
To: solr-user@lucene.apache.org
Subject: SOLR 4 getting stuck during restart

I have my index based spell checker configured, and the select request
handlers are configured with collation i.e. true

For my testing I have indexed 2 million records there after generated the
index based dictionary (I am evaluating the DirectSpellChecker, I am seeing
my memory consumption is more when I use DirectSpellChecker).  Now I wanted
modify some threshold parameters in the config file for the new spellchecker
component. So when I restarted my tomcat, the tomcat re-start is getting
stuck at the following line

INFO: QuerySenderListener sending requests to Searcher@332b9f79
main{StandardDirectoryReader(segments_1f:281 _2x(4.0.0.2):C1653773)}

Any comments, am I missing some thing or some miss configuration. Please
help.

My temporary work around :- removed the index based dictionary which was
created before, and restarted. I will regenerate the dictionary now.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-4-getting-stuck-during-restart-tp4034734.html
Sent from the Solr - User mailing list archive at Nabble.com.




curl with dynamic url not working

2013-01-21 Thread nishi
While running the following curl command, the url never went with params
while invoking:

curl
http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-import&clean=false&url=http://www.example.com/news

LOG details:
INFO: [collection1] webapp=/solr path=/newsRSS_DIH
params={command=full-import} status=0 QTime=11 
Jan 21, 2013 10:19:14 AM org.apache.solr.handler.dataimport.DataImporter
doFullImport


If I run the command in browser without curl, it works as expected and
indexed properly. Please let me know any pointer to resolve this issue.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/curl-with-dynamic-url-not-working-tp4035092.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tokenized keywords

2013-01-21 Thread Jack Krupansky
If debugQuery doesn't give you what you want, you can write a custom search 
component which runs after the QueryComponent and extracts the info you want 
from the generated query and then simply adds it to the response in any form 
that you want.


-- Jack Krupansky

-Original Message- 
From: Romita Saha

Sent: Monday, January 21, 2013 1:49 AM
To: solr-user@lucene.apache.org
Subject: Re: Tokenized keywords

What I am trying to achieve is as follows.

I query "Search for all the Laptops" and my tokenized key words are
"search laptop" (I apply stopword filter to filter out words like
for,all,the and i also user lowercase filter).
I want to display these tokenized keywords using debugQuery.

Thanks and regards,
Romita



From:   Dikchant Sahi 
To: solr-user@lucene.apache.org,
Date:   01/21/2013 02:26 PM
Subject:Re: Tokenized keywords



Can you please elaborate a more on what you are trying to achieve.

Tokenizers work on indexed field and doesn't effect how the values will be
displayed. The response value comes from stored field. If you want to see
how your query is being tokenized, you can do it using analysis interface
or enable debugQuery to see how your query is being formed.


On Mon, Jan 21, 2013 at 11:06 AM, Romita Saha
wrote


Hi,

I use some tokenizers to tokenize the query. I want to see the tokenized
query words displayed in the .Could you kindly help me do

that.


Thanks and regards,
Romita




RE: curl with dynamic url not working

2013-01-21 Thread Markus Jelsma
Note the HTML entities in the URL? They should not be there. Your URL is 
interpreted as if  amp;clean=false is a parameter.


 
-Original message-
> From:nishi 
> Sent: Mon 21-Jan-2013 16:39
> To: solr-user@lucene.apache.org
> Subject: curl with dynamic url not working
> 
> While running the following curl command, the url never went with params
> while invoking:
> 
> curl
> http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-import&clean=false&url=http://www.example.com/news
> 
> LOG details:
> INFO: [collection1] webapp=/solr path=/newsRSS_DIH
> params={command=full-import} status=0 QTime=11 
> Jan 21, 2013 10:19:14 AM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> 
> 
> If I run the command in browser without curl, it works as expected and
> indexed properly. Please let me know any pointer to resolve this issue.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/curl-with-dynamic-url-not-working-tp4035092.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 


Re: How to combine Qparsers in a plugin?

2013-01-21 Thread Jack Krupansky
The test for equality of tow queries is via the "equals" method of the 
generated Query objects. Is there any chance that you might be generating 
any custom Query objects which may not have the equals method implemented to 
return false properly?


-- Jack Krupansky

-Original Message- 
From: denl0

Sent: Monday, January 21, 2013 10:18 AM
To: solr-user@lucene.apache.org
Subject: Re: How to combine Qparsers in a plugin?

No,Maybe I explained it wrong. It returns the same result even if the query
is different. I've tried to output the queryString and it seems to change.

The response values stay the same even If I know they are wrong.

Restarting solr always works 1 time.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011p4035085.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: curl with dynamic url not working

2013-01-21 Thread Rafał Kuć
Hello!

Try surrounding you URL with ' characters, so the whole command looks
like this:

curl
'http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-import&clean=false&url=http://www.example.com/news'


-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

> While running the following curl command, the url never went with params
> while invoking:

> curl
> http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-import&clean=false&url=http://www.example.com/news

> LOG details:
> INFO: [collection1] webapp=/solr path=/newsRSS_DIH
> params={command=full-import} status=0 QTime=11 
> Jan 21, 2013 10:19:14 AM
> org.apache.solr.handler.dataimport.DataImporter
> doFullImport


> If I run the command in browser without curl, it works as expected and
> indexed properly. Please let me know any pointer to resolve this issue.



> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/curl-with-dynamic-url-not-working-tp4035092.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Full import through DIH leaving documents as uncommited

2013-01-21 Thread vijeshnair
I saw the following in the IndexWriter Java doc page

NOTE: if you hit an OutOfMemoryError then IndexWriter will quietly record
this fact and block all future segment commits. This is a defensive measure
in case any internal state (buffered documents and deletions) were
corrupted. Any subsequent calls to commit() will throw an
IllegalStateException. The only course of action is to call close(), which
internally will call rollback(), to undo any changes to the index since the
last commit. You can also just call rollback() directly.

So am I left with only option of restarting the server ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Full-import-through-DIH-leaving-documents-as-uncommited-tp4035084p4035097.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: curl with dynamic url not working

2013-01-21 Thread Jack Krupansky
You need to enclose the full URL and parameters in quotes - the & was 
interpreted by the command shell.


-- Jack Krupansky

-Original Message- 
From: nishi

Sent: Monday, January 21, 2013 10:31 AM
To: solr-user@lucene.apache.org
Subject: curl with dynamic url not working

While running the following curl command, the url never went with params
while invoking:

curl
http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-import&clean=false&url=http://www.example.com/news

LOG details:
INFO: [collection1] webapp=/solr path=/newsRSS_DIH
params={command=full-import} status=0 QTime=11
Jan 21, 2013 10:19:14 AM org.apache.solr.handler.dataimport.DataImporter
doFullImport


If I run the command in browser without curl, it works as expected and
indexed properly. Please let me know any pointer to resolve this issue.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/curl-with-dynamic-url-not-working-tp4035092.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Scoring differences solr versions

2013-01-21 Thread roySolr
Hi,

I have some question about the scoring in SOLR4. I have the same query on 2
versions of SOLR(same indexed docs). The debug of the scoring:

*SOLR4:*
3.3243241 = (MATCH) sum of: 0.20717455 = (MATCH) max plus 1.0 times others
of: 0.19920631 = (MATCH) weight(firstname_search:g^50.0 in 783453)
[DefaultSimilarity], result of: 0.19920631 = score(doc=783453,freq=1.0 =
termFreq=1.0 ), product of: 0.11625154 = queryWeight, product of: 50.0 =
boost 3.4271598 = idf(docFreq=195811, maxDocs=2217897) 6.784133E-4 =
queryNorm 1.7135799 = fieldWeight in 783453, product of: 1.0 = tf(freq=1.0),
with freq of: 1.0 = termFreq=1.0 3.4271598 = idf(docFreq=195811,
maxDocs=2217897) 0.5 = fieldNorm(doc=783453) 0.007968252 = (MATCH)
weight(name_first_letter:g in 783453) [DefaultSimilarity], result of:
0.007968252 = score(doc=783453,freq=1.0 = termFreq=1.0 ), product of:
0.0023250307 = queryWeight, product of: 3.4271598 = idf(docFreq=195811,
maxDocs=2217897) 6.784133E-4 = queryNorm 3.4271598 = fieldWeight in 783453,
product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 3.4271598 =
idf(docFreq=195811, maxDocs=2217897) 1.0 = fieldNorm(doc=783453) 3.1171496 =
(MATCH) max plus 1.0 times others of: 3.1171496 = (MATCH)
weight(lastname_search:aalbers^50.0 in 783453) [DefaultSimilarity], result
of: 3.1171496 = score(doc=783453,freq=1.0 = termFreq=1.0 ), product of:
0.3251704 = queryWeight, product of: 50.0 = boost 9.586204 =
idf(docFreq=413, maxDocs=2217897) 6.784133E-4 = queryNorm 9.586204 =
fieldWeight in 783453, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 =
termFreq=1.0 9.586204 = idf(docFreq=413, maxDocs=2217897) 1.0 =
fieldNorm(doc=783453)

*SOLR3.1:*
3.3741257 = (MATCH) sum of: 0.25697616 = (MATCH) max plus 1.0 times others
of: 0.2490079 = (MATCH) weight(firstname_search:g^50.0 in 1697008), product
of: 0.11625154 = queryWeight(firstname_search:g^50.0), product of: 50.0 =
boost 3.4271598 = idf(docFreq=195811, maxDocs=2217897) 6.784133E-4 =
queryNorm 2.141975 = (MATCH) fieldWeight(firstname_search:g in 1697008),
product of: 1.0 = tf(termFreq(firstname_search:g)=1) 3.4271598 =
idf(docFreq=195811, maxDocs=2217897) 0.625 =
fieldNorm(field=firstname_search, doc=1697008) 0.007968252 = (MATCH)
weight(name_first_letter:g in 1697008), product of: 0.0023250307 =
queryWeight(name_first_letter:g), product of: 3.4271598 =
idf(docFreq=195811, maxDocs=2217897) 6.784133E-4 = queryNorm 3.4271598 =
(MATCH) fieldWeight(name_first_letter:g in 1697008), product of: 1.0 =
tf(termFreq(name_first_letter:g)=1) 3.4271598 = idf(docFreq=195811,
maxDocs=2217897) 1.0 = fieldNorm(field=name_first_letter, doc=1697008)
3.1171496 = (MATCH) max plus 1.0 times others of: 3.1171496 = (MATCH)
weight(lastname_search:aalbers^50.0 in 1697008), product of: 0.3251704 =
queryWeight(lastname_search:aalbers^50.0), product of: 50.0 = boost 9.586204
= idf(docFreq=413, maxDocs=2217897) 6.784133E-4 = queryNorm 9.586204 =
(MATCH) fieldWeight(lastname_search:aalbers in 1697008), product of: 1.0 =
tf(termFreq(lastname_search:aalbers)=1) 9.586204 = idf(docFreq=413,
maxDocs=2217897) 1.0 = fieldNorm(field=lastname_search, doc=1697008)


What is the reason for differences in score? Is there something really
different in calculating scores in SOLR 4?

Thanks
Roy



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Scoring-differences-solr-versions-tp4035106.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to combine Qparsers in a plugin?

2013-01-21 Thread denl0
I build my query like this:

Query q = QParser.getParser(qstr, "edismax",
req).getQuery();
return JoinUtil.createJoinQuery("pageFileId", true,
"fileId", q, searcher, ScoreMode.Max);
Is it possible to set that the query never equals.
Here's an updated version of my code:

public class TestParserPlugin extends ExtendedDismaxQParserPlugin{


  
@Override
public QParser createParser(String string, SolrParams sp, SolrParams
sp1, SolrQueryRequest sqr) {

return new TestParserPlugin.TestParser(string, sp1, sp1, sqr);

}

@Override
public void init(NamedList nl) {
   
}


public class TestParser extends QParser {

public TestParser(String qstr, SolrParams localParams, SolrParams
params, SolrQueryRequest req) {
super(qstr, localParams, params, req);
}

@Override
public org.apache.lucene.search.Query parse() throws
org.apache.lucene.queryparser.classic.ParseException {
//IndexReader reader;
try {
//IndexSearcher searcher=req.getSearcher();
IndexSearcher searcher = req.getSearcher();
Query q = QParser.getParser(qstr, "edismax",
req).getQuery();
return JoinUtil.createJoinQuery("pageFileId", true,
"fileId", q, searcher, ScoreMode.Max);
} catch (IOException ex) {
   
Logger.getLogger(TestParserPlugin.class.getName()).log(Level.SEVERE, null,
ex);
}
   
return null;
}

}






}



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011p4035108.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: curl with dynamic url not working

2013-01-21 Thread nishi
Thanks for the advice, I followed all the pointers mentioned:

curl
'http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-import&clean=false&commit=true&url=http://www.example.com/news'


Now, I got the following error:

Jan 21, 2013 10:53:02 AM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 95
Jan 21, 2013 10:53:02 AM org.apache.solr.handler.dataimport.SolrWriter
commit
SEVERE: Exception while solr commit.
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1310)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1422)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1200)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:560)
at
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007)
at
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
at
org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:107)
at
org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:304)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:252)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter
is closed


2nd question:
Also , I am planning to put similar curl commands in one-one .sh file and
invoke through cron job for scheduling. Any idea how to handle
error/exception using the .sh file as some of them may failed due to the
error like above.  Is it the right approach to the critical scheduling
process for indexing our business data.
Appreciate all your advice.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/curl-with-dynamic-url-not-working-tp4035092p4035111.html
Sent from the Solr - User mailing list archive at Nabble.com.


Enable Logging in the Example App

2013-01-21 Thread O. Olson
Hi,
 
    I am really
new to Solr, and I have never used anything similar to it before. So please
pardon my ignorance. I downloaded  Solr
4.0 from http://lucene.apache.org/solr/downloads.html and start it using the 
commandline: 
 
>java -jar start.jar  
 
This generates a number of INFO log messages to the console,
that I would like to better view. 
 
    What is the
best way to send these log messages to a file? I see a logs directory, but it
seems to be empty. I first tried to add the log4j.properties in the “etc”
directory as mentioned in http://wiki.apache.org/solr/SolrLogging.
I then started solr on the commandline: 
 
>java -jar start.jar
-Dlog4j.configuration=file:etc/log4j.properties
 
This does not give me any log files. I would appreciate any
ideas in this regard i.e. the easiest way to get logging into the example app.
 
Thank you,
O. O.


Re: How to combine Qparsers in a plugin?

2013-01-21 Thread Jack Krupansky
You may need to implement your own Query class that wraps the generated 
query and it could always return false for equals.


-- Jack Krupansky

-Original Message- 
From: denl0

Sent: Monday, January 21, 2013 10:54 AM
To: solr-user@lucene.apache.org
Subject: Re: How to combine Qparsers in a plugin?

I build my query like this:

   Query q = QParser.getParser(qstr, "edismax",
req).getQuery();
   return JoinUtil.createJoinQuery("pageFileId", true,
"fileId", q, searcher, ScoreMode.Max);
Is it possible to set that the query never equals.
Here's an updated version of my code:

public class TestParserPlugin extends ExtendedDismaxQParserPlugin{



   @Override
   public QParser createParser(String string, SolrParams sp, SolrParams
sp1, SolrQueryRequest sqr) {

   return new TestParserPlugin.TestParser(string, sp1, sp1, sqr);

   }

   @Override
   public void init(NamedList nl) {

   }


   public class TestParser extends QParser {

   public TestParser(String qstr, SolrParams localParams, SolrParams
params, SolrQueryRequest req) {
   super(qstr, localParams, params, req);
   }

   @Override
   public org.apache.lucene.search.Query parse() throws
org.apache.lucene.queryparser.classic.ParseException {
   //IndexReader reader;
   try {
   //IndexSearcher searcher=req.getSearcher();
   IndexSearcher searcher = req.getSearcher();
   Query q = QParser.getParser(qstr, "edismax",
req).getQuery();
   return JoinUtil.createJoinQuery("pageFileId", true,
"fileId", q, searcher, ScoreMode.Max);
   } catch (IOException ex) {

Logger.getLogger(TestParserPlugin.class.getName()).log(Level.SEVERE, null,
ex);
   }

   return null;
   }

   }






}



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011p4035108.html
Sent from the Solr - User mailing list archive at Nabble.com. 



RE: Scoring differences solr versions

2013-01-21 Thread Markus Jelsma
Could you provide indented format instead? This is hard to debug but i suspect 
it's the query norm.
 
-Original message-
> From:roySolr 
> Sent: Mon 21-Jan-2013 17:00
> To: solr-user@lucene.apache.org
> Subject: Scoring differences solr versions
> 
> Hi,
> 
> I have some question about the scoring in SOLR4. I have the same query on 2
> versions of SOLR(same indexed docs). The debug of the scoring:
> 
> *SOLR4:*
> 3.3243241 = (MATCH) sum of: 0.20717455 = (MATCH) max plus 1.0 times others
> of: 0.19920631 = (MATCH) weight(firstname_search:g^50.0 in 783453)
> [DefaultSimilarity], result of: 0.19920631 = score(doc=783453,freq=1.0 =
> termFreq=1.0 ), product of: 0.11625154 = queryWeight, product of: 50.0 =
> boost 3.4271598 = idf(docFreq=195811, maxDocs=2217897) 6.784133E-4 =
> queryNorm 1.7135799 = fieldWeight in 783453, product of: 1.0 = tf(freq=1.0),
> with freq of: 1.0 = termFreq=1.0 3.4271598 = idf(docFreq=195811,
> maxDocs=2217897) 0.5 = fieldNorm(doc=783453) 0.007968252 = (MATCH)
> weight(name_first_letter:g in 783453) [DefaultSimilarity], result of:
> 0.007968252 = score(doc=783453,freq=1.0 = termFreq=1.0 ), product of:
> 0.0023250307 = queryWeight, product of: 3.4271598 = idf(docFreq=195811,
> maxDocs=2217897) 6.784133E-4 = queryNorm 3.4271598 = fieldWeight in 783453,
> product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 3.4271598 =
> idf(docFreq=195811, maxDocs=2217897) 1.0 = fieldNorm(doc=783453) 3.1171496 =
> (MATCH) max plus 1.0 times others of: 3.1171496 = (MATCH)
> weight(lastname_search:aalbers^50.0 in 783453) [DefaultSimilarity], result
> of: 3.1171496 = score(doc=783453,freq=1.0 = termFreq=1.0 ), product of:
> 0.3251704 = queryWeight, product of: 50.0 = boost 9.586204 =
> idf(docFreq=413, maxDocs=2217897) 6.784133E-4 = queryNorm 9.586204 =
> fieldWeight in 783453, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 =
> termFreq=1.0 9.586204 = idf(docFreq=413, maxDocs=2217897) 1.0 =
> fieldNorm(doc=783453)
> 
> *SOLR3.1:*
> 3.3741257 = (MATCH) sum of: 0.25697616 = (MATCH) max plus 1.0 times others
> of: 0.2490079 = (MATCH) weight(firstname_search:g^50.0 in 1697008), product
> of: 0.11625154 = queryWeight(firstname_search:g^50.0), product of: 50.0 =
> boost 3.4271598 = idf(docFreq=195811, maxDocs=2217897) 6.784133E-4 =
> queryNorm 2.141975 = (MATCH) fieldWeight(firstname_search:g in 1697008),
> product of: 1.0 = tf(termFreq(firstname_search:g)=1) 3.4271598 =
> idf(docFreq=195811, maxDocs=2217897) 0.625 =
> fieldNorm(field=firstname_search, doc=1697008) 0.007968252 = (MATCH)
> weight(name_first_letter:g in 1697008), product of: 0.0023250307 =
> queryWeight(name_first_letter:g), product of: 3.4271598 =
> idf(docFreq=195811, maxDocs=2217897) 6.784133E-4 = queryNorm 3.4271598 =
> (MATCH) fieldWeight(name_first_letter:g in 1697008), product of: 1.0 =
> tf(termFreq(name_first_letter:g)=1) 3.4271598 = idf(docFreq=195811,
> maxDocs=2217897) 1.0 = fieldNorm(field=name_first_letter, doc=1697008)
> 3.1171496 = (MATCH) max plus 1.0 times others of: 3.1171496 = (MATCH)
> weight(lastname_search:aalbers^50.0 in 1697008), product of: 0.3251704 =
> queryWeight(lastname_search:aalbers^50.0), product of: 50.0 = boost 9.586204
> = idf(docFreq=413, maxDocs=2217897) 6.784133E-4 = queryNorm 9.586204 =
> (MATCH) fieldWeight(lastname_search:aalbers in 1697008), product of: 1.0 =
> tf(termFreq(lastname_search:aalbers)=1) 9.586204 = idf(docFreq=413,
> maxDocs=2217897) 1.0 = fieldNorm(field=lastname_search, doc=1697008)
> 
> 
> What is the reason for differences in score? Is there something really
> different in calculating scores in SOLR 4?
> 
> Thanks
> Roy
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Scoring-differences-solr-versions-tp4035106.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 


Re: curl with dynamic url not working

2013-01-21 Thread nishi
Thanks. The above issue is resolved by setting the following parameters at
solrConfig.xml


   -
simple
true

1
6
6
   -



Regarding any pointer on - cron and the script for making exception
handling/robust scheduler would be helpful. Thanks. 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/curl-with-dynamic-url-not-working-tp4035092p4035133.html
Sent from the Solr - User mailing list archive at Nabble.com.


AutoComplete with FiterQuery for Full content

2013-01-21 Thread Sujatha Arun
Hi,

I need suggestion  on solr Autocomplete for Full content with Filter query.

I have currently implemented this as below


   1. Solr version 3.6.1
   2. solr.StandardTokenizerFactory
   3. EdgeNGramFilterFactory with maxGramSize="25" minGramSize="1"
   4. Stored the content field
   5. Use the Fastvectorhighter and breakiterator on WORD to return results
   based on  standard analyzer with a fragsize of 20 &using the fq param as
   required

This seems to provide snippets ,but they seem like junk at times and not
really relevant as they are pieces of sentence with search term in them .It
could be like
the  and ...eg: on searching river  suggestion is  - the river and
...which does not really make sense as a suggestion...

So other options of


   - facets support fq but cannot be used for fullcontent tokenized text
   due to performance issue


   1. Can we use a tool that can just extract keywords/phrases from the
Full content and that can either be indexed or updated to Db and same can
   be used to serve the autocomplete?
   2. Any other methods?
   3. Are there any opensource tools for keyword extraction? Sematext has a
   commercial tool for the same.
   4. Which would be better for Autocomplete  - DB / Index in terms of
   speed /performance?

Any pointers?

Regards,
Sujatha


java.io.FileNotFoundException: /var/solrdata/coreX/index.20121111165300/_3m.fnm

2013-01-21 Thread Uomesh
Hi, 

I am getting below exception while loading my core. I can see a _3j.fnm file
in my directory but it looking for _3m.fnm. Could you please help me what is
causing this issue?

http-8080-14 Jan-21 11:10:00-ERROR [org.apache.solr.core.SolrCore] -
org.apache.solr.common.SolrException: Error executing default implementation
of CREATE
at
com.solr.handler.admin.CustomCoreAdminHandler.handleCreateAction(CustomCoreAdminHandler.java:87)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:115)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:318)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:187)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException
at org.apache.solr.core.SolrCore.(SolrCore.java:600)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:480)
at
com.solr.handler.admin.CustomCoreAdminHandler.handleCreateAction(CustomCoreAdminHandler.java:82)
... 17 more
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException:
/var/solr/data/coreX/index.2012165300/_3m.fnm (No such file or
directory)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1104)
at org.apache.solr.core.SolrCore.(SolrCore.java:585)
... 19 more
Caused by: java.io.FileNotFoundException:
/var/solr/data/coreX/index.2012165300/_3m.fnm (No such file or
directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:212)
at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:218)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345)
at org.apache.lucene.index.FieldInfos.(FieldInfos.java:71)
at
org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:80)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:116)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:94)
at 
org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:105)
at
org.apache.lucene.index.ReadOnlyDirectoryReader.(ReadOnlyDirectoryReader.java:27)
at
org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:78)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:709)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:72)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:375)
at
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:38)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1093)
... 20 more




--
View this message in context: 
http://lucene.472066.n3.nabble.com/java-io-FileNotFoundException-var-solrdata-coreX-index-2012165300-3m-fnm-tp4035146.html
Sent from the Solr - User mailing list archive at Nabble.com.


Delete all Documents in the Example (Solr 4.0)

2013-01-21 Thread O. Olson
Hi,
 
    I am
attempting to use the example-DIH that comes with the Solr 4.0 download. In
/example, I start Solr using: 
 
java -Dsolr.solr.home="./example-DIH/solr/" -jar
start.jar
 
After playing with it for a while, I decided to delete all
documents in the index. The FAQ at 
http://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F 
seems to say that I needed to use: 
 
http://localhost:8983/solr/update?stream.body=*:*
http://localhost:8983/solr/update?stream.body=
 
I put the above urls in my browser, but I simply get 404’s. I
then tried: 
 
http://localhost:8983/solr/update 
 
and I got a 404 too. I then looked at
/example-DIH/solr/solr/conf/solrconfig.xml and it seems to have . 
 
I am confused why I am getting a 404 if /update has a
handler? 
 
Thank you for any ideas.
O. O.


Re: Delete all Documents in the Example (Solr 4.0)

2013-01-21 Thread Shawn Heisey

On 1/21/2013 11:27 AM, O. Olson wrote:

http://localhost:8983/solr/update

and I got a 404 too. I then looked at
/example-DIH/solr/solr/conf/solrconfig.xml and it seems to have .

I am confused why I am getting a 404 if /update has a
handler?


You need to send the request to /solr/corename/update ... if you are 
using the solr example, most likely the core is named "collection1" so 
the URL would be /solr/collection1/update.


There is a lot of information out there that has not been updated since 
before multicore operation became the default in Solr examples.


The example does have defaultCoreName defined, but I still see lots of 
people that run into problems like this, so I suspect that it isn't 
always honored.


Thanks,
Shawn



Re: Delete all Documents in the Example (Solr 4.0)

2013-01-21 Thread Alexandre Rafalovitch
I just tested that and /update does not seem to honor the default core
value (same 404 issue). Is that a bug?

Regards,
   Alex.


Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jan 21, 2013 at 1:35 PM, Shawn Heisey  wrote:

> On 1/21/2013 11:27 AM, O. Olson wrote:
>
>> http://localhost:8983/solr/**update 
>>
>> and I got a 404 too. I then looked at
>> /example-DIH/solr/solr/conf/**solrconfig.xml and it seems to have
>> > name="/update" class="solr.**UpdateRequestHandler"  />.
>>
>> I am confused why I am getting a 404 if /update has a
>> handler?
>>
>
> You need to send the request to /solr/corename/update ... if you are using
> the solr example, most likely the core is named "collection1" so the URL
> would be /solr/collection1/update.
>
> There is a lot of information out there that has not been updated since
> before multicore operation became the default in Solr examples.
>
> The example does have defaultCoreName defined, but I still see lots of
> people that run into problems like this, so I suspect that it isn't always
> honored.
>
> Thanks,
> Shawn
>
>


Re: Solr cache considerations

2013-01-21 Thread Erick Erickson
Hmm, interesting. I'll have to look closer...

On Sun, Jan 20, 2013 at 3:50 PM, Walter Underwood  wrote:
> I routinely see hit rates over 75% on the document cache. Perhaps yours is 
> too small. Mine is set at 10240 entries.
>
> wunder
>
> On Jan 20, 2013, at 8:08 AM, Erick Erickson wrote:
>
>> About your question about document cache: Typically the document cache
>> has a pretty low hit-ratio. I've rarely, if ever, seen it get hit very
>> often. And remember that this cache is only hit when assembling the
>> response for a few documents (your page size).
>>
>> Bottom line: I wouldn't worry about this cache much. It's quite useful
>> for processing a particular query faster, but not really intended for
>> cross-query use.
>>
>> Really, I think you're getting the cart before the horse here. Run it
>> up the flagpole and try it. Rely on the OS to do its job
>> (http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html).
>> Find  a bottleneck _then_ tune. Premature optimization and all
>> that
>>
>> Several tens of millions of docs isn't that large unless the text
>> fields are enormous.
>>
>> Best
>> Erick
>>
>> On Sat, Jan 19, 2013 at 2:32 PM, Isaac Hebsh  wrote:
>>> Ok. Thank you everyone for your helpful answers.
>>> I understand that fieldValueCache is not used for resolving queries.
>>> Is there any cache that can help this basic scenario (a lot of different
>>> queries, on a small set of fields)?
>>> Does Lucene's FieldCache help (implicitly)?
>>> How can I use RAM to reduce I/O in this type of queries?
>>>
>>>
>>> On Fri, Jan 18, 2013 at 4:09 PM, Tomás Fernández Löbbe <
>>> tomasflo...@gmail.com> wrote:
>>>
 No, the fieldValueCache is not used for resolving queries. Only for
 multi-token faceting and apparently for the stats component too. The
 document cache maintains in memory the stored content of the fields you are
 retrieving or highlighting on. It'll hit if the same document matches the
 query multiple times and the same fields are requested, but as Eirck said,
 it is important for cases when multiple components in the same request need
 to access the same data.

 I think soft committing every 10 minutes is totally fine, but you should
 hard commit more often if you are going to be using transaction log.
 openSearcher=false will essentially tell Solr not to open a new searcher
 after the (hard) commit, so you won't see the new indexed data and caches
 wont be flushed. openSearcher=false makes sense when you are using
 hard-commits together with soft-commits, as the "soft-commit" is dealing
 with opening/closing searchers, you don't need hard commits to do it.

 Tomás


 On Fri, Jan 18, 2013 at 2:20 AM, Isaac Hebsh 
 wrote:

> Unfortunately, it seems (
> http://lucene.472066.n3.nabble.com/Nrt-and-caching-td3993612.html) that
> these caches are not per-segment. In this case, I want to (soft) commit
> less frequently. Am I right?
>
> Tomás, as the fieldValueCache is very similar to lucene's FieldCache, I
> guess it has a big contribution to standard (not only faceted) queries
> time. SolrWiki claims that it primarily used by faceting. What that says
> about complex textual queries?
>
> documentCache:
> Erick, After a query processing is finished, doesn't some documents stay
 in
> the documentCache? can't I use it to accelerate queries that should
> retrieve stored fields of documents? In this case, a big documentCache
 can
> hold more documents..
>
> About commit frequency:
> HardCommit: "openSearch=false" seems as a nice solution. Where can I read
> about this? (found nothing but one unexplained sentence in SolrWiki).
> SoftCommit: In my case, the required index freshness is 10 minutes. The
> plan to soft commit every 10 minutes is similar to storing all of the
> documents in a queue (outside to Solr), an indexing a bulk every 10
> minutes.
>
> Thanks.
>
>
> On Fri, Jan 18, 2013 at 2:15 AM, Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
>
>> I think fieldValueCache is not per segment, only fieldCache is.
 However,
>> unless I'm missing something, this cache is only used for faceting on
>> multivalued fields
>>
>>
>> On Thu, Jan 17, 2013 at 8:58 PM, Erick Erickson <
 erickerick...@gmail.com
>>> wrote:
>>
>>> filterCache: This is bounded by 1M * (maxDoc) / 8 * (num filters in
>>> cache). Notice the /8. This reflects the fact that the filters are
>>> represented by a bitset on the _internal_ Lucene ID. UniqueId has no
>>> bearing here whatsoever. This is, in a nutshell, why warming is
>>> required, the internal Lucene IDs may change. Note also that it's
>>> maxDoc, the internal arrays have "holes" for deleted documents.
>>>
>>> Note this is an _upper_ bound, if there are only a few docs tha

When a URL is a component of a query string's data?

2013-01-21 Thread Jack Park
There exists in my Solr index a document (several, actually) which
harbor http:// URL values. Trying to find documents with a particular
URL fails.

The query is like this:
ResourceURLPropertyType:http://someserver.org/something

Fails due to the second ":"

If I substitute %3a into that query, e.g.
ResourceURLPropertyType:http$3a//someserver.org/something
the query goes through and finds nothing.

A fork in the road?
Make it a policy to swap %3a into all URL values going to Solr, then
use the same format in search.
or
Find another way to get the query to work with the full URL,
untouched, in the index.

Googling this one has been difficult due to the ambiguity of "url" in
query strings.

Thoughts?

Many thanks in advance
Jack


Re: Spatial+Dataimport full import results in OutOfMemory for a rectangle defining a line

2013-01-21 Thread David Smiley (@MITRE.org)
Javier,

Your minX is slightly greater than maxX, which is interpreted as a line that
wraps nearly the entire globe.  Is that what you intended?

If this is what you intended, then you got bitten by this unfixed bug:
https://issues.apache.org/jira/browse/LUCENE-4550
As a work-around, you could split that horizontal line into two equal pieces
and index them as separate values for the document.

~ David


Javier Molina wrote
> Hi,
> 
> I have been struggling the past days trying to spot what was causing my
> solr 4.0 to go Out Of Memory when doing a dataimport full-import
> 
> After troubleshooting I found that the problem is not related with data
> volume but instead with one particular record in my DB.
> 
> The offending record has a location value of  147.3376750002 -42.88601
> 147.337675 -42.88601
> 
> As you can see, the min and max latitude have the same value, that is the
> rectangle is indeed a line.
> 
> The location field uses the default definition for field type
> location_rpt,
> extract from schema.xml shown below.
> 
> 
>  class="solr.SpatialRecursivePrefixTreeFieldType"^M
> geo="true" distErrPct="0.025" maxDistErr="0.09"
> units="degrees"
> />
> 
> We have configured the delta settings for dataimport to take a particular
> id from our db records, and what it is interesting is that if instead of
> doing a full-import I issue a delta-import the operation succeed. Note
> that
> in both cases I am just importing the same particular record, in the case
> of full-import I specify start and rows parameters, as seen in the log.
> 
> My understanding is that defining a rectangle as a line it is still a
> valid
> area (so to speak) so the operation should succeed, nevertheless if those
> values are not a valid there should be a validation in place in order to
> prevent the OutOfMemoryError.
> 
> 
> See log below for an example of the run leading to the Out Of Memory
> Error.
> 
> Thanks in advance for your feedback.
> 
> Regards,
> Javier
> 
> 
> 21/01/2013 12:00:38 PM org.apache.solr.core.SolrCore execute
> INFO: [coreDap] webapp=/solr path=/dataimport
> params={optimize=false&clean=true&commit=true&start=1995&verbose=true&command=full-import&rows=1}
> status=0 QTime=10
> 21/01/2013 12:00:38 PM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> INFO: Starting Full Import
> 21/01/2013 12:00:38 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> 21/01/2013 12:00:38 PM org.apache.solr.core.SolrCore execute
> INFO: [coreDap] webapp=/solr path=/dataimport params={command=status}
> status=0 QTime=1
> 21/01/2013 12:00:38 PM org.apache.solr.core.SolrDeletionPolicy onInit
> INFO: SolrDeletionPolicy.onInit: commits:num=1
> 
> commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/DMS_dev/r01/solr/coreDap/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@616181be;
> maxCacheMB=48.0
> maxMergeSizeMB=4.0),segFN=segments_mq,generation=818,filenames=[_12y_Lucene40_0.tip,
> _12w_Lucene40_0.prx, _12w_nrm.cfe, _12z_Lucene40_0.tim,
> _12z_Lucene40_0.frq, _12z.si, _12y_Lucene40_0.prx, _12z_nrm.cfe,
> _12w_Lucene40_0.frq, _12w.fdx, segments_mq, _12y_nrm.cfs, _12w.fnm,
> _12w_Lucene40_0.tip, _12w.si, _12w_Lucene40_0.tim, _12w.fdt, _12z_nrm.cfs,
> _12z.fnm, _12y.fdx, _12y.fnm, _12z.fdx, _12z_Lucene40_0.prx, _12y.fdt,
> _12z.fdt, _12y_nrm.cfe, _12y.si, _12z_Lucene40_0.tip, _12y_Lucene40_0.frq,
> _12w_nrm.cfs, _12y_Lucene40_0.tim]
> 21/01/2013 12:00:38 PM org.apache.solr.core.SolrDeletionPolicy
> updateCommits
> INFO: newest commit = 818
> 21/01/2013 12:00:38 PM org.apache.solr.search.SolrIndexSearcher 
> 
> INFO: Opening Searcher@70f4d063 realtime
> 21/01/2013 12:00:38 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity dataCollection with URL:
> jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=
> myserver.mydomain.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=
> myserver.mydomain.com)))
> 21/01/2013 12:00:39 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Time taken for getConnection(): 40
> 21/01/2013 12:00:39 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity topic with URL: jdbc:oracle:thin:@
> (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver.mydomain.com
> )(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myserver.mydomain.com)))
> 21/01/2013 12:00:39 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Time taken for getConnection(): 47
> 21/01/2013 12:00:39 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity usergroup with URL:
> jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=
> myserver.mydomain.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=
> myserver.mydomain.com)))
> 21/01/2013 12:00:39 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Time taken for getCo

Re: Using Solr Spatial in conjunction with HBASE/Hadoop

2013-01-21 Thread David Smiley (@MITRE.org)
What good is a key-value store in the context of oakstream's question?

~ David



-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-Solr-Spatial-in-conjunction-with-HBASE-Hadoop-tp4034307p4035164.html
Sent from the Solr - User mailing list archive at Nabble.com.


Issues with docFreq/docCount on SolrCloud

2013-01-21 Thread Markus Jelsma
Hi,

We have a few trunk clusters running with two replica's for each shard. We 
sometimes see results jumping positions for identical queries. We've tracked it 
down to differences in docFreq and docCount between the leader and replica's. 
The only way to force all cores in the shard to be consistent is to optimize or 
forceMerge the segments.

Is there anyone here who can give advice on this issue? For obvious reasons we 
don't want to to optimize 50GB of data on some regular basis but we do want to 
make sure the variations in docFreq/docCount does not lead to results jumping 
positions in the resultset for identical queries.

We already have like most of you small issues due to the lack of distributed 
IDF, having this problem as well makes SolrCloud less predictable and harder to 
debug.

Thanks,
Markus


Re: When a URL is a component of a query string's data?

2013-01-21 Thread Jack Krupansky
The colons are probably okay. It is probably the slashes causing the 
problem. An embedded slash now terminates the preceding term and starts a 
regular expression term (that is terminated by a second slash).


Solution: quote each slash with a backslash.

   ResourceURLPropertyType:http:\/\/someserver.org\/something

Or, enclose the URL in quotes.

   ResourceURLPropertyType:"http://someserver.org/something";

-- Jack Krupansky

-Original Message- 
From: Jack Park

Sent: Monday, January 21, 2013 1:41 PM
To: solr-user@lucene.apache.org
Subject: When a URL is a component of a query string's data?

There exists in my Solr index a document (several, actually) which
harbor http:// URL values. Trying to find documents with a particular
URL fails.

The query is like this:
ResourceURLPropertyType:http://someserver.org/something

Fails due to the second ":"

If I substitute %3a into that query, e.g.
ResourceURLPropertyType:http$3a//someserver.org/something
the query goes through and finds nothing.

A fork in the road?
Make it a policy to swap %3a into all URL values going to Solr, then
use the same format in search.
or
Find another way to get the query to work with the full URL,
untouched, in the index.

Googling this one has been difficult due to the ambiguity of "url" in
query strings.

Thoughts?

Many thanks in advance
Jack 



Re: When a URL is a component of a query string's data?

2013-01-21 Thread Jack Park
At the admin console, surrounding with "" worked fine.

Many thanks
Jack

On Mon, Jan 21, 2013 at 11:24 AM, Jack Krupansky
 wrote:
> The colons are probably okay. It is probably the slashes causing the
> problem. An embedded slash now terminates the preceding term and starts a
> regular expression term (that is terminated by a second slash).
>
> Solution: quote each slash with a backslash.
>
>ResourceURLPropertyType:http:\/\/someserver.org\/something
>
> Or, enclose the URL in quotes.
>
>ResourceURLPropertyType:"http://someserver.org/something";
>
> -- Jack Krupansky
>
> -Original Message- From: Jack Park
> Sent: Monday, January 21, 2013 1:41 PM
> To: solr-user@lucene.apache.org
> Subject: When a URL is a component of a query string's data?
>
>
> There exists in my Solr index a document (several, actually) which
> harbor http:// URL values. Trying to find documents with a particular
> URL fails.
>
> The query is like this:
> ResourceURLPropertyType:http://someserver.org/something
>
> Fails due to the second ":"
>
> If I substitute %3a into that query, e.g.
> ResourceURLPropertyType:http$3a//someserver.org/something
> the query goes through and finds nothing.
>
> A fork in the road?
> Make it a policy to swap %3a into all URL values going to Solr, then
> use the same format in search.
> or
> Find another way to get the query to work with the full URL,
> untouched, in the index.
>
> Googling this one has been difficult due to the ambiguity of "url" in
> query strings.
>
> Thoughts?
>
> Many thanks in advance
> Jack


Solr and Unicode characters in strings

2013-01-21 Thread Jack Park
Here is a situation I now experience:

What Solr has:
economist and thus …@en
What was sent:
economist and thus …@en
where those are just snippets from what I sent up -- the ellipsis was
created by Carrot2, and what comes back when I fetch the document with
that passage.

There is a hint in the Solr FAQ that the server must support UTF-8;
it's not clear how to do that from HTTPSolrServer.
Other hints from around the web suggest I should be using a different
field than type = "string"

I should point out that I am running these developmental tests on the
Solr 4 example build with my schema.xml.

My question is this: what simple, say, utility call would return the
text to its original?
(perhaps that's the wrong question...)

Many thank in advance
Jack


Re: How to combine Qparsers in a plugin?

2013-01-21 Thread Jack Krupansky
So? I mean, so what if a Query object cannot be cast to some other random 
object type. The whole point is that a class extending "Query" can be used 
where any "Query" object can be used.


You need to add the code to your Query-derived class to create it using an 
existing query object and pass along all the required methods via 
delegation.


-- Jack Krupansky

-Original Message- 
From: denl0

Sent: Monday, January 21, 2013 11:48 AM
To: solr-user@lucene.apache.org
Subject: Re: How to combine Qparsers in a plugin?

Aw Extending doesn't seem to work.

I tried this:

   public abstract class QObject extends org.apache.lucene.search.Query{

   @Override
   public boolean equals(Object o) {
   return false;
   }


   }

But I get an error:
TermsIncludingScoreQuery cannot be cast to
lucenejar.TestParserPlugin$QObjectjava.lang.ClassCastException

The Query class doesn't have an interface.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-Qparsers-in-a-plugin-tp4035011p4035131.html
Sent from the Solr - User mailing list archive at Nabble.com. 



error initializing QueryElevationComponent

2013-01-21 Thread eShard
Hi,
I'm trying to test out the queryelevationcomponent.
elevate.xml is in the solrconfig.xml and it's in the conf directory.
I left the defaults.
I added this to the elevate.xml

 
  https://opentextdev/cs/llisapi.dll?func=ll&objID=577575&objAction=download";
/>
 


id is a string setup as the uniquekey

And I get this error:
16:25:48SEVERE  Config  Exception during parsing file:
elevate.xml:org.xml.sax.SAXParseException; systemId: solrres:/elevate.xml;
lineNumber: 28; columnNumber: 77; The reference to entity "objID" must end
with the ';' delimiter.
16:25:48SEVERE  SolrCorejava.lang.NullPointerException
16:25:48SEVERE  CoreContainer   Unable to create core: Lisa
16:25:48SEVERE  CoreContainer   
null:org.apache.solr.common.SolrException:
Error initializing QueryElevationComponent. 

what am I doing wrong?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/error-initializing-QueryElevationComponent-tp4035194.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Enable Logging in the Example App

2013-01-21 Thread Ahmet Arslan
Hi Olson,

java -Djava.util.logging.config.file=etc/logging.properties -jar start.jar
should do the trick. There is an information about this in README.txt


--- On Mon, 1/21/13, O. Olson  wrote:

> From: O. Olson 
> Subject: Enable Logging in the Example App
> To: "solr-user@lucene.apache.org" 
> Date: Monday, January 21, 2013, 6:02 PM
> Hi,
>  
>     I am really
> new to Solr, and I have never used anything similar to it
> before. So please
> pardon my ignorance. I downloaded  Solr
> 4.0 from http://lucene.apache.org/solr/downloads.html and start
> it using the commandline: 
>  
> >java -jar start.jar  
>  
> This generates a number of INFO log messages to the
> console,
> that I would like to better view. 
>  
>     What is the
> best way to send these log messages to a file? I see a logs
> directory, but it
> seems to be empty. I first tried to add the log4j.properties
> in the “etc”
> directory as mentioned in http://wiki.apache.org/solr/SolrLogging.
> I then started solr on the commandline: 
>  
> >java -jar start.jar
> -Dlog4j.configuration=file:etc/log4j.properties
>  
> This does not give me any log files. I would appreciate any
> ideas in this regard i.e. the easiest way to get logging
> into the example app.
>  
> Thank you,
> O. O.
>


Re: Spatial+Dataimport full import results in OutOfMemory for a rectangle defining a line

2013-01-21 Thread Javier Molina
Thanks for your reply David.

After doing more testing I found that overlapping latitudes or longitudes
was not the issue as you point out.

The values presented are extreme but they are correct, our solution should
allow a user to define a box in a map, a box crossing the 180 meridian is
also valid.

I know those are very extreme (and rare) scenarios but technically still
valid ones.

I understand your workaround and seems feasible but unfortunately I don't
see any example on
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4on how to
index a multivalued field.

Is there any other documentation that I am missing? If not could you please
shed some light on the syntax to index to boxes (rectangles) on a field?

With regards to search, will an Intersects function behaviour will be OR,
that is match any rectangle in my example on that multivalued field that
intersects with the given area?

Thanks,
Javier





On 22 January 2013 05:43, David Smiley (@MITRE.org) wrote:

> Javier,
>
> Your minX is slightly greater than maxX, which is interpreted as a line
> that
> wraps nearly the entire globe.  Is that what you intended?
>
> If this is what you intended, then you got bitten by this unfixed bug:
> https://issues.apache.org/jira/browse/LUCENE-4550
> As a work-around, you could split that horizontal line into two equal
> pieces
> and index them as separate values for the document.
>
> ~ David
>
>
> Javier Molina wrote
> > Hi,
> >
> > I have been struggling the past days trying to spot what was causing my
> > solr 4.0 to go Out Of Memory when doing a dataimport full-import
> >
> > After troubleshooting I found that the problem is not related with data
> > volume but instead with one particular record in my DB.
> >
> > The offending record has a location value of  147.3376750002
> -42.88601
> > 147.337675 -42.88601
> >
> > As you can see, the min and max latitude have the same value, that is the
> > rectangle is indeed a line.
> >
> > The location field uses the default definition for field type
> > location_rpt,
> > extract from schema.xml shown below.
> >
> >
> >  > class="solr.SpatialRecursivePrefixTreeFieldType"^M
> > geo="true" distErrPct="0.025" maxDistErr="0.09"
> > units="degrees"
> > />
> >
> > We have configured the delta settings for dataimport to take a particular
> > id from our db records, and what it is interesting is that if instead of
> > doing a full-import I issue a delta-import the operation succeed. Note
> > that
> > in both cases I am just importing the same particular record, in the case
> > of full-import I specify start and rows parameters, as seen in the log.
> >
> > My understanding is that defining a rectangle as a line it is still a
> > valid
> > area (so to speak) so the operation should succeed, nevertheless if those
> > values are not a valid there should be a validation in place in order to
> > prevent the OutOfMemoryError.
> >
> >
> > See log below for an example of the run leading to the Out Of Memory
> > Error.
> >
> > Thanks in advance for your feedback.
> >
> > Regards,
> > Javier
> >
> >
> > 21/01/2013 12:00:38 PM org.apache.solr.core.SolrCore execute
> > INFO: [coreDap] webapp=/solr path=/dataimport
> >
> params={optimize=false&clean=true&commit=true&start=1995&verbose=true&command=full-import&rows=1}
> > status=0 QTime=10
> > 21/01/2013 12:00:38 PM org.apache.solr.handler.dataimport.DataImporter
> > doFullImport
> > INFO: Starting Full Import
> > 21/01/2013 12:00:38 PM
> > org.apache.solr.handler.dataimport.SimplePropertiesWriter
> > readIndexerProperties
> > INFO: Read dataimport.properties
> > 21/01/2013 12:00:38 PM org.apache.solr.core.SolrCore execute
> > INFO: [coreDap] webapp=/solr path=/dataimport params={command=status}
> > status=0 QTime=1
> > 21/01/2013 12:00:38 PM org.apache.solr.core.SolrDeletionPolicy onInit
> > INFO: SolrDeletionPolicy.onInit: commits:num=1
> >
> > commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@
> /DMS_dev/r01/solr/coreDap/data/index
> > lockFactory=org.apache.lucene.store.NativeFSLockFactory@616181be;
> > maxCacheMB=48.0
> >
> maxMergeSizeMB=4.0),segFN=segments_mq,generation=818,filenames=[_12y_Lucene40_0.tip,
> > _12w_Lucene40_0.prx, _12w_nrm.cfe, _12z_Lucene40_0.tim,
> > _12z_Lucene40_0.frq, _12z.si, _12y_Lucene40_0.prx, _12z_nrm.cfe,
> > _12w_Lucene40_0.frq, _12w.fdx, segments_mq, _12y_nrm.cfs, _12w.fnm,
> > _12w_Lucene40_0.tip, _12w.si, _12w_Lucene40_0.tim, _12w.fdt,
> _12z_nrm.cfs,
> > _12z.fnm, _12y.fdx, _12y.fnm, _12z.fdx, _12z_Lucene40_0.prx, _12y.fdt,
> > _12z.fdt, _12y_nrm.cfe, _12y.si, _12z_Lucene40_0.tip,
> _12y_Lucene40_0.frq,
> > _12w_nrm.cfs, _12y_Lucene40_0.tim]
> > 21/01/2013 12:00:38 PM org.apache.solr.core.SolrDeletionPolicy
> > updateCommits
> > INFO: newest commit = 818
> > 21/01/2013 12:00:38 PM org.apache.solr.search.SolrIndexSearcher
> > 
> > INFO: Opening Searcher@70f4d063 realtime
> > 21/01/2013 12:00:38 PM
> org.apache.solr.handler.dataimport.JdbcDataSource$1
> > call
> > I

SolrException: Error loading class 'org.apache.solr.response.transform.EditorialMarkerFactory'

2013-01-21 Thread eShard
Hi,
This is related to my earlier question regarding the elevationcomponent.
I tried turning this on:
 If you are using the QueryElevationComponent, you may wish to mark
documents that get boosted.  The
  EditorialMarkerFactory will do exactly that: 
 --> 
 

but it fails to load this class.

I'm using solr 4.0 final.
How do I get this to load?

thanks,




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrException-Error-loading-class-org-apache-solr-response-transform-EditorialMarkerFactory-tp4035203.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Delete all Documents in the Example (Solr 4.0)

2013-01-21 Thread O. Olson




- Messaggio originale -
Da: Shawn Heisey 
A: solr-user@lucene.apache.org
Cc: 
Inviato: Lunedì 21 Gennaio 2013 12:35
Oggetto: Re: Delete all Documents in the Example (Solr 4.0)

>On 1/21/2013 11:27 AM, O. Olson wrote:
>> http://localhost:8983/solr/update
>> 
>> and I got a 404 too. I then looked at
>> /example-DIH/solr/solr/conf/solrconfig.xml and it seems to have 
>> > name="/update" class="solr.UpdateRequestHandler"  />.
>> 
>> I am confused why I am getting a 404 if /update has a
>> handler?

>You need to send the request to /solr/corename/update ... if you are using the 
>solr example, most likely the core is named "collection1" so the URL would be 
>/solr/collection1/update.
>
>There is a lot of information out there that has not been updated since before 
>multicore operation became the default in Solr examples.
>
>The example does have defaultCoreName defined, but I still see lots of people 
>that run into problems like this, so I suspect that it isn't always honored.
>
>Thanks,
>Shawn
---

Thank you Shawn for the hint. Can someone tell me how to
figure out the corename?
 
http://localhost:8983/solr/collection1/update
 
did not seem to work for me. I then saw that /example/example-DIH/solr/db
had a conf and data directory, so I assumed it to be core. I then tried 
 
http://localhost:8983/solr/db/update?stream.body=*:*
http://localhost:8983/solr/db/update?stream.body=
 
which worked for me i.e. the documents in the index got
deleted.
 
Thanks again,
O. O.


Re: Enable Logging in the Example App

2013-01-21 Thread O. Olson
Thank you Ahmet. This worked perfectly.
O. O.


- Messaggio originale -
Da: Ahmet Arslan 
A: solr-user@lucene.apache.org; O. Olson 
Cc: 
Inviato: Lunedì 21 Gennaio 2013 15:44
Oggetto: Re: Enable Logging in the Example App

Hi Olson,

java -Djava.util.logging.config.file=etc/logging.properties -jar start.jar
should do the trick. There is an information about this in README.txt


--- On Mon, 1/21/13, O. Olson  wrote:

> From: O. Olson 
> Subject: Enable Logging in the Example App
> To: "solr-user@lucene.apache.org" 
> Date: Monday, January 21, 2013, 6:02 PM
> Hi,
>  
>     I am really
> new to Solr, and I have never used anything similar to it
> before. So please
> pardon my ignorance. I downloaded  Solr
> 4.0 from http://lucene.apache.org/solr/downloads.html and start
> it using the commandline: 
>  
> >java -jar start.jar  
>  
> This generates a number of INFO log messages to the
> console,
> that I would like to better view. 
>  
>     What is the
> best way to send these log messages to a file? I see a logs
> directory, but it
> seems to be empty. I first tried to add the log4j.properties
> in the “etc”
> directory as mentioned in http://wiki.apache.org/solr/SolrLogging.
> I then started solr on the commandline: 
>  
> >java -jar start.jar
> -Dlog4j.configuration=file:etc/log4j.properties
>  
> This does not give me any log files. I would appreciate any
> ideas in this regard i.e. the easiest way to get logging
> into the example app.
>  
> Thank you,
> O. O.
>



Re: Delete all Documents in the Example (Solr 4.0)

2013-01-21 Thread Erick Erickson
Try the admin page (note, this doesn't need a core, .../solr should
take you there). The cores should be listed on the left

Best
Erick

On Mon, Jan 21, 2013 at 6:09 PM, O. Olson  wrote:
>
>
>
>
> - Messaggio originale -
> Da: Shawn Heisey 
> A: solr-user@lucene.apache.org
> Cc:
> Inviato: Lunedì 21 Gennaio 2013 12:35
> Oggetto: Re: Delete all Documents in the Example (Solr 4.0)
>
>>On 1/21/2013 11:27 AM, O. Olson wrote:
>>> http://localhost:8983/solr/update
>>>
>>> and I got a 404 too. I then looked at
>>> /example-DIH/solr/solr/conf/solrconfig.xml and it seems to have 
>>> >> name="/update" class="solr.UpdateRequestHandler"  />.
>>>
>>> I am confused why I am getting a 404 if /update has a
>>> handler?
>
>>You need to send the request to /solr/corename/update ... if you are using 
>>the solr example, most likely the core is named "collection1" so the URL 
>>would be /solr/collection1/update.
>>
>>There is a lot of information out there that has not been updated since 
>>before multicore operation became the default in Solr examples.
>>
>>The example does have defaultCoreName defined, but I still see lots of people 
>>that run into problems like this, so I suspect that it isn't always honored.
>>
>>Thanks,
>>Shawn
> ---
>
> Thank you Shawn for the hint. Can someone tell me how to
> figure out the corename?
>
> http://localhost:8983/solr/collection1/update
>
> did not seem to work for me. I then saw that /example/example-DIH/solr/db
> had a conf and data directory, so I assumed it to be core. I then tried
>
> http://localhost:8983/solr/db/update?stream.body=*:*
> http://localhost:8983/solr/db/update?stream.body=
>
> which worked for me i.e. the documents in the index got
> deleted.
>
> Thanks again,
> O. O.


Re: Using Solr Spatial in conjunction with HBASE/Hadoop

2013-01-21 Thread oakstream
David,
I appreciate your time.  I'm going to take a crack at the Lucene sharded
index approach and will let you know how I fare.  Thanks again



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-Solr-Spatial-in-conjunction-with-HBASE-Hadoop-tp4034307p4035211.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: java.io.FileNotFoundException: /var/solrdata/coreX/index.20121111165300/_3m.fnm

2013-01-21 Thread Erick Erickson
What have you done to get to this state? This is definitely not
normal, but seems to indicate that you've somehow removed files from
your index externally to Solr.

what version of Solr?

On Mon, Jan 21, 2013 at 12:25 PM, Uomesh  wrote:
> Hi,
>
> I am getting below exception while loading my core. I can see a _3j.fnm file
> in my directory but it looking for _3m.fnm. Could you please help me what is
> causing this issue?
>
> http-8080-14 Jan-21 11:10:00-ERROR [org.apache.solr.core.SolrCore] -
> org.apache.solr.common.SolrException: Error executing default implementation
> of CREATE
> at
> com.solr.handler.admin.CustomCoreAdminHandler.handleCreateAction(CustomCoreAdminHandler.java:87)
> at
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:115)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at
> org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:318)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:187)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
> at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.solr.common.SolrException
> at org.apache.solr.core.SolrCore.(SolrCore.java:600)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:480)
> at
> com.solr.handler.admin.CustomCoreAdminHandler.handleCreateAction(CustomCoreAdminHandler.java:82)
> ... 17 more
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException:
> /var/solr/data/coreX/index.2012165300/_3m.fnm (No such file or
> directory)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1104)
> at org.apache.solr.core.SolrCore.(SolrCore.java:585)
> ... 19 more
> Caused by: java.io.FileNotFoundException:
> /var/solr/data/coreX/index.2012165300/_3m.fnm (No such file or
> directory)
> at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.(RandomAccessFile.java:212)
> at 
> org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:218)
> at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345)
> at org.apache.lucene.index.FieldInfos.(FieldInfos.java:71)
> at
> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:80)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:116)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:94)
> at 
> org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:105)
> at
> org.apache.lucene.index.ReadOnlyDirectoryReader.(ReadOnlyDirectoryReader.java:27)
> at
> org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:78)
> at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:709)
> at 
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:72)
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:375)
> at
> org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:38)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1093)
> ... 20 more
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/java-io-FileNotFoundException-var-solrdata-coreX-index-2012165300-3m-fnm-tp4035146.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Problem querying collection in Solr 4.1

2013-01-21 Thread Brett Hoerner
I have a collection in Solr 4.1 RC1 and doing a simple query like
text:"puppy dog" is causing an exception. Oddly enough, I CAN query for
text:puppy or text:"puppy", but adding the space breaks everything.

Schema and config: https://gist.github.com/f49da15e39e5609b75b1

This happens whether I query the whole collection or a single direct core.
I haven't tested whether this would happen outside of SolrCloud.

http://localhost:8984/solr/timeline/select?q=text%3A%22puppy+dog%22&wt=xml

http://localhost:8984/solr/timeline_shard4_replica1/select?q=text%3A%22puppy+dog%22&wt=xml

Jan 22, 2013 12:07:24 AM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request:[
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard2_replica1,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard1_replica2,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard3_replica2,
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard4_replica1,
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard1_replica1,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard2_replica2,
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard3_replica1,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard4_replica2]
 at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
 at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
 at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
 at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
 at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
 at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
 at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
 at org.eclipse.jetty.server.Server.handle(Server.java:365)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
 at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
 at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
 at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
 at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
 at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.solr.client.solrj.SolrServerException: No live
SolrServers available to handle this request:[
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard2_replica1,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard1_replica2,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard3_replica2,
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard4_replica1,
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard1_replica1,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard2_replica2,
http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard3_replica1,
http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard4_replica2]
 at
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:325)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:171)
 at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:135)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.

bf, nested queries and local params

2013-01-21 Thread Anatoli Matuskova
I'm trying to reach this:

Having this query:
q=table&bf=product(scale({!type=dismax qf=description,color v='with
color'},0,1),price)

And using on q (in solrconfig.xml) defType=dismax, qf=title, description

I'm trying to query for "table" and influence the score by doing the product
of the price with the score of the query "with color" on the fields
description and color, using defType dismax too and scaling from 0 to 1.

However I'm getting this error:

undefined field: "qf"
400


So how could I nest a query (with defType=dismax) inside the first parameter
of the scale function?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bf-nested-queries-and-local-params-tp4035216.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: bf, nested queries and local params

2013-01-21 Thread Jack Krupansky

Wrap the query argument for the scale function with the query function:

q=table&bf=product(scale(query({!type=dismax qf=description,color v='with 
color'}),0,1),price)


-- Jack Krupansky

-Original Message- 
From: Anatoli Matuskova

Sent: Monday, January 21, 2013 7:35 PM
To: solr-user@lucene.apache.org
Subject: bf, nested queries and local params

I'm trying to reach this:

Having this query:
q=table&bf=product(scale({!type=dismax qf=description,color v='with
color'},0,1),price)

And using on q (in solrconfig.xml) defType=dismax, qf=title, description

I'm trying to query for "table" and influence the score by doing the product
of the price with the score of the query "with color" on the fields
description and color, using defType dismax too and scaling from 0 to 1.

However I'm getting this error:

undefined field: "qf"
400


So how could I nest a query (with defType=dismax) inside the first parameter
of the scale function?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bf-nested-queries-and-local-params-tp4035216.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: bf, nested queries and local params

2013-01-21 Thread Anatoli Matuskova
Looks like doing this:

/Wrap the query argument for the scale function with the query function: 

q=table&bf=product(scale(query({!type=dismax qf=description,color v='with 
color'}),0,1),price) /

I'm still getting the same error. Something might be wrong when writing the
nested query but can't figure out what



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bf-nested-queries-and-local-params-tp4035216p4035218.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: bf, nested queries and local params

2013-01-21 Thread Jack Krupansky
The value for "bf" is a list of function queries, each separated by white 
space, so maybe the space after "dismax" is confusing the bf value parser.


Post your exact test URL.

-- Jack Krupansky

-Original Message- 
From: Anatoli Matuskova

Sent: Monday, January 21, 2013 7:53 PM
To: solr-user@lucene.apache.org
Subject: Re: bf, nested queries and local params

Looks like doing this:

/Wrap the query argument for the scale function with the query function:

q=table&bf=product(scale(query({!type=dismax qf=description,color v='with
color'}),0,1),price) /

I'm still getting the same error. Something might be wrong when writing the
nested query but can't figure out what



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bf-nested-queries-and-local-params-tp4035216p4035218.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: bf, nested queries and local params

2013-01-21 Thread Jack Krupansky
I think the space between "with" and "color" may be messing up the bf arg 
parser. Try a single term in the query, just to see if that's it.


-- Jack Krupansky

-Original Message- 
From: Anatoli Matuskova

Sent: Monday, January 21, 2013 8:39 PM
To: solr-user@lucene.apache.org
Subject: Re: bf, nested queries and local params

Getting closer:
q=table&bf=product(scale(query({!v='with color'}),0,1),100)

I was expecting to get the score from query({!v='with color'}) so the scale
would look like (this is just an example in a document):

scale(8.3343252,0,1)

But its not working like that, I'm getting the error:



org.apache.lucene.search.TermQuery$TermWeight cannot be cast to
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction$ScaleInfo


java.lang.ClassCastException: org.apache.lucene.search.TermQuery$TermWeight
cannot be cast to
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction$ScaleInfo
at
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.getValues(ScaleFloatFunction.java:103)
at
org.apache.lucene.queries.function.valuesource.MultiFloatFunction.getValues(MultiFloatFunction.java:65)
at
org.apache.lucene.queries.function.FunctionQuery$AllScorer.(FunctionQuery.java:120)
at
org.apache.lucene.queries.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:95)
at
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:318)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:589) at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280) at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1536)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1265)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:385)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:419)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1555) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
at java.lang.Thread.run(Thread.java:662)

500



Any help?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bf-nested-queries-and-local-params-tp4035216p4035226.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Spatial+Dataimport full import results in OutOfMemory for a rectangle defining a line

2013-01-21 Thread David Smiley (@MITRE.org)
Javier,

I didn't point out anything about overlapping latitudes or longitudes.  I
pointed out that your rectangle is extremely wide.  It's 359.98
degrees wide out of a maximum possibility of 360 even.  That's wide!  Wether
it crosses the dateline or not doesn't trigger the bug; its triggered by its
width being > 180 degrees, but is most severe the wider it is.  You probably
wouldn't have noticed this problem if the rect was "only" 300 degrees wide
or even a bit wider.

There's nothing special about indexing a multi-valued field with spatial,
its the same as any other Solr multi-valued field.  To split rectangles that
have a width > 180, you could write a DIH Transformer or a Solr 
UpdateRequestProcessor similarly.  Here's an example I just did:

public class SplitRectURPFactory extends FieldMutatingUpdateProcessorFactory
{

  @Override
  public UpdateRequestProcessor getInstance(SolrQueryRequest req,
SolrQueryResponse rsp,
UpdateRequestProcessor next) {
return new FieldValueMutatingUpdateProcessor(getSelector(), next) {
  @Override
  protected Object mutateValue(Object src) {
SpatialContext ctx = SpatialContext.GEO;
Rectangle rectangle = (Rectangle) ctx.readShape((String)src);
if (rectangle.getWidth() > 180) {
  double minX = rectangle.getMinX();
  double midX = minX + rectangle.getWidth() / 2;
  if (midX > 180)
midX -= 180;
  double maxX = rectangle.getMaxX();
  Rectangle rect1 = ctx.makeRectangle(minX, midX,
rectangle.getMinY(),
  rectangle.getMaxY());
  Rectangle rect2 = ctx.makeRectangle(midX, maxX,
rectangle.getMinY(), rectangle.getMaxY());
  src = Arrays.asList(rect1, rect2);
}
return src;
  }
};
  }
}

Of course ideally I should just go ahead and fix this bug ;-)
~ David


Javier Molina wrote
> Thanks for your reply David.
> 
> After doing more testing I found that overlapping latitudes or longitudes
> was not the issue as you point out.
> 
> The values presented are extreme but they are correct, our solution should
> allow a user to define a box in a map, a box crossing the 180 meridian is
> also valid.
> 
> I know those are very extreme (and rare) scenarios but technically still
> valid ones.
> 
> I understand your workaround and seems feasible but unfortunately I don't
> see any example on
> http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4on how to
> index a multivalued field.
> 
> Is there any other documentation that I am missing? If not could you
> please
> shed some light on the syntax to index to boxes (rectangles) on a field?
> 
> With regards to search, will an Intersects function behaviour will be OR,
> that is match any rectangle in my example on that multivalued field that
> intersects with the given area?
> 
> Thanks,
> Javier
> 
> 
> 
> 
> 
> On 22 January 2013 05:43, David Smiley (@MITRE.org) <

> DSMILEY@

> >wrote:
> 
>> Javier,
>>
>> Your minX is slightly greater than maxX, which is interpreted as a line
>> that
>> wraps nearly the entire globe.  Is that what you intended?
>>
>> If this is what you intended, then you got bitten by this unfixed bug:
>> https://issues.apache.org/jira/browse/LUCENE-4550
>> As a work-around, you could split that horizontal line into two equal
>> pieces
>> and index them as separate values for the document.
>>
>> ~ David
>>





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spatial-Dataimport-full-import-results-in-OutOfMemory-for-a-rectangle-defining-a-line-tp4034928p4035234.html
Sent from the Solr - User mailing list archive at Nabble.com.


how to avoid DataImportHandler from interpreting "tinyint(1) unsigned" value as "Boolean" value?

2013-01-21 Thread nanyang cai
Hi,

DIH is really handy. But I found it interpret "Tinyint(1) unsigned" values
as "Boolean" values. In my case, we have a column 'status tinyint(1)', who
could hold value of (0, 1, 2, 3);

My configuration as blow:
db-data-config.xml :



schema.xml:

 

But search results after indexing:

false

I think value 1, 2, 3 are all read as true, this is not I want.

I've googled, and this is not a new problem. One solution is to change the
column type to 'Tinyint(2)', but this is not allowed for some reason.

I hope I've missed some other ways, could you point me out? Thank you!


Re: Problem querying collection in Solr 4.1

2013-01-21 Thread Gopal Patwa
one thing I noticed in solrconfig xml that it set to use Lucene version 4.0
index format but you  mention you are using it 4.1

  LUCENE_40



On Mon, Jan 21, 2013 at 4:26 PM, Brett Hoerner wrote:

> I have a collection in Solr 4.1 RC1 and doing a simple query like
> text:"puppy dog" is causing an exception. Oddly enough, I CAN query for
> text:puppy or text:"puppy", but adding the space breaks everything.
>
> Schema and config: https://gist.github.com/f49da15e39e5609b75b1
>
> This happens whether I query the whole collection or a single direct core.
> I haven't tested whether this would happen outside of SolrCloud.
>
> http://localhost:8984/solr/timeline/select?q=text%3A%22puppy+dog%22&wt=xml
>
>
> http://localhost:8984/solr/timeline_shard4_replica1/select?q=text%3A%22puppy+dog%22&wt=xml
>
> Jan 22, 2013 12:07:24 AM org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: No live SolrServers
> available to handle this request:[
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard2_replica1,
> http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard1_replica2,
> http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard3_replica2,
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard4_replica1,
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard1_replica1,
> http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard2_replica2,
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard3_replica1,
> http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard4_replica2]
>  at
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302)
> at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
> at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
>  at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
> at
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
>  at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
> at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
>  at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
> at
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>  at
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
>  at
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
>  at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>  at
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
>  at org.eclipse.jetty.server.Server.handle(Server.java:365)
> at
>
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
>  at
>
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> at
>
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
>  at
>
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
>  at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at
>
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
>  at
>
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>  at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.solr.client.solrj.SolrServerException: No live
> SolrServers available to handle this request:[
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard2_replica1,
> http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard1_replica2,
> http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard3_replica2,
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard4_replica1,
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard1_replica1,
> http://timelinesearch-2d.i.massrel.com:8983/solr/timeline_shard2_replica2,
> http://timelinesearch-1d.i.massrel.com:8983/solr/timeline_shard3_replica1,
> http://

Re: Spatial+Dataimport full import results in OutOfMemory for a rectangle defining a line

2013-01-21 Thread Javier Molina
Re: overlapping latitudes/longitudes, I think it was a mixup of sentences.

At the end you pointed out where the problem was.

After doing more testing I see the issue not only depends on the longitudes
but it is also affected by latitudes.

For example

This very wide rectangle will cause an OutOfMemoryError

-180 3 180 3.016668

While this one, slightly taller will work fine.

-180 3 180 3.5

But you probably already know that.

Re: multivalued fields, I have never used one before, then I think I will
just jump to the general documentation and also follow the code you posted.

Thanks,
Javier




On 22 January 2013 13:17, David Smiley (@MITRE.org) wrote:

> Javier,
>
> I didn't point out anything about overlapping latitudes or longitudes.  I
> pointed out that your rectangle is extremely wide.  It's 359.98
> degrees wide out of a maximum possibility of 360 even.  That's wide!
>  Wether
> it crosses the dateline or not doesn't trigger the bug; its triggered by
> its
> width being > 180 degrees, but is most severe the wider it is.  You
> probably
> wouldn't have noticed this problem if the rect was "only" 300 degrees wide
> or even a bit wider.
>
> There's nothing special about indexing a multi-valued field with spatial,
> its the same as any other Solr multi-valued field.  To split rectangles
> that
> have a width > 180, you could write a DIH Transformer or a Solr
> UpdateRequestProcessor similarly.  Here's an example I just did:
>
> public class SplitRectURPFactory extends
> FieldMutatingUpdateProcessorFactory
> {
>
>   @Override
>   public UpdateRequestProcessor getInstance(SolrQueryRequest req,
> SolrQueryResponse rsp,
> UpdateRequestProcessor next) {
> return new FieldValueMutatingUpdateProcessor(getSelector(), next) {
>   @Override
>   protected Object mutateValue(Object src) {
> SpatialContext ctx = SpatialContext.GEO;
> Rectangle rectangle = (Rectangle) ctx.readShape((String)src);
> if (rectangle.getWidth() > 180) {
>   double minX = rectangle.getMinX();
>   double midX = minX + rectangle.getWidth() / 2;
>   if (midX > 180)
> midX -= 180;
>   double maxX = rectangle.getMaxX();
>   Rectangle rect1 = ctx.makeRectangle(minX, midX,
> rectangle.getMinY(),
>   rectangle.getMaxY());
>   Rectangle rect2 = ctx.makeRectangle(midX, maxX,
> rectangle.getMinY(), rectangle.getMaxY());
>   src = Arrays.asList(rect1, rect2);
> }
> return src;
>   }
> };
>   }
> }
>
> Of course ideally I should just go ahead and fix this bug ;-)
> ~ David
>
>
> Javier Molina wrote
> > Thanks for your reply David.
> >
> > After doing more testing I found that overlapping latitudes or longitudes
> > was not the issue as you point out.
> >
> > The values presented are extreme but they are correct, our solution
> should
> > allow a user to define a box in a map, a box crossing the 180 meridian is
> > also valid.
> >
> > I know those are very extreme (and rare) scenarios but technically still
> > valid ones.
> >
> > I understand your workaround and seems feasible but unfortunately I don't
> > see any example on
> > http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4on how to
> > index a multivalued field.
> >
> > Is there any other documentation that I am missing? If not could you
> > please
> > shed some light on the syntax to index to boxes (rectangles) on a field?
> >
> > With regards to search, will an Intersects function behaviour will be OR,
> > that is match any rectangle in my example on that multivalued field that
> > intersects with the given area?
> >
> > Thanks,
> > Javier
> >
> >
> >
> >
> >
> > On 22 January 2013 05:43, David Smiley (@MITRE.org) <
>
> > DSMILEY@
>
> > >wrote:
> >
> >> Javier,
> >>
> >> Your minX is slightly greater than maxX, which is interpreted as a line
> >> that
> >> wraps nearly the entire globe.  Is that what you intended?
> >>
> >> If this is what you intended, then you got bitten by this unfixed bug:
> >> https://issues.apache.org/jira/browse/LUCENE-4550
> >> As a work-around, you could split that horizontal line into two equal
> >> pieces
> >> and index them as separate values for the document.
> >>
> >> ~ David
> >>
>
>
>
>
>
> -
>  Author:
> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Spatial-Dataimport-full-import-results-in-OutOfMemory-for-a-rectangle-defining-a-line-tp4034928p4035234.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: bf, nested queries and local params

2013-01-21 Thread Jack Krupansky

Seems that way. Sounds like it needs some deeper troubleshooting.

Ah... try this, wrap the query function with a "product" function that 
simply multiplies by 1 to force the conversion


q=table&bf=product(scale(product(query({!v='color'}),1),0,1),100)

Even if that does work, sounds like a bug.

-- Jack Krupansky

-Original Message- 
From: Anatoli Matuskova

Sent: Monday, January 21, 2013 9:10 PM
To: solr-user@lucene.apache.org
Subject: Re: bf, nested queries and local params

q=table&bf=product(scale(query({!v='color'}),0,1),100)

Gives me back the same exception. Once the query query({!v='color'}) is
executed, the scale is it supposed to use as first param the score of the
nested query? I'm doing that assumption but seems not to be true:

java.lang.ClassCastException: org.apache.lucene.search.TermQuery$TermWeight
cannot be cast to
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction$ScaleInfo

Seems like it is returning a TermWeight?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bf-nested-queries-and-local-params-tp4035216p4035233.html
Sent from the Solr - User mailing list archive at Nabble.com. 



AutoComplete with FiterQuery for Full content

2013-01-21 Thread Sujatha Arun
Hi,

I need suggestion  on solr Autocomplete for Full content with Filter query.

I have currently implemented this as below


   1. Solr version 3.6.1
   2. solr.StandardTokenizerFactory
   3. EdgeNGramFilterFactory with maxGramSize="25" minGramSize="1"
   4. Stored the content field
   5. Use the Fastvectorhighter and breakiterator on WORD to return results
   based on  standard analyzer with a fragsize of 20 &using the fq param as
   required

This seems to provide snippets ,but they seem like junk at times and not
really relevant as they are pieces of sentence with search term in them .It
could be like
the  and ...eg: on searching river  suggestion is  - the river and
...which does not really make sense as a suggestion...

So other options of


   - facets support fq but cannot be used for fullcontent tokenized text
   due to performance issue


   1. Can we use a tool that can just extract keywords/phrases from the
Full content and that can either be indexed or updated to Db and same can
   be used to serve the autocomplete?
   2. Any other methods?
   3. Are there any opensource tools for keyword extraction? Sematext has a
   commercial tool for the same.
   4. Which would be better for Autocomplete  - DB / Index in terms of
   speed /performance?

Any pointers?

Regards,
Sujatha


Re: AutoComplete with FiterQuery for Full content

2013-01-21 Thread Jack Krupansky
It's not clear what your question or problem is. Try explaining it in simple 
English first. Autocomplete is fairly simple - no need for the complexity of 
an ngram filter.


Here's an example of a suggester component and request handler based on a 
simple text field:



 
   suggest
   org.apache.solr.spelling.suggest.Suggester
   name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup

   name
   true
 


name="/suggest">

 
   true
   suggest
   true
   5
   true
 
 
   suggest
 


-- Jack Krupansky

-Original Message- 
From: Sujatha Arun

Sent: Tuesday, January 22, 2013 12:59 AM
To: solr-user@lucene.apache.org
Subject: AutoComplete with FiterQuery for Full content

Hi,

I need suggestion  on solr Autocomplete for Full content with Filter query.

I have currently implemented this as below


  1. Solr version 3.6.1
  2. solr.StandardTokenizerFactory
  3. EdgeNGramFilterFactory with maxGramSize="25" minGramSize="1"
  4. Stored the content field
  5. Use the Fastvectorhighter and breakiterator on WORD to return results
  based on  standard analyzer with a fragsize of 20 &using the fq param as
  required

This seems to provide snippets ,but they seem like junk at times and not
really relevant as they are pieces of sentence with search term in them .It
could be like
the  and ...eg: on searching river  suggestion is  - the river and
...which does not really make sense as a suggestion...

So other options of


  - facets support fq but cannot be used for fullcontent tokenized text
  due to performance issue


  1. Can we use a tool that can just extract keywords/phrases from the
   Full content and that can either be indexed or updated to Db and same 
can

  be used to serve the autocomplete?
  2. Any other methods?
  3. Are there any opensource tools for keyword extraction? Sematext has a
  commercial tool for the same.
  4. Which would be better for Autocomplete  - DB / Index in terms of
  speed /performance?

Any pointers?

Regards,
Sujatha 



Re: AutoComplete with FiterQuery for Full content

2013-01-21 Thread Sujatha Arun
Hi Jack,

I need to filter the suggestions based on some other fields and the below
mentioned method  [Suggester] does not allow to do same.

Hence at present we have only two options for suggest  Implementation with
filters

1.Facets
2.N-Grams

as mentioned in this site .
http://www.searchworkings.org/blog/-/blogs/420845/maximized

What I have mentioned below is the ngrams approach .

Regards
Sujatha

On Tue, Jan 22, 2013 at 11:52 AM, Jack Krupansky wrote:

> It's not clear what your question or problem is. Try explaining it in
> simple English first. Autocomplete is fairly simple - no need for the
> complexity of an ngram filter.
>
> Here's an example of a suggester component and request handler based on a
> simple text field:
>
> 
>  
>suggest
>org.apache.**solr.spelling.suggest.**
> Suggester
>org.apache.**solr.spelling.suggest.tst.**
> TSTLookup
>name
>true
>  
> 
>
>  name="/suggest">
>  
>true
>**suggest
>true
>5
>**true
>  
>  
>suggest
>  
> 
>
> -- Jack Krupansky
>
> -Original Message- From: Sujatha Arun
> Sent: Tuesday, January 22, 2013 12:59 AM
> To: solr-user@lucene.apache.org
> Subject: AutoComplete with FiterQuery for Full content
>
>
> Hi,
>
> I need suggestion  on solr Autocomplete for Full content with Filter query.
>
> I have currently implemented this as below
>
>
>   1. Solr version 3.6.1
>   2. solr.StandardTokenizerFactory
>   3. EdgeNGramFilterFactory with maxGramSize="25" minGramSize="1"
>   4. Stored the content field
>   5. Use the Fastvectorhighter and breakiterator on WORD to return results
>
>   based on  standard analyzer with a fragsize of 20 &using the fq param as
>   required
>
> This seems to provide snippets ,but they seem like junk at times and not
> really relevant as they are pieces of sentence with search term in them .It
> could be like
> the  and ...eg: on searching river  suggestion is  - the river and
> ...which does not really make sense as a suggestion...
>
> So other options of
>
>
>   - facets support fq but cannot be used for fullcontent tokenized text
>   due to performance issue
>
>
>   1. Can we use a tool that can just extract keywords/phrases from the
>
>Full content and that can either be indexed or updated to Db and same
> can
>   be used to serve the autocomplete?
>   2. Any other methods?
>   3. Are there any opensource tools for keyword extraction? Sematext has a
>
>   commercial tool for the same.
>   4. Which would be better for Autocomplete  - DB / Index in terms of
>
>   speed /performance?
>
> Any pointers?
>
> Regards,
> Sujatha
>


How distributed queries works?

2013-01-21 Thread SuoNayi
Dear list,
I want to know the internal mechanism for the distributed queries of SolrCloud.
AFAIK,distributed query is supported before the presence of SolrCloud, users 
can 
specify shard urls in the query parameters. We can distribute data by time 
interval 
in this case.It's called horizontal scalability based on history?
Now SolrCloud do further more because it can discover the other shards(Solr 
instance) 
via ZooKeeper and distribute data based on Hash & Mod  of the unique key of the 
doc.
For both cases the requested Solr instance need do scatter queries across the 
shards 
and gather the result at last.This process seems like Map-Reduce.
Buy what happens when scattering and gathering? I have read the WIKI but no 
more 
details available.I really hope someone can make me clear and give some links.


Supposing there are 3 shards and 0 replica in my Solr cloud, each shard have 
150 
millions docs.My client query by q=*:* and outputs the results page by 
page.When 
the page number is very large,saying 400th page, does Solr need load all the 
docs into 
RAM to calculate score and order?


Sorry for newbie question and thanks for your time.




Thanks
SuoNayi