Re: Boosting documents by categorical preferences

2013-11-20 Thread Amit Nithian
I thought about that but my concern/question was how. If I used the pow
function then I'm still boosting the bad categories by a small
amount..alternatively I could multiply by a negative number but does that
work as expected?

I haven't done much with negative boosting except for the sledgehammer
approach of category exclusion through filters.

Thanks
Amit
On Nov 19, 2013 8:51 AM, "Chris Hostetter"  wrote:

> : My approach was something like:
> : 1) Look at the categories that the user has preferred and compute the
> : z-score
> : 2) Pick the top 3 among those
> : 3) Use those to boost search results.
>
> I think that totaly makes sense ... the additional bit i was suggesting
> that you consider is that instead of picking the "highest" 3 z-scores,
> pick the z-scores with the greatest absolute value ... that way if someone
> is a very booring person and their "positive interests" are all basically
> exactly the same as the mean for everyone else, but they have some very
> strong "dis-interests" you don't bother boosting on those miniscule
> interests and instead you negatively boost on the things they are
> antogonistic against.
>
>
> -Hoss
>


DataImportHandler on multi core - limiting concurrent runs on more than N cores

2013-11-20 Thread Patrice Monroe Pustavrh
Hi,
I am currently run Solr with 10 cores. It works fine with me, until "I" try
to run update on too may cores (each core uses more than enough CPU and
memory so machine becomes really slow). I've googled around and tried to
find whether there is an option in SOLR to prevent to many simultaneous data
imports at once, and so far, I found nothing. Is there any way to limit (and
possibly, enqueue) to many dataimports at once on multi core ? 

P.S. I am aware one can run only one dataimport per core, but this is not
the issue here.

Regards
Patrice Monroe Pustavrh



DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
Hi there,

I'm not fully understand what kind of usage example that DocValues can be
used?

When I set field docValues=true, do i need to change anyhting in xml that I
sent to solr for indexing?

Please point me.

Thanks

Floyd

PS: I've googled and read lots of DocValues discussion but confused.


Re: DocValues uasge and senarios?

2013-11-20 Thread Yago Riveiro
Hi Floyd, 

DocValues are useful for sorting and faceting per example. 

You don't need to change nothing in your xml's, the only thing that you need to 
do is set the docValues=true in your field definition in the schema.

If you don't want use the default implementation (all loaded in the heap), you 
need to add the tag  in the 
solrconfig.xml and the docValuesFormat=true on the fieldType definition.

-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Wednesday, November 20, 2013 at 9:38 AM, Floyd Wu wrote:

> Hi there,
> 
> I'm not fully understand what kind of usage example that DocValues can be
> used?
> 
> When I set field docValues=true, do i need to change anyhting in xml that I
> sent to solr for indexing?
> 
> Please point me.
> 
> Thanks
> 
> Floyd
> 
> PS: I've googled and read lots of DocValues discussion but confused. 



Re: DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
Hi Yago

Thanks for you reply. I once thought that DocValues feature is one for me
to store some extra values.

May I summarized that DocValues is a feature that "speed up" sorting and
faceting?

Floyd



2013/11/20 Yago Riveiro 

> Hi Floyd,
>
> DocValues are useful for sorting and faceting per example.
>
> You don't need to change nothing in your xml's, the only thing that you
> need to do is set the docValues=true in your field definition in the schema.
>
> If you don't want use the default implementation (all loaded in the heap),
> you need to add the tag  in
> the solrconfig.xml and the docValuesFormat=true on the fieldType definition.
>
> --
> Yago Riveiro
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
>
> On Wednesday, November 20, 2013 at 9:38 AM, Floyd Wu wrote:
>
> > Hi there,
> >
> > I'm not fully understand what kind of usage example that DocValues can be
> > used?
> >
> > When I set field docValues=true, do i need to change anyhting in xml
> that I
> > sent to solr for indexing?
> >
> > Please point me.
> >
> > Thanks
> >
> > Floyd
> >
> > PS: I've googled and read lots of DocValues discussion but confused.
>
>


Re: DocValues uasge and senarios?

2013-11-20 Thread Yago Riveiro
You should understand the DocValues as feature that allow you to do sorting and 
faceting without blow the heap. 

They are not necessary faster than the traditional method, they are more memory 
efficient and in huge indexes this is the main limitation. 

This post resumes the docvalues feature and the main goals 
http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/

-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Wednesday, November 20, 2013 at 10:15 AM, Floyd Wu wrote:

> Hi Yago
> 
> Thanks for you reply. I once thought that DocValues feature is one for me
> to store some extra values.
> 
> May I summarized that DocValues is a feature that "speed up" sorting and
> faceting?
> 
> Floyd
> 
> 
> 
> 2013/11/20 Yago Riveiro  (mailto:yago.rive...@gmail.com)>
> 
> > Hi Floyd,
> > 
> > DocValues are useful for sorting and faceting per example.
> > 
> > You don't need to change nothing in your xml's, the only thing that you
> > need to do is set the docValues=true in your field definition in the schema.
> > 
> > If you don't want use the default implementation (all loaded in the heap),
> > you need to add the tag  in
> > the solrconfig.xml and the docValuesFormat=true on the fieldType definition.
> > 
> > --
> > Yago Riveiro
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > 
> > 
> > On Wednesday, November 20, 2013 at 9:38 AM, Floyd Wu wrote:
> > 
> > > Hi there,
> > > 
> > > I'm not fully understand what kind of usage example that DocValues can be
> > > used?
> > > 
> > > When I set field docValues=true, do i need to change anyhting in xml
> > that I
> > > sent to solr for indexing?
> > > 
> > > Please point me.
> > > 
> > > Thanks
> > > 
> > > Floyd
> > > 
> > > PS: I've googled and read lots of DocValues discussion but confused. 



Re: How to index X™ as ™ (HTML decimal entity)

2013-11-20 Thread Uwe Reh
What's about having a simple charfilter in the analyzer queue for 
indexing *and* searching. e.g
replacement="™" />

or
mapping="mapping-specials.txt" />


Uwe

Am 19.11.2013 23:46, schrieb Developer:

I have a data coming in to SOLR as below.

X™ - Black

I need to store the HTML Entity (decimal) equivalent value (i.e. ™)
in SOLR rather than storing the original value.

Is there a way to do this?





Re: DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
Thanks Yago,

I've read this article
http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/
But I don't understand well.
I'll try to figure out the missing part. Thanks for helping.

Floyd




2013/11/20 Yago Riveiro 

> You should understand the DocValues as feature that allow you to do
> sorting and faceting without blow the heap.
>
> They are not necessary faster than the traditional method, they are more
> memory efficient and in huge indexes this is the main limitation.
>
> This post resumes the docvalues feature and the main goals
> http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/
>
> --
> Yago Riveiro
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
>
> On Wednesday, November 20, 2013 at 10:15 AM, Floyd Wu wrote:
>
> > Hi Yago
> >
> > Thanks for you reply. I once thought that DocValues feature is one for me
> > to store some extra values.
> >
> > May I summarized that DocValues is a feature that "speed up" sorting and
> > faceting?
> >
> > Floyd
> >
> >
> >
> > 2013/11/20 Yago Riveiro  yago.rive...@gmail.com)>
> >
> > > Hi Floyd,
> > >
> > > DocValues are useful for sorting and faceting per example.
> > >
> > > You don't need to change nothing in your xml's, the only thing that you
> > > need to do is set the docValues=true in your field definition in the
> schema.
> > >
> > > If you don't want use the default implementation (all loaded in the
> heap),
> > > you need to add the tag  class="solr.SchemaCodecFactory"/> in
> > > the solrconfig.xml and the docValuesFormat=true on the fieldType
> definition.
> > >
> > > --
> > > Yago Riveiro
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > >
> > >
> > > On Wednesday, November 20, 2013 at 9:38 AM, Floyd Wu wrote:
> > >
> > > > Hi there,
> > > >
> > > > I'm not fully understand what kind of usage example that DocValues
> can be
> > > > used?
> > > >
> > > > When I set field docValues=true, do i need to change anyhting in xml
> > > that I
> > > > sent to solr for indexing?
> > > >
> > > > Please point me.
> > > >
> > > > Thanks
> > > >
> > > > Floyd
> > > >
> > > > PS: I've googled and read lots of DocValues discussion but confused.
>
>


Re: Error with Solr 4.4.0, Glassfish, and CentOS 6.2

2013-11-20 Thread Ericvb
Hi
We had the same issue
as mentioned we added
-Djavax.net.ssl.keyStorePassword=changeit
but we also had to add
-Djavax.net.ssl.trustStorePassword=changeit
that did it for us




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-with-Solr-4-4-0-Glassfish-and-CentOS-6-2-tp4089211p4102090.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Shalin Shekhar Mangar
No, it is not supported yet. We can't split to a remote node directly.
The best bet is trigger a new leader election by unloading the leader
node once all replicas are active.

On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
 wrote:
> Hi,
>
> Is it possible to perform a shard split and stream data for the
> new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
> on the local/source node first?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/



-- 
Regards,
Shalin Shekhar Mangar.


Re: DataImportHandler on multi core - limiting concurrent runs on more than N cores

2013-11-20 Thread Shalin Shekhar Mangar
No, there is no synchronisation between data import handlers on
different cores. You will have to implement this sort of queuing logic
on your application's side.

On Wed, Nov 20, 2013 at 2:23 PM, Patrice Monroe Pustavrh
 wrote:
> Hi,
> I am currently run Solr with 10 cores. It works fine with me, until "I" try
> to run update on too may cores (each core uses more than enough CPU and
> memory so machine becomes really slow). I've googled around and tried to
> find whether there is an option in SOLR to prevent to many simultaneous data
> imports at once, and so far, I found nothing. Is there any way to limit (and
> possibly, enqueue) to many dataimports at once on multi core ?
>
> P.S. I am aware one can run only one dataimport per core, but this is not
> the issue here.
>
> Regards
> Patrice Monroe Pustavrh
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: What is the difference between "attorney:(Roger Miller)" and "attorney:Roger Miller"

2013-11-20 Thread Erick Erickson
&debug=query is your friend!


On Tue, Nov 19, 2013 at 4:17 PM, Rafał Kuć  wrote:

> Hello!
>
> Terms surrounded by " characters will be treated as phrase query. So,
> if your default query operator is OR, the attorney:(Roger Miller) will
> result in documents with first or second (or both) terms in the
> attorney field. The attorney:"Roger Miller" will result only in
> documents that have the phrase Roger Miller in the attorney field.
>
> You may want to look at Lucene query syntax to understand all the
> differences:
> http://lucene.apache.org/core/4_5_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description
>
> --
> Regards,
>  Rafał Kuć
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> > Also, attorney:(Roger Miller) is same as attorney:"Roger Miller" right?
> Or
> > the term "Roger Miller" is run against attorney?
>
> > Thanks,
> > -Utkarsh
>
>
> > On Tue, Nov 19, 2013 at 12:42 PM, Rafał Kuć  wrote:
>
> >> Hello!
> >>
> >> In the first one, the two terms 'Roger' and 'Miller' are run against
> >> the attorney field. In the second the 'Roger' term is run against the
> >> attorney field and the 'Miller' term is run against the default search
> >> field.
> >>
> >> --
> >> Regards,
> >>  Rafał Kuć
> >> Performance Monitoring * Log Analytics * Search Analytics
> >> Solr & Elasticsearch Support * http://sematext.com/
> >>
> >>
> >> > We got different results for these two queries. The first one returned
> >> 115
> >> > records and the second returns 179 records.
> >>
> >> > Thanks,
> >>
> >> > Fudong
> >>
> >>
>
>
>


Re: {!cache=false} for regular queries?

2013-11-20 Thread Erick Erickson
But I don't know whether it's worth worrying about. queryResultCache is
pretty small.

I think of it as a map where the key is the text of the query and the value
is an
int[queryWindowSize]. So bypassing the cache is probably not going to make
much
difference.

YMMV of course.

Best
Erick


On Wed, Nov 20, 2013 at 12:44 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> It should bypass cache for sure
>
> https://github.com/apache/lucene-solr/blob/34a92d090ac4ff5c8382e1439827d678265ede0d/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1263
> 20.11.2013 7:05 пользователь "Otis Gospodnetic" <
> otis.gospodne...@gmail.com>
> написал:
>
> > Hi,
> >
> > We have the ability to turn off caching for filter queries -
> > http://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters
> >
> > I didn't try it, but I think one can't turn off caching for regular
> > queries, a la:
> >
> > q={!cache=false}
> >
> > Is there a reason this could not be done?
> >
> > Thanks,
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
>


Solr is deleting newly created index from index folder

2013-11-20 Thread vishalgupta084
I am runnig cron job for indexing and commiting documents for solr search.
Earlier everything was fine. But from some time it is deleting indexes from
index folder. Whenever I update any document or create any new document, it
gets indexed and commited and appear in search but after some hour later
when i search for the same document it gets disappear from the search and
when i cheked index folder size then noticed that it gets reduced to its
original size that was before updating the document 
Query is:

 (endtime:[* TO NOW]
AND -endtime:"1970-01-01T01:00:00Z")

 Could anyone please let me know why it is deleting only newly created
indexes not the old indexes. Old indexes appear in search.

 How can i stop this deletion process.

 Although I checked for deletion policy also but in my solrsonfig.xml it is
commented.

 My solr was running fine on production but now it is creating above
mentioned issue. So urgent help require.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-is-deleting-newly-created-index-from-index-folder-tp4102104.html
Sent from the Solr - User mailing list archive at Nabble.com.


Auto optimized of Solr indexing results

2013-11-20 Thread Bayu Widyasanyata
Hi,

After successfully configured re-crawling script, I sometimes checked and
found on Solr Admin that "Optimized" status of my collection is not
optimized (slash icon).

Hence I did optimized steps manually.

How to make my crawling optimized automatically?

Should we restart Solr (I use Tomcat) as shown on here [1]

[1] http://wiki.apache.org/nutch/Crawl

Thanks!

-- 
wassalam,
[bayu]


Re: Solr is deleting newly created index from index folder

2013-11-20 Thread Erick Erickson
I cannot imagine that Solr suddenly starts deleting indexes without you
changing anything, although all things are possible. Sanity check:
do you have at least as much free space on your disk as the total size
of your index on disk?

Your query will delete everything from your index with an endtime, except
some very specific dates (i.e. a Unix time of 0).

First thing I'd do is stop the cron job to see if that fixes your problem.
My
bet is that this query is not doing what you expect.

Then submit the query to Solr from a URL or from the admin page, turn
query debugging on and examine the results. Do this with just the query,
not the delete parts, i.e.
q=(endtime:[* TO NOW] AND -endtime:"1970-01-01T01:00:00Z")

BTW, the AND is unnecessary, but I don't think that's relevant anyway...

If that doesn't show the problem, please show us the results of running
the query with debug info so we see the parsed query.

Best,
Erick


On Wed, Nov 20, 2013 at 7:31 AM, vishalgupta084
wrote:

> I am runnig cron job for indexing and commiting documents for solr search.
> Earlier everything was fine. But from some time it is deleting indexes from
> index folder. Whenever I update any document or create any new document, it
> gets indexed and commited and appear in search but after some hour later
> when i search for the same document it gets disappear from the search and
> when i cheked index folder size then noticed that it gets reduced to its
> original size that was before updating the document
> Query is:
>
>  (endtime:[* TO NOW]
> AND -endtime:"1970-01-01T01:00:00Z")
>
>  Could anyone please let me know why it is deleting only newly created
> indexes not the old indexes. Old indexes appear in search.
>
>  How can i stop this deletion process.
>
>  Although I checked for deletion policy also but in my solrsonfig.xml it is
> commented.
>
>  My solr was running fine on production but now it is creating above
> mentioned issue. So urgent help require.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-is-deleting-newly-created-index-from-index-folder-tp4102104.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Auto optimized of Solr indexing results

2013-11-20 Thread Erick Erickson
You probably shouldn't optimize at all. The default TieredMergePolicy
will eventually purge the deleted files' data, which is really what optimize
does. So despite its name, most of the time it's not really worth the
effort.

Take a look at your Solr admin page, the "overview" link for a core.
If the number of deleted docs is a significant percentage of your
numDocs (I typically use 20% or so, but YMMV) then optimize
might be worthwhile. Otherwise, it's a distraction unless and until
you have some evidence that it actually makes a difference.

Best,
Erick


On Wed, Nov 20, 2013 at 7:33 AM, Bayu Widyasanyata
wrote:

> Hi,
>
> After successfully configured re-crawling script, I sometimes checked and
> found on Solr Admin that "Optimized" status of my collection is not
> optimized (slash icon).
>
> Hence I did optimized steps manually.
>
> How to make my crawling optimized automatically?
>
> Should we restart Solr (I use Tomcat) as shown on here [1]
>
> [1] http://wiki.apache.org/nutch/Crawl
>
> Thanks!
>
> --
> wassalam,
> [bayu]
>


Re: Solr is deleting newly created index from index folder

2013-11-20 Thread Jack Krupansky
You may be hitting a query parser bug/nuance, that a purely negative 
sub-query needs to have a *:* added so that it is not purely negative.


So, replace:

AND -endtime:"1970-01-01T01:00:00Z"

with

AND (*:* -endtime:"1970-01-01T01:00:00Z")

Or, as Erick mentioned in his reply, you don't really not to AND with a 
sub-query, so make it:


-endtime:"1970-01-01T01:00:00Z"

And then it is simply a clause of the Boolean query.


-- Jack Krupansky
-Original Message- 
From: vishalgupta084

Sent: Wednesday, November 20, 2013 7:31 AM
To: solr-user@lucene.apache.org
Subject: Solr is deleting newly created index from index folder

I am runnig cron job for indexing and commiting documents for solr search.
Earlier everything was fine. But from some time it is deleting indexes from
index folder. Whenever I update any document or create any new document, it
gets indexed and commited and appear in search but after some hour later
when i search for the same document it gets disappear from the search and
when i cheked index folder size then noticed that it gets reduced to its
original size that was before updating the document
Query is:

(endtime:[* TO NOW]
AND -endtime:"1970-01-01T01:00:00Z")

Could anyone please let me know why it is deleting only newly created
indexes not the old indexes. Old indexes appear in search.

How can i stop this deletion process.

Although I checked for deletion policy also but in my solrsonfig.xml it is
commented.

My solr was running fine on production but now it is creating above
mentioned issue. So urgent help require.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-is-deleting-newly-created-index-from-index-folder-tp4102104.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: How to index X™ as ™ (HTML decimal entity)

2013-11-20 Thread Jack Krupansky
Any analysis filtering affects the indexed value only, but the stored value 
would be unchanged from the original input value. An update processor lets 
you modify the original input value that will be stored.


-- Jack Krupansky

-Original Message- 
From: Uwe Reh

Sent: Wednesday, November 20, 2013 5:43 AM
To: solr-user@lucene.apache.org
Subject: Re: How to index X™ as ™ (HTML decimal entity)

What's about having a simple charfilter in the analyzer queue for
indexing *and* searching. e.g

or


Uwe

Am 19.11.2013 23:46, schrieb Developer:

I have a data coming in to SOLR as below.

X™ - Black

I need to store the HTML Entity (decimal) equivalent value (i.e. ™)
in SOLR rather than storing the original value.

Is there a way to do this?





Re: {!cache=false} for regular queries?

2013-11-20 Thread Mikhail Khludnev
Eric,

it's worth to mention that queries which sorts by field can potentially
blows filter cache
http://wiki.apache.org/solr/SolrCaching#useFilterForSortedQuery here is the
hint might work out.


On Wed, Nov 20, 2013 at 4:30 PM, Erick Erickson wrote:

> But I don't know whether it's worth worrying about. queryResultCache is
> pretty small.
>
> I think of it as a map where the key is the text of the query and the value
> is an
> int[queryWindowSize]. So bypassing the cache is probably not going to make
> much
> difference.
>
> YMMV of course.
>
> Best
> Erick
>
>
> On Wed, Nov 20, 2013 at 12:44 AM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
> > It should bypass cache for sure
> >
> >
> https://github.com/apache/lucene-solr/blob/34a92d090ac4ff5c8382e1439827d678265ede0d/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1263
> > 20.11.2013 7:05 пользователь "Otis Gospodnetic" <
> > otis.gospodne...@gmail.com>
> > написал:
> >
> > > Hi,
> > >
> > > We have the ability to turn off caching for filter queries -
> > > http://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters
> > >
> > > I didn't try it, but I think one can't turn off caching for regular
> > > queries, a la:
> > >
> > > q={!cache=false}
> > >
> > > Is there a reason this could not be done?
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Performance Monitoring * Log Analytics * Search Analytics
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics


 


Re: Swapping Cores

2013-11-20 Thread Tirthankar Chatterjee
Hi Shawn,
It just slipped my mind to mention the details of my solr version. Good point 
and thought from your side. Thanks for checking back my emails. 

I am currently using SOLR4.3 but not SOLR CLOUD. WE have a technical 
documentation site which keeps changing with some new files and some deleted 
files. We use nutch to crawl for now, so this question where I want to swap the 
cores in run time when the new index is available. 

I will look into the link Otis sent me and will read the second paragraph. Till 
then please wait for my questions if I have any :-)
On Nov 20, 2013, at 12:12 PM, Shawn Heisey  wrote:

> On 11/19/2013 10:18 PM, Tirthankar Chatterjee wrote:
>> I have a site that I crawl and host the index. The web site has changes 
>> every month which requires it to re-crawl. Now there is a new SOLR index 
>> that is created. How effectively can I swap the previous one with the new 
>> one with minimal downtime for search. 
>> 
>> We have tried swapping the core but once due to any reason tomcat is 
>> restarted the temp core data is gone after the restart. Is there a way we 
>> dont lose the new index after the swap.
> 
> The reply you received from Otis assumes that you're using SolrCloud.  I
> looked back at previous messages that you have sent to the list, where
> you were using version 3.6, but that was over a year ago, so I don't
> know whether you've upgraded to 4.x yet, and I don't know if you've gone
> with SolrCloud.
> 
> If you are not using SolrCloud, then you can do core swapping, and the
> next paragraph will apply.  If you are using SolrCloud, then you can't;
> you must use collection aliasing.
> 
> Do you have persistent set to true in your solr.xml?  This is required
> for core swapping to work properly through restarts.
> 
> Thanks,
> Shawn
> 




***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**


Issues faced after docValues migration

2013-11-20 Thread vicky desai
Hi, 

I am using solr 4.3 version. I am planning to use the docValues feature
introduced in solr 4.2. Although I see a significant improvement in facet
and group query , there is a degrade in group.facet and group.ngroups query.
Has anybody faced a similar issue? Any work arounds?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issues-faced-after-docValues-migration-tp4102134.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: {!cache=false} for regular queries?

2013-11-20 Thread Erick Erickson
Mikhail:

Interesting point! Thanks! I haven't looked at the implementation enough to
know, but it looks like these are unrelated? You'd have to happen to have a
fq clause you'd already submitted (and cached) for this to happen?

But like I said, I don't know the code. I didn't realize this even happened
BTW, so I'm glad you pointed it out!

Erick


On Wed, Nov 20, 2013 at 8:02 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Eric,
>
> it's worth to mention that queries which sorts by field can potentially
> blows filter cache
> http://wiki.apache.org/solr/SolrCaching#useFilterForSortedQuery here is
> the
> hint might work out.
>
>
> On Wed, Nov 20, 2013 at 4:30 PM, Erick Erickson  >wrote:
>
> > But I don't know whether it's worth worrying about. queryResultCache is
> > pretty small.
> >
> > I think of it as a map where the key is the text of the query and the
> value
> > is an
> > int[queryWindowSize]. So bypassing the cache is probably not going to
> make
> > much
> > difference.
> >
> > YMMV of course.
> >
> > Best
> > Erick
> >
> >
> > On Wed, Nov 20, 2013 at 12:44 AM, Mikhail Khludnev <
> > mkhlud...@griddynamics.com> wrote:
> >
> > > It should bypass cache for sure
> > >
> > >
> >
> https://github.com/apache/lucene-solr/blob/34a92d090ac4ff5c8382e1439827d678265ede0d/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1263
> > > 20.11.2013 7:05 пользователь "Otis Gospodnetic" <
> > > otis.gospodne...@gmail.com>
> > > написал:
> > >
> > > > Hi,
> > > >
> > > > We have the ability to turn off caching for filter queries -
> > > > http://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters
> > > >
> > > > I didn't try it, but I think one can't turn off caching for regular
> > > > queries, a la:
> > > >
> > > > q={!cache=false}
> > > >
> > > > Is there a reason this could not be done?
> > > >
> > > > Thanks,
> > > > Otis
> > > > --
> > > > Performance Monitoring * Log Analytics * Search Analytics
> > > > Solr & Elasticsearch Support * http://sematext.com/
> > > >
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> 
>  
>


Re: Auto optimized of Solr indexing results

2013-11-20 Thread Bayu Widyasanyata
Thanks Erick.
I will check that on next round.

---
wassalam,
[bayu]

/sent from Android phone/
On Nov 20, 2013 7:45 PM, "Erick Erickson"  wrote:

> You probably shouldn't optimize at all. The default TieredMergePolicy
> will eventually purge the deleted files' data, which is really what
> optimize
> does. So despite its name, most of the time it's not really worth the
> effort.
>
> Take a look at your Solr admin page, the "overview" link for a core.
> If the number of deleted docs is a significant percentage of your
> numDocs (I typically use 20% or so, but YMMV) then optimize
> might be worthwhile. Otherwise, it's a distraction unless and until
> you have some evidence that it actually makes a difference.
>
> Best,
> Erick
>
>
> On Wed, Nov 20, 2013 at 7:33 AM, Bayu Widyasanyata
> wrote:
>
> > Hi,
> >
> > After successfully configured re-crawling script, I sometimes checked and
> > found on Solr Admin that "Optimized" status of my collection is not
> > optimized (slash icon).
> >
> > Hence I did optimized steps manually.
> >
> > How to make my crawling optimized automatically?
> >
> > Should we restart Solr (I use Tomcat) as shown on here [1]
> >
> > [1] http://wiki.apache.org/nutch/Crawl
> >
> > Thanks!
> >
> > --
> > wassalam,
> > [bayu]
> >
>


Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Otis Gospodnetic
Do you think this is something that is actually implementable?  If so,
I'll open an issue.

One use-case where this may come in handy is when the disk space is
tight.  If a shard is using > 50% of the disk space on some node X,
you can't really split that shard because the 2 new sub-shards will
not fit on the local disk.  Or is there some trick one could use in
this situation?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar
 wrote:
> No, it is not supported yet. We can't split to a remote node directly.
> The best bet is trigger a new leader election by unloading the leader
> node once all replicas are active.
>
> On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
>  wrote:
>> Hi,
>>
>> Is it possible to perform a shard split and stream data for the
>> new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
>> on the local/source node first?
>>
>> Thanks,
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.


Support for Numeric DocValues Updates in Solr?

2013-11-20 Thread Otis Gospodnetic
Hi,

"Numeric DocValues Updates" functionality that came via
https://issues.apache.org/jira/browse/LUCENE-5189 sounds very
valuable, while we wait for full/arbitrary field updates
(https://issues.apache.org/jira/browse/LUCENE-4258).

Would it make sense to add support for Numeric DocValues Updates to Solr?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


How to Configure Highlighting for Solr

2013-11-20 Thread Furkan KAMACI
I have setup my highlight as follows:

 true
 name age address

However I don't want *name* be highlighted *but *included inside response:

"highlighting": {
 Something_myid: {
name: "Something bla bla",
age: "Something age bla bla",
address: "Something age bla bla"
  }
}

*or:*

I want to group them on name field instead of id:

"highlighting": {
 Something bla bla: {
age: "Something age bla bla",
address: "Something age bla bla"
  }
}

*or*

"highlighting": {
 Something bla bla: {
name: "Something bla bla",
age: "Something age bla bla",
address: "Something age bla bla"
  }
}

How can I do *any* of them at Solr 4.5.1?


Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread Otis Gospodnetic
Hi,

When full index replication is happening via SnapPuller, a temporary
"timestamped" index dir is created.

Questions:
1) Under normal circumstances could more than 1 timestamped index
directory ever be present?
2) Should there always be an the .../data/index directory present?

I'm asking because I see the following situation on one SolrCloud node:

$ du -ms /home/solr/data/*
1188367/home/solr/data/index.20131118152402344
709050/home/solr/data/index.20131119210950598
1/home/solr/data/index.properties
1/home/solr/data/replication.properties
3053/home/solr/data/tlog

Note:
1) there are 2 timestamped directories
2) there is no data/index directory

According to SnapPuller, the timestamped index dir is a temporary dir
and should be removed after replication. unless maybe some error
case is not being handled correctly and timestamped index dirs are
"leaking".

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


Solr Highlighting Response Type

2013-11-20 Thread Furkan KAMACI
Here is an example from wiki:

Iterator iter = queryResponse.getResults().iterator();

while (iter.hasNext()) {
  SolrDocument resultDoc = iter.next();

  String content = (String) resultDoc.getFieldValue("content");
  String id = (String) resultDoc.getFieldValue("id"); //id is the
uniqueKey field

  if (queryResponse.getHighlighting().get(id) != null) {
List highlightSnippets =
queryResponse.getHighlighting().get(id).get("content");
  }
}

Why  queryResponse.getHighlighting().get(id).get("content"); returns a
List? Is it because of multivalued things or any other purposes?

Also is there is a control as like that:

 if (queryResponse.getHighlighting().get(id) != null)

Can we configure Solr returning highlights *at same order *of response
document list and can we include  all response documents at highlight list
even there is not a highlighted match of its fields. Because I want to get
the first response and if there is a highlighted field at highlighting list
I want to use highlighted match if not I want to use plain version. I know
that I can use *id * field of response document to get highlighted fields
of it but my ids are randomized and I can not use it.


Re: Support for Numeric DocValues Updates in Solr?

2013-11-20 Thread Gopal Patwa
+1 to add this support in Solr


On Wed, Nov 20, 2013 at 7:16 AM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hi,
>
> "Numeric DocValues Updates" functionality that came via
> https://issues.apache.org/jira/browse/LUCENE-5189 sounds very
> valuable, while we wait for full/arbitrary field updates
> (https://issues.apache.org/jira/browse/LUCENE-4258).
>
> Would it make sense to add support for Numeric DocValues Updates to Solr?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>


Re: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread michael.boom
I encountered this problem often when i restarted a solr instance before
replication was finished more than once.
I would then have multiple timestamped directories and the index directory. 
However, the index.properties points to the active index directory.

The moment when the replication succeeded the temp dir is renamed "index"
and the index.properties is gone.  

On the situation when the index is missing, not sure about that. Maybe this
happens when the replica is too old and an old-school replication is done.



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-data-index-MMDD-dirs-bug-tp4102163p4102168.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

2013-11-20 Thread Timothy Potter
Hi Otis,

I think these are related problems but giving the ability to enforce a
majority quorum among the total replica set for a shard is not the
same as hinted handoff in the Cassandra sense. Cass's hinted handed
allows you to say it's ok to send the write somewhere and somehow
it'll make its way to the correct node, eventually. But, from the docs
"A hinted write does not count towards ConsistencyLevel requirements
of ONE, QUORUM, or ALL." It's useful for environments that need
extreme write-availability.

Today, SolrCloud uses a consistency level of ALL, where ALL is based
on the number of *active* replicas for a shard, which could be as
small as one. What I'm proposing is to give a knob to allow a
SolrCloud user to enforce a consistency level of QUORUM, where QUORUM
is based on the entire replica set (down + active replicas) and not
just active replicas as it is today. However, we'll need a better
vocabulary for this because in my scenario, QUORUM is stronger than
ALL which will confuse even the most seasoned of distributed systems
engineers ;-)

Cheers,
Tim


On Tue, Nov 19, 2013 at 9:25 PM, Otis Gospodnetic
 wrote:
> Btw. isn't the situation Timothy is describing what hinted handoff is all
> about?
>
> http://wiki.apache.org/cassandra/HintedHandoff
> http://www.datastax.com/dev/blog/modern-hinted-handoff
>
> Check this:
> http://www.jroller.com/otis/entry/common_distributed_computing_routines
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tue, Nov 19, 2013 at 1:58 PM, Mark Miller  wrote:
>
>> Mostly a lot of other systems already offer these types of things, so they
>> were hard not to think about while building :) Just hard to get back to a
>> lot of those things, even though a lot of them are fairly low hanging
>> fruit. Hardening takes the priority :(
>>
>> - Mark
>>
>> On Nov 19, 2013, at 12:42 PM, Timothy Potter  wrote:
>>
>> > You're thinking is always one-step ahead of me! I'll file the JIRA
>> >
>> > Thanks.
>> > Tim
>> >
>> >
>> > On Tue, Nov 19, 2013 at 10:38 AM, Mark Miller 
>> wrote:
>> >
>> >> Yeah, this is kind of like one of many little features that we have just
>> >> not gotten to yet. I’ve always planned for a param that let’s you say
>> how
>> >> many replicas an update must be verified on before responding success.
>> >> Seems to make sense to fail that type of request early if you notice
>> there
>> >> are not enough replicas up to satisfy the param to begin with.
>> >>
>> >> I don’t think there is a JIRA issue yet, fire away if you want.
>> >>
>> >> - Mark
>> >>
>> >> On Nov 19, 2013, at 12:14 PM, Timothy Potter 
>> wrote:
>> >>
>> >>> I've been thinking about how SolrCloud deals with write-availability
>> >> using
>> >>> in-sync replica sets, in which writes will continue to be accepted so
>> >> long
>> >>> as there is at least one healthy node per shard.
>> >>>
>> >>> For a little background (and to verify my understanding of the process
>> is
>> >>> correct), SolrCloud only considers active/healthy replicas when
>> >>> acknowledging a write. Specifically, when a shard leader accepts an
>> >> update
>> >>> request, it forwards the request to all active/healthy replicas and
>> only
>> >>> considers the write successful if all active/healthy replicas ack the
>> >>> write. Any down / gone replicas are not considered and will sync up
>> with
>> >>> the leader when they come back online using peer sync or snapshot
>> >>> replication. For instance, if a shard has 3 nodes, A, B, C with A being
>> >> the
>> >>> current leader, then writes to the shard will continue to succeed even
>> >> if B
>> >>> & C are down.
>> >>>
>> >>> The issue is that if a shard leader continues to accept updates even if
>> >> it
>> >>> loses all of its replicas, then we have acknowledged updates on only 1
>> >>> node. If that node, call it A, then fails and one of the previous
>> >> replicas,
>> >>> call it B, comes back online before A does, then any writes that A
>> >> accepted
>> >>> while the other replicas were offline are at risk to being lost.
>> >>>
>> >>> SolrCloud does provide a safe-guard mechanism for this problem with the
>> >>> leaderVoteWait setting, which puts any replicas that come back online
>> >>> before node A into a temporary wait state. If A comes back online
>> within
>> >>> the wait period, then all is well as it will become the leader again
>> and
>> >> no
>> >>> writes will be lost. As a side note, sys admins definitely need to be
>> >> made
>> >>> more aware of this situation as when I first encountered it in my
>> >> cluster,
>> >>> I had no idea what it meant.
>> >>>
>> >>> My question is whether we want to consider an approach where SolrCloud
>> >> will
>> >>> not accept writes unless there is a majority of replicas available to
>> >>> accept the write? For my example, under this approach, we wouldn't
>> accept
>> >>> writes if both B&C failed, but would if only C did, leaving A & B
>> on

RE: facet method=enum and uninvertedfield limitations

2013-11-20 Thread Lemke, Michael SZ/HZA-ZSW
On Wednesday, November 20, 2013 7:37 AM, Dmitry Kan wrote:

Thanks for your reply.

>
>Since you are faceting on a text field (is this correct?) you deal with a
>lot of unique values in it.

Yes, this is a text field and we experimented with reducing the index.  As
I said in my original question the stripped down index had 178,000 terms
and it (fc) still didn't work.  Is number of terms the relevant quantity?

>So your best bet is enum method. 

Hm, yes, that works but I have to wait 4 minutes for the answer (with the
original data).  Not good.

>Also if you
>are on solr 4x try building doc values in the index: this suits faceting
>well.

We are on Solr 1.4, so, no.

>
>Otherwise start from your spec once again. Can you use shingles instead?

Possibly but I don't know shingles.  Although I'd prefer to use our original
index we are trying to build a specialized index just for this sort of
query but still don't know what to look for.

A query like

 
q=word&facet.field=CONTENT&facet=true&facet.limit=10&facet.mincount=1&facet.method=fc&facet.prefix=a&rows=0

would give me the top ten results containing 'word' and something starting
with 'a'.  That's what I want.  An empty facet.prefix should also work.
Eventually, the query will be more complex containing other fields and
filter queries but the basic function should be exactly like this.  How
can we achieve this?

Thanks,
Michael


>On 19 Nov 2013 17:44, "Lemke, Michael SZ/HZA-ZSW" 
>wrote:
>
>> On Friday, November 15, 2013 11:22 AM, Lemke, Michael SZ/HZA-ZSW wrote:
>>
>> Judging from numerous replies this seems to be a tough question.
>> Nevertheless, I'd really appreciate any help as we are stuck.
>> We'd really like to know what in our index causes the facet.method=fc
>> query to fail.
>>
>> Thanks,
>> Michael
>>
>> >On Thu, November 14, 2013 7:26 PM, Yonik Seeley wrote:
>> >>On Thu, Nov 14, 2013 at 12:03 PM, Lemke, Michael  SZ/HZA-ZSW
>> >> wrote:
>> >>> I am running into performance problems with faceted queries.
>> >>> If I do a
>> >>>
>> >>>
>> q=word&facet.field=CONTENT&facet=true&facet.limit=10&facet.mincount=1&facet.method=fc&facet.prefix=a&rows=0
>> >>>
>> >>> I am getting an exception:
>> >>> org.apache.solr.common.SolrException: Too many values for
>> UnInvertedField faceting on field CONTENT
>> >>> at
>> org.apache.solr.request.UnInvertedField.uninvert(UnInvertedField.java:384)
>> >>> at
>> org.apache.solr.request.UnInvertedField.(UnInvertedField.java:178)
>> >>> at
>> org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:839)
>> >>> ...
>> >>>
>> >>> I understand it's got something to do with a 24bit limit somewhere
>> >>> in the code but I don't understand enough of it to be able to construct
>> >>> a specialized index that can be queried with facet.method=enum.
>> >>
>> >>You shouldn't need to do anything differently to try facet.method=enum
>> >>(just replace facet.method=fc with facet.method=enum)
>> >
>> >This is true and facet.method=enum does work indeed.  The problem is
>> >runtime.  In particular queries with an empty facet.prefix= run many
>> >seconds if not minutes.  I initially asked about this here:
>> >
>> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201310.mbox/%3c33ec3398272fbe47b64ee3b3e98f69a761427...@de011521.schaeffler.com%3E
>> >
>> >It was suggested that fc is much faster than enum and I'd like to
>> >test that.  We are still fairly free to design the index such that
>> >it performs well.  But to do that we need to understand what is
>> >killing it.
>> >
>> >>
>> >>You may also want to add the parameter
>> >>facet.enum.cache.minDf=10
>> >>to lower memory usage by only usiing the filter cache for terms that
>> >>match more than 100K docs.
>> >
>> >That helped a little, cut down my particular test from 10 sec to 5 sec.
>> >But still too slow.  Mind you this is for an autosuggest feature.
>> >
>> >Thanks for your reply.
>> >
>> >Michael
>> >
>> >
>>
>>



Re: How to index X™ as ™ (HTML decimal entity)

2013-11-20 Thread Walter Underwood
Again, I'd like to know why this is wanted. It sounds like an X-Y, problem. 
Storing Unicode characters as XML/HTML encoded character references is an 
extremely bad idea.

wunder

On Nov 20, 2013, at 5:01 AM, "Jack Krupansky"  wrote:

> Any analysis filtering affects the indexed value only, but the stored value 
> would be unchanged from the original input value. An update processor lets 
> you modify the original input value that will be stored.
> 
> -- Jack Krupansky
> 
> -Original Message- From: Uwe Reh
> Sent: Wednesday, November 20, 2013 5:43 AM
> To: solr-user@lucene.apache.org
> Subject: Re: How to index X™ as ™ (HTML decimal entity)
> 
> What's about having a simple charfilter in the analyzer queue for
> indexing *and* searching. e.g
>  replacement="™" />
> or
>  mapping="mapping-specials.txt" />
> 
> Uwe
> 
> Am 19.11.2013 23:46, schrieb Developer:
>> I have a data coming in to SOLR as below.
>> 
>> X™ - Black
>> 
>> I need to store the HTML Entity (decimal) equivalent value (i.e. ™)
>> in SOLR rather than storing the original value.
>> 
>> Is there a way to do this?
> 

--
Walter Underwood
wun...@wunderwood.org





Re: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread Daniel Collins
In our experience (with SolrCloud), if you trigger a full replication (e.g.
new replica), you get the "timestamp" directory, it never renames back to
just "index".  Since index.properties gives you the name of the real
directory, we had never considered that a problem/bug.  Why bother with the
rename afterwards, it just seems unnecessary?

So to answer your questions:

1) Not in normal circumstances, but if replication crashes or stops, it
might leave it hanging.
2) No, as long as there is an index.properties file.

Not official answers, but that's our experience.


On 20 November 2013 15:55, michael.boom  wrote:

> I encountered this problem often when i restarted a solr instance before
> replication was finished more than once.
> I would then have multiple timestamped directories and the index directory.
> However, the index.properties points to the active index directory.
>
> The moment when the replication succeeded the temp dir is renamed "index"
> and the index.properties is gone.
>
> On the situation when the index is missing, not sure about that. Maybe this
> happens when the replica is too old and an old-school replication is done.
>
>
>
> -
> Thanks,
> Michael
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Multiple-data-index-MMDD-dirs-bug-tp4102163p4102168.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Solr Docvalues grouping

2013-11-20 Thread GOYAL, ANKUR
Hi,

I am using Solr 4.5.1. and I am planning to use docValues attribute for a 
string type. The values in that field change only once a day. I would like to 
only group on that field. At the following link :-

http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/

it is mentioned that "For repeated access to the same field, the inverted index 
performs better due to internal Lucene caching". However, that link discusses 
about faceting. So, does the docValues performs slower as compared to inverted 
index when doing grouping also?

-Ankur





Re: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread Mark Miller
There might be a JIRA issue out there about replication not cleaning up on all 
fails - e.g. on startup or something - kind of rings a bell…if so, it will be 
addressed eventually.

Otherwise, you might have two for a bit just due to multiple searchers being 
around at once for a while or something - but it should not be something that 
lasts a long time.

- Mark

On Nov 20, 2013, at 11:50 AM, Daniel Collins  wrote:

> In our experience (with SolrCloud), if you trigger a full replication (e.g.
> new replica), you get the "timestamp" directory, it never renames back to
> just "index".  Since index.properties gives you the name of the real
> directory, we had never considered that a problem/bug.  Why bother with the
> rename afterwards, it just seems unnecessary?
> 
> So to answer your questions:
> 
> 1) Not in normal circumstances, but if replication crashes or stops, it
> might leave it hanging.
> 2) No, as long as there is an index.properties file.
> 
> Not official answers, but that's our experience.
> 
> 
> On 20 November 2013 15:55, michael.boom  wrote:
> 
>> I encountered this problem often when i restarted a solr instance before
>> replication was finished more than once.
>> I would then have multiple timestamped directories and the index directory.
>> However, the index.properties points to the active index directory.
>> 
>> The moment when the replication succeeded the temp dir is renamed "index"
>> and the index.properties is gone.
>> 
>> On the situation when the index is missing, not sure about that. Maybe this
>> happens when the replica is too old and an old-school replication is done.
>> 
>> 
>> 
>> -
>> Thanks,
>> Michael
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Multiple-data-index-MMDD-dirs-bug-tp4102163p4102168.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 



Re: How to index X™ as ™ (HTML decimal entity)

2013-11-20 Thread Jack Krupansky
AFAICT, it's not an "extremely bad idea" - using SGML/HTML as a format for 
storing text to be rendered. If you disagree - try explaining yourself.


But maybe TM should be encoded as "™". Ditto for other named SGML 
entities.


-- Jack Krupansky

-Original Message- 
From: Walter Underwood

Sent: Wednesday, November 20, 2013 11:21 AM
To: solr-user@lucene.apache.org
Subject: Re: How to index X™ as ™ (HTML decimal entity)

Again, I'd like to know why this is wanted. It sounds like an X-Y, problem. 
Storing Unicode characters as XML/HTML encoded character references is an 
extremely bad idea.


wunder

On Nov 20, 2013, at 5:01 AM, "Jack Krupansky"  
wrote:


Any analysis filtering affects the indexed value only, but the stored 
value would be unchanged from the original input value. An update 
processor lets you modify the original input value that will be stored.


-- Jack Krupansky

-Original Message- From: Uwe Reh
Sent: Wednesday, November 20, 2013 5:43 AM
To: solr-user@lucene.apache.org
Subject: Re: How to index X™ as ™ (HTML decimal entity)

What's about having a simple charfilter in the analyzer queue for
indexing *and* searching. e.g

or


Uwe

Am 19.11.2013 23:46, schrieb Developer:

I have a data coming in to SOLR as below.

X™ - Black

I need to store the HTML Entity (decimal) equivalent value (i.e. ™)
in SOLR rather than storing the original value.

Is there a way to do this?




--
Walter Underwood
wun...@wunderwood.org





Suggester - how to return exact match?

2013-11-20 Thread Mirko
Hi,
we implemented a Solr suggester (http://wiki.apache.org/solr/Suggester)
that uses a file based dictionary. We use the results of the suggester to
populate a dropdown field of a search field on a webpage.

Our dictionary (autosuggest.txt) contains:

foo
bar

Our suggester has the following behavior:

We can make a request with the search query "fo" and get a response with
the suggestion "foo". This is great.

However, if we make a request with the query "foo" (an exact match) we get
no suggestions. We would expect that the response returns the suggestion
"foo".

How can we configure the suggester to return also the perfect match as a
suggestion?

This is the config for our search component:


spellCheck

  default
  org.apache.solr.spelling.suggest.Suggester
 autosuggest.txt

  

Thanks for help!
Mirko


Re: Solr spatial search within the polygon

2013-11-20 Thread Smiley, David W.
Dhanesh,


> I'm pretty sure that the coordinates are in the right position.
> "9.445890,76.540970" is in India, precisely in Kerala state :)


My suspicion was wright; you have all of your latitudes and longitudes in
the wrong position.  Your example that I quote you on above is correct
("lat,lon") , but you're not indexing it that way (you're doing "lat
lon").  If you corrected your indexing to use "lat,lon" (as you should
do), you will find that the WKT you generate in your queries is also
reversed.  The way you're doing it now prevents you from searching
anywhere out of -90 & +90 degrees longitude.  At this point I don't think
I can be any more clear.

Good luck.

~ David

On 11/20/13 12:50 AM, "Dhanesh Radhakrishnan"  wrote:

>Hi David,
>Thank you for your reply
>This is my current schema and field type "location_rpt" is a
>SpatialRecursivePrefixTreeFieldType and
>Field "location" is a type "location_rpt" and its multiValued
>
>
>omitNorms="true"/>
>positionIncrementGap="0"/>
>precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
>class="solr.SpatialRecursivePrefixTreeFieldType"
>spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFac
>tory"
>distErrPct="0.025" maxDistErr="0.09" units="degrees" geo="true"  />
>
>
>
>
>
>multiValued="false"/>
>multiValued="false" omitNorms="true" omitTermFreqAndPositions="true"
>termVectors="false"/>
>
>required="false" />
>required="false" />
>required="false" />
>stored="true"  multiValued="true" />
>
>
>Whenever add a document to solr, I'll collect the current latitude and
>longitude of particular business and index in the field "location"
>It's like
>$doc->setField('location', $business['latitude']."
>".$business['longitude']);
>This should looks like "location":["9.445890 76.540970"] in solr
>
>What I'm doing is that in Map view of search result , there is one
>provision to draw polygon in map and fetch the result based on the
>drawing.
>
>http://localhost:8983/solr/poc/select?fl=id,name,locality&wt=json&json.nl=
>map&q=*:*&fq=state:Kerala&fq=location:"IsWithin(POLYGON((9.471920923238988
>76.5496015548706,9.464174399734185 76.53947353363037,9.457232011740006
>76.55457973480225,9.471920923238988 76.5496015548706)))
>distErrPct=0"&debugQuery=true
>
>I'm pretty sure that the coordinates are in the right position.
>"9.445890,76.540970" is in India, precisely in Kerala state :)
>
>It is highly appreciated that you kindly correct me if I'm in wrong way
>
>
>
>
>
>This is response from solr
>
>"responseHeader": {
>"status": 0,
>"QTime": 5
>},
>"response": {
>"numFound": 3,
>"start": 0,
>"docs": [
>{
>"id": "192",
>"name": "50 cents of ideal plot",
>"locality": "Changanassery"
>},
>{
>"id": "189",
>"name": "new independent house for sale",
>"locality": "Changanassery"
>},
>{
>"id": "188",
>"name": "Renovated Resort style home with 21 cent",
>"locality": "Changanassery"
>}
>]
>}
>
>
>
>Here is the debug mode output of the query
>
>
>"debug": {
>"rawquerystring": "*:*",
>"querystring": "*:*",
>"parsedquery": "MatchAllDocsQuery(*:*)",
>"parsedquery_toString": "*:*",
>"explain": {
>"188": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
>queryNorm\n",
>"189": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
>queryNorm\n",
>"192": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
>queryNorm\n"
>},
>"QParser": "LuceneQParser",
>"filter_queries": [
>"state:Kerala",
>"location:\"IsWithin(POLYGON((9.471920923238988
>76.5496015548706,9.464174399734185 76.53947353363037,9.457232011740006
>76.55457973480225,9.471920923238988 76.5496015548706))) distErrPct=0\""
>],
>"parsed_filter_queries": [
>"state:Kerala",
>
>"ConstantScore(org.apache.lucene.spatial.prefix.WithinPrefixTreeFilter@1ed
>6c279
>)"
>],
>"timing": {
>"time": 5,
>"prepare": {
>"time": 1,
>"query": {
>"time": 1
>},
>"facet": {
>"time": 0
>},
>"mlt": {
>"time": 0
>},
>"highlight": {
>"time": 0
>},
>"stats": {
>"time": 0
>},
>"debug": {
>"time": 0
>}
>},
>"process": {
>"time": 4,
>"query": {
>"time": 3
>},
>"facet": {
>"time": 0
>},
>"mlt": {
>"time": 0
>},
>"highlight": {
>"time": 0
>},
>"stats": {
>"time": 0
>},
>"debug": {
>"time": 1
> 

Re: Indexing different customer customized field values

2013-11-20 Thread kchellappa
Thanks Otis

We also thought about having multiple fields, but thought that having too
many fields will be an issue.  I see threads about too many fields is an
issue for sort (we don't expect to sort on these), but look through the
archives.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-different-customer-customized-field-values-tp4102000p4102204.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Shalin Shekhar Mangar
At the Lucene level, I think it would require a directory
implementation which writes to a remote node directly. Otherwise, on
the solr side, we must move the leader itself to another node which
has enough disk space and then split the index.

On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic
 wrote:
> Do you think this is something that is actually implementable?  If so,
> I'll open an issue.
>
> One use-case where this may come in handy is when the disk space is
> tight.  If a shard is using > 50% of the disk space on some node X,
> you can't really split that shard because the 2 new sub-shards will
> not fit on the local disk.  Or is there some trick one could use in
> this situation?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar
>  wrote:
>> No, it is not supported yet. We can't split to a remote node directly.
>> The best bet is trigger a new leader election by unloading the leader
>> node once all replicas are active.
>>
>> On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
>>  wrote:
>>> Hi,
>>>
>>> Is it possible to perform a shard split and stream data for the
>>> new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
>>> on the local/source node first?
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Performance Monitoring * Log Analytics * Search Analytics
>>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.



-- 
Regards,
Shalin Shekhar Mangar.


Re: DocValues uasge and senarios?

2013-11-20 Thread Chris Hostetter


Perhaps this can help you make sense of the advantages...

https://cwiki.apache.org/confluence/display/solr/DocValues



: Date: Wed, 20 Nov 2013 18:45:04 +0800
: From: Floyd Wu 
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Re: DocValues uasge and senarios?
: 
: Thanks Yago,
: 
: I've read this article
: http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/
: But I don't understand well.
: I'll try to figure out the missing part. Thanks for helping.
: 
: Floyd
: 
: 
: 
: 
: 2013/11/20 Yago Riveiro 
: 
: > You should understand the DocValues as feature that allow you to do
: > sorting and faceting without blow the heap.
: >
: > They are not necessary faster than the traditional method, they are more
: > memory efficient and in huge indexes this is the main limitation.
: >
: > This post resumes the docvalues feature and the main goals
: > http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/
: >
: > --
: > Yago Riveiro
: > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
: >
: >
: > On Wednesday, November 20, 2013 at 10:15 AM, Floyd Wu wrote:
: >
: > > Hi Yago
: > >
: > > Thanks for you reply. I once thought that DocValues feature is one for me
: > > to store some extra values.
: > >
: > > May I summarized that DocValues is a feature that "speed up" sorting and
: > > faceting?
: > >
: > > Floyd
: > >
: > >
: > >
: > > 2013/11/20 Yago Riveiro  yago.rive...@gmail.com)>
: > >
: > > > Hi Floyd,
: > > >
: > > > DocValues are useful for sorting and faceting per example.
: > > >
: > > > You don't need to change nothing in your xml's, the only thing that you
: > > > need to do is set the docValues=true in your field definition in the
: > schema.
: > > >
: > > > If you don't want use the default implementation (all loaded in the
: > heap),
: > > > you need to add the tag  class="solr.SchemaCodecFactory"/> in
: > > > the solrconfig.xml and the docValuesFormat=true on the fieldType
: > definition.
: > > >
: > > > --
: > > > Yago Riveiro
: > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
: > > >
: > > >
: > > > On Wednesday, November 20, 2013 at 9:38 AM, Floyd Wu wrote:
: > > >
: > > > > Hi there,
: > > > >
: > > > > I'm not fully understand what kind of usage example that DocValues
: > can be
: > > > > used?
: > > > >
: > > > > When I set field docValues=true, do i need to change anyhting in xml
: > > > that I
: > > > > sent to solr for indexing?
: > > > >
: > > > > Please point me.
: > > > >
: > > > > Thanks
: > > > >
: > > > > Floyd
: > > > >
: > > > > PS: I've googled and read lots of DocValues discussion but confused.
: >
: >
: 

-Hoss


Re: How to Configure Highlighting for Solr

2013-11-20 Thread Stefan Matheis
Solr is using the UniqueKey you defined for your documents, that shouldn't be a 
problem, since you can lookup the document from the list of documents in the 
main response?

And there is actually a ticket, which would allow it to inline the highlight 
response with DocTransfomers: https://issues.apache.org/jira/browse/SOLR-3479

-Stefan 


On Wednesday, November 20, 2013 at 4:37 PM, Furkan KAMACI wrote:

> I have setup my highlight as follows:
> 
> true
> name age address
> 
> However I don't want *name* be highlighted *but *included inside response:
> 
> "highlighting": {
> Something_myid: {
> name: "Something bla bla",
> age: "Something age bla bla",
> address: "Something age bla bla"
> }
> }
> 
> *or:*
> 
> I want to group them on name field instead of id:
> 
> "highlighting": {
> Something bla bla: {
> age: "Something age bla bla",
> address: "Something age bla bla"
> }
> }
> 
> *or*
> 
> "highlighting": {
> Something bla bla: {
> name: "Something bla bla",
> age: "Something age bla bla",
> address: "Something age bla bla"
> }
> }
> 
> How can I do *any* of them at Solr 4.5.1? 



SolrJ - HttpSolrServer - allow setting custom HTTP headers

2013-11-20 Thread Eugen Paraschiv
Hi - a quick question about a low(ish)-level usecase of SolrJ.
I am trying to set a custom HTTP Header on the request that SolrJ is
sending out to the Solr Server - and as far as I can tell - there isn't a
clear way of doing that.
HttpSolrServer.request crates the HttpGet request and sends it - but
doesn't expose it at any point, nor does it allow setting any custom
headers on it.
Since this is a relatively general case - and I am assuming that
non-trivial applications may make use of various custom headers - would it
make sense to add an overloaded request method here?
One very simple option would be to simply pass in a map.
A more general way would be to allow the client to pass in the method
itself (HttpRequestBase method) - especially since the code that creates
the method is very straightforward.
Should I open a JIRA to track this?
Any help is appreciated.
Cheers,
Eugen.


Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
Hi,
Quick question about the HttpSolrServer implementation - I would like to
extend some of the functionality of this class - but when I extend it - I'm
having issues with how extensible it is.
For example - some of the details are not visible externally - setters
exist for maxRetries and followRedirects but no getters.
It would really help to make this class a bit more extensible - I'm sure it
usually enough, but when it does need to be extended - it would make sense
to allow that rather than the client implement alternative version of it
via copy-paste (which looks like the only option available right now).
Hope this makes sense.
Cheers,
Eugen.


Re: Suggester - how to return exact match?

2013-11-20 Thread Developer
May be there is a way to do this but it doesn't make sense to return the same
search query as a suggestion (Search query is not a suggestion as it might
or might not be present in the index).

AFAIK you can use various look up algorithm to get the suggestion list and
they lookup the terms based on the query value (some alogrithm implements
fuzzy logic too). so searching Foo will return FooBar, Foo2 but not foo.

You should fetch the suggestion only if the numfound is greater than 0 else
you don't have any suggestion.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Suggester-how-to-return-exact-match-tp4102203p4102259.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Extensibility of HttpSolrServer

2013-11-20 Thread Mark Miller
Feel free to file a JIRA issue with the changes you think make sense.

- Mark

On Nov 20, 2013, at 4:21 PM, Eugen Paraschiv  wrote:

> Hi,
> Quick question about the HttpSolrServer implementation - I would like to
> extend some of the functionality of this class - but when I extend it - I'm
> having issues with how extensible it is.
> For example - some of the details are not visible externally - setters
> exist for maxRetries and followRedirects but no getters.
> It would really help to make this class a bit more extensible - I'm sure it
> usually enough, but when it does need to be extended - it would make sense
> to allow that rather than the client implement alternative version of it
> via copy-paste (which looks like the only option available right now).
> Hope this makes sense.
> Cheers,
> Eugen.



Re: Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
Will do - thanks for the quick feedback.
Eugen.


On Thu, Nov 21, 2013 at 12:06 AM, Mark Miller  wrote:

> Feel free to file a JIRA issue with the changes you think make sense.
>
> - Mark
>
> On Nov 20, 2013, at 4:21 PM, Eugen Paraschiv  wrote:
>
> > Hi,
> > Quick question about the HttpSolrServer implementation - I would like to
> > extend some of the functionality of this class - but when I extend it -
> I'm
> > having issues with how extensible it is.
> > For example - some of the details are not visible externally - setters
> > exist for maxRetries and followRedirects but no getters.
> > It would really help to make this class a bit more extensible - I'm sure
> it
> > usually enough, but when it does need to be extended - it would make
> sense
> > to allow that rather than the client implement alternative version of it
> > via copy-paste (which looks like the only option available right now).
> > Hope this makes sense.
> > Cheers,
> > Eugen.
>
>


Re: Extensibility of HttpSolrServer

2013-11-20 Thread Shawn Heisey

On 11/20/2013 2:21 PM, Eugen Paraschiv wrote:

Quick question about the HttpSolrServer implementation - I would like to
extend some of the functionality of this class - but when I extend it - I'm
having issues with how extensible it is.
For example - some of the details are not visible externally - setters
exist for maxRetries and followRedirects but no getters.
It would really help to make this class a bit more extensible - I'm sure it
usually enough, but when it does need to be extended - it would make sense
to allow that rather than the client implement alternative version of it
via copy-paste (which looks like the only option available right now).


The specific examples that you have given are things that typically get 
set when the object is created and neverget changed.  If you really need 
access to maxRetries or other things that are private but have no 
getter, you can implement local fields, pass-through setters, and the 
getters you want.At the URL below, you can see an implementation that 
exposes a getter for maxRetries and has no warnings.  This will allow 
you to expose internal details that are not available upstream:


http://apaste.info/70Jh

For followRedirects, this is a parameter for the HttpClient.  There is a 
getter for the HttpClient, and with that, you can look up and change 
just about anything you want.


A question for java experts and committers with a lot of experience ... 
are there compelling reasons to keep so many details in HttpSolrServer 
private rather than protected?  Should getters be implemented for all 
private fields?


Thanks,
Shawn



Re: Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
The reason I needed access to internal details of the class - and it's not
just these 2 fields (I used these just as a quick example) - was that I was
trying to extend the class and overload the request method. As soon as I
tried to do that, I noticed that I really couldn't easily do so - because
much of the fields has no getters and were not protected either (as you
pointed out).
So - it's not specifically about this to particular fields - it's more
about the overall extensibility of the class.
The class is very closed off in terms of the API - so it stands to reason
that it may be extended for some specific usecases (for example I am trying
to allow setting custom HTTP Headers on the GET request before sending it)
- this is mainly why I was asking if it would make sense to try to open it
up a little and make it more extensible.
Hope that makes sense.
Cheers,
Eugen.


On Thu, Nov 21, 2013 at 12:21 AM, Shawn Heisey  wrote:

> On 11/20/2013 2:21 PM, Eugen Paraschiv wrote:
>
>> Quick question about the HttpSolrServer implementation - I would like to
>> extend some of the functionality of this class - but when I extend it -
>> I'm
>> having issues with how extensible it is.
>> For example - some of the details are not visible externally - setters
>> exist for maxRetries and followRedirects but no getters.
>> It would really help to make this class a bit more extensible - I'm sure
>> it
>> usually enough, but when it does need to be extended - it would make sense
>> to allow that rather than the client implement alternative version of it
>> via copy-paste (which looks like the only option available right now).
>>
>
> The specific examples that you have given are things that typically get
> set when the object is created and neverget changed.  If you really need
> access to maxRetries or other things that are private but have no getter,
> you can implement local fields, pass-through setters, and the getters you
> want.At the URL below, you can see an implementation that exposes a getter
> for maxRetries and has no warnings.  This will allow you to expose internal
> details that are not available upstream:
>
> http://apaste.info/70Jh
>
> For followRedirects, this is a parameter for the HttpClient.  There is a
> getter for the HttpClient, and with that, you can look up and change just
> about anything you want.
>
> A question for java experts and committers with a lot of experience ...
> are there compelling reasons to keep so many details in HttpSolrServer
> private rather than protected?  Should getters be implemented for all
> private fields?
>
> Thanks,
> Shawn
>
>


Re: Extensibility of HttpSolrServer

2013-11-20 Thread Chris Hostetter

: The reason I needed access to internal details of the class - and it's not
: just these 2 fields (I used these just as a quick example) - was that I was
: trying to extend the class and overload the request method. As soon as I
: tried to do that, I noticed that I really couldn't easily do so - because
: much of the fields has no getters and were not protected either (as you
: pointed out).

In a lot of cases, this is generally intentional to help reduce the 
surface area of the API.  The less you can do in a subclass, the more 
flexibility there is in modifying the internals of a base class, and the 
less likeley that changes to the base class break your subclass.

IIRC the key touch point for clients ot customize HttpSolrServer is by 
being able to specifying an arbitrary HttpClient that the HttpSolrServer 
uses for all of it's communication.  By specifying your own HttpClient, 
you can do just about anything you might want, and i'm hard pressed to 
think of anything we'd want to let clients do that *can't* be done by 
specifying an arbitrary HttpClient.

: that it may be extended for some specific usecases (for example I am trying
: to allow setting custom HTTP Headers on the GET request before sending it)

This is pretty easy at the HttpClient level using RequestWrapper arround 
any GET request you are asked to execute.


-Hoss


Re: Extensibility of HttpSolrServer

2013-11-20 Thread Shawn Heisey

On 11/20/2013 3:28 PM, Eugen Paraschiv wrote:

The reason I needed access to internal details of the class - and it's not
just these 2 fields (I used these just as a quick example) - was that I was
trying to extend the class and overload the request method. As soon as I
tried to do that, I noticed that I really couldn't easily do so - because
much of the fields has no getters and were not protected either (as you
pointed out).
So - it's not specifically about this to particular fields - it's more
about the overall extensibility of the class.
The class is very closed off in terms of the API - so it stands to reason
that it may be extended for some specific usecases (for example I am trying
to allow setting custom HTTP Headers on the GET request before sending it)
- this is mainly why I was asking if it would make sense to try to open it
up a little and make it more extensible.


That makes perfect sense.  As Mark suggested, please file an issue in 
Jira.  We can then figure out on the back end exactly what to do.


What kind of HTTP headers are you wanting to send?  SolrJ should already 
send all the headers that Solr requires.  If there's a compelling 
general use-case, we might want a Jira issue that makes it possible to 
define custom headers for all SolrServer implementations.


Thanks,
Shawn



Re: Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
First - I completely agree with keeping the moving parts to a minimum - but
I do think that's a case by case decision, and in this particular case - it
may just be worth opening up a little.
Then - adding in a custom HttpClient may work - but HttpHeaders are set on
a request (and may differ from one request to the other) - so doing it in a
custom http client may be technically doable, but it's probably going to be
(and look like - from the API POV) a workaround.
As a quick example - I'm using a custom header to activate some
functionality in SORL, and another - to activate another type of
functionality. Ideally, I think the API should be:
solrServer.request(request, headers); // where headers is = final
Map headers
Or if that's to specific:
solrServer.request(request, new HttpGet(...));
Now - I can definitely see how that may or may not be OK with the direction
of the API - which is why I was just thinking of extending the
HttpSolrServer and adding that method for my own version. And that is where
I ran into the problems with extensibility.
I think that - without going into adding overloading methods - a quick
change would be to carefully open some of the internal details up so that
they can be used by an extending class - either by making them protected,
or by adding in getters. Since setters do exist - I also think getters
won't add to much conceptual load or moving parts to the logic.
Cheers,
Eugen.


On Thu, Nov 21, 2013 at 12:49 AM, Shawn Heisey  wrote:

> On 11/20/2013 3:28 PM, Eugen Paraschiv wrote:
>
>> The reason I needed access to internal details of the class - and it's not
>> just these 2 fields (I used these just as a quick example) - was that I
>> was
>> trying to extend the class and overload the request method. As soon as I
>> tried to do that, I noticed that I really couldn't easily do so - because
>> much of the fields has no getters and were not protected either (as you
>> pointed out).
>> So - it's not specifically about this to particular fields - it's more
>> about the overall extensibility of the class.
>> The class is very closed off in terms of the API - so it stands to reason
>> that it may be extended for some specific usecases (for example I am
>> trying
>> to allow setting custom HTTP Headers on the GET request before sending it)
>> - this is mainly why I was asking if it would make sense to try to open it
>> up a little and make it more extensible.
>>
>
> That makes perfect sense.  As Mark suggested, please file an issue in
> Jira.  We can then figure out on the back end exactly what to do.
>
> What kind of HTTP headers are you wanting to send?  SolrJ should already
> send all the headers that Solr requires.  If there's a compelling general
> use-case, we might want a Jira issue that makes it possible to define
> custom headers for all SolrServer implementations.
>
> Thanks,
> Shawn
>
>


Re: How to Configure Highlighting for Solr

2013-11-20 Thread Furkan KAMACI
I have implemented a search API that interacts with Solr. I don't retrieve
id field. Id field is a transformed version of name field and it helps to
make a quicker search on index. It would be nice to declare to Solr that I
have another field that is unique too and it would be nice to group
highlighting on that unique field instead of id.

Thanks;
Furkan KAMACI


20 Kasım 2013 Çarşamba tarihinde Stefan Matheis 
adlı kullanıcı şöyle yazdı:
> Solr is using the UniqueKey you defined for your documents, that
shouldn't be a problem, since you can lookup the document from the list of
documents in the main response?
>
> And there is actually a ticket, which would allow it to inline the
highlight response with DocTransfomers:
https://issues.apache.org/jira/browse/SOLR-3479
>
> -Stefan
>
>
> On Wednesday, November 20, 2013 at 4:37 PM, Furkan KAMACI wrote:
>
>> I have setup my highlight as follows:
>>
>> true
>> name age address
>>
>> However I don't want *name* be highlighted *but *included inside
response:
>>
>> "highlighting": {
>> Something_myid: {
>> name: "Something bla bla",
>> age: "Something age bla bla",
>> address: "Something age bla bla"
>> }
>> }
>>
>> *or:*
>>
>> I want to group them on name field instead of id:
>>
>> "highlighting": {
>> Something bla bla: {
>> age: "Something age bla bla",
>> address: "Something age bla bla"
>> }
>> }
>>
>> *or*
>>
>> "highlighting": {
>> Something bla bla: {
>> name: "Something bla bla",
>> age: "Something age bla bla",
>> address: "Something age bla bla"
>> }
>> }
>>
>> How can I do *any* of them at Solr 4.5.1?
>
>


csv does not return custom fields (distance)

2013-11-20 Thread GaneshSe
I am using spacial search feature in Solr (4.0) version. 

When I try to extract the csv (using wt=csv option) using the edismax
parser, I dont get all the fields in the CSV output as specified in the fl
parameter. Only the schema fields are coming out in CSV and the score, the
custom fields like distance as specified/bolded below does not come out in
the csv file. But i am able to get the same in the wt=xml option. 

q=+(Name:abcd)&sfield=location&rows=100&defType=edismax&pt=40.721587,-73.886938&q.op=OR&isShard=true&start=0&fl=*,score,*dist:geodist()*&wt=csv

Above is not complete query

I would like to have distance in the CSV output, any help please?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/csv-does-not-return-custom-fields-distance-tp4102313.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to retain the original format of input document in search results in SOLR - Tomcat

2013-11-20 Thread ramesh py
Hi All,



I am  new to apache solr. Recently  I could able to configure the solr with
tomcat successfully. And its working fine except the format of the search
results i.e., the format of the search results not displaying as like as
input document.



I am doing the below things



1.   Indexing the xml file into solr

2.   Format of the xml as below

**

some text

 Title1: descriptions of the title

Title2 : description of the title2

Title3 : description of title3



some text 





3.   After index, the results are displaying in the below format.



*F1 : *some text

*F2*: Title1: descriptions of the title Title2 : description of the title2
Title3 : description of title3

*F3*: some text



*Expected Result :*



*F1 : *some text

*F2*: Title1: descriptions of the title

  Title2 : description of the title2

  Title3 : description of title3

*F3*: some text





If we see the F2 field, format id getting changed i.e., input format is of
F2 field is line by line for each sub title, but in the result it
displaying as single line.





I would like to display the result like whenever any subtitle occurs in xml
file for any field, that subtitle should display in the next  line in the
results.



Can anyone please help on this. Thanks in advance.





Regards,

Ramesh p.y

-- 
Ramesh P.Y
pyrames...@gmail.com
Mobile No:+91-9176361984


How to configure new path for velocity for SolrCloud?

2013-11-20 Thread John W.Lee
I deploy a solrcloud with three instances of solr, which use conf/velocity by
default.
Now I want to add another velocity configuration folder conf/new_vel, but I
couldn't 
make it work when I change  element of request
handler of solrconfig.xml.
I have tried conf/new_vel, /var/solr/test/conf/new_vel, neither of these
works.
P.S test is my collection name.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-configure-new-path-for-velocity-for-SolrCloud-tp4102335.html
Sent from the Solr - User mailing list archive at Nabble.com.