Re: not getting any mails

2014-05-12 Thread Aman Tandon
Thanks Erick.

With Regards
Aman Tandon


On Sun, May 11, 2014 at 9:48 AM, Erick Erickson wrote:

> There was an infrastructure problem, _nobody_ was getting e-mails. I
> think it's fixed now.
>
> But the backlog will take a while to work through.
>
> Erick
>
> On Sat, May 10, 2014 at 5:30 AM, Aman Tandon 
> wrote:
> > Hi,
> >
> > I am not getting any mails from this group, did my subscription just got
> > ended? Is there anybody can help.
> >
> > With Regards
> > Aman Tandon
>


spellcheck if docsfound below threshold

2014-05-12 Thread Jan Verweij - Reeleez
Hi,

Is there a setting to only include spellcheck if the number of documents
found is below a certain threshold?

Or would we need to rerun the request with the spellcheck parameters based
on the docs found?

Kind regards,

Jan Verweij


SWF content not indexed

2014-05-12 Thread Mauro Gregorio Binetti
Hi guys,
how can I make it possibile to index content of SWF files? I'm using Solr
3.6.0.

Regards,
Mauro


Replica active during warming

2014-05-12 Thread lboutros
Dear All,

we just finished the migration of a cluster from Solr 4.3.1 to Solr 4.6.1.
With solr 4.3.1 a node was not considered as active before the end of the
warming process.

Now, with solr 4.6.1 a replica is considered as active during the warming
process.
This means that if you restart a replica or create a new one, queries will
be send to this replica and the query will hang until the end of the warming
process (We do not use cold searchers).

We have quite long warming queries and this is a big issue.
Is there a parameter I do not know that could control this behavior ?

thanks,

Ludovic.



-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Replica-active-during-warming-tp4135274.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solrj problem

2014-05-12 Thread blach
any solution please :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrj-problem-tp4135030p4135046.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: DataImport using SqlEntityProcessor running Out of Memory

2014-05-12 Thread Shawn Heisey
On 5/9/2014 9:16 AM, O. Olson wrote:
> I have a Data Schema which is Hierarchical i.e. I have an Entity and a number
> of attributes. For a small subset of the Data - about 300 MB, I can do the
> import with 3 GB memory. Now with the entire 4 GB Dataset, I find I cannot
> do the import with 9 GB of memory. 
> I am using the SqlEntityProcessor as below: 
> 
> 
>  url="jdbc:sqlserver://localhost\MSSQLSERVER;databaseName=SolrDB;user=solrusr;password=solrusr;"/>

Upgrade your JDBC driver to 1.2 or later, or turn on response buffering.
 The following URL has this information.  It's a very long URL, so if
your mail client wraps it, you may not be able to click on it properly:

http://wiki.apache.org/solr/DataImportHandlerFaq#I.27m_using_DataImportHandler_with_MS_SQL_Server_database_with_sqljdbc_driver._DataImportHandler_is_going_out_of_memory._I_tried_adjustng_the_batchSize_values_but_they_don.27t_seem_to_make_any_difference._How_do_I_fix_this.3F

Thanks,
Shawn



Re: Solrcore.properties variable question.

2014-05-12 Thread Shawn Heisey
On 5/8/2014 2:01 AM, Guido Medina wrote:
> We have a couple of Solr servers acting as master and slave, and each
> server have the same amount of cores, we are trying to configure the
> solrcore.properties so that an script is able to add cores without
> changing the solrcore.properties using a hack like this:
> 
> enable.master=false
> enable.slave=true
> master_url=http://master_solr:8983/solr/${solr.core.name}
> 
> Our idea is to have solr.core.name to be the dynamic variable, but once
> we go to admin, the master URL is not showing the last part, is there a
> format error or something trivial I'm missing?

This works in solrconfig.xml, but I have never tried it in
core.properties (with the new solr.xml format).

I don't know if the fact that the property doesn't work in a properties
file is a bug or not, but I would advise opening a new issue in Jira.
You need an account on the Apache Jira install.

https://issues.apache.org/jira/browse/SOLR/

You can use the solr.core.name property in master URL in the replication
handler in solrconfig.xml for sure.  I used to do this a long time ago
when I was using replication.  It proved much more advantageous to
update each index copy independently, so replication is no longer used.

Thanks,
Shawn



Easises way to insatll solr cloud with tomcat

2014-05-12 Thread Aman Tandon
Hi,

I tried to set up solr cloud with jetty which works fine. But in our
production environment we uses tomcat so i need to set up the solr cloud
with the tomcat. So please help me out to how to setup solr cloud with
tomcat on single machine.

Thanks in advance.

With Regards
Aman Tandon


Re: Website running Solr

2014-05-12 Thread Shawn Heisey
On 5/11/2014 10:55 AM, Olivier Austina wrote:
> Is there a way to know if a website use Solr? Thanks.

Paul's answer is correct.  There is usually no way to know for sure,
unless you ask the website operators.  Secure implementations will not
expose Solr to the outside world.

If you see evidence that faceted search is happening, there is a
reasonable chance that it's Solr ... but even then you cannot be sure,
because other search products have facets too.  A strong example of
faceted search is the "narrow your choices" options in the left column
on Newegg.  I believe they are using Solr, but I do not know that for sure.

http://www.newegg.com/All-Desktop-Hard-Drives/SubCategory/ID-14

Thanks,
Shawn



SolrCloud - Highly Reliable / Scalable Resources?

2014-05-12 Thread Darren Lee
Hi everyone,

We have been using Solr Cloud (4.4) for ~ 6 months now. Functionally its 
excellent but we have suffered several issues which always seem quite 
problematic to resolve.

I was wondering if anyone in the community can recommend good resources / 
reading for setting up a highly scalable / highly reliable cluster. A lot of 
what I see in the solr documentation is aimed at small setups or is quite 
sparse.

Dealing with topics like:

* Capacity planning

* Losing nodes

* Voting panic

* Recovery failure

* Replication factors

* Elasticity / Auto scaling / Scaling recipes

* Exhibitor

* Container configuration, concurrency limits, packet drop tuning

* Increasing capacity without downtime

* Scalable approaches to full indexing hundreds of millions of documents

* External health check vs CloudSolrServer

* Separate vs local zookeeper

* Benchmarks


Sorry, I know that's a lot to ask heh. We are going to run a project for a 
month or so soon where we re-write all our run books and do deeper testing on 
various failure scenarios and the above but any starting point would be much 
appreciated.

Thanks all,
Darren


Re: ContributorsGroup add request

2014-05-12 Thread Erick Erickson
Done. I think this was done a couple of days ago actually, but the
mailing lists have been a bit messed up.

On Sat, May 10, 2014 at 5:35 PM, Jim Martin  wrote:
> Greetings-
>
>Please add me the ContributorsGroup; I've got some Solr icons I'd like to
> suggest to the community. Perhaps down the road I can contribute more. I'm
> the team lead at Overstock.Com for search, and Solr is the foundation of
> what we do.
>
>Username: JamesMartin
>
> Thanks,
> -Jim


Re: retreive all the fields in join

2014-05-12 Thread Erick Erickson
Any time you find yourself trying to use Solr like a DB, stop.

Solr joins are _not_ DB joins, the data from the "from" core is not
returned (I think there are a few special cases where you can make
this happen though).

Try denormalizing your data if at all possible, that's what Solr docs
best... search single records.

Best,
Erick

On Sun, May 11, 2014 at 6:40 PM, Aman Tandon  wrote:
> please help me out here!!
>
> With Regards
> Aman Tandon
>
>
> On Sun, May 11, 2014 at 1:44 PM, Aman Tandon wrote:
>
>> Hi,
>>
>> Is there a way possible to retrieve all the fields present in both the
>> cores(core 1 and core2).
>>
>> e.g.
>> core1: {id:111,name: "abc" }
>>
>> core2: {page:17, type: "fiction"}
>>
>> I want is that, on querying both the cores I want to retrieve the results
>> containing all the 4 fields, fields id, name from core1 and page, type from
>> core2. Is it possible?
>>
>> With Regards
>> Aman Tandon
>>


Re: Easises way to insatll solr cloud with tomcat

2014-05-12 Thread Erick Erickson
What have you already tried to solve the problem yourself? What are
you having difficulty with?

I suggest you review:

http://wiki.apache.org/solr/UsingMailingLists

Best,
Erick

On Mon, May 12, 2014 at 12:54 AM, Aman Tandon  wrote:
> Hi,
>
> I tried to set up solr cloud with jetty which works fine. But in our
> production environment we uses tomcat so i need to set up the solr cloud
> with the tomcat. So please help me out to how to setup solr cloud with
> tomcat on single machine.
>
> Thanks in advance.
>
> With Regards
> Aman Tandon


Re: retreive all the fields in join

2014-05-12 Thread Aman Tandon
Yeah i understand but i got the requirement from the top management,
requirements are:
core1:  in this we want to keep the supplier activity points
case 2: we want to boost those records which are present in core1 by the
amount of supplier activity points.

I know we can keep that supplier score in same core but this requires the
full indexing of 12M records and suppliers are of about 1lacs which won't
cost much.

With Regards
Aman


On Mon, May 12, 2014 at 7:44 PM, Erick Erickson wrote:

> Any time you find yourself trying to use Solr like a DB, stop.
>
> Solr joins are _not_ DB joins, the data from the "from" core is not
> returned (I think there are a few special cases where you can make
> this happen though).
>
> Try denormalizing your data if at all possible, that's what Solr docs
> best... search single records.
>
> Best,
> Erick
>
> On Sun, May 11, 2014 at 6:40 PM, Aman Tandon 
> wrote:
> > please help me out here!!
> >
> > With Regards
> > Aman Tandon
> >
> >
> > On Sun, May 11, 2014 at 1:44 PM, Aman Tandon  >wrote:
> >
> >> Hi,
> >>
> >> Is there a way possible to retrieve all the fields present in both the
> >> cores(core 1 and core2).
> >>
> >> e.g.
> >> core1: {id:111,name: "abc" }
> >>
> >> core2: {page:17, type: "fiction"}
> >>
> >> I want is that, on querying both the cores I want to retrieve the
> results
> >> containing all the 4 fields, fields id, name from core1 and page, type
> from
> >> core2. Is it possible?
> >>
> >> With Regards
> >> Aman Tandon
> >>
>


solr 4.8 Leader Problem

2014-05-12 Thread adfel70
*Solr &Collection Info:*
Solr 4.8 , 4 shards, 3 replicas per shard, 30-40 million docs per shard. 

Process:
1. Indexing 100-200 docs per second. 
2. Doing Pkill -9 java to 2 replicas (not the leader) in shard 3 (while
indexing). 
3. Indexing for 10-20 minutes and doing hard commit. 
4. Doing Pkill -9 java to the leader and then starting one replica in shard
3 (while indexing). 
5. After 20 minutes starting another replica in shard 3 ,while indexing (not
the leader in step 1). 
6. After 10 minutes starting the rep that was the leader in step 1. 

*Results:*
2. Only the leader is active in shard 3. 
3. Thousands of docs were added to the leader in shard 3. 
4. After staring the replica, it's state was down and after 10 minutes it
became the leader in cluster state (and still down). no servers hosting
shards for index and search requests. 
5. After starting another replica, it's state was recovering for 2-3 minutes
and then it became active (not leader in cluster state). 
   Index, commit and search requests are handled in the other replica
(active status, not leader!!!). 
   The search Results not includes docs that have been indexed to the leader
in step 3.  
6. syncing with the active rep. 

*Expected:*
5. To stay in down status. 
   Not to handle index, commit and search requests - no servers hosting
shards!
6. Become the leader. 

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-4-8-Leader-Problem-tp4135306.html
Sent from the Solr - User mailing list archive at Nabble.com.


Spell check [or] Did you mean this with Phrase suggestion

2014-05-12 Thread vanitha venkatachalam
Hi,
We need a spell check component that suggest actual full phrase not just
words.

Say, we have list of brands : "Nike corporation", "Samsung electronics" ,

when I search for "tamsong", I like to get suggestions as "samsung
electronics" ( full phrase ) not just "samsung" ( words)
Please help.
-- 
regards,
Vanitha


URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
I been trying to index data from other solr servers but the import always
shows:
Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
Requests: 1, Fetched: 0, Skipped: 0, Processed

My data config looks like this:



Any help will be greatly appreciated



--
View this message in context: 
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Too many documents Exception

2014-05-12 Thread Greg Walters
Looks like you've hit an internal limitation of Lucene, see 
http://lucene.apache.org/core/3_0_3/fileformats.html#Limitations:

When referring to term numbers, Lucene's current implementation uses a 
Java int to hold the term index, which means the maximum number of unique terms 
in any single index segment is ~2.1 billion times the term index interval 
(default 128) = ~274 billion. This is technically not a limitation of the index 
file format, just of Lucene's current implementation.

Similarly, Lucene uses a Java int to refer to document numbers, and the index 
file format uses an Int32 on-disk to store document numbers. This is a 
limitation of both the index file format and the current implementation. 
Eventually these should be replaced with either UInt64 values, or better yet, 
VInt values which have no limit. 

To fix this you might try splitting your collection into multiple collections 
or using SolrCloud with multiple shards as SolrCloud maintains each shard 
internally as a unique Lucene index. If you're unable to reindex to accomplish 
the split or sharding you might be able to use some of Lucene's CLI tools to 
optimize the index purging deleted documents or to split it into multiple 
indexes.

Thanks,
Greg

On May 6, 2014, at 7:54 PM, [Tech Fun]山崎  wrote:

> Hello everybody,
> 
> Solr 4.3.1(and 4.7.1), Num Docs + Deleted Docs >
> 2147483647(Integer.MAX_VALUE) over
> Caused by: java.lang.IllegalArgumentException: Too many documents,
> composite IndexReaders cannot exceed 2147483647
> 
> It seems to be trouble similar to the unresolved e-mail.
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/browser
> 
> If How can I fix this?
> This Solr Specification?
> 
> 
> log.
> 
> ERROR org.apache.solr.core.CoreContainer  – Unable to create core: collection1
> org.apache.solr.common.SolrException: Error opening new searcher
>at org.apache.solr.core.SolrCore.(SolrCore.java:821)
>at org.apache.solr.core.SolrCore.(SolrCore.java:618)
>at 
> org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)
>at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
>at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
>at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
>at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438)
>at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550)
>at org.apache.solr.core.SolrCore.(SolrCore.java:796)
>... 13 more
> Caused by: org.apache.solr.common.SolrException: Error opening Reader
>at 
> org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172)
>at 
> org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:183)
>at 
> org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:179)
>at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414)
>... 15 more
> Caused by: java.lang.IllegalArgumentException: Too many documents,
> composite IndexReaders cannot exceed 2147483647
>at 
> org.apache.lucene.index.BaseCompositeReader.(BaseCompositeReader.java:77)
>at org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:368)
>at 
> org.apache.lucene.index.StandardDirectoryReader.(StandardDirectoryReader.java:42)
>at 
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71)
>at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
>at 
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
>at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88)
>at 
> org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34)
>at 
> org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169)
>... 18 more
> ERROR org.apache.solr.core.CoreContainer  –
> null:org.apache.solr.common.SolrException: Unable to create core:
> collection1
>at 
> org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450)
>at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993)
>at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
>at org.apache.solr.core.CoreContainer$2.call(CoreC

Re: retreive all the fields in join

2014-05-12 Thread Walter Underwood
Top management has given requirements that force a broken design. They are 
requiring something that is impossible with Solr.

1. Flatten the data. You get one table, no joins.

2. 12M records is not a big Solr index. That should work fine.

3. If the supplier activity points are updated frequently, you could use an 
external file field for those, but they still need to be flattened. 

wunder

On May 12, 2014, at 7:21 AM, Aman Tandon  wrote:

> Yeah i understand but i got the requirement from the top management,
> requirements are:
> core1:  in this we want to keep the supplier activity points
> case 2: we want to boost those records which are present in core1 by the
> amount of supplier activity points.
> 
> I know we can keep that supplier score in same core but this requires the
> full indexing of 12M records and suppliers are of about 1lacs which won't
> cost much.
> 
> With Regards
> Aman
> 
> 
> On Mon, May 12, 2014 at 7:44 PM, Erick Erickson 
> wrote:
> 
>> Any time you find yourself trying to use Solr like a DB, stop.
>> 
>> Solr joins are _not_ DB joins, the data from the "from" core is not
>> returned (I think there are a few special cases where you can make
>> this happen though).
>> 
>> Try denormalizing your data if at all possible, that's what Solr docs
>> best... search single records.
>> 
>> Best,
>> Erick
>> 
>> On Sun, May 11, 2014 at 6:40 PM, Aman Tandon 
>> wrote:
>>> please help me out here!!
>>> 
>>> With Regards
>>> Aman Tandon
>>> 
>>> 
>>> On Sun, May 11, 2014 at 1:44 PM, Aman Tandon >> wrote:
>>> 
 Hi,
 
 Is there a way possible to retrieve all the fields present in both the
 cores(core 1 and core2).
 
 e.g.
 core1: {id:111,name: "abc" }
 
 core2: {page:17, type: "fiction"}
 
 I want is that, on querying both the cores I want to retrieve the
>> results
 containing all the 4 fields, fields id, name from core1 and page, type
>> from
 core2. Is it possible?
 
 With Regards
 Aman Tandon
 
>> 

--
Walter Underwood
wun...@wunderwood.org





Re: URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
Here is the data config:





http://slszip11.as.homes.com/solr/select?q=*:*";
processor="XPathEntityProcessor"
forEach="/response/result/doc"
transformer="DateFormatTransformer">
 
 
 
 
 









--
View this message in context: 
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135331.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
I tested calling the URL using curl right on the server, and I get a valid
response and the correct content




--
View this message in context: 
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135333.html
Sent from the Solr - User mailing list archive at Nabble.com.


autowarming queries

2014-05-12 Thread Joshi, Shital
Hi,

How many auto warming queries are supported per collection in Solr4.4 and 
higher? We see one out of three queries in log when new searcher is created. 
Shouldn't it print all searcher queries?

Thanks!





RE: autowarming queries

2014-05-12 Thread Joshi, Shital
We added an id (searcher3) in each searcher but it never 
gets printed in log file. Does Solr internally massages the searcher queries?

_
From: Joshi, Shital [Tech]
Sent: Monday, May 12, 2014 11:27 AM
To: 'solr-user@lucene.apache.org'
Subject: autowarming queries


Hi,

How many auto warming queries are supported per collection in Solr4.4 and 
higher? We see one out of three queries in log when new searcher is created. 
Shouldn't it print all searcher queries?

Thanks!





Re: retreive all the fields in join

2014-05-12 Thread Erick Erickson
Well, you might explain to management that this isn't what Solr is
built to do.

12M records is actually quite small by Solr standards. I'd recommend
de-normalizing and demonstrating a working solution that uses Solr
as it's intended rather than how senior management wishes it were.
It'll be a lot easier/faster/more scalable I'd guess. and with 12M records,
it will very likely fit on commodity hardware.

I'm afraid I can't really help you otherwise..

Best
Erick

On Mon, May 12, 2014 at 7:21 AM, Aman Tandon  wrote:
> Yeah i understand but i got the requirement from the top management,
> requirements are:
> core1:  in this we want to keep the supplier activity points
> case 2: we want to boost those records which are present in core1 by the
> amount of supplier activity points.
>
> I know we can keep that supplier score in same core but this requires the
> full indexing of 12M records and suppliers are of about 1lacs which won't
> cost much.
>
> With Regards
> Aman
>
>
> On Mon, May 12, 2014 at 7:44 PM, Erick Erickson 
> wrote:
>
>> Any time you find yourself trying to use Solr like a DB, stop.
>>
>> Solr joins are _not_ DB joins, the data from the "from" core is not
>> returned (I think there are a few special cases where you can make
>> this happen though).
>>
>> Try denormalizing your data if at all possible, that's what Solr docs
>> best... search single records.
>>
>> Best,
>> Erick
>>
>> On Sun, May 11, 2014 at 6:40 PM, Aman Tandon 
>> wrote:
>> > please help me out here!!
>> >
>> > With Regards
>> > Aman Tandon
>> >
>> >
>> > On Sun, May 11, 2014 at 1:44 PM, Aman Tandon > >wrote:
>> >
>> >> Hi,
>> >>
>> >> Is there a way possible to retrieve all the fields present in both the
>> >> cores(core 1 and core2).
>> >>
>> >> e.g.
>> >> core1: {id:111,name: "abc" }
>> >>
>> >> core2: {page:17, type: "fiction"}
>> >>
>> >> I want is that, on querying both the cores I want to retrieve the
>> results
>> >> containing all the 4 fields, fields id, name from core1 and page, type
>> from
>> >> core2. Is it possible?
>> >>
>> >> With Regards
>> >> Aman Tandon
>> >>
>>


Re: Easises way to insatll solr cloud with tomcat

2014-05-12 Thread Aman Tandon
Can anybody help me out??

With Regards
Aman Tandon


On Mon, May 12, 2014 at 1:24 PM, Aman Tandon wrote:

> Hi,
>
> I tried to set up solr cloud with jetty which works fine. But in our
> production environment we uses tomcat so i need to set up the solr cloud
> with the tomcat. So please help me out to how to setup solr cloud with
> tomcat on single machine.
>
> Thanks in advance.
>
> With Regards
> Aman Tandon
>


Sorting by custom function query

2014-05-12 Thread Emanuele Filannino
Hi there,


I'm running into some issues developing a custom function query using Solr
3.6.2.

My goal is to be able to implement a custom sorting technique.


I have a field called daily_prices_str, it is a single value str.


Example:




2014-05-01:130 2014-05-02:130 2014-05-03:130 2014-05-04:130 2014-05-05:130
2014-05-06:130 2014-05-07:130 2014-05-08:130 2014-05-09:130 2014-05-10:130
2014-05-11:130 2014-05-12:130 2014-05-13:130 2014-05-14:130 2014-05-15:130
2014-05-16:130 2014-05-17:130 2014-05-18:130 2014-05-19:130 2014-05-20:130
2014-05-21:130 2014-05-22:130 2014-05-23:130 2014-05-24:130 2014-05-25:130
2014-05-26:130 2014-05-27:130 2014-05-28:130 2014-05-29:130 2014-05-30:130
2014-05-31:130 2014-06-01:130 2014-06-02:130 2014-06-03:130 2014-06-04:130
2014-06-05:130 2014-06-06:130 2014-06-07:130 2014-06-08:130 2014-06-09:130
2014-06-10:130 2014-06-11:130 2014-06-12:130 2014-06-13:130 2014-06-14:130
2014-06-15:130 2014-06-16:130 2014-06-17:130 2014-06-18:130 2014-06-19:130
2014-06-20:130 2014-06-21:130 2014-06-22:130 2014-06-23:130 2014-06-24:130
2014-06-25:130 2014-06-26:130 2014-06-27:130 2014-06-28:130 2014-06-29:130
2014-06-30:130 2014-07-01:130 2014-07-02:130 2014-07-03:130 2014-07-04:130
2014-07-05:130 2014-07-06:130 2014-07-07:130 2014-07-08:130 2014-07-09:130
2014-07-10:130 2014-07-11:130 2014-07-12:130 2014-07-13:130 2014-07-14:130
2014-07-15:130 2014-07-16:130 2014-07-17:130 2014-07-18:130 2014-07-19:170
2014-07-20:170 2014-07-21:170 2014-07-22:170 2014-07-23:170 2014-07-24:170
2014-07-25:170 2014-07-26:170 2014-07-27:170 2014-07-28:170 2014-07-29:170
2014-07-30:170 2014-07-31:170 2014-08-01:170 2014-08-02:170 2014-08-03:170
2014-08-04:170 2014-08-05:170 2014-08-06:170 2014-08-07:170 2014-08-08:170
2014-08-09:170 2014-08-10:170 2014-08-11:170 2014-08-12:170 2014-08-13:170
2014-08-14:170 2014-08-15:170 2014-08-16:170 2014-08-17:170 2014-08-18:170
2014-08-19:170 2014-08-20:170 2014-08-21:170 2014-08-22:170 2014-08-23:170
2014-08-24:170 2014-08-25:170 2014-08-26:170 2014-08-27:170 2014-08-28:170
2014-08-29:170 2014-08-30:170




As you can see the structure of the string is date:price.


Basically, I would like to parse the string to get the price for a
particular period and sort by that price.

I’ve already developed the java plugin for the custom function query and
I’m at the point where my code compiles, runs, executes, etc. Solr is happy
with my code.


Example:

price(daily_prices_str,2015-01-01,2015-01-03)


If I run this query I can see the correct price in the score field:


/select?price=price(daily_prices_str,2015-01-01,2015-01-03)&q={!func}$price


One of the problems is that I cannot sort by function result.

If I run this query:


/select?price=price(daily_prices_str,2015-01-01,2015-01-03)&q={!func}$price&sort=$price+asc


I get a 404 saying that "sort param could not be parsed as a query, and is
not a field that exists in the index: $price"

But it works with a workaround:


/select?price=sum(0,price(daily_prices_str,2015-01-01,2015-01-03))&q={!func}$price&sort=$price+asc


The main problem is that I cannot filter by range:


/select?price=sum(0,price(daily_prices_str,2015-1-1,2015-1-3))&q={!frange
l=100 u=400}$price


Maybe I'm going about this totally incorrectly?


What is the usage of solr.NumericPayloadTokenFilterFactory

2014-05-12 Thread ienjreny
Dears:
Can any body explain at easy way what is the benefits of
solr.NumericPayloadTokenFilterFactory and what is acceptable values for
typeMatch

Thanks in advance



--
View this message in context: 
http://lucene.472066.n3.nabble.com/What-is-the-usage-of-solr-NumericPayloadTokenFilterFactory-tp4135326.html
Sent from the Solr - User mailing list archive at Nabble.com.


Falling back to SlowFuzzyQuery

2014-05-12 Thread Brian Panulla
I'm working on upgrading our Solr 3 applications to Solr 4. The last piece
of the puzzle involves the change in how fuzzy matching works in the new
version. I have to rework how a key feature of our application is
implemented to get the same behavior with the new FuzzyQuery as it has in
the old version. I'm hoping get the rest of the system upgraded first and
deal with that separately.

I found a previous discussion indicating that SlowFuzzyQuery from the
sandbox package is the older fuzzy matching implementation:

http://mail-archives.apache.org/mod_mbox/lucene-java-user/201308.mbox/%3C03be01ce98f7$da6c0760$8f441620$@thetaphi.de%3E

How does one re-introduce SlowFuzzyQuery to a Solr service? It wasn't
obvious from the standard configuration that you could directly swap the
classes.

Do I need to implement a custom Query Parser or Query Handler? Or is this
something that can be accomplished through configuration?


Re: SolrCloud - Highly Reliable / Scalable Info

2014-05-12 Thread Aman Tandon
Hi,

Can i ask you a question are you using the solr with tomcat or jetty?

With Regards
Aman Tandon


On Mon, May 12, 2014 at 9:04 PM, Darren Lee  wrote:

> Hi everyone,
>
> We have been using Solr Cloud (4.4) for ~ 6 months now. Functionally its
> excellent but we have suffered several issues which always seem quite
> problematic to resolve.
>
> I was wondering if anyone in the community can recommend good resources /
> reading for setting up a highly scalable / highly reliable cluster. A lot
> of what I see in the solr documentation is aimed at small setups or is
> quite sparse.
>
> Dealing with topics like:
>
> * Capacity planning
>
> * Losing nodes
>
> * Voting panic
>
> * Recovery failure
>
> * Replication factors
>
> * Elasticity / Auto scaling / Scaling recipes
>
> * Exhibitor
>
> * Container configuration, concurrency limits, packet drop tuning
>
> * Increasing capacity without downtime
>
> * Scalable approaches to full indexing hundreds of millions of
> documents
>
> * External health check vs CloudSolrServer
>
> * Separate vs local zookeeper
>
> * Benchmarks
>
>
> Sorry, I know that's a lot to ask heh. We are going to run a project for a
> month or so soon where we re-write all our run books and do deeper testing
> on various failure scenarios and the above but any starting point would be
> much appreciated.
>
> Thanks all,
> Darren
>
>


SolrCloud - Highly Reliable / Scalable Info

2014-05-12 Thread Darren Lee
Hi everyone,

We have been using Solr Cloud (4.4) for ~ 6 months now. Functionally its 
excellent but we have suffered several issues which always seem quite 
problematic to resolve.

I was wondering if anyone in the community can recommend good resources / 
reading for setting up a highly scalable / highly reliable cluster. A lot of 
what I see in the solr documentation is aimed at small setups or is quite 
sparse.

Dealing with topics like:

* Capacity planning

* Losing nodes

* Voting panic

* Recovery failure

* Replication factors

* Elasticity / Auto scaling / Scaling recipes

* Exhibitor

* Container configuration, concurrency limits, packet drop tuning

* Increasing capacity without downtime

* Scalable approaches to full indexing hundreds of millions of documents

* External health check vs CloudSolrServer

* Separate vs local zookeeper

* Benchmarks


Sorry, I know that's a lot to ask heh. We are going to run a project for a 
month or so soon where we re-write all our run books and do deeper testing on 
various failure scenarios and the above but any starting point would be much 
appreciated.

Thanks all,
Darren



Re: Indexing PDF in Apache Solr 4.8.0 - Problem.

2014-05-12 Thread Siegfried Goeschl
Hi Vignesh,

can you check your SOLR Server Log?! Not all PDF documents on this planet can 
be processed using Tikka :-)

Cheers,

Siegfried Goeschl

On 07 May 2014, at 09:40, vignesh  wrote:

> Dear Team,
>  
> I am Vignesh  using the latest version 4.8.0 Apache Solr and am 
> Indexing my PDF but getting an error and have posted that below for your 
> reference. Kindly guide me to solve this error.
>  
> D:\IPCB\solr>java -Durl=http://localhost:8082/solr/ipcb/update/extract 
> -Dparams=
> literal.id=herald060214_001 -Dtype=application/pdf -jar post.jar 
> "D:/IPCB/ipcbpd
> f/herald060214_001.pdf"
> SimplePostTool version 1.5
> Posting files to base url 
> http://localhost:8082/solr/ipcb/update/extract?literal
> .id=herald060214_001 using content-type application/pdf..
> POSTing file herald060214_001.pdf
> SimplePostTool: WARNING: Solr returned an error #500 Internal Server Error
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException
> : Server returned HTTP response code: 500 for URL: 
> http://localhost:8082/solr/ip
> cb/update/extract?literal.id=herald060214_001
> 1 files indexed.
> COMMITting Solr index changes to 
> http://localhost:8082/solr/ipcb/update/extract?
> literal.id=herald060214_001..
> SimplePostTool: WARNING: Solr returned an error #500 Internal Server Error 
> for u
> rl 
> http://localhost:8082/solr/ipcb/update/extract?literal.id=herald060214_001&co
> mmit=true
> Time spent: 0:00:00.062
>  
>  
>  
> Thanks & Regards.
> Vignesh.V
>  
> 
> Ninestars Information Technologies Limited.,
> 72, Greams Road, Thousand Lights, Chennai - 600 006. India.
> Landline : +91 44 2829 4226 / 36 / 56   X: 144
> www.ninestars.in
>  
> 
> STOP Virus, STOP SPAM, SAVE Bandwidth! 
> www.safentrix.com
> 



Re: not getting any mails

2014-05-12 Thread Oliver Schrenk
Something still seems amiss. I unsubscribed from the mailing list and still get 
mails.

On 11 May 2014, at 14:31, Ahmet Arslan  wrote:

> 
> 
> 
> Hi Amon,
> 
> Its not just you. There was a general problem with Apache mailing lists. But 
> it is fixed now. 
> Please see for more info : https://blogs.apache.org/infra/entry/mail_outage
> 
> Ahmet
> 
> 
> On Sunday, May 11, 2014 7:41 AM, Aman Tandon  wrote:
> Hi,
> 
> I am not getting any mails from this group, did my subscription just got
> ended? Is there anybody can help.
> 
> With Regards
> Aman Tandon
> 



Re: Solr + SPDY

2014-05-12 Thread harspras
Hi Vinay,

I have been trying to setup a similar environment with SPDY being enabled
for Solr inter shard communication. Did you happen to have been able to do
it? I somehow cannot use SolrCloud with SPDY enabled in jetty.

Regards,
Harsh Prasad



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-SPDY-tp4097771p4135377.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: retreive all the fields in join

2014-05-12 Thread David Philip
Hi Aman,

I think it is possible.

1. Use fl parameter.
2. Add all the 4 fields in both the schemas[schemas of core 1 and 2].
3. While querying use &fl=id,name,type,page.

It will return all the fields. The document that has no data for this
field, the field will be an empty string.
Ex:  {id:111,name: "abc", type:"", page""}
{page:17, type: "fiction", id:"", name:""}


Thanks







On Mon, May 12, 2014 at 7:10 AM, Aman Tandon wrote:

> please help me out here!!
>
> With Regards
> Aman Tandon
>
>
> On Sun, May 11, 2014 at 1:44 PM, Aman Tandon  >wrote:
>
> > Hi,
> >
> > Is there a way possible to retrieve all the fields present in both the
> > cores(core 1 and core2).
> >
> > e.g.
> > core1: {id:111,name: "abc" }
> >
> > core2: {page:17, type: "fiction"}
> >
> > I want is that, on querying both the cores I want to retrieve the results
> > containing all the 4 fields, fields id, name from core1 and page, type
> from
> > core2. Is it possible?
> >
> > With Regards
> > Aman Tandon
> >
>


Re: autowarming queries

2014-05-12 Thread Erick Erickson
First define an auto-warming query :)...

firstSearcher queries are fired when the server is started

newSearcher queries are fired when a new searcher is opened, i.e. when
a commit (hard when openSeracher=true or soft) happens.

Let's see your configuration too where you think you're setting up the
queries, maybe you've got an error there.

Best,
Erick

On Mon, May 12, 2014 at 8:27 AM, Joshi, Shital  wrote:
> Hi,
>
> How many auto warming queries are supported per collection in Solr4.4 and 
> higher? We see one out of three queries in log when new searcher is created. 
> Shouldn't it print all searcher queries?
>
> Thanks!
>
>
>


Re: Website running Solr

2014-05-12 Thread Gora Mohanty
On 11 May 2014 23:39, Ahmet Arslan  wrote:
>
> Hi,
>
> Some site owners put themselves here :
>
> https://wiki.apache.org/solr/PublicServers

Thanks for the reminder: I need to add some sites there.
If you got it, flaunt it :-)

>
>
> Besides, I would try *:* match all docs query.

Won't work. Many front-ends, such as the excellent Haystack for
Django, will not expose raw Solr queries by default.

Regards,
Gora


Re: Physical Files v. Reported Index Size

2014-05-12 Thread Greg Walters
See which index directory is actually in use by catting the index.properties 
file, verify nothing is using the others via lsof and you're safe to delete 
them.

Thanks,
Greg

On May 6, 2014, at 10:34 PM, Darrell Burgan  wrote:

> Hello all, I’m trying to reconcile what I’m seeing in the file system for a 
> Solr index versus what it is reporting in the UI. Here’s what I see in the UI 
> for the index:
>  
> https://s3-us-west-2.amazonaws.com/pa-darrell/ui.png
>  
> As shown, the index is 74.85 GB in size. However, here is what I see in the 
> data folder of the file system on that server:
>  
> https://s3-us-west-2.amazonaws.com/pa-darrell/file-system.png
>  
> As shown, it is consuming 109 GB of space. Also note that one of the index 
> folders is 75 GB in size.
>  
> My question is why the difference, and whether I can remove some of these 
> index folders to reclaim file system space? Or is there a Solr command to do 
> it (is it as obvious as “Optimize”)?
>  
> If there a manual I should RTFM about the file structure, please point me to 
> it.  J
>  
> Thanks!
> Darrell
>  
>  
> 
> Darrell Burgan | Architect, Sr. Principal, PeopleAnswers
> office: 214 445 2172 | mobile: 214 564 4450 | fax: 972 692 5386 | 
> darrell.bur...@infor.com | http://www.infor.com
> CONFIDENTIALITY NOTE: This email (including any attachments) is confidential 
> and may be protected by legal privilege. If you are not the intended 
> recipient, be aware that any disclosure, copying, distribution, or use of the 
> information contained herein is prohibited.  If you have received this 
> message in error, please notify the sender by replying to this message and 
> then delete this message in its entirety. Thank you for your cooperation.
> 
>  



Re: URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
Just in case the url is not available from outside my network, here is how
the url response looks like:




0
1007

*:*





1518 INDIANA CT, IRVING, TX
Central
200600
0.31
230690
170510
No Basement
1518 INDIANA CT
IRVING
US
TX
75060
2.0
4
Dallas-Fort Worth-Arlington
IRVING
Frame
Dallas
38300
2014-03-11T00:00:01Z
-6
2014-03-11T00:00:01Z
-12
2014-03-11T00:00:01Z
-3.0E-4
2013-07-10T00:00:01Z
1550
Brick veneer
1
Slab
4
Central
29
146.7849
38
2010-01-13T00:00:01Z
0
32.79920959472656
32.799209594726562,-96.926918029785156
-96.92691802978516
IRVING
SFH
2348
0
TX
Texas
1518 INDIANA CT
INDIANA CT
INDIANA CT
1518
500018666323
0.0178
3893.0
IRVING 015000
0
2002
75060
2014-04-20T16:28:52.467Z



2600 ASH CRK, MESQUITE, TX
Central
144200
0.28
165830
122570
No Basement
2600 ASH CREEK
MESQUITE
US
TX
75181
2.0
4
Dallas-Fort Worth-Arlington
MESQUITE
Frame
Dallas
100
2014-04-11T00:00:01Z
-1
2014-03-11T00:00:01Z
-1
2014-04-11T00:00:01Z
-3.0E-4
2013-07-10T00:00:01Z
1470
Brick veneer
1
Slab
1
Central
35
153.4116
54
2006-01-20T00:00:01Z
0
32.7484283447266
32.7484283447266,-96.5575180053711
-96.5575180053711
MESQUITE
SFH
2189
0
TX
Texas
2600 ASH CRK
ASH CRK
ASH CRK
2600
500018666324
0.0178
3345.0
MESQUITE 017304
0
1996
75181
2014-04-20T16:28:52.467Z









--
View this message in context: 
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135332.html
Sent from the Solr - User mailing list archive at Nabble.com.