Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Ahmet Arslan
Hi,

Your response says wt=json, but your solrconfig excerpt says wt=velocity.
May be you are hitting a different request handler?

What happens when you submit your query as q=Warszawa&df=text_index




On Wednesday, August 5, 2015 8:28 AM, Michał Oleś  wrote:
I installed solr 5.2.1 and use dih example with tika integration to search
for pdf content. Everything work as expected except highlighting plugin.
When I execute the query I don't even see highlighting section in results:

{
  "responseHeader": {
"status": 0,
"QTime": 1,
"params": {
  "indent": "true",
  "q": "text_index:Warszawa",
  "_": "1438704448534",
  "hl.simple.pre": "",
  "hl.simple.post": "",
  "hl.fl": "text",
  "wt": "json",
  "hl": "true"
}
  },
  "response": {
  "numFound": 2,
  "start": 0,
  "docs": [
  {
"size": "698231",
"lastModified": "Tue Aug 04 07:38:07 UTC 2015",
"id": "C:\\Moje\\solr-5.2.1\\pdf\\D2015000105301.pdf",
"text": [
  "\n  \n \n\nDZIENNIK USTAW \nRZECZYPOSPOLITEJ POLSKIEJ
\n\nWarszawa, dnia 29 lipca 2015 r. \n\nPoz. 1053 \n\nRO ZPORZĄDZENIE
\n\nMINISTRA OBRONY NARODOWEJ \n\nz dnia 9 lipca 2015 r. \n\n"
],
"title": [
  "Pozycja 1053 DPA.555.14.2015 JS (word)"
],
"author": "jswiderska"
  },
  {
"size": "747618",
"lastModified": "Tue Aug 04 07:37:02 UTC 2015",
"id": "C:\\Moje\\solr-5.2.1\\pdf\\D2015000109301.pdf",
"text": [
  "\n  \n \n\nDZIENNIK USTAW \nRZECZYPOSPOLITEJ POLSKIEJ
\n\nWarszawa, dnia 3 sierpnia 2015 r. \n\n"
],
"title": [
  "OGŁ - SZCZOTKA 1093"
],
"author": "bzebrowska"
  }
  ]
  }
}

My solrconfig.xml is default from that example. I tried to add default
values but it won't changed anything:



  explicit

  
  velocity
  browse
  layout

  
  edismax
  *:*
  10
  *,score

  
  on
  1

  
   on
   text
   true
  html
   
   
   3
   200
   text
   750

  

Here is part of schema.xml:





As in example I use two fields (one for indexing and one for store value).
When I run debug I found that highlight plugin time = 0. So it looks like
this plugin isn't even got invoked. Also in solr admin panel under tab
"Plugins/Stats" for all org.apache.solr.highlight.* classes I got 0 request.


Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread Upayavira
How did you trigger that exception, and can you guve the full exception? 

Upayavira

On Tue, Aug 4, 2015, at 09:14 PM, wwang525 wrote:
> Hi Upayavira,
> 
> I have physically cleaned up the files under index directory, and
> re-index
> did not fix the problem.
> 
> The following is an example of the field definition:
> 
>  docValues="true" default="0" required="true"/>
> 
> and the following is the definition of tint
> 
>  positionIncrementGap="0"/>
> 
> For some reason, I keep getting the error message:
> 
> 
> Caused by: java.lang.IllegalStateException: Type mismatch: DateDep was
> indexed as NUMERIC
> 
> I am on Solr 4.7. I edited the out-of-box solrconfig.xml for DIH example
> to
> include necessary libraries:
> 
>   
>   
>   
> 
> 
>   
> 
> 
>   
> 
> 
>   
> 
> 
> 
> 
> Not sure if there is something that is missing.
> 
> Thanks
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/TrieIntField-not-working-in-Solr-4-7-tp4220744p4220840.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread Upayavira
Fwiw, it wouldn't surprise me if you can't facet or sort on a trie field
with a precision step above 0.

That feature indexes at multiple precisions to make range queries
efficient. You may need to index the value twice, once with a precision
step for ranges, and once without (or zero rather). Both would be a
TrieField, you would use a copyField declaration in your schema  to
duplicate the field.

Upayavira

On Wed, Aug 5, 2015, at 09:55 AM, Upayavira wrote:
> How did you trigger that exception, and can you guve the full exception? 
> 
> Upayavira
> 
> On Tue, Aug 4, 2015, at 09:14 PM, wwang525 wrote:
> > Hi Upayavira,
> > 
> > I have physically cleaned up the files under index directory, and
> > re-index
> > did not fix the problem.
> > 
> > The following is an example of the field definition:
> > 
> >  > docValues="true" default="0" required="true"/>
> > 
> > and the following is the definition of tint
> > 
> >  > positionIncrementGap="0"/>
> > 
> > For some reason, I keep getting the error message:
> > 
> > 
> > Caused by: java.lang.IllegalStateException: Type mismatch: DateDep was
> > indexed as NUMERIC
> > 
> > I am on Solr 4.7. I edited the out-of-box solrconfig.xml for DIH example
> > to
> > include necessary libraries:
> > 
> >   
> >   
> >   
> > 
> > 
> >   
> > 
> > 
> >   
> > 
> > 
> >   
> > 
> > 
> > 
> > 
> > Not sure if there is something that is missing.
> > 
> > Thanks
> > 
> > 
> > 
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/TrieIntField-not-working-in-Solr-4-7-tp4220744p4220840.html
> > Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr SolrEntityProcessor - can it take customer parameters?

2015-08-05 Thread Mikhail Khludnev
it should work with placehoder syntax like ${fromDate}. Have you tried?

On Wed, Aug 5, 2015 at 1:57 AM, sergeyk  wrote:

> I's like to use SolrEntityProcessor for import some documents from one solr
> cloud to another solr cloud.
> The date range is dynamic and can change.
> Is there a way to pass, say solr/core/data-import?&fromDate= date>&toDate=
>
> And then use them in query for SolrEntityProcessor q="date:[$fromDate TO
> $toDate]".
> I am still on Solr 4.6.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-SolrEntityProcessor-can-it-take-customer-parameters-tp4220872.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





High CPU DistributedQueue and OverseerAutoReplicaFailoverThread

2015-08-05 Thread Markus Jelsma
Hello  - we have a single Solr 5.2.1 node that (for now) contains four single 
shard collections. Only two collections actually contain data and are queried. 
The machine has some unusual latency that led me to sample the CPU time with 
VisualVM. On that node we see that DistributedQueue$LatchWachter.await() and 
OverseerAutoReplicaFailoverThread() claiming a lot of CPU time, 
DistributedQueue takes the most of it. I have set logging to DEBUG but no 
Zookeeper logging is printed, the connections are stable and updates rarely 
come in. 

Is it because the node runs the overseer? If so, how can i prevent it from 
taking quite a lot of CPU time on a node with four collections where no state 
changes occur?

Any other thoughts?
Markus


Solr spell check not showing any suggestions for other language

2015-08-05 Thread talha
Solr spell check is not showing any suggestions for other language.I have
indexed mutli-languages (english and bangla) in same core.It's showing
suggestions for wrongly spelt english word but in case of wrongly spelt
bangla word it showing "correctlySpelled = false" but not showing any
suggestions for it.

Please check my configuration for spell check below

solrconfig.xml


  

explicit
10
product_name

on
default
wordbreak
true
5
2
5
true
true
5
3

  
  
spellcheck
  



  text_suggest

  
default
suggest
solr.DirectSolrSpellChecker
internal
0.5
  

  
wordbreak
suggest
solr.WordBreakSolrSpellChecker
true
true
10
5
  



schema.xml


  





  
  




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-spell-check-not-showing-any-suggestions-for-other-language-tp4220950.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: multiple but identical suggestions in autocomplete

2015-08-05 Thread Nutch Solr User
You will need to call this service from UI as you are calling suggester
component currently. (may be on every key-press event in text box). You will
pass required parameters too. 

Service will internally form a solr suggester query and query Solr. From the
returned response it will keep only unique suggestions from top N
suggestions and return suggestions to UI.



-
Nutch Solr User

"The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing."
--
View this message in context: 
http://lucene.472066.n3.nabble.com/multiple-but-identical-suggestions-in-autocomplete-tp4220055p4220953.html
Sent from the Solr - User mailing list archive at Nabble.com.


how to extend JavaBinCodec and make it available in solrj api

2015-08-05 Thread Dmitry Kan
Hello,

Solr: 5.2.1
class: org.apache.solr.common.util.JavaBinCodec

I'm working on a custom data structure for the highlighter. The data
structure is ready in JSON and XML formats. I need also JavaBin format. The
data structure is already made serializable by extending the WritableValue
class (methods write and resolve).

To receive the custom format on the client via solrj api, the data
structure needs to be parseable by JavaBinCodec. Is this correct
assumption? Can we introduce the custom data structure consumer on the
solrj api without complete overhaul of the api? Is there plugin framework
such that JavaBinCodec is extended and used for the new data structure?



-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info


Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
Hi,

for months/years, I have been experiencing occasional very long (30s+)
hangs when programmatically initializing a solr container in Java. The
application has worked for years in production with this setup without any
problems apart from this.

The code I have is this here:

 public void initContainer(File solrConfig) throws Exception {
logger.debug("initializing solr container with config {}",
solrConfig);
Preconditions.checkNotNull(solrConfig);
Stopwatch stopwatch = Stopwatch.createStarted();
container =
CoreContainer.createAndLoad(solrConfig.getParentFile().getAbsolutePath(),
solrConfig);
containerInitialized = true;
logger.debug("initializing solr container took {}", stopwatch);
if (stopwatch.elapsed(TimeUnit.MILLISECONDS) > 1000) {
logger.warn("initializing solr container took very long
({})", stopwatch);
}
}

So it is obviously the createAndLoad-Call. I posted about this a long time
ago and people suggested checking for uncommitted soft commits but now I
realized that I had these hangs in a test setup, where the index is created
from scratch, so that cannot be the problem.

Any ideas anyone?

My config is rather simple. Is there something wring with my locking
options that might cause this?



  
native
true
64
1000
  

  LUCENE_43

  

   1000
   1
   false

  

  

  

  
  
  

  
  
  

  
solr
  



Thank you in advance,

Robert


Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread wwang525
Hi Upayavira,

I edited the definition of tint to have a precisionStep=0 for DateDep 
(i.e.: departure date). This field is used as filter query and also used in
faceted search.

The following are definitions:




   

The following is the log message after I executed a query:


INFO  - 2015-08-05 08:19:52.090; org.apache.solr.core.SolrCore; [db-mssql]
webapp=/solr path=/select
params={facet=true&group.ngroups=true&sort=Price+asc&facet.mincount=1&facet.limit=800&wt=jason&group.facet=true&rows=30&debugQuery=true&facet.sort=count&q=*:*&group.field=nSoftVoyageCode&facet.field=DateDep&facet.field=HotelCode&facet.field=Collection&facet.field=GatewayCode&facet.field=StarRating&facet.field=MealplanCode&group=true&fq=GatewayCode:(YYZ+OR+BUF+OR+YVO+OR+YYT+OR+YBG+OR+YAM)&fq=DestCode:(FPO+OR+AUA+OR+VRA+OR+CCC+OR+MBJ+OR+CMW+OR+CYO)&fq=DateDep:([20150820+TO+20150920]+OR+[20150720+TO+20150820])&fq=Duration:(2+OR+4+OR+6+OR+12+OR+13)}
hits=584 status=500 QTime=27 
ERROR - 2015-08-05 08:19:52.091; org.apache.solr.common.SolrException;
null:org.apache.solr.common.SolrException: Exception during facet.field:
DateDep
at org.apache.solr.request.SimpleFacets$2.call(SimpleFacets.java:563)
at org.apache.solr.request.SimpleFacets$2.call(SimpleFacets.java:549)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at org.apache.solr.request.SimpleFacets$1.execute(SimpleFacets.java:503)
at
org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:573)
at
org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:260)
at
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:84)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:222)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Type mismatch: DateDep was
indexed as NUMERIC
at
org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1161)
at
org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1145)
at
org.apache.lucene.search.grouping.term.TermGroupFacetCollector$SV.setNextReader(TermGroupFacetCollector.java:130)
at org.apache.lucene.search.IndexSearcher.

RE: Duplicate Documents

2015-08-05 Thread Tarala, Magesh
I deleted the index and re-indexed. Duplicates went away. Have not identified 
root cause, but looks like updating documents is causing it sporadically. Going 
to try deleting the document and then update. 


-Original Message-
From: Tarala, Magesh 
Sent: Monday, August 03, 2015 8:27 AM
To: solr-user@lucene.apache.org
Subject: Duplicate Documents

I'm using solr 4.10.2. I'm using "id" field as the unique key - it is passed in 
with the document when ingesting the documents into solr. When querying I get 
duplicate documents with different "_version_". Out off approx. 25K unique 
documents ingested into solr, I see approx. 300 duplicates.

It is a 3 node solr cloud with one shard and 2 replicas.
I'm also using nested documents.

Thanks in advance for any insights.

--Magesh



Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread wwang525
Hi Upayavira,

A bit more explanation on DateDep.

This value in database is expressed as a varchar (8), and has the format of
20150803. I map it to be an SortableIntField before, and it worked with the
filter query and faceted search. 

After I changed it to be TrieIntField, tried re-indexing many times, and it
gave me all the same error message that I uploaded in the last post.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/TrieIntField-not-working-in-Solr-4-7-tp4220744p4220983.html
Sent from the Solr - User mailing list archive at Nabble.com.


Embedded Solr now deprecated?

2015-08-05 Thread Robert Krüger
Hi,

I tried to upgrade my application from solr 4 to 5 and just now realized
that embedded use of solr seems to be on the way out. Is that correct or is
there a just new API to use for that?

Thanks in advance,

Robert


Re: Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
OK, now that I had a reproducible setup I could debug where it hangs:

public SystemInfoHandler(CoreContainer cc) {
super();
this.cc = cc;
init();
  }

  private void init() {
try {
  InetAddress addr = InetAddress.getLocalHost();
  hostname = addr.getCanonicalHostName();
 this is where it hangs
} catch (UnknownHostException e) {
  //default to null
}
  }


so it depends on my current network setup even for the embedded case. any
idea how I can stop solr from making that call? InetAddress.getLocalHost()
in this case returns some local vpn address and thus the reverse lookup
times out after 30 seconds. This actually happens twice, once when
initializing the container, again when initializing the core, so in my case
a minute per restart and looking at the code, I don't see how I can work
around this other than patching solr, which I am trying to avoid like hell.

On Wed, Aug 5, 2015 at 1:54 PM, Robert Krüger  wrote:

> Hi,
>
> for months/years, I have been experiencing occasional very long (30s+)
> hangs when programmatically initializing a solr container in Java. The
> application has worked for years in production with this setup without any
> problems apart from this.
>
> The code I have is this here:
>
>  public void initContainer(File solrConfig) throws Exception {
> logger.debug("initializing solr container with config {}",
> solrConfig);
> Preconditions.checkNotNull(solrConfig);
> Stopwatch stopwatch = Stopwatch.createStarted();
> container =
> CoreContainer.createAndLoad(solrConfig.getParentFile().getAbsolutePath(),
> solrConfig);
> containerInitialized = true;
> logger.debug("initializing solr container took {}", stopwatch);
> if (stopwatch.elapsed(TimeUnit.MILLISECONDS) > 1000) {
> logger.warn("initializing solr container took very long
> ({})", stopwatch);
> }
> }
>
> So it is obviously the createAndLoad-Call. I posted about this a long time
> ago and people suggested checking for uncommitted soft commits but now I
> realized that I had these hangs in a test setup, where the index is created
> from scratch, so that cannot be the problem.
>
> Any ideas anyone?
>
> My config is rather simple. Is there something wring with my locking
> options that might cause this?
>
> 
>
>   
> native
> true
> 64
> 1000
>   
>
>   LUCENE_43
>
>   
> 
>1000
>1
>false
> 
>   
>
>   
>  multipartUploadLimitInKB="2048" />
>   
>
>default="true" />
>   
>class="org.apache.solr.handler.admin.AdminHandlers" />
>
>autowarmCount="256"/>
>autowarmCount="256"/>
>autowarmCount="0"/>
>
>   
> solr
>   
>
> 
>
> Thank you in advance,
>
> Robert
>



-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


Re: Embedded Solr now deprecated?

2015-08-05 Thread Erick Erickson
Where did you see that? Maybe I missed something yet
again. This is unrelated to whether we ship a WAR if that's
being conflated here.

I rather doubt that embedded is "on it's way out", although
my memory isn't what it used to be.

For starters, MapReduceIndexerTool uses it, so it gets
regular exercise from that, and anything removing it would
require some kind of replacement.

How are you using it that you care? Wondering what
alternatives exist...

Best,
Erick


On Wed, Aug 5, 2015 at 9:09 AM, Robert Krüger  wrote:
> Hi,
>
> I tried to upgrade my application from solr 4 to 5 and just now realized
> that embedded use of solr seems to be on the way out. Is that correct or is
> there a just new API to use for that?
>
> Thanks in advance,
>
> Robert


Re: Solr SolrEntityProcessor - can it take customer parameters?

2015-08-05 Thread Shawn Heisey
On 8/4/2015 4:57 PM, sergeyk wrote:
> I's like to use SolrEntityProcessor for import some documents from one solr
> cloud to another solr cloud.
> The date range is dynamic and can change.
> Is there a way to pass, say solr/core/data-import?&fromDate= date>&toDate=

You can use syntax like ${dih.request.fromDate} in your dataimport
config to substitute parameters sent on the URL.  Here is an example
from my own dih-config.xml file:

  query="
SELECT * FROM ${dih.request.dataView}
WHERE (
  (
did > ${dih.request.minDid}
AND did <= ${dih.request.maxDid}
  )
  ${dih.request.extraWhere}
) AND (crc32(did) % ${dih.request.numShards})
  IN (${dih.request.modVal})
"

Thanks,
Shawn



RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread Dyer, James
Talha,

Possibly this english-specific analysis in your "text_suggest" field is 
interfering:  solr.EnglishPossessiveFilterFactory ?

Another guess is you're receiving more than 5 results and 
"maxResultsForSuggest" is set to 5.

But I'm not sure.  Maybe someone can help with more information from you?

Can you provide a few document examples that have Bangla text, then the full 
query request with a misspelled Bangla word (from the document examples you 
provide), then the full spellcheck response, and the total # of documents 
returned ? 

James Dyer
Ingram Content Group

-Original Message-
From: talha [mailto:talh...@gmail.com] 
Sent: Wednesday, August 05, 2015 5:20 AM
To: solr-user@lucene.apache.org
Subject: Solr spell check not showing any suggestions for other language

Solr spell check is not showing any suggestions for other language.I have
indexed mutli-languages (english and bangla) in same core.It's showing
suggestions for wrongly spelt english word but in case of wrongly spelt
bangla word it showing "correctlySpelled = false" but not showing any
suggestions for it.

Please check my configuration for spell check below

solrconfig.xml


  

explicit
10
product_name

on
default
wordbreak
true
5
2
5
true
true
5
3

  
  
spellcheck
  



  text_suggest

  
default
suggest
solr.DirectSolrSpellChecker
internal
0.5
  

  
wordbreak
suggest
solr.WordBreakSolrSpellChecker
true
true
10
5
  



schema.xml


  





  
  




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-spell-check-not-showing-any-suggestions-for-other-language-tp4220950.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Embedded Solr now deprecated?

2015-08-05 Thread Shawn Heisey
On 8/5/2015 7:09 AM, Robert Krüger wrote:
> I tried to upgrade my application from solr 4 to 5 and just now realized
> that embedded use of solr seems to be on the way out. Is that correct or is
> there a just new API to use for that?

Building on Erick's reply:

I doubt that the embedded server is going away, and I do not recall
seeing *anything* marking the entire class deprecated.  The class still
receives attention from devs -- this feature was released with 5.1.0:

https://issues.apache.org/jira/browse/SOLR-7307

That said, we have discouraged users from deploying it in production for
quite some time, even though it continues to exist and receive developer
attention.  Some of the reasons that I think users should avoid the
embedded server:  It doesn't support SolrCloud, you cannot make it
fault-tolerant (redundant), and troubleshooting is harder because you
cannot connect to it from outside of the source code where it is embedded.

Deploying Solr as a network service offers much more capability than you
can get when you embed it in your application.  Chances are that you can
easily replace EmbeddedSolrServer with one of the SolrClient classes and
use a separate Solr deployment from your application.

Thanks,
Shawn



Re: Embedded Solr now deprecated?

2015-08-05 Thread Alexandre Rafalovitch
I thought the Embedded server was good for a scenario where you wanted
quickly to build a core with lots of documents locally. And then, move
the core into production and swap it in. So you minimize the network
traffic.

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 5 August 2015 at 10:54, Shawn Heisey  wrote:
> On 8/5/2015 7:09 AM, Robert Krüger wrote:
>> I tried to upgrade my application from solr 4 to 5 and just now realized
>> that embedded use of solr seems to be on the way out. Is that correct or is
>> there a just new API to use for that?
>
> Building on Erick's reply:
>
> I doubt that the embedded server is going away, and I do not recall
> seeing *anything* marking the entire class deprecated.  The class still
> receives attention from devs -- this feature was released with 5.1.0:
>
> https://issues.apache.org/jira/browse/SOLR-7307
>
> That said, we have discouraged users from deploying it in production for
> quite some time, even though it continues to exist and receive developer
> attention.  Some of the reasons that I think users should avoid the
> embedded server:  It doesn't support SolrCloud, you cannot make it
> fault-tolerant (redundant), and troubleshooting is harder because you
> cannot connect to it from outside of the source code where it is embedded.
>
> Deploying Solr as a network service offers much more capability than you
> can get when you embed it in your application.  Chances are that you can
> easily replace EmbeddedSolrServer with one of the SolrClient classes and
> use a separate Solr deployment from your application.
>
> Thanks,
> Shawn
>


Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-05 Thread Shawn Heisey
On 8/5/2015 5:38 AM, Dmitry Kan wrote:
> Solr: 5.2.1
> class: org.apache.solr.common.util.JavaBinCodec
>
> I'm working on a custom data structure for the highlighter. The data
> structure is ready in JSON and XML formats. I need also JavaBin format. The
> data structure is already made serializable by extending the WritableValue
> class (methods write and resolve).
>
> To receive the custom format on the client via solrj api, the data
> structure needs to be parseable by JavaBinCodec. Is this correct
> assumption? Can we introduce the custom data structure consumer on the
> solrj api without complete overhaul of the api? Is there plugin framework
> such that JavaBinCodec is extended and used for the new data structure?

The JavaBinCodec class lives in the solr/sorlj/src/java directory.  It
is already part of SolrJ.  The class and the other classes defined
inside it are public and not final, so you should be able to extend it
and override things as required with no problem in a program that
includes SolrJ as well as a custom plugin for Solr.  There are a handful
of private fields in the class ... if you need your code to deal with
any of those, open an issue and make your case.  If it is compelling,
perhaps some of them can be changed to protected.

Is there a problem that's not solved by extending JavaBinCodec?

Thanks,
Shawn



SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Suma Shivaprasad
Hello,

Is /clusterstate.json in Zookeeper updated with collection state if a
collection is created with server running in Solr Cloud mode without
creating a core through coreAdmin or providing a core.properties . I find
that there is a "state.json" present under /collections/
which reflects the collection status . However dont find the
/clusterstate.json updated. Is this an expected behaviour ?

Thanks
Suma


RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread talha
Dear James

Thank you for your reply.

I tested analyser without “solr.EnglishPossessiveFilterFactory” but still no
luck. I also updated analyser please find this below.


  



  



with above configuration for “text_sugggest” i got following results

For Correct Bangla Word: সহজ Solr response is 
Note: i set rows to 0 to skip results


  0
  2
  
সহজ
true
0
xml
1438787238383
  




  
true
  



For an Incorrect Bangla Word: সহগ where i just changed last letter and Solr
response is


  0
  7
  
সহগ
true
0
xml
1438787208052
  




  
false
  






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-spell-check-not-showing-any-suggestions-for-other-language-tp4220950p4221033.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Initializing core takes very long at times

2015-08-05 Thread Shawn Heisey
On 8/5/2015 7:56 AM, Robert Krüger wrote:
> OK, now that I had a reproducible setup I could debug where it hangs:
>
> public SystemInfoHandler(CoreContainer cc) {
> super();
> this.cc = cc;
> init();
>   }
>
>   private void init() {
> try {
>   InetAddress addr = InetAddress.getLocalHost();
>   hostname = addr.getCanonicalHostName();
>  this is where it hangs
> } catch (UnknownHostException e) {
>   //default to null
> }
>   }
>
>
> so it depends on my current network setup even for the embedded case. any
> idea how I can stop solr from making that call? InetAddress.getLocalHost()
> in this case returns some local vpn address and thus the reverse lookup
> times out after 30 seconds. This actually happens twice, once when
> initializing the container, again when initializing the core, so in my case
> a minute per restart and looking at the code, I don't see how I can work
> around this other than patching solr, which I am trying to avoid like hell.

Because almost all users are using Solr in a mode where it requires the
network, that code cannot be eliminated from Solr.

It is critical that your machine's local network is set up completely
right when you are running applications that are (normally)
network-aware.  Generally that means having relevant entries for all
interfaces in your hosts file and making sure that the DNS resolver code
included with the operating system is not buggy.

If you're dealing with a VPN or something else where the address is
acquired from elsewhere, then you need to make sure that the machine has
at least two DNS servers configured, that one of them is working, and
that the forward and reverse DNS on those servers are completely set up
for that interface's IP address.  Bugs in the DNS resolver code can
complicate this.

Thanks,
Shawn



Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
@Upayavira

Thanks these thing are most useful for my understanding 
I have thing about i will create XML or CVS file from my requirement using
java
Then Index it via HTTP post or  bin/post 

I am not using DIH because i did't get any of  link or idea how to split
data and add to solr one by one.(As i mention onmy requirement) 

tell me Indexing XML file or CVS files which one is a better way ?

with csv i noticed that it didn't parse the data into the correct fields. So
how do we ensure that the data is correctly stored in Solr ?

Or XML is a correct way to parse it



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-Apache-Solr-Handle-TeraByte-Large-Data-tp3656484p4221051.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Upayavira
If you are using Java, you will likely find SolrJ the best way - it uses
serialised Java objects to communicate with Solr - you don't need to
worry about that. Just use code similar to that earlier in the thread.
No XML, no CSV, just simple java code.

Upayavira


On Wed, Aug 5, 2015, at 04:50 PM, Mugeesh Husain wrote:
> @Upayavira
> 
> Thanks these thing are most useful for my understanding 
> I have thing about i will create XML or CVS file from my requirement
> using
> java
> Then Index it via HTTP post or  bin/post 
> 
> I am not using DIH because i did't get any of  link or idea how to split
> data and add to solr one by one.(As i mention onmy requirement) 
> 
> tell me Indexing XML file or CVS files which one is a better way ?
> 
> with csv i noticed that it didn't parse the data into the correct fields.
> So
> how do we ensure that the data is correctly stored in Solr ?
> 
> Or XML is a correct way to parse it
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Can-Apache-Solr-Handle-TeraByte-Large-Data-tp3656484p4221051.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
filesystem are about 40 millions of document it will iterate 40 times how may
solrJ could not handle 40m times loops(before  indexing i have to split
values from filename and make some operation then index to Solr)

Is it will continuous indexing using 40m times or i have to sleep in between
some interaval.

Does it will take same time in compare of HTTP or  bin/post ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-Apache-Solr-Handle-TeraByte-Large-Data-tp3656484p4221060.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Michał Oleś
Thank you for answer. When I execute the query using q=Warszawa&df=text_index
instead of q=text_index:Warszawa nothing changed. If I remove wt=json from
query I got response in xml but also without highlight results.


Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Upayavira
Post your docs in sets of 1000. Create a:

 List docs

Then add 1000 docs to it, then client.add(docs);

Repeat until your 40m are indexed.

Upayavira

On Wed, Aug 5, 2015, at 05:07 PM, Mugeesh Husain wrote:
> filesystem are about 40 millions of document it will iterate 40 times how
> may
> solrJ could not handle 40m times loops(before  indexing i have to split
> values from filename and make some operation then index to Solr)
> 
> Is it will continuous indexing using 40m times or i have to sleep in
> between
> some interaval.
> 
> Does it will take same time in compare of HTTP or  bin/post ?
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Can-Apache-Solr-Handle-TeraByte-Large-Data-tp3656484p4221060.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
@Mikhail   Use of data import handler ,if i define my baseDir is
D:/work/folder. Will it work for sub-folder and sub-folder of sub-folder ...
etc  also.?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-Apache-Solr-Handle-TeraByte-Large-Data-tp3656484p4221063.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
I am shipping solr as a local search engine with our software, so I have no
way of controlling that environment. Many other software packages (rdbmss,
nosql engines etc.) work well in such a setup (as does solr except this
problem). The problem is that in this case (AFAICS) the host cannot be
overridden in any way (by config or system property or whatever), because
that handler is coded as it is. It is in no way a natural limitation of the
type of software or my use case. But I understand that this is probably not
frequently a problem for people, because by far most solr use is classic
server-based use.

I may suggest a patch on the devel mailing list.


On Wed, Aug 5, 2015 at 5:42 PM, Shawn Heisey  wrote:

> On 8/5/2015 7:56 AM, Robert Krüger wrote:
> > OK, now that I had a reproducible setup I could debug where it hangs:
> >
> > public SystemInfoHandler(CoreContainer cc) {
> > super();
> > this.cc = cc;
> > init();
> >   }
> >
> >   private void init() {
> > try {
> >   InetAddress addr = InetAddress.getLocalHost();
> >   hostname = addr.getCanonicalHostName();
> >  this is where it hangs
> > } catch (UnknownHostException e) {
> >   //default to null
> > }
> >   }
> >
> >
> > so it depends on my current network setup even for the embedded case. any
> > idea how I can stop solr from making that call?
> InetAddress.getLocalHost()
> > in this case returns some local vpn address and thus the reverse lookup
> > times out after 30 seconds. This actually happens twice, once when
> > initializing the container, again when initializing the core, so in my
> case
> > a minute per restart and looking at the code, I don't see how I can work
> > around this other than patching solr, which I am trying to avoid like
> hell.
>
> Because almost all users are using Solr in a mode where it requires the
> network, that code cannot be eliminated from Solr.
>
> It is critical that your machine's local network is set up completely
> right when you are running applications that are (normally)
> network-aware.  Generally that means having relevant entries for all
> interfaces in your hosts file and making sure that the DNS resolver code
> included with the operating system is not buggy.
>
> If you're dealing with a VPN or something else where the address is
> acquired from elsewhere, then you need to make sure that the machine has
> at least two DNS servers configured, that one of them is working, and
> that the forward and reverse DNS on those servers are completely set up
> for that interface's IP address.  Bugs in the DNS resolver code can
> complicate this.
>
> Thanks,
> Shawn
>
>


-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
thanks you Upayavira,

I think i have done all these thing using SolrJ which was usefull  before
starting development of the project.
I hope i will not got any of issue using SolrJ and got lots of stuff using
it.   

Thanks
Mugeesh Husain



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-Apache-Solr-Handle-TeraByte-Large-Data-tp3656484p4221066.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Initializing core takes very long at times

2015-08-05 Thread Erick Erickson
All patches welcome!

On Wed, Aug 5, 2015 at 12:40 PM, Robert Krüger  wrote:
> I am shipping solr as a local search engine with our software, so I have no
> way of controlling that environment. Many other software packages (rdbmss,
> nosql engines etc.) work well in such a setup (as does solr except this
> problem). The problem is that in this case (AFAICS) the host cannot be
> overridden in any way (by config or system property or whatever), because
> that handler is coded as it is. It is in no way a natural limitation of the
> type of software or my use case. But I understand that this is probably not
> frequently a problem for people, because by far most solr use is classic
> server-based use.
>
> I may suggest a patch on the devel mailing list.
>
>
> On Wed, Aug 5, 2015 at 5:42 PM, Shawn Heisey  wrote:
>
>> On 8/5/2015 7:56 AM, Robert Krüger wrote:
>> > OK, now that I had a reproducible setup I could debug where it hangs:
>> >
>> > public SystemInfoHandler(CoreContainer cc) {
>> > super();
>> > this.cc = cc;
>> > init();
>> >   }
>> >
>> >   private void init() {
>> > try {
>> >   InetAddress addr = InetAddress.getLocalHost();
>> >   hostname = addr.getCanonicalHostName();
>> >  this is where it hangs
>> > } catch (UnknownHostException e) {
>> >   //default to null
>> > }
>> >   }
>> >
>> >
>> > so it depends on my current network setup even for the embedded case. any
>> > idea how I can stop solr from making that call?
>> InetAddress.getLocalHost()
>> > in this case returns some local vpn address and thus the reverse lookup
>> > times out after 30 seconds. This actually happens twice, once when
>> > initializing the container, again when initializing the core, so in my
>> case
>> > a minute per restart and looking at the code, I don't see how I can work
>> > around this other than patching solr, which I am trying to avoid like
>> hell.
>>
>> Because almost all users are using Solr in a mode where it requires the
>> network, that code cannot be eliminated from Solr.
>>
>> It is critical that your machine's local network is set up completely
>> right when you are running applications that are (normally)
>> network-aware.  Generally that means having relevant entries for all
>> interfaces in your hosts file and making sure that the DNS resolver code
>> included with the operating system is not buggy.
>>
>> If you're dealing with a VPN or something else where the address is
>> acquired from elsewhere, then you need to make sure that the machine has
>> at least two DNS servers configured, that one of them is working, and
>> that the forward and reverse DNS on those servers are completely set up
>> for that interface's IP address.  Bugs in the DNS resolver code can
>> complicate this.
>>
>> Thanks,
>> Shawn
>>
>>
>
>
> --
> Robert Krüger
> Managing Partner
> Lesspain GmbH & Co. KG
>
> www.lesspain-software.com


Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Erick Erickson
Yes. The older-style ZK entity was all-in-one
in /clusterstate.json. Recently we've moved
to a per-collection state.json instead, to avoid
the "thundering herd" problem. In that state,
/clusterstate.json is completely ignored and,
as you see, not updated.

Hmmm, might be worth raising a JIRA to remove
empty clusterstate.json to avoid confusion

Best,
Erick

On Wed, Aug 5, 2015 at 11:22 AM, Suma Shivaprasad
 wrote:
> Hello,
>
> Is /clusterstate.json in Zookeeper updated with collection state if a
> collection is created with server running in Solr Cloud mode without
> creating a core through coreAdmin or providing a core.properties . I find
> that there is a "state.json" present under /collections/
> which reflects the collection status . However dont find the
> /clusterstate.json updated. Is this an expected behaviour ?
>
> Thanks
> Suma


Re: Embedded Solr now deprecated?

2015-08-05 Thread Robert Krüger
I just saw lots of deprecation warnings in my current code and a method
that was removed, which is why I asked.

Regarding the use case, I am embedding it with a desktop application just
as others use java-based no-sql or rdbms engines and that makes sense
architecturally in my case and is just simpler than deploying a separate
little tomcat instance. API-wise, I know it is the same and it would be
doable to do it that way. The embedded option is just the logical and
simpler choice in terms of delivery, packaging, installation, automated
testing in this case. The network option just doesn't add anything here
apart from overhead (probably negligible in our case) and complexity.

So our use is in production but in a desktop way, not what people normally
think about when they hear "production use".

Thanks everyone for the quick feedback! I am very relieved to hear it is
not on its way out and I will look at the api changes more closely and try
to get our application running on 5.2.1.

Best regards,

Robert

On Wed, Aug 5, 2015 at 4:34 PM, Erick Erickson 
wrote:

> Where did you see that? Maybe I missed something yet
> again. This is unrelated to whether we ship a WAR if that's
> being conflated here.
>
> I rather doubt that embedded is "on it's way out", although
> my memory isn't what it used to be.
>
> For starters, MapReduceIndexerTool uses it, so it gets
> regular exercise from that, and anything removing it would
> require some kind of replacement.
>
> How are you using it that you care? Wondering what
> alternatives exist...
>
> Best,
> Erick
>
>
> On Wed, Aug 5, 2015 at 9:09 AM, Robert Krüger  wrote:
> > Hi,
> >
> > I tried to upgrade my application from solr 4 to 5 and just now realized
> > that embedded use of solr seems to be on the way out. Is that correct or
> is
> > there a just new API to use for that?
> >
> > Thanks in advance,
> >
> > Robert
>



-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Suma Shivaprasad
Thanks for your response. What version of solr is this change effective
from?
Will raise a jira.

Thanks
Suma

On Wed, Aug 5, 2015 at 10:37 PM, Erick Erickson 
wrote:

> Yes. The older-style ZK entity was all-in-one
> in /clusterstate.json. Recently we've moved
> to a per-collection state.json instead, to avoid
> the "thundering herd" problem. In that state,
> /clusterstate.json is completely ignored and,
> as you see, not updated.
>
> Hmmm, might be worth raising a JIRA to remove
> empty clusterstate.json to avoid confusion
>
> Best,
> Erick
>
> On Wed, Aug 5, 2015 at 11:22 AM, Suma Shivaprasad
>  wrote:
> > Hello,
> >
> > Is /clusterstate.json in Zookeeper updated with collection state if a
> > collection is created with server running in Solr Cloud mode without
> > creating a core through coreAdmin or providing a core.properties . I find
> > that there is a "state.json" present under /collections/
> > which reflects the collection status . However dont find the
> > /clusterstate.json updated. Is this an expected behaviour ?
> >
> > Thanks
> > Suma
>


Re: Initializing core takes very long at times

2015-08-05 Thread Alexandre Rafalovitch
I wonder if that's also something that could be resolved by having a
custom Network level handler, on a pure Java level.

I see to vaguely recall it was possible.

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 5 August 2015 at 12:57, Erick Erickson  wrote:
> All patches welcome!
>
> On Wed, Aug 5, 2015 at 12:40 PM, Robert Krüger  wrote:
>> I am shipping solr as a local search engine with our software, so I have no
>> way of controlling that environment. Many other software packages (rdbmss,
>> nosql engines etc.) work well in such a setup (as does solr except this
>> problem). The problem is that in this case (AFAICS) the host cannot be
>> overridden in any way (by config or system property or whatever), because
>> that handler is coded as it is. It is in no way a natural limitation of the
>> type of software or my use case. But I understand that this is probably not
>> frequently a problem for people, because by far most solr use is classic
>> server-based use.
>>
>> I may suggest a patch on the devel mailing list.
>>
>>
>> On Wed, Aug 5, 2015 at 5:42 PM, Shawn Heisey  wrote:
>>
>>> On 8/5/2015 7:56 AM, Robert Krüger wrote:
>>> > OK, now that I had a reproducible setup I could debug where it hangs:
>>> >
>>> > public SystemInfoHandler(CoreContainer cc) {
>>> > super();
>>> > this.cc = cc;
>>> > init();
>>> >   }
>>> >
>>> >   private void init() {
>>> > try {
>>> >   InetAddress addr = InetAddress.getLocalHost();
>>> >   hostname = addr.getCanonicalHostName();
>>> >  this is where it hangs
>>> > } catch (UnknownHostException e) {
>>> >   //default to null
>>> > }
>>> >   }
>>> >
>>> >
>>> > so it depends on my current network setup even for the embedded case. any
>>> > idea how I can stop solr from making that call?
>>> InetAddress.getLocalHost()
>>> > in this case returns some local vpn address and thus the reverse lookup
>>> > times out after 30 seconds. This actually happens twice, once when
>>> > initializing the container, again when initializing the core, so in my
>>> case
>>> > a minute per restart and looking at the code, I don't see how I can work
>>> > around this other than patching solr, which I am trying to avoid like
>>> hell.
>>>
>>> Because almost all users are using Solr in a mode where it requires the
>>> network, that code cannot be eliminated from Solr.
>>>
>>> It is critical that your machine's local network is set up completely
>>> right when you are running applications that are (normally)
>>> network-aware.  Generally that means having relevant entries for all
>>> interfaces in your hosts file and making sure that the DNS resolver code
>>> included with the operating system is not buggy.
>>>
>>> If you're dealing with a VPN or something else where the address is
>>> acquired from elsewhere, then you need to make sure that the machine has
>>> at least two DNS servers configured, that one of them is working, and
>>> that the forward and reverse DNS on those servers are completely set up
>>> for that interface's IP address.  Bugs in the DNS resolver code can
>>> complicate this.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>
>>
>> --
>> Robert Krüger
>> Managing Partner
>> Lesspain GmbH & Co. KG
>>
>> www.lesspain-software.com


Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Shawn Heisey
On 8/5/2015 9:22 AM, Suma Shivaprasad wrote:
> Hello,
>
> Is /clusterstate.json in Zookeeper updated with collection state if a
> collection is created with server running in Solr Cloud mode without
> creating a core through coreAdmin or providing a core.properties . I find
> that there is a "state.json" present under /collections/
> which reflects the collection status . However dont find the
> /clusterstate.json updated. Is this an expected behaviour ?

Yes, that is expected behavior.  A separate clusterstate for each
collection was one of the big new features in Solr 5.0:

https://issues.apache.org/jira/browse/SOLR-5473

Thanks,
Shawn



Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Suma Shivaprasad
Thanks Shawn.

Does this mean in client code, wherever we are using the API

"ZkStateReader.getClusterState.getCollections" to get status should be
changed to
"CollectionsAdminRequest.ClusterStatus" for each collection or will that
API continue to work ?

Thanks
Suma

On Wed, Aug 5, 2015 at 11:00 PM, Shawn Heisey  wrote:

> On 8/5/2015 9:22 AM, Suma Shivaprasad wrote:
> > Hello,
> >
> > Is /clusterstate.json in Zookeeper updated with collection state if a
> > collection is created with server running in Solr Cloud mode without
> > creating a core through coreAdmin or providing a core.properties . I find
> > that there is a "state.json" present under /collections/
> > which reflects the collection status . However dont find the
> > /clusterstate.json updated. Is this an expected behaviour ?
>
> Yes, that is expected behavior.  A separate clusterstate for each
> collection was one of the big new features in Solr 5.0:
>
> https://issues.apache.org/jira/browse/SOLR-5473
>
> Thanks,
> Shawn
>
>


Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Ahmet Arslan
Hi,

bq: I don't even see highlighting section in results

I mean, it is possible that you are hitting a request/search handler that does 
not have highlighting component registered. This is possible when you 
explicitly register components (query, facet, highlighting etc). 

Lets first make sure it is in the components. When you add debug=true to your 
URL do you use some info about highlighting component?



On Wednesday, August 5, 2015 7:12 PM, Michał Oleś  wrote:
Thank you for answer. When I execute the query using q=Warszawa&df=text_index
instead of q=text_index:Warszawa nothing changed. If I remove wt=json from
query I got response in xml but also without highlight results.


RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread Dyer, James
Talha,

Can you try putting your queried keyword in "spellcheck.q" ?

James Dyer
Ingram Content Group


-Original Message-
From: talha [mailto:talh...@gmail.com] 
Sent: Wednesday, August 05, 2015 10:13 AM
To: solr-user@lucene.apache.org
Subject: RE: Solr spell check not showing any suggestions for other language

Dear James

Thank you for your reply.

I tested analyser without “solr.EnglishPossessiveFilterFactory” but still no
luck. I also updated analyser please find this below.


  



  



with above configuration for “text_sugggest” i got following results

For Correct Bangla Word: সহজ Solr response is 
Note: i set rows to 0 to skip results


  0
  2
  
সহজ
true
0
xml
1438787238383
  




  
true
  



For an Incorrect Bangla Word: সহগ where i just changed last letter and Solr
response is


  0
  7
  
সহগ
true
0
xml
1438787208052
  




  
false
  






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-spell-check-not-showing-any-suggestions-for-other-language-tp4220950p4221033.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
I just posted on lucene-dev. I think just replacing getCanonicalHostName by
getHostName might do the Job. At least that's exactly what Logback does for
this purpose:

http://logback.qos.ch/xref/ch/qos/logback/core/util/ContextUtil.html

On Wed, Aug 5, 2015 at 6:57 PM, Erick Erickson 
wrote:

> All patches welcome!
>
> On Wed, Aug 5, 2015 at 12:40 PM, Robert Krüger 
> wrote:
> > I am shipping solr as a local search engine with our software, so I have
> no
> > way of controlling that environment. Many other software packages
> (rdbmss,
> > nosql engines etc.) work well in such a setup (as does solr except this
> > problem). The problem is that in this case (AFAICS) the host cannot be
> > overridden in any way (by config or system property or whatever), because
> > that handler is coded as it is. It is in no way a natural limitation of
> the
> > type of software or my use case. But I understand that this is probably
> not
> > frequently a problem for people, because by far most solr use is classic
> > server-based use.
> >
> > I may suggest a patch on the devel mailing list.
> >
> >
> > On Wed, Aug 5, 2015 at 5:42 PM, Shawn Heisey 
> wrote:
> >
> >> On 8/5/2015 7:56 AM, Robert Krüger wrote:
> >> > OK, now that I had a reproducible setup I could debug where it hangs:
> >> >
> >> > public SystemInfoHandler(CoreContainer cc) {
> >> > super();
> >> > this.cc = cc;
> >> > init();
> >> >   }
> >> >
> >> >   private void init() {
> >> > try {
> >> >   InetAddress addr = InetAddress.getLocalHost();
> >> >   hostname = addr.getCanonicalHostName();
> >> >  this is where it hangs
> >> > } catch (UnknownHostException e) {
> >> >   //default to null
> >> > }
> >> >   }
> >> >
> >> >
> >> > so it depends on my current network setup even for the embedded case.
> any
> >> > idea how I can stop solr from making that call?
> >> InetAddress.getLocalHost()
> >> > in this case returns some local vpn address and thus the reverse
> lookup
> >> > times out after 30 seconds. This actually happens twice, once when
> >> > initializing the container, again when initializing the core, so in my
> >> case
> >> > a minute per restart and looking at the code, I don't see how I can
> work
> >> > around this other than patching solr, which I am trying to avoid like
> >> hell.
> >>
> >> Because almost all users are using Solr in a mode where it requires the
> >> network, that code cannot be eliminated from Solr.
> >>
> >> It is critical that your machine's local network is set up completely
> >> right when you are running applications that are (normally)
> >> network-aware.  Generally that means having relevant entries for all
> >> interfaces in your hosts file and making sure that the DNS resolver code
> >> included with the operating system is not buggy.
> >>
> >> If you're dealing with a VPN or something else where the address is
> >> acquired from elsewhere, then you need to make sure that the machine has
> >> at least two DNS servers configured, that one of them is working, and
> >> that the forward and reverse DNS on those servers are completely set up
> >> for that interface's IP address.  Bugs in the DNS resolver code can
> >> complicate this.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
> >
> >
> > --
> > Robert Krüger
> > Managing Partner
> > Lesspain GmbH & Co. KG
> >
> > www.lesspain-software.com
>



-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread wwang525
Hi All,

It looks like Numeric field can not be used for faceting if
docValues="true".

The following article seemed to indicate an issue in this scenario:

https://issues.apache.org/jira/browse/SOLR-7495

"Unexpected docvalues type NUMERIC when grouping by a int facet"





--
View this message in context: 
http://lucene.472066.n3.nabble.com/TrieIntField-not-working-in-Solr-4-7-tp4220744p4221133.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Michał Oleś
Hi,
I checked and for me config looks alright but if you can take a look it
will be great.

Here is whole solrconfig.xml:
http://pastebin.com/7YfVZA90

and here is full schema.xml:
http://pastebin.com/LgeAvtFf

and query result with enabled debug:
http://pastebin.com/i74Wyep3


Re: Embedded Solr now deprecated?

2015-08-05 Thread Erick Erickson
Hmmm, you may want to investigate the new "no-war" solution. Solr runs
as a service with start/stop scripts. Currently it uses an underlying
Jetty container, but that will (probably) eventually change and is
pretty much considered an "implementation detail". Not quite sure
whether it'd be easier or harder for you though.

Best,
Erick

On Wed, Aug 5, 2015 at 1:08 PM, Robert Krüger  wrote:
> I just saw lots of deprecation warnings in my current code and a method
> that was removed, which is why I asked.
>
> Regarding the use case, I am embedding it with a desktop application just
> as others use java-based no-sql or rdbms engines and that makes sense
> architecturally in my case and is just simpler than deploying a separate
> little tomcat instance. API-wise, I know it is the same and it would be
> doable to do it that way. The embedded option is just the logical and
> simpler choice in terms of delivery, packaging, installation, automated
> testing in this case. The network option just doesn't add anything here
> apart from overhead (probably negligible in our case) and complexity.
>
> So our use is in production but in a desktop way, not what people normally
> think about when they hear "production use".
>
> Thanks everyone for the quick feedback! I am very relieved to hear it is
> not on its way out and I will look at the api changes more closely and try
> to get our application running on 5.2.1.
>
> Best regards,
>
> Robert
>
> On Wed, Aug 5, 2015 at 4:34 PM, Erick Erickson 
> wrote:
>
>> Where did you see that? Maybe I missed something yet
>> again. This is unrelated to whether we ship a WAR if that's
>> being conflated here.
>>
>> I rather doubt that embedded is "on it's way out", although
>> my memory isn't what it used to be.
>>
>> For starters, MapReduceIndexerTool uses it, so it gets
>> regular exercise from that, and anything removing it would
>> require some kind of replacement.
>>
>> How are you using it that you care? Wondering what
>> alternatives exist...
>>
>> Best,
>> Erick
>>
>>
>> On Wed, Aug 5, 2015 at 9:09 AM, Robert Krüger  wrote:
>> > Hi,
>> >
>> > I tried to upgrade my application from solr 4 to 5 and just now realized
>> > that embedded use of solr seems to be on the way out. Is that correct or
>> is
>> > there a just new API to use for that?
>> >
>> > Thanks in advance,
>> >
>> > Robert
>>
>
>
>
> --
> Robert Krüger
> Managing Partner
> Lesspain GmbH & Co. KG
>
> www.lesspain-software.com


Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Erick Erickson
The API shouldn't be changing, did you run into any errors?

Don't bother to raise a JIRA, I checked through IM and the
/clusterstate.json is still being used as a watch point for things
like creating collections, and there are other JIRAs afoot that will
take care of this at the at the right time.

Best,
Erick

On Wed, Aug 5, 2015 at 1:54 PM, Suma Shivaprasad
 wrote:
> Thanks Shawn.
>
> Does this mean in client code, wherever we are using the API
>
> "ZkStateReader.getClusterState.getCollections" to get status should be
> changed to
> "CollectionsAdminRequest.ClusterStatus" for each collection or will that
> API continue to work ?
>
> Thanks
> Suma
>
> On Wed, Aug 5, 2015 at 11:00 PM, Shawn Heisey  wrote:
>
>> On 8/5/2015 9:22 AM, Suma Shivaprasad wrote:
>> > Hello,
>> >
>> > Is /clusterstate.json in Zookeeper updated with collection state if a
>> > collection is created with server running in Solr Cloud mode without
>> > creating a core through coreAdmin or providing a core.properties . I find
>> > that there is a "state.json" present under /collections/
>> > which reflects the collection status . However dont find the
>> > /clusterstate.json updated. Is this an expected behaviour ?
>>
>> Yes, that is expected behavior.  A separate clusterstate for each
>> collection was one of the big new features in Solr 5.0:
>>
>> https://issues.apache.org/jira/browse/SOLR-5473
>>
>> Thanks,
>> Shawn
>>
>>


Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Ahmet Arslan
Hi,

I couldn't find anything suspicious. It was allowed to highlight on an 
indexed="false" field as long as a tokenizer defined on it: 
https://cwiki.apache.org/confluence/display/solr/Field+Properties+by+Use+Case

May be that is changed. Can you try to highlight on a both indexed and stored 
field?

Ahmet


On Wednesday, August 5, 2015 10:41 PM, Michał Oleś  
wrote:
Hi,
I checked and for me config looks alright but if you can take a look it
will be great.

Here is whole solrconfig.xml:
http://pastebin.com/7YfVZA90

and here is full schema.xml:
http://pastebin.com/LgeAvtFf

and query result with enabled debug:
http://pastebin.com/i74Wyep3


RE: Embedded Solr now deprecated?

2015-08-05 Thread Ken Krugler
Hi Shawn,

We have a different use case than the ones you covered in your response to 
Robert (below), which I wanted to call out.

We currently use the embedded server when building indexes as part of a Hadoop 
workflow. The results get copied to a production analytics server and swapped 
in on a daily basis.

Writing to multiple embedded servers (one per reduce task) gives us maximum 
performance, and has proven to be a very reliable method for the daily rebuild 
of pre-aggregations we need for our analytics use case.

Regards,

-- Ken

PS - I'm also currently looking at using embedded Solr as a state storage 
engine for Samza.

> From: Shawn Heisey
> Sent: August 5, 2015 7:54:07am PDT
> To: solr-user@lucene.apache.org
> Subject: Re: Embedded Solr now deprecated?
> 
> On 8/5/2015 7:09 AM, Robert Krüger wrote:
>> I tried to upgrade my application from solr 4 to 5 and just now realized
>> that embedded use of solr seems to be on the way out. Is that correct or is
>> there a just new API to use for that?
> 
> Building on Erick's reply:
> 
> I doubt that the embedded server is going away, and I do not recall
> seeing *anything* marking the entire class deprecated.  The class still
> receives attention from devs -- this feature was released with 5.1.0:
> 
> https://issues.apache.org/jira/browse/SOLR-7307
> 
> That said, we have discouraged users from deploying it in production for
> quite some time, even though it continues to exist and receive developer
> attention.  Some of the reasons that I think users should avoid the
> embedded server:  It doesn't support SolrCloud, you cannot make it
> fault-tolerant (redundant), and troubleshooting is harder because you
> cannot connect to it from outside of the source code where it is embedded.
> 
> Deploying Solr as a network service offers much more capability than you
> can get when you embed it in your application.  Chances are that you can
> easily replace EmbeddedSolrServer with one of the SolrClient classes and
> use a separate Solr deployment from your application.
> 
> Thanks,
> Shawn
> 


--
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr