Re: solrj returning no results but curl can get them

2015-01-30 Thread Dmitry Kan
Hi,

Some sanity checking: does the solr server base url in the code match the
one you use with curl?

What if you curl against http://myserver/myapp/
 ? 

On Fri, Jan 30, 2015 at 5:58 AM, S L  wrote:

> I'm stumped. I've got some solrj 3.6.1 code that works fine against three
> of
> my request handlers but not the fourth. The very odd thing is that I have
> no
> trouble retrieving results with curl against all of the result handlers.
>
> My solrj code sets some parameters:
>
> ModifiableSolrParams params = new ModifiableSolrParams();
>
> params.set("fl","*,score");
> params.set("rows","500");
> params.set("qt","/"+product);
> params.set("hl", "on");
> params.set("hl.fl", "title snippet");
> params.set("hl.fragsize",50);
> params.set("hl.simple.pre","");
> params.set("hl.simple.post","");
>
> queryString = "(" + queryString + s[s.length-1] + ")";
>
> I have various request handlers that key off of the product value. I'll
> call
> the one that doesn't work "myproduct".
>
> I send the parameter string to catalina.out for debugging:
>
> System.out.println(params.toString());
>
> I get this:
>
>
>
> fl=*%2Cscore&rows=500&qt=%2Fmyproduct&hl=on&hl.fl=title+snippet&hl.fragsize=50
> &hl.simple.pre=%3Cspan+class%3D%22hlt%22%3E
>
>
> &hl.simple.post=%3C%2Fspan%3E&q=title%3A%28brain%29+OR+snippet%3A%28brain%29
>
> I get no results when I let the solrj code do the search although the code
> works fine with the other three products.
>
> To convince myself that there is nothing wrong with the data I unencode the
> parameter string and run this command:
>
> curl "http://myserver/myapp/myproduct\
>
>
> fl=*,score&rows=500&qt=/myproduct&hl=on&hl.fl=title+snippet&hl.fragsize=50\
>&hl.simple.pre=&hl.simple.post=\
>&q=title:brain%20OR%20snippet:brain"
>
> It runs just fine.
>
> How can I debug this?
>
> Thanks very much.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/solrj-returning-no-results-but-curl-can-get-them-tp4183053.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info


Re: WordDelimiterFilterFactory and position increment.

2015-01-30 Thread Modassar Ather
Hi,

An insight in the behavior of WordDelimiterFilter will be very helpful.
Please share your inputs.

Thanks,
Modassar

On Thu, Jan 22, 2015 at 2:54 PM, Modassar Ather 
wrote:

> Hi,
>
> I am using WordDelimiterFilter while indexing. Parser used is edismax.
> Phrase search is failing for terms like "3d image".
>
> On the analysis page it shows following four tokens for *3d* and there
> positions.
>
> *token  position*
> 3d   1
> 3 1
> 3d   1
> d 2
>
> image 3
>
> Here the token d is at position 2 which per my understanding causes the
> phrase search "3d image" fail.
> "3d image"~1 works fine. Same behavior is present for "wi-fi device" and
> other few queries starting with token which is tokenized as shown above in
> the table.
>
> Kindly help me understand the behavior and let me know how the phrase
> search is possible in such cases without the slop.
>
> Thanks,
> Modassar
>
>
>


Re: WordDelimiterFilterFactory and position increment.

2015-01-30 Thread Dmitry Kan
Hi,

Do you use WordDelimiterFilter on query side as well?

On Fri, Jan 30, 2015 at 12:51 PM, Modassar Ather 
wrote:

> Hi,
>
> An insight in the behavior of WordDelimiterFilter will be very helpful.
> Please share your inputs.
>
> Thanks,
> Modassar
>
> On Thu, Jan 22, 2015 at 2:54 PM, Modassar Ather 
> wrote:
>
> > Hi,
> >
> > I am using WordDelimiterFilter while indexing. Parser used is edismax.
> > Phrase search is failing for terms like "3d image".
> >
> > On the analysis page it shows following four tokens for *3d* and there
> > positions.
> >
> > *token  position*
> > 3d   1
> > 3 1
> > 3d   1
> > d 2
> >
> > image 3
> >
> > Here the token d is at position 2 which per my understanding causes the
> > phrase search "3d image" fail.
> > "3d image"~1 works fine. Same behavior is present for "wi-fi device" and
> > other few queries starting with token which is tokenized as shown above
> in
> > the table.
> >
> > Kindly help me understand the behavior and let me know how the phrase
> > search is possible in such cases without the slop.
> >
> > Thanks,
> > Modassar
> >
> >
> >
>



-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info


Re: solrj returning no results but curl can get them

2015-01-30 Thread S L
It was pilot error. I just reviewed my servlet and noticed a parameter in
web.xml that was looking to find data for the new product in the production
index which doesn't have that data yet while my curl command was running
against the staging index. I rebuilt the servlet with the fixed parameter
and life is now good.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solrj-returning-no-results-but-curl-can-get-them-tp4183053p4183119.html
Sent from the Solr - User mailing list archive at Nabble.com.


Removing a stored field from solrcloud 4.4

2015-01-30 Thread Nishanth S
Hello,

I have a field which is indexed  and stored  in the solr schema( 4.4.solr
cloud).This field is relatively huge and I plan to  only index the field
and not to store.Is there a  need to re-index the  documents once this
change is made?.

Thanks,
Nishanth


Calling custom request handler with data import

2015-01-30 Thread vineet yadav
Hi,
I am using data import handler to import data from mysql, and I want to
identify name entities from it. So I am using following example(
http://www.searchbox.com/named-entity-recognition-ner-in-solr/). where I am
using stanford ner to identify name entities. I am using following
requesthandler



 data-import.xml
 


for importing data from mysql and


  
   
 
   content
 
   
   
   
 
 
   
 mychain
   
  

for identifying name entities.NER request handler identifies name entities
from content field, but store extracted entities in solr fields.

NER request handler was working when I am using nutch with solr. But When I
am importing data from mysql, ner request handler is not invoked. So
entities are not stored in solr for imported documents. Can anybody tell me
how to call custom request handler in data import handler.

Otherwise if I can invoke ner request handler externally, so that it can
index person, organization and location in solr for imported document. It
is also fine. Any suggestion are welcome.

Thanks
Vineet Yadav


Re: AW: AW: AW: CoreContainer#createAndLoad, existing cores not loaded

2015-01-30 Thread Shawn Heisey
On 1/29/2015 11:37 PM, Clemens Wyss DEV wrote:
>> The recommendation these days is to NOT use the embedded server
> We would love to, as it is clear that this is not the "Solr-way" to go. The 
> reason for us building upon EmbeddedSolrServer is, we have more than 
> 150sites, each with ist own index (core). If we'd go client server then we 
> could no easily update the solr server(s) without also updating all clients 
> (i.e. the 150 sites) at same time. And having a dedicated Solr server for 
> every client/site is not really an option, is it?
> 
> Or can for example a 4.10.3 client "talk" to a Solr 5/6 Server? Also when 
> updating the Solr server, doesn't that also require a re-index of all data as 
> the Luncene-storage format might have changed?

Cross-version compatibility between SolrJ and Solr is very high, as long
as you're not running SolrCloud.  SolrCloud is *incredibly* awesome, but
it's not for everyone.

Without SolrCloud, the communication is http only, using very stable
APIs that have been around since pretty much the beginning of Solr.  In
the 1.x and 3.x days, there were occasional code tweaks required for
cross-version compatibility, but the API has been extremely stable since
early 4.x -- for a couple of years now.

SolrCloud is much more recent and far more complex, so problems or
deficiencies are sometimes found with the API.  Fixing those bugs
sometimes requires changes that are incompatible with other versions of
the Java client.  The SolrJ java client is an integral part of Solr
itself, so SolrCloud functionality in the client is tightly coupled to
specifics in the API that are undergoing rapid change from version to
version.

I don't think that SolrCloud is even possible with the embedded server,
because it requires HTTP for inter-server communication.  The embedded
server doesn't listen for HTTP.

Thanks,
Shawn



RE: Does DocValues improve Grouping performance ?

2015-01-30 Thread Cario, Elaine
Hi Shamik,

We use DocValues for grouping, and although I have nothing to compare it to (we 
started with DocValues), we are also seeing similar poor results as you: easily 
60% overhead compared to non-group queries.  Looking around for some solution, 
no quick fix is presenting itself unfortunately.  CollapsingQParserPlugin also 
is too limited for our needs.

-Original Message-
From: Shamik Bandopadhyay [mailto:sham...@gmail.com] 
Sent: Thursday, January 15, 2015 6:02 PM
To: solr-user@lucene.apache.org
Subject: Does DocValues improve Grouping performance ?

Hi,

   Does use of DocValues provide any performance improvement for Grouping ?
I' looked into the blog which mentions improving Grouping performance through 
DocValues.

https://lucidworks.com/blog/fun-with-docvalues-in-solr-4-2/

Right now, Group by queries (which I can't sadly avoid) has become a huge 
bottleneck. It has an overhead of 60-70% compared to the same query san group 
by. Unfortunately, I'm not able to be CollapsingQParserPlugin as it doesn't 
have a support similar to "group.facet" feature.

My understanding on DocValues is that it's intended for faceting and sorting. 
Just wondering if anyone have tried DocValues for Grouping and saw any 
improvements ?

-Thanks,
Shamik


Replication in solrloud

2015-01-30 Thread solr2020
Hi,


We have 4 servers in Solrcloud with one shard. 2 of the servers are not in
sync with other two.We like to force replication manually to keep all the
servers in sync.Do we have a command to force replication? (other than Solr
restart).


Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Replication-in-solrloud-tp4183103.html
Sent from the Solr - User mailing list archive at Nabble.com.


New UI for SOLR-based projects

2015-01-30 Thread Roman Chyla
Hi everybody,

There exists a new open-source implementation of a search interface for
SOLR. It is written in Javascript (using Backbone), currently in version
v1.0.19 - but new features are constantly coming. Rather than describing it
in words, please see it in action for yourself at http://ui.adslabs.org -
I'd recommend exploring facets, the query form, and visualizations.

The code lives at: http://github.com/adsabs/bumblebee

Best,

  Roman


Hit Highlighting and More Like This

2015-01-30 Thread Tim Hearn
Hi all,

I'm fairly new to Solr.  It seems like it should be possible to enable the
hit highlighting feature and more like this feature at the same time, with
the key words from the MLT query being the terms highlighted.  Is this
possible?  I am trying right now to do this, but I am not having any
snippets returned to me.

Thanks!


Re: Removing a stored field from solrcloud 4.4

2015-01-30 Thread Erick Erickson
Yes and no. Solr should continue to work fine, just all new documents won't
have the stored field to return to the clients. As you re-index docs,
subsequent merges will purge the stored data _for the docs you've
re-indexed_.

But I would re-index just to get my system in a consistent state.

Best
Erick

On Fri, Jan 30, 2015 at 9:40 AM, Nishanth S  wrote:

> Hello,
>
> I have a field which is indexed  and stored  in the solr schema( 4.4.solr
> cloud).This field is relatively huge and I plan to  only index the field
> and not to store.Is there a  need to re-index the  documents once this
> change is made?.
>
> Thanks,
> Nishanth
>


RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-30 Thread Dyer, James
You need to decrease this to at least 2 because the length of "go" is <3.

3

James Dyer
Ingram Content Group


-Original Message-
From: fabio.bozzo [mailto:f.bo...@3-w.it] 
Sent: Wednesday, January 28, 2015 4:55 PM
To: solr-user@lucene.apache.org
Subject: RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

I tried increasing my alternativeTermCount to 5 and enable extended results.
I also added a filter fq parameter to clarify what I mean:

*Querying for "go pro" is good:*

{
  "responseHeader": {
"status": 0,
"QTime": 2,
"params": {
  "q": "go pro",
  "indent": "true",
  "fq": "marchio:\"GO PRO\"",
  "rows": "1",
  "wt": "json",
  "spellcheck.extendedResults": "true",
  "_": "1422485581792"
}
  },
  "response": {
"numFound": 27,
"start": 0,
"docs": [
  {
"codice_produttore_s": "DK00150020",
"codice_s": "5.BAT.27407",
"id": "27407",
"marchio": "GO PRO",
"barcode_interno_s": "185323000958",
"prezzo_acquisto_d": 16.12,
"data_aggiornamento_dt": "2012-06-21T00:00:00Z",
"descrizione": "BATTERIA GO PRO HERO ",
"prezzo_vendita_d": 39.9,
"categoria": "Batterie",
"_version_": 1491583424191791000
  },

 

]
  },
  "spellcheck": {
"suggestions": [
  "go pro",
  {
"numFound": 1,
"startOffset": 0,
"endOffset": 6,
"origFreq": 433,
"suggestion": [
  {
"word": "gopro",
"freq": 2
  }
]
  },
  "correctlySpelled",
  false,
  "collation",
  [
"collationQuery",
"gopro",
"hits",
3,
"misspellingsAndCorrections",
[
  "go pro",
  "gopro"
]
  ]
]
  }
}

While querying for "gopro" is not:

{
  "responseHeader": {
"status": 0,
"QTime": 6,
"params": {
  "q": "gopro",
  "indent": "true",
  "fq": "marchio:\"GO PRO\"",
  "rows": "1",
  "wt": "json",
  "spellcheck.extendedResults": "true",
  "_": "1422485629480"
}
  },
  "response": {
"numFound": 3,
"start": 0,
"docs": [
  {
"codice_produttore_s": "DK0030010",
"codice_s": "5.VID.39163",
"id": "38814",
"marchio": "GO PRO",
"barcode_interno_s": "818279012477",
"prezzo_acquisto_d": 150.84,
"data_aggiornamento_dt": "2014-12-24T00:00:00Z",
"descrizione": "VIDEOCAMERA GO-PRO HERO 3 WHITE NUOVO SLIM",
"prezzo_vendita_d": 219,
"categoria": "Fotografia",
"_version_": 1491583425479442400
  },

]
  },
  "spellcheck": {
"suggestions": [
  "gopro",
  {
"numFound": 1,
"startOffset": 0,
"endOffset": 5,
"origFreq": 2,
"suggestion": [
  {
"word": "giro",
"freq": 6
  }
]
  },
  "correctlySpelled",
  false
]
  }
}

---

I'd like "go pro" as a suggestion for "gopro" too.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Suggesting-broken-words-with-solr-WordBreakSolrSpellChecker-tp4182172p4182735.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Does DocValues improve Grouping performance ?

2015-01-30 Thread Joel Bernstein
A few questions so we can better understand the scale of grouping you're
trying to accomplish:

How many distinct groups do you typically have in a search result?

How many distinct groups are there in the field you are grouping on?

How many results are you trying to group in a query?

Joel Bernstein
Search Engineer at Heliosearch

On Fri, Jan 30, 2015 at 4:10 PM, Cario, Elaine <
elaine.ca...@wolterskluwer.com> wrote:

> Hi Shamik,
>
> We use DocValues for grouping, and although I have nothing to compare it
> to (we started with DocValues), we are also seeing similar poor results as
> you: easily 60% overhead compared to non-group queries.  Looking around for
> some solution, no quick fix is presenting itself unfortunately.
> CollapsingQParserPlugin also is too limited for our needs.
>
> -Original Message-
> From: Shamik Bandopadhyay [mailto:sham...@gmail.com]
> Sent: Thursday, January 15, 2015 6:02 PM
> To: solr-user@lucene.apache.org
> Subject: Does DocValues improve Grouping performance ?
>
> Hi,
>
>Does use of DocValues provide any performance improvement for Grouping ?
> I' looked into the blog which mentions improving Grouping performance
> through DocValues.
>
> https://lucidworks.com/blog/fun-with-docvalues-in-solr-4-2/
>
> Right now, Group by queries (which I can't sadly avoid) has become a huge
> bottleneck. It has an overhead of 60-70% compared to the same query san
> group by. Unfortunately, I'm not able to be CollapsingQParserPlugin as it
> doesn't have a support similar to "group.facet" feature.
>
> My understanding on DocValues is that it's intended for faceting and
> sorting. Just wondering if anyone have tried DocValues for Grouping and saw
> any improvements ?
>
> -Thanks,
> Shamik
>


timestamp field and atomic updates

2015-01-30 Thread Bill Au
I have a timestamp field in my schema to track when each doc was indexed:



Recently, we have switched over to use atomic update instead of re-indexing
when we need to update a doc in the index.  It looks to me that the
timestamp field is not updated during an atomic update.  I have also looked
into TimestampUpdateProcessorFactory and it looks to me that won't help in
my case.

Is there anything within Solr that I can use to update the timestamp during
atomic update, or do I have to explicitly include the timestamp field as
part of the atomic update?

Bill


Re: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-30 Thread fabio.bozzo
Nice! It works indeed!
Sorry I didn't noticed that before.

But what if I want the same for the iPhone?
I mean suggesting "I phone" for users who searched "iphone". Minbreaklength
of 1 is just too small isn't it?

Il sabato 31 gennaio 2015, Dyer, James-2 [via Lucene] <
ml-node+s472066n4183176...@n3.nabble.com> ha scritto:

> You need to decrease this to at least 2 because the length of "go" is <3.
>
> 3
>
> James Dyer
> Ingram Content Group
>
>
> -Original Message-
> From: fabio.bozzo [mailto:[hidden email]
> ]
> Sent: Wednesday, January 28, 2015 4:55 PM
> To: [hidden email] 
> Subject: RE: Suggesting broken words with solr.WordBreakSolrSpellChecker
>
> I tried increasing my alternativeTermCount to 5 and enable extended
> results.
> I also added a filter fq parameter to clarify what I mean:
>
> *Querying for "go pro" is good:*
>
> {
>   "responseHeader": {
> "status": 0,
> "QTime": 2,
> "params": {
>   "q": "go pro",
>   "indent": "true",
>   "fq": "marchio:\"GO PRO\"",
>   "rows": "1",
>   "wt": "json",
>   "spellcheck.extendedResults": "true",
>   "_": "1422485581792"
> }
>   },
>   "response": {
> "numFound": 27,
> "start": 0,
> "docs": [
>   {
> "codice_produttore_s": "DK00150020",
> "codice_s": "5.BAT.27407",
> "id": "27407",
> "marchio": "GO PRO",
> "barcode_interno_s": "185323000958",
> "prezzo_acquisto_d": 16.12,
> "data_aggiornamento_dt": "2012-06-21T00:00:00Z",
> "descrizione": "BATTERIA GO PRO HERO ",
> "prezzo_vendita_d": 39.9,
> "categoria": "Batterie",
> "_version_": 1491583424191791000
>   },
>
>  
>
> ]
>   },
>   "spellcheck": {
> "suggestions": [
>   "go pro",
>   {
> "numFound": 1,
> "startOffset": 0,
> "endOffset": 6,
> "origFreq": 433,
> "suggestion": [
>   {
> "word": "gopro",
> "freq": 2
>   }
> ]
>   },
>   "correctlySpelled",
>   false,
>   "collation",
>   [
> "collationQuery",
> "gopro",
> "hits",
> 3,
> "misspellingsAndCorrections",
> [
>   "go pro",
>   "gopro"
> ]
>   ]
> ]
>   }
> }
>
> While querying for "gopro" is not:
>
> {
>   "responseHeader": {
> "status": 0,
> "QTime": 6,
> "params": {
>   "q": "gopro",
>   "indent": "true",
>   "fq": "marchio:\"GO PRO\"",
>   "rows": "1",
>   "wt": "json",
>   "spellcheck.extendedResults": "true",
>   "_": "1422485629480"
> }
>   },
>   "response": {
> "numFound": 3,
> "start": 0,
> "docs": [
>   {
> "codice_produttore_s": "DK0030010",
> "codice_s": "5.VID.39163",
> "id": "38814",
> "marchio": "GO PRO",
> "barcode_interno_s": "818279012477",
> "prezzo_acquisto_d": 150.84,
> "data_aggiornamento_dt": "2014-12-24T00:00:00Z",
> "descrizione": "VIDEOCAMERA GO-PRO HERO 3 WHITE NUOVO SLIM",
> "prezzo_vendita_d": 219,
> "categoria": "Fotografia",
> "_version_": 1491583425479442400
>   },
> 
> ]
>   },
>   "spellcheck": {
> "suggestions": [
>   "gopro",
>   {
> "numFound": 1,
> "startOffset": 0,
> "endOffset": 5,
> "origFreq": 2,
> "suggestion": [
>   {
> "word": "giro",
> "freq": 6
>   }
> ]
>   },
>   "correctlySpelled",
>   false
> ]
>   }
> }
>
> ---
>
> I'd like "go pro" as a suggestion for "gopro" too.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Suggesting-broken-words-with-solr-WordBreakSolrSpellChecker-tp4182172p4182735.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Suggesting-broken-words-with-solr-WordBreakSolrSpellChecker-tp4182172p4183176.html
>  To unsubscribe from Suggesting broken words with
> solr.WordBreakSolrSpellChecker, click here
> 
> .
> NAML
> 
>


-- 
Fabio Bozzo
SW Engineer

3W s.r.l.
Via Luisetti,7
13900-Biella ( BI )
Tel. 015.8

AW: AW: AW: CoreContainer#createAndLoad, existing cores not loaded

2015-01-30 Thread Clemens Wyss DEV
I looked into sources of CoreAdminHandler#handleCreateAction
...
  SolrCore core = coreContainer.create(dcore);
  
  // only write out the descriptor if the core is successfully created
  coreContainer.getCoresLocator().create(coreContainer, dcore);
...

I was missing the   "coreContainer.getCoresLocator().create(coreContainer, 
dcore);"
When doing the two calls:
a) Core.properties is being created 
AND 
b) the cores are being loaded upon container-startup ;)
:-) 

-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch] 
Gesendet: Freitag, 30. Januar 2015 07:38
An: solr-user@lucene.apache.org
Betreff: AW: AW: AW: CoreContainer#createAndLoad, existing cores not loaded

> The recommendation these days is to NOT use the embedded server
We would love to, as it is clear that this is not the "Solr-way" to go. The 
reason for us building upon EmbeddedSolrServer is, we have more than 150sites, 
each with ist own index (core). If we'd go client server then we could no 
easily update the solr server(s) without also updating all clients (i.e. the 
150 sites) at same time. And having a dedicated Solr server for every 
client/site is not really an option, is it?

Or can for example a 4.10.3 client "talk" to a Solr 5/6 Server? Also when 
updating the Solr server, doesn't that also require a re-index of all data as 
the Luncene-storage format might have changed?

-Ursprüngliche Nachricht-
Von: Shawn Heisey [mailto:apa...@elyograg.org]
Gesendet: Donnerstag, 29. Januar 2015 20:30
An: solr-user@lucene.apache.org
Betreff: Re: AW: AW: CoreContainer#createAndLoad, existing cores not loaded

On 1/29/2015 10:15 AM, Clemens Wyss DEV wrote:
>> to put your solr home inside the extracted WAR
> We are NOT using war's
> 
>> coreRootDirectory
> I don't have this property in my sorl.xml
> 
>> If there will only be core.properties files in that cores directory
> Again, I see no core.properties file. I am creating my cores through 
> CoreContainer.createCore( CordeDescriptor). The folder(s) are created 
> but  no core.properties file

I am pretty clueless when it comes to the embedded server, but if you are 
creating the cores in the java code every time you create the container, I bet 
what I'm telling you doesn't apply at all.  The solr.xml file may not even be 
used.

The recommendation these days is to NOT use the embedded server.  There are too 
many limitations and it doesn't receive as much user testing as the webapp.  
Start Solr as a separate process and access it over http.
The overhead of http on a LAN is minimal, and over localhost it's almost 
nothing.

To do that, you would just need to change your code to use one of the client 
objects.  That would probably be HttpSolrServer, which is renamed to 
HttpSolrClient in 5.0.  They share the same parent object as 
EmbeddedSolrServer.  Most of the relevant methods used come from the parent 
class, so you would need very few code changes.

Thanks,
Shawn



Re: solrj returning no results but curl can get them

2015-01-30 Thread S L
Hi Dmitri,

I do have a question mark in my search. I see that I dropped that
accidentally when I was copying/pasting/formatting the details. 

My curl command is curl "http://myserver/myapp/myproduct?fl=*,.";

And, it works fine whether I have .../myproduct/?fl=*, or if I leave out
the / before ?fl=*.

The curl command works perfectly with any of the four request handlers so I
believe the data to be correct and my solrj code works perfectly with three
out of four of the request handlers so I believe the code to be correct as
well.

Thanks.

Sol



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solrj-returning-no-results-but-curl-can-get-them-tp4183053p4183116.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: New UI for SOLR-based projects

2015-01-30 Thread Shawn Heisey
On 1/30/2015 1:07 PM, Roman Chyla wrote:
> There exists a new open-source implementation of a search interface for
> SOLR. It is written in Javascript (using Backbone), currently in version
> v1.0.19 - but new features are constantly coming. Rather than describing it
> in words, please see it in action for yourself at http://ui.adslabs.org -
> I'd recommend exploring facets, the query form, and visualizations.
> 
> The code lives at: http://github.com/adsabs/bumblebee

I have no wish to trivialize the work you've done.  I haven't looked
into the code, but a high-level glance at the documentation suggests
that you've put a lot of work into it.

I do however have a strong caveat for your users.  I'm the guy holding
the big sign that says "the end is near" to anyone who will listen!

By itself, this is an awesome tool for prototyping, but without some
additional expertise and work, there are severe security implications.

If this gets used for a public Internet facing service, the Solr server
must be accessible from the end user's machine, which might mean that it
must be available to the entire Internet.

If the Solr server is not sitting behind some kind of intelligent proxy
that can detect and deny aattempts to access certain parts of the Solr
API, then Solr will be wide open to attack.  A knowledgeable user that
has unfiltered access to a Solr server will be able to completely delete
the index, change any piece of information in the index, or send denial
of service queries that will make it unable to respond to legitimate
traffic.

Setting up such a proxy is not a trivial task.  I know that some people
have done it, but so far I have not seen anyone share those
configurations.  Even with such a proxy, it might still be possible to
easily send denial of service queries.

I cannot find any information in your README or the documentation links
that mentions any of these concerns.  I suspect that many who
incorporate this client into their websites will be unaware that their
setup may be insecure, or how to protect it.

Thanks,
Shawn



Re: New UI for SOLR-based projects

2015-01-30 Thread Roman Chyla
I gather from your comment that I should update readme, because there could
be people who would be inclined to use bumblebee development server in
production: Beware those who enter through this gate! :-)

Your point, that so far you haven't seen anybody share their middle layer
can be addressed by pointing to the following projects:

https://github.com/adsabs/solr-service
https://github.com/adsabs/adsws

These are also open source, we use them in production, and have oauth,
microservices, rest, and rate limits, we know it is not perfect, but what
is? ;-) pull requests welcome!

Thanks,

Roman
On 30 Jan 2015 21:51, "Shawn Heisey"  wrote:

> On 1/30/2015 1:07 PM, Roman Chyla wrote:
> > There exists a new open-source implementation of a search interface for
> > SOLR. It is written in Javascript (using Backbone), currently in version
> > v1.0.19 - but new features are constantly coming. Rather than describing
> it
> > in words, please see it in action for yourself at http://ui.adslabs.org
> -
> > I'd recommend exploring facets, the query form, and visualizations.
> >
> > The code lives at: http://github.com/adsabs/bumblebee
>
> I have no wish to trivialize the work you've done.  I haven't looked
> into the code, but a high-level glance at the documentation suggests
> that you've put a lot of work into it.
>
> I do however have a strong caveat for your users.  I'm the guy holding
> the big sign that says "the end is near" to anyone who will listen!
>
> By itself, this is an awesome tool for prototyping, but without some
> additional expertise and work, there are severe security implications.
>
> If this gets used for a public Internet facing service, the Solr server
> must be accessible from the end user's machine, which might mean that it
> must be available to the entire Internet.
>
> If the Solr server is not sitting behind some kind of intelligent proxy
> that can detect and deny aattempts to access certain parts of the Solr
> API, then Solr will be wide open to attack.  A knowledgeable user that
> has unfiltered access to a Solr server will be able to completely delete
> the index, change any piece of information in the index, or send denial
> of service queries that will make it unable to respond to legitimate
> traffic.
>
> Setting up such a proxy is not a trivial task.  I know that some people
> have done it, but so far I have not seen anyone share those
> configurations.  Even with such a proxy, it might still be possible to
> easily send denial of service queries.
>
> I cannot find any information in your README or the documentation links
> that mentions any of these concerns.  I suspect that many who
> incorporate this client into their websites will be unaware that their
> setup may be insecure, or how to protect it.
>
> Thanks,
> Shawn
>
>


Re: Calling custom request handler with data import

2015-01-30 Thread Dan Davis
The Data Import Handler isn't pushing data into the /update request
handler.   However, Data Import Handler can be extended with transformers.
  Two such transformers are the TemplateTransformer and the
ScriptTransformer.   It may be possible to get a script function to load
your custom Java code.   You could also just write a
StandfordNerTransformer.

Hope this helps,

Dan

On Fri, Jan 30, 2015 at 9:07 AM, vineet yadav 
wrote:

> Hi,
> I am using data import handler to import data from mysql, and I want to
> identify name entities from it. So I am using following example(
> http://www.searchbox.com/named-entity-recognition-ner-in-solr/). where I
> am
> using stanford ner to identify name entities. I am using following
> requesthandler
>
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
>  data-import.xml
>  
> 
>
> for importing data from mysql and
>
> 
>   
>
>  
>content
>  
>
>
>
>  
>  
>
>  mychain
>
>   
>
> for identifying name entities.NER request handler identifies name entities
> from content field, but store extracted entities in solr fields.
>
> NER request handler was working when I am using nutch with solr. But When I
> am importing data from mysql, ner request handler is not invoked. So
> entities are not stored in solr for imported documents. Can anybody tell me
> how to call custom request handler in data import handler.
>
> Otherwise if I can invoke ner request handler externally, so that it can
> index person, organization and location in solr for imported document. It
> is also fine. Any suggestion are welcome.
>
> Thanks
> Vineet Yadav
>


Re: Calling custom request handler with data import

2015-01-30 Thread Dan Davis
You know, another thing you can do is just write some Java/perl/whatever to
pull data out of your database and push it to Solr.Not as convenient
for development perhaps, but it has more legs in the long run.   Data
Import Handler does not easily multi-thread.

On Sat, Jan 31, 2015 at 12:34 AM, Dan Davis  wrote:

> The Data Import Handler isn't pushing data into the /update request
> handler.   However, Data Import Handler can be extended with transformers.
>   Two such transformers are the TemplateTransformer and the
> ScriptTransformer.   It may be possible to get a script function to load
> your custom Java code.   You could also just write a
> StandfordNerTransformer.
>
> Hope this helps,
>
> Dan
>
> On Fri, Jan 30, 2015 at 9:07 AM, vineet yadav  > wrote:
>
>> Hi,
>> I am using data import handler to import data from mysql, and I want to
>> identify name entities from it. So I am using following example(
>> http://www.searchbox.com/named-entity-recognition-ner-in-solr/). where I
>> am
>> using stanford ner to identify name entities. I am using following
>> requesthandler
>>
>> > class="org.apache.solr.handler.dataimport.DataImportHandler">
>> 
>>  data-import.xml
>>  
>> 
>>
>> for importing data from mysql and
>>
>> 
>>   
>>
>>  
>>content
>>  
>>
>>
>>
>>  
>>  
>>
>>  mychain
>>
>>   
>>
>> for identifying name entities.NER request handler identifies name entities
>> from content field, but store extracted entities in solr fields.
>>
>> NER request handler was working when I am using nutch with solr. But When
>> I
>> am importing data from mysql, ner request handler is not invoked. So
>> entities are not stored in solr for imported documents. Can anybody tell
>> me
>> how to call custom request handler in data import handler.
>>
>> Otherwise if I can invoke ner request handler externally, so that it can
>> index person, organization and location in solr for imported document. It
>> is also fine. Any suggestion are welcome.
>>
>> Thanks
>> Vineet Yadav
>>
>
>


role of the wiki and cwiki

2015-01-30 Thread Dan Davis
I've been thinking of https://wiki.apache.org/solr/ as the "Old Wiki" and
https://cwiki.apache.org/confluence/display/solr as the "New Wiki".

I guess that's the wrong way to think about it - Confluence is being used
for the "Solr Reference Guide", and MoinMoin is being used as a wiki.

Is this the correct understanding?


Re: New UI for SOLR-based projects

2015-01-30 Thread Lukáš Vlček
Nice work Roman!

Lukas

On Sat, Jan 31, 2015 at 4:36 AM, Roman Chyla  wrote:

> I gather from your comment that I should update readme, because there could
> be people who would be inclined to use bumblebee development server in
> production: Beware those who enter through this gate! :-)
>
> Your point, that so far you haven't seen anybody share their middle layer
> can be addressed by pointing to the following projects:
>
> https://github.com/adsabs/solr-service
> https://github.com/adsabs/adsws
>
> These are also open source, we use them in production, and have oauth,
> microservices, rest, and rate limits, we know it is not perfect, but what
> is? ;-) pull requests welcome!
>
> Thanks,
>
> Roman
> On 30 Jan 2015 21:51, "Shawn Heisey"  wrote:
>
> > On 1/30/2015 1:07 PM, Roman Chyla wrote:
> > > There exists a new open-source implementation of a search interface for
> > > SOLR. It is written in Javascript (using Backbone), currently in
> version
> > > v1.0.19 - but new features are constantly coming. Rather than
> describing
> > it
> > > in words, please see it in action for yourself at
> http://ui.adslabs.org
> > -
> > > I'd recommend exploring facets, the query form, and visualizations.
> > >
> > > The code lives at: http://github.com/adsabs/bumblebee
> >
> > I have no wish to trivialize the work you've done.  I haven't looked
> > into the code, but a high-level glance at the documentation suggests
> > that you've put a lot of work into it.
> >
> > I do however have a strong caveat for your users.  I'm the guy holding
> > the big sign that says "the end is near" to anyone who will listen!
> >
> > By itself, this is an awesome tool for prototyping, but without some
> > additional expertise and work, there are severe security implications.
> >
> > If this gets used for a public Internet facing service, the Solr server
> > must be accessible from the end user's machine, which might mean that it
> > must be available to the entire Internet.
> >
> > If the Solr server is not sitting behind some kind of intelligent proxy
> > that can detect and deny aattempts to access certain parts of the Solr
> > API, then Solr will be wide open to attack.  A knowledgeable user that
> > has unfiltered access to a Solr server will be able to completely delete
> > the index, change any piece of information in the index, or send denial
> > of service queries that will make it unable to respond to legitimate
> > traffic.
> >
> > Setting up such a proxy is not a trivial task.  I know that some people
> > have done it, but so far I have not seen anyone share those
> > configurations.  Even with such a proxy, it might still be possible to
> > easily send denial of service queries.
> >
> > I cannot find any information in your README or the documentation links
> > that mentions any of these concerns.  I suspect that many who
> > incorporate this client into their websites will be unaware that their
> > setup may be insecure, or how to protect it.
> >
> > Thanks,
> > Shawn
> >
> >
>


Re: role of the wiki and cwiki

2015-01-30 Thread Anshum Gupta
Hi Dan,

I would say that the wiki is old and dated and that gap is only increasing.
I would highly recommend everyone to use the Reference Guide instead of the
wiki, unless there's something that they can't find. In case you are unable
to find something on the wiki, it'd be good to comment on confluence about
the missing content, better still, contribute :-).

Now, about the reference guide. The link you've shared above is always the
next version of the ref guide e.g. right now, all the content there is
w.r.t. 5.0 and is unreleased. The best way to use the reference guide is to
download the ref guide for the version you're using.


On Fri, Jan 30, 2015 at 9:59 PM, Dan Davis  wrote:

> I've been thinking of https://wiki.apache.org/solr/ as the "Old Wiki" and
> https://cwiki.apache.org/confluence/display/solr as the "New Wiki".
>
> I guess that's the wrong way to think about it - Confluence is being used
> for the "Solr Reference Guide", and MoinMoin is being used as a wiki.
>
> Is this the correct understanding?
>



-- 
Anshum Gupta
http://about.me/anshumgupta


Re: role of the wiki and cwiki

2015-01-30 Thread Shawn Heisey
On 1/30/2015 10:59 PM, Dan Davis wrote:
> I've been thinking of https://wiki.apache.org/solr/ as the "Old Wiki" and
> https://cwiki.apache.org/confluence/display/solr as the "New Wiki".
> 
> I guess that's the wrong way to think about it - Confluence is being used
> for the "Solr Reference Guide", and MoinMoin is being used as a wiki.
> 
> Is this the correct understanding?

Yes, your understanding is correct.

Because the Solr Reference Guide is released as official documentation
in PDF form shortly after each new minor Solr version, only committers
have the ability to edit the confluence wiki.  Anyone can comment on it,
so we do have a feedback mechanism.

Anyone can edit the MoinMoin wiki, after they ask for edit rights and
provide their username for the Solr portion of that wiki.  Asking for
edit permission is typically done via this mailing list or the IRC channel.

Because they have different potential authors, the two systems now serve
different purposes.

There are still some pages on the MoinMoin wiki that contain
documentation that should be in the reference guide, but isn't.

The MoinMoin wiki is still useful, as a place where users can collect
information that is useful to others, but doesn't qualify as official
documentation, or perhaps simply hasn't been verified.  I believe this
means that a lot of information which has been migrated into the
reference guide will eventually be removed from MoinMoin.

Thanks,
Shawn