Re: shard splitting

2013-06-10 Thread Shalin Shekhar Mangar
Hi Ming, Yes, that's exactly what I meant. Referring to your last email about SolrEntityProcessor -- If you're trying to migrate from a 3.x installation to SolrCloud, then I think that you should create a SolrCloud installation with numShards=1 and copy over your previous (3.x) index. Then you can

Re: reg: efficient querying using solr

2013-06-10 Thread gururaj kosuru
How can one calculate an ideal max shard size for a solr core instance if I am running a cloud with multiple systems of 4GB? Thanks On 11 June 2013 11:18, Walter Underwood wrote: > An index does not need to fit into the heap. But a 4GB machine is almost > certainly too small to run Solr with 4

Re: reg: efficient querying using solr

2013-06-10 Thread Walter Underwood
An index does not need to fit into the heap. But a 4GB machine is almost certainly too small to run Solr with 40 million documents. wunder On Jun 10, 2013, at 10:36 PM, gururaj kosuru wrote: > Hi Walter, > thanks for replying. Do you mean that it is necessary for > the index to

Re: reg: efficient querying using solr

2013-06-10 Thread gururaj kosuru
Hi Walter, thanks for replying. Do you mean that it is necessary for the index to fit into the heap? if so, will a heap size that is greater than the actual RAM size slow down the queries? Thanks, Gururaj On 11 June 2013 10:36, Walter Underwood wrote: > 2GB is a rather small h

Re: shard splitting

2013-06-10 Thread Mingfeng Yang
Hi Shalin, Do you mean that we can do 1->2, 2->4, 4->8 to get 8 shards eventually? After splitting, if we want to set up a solrcloud with all 8 shards, how shall we allocate the shards then? Thanks, Ming- On Mon, Jun 10, 2013 at 9:55 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote:

Re: reg: efficient querying using solr

2013-06-10 Thread Walter Underwood
2GB is a rather small heap. Our production systems run with 8GB and smaller indexes than that. Our dev and test systems run with 6GB heaps. wunder On Jun 10, 2013, at 9:52 PM, gururaj kosuru wrote: > Hello, > I have recently started using solr 3.4 and have a standalone > system deployed

Re: shard splitting

2013-06-10 Thread Shalin Shekhar Mangar
No, it is hard coded to split into two shards only. You can call it recursively on a sub shard to split into more pieces. Please note that some serious bugs were found in that command which will be fixed in the next (4.3.1) release of Solr. On Tue, Jun 11, 2013 at 9:43 AM, Mingfeng Yang wrote: >

reg: efficient querying using solr

2013-06-10 Thread gururaj kosuru
Hello, I have recently started using solr 3.4 and have a standalone system deployed that has 40,000,000 data rows with 3 indexed fields totalling around 9 GB. I have given a Heap size of 2GB and I run the instance on Tomcat on an i7 system with 4 GB RAM. My queries involve searching among

shard splitting

2013-06-10 Thread Mingfeng Yang
>From the solr wiki, I saw this command ( http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=&shard=shardId) which split one index into 2 shards. However, is there someway to split into more shards? Thanks, Ming-

Re: Field Names

2013-06-10 Thread Jack Krupansky
One idea: DON'T DO IT! Seriously, if you find yourself trying to "play games" with field names, it says that you probably have a data model that is grossly out of line with the strengths (and weaknesses) of Solr. Dynamic fields are fine - when used in moderation, but not when pushed to e

Re: Lucene/Solr Filesystem tunings

2013-06-10 Thread Ryan Zezeski
Just to add to the pile...use the Deadline or NOOP I/O scheduler. -Z On Sat, Jun 8, 2013 at 4:40 PM, Mark Miller wrote: > Turning swappiness down to 0 can have some decent performance impact. > > - http://en.wikipedia.org/wiki/Swappiness > > In the past, I've seen better performance with ext3

Re: Field Names

2013-06-10 Thread Gora Mohanty
On 11 June 2013 07:24, PeriS wrote: > > I was wondering if there was a way to define field names that are more less > dynamic in nature but follow a regular expression pattern. I know you can > have asterisk either as a prefix or a suffix but not both or somewhere int he > middle of a name. >

Field Names

2013-06-10 Thread PeriS
I was wondering if there was a way to define field names that are more less dynamic in nature but follow a regular expression pattern. I know you can have asterisk either as a prefix or a suffix but not both or somewhere int he middle of a name. Goal: to define a field that takes up the form l

Re: index merge question

2013-06-10 Thread Jamie Johnson
Thanks Mark. My question is stemming from the new cloudera search stuff. My concern its that if while rebuilding the index someone updates a doc that update could be lost from a solr perspective. I guess what would need to happen to ensure the correct information was indexed would be to record th

Re: external zookeeper with SolrCloud

2013-06-10 Thread Mark Miller
This might be https://issues.apache.org/jira/browse/SOLR-4899 - Mark On Jun 10, 2013, at 5:59 PM, "Joshi, Shital" wrote: > Hi, > > > > We're setting up 5 shard SolrCloud with external zoo keeper. When we bring up > Solr nodes while the zookeeper instance is not up and running, we see this

Hoss commented on http://cwiki.apache.org/SOLR/test-page-1.html

2013-06-10 Thread no-reply
Hello, Hoss has commented on http://cwiki.apache.org/SOLR/test-page-1.html. You can find the comment here: http://cwiki.apache.org/SOLR/test-page-1.html#comment_1359 Please note that if the comment contains a hyperlink, it must be approved before it is shown on the site. Below is the reply that w

Re: Query-node+shard stickiness?

2013-06-10 Thread Otis Gospodnetic
Actually, it doesn't really have to be messy. Couldn't one have a custom handler that know how to compute a query hash and map it to a specific node? When the same query comes in again, the same computation will be doen and the same node will be selected to execute the query. No need for any nodes

external zookeeper with SolrCloud

2013-06-10 Thread Joshi, Shital
Hi, We're setting up 5 shard SolrCloud with external zoo keeper. When we bring up Solr nodes while the zookeeper instance is not up and running, we see this error in Solr logs. java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

RE: Solr developer IRC channel

2013-06-10 Thread Vaillancourt, Tim
I agree with Yonik. It is great to see an IRC for Solr! Tim -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Monday, June 10, 2013 2:46 PM To: solr-user@lucene.apache.org Subject: Re: Solr developer IRC channel On Mon, Jun 10, 2013 at

Re: Dataless nodes in SolrCloud?

2013-06-10 Thread Tim Vaillancourt
To answer Otis' question of whether or not this would be useful, the trouble is, I don't know! :) It very well could be useful for my use case. Is there any way to determine the impact of result merging (time spent? Etc?) aside from just 'trying it'? Cheers, Tim On 10 June 2013 14:48, Otis Gos

Re: Dataless nodes in SolrCloud?

2013-06-10 Thread Otis Gospodnetic
I think it would be useful. I know people using ElasticSearch use it relatively often. > Is aggregation expensive enough to warrant a separate box? I think it can get expensive if X in rows=X is highish. We've seen this reported here on the Solr ML before So to make sorting/merging of N re

Re: Solr developer IRC channel

2013-06-10 Thread Yonik Seeley
On Mon, Jun 10, 2013 at 5:32 PM, Otis Gospodnetic wrote: > Mucho good! +1 > Why unlogged though? Just curious. Personal preference give it a more informal / slightly more private feel. Some people don't want casual watercooler chat recorded & publicized forever. -Yonik http://lucidworks.com

Re: Query-node+shard stickiness?

2013-06-10 Thread Otis Gospodnetic
Yeah, that sounds complique and messy. Just the other day I was looking at performance metrics for a customer using master-slave setup and this sort of query->slave mapping behing the load balancer. After switching from such a setup to a round-robin setup their performance noticeably suffered... O

Re: Most common query

2013-06-10 Thread Otis Gospodnetic
Reply to an ancient email... On Thu, Feb 14, 2013 at 7:49 AM, Ahmet Arslan wrote: > Hi, > > If I am not mistaken I saw some open jira to collect queries and calculate > popular searches etc. > > Some commercial solutions exist: > > http://sematext.com/search-analytics/index.html The above is ac

Re: How to Reach LukeRequestHandler From Solrj?

2013-06-10 Thread bbarani
Try the below code.. query.setQueryType("/admin/luke"); QueryResponse rsp = server.query( query,METHOD.GET ); System.out.println(rsp.getResponse()); -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-Reach-LukeRequestHandl

Re: Solr developer IRC channel

2013-06-10 Thread Otis Gospodnetic
Mucho good! +1 Why unlogged though? Just curious. Otis -- Solr & ElasticSearch Support http://sematext.com/ On Mon, Jun 10, 2013 at 1:52 PM, Yonik Seeley wrote: > FYI, I've created a #solr-dev IRC channel for those who contribute to > Solr's development. > > There used to be more of a "comm

Re: Curious why Solr Jetty URL has a # sign?

2013-06-10 Thread O. Olson
Thank you Alex for the explanation. I was not aware of single page application design. After a bit of google, it seems to be more popular than I expected. O. O. Alexandre Rafalovitch wrote > The # part is JavaScript URL. It is not seen by the server. It is part > of a standard single-page-applic

Re: How to check does index needs optimize or not?

2013-06-10 Thread Otis Gospodnetic
Here is one way to tell if the index is optimized. Look at this graph for example: https://apps.sematext.com/spm/s/Dxn6SHjSLB See the purple line labeled "delta"? If it's not 0 it means your index has deletions. This index has over 100K deleted docs that have not been expunged. That's because

Re: SOLR 4.3.0 synonym filter - parse error - SOLR 4.3.0

2013-06-10 Thread bbarani
Thanks a lot for your response Jack. I figured out that issue, this file is currently generated by a perl program and seems like a bug in that program. Thanks anyways -- View this message in context: http://lucene.472066.n3.nabble.com/Re-SOLR-4-3-0-synonym-filter-parse-error-SOLR-4-3-0-tp40

Re: SOLR 4.3.0 synonym filter - parse error - SOLR 4.3.0

2013-06-10 Thread Jack Krupansky
You just have to look further down the stack trace "cause" chain to find it: Caused by: java.text.ParseException: Invalid synonym rule at line 10 at org.apache.lucene.analysis.synonym.SolrSynonymParser.add(SolrSynonymParser.java:72) at org.apache.lucene.analysis.synonym.FSTSynonym

Re: Curious why Solr Jetty URL has a # sign?

2013-06-10 Thread Alexandre Rafalovitch
The # part is JavaScript URL. It is not seen by the server. It is part of a standard single-page-application design approach. So, it is not visible to Jetty rules, etc. If you don't have a problem here, I would suggest just taking this part on faith and continue to other parts of Solr Regards

Re: Curious why Solr Jetty URL has a # sign?

2013-06-10 Thread O. Olson
Thank you Chris. No, I do not have an XY Problem. I am new to Solr, Jetty and related technology and was playing. I did not like the /#/ in the URL and felt that it had no purpose. So, if I understand this correctly is Solr using the # as a JQuery hook to decide which view to show? Am I co

Re: Solr 4.3 - Schema Parsing Failed: Invalid field property: compressed

2013-06-10 Thread Shalin Shekhar Mangar
That is because starting with 4.3, Solr started throwing errors if the schema had an illegal field parameter. See https://issues.apache.org/jira/browse/SOLR-4641 On Tue, Jun 11, 2013 at 12:05 AM, Uomesh wrote: > I am upgrading from Solr 4.2 to 4.3. Till 4.2 i was not seeing any error. > > Than

Re: Configuring lucene to suggest the indexed string for all the searches of the substring of the indexed string

2013-06-10 Thread Prathik Puthran
Our dictionary is has very less words. So it is more of a feature to the user than a nuisance. Thanks, Prathik On Mon, Jun 10, 2013 at 10:52 PM, Walter Underwood wrote: > Why do you think that is useful? That will give terrible search results. > > Here are the first twenty words in /usr/share/d

Re: Hoss commented on https://people.apache.org/~hossman/comments.html

2013-06-10 Thread Chris Hostetter
FYI: this is a test as part of SOLR-4889 to get commenting enabled on the (hopefully soon to launch) new Solre Reference guide wiki. the comments.apache.org system is currently setup to notify solr-user when new comments are posted. (For now i'm testing on people.apache.org becuase it's a bi

Re: How to check does index needs optimize or not?

2013-06-10 Thread Shawn Heisey
On 6/10/2013 12:31 PM, Cosimo Streppone wrote: On 10/6/2013 19:15, Shawn Heisey wrote: I really liked the LHC page. :) Michael is correct here. If you look through that JIRA, you'll see that there are still very valid reasons for doing an optimize, but the age-old reason of "improving performa

SOLR 4.3.0 synonym filter - parse error - SOLR 4.3.0

2013-06-10 Thread bbarani
For some reason I am getting the below error when parsing synonyms using synonyms file. Synonyms File: http://www.pastebin.ca/2395108 The server encountered an internal error ({msg=SolrCore 'solr' is not available due to init failure: java.io.IOException: Error parsing synonyms file:,trace=org.a

Hoss commented on https://people.apache.org/~hossman/comments.html

2013-06-10 Thread no-reply
Hello, Hoss has commented on https://people.apache.org/~hossman/comments.html. You can find the comment here: https://people.apache.org/~hossman/comments.html#comment_1351 Please note that if the comment contains a hyperlink, it must be approved before it is shown on the site. Below is the reply

Re: Solr 4.3 - Schema Parsing Failed: Invalid field property: compressed

2013-06-10 Thread Uomesh
I am upgrading from Solr 4.2 to 4.3. Till 4.2 i was not seeing any error. Thanks, Umesh On Mon, Jun 10, 2013 at 2:41 AM, André Widhani [via Lucene] < ml-node+s472066n4069276...@n3.nabble.com> wrote: > From what version are you upgrading? The compressed attribute is > unsupported since the 3.x r

Re: does solr support query time only stopwords?

2013-06-10 Thread jchen2000
Thanks to you all and finally it seems that I figured out a workaround. Yes I used edismax, but my test query was very simple, it only queries one field and uses only one stopword. So i see no chance it would hit another field (but datastax might have done something we don't know). &debug didn't

Re: How to check does index needs optimize or not?

2013-06-10 Thread Cosimo Streppone
On 10/6/2013 19:15, Shawn Heisey wrote: On 6/10/2013 10:18 AM, Michael Della Bitta wrote: Hi all, first post to this really useful list. My experience with Solr (4.0) started just a few months ago. I had no prior exposure to Solr 3.x. That was my flip way of saying that "Optimize" is a high

Re: SolrEntityProcessor gets slower and slower

2013-06-10 Thread Shalin Shekhar Mangar
SolrEntityProcessor is fine for small amounts of data but not useful for such a large index. The problem is that deep paging in search results is expensive. As the "start" value for a query increases so does the cost of the query. You are much better off just re-indexing the data. On Mon, Jun 10,

Solr developer IRC channel

2013-06-10 Thread Yonik Seeley
FYI, I've created a #solr-dev IRC channel for those who contribute to Solr's development. There used to be more of a "community" feel on some of the IRC channels that's since been lost, so I'm trying to get some of that back with a smaller subset of people interested in developing Solr. The channe

SolrEntityProcessor gets slower and slower

2013-06-10 Thread Mingfeng Yang
I trying to migrate 100M documents from a solr index (v3.6) to a solrcloud index (v4.1, 4 shards) by using SolrEntityProcessor. My data-config.xml is like http://10.64.35.117:8995/solr/"; query="*:*" rows="2000" fl= "author_class,authorlink,author_location_text,author_text,author,category,date,

Re: Curious why Solr Jetty URL has a # sign?

2013-06-10 Thread Chris Hostetter
: This may be a dumb question but I am curious why the sample Solr Jetty : results in a URL with a # sign e.g. http://localhost:8983/solr/#/~logging ? You're looking at the Solr UI which is a single page javascript/AJAX based system that uses url fragments (after the hash) to record state

RE: EmbeddedSolrServer reference

2013-06-10 Thread Alex Sarco
Michael, thank you for your answer. You mean using HttpCommonsSolrServer? I thought of that, but I don't see the point of going through the network when I'm running in the same JVM/box as the main Solr server. I still would like a solution to my issue, since so far EmbeddedSolrServer works fine

Re: EmbeddedSolrServer reference

2013-06-10 Thread Michael Della Bitta
Hi Alex, Why not just use two webapps and not use EmbeddedSolrServer, but do all your indexing as requests from your application to the Solr context next door? One advantage of doing it this way is that EmbeddedSolrServer has been deemphasized by the Solr team, so you might not get the maintenanc

Chinese to Pinyin transliteration : homophone matching

2013-06-10 Thread Catala, Francois
Hi, I've been looking for ways to do homophone matching in Solr for CJK languages. I am digging into Chinese for a start. My inputs are words made of simplified characters, and I need to match words that use different characters, but are pronounced the same way. My conclusion is that I need to

Curious why Solr Jetty URL has a # sign?

2013-06-10 Thread O. Olson
Hi, This may be a dumb question but I am curious why the sample Solr Jetty results in a URL with a # sign e.g. http://localhost:8983/solr/#/~logging ? Is there any way to get rid of it, so I could have something like: http://localhost:8983/solr/~logging ? Thank you, O. O. -- View th

Re: Configuring lucene to suggest the indexed string for all the searches of the substring of the indexed string

2013-06-10 Thread Walter Underwood
Why do you think that is useful? That will give terrible search results. Here are the first twenty words in /usr/share/dict/words that contain the substring "cat". abacate abdicate abdication abdicative abdicator aberuncator abjudicate abjudication acacatechin acacatechol acatalectic acatalepsi

EmbeddedSolrServer reference

2013-06-10 Thread Alex Sarco
Hi, I'm running Solr 4.3 embedded in Tomcat, so there's a Solr server starting when Tomcat starts. In the same webapp, I also have a process to recreate the Lucene index when Solr starts. To do this, I have a singleton instance of EmbeddedSolrServer provided by Spring. This same instance is als

Re: Solrj Stats encoding problem

2013-06-10 Thread ethereal
Yeah, that's right, I just set all the params in "q" param. Stupid mistake. Thanks, Chris. -- View this message in context: http://lucene.472066.n3.nabble.com/Solrj-Stats-encoding-problem-tp4068429p4069431.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to check does index needs optimize or not?

2013-06-10 Thread Shawn Heisey
On 6/10/2013 10:18 AM, Michael Della Bitta wrote: Hi* *Furkan, That was my flip way of saying that "Optimize" is a highly optional procedure that should not be undertaken under ordinary circumstances. Solr has no way of detecting whether this is necessary because in the vast majority of cases, i

Re: How to ignore folder collection1 when running single instance of SOLR?

2013-06-10 Thread bbarani
Not sure if this is the right way, I just moved solr.xml outside of solr directory and made changes to sol.xml to make it point to solr directory and it seems to work fine as before. Can someone confirm if this is the right way to configure when running single instance of solr? --

Re: What to do with CloudSolrServer if Internal Ips are different at my SolrCloud?

2013-06-10 Thread Michael Della Bitta
No, it's a Solr instance config. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions | g+: plus.google.com/appinions w: ap

Re: How to check does index needs optimize or not?

2013-06-10 Thread Michael Della Bitta
Hi* *Furkan, That was my flip way of saying that "Optimize" is a highly optional procedure that should not be undertaken under ordinary circumstances. Solr has no way of detecting whether this is necessary because in the vast majority of cases, it's not. If you do a search of this list, you'll fin

How to ignore folder collection1 when running single instance of SOLR?

2013-06-10 Thread bbarani
I am in process of migrating SOLR 3.x to 4.3.0. I am trying to figure out a way to run single instance of SOLR without modifying the directory structure. Is it mandatory to have a folder named collection1 in order for the new SOLR server to work? I see that by default it always searches the confi

Re: How to check does index needs optimize or not?

2013-06-10 Thread Furkan KAMACI
How Solr admin page understands it? 2013/6/10 Michael Della Bitta > I'm pretty sure you can just check this URL: > http://hasthelargehadroncolliderdestroyedtheworldyet.com/ > > ;) > > Michael Della Bitta > > Applications Developer > > o: +1 646 532 3062 | c: +1 917 477 7906 > > appinions inc. >

Re: What to do with CloudSolrServer if Internal Ips are different at my SolrCloud?

2013-06-10 Thread Furkan KAMACI
Can I do it with Solrj? 2013/6/10 Michael Della Bitta > You need to specify the public interface IP or hostname in the host > parameter in solr.xml: > > http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params > > Michael Della Bitta > > Applications Developer > > o: +1 646 532 3062 | c:

Re: Solr indexing slows down

2013-06-10 Thread Michael Della Bitta
Sorry, with the "paging through the results outside of Solr," I meant writing a test to see how long it takes to get through all the results in a test harness that doesn't use Solr. I agree with Shawn that you might need to do some JVM tuning to get things going quicker. You might want to try to m

Re: Facet count for "others" after facet.limit

2013-06-10 Thread Raheel Hasan
Yea, I just thought about the calculation from [total results - all facet results]... But I wish there was a simple "Others" option as well ... Thanks anyway for your help. On Mon, Jun 10, 2013 at 8:20 PM, Jack Krupansky wrote: > Not directly for a field facet. Range and date facets do have

Re: Facet count for "others" after facet.limit

2013-06-10 Thread Jack Krupansky
Not directly for a field facet. Range and date facets do have the concept of "other" to give you more details, but field facet doesn't have that. But, you can calculate that number easily - it is numFound minus the sum of the facet counts for the field, minus "missing". Still, I agree that it

Re: Solr indexing slows down

2013-06-10 Thread Walter Underwood
8 million documents in two hours is over 1000/sec. That is a pretty fast indexing rate. It may be hard to go faster than that. wunder On Jun 10, 2013, at 7:12 AM, Shawn Heisey wrote: > On 6/10/2013 2:32 AM, Sebastian Steinfeld wrote: >> Hi Shawn, >> >> thank you for your answer. >> >> I am us

Re: What to do with CloudSolrServer if Internal Ips are different at my SolrCloud?

2013-06-10 Thread Michael Della Bitta
You need to specify the public interface IP or hostname in the host parameter in solr.xml: http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 1

Re: How to check does index needs optimize or not?

2013-06-10 Thread Michael Della Bitta
I'm pretty sure you can just check this URL: http://hasthelargehadroncolliderdestroyedtheworldyet.com/ ;) Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appin

Re: Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-10 Thread Michael Della Bitta
Glad that helped. I'm going to go buy a lottery ticket now! :) Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions | g+: p

Facet count for "others" after facet.limit

2013-06-10 Thread Raheel Hasan
Hi, Is there anyway to use facet such that the results shows "Others" (or any default value) and show all the others? For example: on category_code count 6 1 false This will show top 6 different products counts divided into the categories. However, there are say 20 different categories and I wa

Re: Adding pdf/word file using JSON/XML

2013-06-10 Thread Jack Krupansky
Sorry, but you are STILL not being clear! Are you asking if you can pass Solr parameters as XML fields? No. Are you asking if the file name and path can be indexed as metadata? To some degree: curl "http://localhost:8983/solr/update/extract?literal.id=doc-1\ &commit=true&uprefix=attr_" -F "He

Re: Dataless nodes in SolrCloud?

2013-06-10 Thread Shawn Heisey
On 6/10/2013 3:32 AM, Shalin Shekhar Mangar wrote: > No, there's no such notion in SolrCloud. Each node that is part of a > collection/shard is a replica and will handle indexing/querying. Even > though you can send a request to a node containing a different collection, > the request would just be

Re: AW: Solr indexing slows down

2013-06-10 Thread Shawn Heisey
On 6/10/2013 2:32 AM, Sebastian Steinfeld wrote: > Hi Shawn, > > thank you for your answer. > > I am using Oracle. This is the configuration I am using: > - > name="local" > driver="oracle.jdbc.driver.OracleDriver" > url="jdbc:oracle:thin:@localhost:1521:XE" > user="" > password=

Re: Adding pdf/word file using JSON/XML

2013-06-10 Thread Gora Mohanty
On 10 June 2013 18:53, Roland Everaert wrote: > Sorry if it was not clear. > > What I would like is to know how to construct an XML/JSON request that > provide any necessary information (supposedly the full path on disk) to > solr to retrieve and index a pdf/ms word document. > > So, an XML reques

How to check does index needs optimize or not?

2013-06-10 Thread Furkan KAMACI
At admin page there occurs an optimize button if needed. Does it related to current label? I mean does current is true means no need to optimize and current is false means needs to optimeze? If not how can I check whether it needs optimize or not from Solrj with CloudSolrServer?

How to Get Cloud Statistics and Why It is Permitted to Use CloudSolrServer and LukeRequest?

2013-06-10 Thread Furkan KAMACI
I have two shards. One of them has 46 documents other one has 42. My default core name is collection1. When I select a node from first shard I see that: Last Modified:about a minute ago Num Docs:42 Max Doc:42 Deleted Docs:0 Version:27 Segment Count:1 When I select a node from second shard I see

Re: Adding pdf/word file using JSON/XML

2013-06-10 Thread Roland Everaert
Sorry if it was not clear. What I would like is to know how to construct an XML/JSON request that provide any necessary information (supposedly the full path on disk) to solr to retrieve and index a pdf/ms word document. So, an XML request could look like this: doc10 BLAH /path/to/file.pdf

Re: translating a character code to an ordinal?

2013-06-10 Thread geeky2
i will try it out and let you know - -- View this message in context: http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4069339.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Adding pdf/word file using JSON/XML

2013-06-10 Thread Gora Mohanty
On 10 June 2013 17:47, Roland Everaert wrote: > Hi, > > Based on the wiki, below is an example of how I am currently adding a pdf > file with an extra field called name: > curl " > http://localhost:8080/solr/update/extract?literal.id=doc10&literal.name=BLAH&defaultField=text"; > --data-binary @/pa

Re: translating a character code to an ordinal?

2013-06-10 Thread Erick Erickson
Hmmm, that may be a wrinkle. I'm actually not sure what'll happen if the _raw_ thing you copy to the int field is not an int (or whatever). You spoke of character code translation, so it may blow up. In which case I'd consider a custom update processor that read the source field, performed whatever

Re: translating a character code to an ordinal?

2013-06-10 Thread geeky2
i will try it. i guess i made a "poor" assumption that you would not get predictable results when copying a code like "mycode" to an int field where where the desired end result in the int field is say, "1". i was worried that some sort of ascii conversion or "wrap around" would happen in the int

Re: solr facet query on multiple search term

2013-06-10 Thread Erick Erickson
There's nothing like that built in that I know of, the closest in concept is "pivot faceting" but that doesn't work in this case. Best Erick On Mon, Jun 10, 2013 at 2:13 AM, vrparekh wrote: > Thanks Erick, > > yes example url i provided is bit confusing, sorry for that. > > Actual requirement is

Adding pdf/word file using JSON/XML

2013-06-10 Thread Roland Everaert
Hi, Based on the wiki, below is an example of how I am currently adding a pdf file with an extra field called name: curl " http://localhost:8080/solr/update/extract?literal.id=doc10&literal.name=BLAH&defaultField=text"; --data-binary @/path/to/file.pdf -H "Content-Type: application/pdf" Is it pos

Re: does solr support query time only stopwords?

2013-06-10 Thread Erick Erickson
My _guess_ is that you're perhaps using edismax or similar and getting matches from fields you don't expect on terms you that are not stopwords. Try adding &debug=query and seeing what the parsed query actually is. And, of course, I have no idea what Datastax is doing. And, you have to at least r

Re: Query-node+shard stickiness?

2013-06-10 Thread Erick Erickson
Nothing I've seen. It would get really tricky though. Each node in the cluster would have to have a copy of all queries received by _any_ node which would result in all queries being sent to all nodes along with an indication of what node that query was actually supposed to be serviced by. And now

Re: Custom Data Clustering

2013-06-10 Thread Raheel Hasan
I wounder how to do that shouldn't this already be part of Solr? Also, I read over then the Internet that it possible to use Mahout and Solr for this purpose so how to achieve that? On Sun, Jun 9, 2013 at 7:57 AM, Otis Gospodnetic wrote: > Hello, > > This sounds like a custom SearchCom

Re: translating a character code to an ordinal?

2013-06-10 Thread Erick Erickson
You can use copyField. All it does is send the raw data to the second field, the fact that they're different types is irrelevant. Why not just give it a try? Erick On Fri, Jun 7, 2013 at 8:08 PM, geeky2 wrote: > hello jack, > > thank you for the code ;) > > what "book" are you referring to? AF

What is directory and userdata at LukeRequest?

2013-06-10 Thread Furkan KAMACI
I have that line of codes: CloudSolrServer solrServer = SolrCloudServerFactory.getCloudSolrServer(); NamedList namedList = solrServer.request(new LukeRequest()); NamedList index = (NamedList) namedList.get("index"); System.out.println(index.get("directory")); System.out.println(index.get("userData

Admin Page Segment Count is Different than LukeRequest's Segment Count?

2013-06-10 Thread Furkan KAMACI
I have that lines of code: CloudSolrServer solrServer = SolrCloudServerFactory.getCloudSolrServer(); NamedList namedList = solrServer.request(new LukeRequest()); NamedList index = (NamedList) namedList.get("index"); System.out.println(index.get("segmentCount")); It prints 5 into system out. Howev

What to do with CloudSolrServer if Internal Ips are different at my SolrCloud?

2013-06-10 Thread Furkan KAMACI
I want to use CloudSolrServer via Solrj at my application. However I get that error: org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[http://10.236.**.***:8983/solr/collection1, http://10.240.**.**:8983/solr/collection1 ... I think that probl

Re: Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-10 Thread Mark Wilson
Hi Michael Thanks very much for that, it did indeed solve the problem. I had it setup on my internal servers, as I have a separate script for tomcat startup, but forgot all about it on the Amazon Cloud servers. For info I added CATALINA_OPTS="-Djava.awt.headless=true" export CATALINA_OPTS to

Re: Dataless nodes in SolrCloud?

2013-06-10 Thread Shalin Shekhar Mangar
No, there's no such notion in SolrCloud. Each node that is part of a collection/shard is a replica and will handle indexing/querying. Even though you can send a request to a node containing a different collection, the request would just be forwarded to the right node and will be executed there. Th

Re: LotsOfCores feature

2013-06-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
Aleksey, It was a less than ideal situation. because we did not have a choice. We had external systems/scripts to manage this. A new custom implementation is being built on SolrCloud which would have taken care of most of hose issues. SolrReplication is a hidden once you move to cloud. But it wi

AW: Solr indexing slows down

2013-06-10 Thread Sebastian Steinfeld
Hi Shawn, thank you for your answer. I am using Oracle. This is the configuration I am using: - There are 12GB free memory on the server I hope this is enough. I will test the import with 4GB vm memory. Do you know if the "autocommit" inside solrconfig.xml configuration wo

How to Reach LukeRequestHandler From Solrj?

2013-06-10 Thread Furkan KAMACI
I want to get statistics from Solr via Solrj. I think that I should reach LukeRequestHandler (*if it is not, you can explain th proper way*.) I use Solr 4.2.1 and CloudSolrServer to reach Solr via Solrj. How can I do that? This URL's response has exactly what I want: :8983/solr/collection1/admin/

AW: Solr 4.3 - Schema Parsing Failed: Invalid field property: compressed

2013-06-10 Thread André Widhani
>From what version are you upgrading? The compressed attribute is unsupported >since the 3.x releases. The change log (CHANGES.txt) has a section "Upgrading from Solr 1.4" in the notes for Solr 3.1: "Field compression is no longer supported. Fields that were formerly compressed will be uncompr

Re: LIMIT on number of OR in fq

2013-06-10 Thread Raymond Wiker
A better option would be to use POST instead of GET. On Mon, Jun 10, 2013 at 8:50 AM, Aloke Ghoshal wrote: > True, the container's request header size limit must be the reason then. > Try: > > http://serverfault.com/questions/136249/how-do-we-increase-the-maximum-allowed-http-get-query-length-i

Re: Get Statistics With CloudSolrServer?

2013-06-10 Thread Furkan KAMACI
I think that it is related to LukeRequest 2013/6/10 Mark Miller > > On Jun 9, 2013, at 7:52 PM, Furkan KAMACI wrote: > > > There is a stat,st,cs section at admin page and gives information as > like: > > > > Last Modified, Num Docs, Max Doc and etc. How can I get such kind of > > information us

AW: Solr indexing slows down

2013-06-10 Thread Sebastian Steinfeld
Hi Michael, the database I am using is Oracle. That's right, I am selecting from a view. What do you mean by selecting from outside of solr? I thought the batchsize will do the pagination? The load of the database server is not increasing during the import. It seems that the database is doing n