Re: using HttpSolrServer with PoolingHttpClientConnectionManager

2017-03-01 Thread Renee Sun
Thank you Shawn! this is very helpful. Renee -- View this message in context: http://lucene.472066.n3.nabble.com/using-HttpSolrServer-with-PoolingHttpClientConnectionManager-tp4322905p4322972.html Sent from the Solr - User mailing list archive at Nabble.com.

using HttpSolrServer with PoolingHttpClientConnectionManager

2017-03-01 Thread Renee Sun
first of all I apologize for the length of this message ... there are few questions I would appreciate your help please: 1. originally I wanted to use solrj in my application layer (webapp deployed with tomcat), to query the solr server(s) with multi-cores, non-cloud setup. Since I need send back

is there a way to match related multivalued fields of different types

2017-02-08 Thread Renee Sun
Hi - I have a schema looks like: (text_nost and text_st are just defined field type without/with stopwords... irrelevant to the issues here) these 3 fields are parallel in means of their values. I want to be able to match these values and be able to search something like : give me all attach

Re: project related configsets need to be deployed in both data and solr install folders ?

2017-02-01 Thread Renee Sun
thanks for your time! -- View this message in context: http://lucene.472066.n3.nabble.com/project-related-configsets-need-to-be-deployed-in-both-data-and-solr-install-folders-tp4317897p4318382.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: project related configsets need to be deployed in both data and solr install folders ?

2017-02-01 Thread Renee Sun
Hi Chris, since I have been playing with this install, and I am not certain if I have unknowingly messed some other settings. I want to avoid put in a false Jira wasting your time. I wiped out everything on my solr box and did a fresh install of solr 6.4.0, made sure my config file set are place

Re: project related configsets need to be deployed in both data and solr install folders ?

2017-01-31 Thread Renee Sun
Thanks Erick! I looked at solr twiki though if configSetBaseDir is not set, the default should be SOLR_HOME/configsets: configSetBaseDir The directory under which configsets for solr cores can be found. Defaults to SOLR_HOME/configsets and I do have my solr started with : -Dsolr.solr.

project related configsets need to be deployed in both data and solr install folders ?

2017-01-30 Thread Renee Sun
Hi - We use separate solr install and data folders with a shared schema/config (configsets) in multi-cores setup, it seems the configsets need to be deployed in both places (we are running solr 6.4.0)? for example, solr is installed in /opt/solr, thus there is folder: /opt/solr/server/solr/con

Re: solr 5 leaving tomcat, will I be the only one fearing about this?

2016-10-10 Thread Renee Sun
Thanks John... yes that was the first idea came to our mind, but it will require doubling our servers (in replica data centers as well etc), definitely we can't afford the cost. We have thought of first establishing a small pool of 'hot' servers and use them to take incoming new index data using u

Re: solr 5 leaving tomcat, will I be the only one fearing about this?

2016-10-10 Thread Renee Sun
Shawn and Ari, the 3rd party jars are exactly just one of the concerns I have. We had more than just a multi-lingual integration, we have to integrate with many other 3rd party tools. We basically deploy all those jars into an 'external' lib extension path in production, then for each 3rd party too

Re: solr 5 leaving tomcat, will I be the only one fearing about this?

2016-10-07 Thread Renee Sun
I just read through the following link Shawn shared in his reply: https://wiki.apache.org/solr/WhyNoWar While the following statement is true: "Supporting a single set of binary bits is FAR easier than worrying about what kind of customized environment the user has chosen for their deployment

Re: solr 5 leaving tomcat, will I be the only one fearing about this?

2016-10-07 Thread Renee Sun
Thanks everyone, I think this is very helpful... I will post more specific questions once we start to get more familiar with solr 6. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-5-leaving-tomcat-will-I-be-the-only-one-fearing-about-this-tp4300065p4300253.html Sent fr

Re: solr 5 leaving tomcat, will I be the only one fearing about this?

2016-10-07 Thread Renee Sun
Thanks ... but that is an extremely simplified situation. We are not just looking for Solr as a new tool to start using it. In our production, we have cloud based big data indexing using Solr for many years. We have developed lots business related logic/component deployed as webapps working seaml

solr 5 leaving tomcat, will I be the only one fearing about this?

2016-10-06 Thread Renee Sun
need some general advises please... our infra is built with multiple webapps with tomcat ... the scale layer is archived on top of those webapps which work hand-in-hand with solr admin APIs / shard queries / commit or optimize / core management etc etc. While I have not get a chance to actually p

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
thanks Yonik... I bet with solr 3.5 we do not have jason facet api support yet ... -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-efficiently-get-sum-of-an-int-field-tp4238464p4238522.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
Also Yonik, out of curiosity... when I run stats on a large msg set (such as 200 million msgs), it tends to use a lot of memory, this should be expected correct? if I were able to use !sum=true to only get sum, a clever algorithm should be able to tell if sum is only requited, it will avoid memory

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
now I think with solr 3.5 (that we are using), !sum=true (overwrite default ) probably is not supported yet :-( thanks Renee -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-efficiently-get-sum-of-an-int-field-tp4238464p4238519.html Sent from the Solr - User mailing l

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
I did try single quote with backslash of the bang. also tried disable history chars... did not work for me. unfortunately, we are using solr 3.5, probably does not support json format? -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-efficiently-get-sum-of-an-int-f

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
thanks! but it is silly that I can seem to escape the {!sum=true} properly to make it work in my curl :-( time curl -d 'q=*:*&rows=0&shards=solrhostname:8080/solr/413-1,anothersolrhost:8080/solr/413-2&stats=true&stats.field={!sum=true}myfieldname' http://localhost:8080/solr/413-1/select/? | xmll

how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
Hi - I have been using stats to get the sum of a field data (int) like: &stats=true&stats.field=my_field_name&rows=0 It works fine but when the index has hundreds million messages on a sharded indices, it take long time. I noticed the 'stats' give out more information than I needed (just sum), I

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-04 Thread Renee Sun
Thanks a lot Shawn, for the details, it is very helpful ! -- View this message in context: http://lucene.472066.n3.nabble.com/any-easy-way-to-find-out-when-a-core-s-index-physical-file-has-been-last-updated-tp4227044p4227274.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-04 Thread Renee Sun
Shawn, thanks so much, and this user forum is so helpful! I will start use autocommit with confidence it will greatly help reducing the false commit requests (a lot) from processes in our system. Regarding the solr version, it is actually a big problem we have to resolve sooner or later. When we

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-03 Thread Renee Sun
unfortunately we are still using solr 3.5 with lucene 2.9.3 :-( If we upgrade to solr 4.x it will require upgrade of lucene away from 2.x.x which will need re-index of all our data. With current measures, it might take about 8-9 for the data we have to be re-indexed, a big concern. so to understan

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-03 Thread Renee Sun
thank you! I will look into that. Also I came across autosoftcommit, it seems to be useful... we are still using solr 3.5, I hope autosoftcommit is included in solr 3.5... -- View this message in context: http://lucene.472066.n3.nabble.com/any-easy-way-to-find-out-when-a-core-s-index-physical-

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-03 Thread Renee Sun
Walter, thanks! I will do some tests using auto commit, I guess if there is requirement for console UI to make documents searchable in 10 minutes, we will need to use the autocommit with maxTime instead of maxDoc. I wonder if in case we need to do a 'force commit', the autocommit will not get in

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-03 Thread Renee Sun
this make sense now. Thanks! why I got on this idea is: In our system we have large customer base and lots of cores, each customer may have multiple cores. there are also a lot of processes running in our system processing the data for these customers, and once a while, they would ask a center p

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-03 Thread Renee Sun
[core]/index is a folder holding index files. But index files in that folder is not just being deleted or added, they are also being updated. on Linux file system, the folder's timestamp will only be updated if the files in it is being added or deleted, NOT updated. So if I check the index folde

Re: any easy way to find out when a core's index physical file has been last updated?

2015-09-03 Thread Renee Sun
hum... at beginning I also assumed segment index files will only be deleted or added, but not modified. But I did a test with heavy indexing on going, and observed the index file in [core]/index with a latest updated timestamp keep growing for about 7 minutes... not sure if the new write caused an

any easy way to find out when a core's index physical file has been last updated?

2015-09-03 Thread Renee Sun
I will need to figure out when was last index activity on a core. I can't use [corename]/index timestamp, because it only reflex the file deletion or addition, not file update. I am curious if any solr core admin RESTful api sort of thing thing I can use to get last modified timestamp on physica

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
thanks Shawn... on the other side, I have just created a thin layer webapp I deploy it with solr/tomcat. this webapp provides RESTful api allow all kind of clients in our system to call and request a commit on the certain core on that solr server. I put in with the idea to have a centre/final pla

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
Hi Erick... as Shawn pointed out... I am not using solrcloud, I am using a more complicated sharding scheme, home grown... thanks for your response :-) Renee -- View this message in context: http://lucene.472066.n3.nabble.com/is-there-any-way-to-tell-delete-by-query-actually-deleted-anything-

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
Hi Shawn, I think we have similar structure where we use frontier/back instead of hot/cold :-) so yes we will probably have to do the same. since we have large customers and some of them may have tera bytes data and end up with hundreds of cold cores the blind delete broadcasting to all of th

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
Shawn, thanks for the reply. I have a sharded index. When I re-index a document (vs new index, which is different process), I need to delete the old one first to avoid dup. We all know that if there is only one core, the newly added document will replace the old one, but with multiple core indexes

is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
I run this curl trying to delete some messages : curl 'http://localhost:8080/solr/mycore/update?commit=true&stream.body=abacd' | xmllint --format - or curl 'http://localhost:8080/solr/mycore/update?commit=true&stream.body=myfield:mycriteria' | xmllint --format - the results I got is like: %

Re: partial optimize does not reduce the segment number to maxNumSegments

2011-04-15 Thread Renee Sun
sorry I should elaborate that earlier... in our production environment, we have multiple cores and the ingest continuously all day long; we only do optimize periodically, and optimize once a day in mid night. So sometimes we could see 'too many open files' error. To prevent it from happening, in

Re: partial optimize does not reduce the segment number to maxNumSegments

2011-04-15 Thread Renee Sun
yeah, I can figure out the segment number by going to stat page of solr... but my question was how to figure out exact total number of files in 'index' folder for each core. Like I mentioned in previous message, I currently have 8 files per segment (.prx .tii etc), but it seems this might change i

Re: partial optimize does not reduce the segment number to maxNumSegments

2011-04-15 Thread Renee Sun
thanks! It seems the file count in index directory is the segment# * 8 in my dev environment... I see there are .fnm .frq .fdt .fdx .nrm .prx .tii .tis (8) file extensions, and each has as many as segment# files. Is it always safe to calculate the file counts using segment number multiply by 8?

Re: partial optimize does not reduce the segment number to maxNumSegments

2011-04-12 Thread Renee Sun
ok I dug more into this and realize the file extensions can vary depending on schema, right? for instance we dont have *.tvx, *.tvd, *.tvf (not using term vector)... and I suspect the file extensions may change with future lucene releases? now it seems we can't just count the file using any formul

Re: partial optimize does not reduce the segment number to maxNumSegments

2011-04-12 Thread Renee Sun
Hi Hoss, thanks for your response... you are right I got a typo in my question, but I did use maxSegments, and here is the exactly url I used: curl 'http://localhost:8080/solr/97/update?optimize=true&maxSegments=10&waitFlush=true' I used jconsole and du -sk to monitor each partial optimize, and

partial optimize does not reduce the segment number to maxNumSegments

2011-03-15 Thread Renee Sun
I have a core with 120+ segment files and I tried partial optimize specify maxNumSegments=10, after the optimize the segment files reduced to 64 files; I did the same optimize again, it reduced to 30 something; this keeps going and eventually it drops to teen number. I was expecting seeing the o

Re: Upgrade to Solr 1.4, very slow at start up when loading all cores

2010-10-13 Thread Renee Sun
just update on this issue... we turned off the new/first searchers (upgrade to Solr 1.4.1), and ran benchmark tests, there is no noticeable performance impact on the queries we perform comparing with Solr 1.3 benchmark tests WITH new/first searchers. Also the memory usage reduced by 5.5 GB after

Re: using HTTPClient sending solr ping request wont timeout as specified

2010-10-13 Thread Renee Sun
Ken, looks like we posted at same time :-) thanks very much! Renee -- View this message in context: http://lucene.472066.n3.nabble.com/using-HTTPClient-sending-solr-ping-request-wont-timeout-as-specified-tp1691292p1695584.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: using HTTPClient sending solr ping request wont timeout as specified

2010-10-13 Thread Renee Sun
thanks Michael, I got it resolved last night... you are right, it is more like a HttpClient issue after I tried another link unrelated to solr. If anyone is interested, here is the working code: HttpClientParams httpClientParams = new HttpClientParams(); httpClientParams.setSoTim

using HTTPClient sending solr ping request wont timeout as specified

2010-10-12 Thread Renee Sun
I am using the following code to send out solr request from a webapp. please notice the timeout setting: HttpClient client = new HttpClient(); HttpMethod method = new GetMethod(solrReq); method.getParams().setParameter(HttpConnectionParams.SO_TIMEOUT,

Re: using HTTPClient sending solr ping request wont timeout as specified

2010-10-12 Thread Renee Sun
I also added the following timeout for the connection, still not working: client.getParams().setSoTimeout(httpClientPingTimeout); client.getParams().setConnectionManagerTimeout(httpClientPingTimeout); -- View this message in context: http://lucene.472066.n3.nabble.com/

Re: Upgrade to Solr 1.4, very slow at start up when loading all cores

2010-10-05 Thread Renee Sun
Hi Yonik, I tried the fix suggested in your comments (using "solr.TrieDateField" ), and it loaded up 130 cores in 1 minute, 1.3GB memory (a little more than 1GB when turning off static warm cache, and much less than 6.5GB when use 'solr.DateField'). Will this have any impact on first query or per

Re: Upgrade to Solr 1.4, very slow at start up when loading all cores

2010-10-01 Thread Renee Sun
http://lucene.472066.n3.nabble.com/file/n1617135/solrconfig.xml solrconfig.xml Hi Yonik, I have uploaded our solrconfig.xml file for your reference. we also tried 1.4.1, for same index data, it took about 30-55 minutes to load up all 130 cores, it did not help at all. There is no query running

Re: Upgrade to Solr 1.4, very slow at start up when loading all cores

2010-10-01 Thread Renee Sun
Hi Yonik, I attached the solrconfig.xml to you in previous post, and we do have firstSearch and newSearch hook ups. I commented them out, all 130 cores loaded up in 1 minute, same as in solr 1.3. total memory took about 1GB. Whereas in 1.3, with hook ups, it took about 6.5GB for same amount of

Re: Upgrade to Solr 1.4, very slow at start up when loading all cores

2010-09-30 Thread Renee Sun
Hi Yonik, thanks for your reply. I entered a bug for this at : https://issues.apache.org/jira/browse/SOLR-2138 to answer your questions here: - do you have any warming queries configured? > no, all autowarmingcount are set to 0 for all caches - do the cores have documents already, and i

Upgrade to Solr 1.4, very slow at start up when loading all cores

2010-09-30 Thread Renee Sun
Hi - I posted this problem but no response, I guess I need to post this in the Solr-User forum. Hopefully you will help me on this. We were running Solr 1.3 for long time, with 130 cores. Just upgrade to Solr 1.4, then when we start the Solr, it took about 45 minutes. The catalina.log shows Solr