optimization suggstions
Stats: default config for 4.3.1 on a high memory AWS instance using jetty. Two collections each with less than 700k docs per collection. We seem to hit some performance lags when doing large commits. Our front end service allows customers to import data which is stored in Mongo and then indexed in Solr. We keep all of that data and do one big commit at the end rather than doing commits for each record along the way. Would it be better to use something like autoSoftCommit and just commit each record as it comes in? Or is the problem more about disk IO?Are there some other "low hanging fruit" things we should consider? The solr dashboard shows that there is still plenty of free memory during these imports so it isn't running out of memory and reverting to disk. Thanks! Eric
queries including time zone
Can anybody provide any insight about using the tz param? The behavior of this isn't affecting date math and /day rounding. What format does the tz variables need to be in? Not finding any documentation on this. Sample query we're using: path=/select params={tz=America/Chicago&sort=id+desc&start=0&q=application_id:51b30ed9bc571bd96773f09c+AND+object_key:object_26+AND+values_field_215_date:[*+TO+NOW/DAY%2B1DAY]&wt=json&rows=25} Thanks! Eric
Re: queries including time zone
We're still not seeing the proper result.I've included a gist of the query and its debug result. This was run on a clean index running 4.4.0 with just one document. That document has a date of 11/15/2013 yet the date in the included TZ it is the 14th but I still get that document returned. Hoping someone can help. https://gist.github.com/anonymous/7478773 On Nov 14, 2013, at 3:06 PM, Chris Hostetter wrote: > > I've beefed up the ref guide page on dates to include more info about all > of this... > > https://cwiki.apache.org/confluence/display/solr/Working+with+Dates > > > -Hoss
Re: queries including time zone
Anybody have any additional suggestions for this TZ issue we're having? I've included the query below. The full debug output is in the gist link also included if you want to see it. As mentioned, the test solr installation has one document with a date that was set to the 15th, and when this query was run during the evening of the 14th we still got that one document in the response. Are there any other potential culprits here? E.g. DateField vs TrieDateField? "params": { "debugQuery": "true", "indent": "true", "q": "values_field_66_date:[* TO NOW/DAY+1DAY]", "TZ:'America/Los_Angeles'": "", "_": "1384487341231", "wt": "json", "rows": "25" } https://gist.github.com/anonymous/7478773 Thanks, Eric On Nov 14, 2013, at 10:58 PM, Eric Katherman wrote: > We're still not seeing the proper result.I've included a gist of the > query and its debug result. This was run on a clean index running 4.4.0 with > just one document. That document has a date of 11/15/2013 yet the date in > the included TZ it is the 14th but I still get that document returned. > Hoping someone can help. > > https://gist.github.com/anonymous/7478773 > > > On Nov 14, 2013, at 3:06 PM, Chris Hostetter wrote: > >> >> I've beefed up the ref guide page on dates to include more info about all >> of this... >> >> https://cwiki.apache.org/confluence/display/solr/Working+with+Dates >> >> >> -Hoss >
Expunging Deletes
I'm running into memory issues and wondering if I should be using expungeDeletes on commits. The server in question at the moment has 450k documents in the collection and represents 15GB on disk. There are also 700k+ "Deleted Docs" and I'm guessing that is part of the disk space consumption but I am not having any luck getting that cleared out. I noticed the expungeDeletes=false in some of the log output related to commit but didn't try setting it to true yet. Will this clear those deleted documents and recover that space? Or should something else already be managing that but maybe isn't configured correctly? Our data is user specific data, each customer has their own database structure so it varies with each user. They also add/remove data fairly frequently in many cases. To compare another collection of the same data type, there are 1M documents and about 120k deleted docs but disk space is only 6.3GB. Hoping someone can share some advice about how to manage this. Thanks, Eric
Re: Expunging Deletes
Thanks for replying! Is there anything I could be doing to help prevent the 14GB collection with 700k deleted docs before it tries removing them and at that point running out of memory? Maybe just scheduled off-peak optimize calls with expungeDeletes? Or is there some other config option I could be using to help manage that a little better? Thanks! Eric On Sep 29, 2014, at 9:35 AM, Shalin Shekhar Mangar wrote: > Yes, expungeDeletes=true will remove all deleted docs from the disk but it > also requires merging all segments that have any deleted docs which, in > your case, could mean a re-write of the entire index. So it'd be an > expensive operation. Usually deletes are removed in the normal course of > indexing as segments are merged together. > > On Sat, Sep 27, 2014 at 8:42 PM, Eric Katherman wrote: > >> I'm running into memory issues and wondering if I should be using >> expungeDeletes on commits. The server in question at the moment has 450k >> documents in the collection and represents 15GB on disk. There are also >> 700k+ "Deleted Docs" and I'm guessing that is part of the disk space >> consumption but I am not having any luck getting that cleared out. I >> noticed the expungeDeletes=false in some of the log output related to >> commit but didn't try setting it to true yet. Will this clear those deleted >> documents and recover that space? Or should something else already be >> managing that but maybe isn't configured correctly? >> >> Our data is user specific data, each customer has their own database >> structure so it varies with each user. They also add/remove data fairly >> frequently in many cases. To compare another collection of the same data >> type, there are 1M documents and about 120k deleted docs but disk space is >> only 6.3GB. >> >> Hoping someone can share some advice about how to manage this. >> >> Thanks, >> Eric >> > > > > -- > Regards, > Shalin Shekhar Mangar.
Sorting & Joins
Is it possible to join documents and use a field from the "from" documents to sort the results? For example, I need to search "employees" and sort on different fields of the "company" each employee is joined to. What would that query look like? We've looked at various resources but haven't found any concise examples that work. Thanks, Eric
Re: http://localhost:8983/solr/#/~cores
unsubscribe On Tue, Oct 29, 2019 at 4:34 AM UMA MAHESWAR wrote: > hi all, > SolrCore Initialization Failures > {{core}}: {{error}} > Please check your logs for more information > > > {{exception.msg}} > > here my log file > 2019-10-29 06:03:30.995 INFO (main) [ ] o.e.j.u.log Logging initialized > @1449ms to org.eclipse.jetty.util.log.Slf4jLog > 2019-10-29 06:03:31.302 INFO (main) [ ] o.e.j.s.Server > jetty-9.4.10.v20180503; built: 2018-05-03T15:56:21.710Z; git: > daa59876e6f384329b122929e70a80934569428c; jvm 1.8.0_144-b01 > 2019-10-29 06:03:31.346 INFO (main) [ ] o.e.j.d.p.ScanningAppProvider > Deployment monitor > [file:///C:/Users/uma.maheswar/Downloads/solr740/server/contexts/] at > interval 0 > 2019-10-29 06:03:32.206 INFO (main) [ ] > o.e.j.w.StandardDescriptorProcessor NO JSP Support for /solr, did not find > org.apache.jasper.servlet.JspServlet > 2019-10-29 06:03:32.215 INFO (main) [ ] o.e.j.s.session > DefaultSessionIdManager workerName=node0 > 2019-10-29 06:03:32.232 INFO (main) [ ] o.e.j.s.session No > SessionScavenger set, using defaults > 2019-10-29 06:03:32.233 INFO (main) [ ] o.e.j.s.session node0 Scavenging > every 66ms > 2019-10-29 06:03:32.245 INFO (main) [ ] c.e.s.SolrSecurityFilter > SolrSecurityFilter --> Filter successfully initialized. > 2019-10-29 06:03:32.247 INFO (main) [ ] c.e.s.ValidateAuthorization > Validate Authorization --> validation Filter file solrsecurity.properties > successfully initialized > 2019-10-29 06:03:32.296 INFO (main) [ ] > o.a.s.u.c.SSLCredentialProviderFactory Processing SSL Credential Provider > chain: env;sysprop > 2019-10-29 06:03:32.326 INFO (main) [ ] o.a.s.s.SolrDispatchFilter Using > logger factory org.apache.logging.slf4j.Log4jLoggerFactory > 2019-10-29 06:03:32.331 INFO (main) [ ] o.a.s.s.SolrDispatchFilter > ___ > _ Welcome to Apache Solr™ version 7.4.0 > 2019-10-29 06:03:32.332 INFO (main) [ ] o.a.s.s.SolrDispatchFilter / __| > ___| |_ _ Starting in standalone mode on port 8983 > 2019-10-29 06:03:32.332 INFO (main) [ ] o.a.s.s.SolrDispatchFilter \__ > \/ > _ \ | '_| Install dir: C:\Users\uma.maheswar\Downloads\solr740 > 2019-10-29 06:03:32.332 INFO (main) [ ] o.a.s.s.SolrDispatchFilter > |___/\___/_|_|Start time: 2019-10-29T06:03:32.332Z > 2019-10-29 06:03:32.358 INFO (main) [ ] o.a.s.c.SolrResourceLoader Using > system property solr.solr.home: > C:\Users\uma.maheswar\Downloads\solr740\server\solr > 2019-10-29 06:03:32.362 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from > C:\Users\uma.maheswar\Downloads\solr740\server\solr\solr.xml > 2019-10-29 06:03:32.419 INFO (main) [ ] o.a.s.c.SolrXmlConfig MBean > server found: com.sun.jmx.mbeanserver.JmxMBeanServer@551bdc27, but no JMX > reporters were configured - adding default JMX reporter. > 2019-10-29 06:03:32.861 INFO (main) [ ] o.a.s.c.SolrResourceLoader > [null] > Added 1 libs to classloader, from paths: > [/C:/Users/uma.maheswar/Downloads/solr740/server/solr/lib] > 2019-10-29 06:03:33.786 INFO (main) [ ] > o.a.s.c.TransientSolrCoreCacheDefault Allocating transient cache for > 2147483647 transient cores > 2019-10-29 06:03:33.789 INFO (main) [ ] o.a.s.h.a.MetricsHistoryHandler > No .system collection, keeping metrics history in memory. > 2019-10-29 06:03:33.867 INFO (main) [ ] o.a.s.m.r.SolrJmxReporter JMX > monitoring for 'solr.node' (registry 'solr.node') enabled at server: > com.sun.jmx.mbeanserver.JmxMBeanServer@551bdc27 > 2019-10-29 06:03:33.867 INFO (main) [ ] o.a.s.m.r.SolrJmxReporter JMX > monitoring for 'solr.jvm' (registry 'solr.jvm') enabled at server: > com.sun.jmx.mbeanserver.JmxMBeanServer@551bdc27 > 2019-10-29 06:03:33.874 INFO (main) [ ] o.a.s.m.r.SolrJmxReporter JMX > monitoring for 'solr.jetty' (registry 'solr.jetty') enabled at server: > com.sun.jmx.mbeanserver.JmxMBeanServer@551bdc27 > 2019-10-29 06:03:33.900 INFO (main) [ ] o.a.s.c.CorePropertiesLocator > Found 1 core definitions underneath > C:\Users\uma.maheswar\Downloads\solr740\server\solr > 2019-10-29 06:03:33.901 INFO (main) [ ] o.a.s.c.CorePropertiesLocator > Cores are: [ebmpapst_AEM] > 2019-10-29 06:03:33.908 INFO (coreLoadExecutor-9-thread-1) [ > x:ebmpapst_AEM] o.a.s.c.SolrResourceLoader [null] Added 1 libs to > classloader, from paths: > [/C:/Users/uma.maheswar/Downloads/solr740/server/solr/ebmpapst_AEM/lib] > 2019-10-29 06:03:34.075 INFO (coreLoadExecutor-9-thread-1) [ > x:ebmpapst_AEM] o.a.s.c.SolrResourceLoader [ebmpapst_AEM] Added 73 libs to > classloader, from paths: > [/C:/Users/uma.maheswar/Downloads/solr740/contrib/clustering/lib, > /C:/Users/uma.maheswar/Downloads/solr740/contrib/extraction/lib, > /C:/Users/uma.maheswar/Downloads/solr740/contrib/langid/lib, > /C:/Users/uma.maheswar/Downloads/solr740/contrib/velocity/lib, > /C:/Users/uma.maheswar/Downloads/solr740/dist, > /C:/Users/uma.maheswar/Downloads/solr740/server/solr/ebmpapst_AEM/lib] > 2019-10-29 06:03:34.162 INFO (coreLoadExe