Re: indexing a xml file
FATAL: Solr returned an error #400 ERROR:unknown field >'name' This issue is due to data type mismatch in both solr(schema.xml) and in coding part(Adding documents). Try to make both the fields should be similar. -- View this message in context: http://lucene.472066.n3.nabble.com/indexing-a-xml-file-tp3364392p3990231.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: delete by query don't work
In order to clear all the indexed data please try to use this code private void Btn_Delete_Click(object sender, EventArgs e) { var solrUrl = this.textBoxSolrUrl.Text; indexer.FixtureSetup(solrUrl); indexer.Delete(); MessageBox.Show("Delete of files is completed"); } public void Delete() { var solr = ServiceLocator.Current.GetInstance>(); solr.Delete(new SolrQueryByField("id", "*:*")); solr.Commit(); } Use this code to delete indivisual document solr.Delete(new SolrQueryByField("id", "SP2514N")); Here particular id="SP2514N" will be removed from the Indexed data. -- View this message in context: http://lucene.472066.n3.nabble.com/delete-by-query-don-t-work-tp3990077p3990243.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr cluster tuning
We are currently using Solr Cloud Version 7.4 with SolrJ api to fetch data from collections. We recently deployed our code to production and noticed that response time is more if the number of incoming requests are less. But strangely, if we bombard the system with more and more requests we get much better response time. My suspicion is client is closing the connections sooner in case of slower requests and slower in case of faster requests. We tried tuning by passing custom HTTPClient to SolrJ and also by updating HttpShardHandlerFactory settings. For example we made - maxThreadIdleTime = 6 socketTimeOut = 18 Wondering what other tuning we can do to make this perform the same irrespective of the number of requests. Thanks! Vidhya
Re: Solr cluster tuning
Thank you Erick and Daniel for your prompt responses. We were trying a few things (moving to G1GC, optimizing by throwing away some fields that need not be indexed & stored) and hence the late response. First of all, thought of giving a overview of the environment... We have a four node Solr Cloud cluster. We have 2 indexes which is spread across 4 shards and has 2 replicas. We have a total of 30GB on each of the nodes (all dedicated to running the Solr Cloud alone). Of which 15GB are allocated to the JVM and the rest for the OS to manage. All the indexes together take up just 1.4GB on the disk. Running version 7.4 with a dedicated Zookeeper cluster. Something of concern I see on the Solr Admin is the use of that memory. [image: image.png] this is what I see by running Top: [image: image.png] Is there a general calculation on how much to leave for OS caching for an index of 2GB? To answer Ericks question, no we are not indexing at the same time. In fact we have stopped indexing just to test the theory and dont see any improvements. I dont think I need to worry about autocommit then right? Daniel, we did try what you mentioned here (that is warm up the cache and then do a slow and a fast test) and we still see the slow test yielding slower results. Any thoughts anyone? Much appreciate your responses thanks Vidhya On Wed, Oct 24, 2018 at 6:40 PM Erick Erickson wrote: > To add to Daniel's comments: Are you indexing at the same time? Say > your autocommit time is 10 seconds. For the sake of argument let's say > it takes 15 queries to warm your searcher. Let's further say that the > average time for those 15 queries is 500ms each and once the searcher > is warmed the average time drops to 100ms. You'll have an average > close to 100ms. > > OTOH, if you only fire 15 queries over that 10 seconds, the average > would be 500ms. > > My guess is your autowarm counts for filterCache and queryResult cache > are the default 0 and if you set them to, say, 20 each much of your > problem would disappear. Ditto if you stopped indexing. Both point to > the searchers having to pull data into memory from disk and/or rebuild > caches. > > Best, > Erick > On Wed, Oct 24, 2018 at 1:37 PM Davis, Daniel (NIH/NLM) [C] > wrote: > > > > Usually, responses are due to I/O waits getting the data off of the > disk. So, to me, this seems more likely because as you bombard the server > with queries, you cause more and more of the data needed to answer the > query into memory. > > > > To verify this, I'd bombard your server with queries to warm it up, and > then repeat your test with the queries coming in slowly or quickly. > > > > If it still holds up, then there is something other than Solr going on > with that server, and taking memory from Solr or your index is somewhat too > big for your server. Linux likes to overcommit memory - try setting vm > swappiness to something low, like 10, rather than the default 60. Look > for anything on the server with Solr that may be competing with it for I/O > resources, and causing its pages to swap out. > > > > Also, look at the size of your index data. > > > > These are general advises in dealing with inverted indexes - some of the > Solr engineers on this list may have some very specific ideas, such as > merging activity or other background tasks running when the query load is > lighter. I wouldn't know how to check for these things, but would thing > they wouldn't affect query response time that badly. > > > > -Original Message- > > From: Vidhya Kailash > > Sent: Wednesday, October 24, 2018 4:22 PM > > To: solr-user@lucene.apache.org > > Subject: Solr cluster tuning > > > > We are currently using Solr Cloud Version 7.4 with SolrJ api to fetch > data from collections. We recently deployed our code to production and > noticed that response time is more if the number of incoming requests are > less. > > > > But strangely, if we bombard the system with more and more requests we > get much better response time. > > > > My suspicion is client is closing the connections sooner in case of > slower requests and slower in case of faster requests. > > > > We tried tuning by passing custom HTTPClient to SolrJ and also by > updating HttpShardHandlerFactory settings. For example we made - > maxThreadIdleTime = 6 socketTimeOut = 18 > > > > Wondering what other tuning we can do to make this perform the same > irrespective of the number of requests. > > > > Thanks! > > > > Vidhya > -- Vidhya Kailash
Unable to get Solr Graph Traversal working
I am unable to get even simple graph traversal expressions like the one below to work in my environment (7.4 and 7.5 versions). They simply yield no results, even though I know the data exists. curl --data-urlencode 'expr=gatherNodes(rec_coll, walk="35d40c4b9d6ddfsdf45cbb0fe4aesd75->USER_ID", gather="ITEM_ID")' http://localhost:8983/solr/rec_coll/graph Can someone help? thanks Vidhya
Solr custom UpdateRequestProcessor error
Any idea why I am getting this error inspite of the following: I have the customupdateprocessor jar in contrib/customupdate/lib directory I have the solrconfig.xml with the lib directives to this jar as well as solr-core.jar and I see those jars being loaded on startup in the logs: 2018-11-08 01:04:17.929 INFO (coreLoadExecutor-9-thread-3) [ x:reviews] o.a.s.c.SolrResourceLoader [reviews] Added 58 libs to classloader, from paths: [/.../solr-7.5.0/contrib/clustering/lib, .../solr-7.5.0/contrib/extraction/lib, .../solr-7.5.0/contrib/hotelreviews/lib, .../solr-7.5.0/contrib/langid/lib, .../solr-7.5.0/contrib/velocity/lib, .../solr-7.5.0/dist] inspite of these I get the following exception: Caused by: java.lang.NoClassDefFoundError: org/apache/solr/update/processor/UpdateRequestProcessorFactory$RunAlways at java.lang.ClassLoader.defineClass1(Native Method) ~[?:1.8.0_161] at java.lang.ClassLoader.defineClass(ClassLoader.java:763) ~[?:1.8.0_161] at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) ~[?:1.8.0_161] at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) ~[?:1.8.0_161] at java.net.URLClassLoader.access$100(URLClassLoader.java:73) ~[?:1.8.0_161] at java.net.URLClassLoader$1.run(URLClassLoader.java:368) ~[?:1.8.0_161] at java.net.URLClassLoader$1.run(URLClassLoader.java:362) ~[?:1.8.0_161] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_161] at java.net.URLClassLoader.findClass(URLClassLoader.java:361) ~[?:1.8.0_161] at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_161] at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_161] at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:565) ~[jetty-webapp-9.4.11.v20180605.jar:9.4.11.v20180605] at java.lang.ClassLoader.loadClass(ClassLoader.java:411) ~[?:1.8.0_161] at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814) ~[?:1.8.0_161] at java.lang.ClassLoader.loadClass(ClassLoader.java:411) ~[?:1.8.0_161] at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814) ~[?:1.8.0_161] at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_161] at java.lang.Class.forName0(Native Method) ~[?:1.8.0_161] at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_161] at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:541) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:488) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:792) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:848) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2810) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:130) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:850) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2785) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2779) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:1430) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.(SolrCore.java:970) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.SolrCore.(SolrCore.java:869) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138) ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:07:55] ... 7 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.update.processor.UpdateRequestProcessorFactory$RunAlways at java.net.UR
Re: Unable to get Solr Graph Traversal working
thanks Joel. Running with /stream handler did reveal some issues and after fixing the same the gagtherNodes expr is working!! I am trying out the recommendations sample from solr website for my use case and now I am struck at the next step which is unable to get the top 3 of those nodes: curl --data-urlencode 'expr=top(n="30", sort="count(*) desc", nodes(rec_coll, search(rec_coll, q="35d40c4b9d6ddfsdf45cbb0fe4aesd75->USER_ID", fl="ITEM_ID", sort="ITEM_ID desc", qt="/export"), walk="ITEM_ID->ITEM_ID", gather="USER_ID", fl="USER_ID", maxDocFreq="1", count(*)))' http://localhost:8983/solr/rec_coll/graph Again appreciate any help Vidhya On Thu, Nov 8, 2018 at 1:23 PM Joel Bernstein wrote: > The basic syntax looks ok. Try it first on the /stream handler to rule out > any issues that might be related to /graph handler. Can you provide the > logs from one of the shards in the rec_coll collection that are generated > by this request? The logs will show the query that is actually being run on > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Wed, Nov 7, 2018 at 1:22 PM Vidhya Kailash > wrote: > > > I am unable to get even simple graph traversal expressions like the one > > below to work in my environment (7.4 and 7.5 versions). They simply yield > > no results, even though I know the data exists. > > curl --data-urlencode 'expr=gatherNodes(rec_coll, > > > > walk="35d40c4b9d6ddfsdf45cbb0fe4aesd75->USER_ID", > > gather="ITEM_ID")' > > http://localhost:8983/solr/rec_coll/graph > > > > Can someone help? > > > > thanks > > Vidhya > > > -- Vidhya Kailash
Matrix Factorization possible with Streams?
Hi I am wondering if anyone has attempted Matrix Factorization possible with Streams in Solr? If so, any pointers would be appreciated. thanks Vidhya