Re: indexing a xml file

2012-06-19 Thread vidhya
FATAL: Solr returned an error #400 ERROR:unknown field
>'name'


This issue is due to data type mismatch in both solr(schema.xml) and in
coding part(Adding documents).
Try to make both the fields should be similar.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-a-xml-file-tp3364392p3990231.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: delete by query don't work

2012-06-19 Thread vidhya
In order to clear all the indexed data please try to use this code
 private void Btn_Delete_Click(object sender, EventArgs e)
{
  
var solrUrl = this.textBoxSolrUrl.Text;
indexer.FixtureSetup(solrUrl);
indexer.Delete();

MessageBox.Show("Delete of files is completed");
}

 public void Delete()
{
var solr =
ServiceLocator.Current.GetInstance>();

solr.Delete(new SolrQueryByField("id", "*:*"));

solr.Commit();
}

 Use this code to delete indivisual document

 solr.Delete(new SolrQueryByField("id", "SP2514N"));

Here particular id="SP2514N" will be removed from the Indexed data. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/delete-by-query-don-t-work-tp3990077p3990243.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr cluster tuning

2018-10-24 Thread Vidhya Kailash
We are currently using Solr Cloud Version 7.4 with SolrJ api to fetch data
from collections. We recently deployed our code to production and noticed
that response time is more if the number of incoming requests are less.

But strangely, if we bombard the system with more and more requests we get
much better response time.

My suspicion is client is closing the connections sooner in case of slower
requests and slower in case of faster requests.

We tried tuning by passing custom HTTPClient to SolrJ and also by updating
HttpShardHandlerFactory settings. For example we made -
maxThreadIdleTime = 6
socketTimeOut = 18

Wondering what other tuning we can do to make this perform the same
irrespective of the number of requests.

Thanks!

Vidhya


Re: Solr cluster tuning

2018-11-01 Thread Vidhya Kailash
Thank you Erick and Daniel for your prompt responses. We were trying a few
things (moving to G1GC, optimizing by throwing away some fields that need
not be indexed & stored) and hence the late response.

First of all, thought of giving a overview of the environment... We have a
four node Solr Cloud cluster. We have 2 indexes which is spread across 4
shards and has 2 replicas. We have a total of 30GB on each of the nodes
(all dedicated to running the Solr Cloud alone). Of which 15GB are
allocated to the JVM and the rest for the OS to manage. All the indexes
together take up just 1.4GB on the disk. Running version 7.4 with a
dedicated Zookeeper cluster.

Something of concern I see on the Solr Admin is the use of that memory.
[image: image.png]
this is what I see by running Top:
[image: image.png]

Is there a general calculation on how much to leave for OS caching for an
index of 2GB?
To answer Ericks question, no we are not indexing at the same time. In fact
we have stopped indexing just to test the theory and dont see any
improvements. I dont think I need to worry about autocommit then right?
Daniel, we did try what you mentioned here (that is warm up the cache and
then do a slow and a fast test) and we still see the slow test yielding
slower results.


Any thoughts anyone? Much appreciate your responses


thanks
Vidhya


On Wed, Oct 24, 2018 at 6:40 PM Erick Erickson 
wrote:

> To add to Daniel's comments: Are you indexing at the same time? Say
> your autocommit time is 10 seconds. For the sake of argument let's say
> it takes 15 queries to warm your searcher. Let's further say that the
> average time for those 15 queries is 500ms each and once the searcher
> is warmed the average time drops to 100ms. You'll have an average
> close to 100ms.
>
> OTOH, if you only fire 15 queries over that 10 seconds, the average
> would be 500ms.
>
> My guess is your autowarm counts for filterCache and queryResult cache
> are the default 0 and if you set them to, say, 20 each much of your
> problem would disappear.  Ditto if you stopped indexing. Both point to
> the searchers having to pull data into memory from disk and/or rebuild
> caches.
>
> Best,
> Erick
> On Wed, Oct 24, 2018 at 1:37 PM Davis, Daniel (NIH/NLM) [C]
>  wrote:
> >
> > Usually, responses are due to I/O waits getting the data off of the
> disk.   So, to me, this seems more likely because as you bombard the server
> with queries, you cause more and more of the data needed to answer the
> query into memory.
> >
> > To verify this, I'd bombard your server with queries to warm it up, and
> then repeat your test with the queries coming in slowly or quickly.
> >
> > If it still holds up, then there is something other than Solr going on
> with that server, and taking memory from Solr or your index is somewhat too
> big for your server.  Linux likes to overcommit memory - try setting vm
> swappiness to something low, like 10, rather than the default 60.   Look
> for anything on the server with Solr that may be competing with it for I/O
> resources, and causing its pages to swap out.
> >
> > Also, look at the size of your index data.
> >
> > These are general advises in dealing with inverted indexes - some of the
> Solr engineers on this list may have some very specific ideas, such as
> merging activity or other background tasks running when the query load is
> lighter.   I wouldn't know how to check for these things, but would thing
> they wouldn't affect query response time that badly.
> >
> > -Original Message-
> > From: Vidhya Kailash 
> > Sent: Wednesday, October 24, 2018 4:22 PM
> > To: solr-user@lucene.apache.org
> > Subject: Solr cluster tuning
> >
> > We are currently using Solr Cloud Version 7.4 with SolrJ api to fetch
> data from collections. We recently deployed our code to production and
> noticed that response time is more if the number of incoming requests are
> less.
> >
> > But strangely, if we bombard the system with more and more requests we
> get much better response time.
> >
> > My suspicion is client is closing the connections sooner in case of
> slower requests and slower in case of faster requests.
> >
> > We tried tuning by passing custom HTTPClient to SolrJ and also by
> updating HttpShardHandlerFactory settings. For example we made -
> maxThreadIdleTime = 6 socketTimeOut = 18
> >
> > Wondering what other tuning we can do to make this perform the same
> irrespective of the number of requests.
> >
> > Thanks!
> >
> > Vidhya
>


-- 
Vidhya Kailash


Unable to get Solr Graph Traversal working

2018-11-07 Thread Vidhya Kailash
I am unable to get even simple graph traversal expressions like the one
below to work in my environment (7.4 and 7.5 versions). They simply yield
no results, even though I know the data exists.
curl --data-urlencode 'expr=gatherNodes(rec_coll,

walk="35d40c4b9d6ddfsdf45cbb0fe4aesd75->USER_ID",
gather="ITEM_ID")'
http://localhost:8983/solr/rec_coll/graph

Can someone help?

thanks
Vidhya


Solr custom UpdateRequestProcessor error

2018-11-08 Thread Vidhya Kailash
Any idea why I am getting this error inspite of the following:

I have the customupdateprocessor jar in contrib/customupdate/lib directory
I have the solrconfig.xml with the lib directives to this jar as well as
solr-core.jar

and I see those jars being loaded on startup in the logs:

2018-11-08 01:04:17.929 INFO  (coreLoadExecutor-9-thread-3) [   x:reviews]
o.a.s.c.SolrResourceLoader [reviews] Added 58 libs to classloader, from
paths: [/.../solr-7.5.0/contrib/clustering/lib,
.../solr-7.5.0/contrib/extraction/lib,
.../solr-7.5.0/contrib/hotelreviews/lib, .../solr-7.5.0/contrib/langid/lib,
.../solr-7.5.0/contrib/velocity/lib, .../solr-7.5.0/dist]


inspite of these I get the following exception:


Caused by: java.lang.NoClassDefFoundError:
org/apache/solr/update/processor/UpdateRequestProcessorFactory$RunAlways

at java.lang.ClassLoader.defineClass1(Native Method) ~[?:1.8.0_161]

at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
~[?:1.8.0_161]

at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
~[?:1.8.0_161]

at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
~[?:1.8.0_161]

at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
~[?:1.8.0_161]

at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
~[?:1.8.0_161]

at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
~[?:1.8.0_161]

at java.security.AccessController.doPrivileged(Native Method)
~[?:1.8.0_161]

at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
~[?:1.8.0_161]

at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
~[?:1.8.0_161]

at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
~[?:1.8.0_161]

at
org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:565)
~[jetty-webapp-9.4.11.v20180605.jar:9.4.11.v20180605]

at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
~[?:1.8.0_161]

at
java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)
~[?:1.8.0_161]

at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
~[?:1.8.0_161]

at
java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)
~[?:1.8.0_161]

at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
~[?:1.8.0_161]

at java.lang.Class.forName0(Native Method) ~[?:1.8.0_161]

at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_161]

at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:541)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:488)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:792)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:848)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2810)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at
org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:130)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:850)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2785)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2779)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at
org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:1430)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at org.apache.solr.core.SolrCore.(SolrCore.java:970)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at org.apache.solr.core.SolrCore.(SolrCore.java:869)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

at
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138)
~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df -
jimczi - 2018-09-18 13:07:55]

... 7 more

Caused by: java.lang.ClassNotFoundException:
org.apache.solr.update.processor.UpdateRequestProcessorFactory$RunAlways

at java.net.UR

Re: Unable to get Solr Graph Traversal working

2018-11-08 Thread Vidhya Kailash
thanks Joel. Running with /stream handler did reveal some issues and after
fixing the same the gagtherNodes expr is working!! I am trying out the
recommendations sample from solr website for my use case and now I am
struck at the next step which is unable to get the top 3 of those nodes:
curl --data-urlencode 'expr=top(n="30",
sort="count(*) desc",
nodes(rec_coll,
search(rec_coll,
q="35d40c4b9d6ddfsdf45cbb0fe4aesd75->USER_ID",  fl="ITEM_ID",
sort="ITEM_ID desc", qt="/export"),
walk="ITEM_ID->ITEM_ID",
gather="USER_ID", fl="USER_ID",
maxDocFreq="1",
    count(*)))'
http://localhost:8983/solr/rec_coll/graph


Again appreciate any help

Vidhya


On Thu, Nov 8, 2018 at 1:23 PM Joel Bernstein  wrote:

> The basic syntax looks ok. Try it first on the /stream handler to rule out
> any issues that might be related to /graph handler. Can you provide the
> logs from one of the shards in the rec_coll collection that are generated
> by this request? The logs will show the query that is actually being run on
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Wed, Nov 7, 2018 at 1:22 PM Vidhya Kailash 
> wrote:
>
> > I am unable to get even simple graph traversal expressions like the one
> > below to work in my environment (7.4 and 7.5 versions). They simply yield
> > no results, even though I know the data exists.
> > curl --data-urlencode 'expr=gatherNodes(rec_coll,
> >
> > walk="35d40c4b9d6ddfsdf45cbb0fe4aesd75->USER_ID",
> > gather="ITEM_ID")'
> > http://localhost:8983/solr/rec_coll/graph
> >
> > Can someone help?
> >
> > thanks
> > Vidhya
> >
>


-- 
Vidhya Kailash


Matrix Factorization possible with Streams?

2019-01-30 Thread Vidhya Kailash
Hi
I am wondering if anyone has attempted Matrix Factorization possible with
Streams in Solr? If so, any pointers would be appreciated.

thanks
Vidhya