On 3/1/2017 4:41 PM, Phil Scadden wrote:
> Using Solr 6.4.1 on windows. Installed and trial POST on my directories 
> worked okay. However, now trying to create an index from code running on 
> tomcat on the same machine as SOLR server with my own schema. Indexing of PDF 
> is very slow. Investigating that find my tomcat output full of wire
> 12:07:06,758 DEBUG wire:72 -  >> "[0xe3][0xe0][0xc7]L[0xf5][0xea][\r]?
> 12:07:06,763 DEBUG wire:72 -  >> 
> "f[0x81][0xb0]b[0xca][0xfa][0xb7][0x1f]n[0xff][0x0][0xa8][0xd0][0x16][0xbb]*[0xfe][0x95][0x98]-[0xbd][0xb7]-
> 12:07:06,805 DEBUG wire:72 -  >> 
> "X[0x3][0xd4][0xcf]OOS:[0x94][0x8b][0xe2][0x89][0x8f][0xc9]rClVQ[0x85][0x1e][0x82]T[0x9e][0xe7]N[0xf4][0xfa]-
> This cant be helping. My code is:
>
>       try {
>             ContentStreamUpdateRequest up = new 
> ContentStreamUpdateRequest("/update/extract");
>             File f = new File(filename);
>             ContentStreamBase.FileStream cs = new 
> ContentStreamBase.FileStream(f);
>             up.addContentStream(cs);
>             up.setParam("literal.id",f.getPath());
>             up.setParam("literal.location", idString);
>             up.setParam("literal.access",access.toString());
>             up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
>             solr.request(up);
>
> All the logging generated by last line. I don’t have any httpclient.wire 
> lines in my log4j.properties (presume these are from httpclient.wire). What 
> do I do to turn this off?

Part of the problem is that you've got your logging set to at least
DEBUG.  You shouldn't do that unless you're actually debugging
something, and even then I would strongly recommend that you NOT set the
global level to DEBUG, but only set the specific classes you are
troubleshooting.  SolrJ depends on many pieces of software, all of which
have their own logging.  DEBUG logging tends to be VERY verbose.

I searched the codebase and did not find the specific logging that you
have mentioned, so it's probably being logged by one of SolrJ's
dependencies, or possibly Tomcat, not SolrJ itself.  Be aware that if
Solr is running in Tomcat, it's an unsupported configuration.  You
haven't said that this is the case, so I am not going to assume that
it's the case.

https://wiki.apache.org/solr/WhyNoWar

On the Solr server side, the 6.4.x versions have a bug that causes
extremely high CPU usage and very slow operation.  This will be fixed in
6.4.2, which will hopefully be out soon.  There is currently no ETA for
this release.

https://issues.apache.org/jira/browse/SOLR-10130

Another side issue:  Using the extracting handler for handling rich
documents is discouraged.  Tika (which is what is used by the extracting
handler) is pretty amazing software, but it has a habit of crashing or
consuming all the heap memory when it encounters a document that it
doesn't know how to properly handle.  It is best to run Tika in your
external program and send its output to Solr, so that if there's a
problem, it won't affect your search capability.

Thanks,
Shawn

Reply via email to