I had a similar problem and was able to fix it in Solr by manually
buffering the responses to a StringWriter before sending it to Tomcat.
Essentially, Tomcat's buffer will only hold so much and at that point
it blocks (thus it always hangs at a constant number of documents).
However, a better solution (to be implemented) is to use more
intelligent code on the client to read the response at the same time
that it is sending input -- not too difficult to do, though best to do
with two threads (i.e. fire off a thread to read the response before
you send any data). Seeing as the HttpClient code probably does this
already, I'll most likely end up using that.
On 7/31/06, sangraal aiken <[EMAIL PROTECTED]> wrote:
Those are some great ideas Chris... I'm going to try some of them out. I'll
post the results when I get a chance to do more testing. Thanks.
At this point I can work around the problem by ignoring Solr's response but
this is obviously not ideal. I would feel better knowing what is causing the
issue as well.
-Sangraal
On 7/29/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>
>
> : Sure, the method that does all the work updating Solr is the
> doUpdate(String
> : s) method in the GanjaUpdate class I'm pasting below. It's hanging when
> I
> : try to read the response... the last output I receive in my log is Got
> : Reader...
>
> I don't have the means to try out this code right now ... but i can't see
> any obvious problems with it (there may be somewhere that you are opening
> a stream or reader and not closing it, but i didn't see one) ... i notice
> you are running this client on the same machine as Solr (hence the
> localhost URLs) did you by any chance try running the client on a seperate
> machine to see if hte number of updates before it hangs changes?
>
> my money is still on a filehandle resource limit somwhere ... if you are
> running on a system that has "lsof" (on some Unix/Linux installations you
> need sudo/su root permissions to run it) you can use "lsof -p ####" to
> look up what files/network connections are open for a given process. You
> can try running that on both the client pid and the Solr server pid once
> it's hung -- You'll probably see a lot of Jar files in use for both, but
> if you see more then a few XML files open by the client, or more then a
> 1 TCP connection open by either the client or the server, there's your
> culprit.
>
> I'm not sure what Windows equivilent of lsof may exist.
>
> Wait ... i just had another thought....
>
> You are using InputStreamReader to deal with the InputStreams of your
> remote XML files -- but you aren't specifying a charset, so it's using
> your system default which may be differnet from the charset of the
> orriginal XML files you are pulling from the URL -- which (i *think*)
> means that your InputStreamReader may in some cases fail to read all of
> the bytes of the stream, which might some dangling filehandles (i'm just
> guessing on that part ... i'm not acctually sure whta happens in that
> case).
>
> What if you simplify your code (for the purposes of testing) and just put
> the post-transform version ganja-full.xml in a big ass String variable in
> your java app and just call GanjaUpdate.doUpdate(bigAssString) over and
> over again ... does that cause the same problem?
>
>
> :
> : ----------
> :
> : package com.iceninetech.solr.update;
> :
> : import com.iceninetech.xml.XMLTransformer;
> :
> : import java.io.*;
> : import java.net.HttpURLConnection;
> : import java.net.URL;
> : import java.util.logging.Logger;
> :
> : public class GanjaUpdate {
> :
> : private String updateSite = "";
> : private String XSL_URL = "http://localhost:8080/xsl/ganja.xsl";
> :
> : private static final File xmlStorageDir = new
> : File("/source/solr/xml-dls/");
> :
> : final Logger log = Logger.getLogger(GanjaUpdate.class.getName());
> :
> : public GanjaUpdate(String siteName) {
> : this.updateSite = siteName;
> : log.info("GanjaUpdate is primed and ready to update " + siteName);
> : }
> :
> : public void update() {
> : StringWriter sw = new StringWriter();
> :
> : try {
> : // transform gawkerInput XML to SOLR update XML
> : XMLTransformer transform = new XMLTransformer();
> : log.info("About to transform ganjaInput XML to Solr Update XML");
> : transform.transform(getXML(), sw, getXSL());
> : log.info("Completed ganjaInput/SolrUpdate XML transform");
> :
> : // Write transformed XML to Disk.
> : File transformedXML = new File(xmlStorageDir, updateSite+".sml");
> : FileWriter fw = new FileWriter(transformedXML);
> : fw.write(sw.toString());
> : fw.close();
> :
> : // post to Solr
> : log.info("About to update Solr for site " + updateSite);
> : String result = this.doUpdate(sw.toString());
> : log.info("Solr says: " + result);
> : sw.close();
> : } catch (Exception e) {
> : e.printStackTrace();
> : }
> : }
> :
> : public File getXML() {
> : String XML_URL = "http://localhost:8080/" + updateSite + "/ganja-
> : full.xml";
> :
> : // check for file
> : File localXML = new File(xmlStorageDir, updateSite + ".xml");
> :
> : try {
> : if (localXML.createNewFile() && localXML.canWrite()) {
> : // open connection
> : log.info("Downloading: " + XML_URL);
> : URL url = new URL(XML_URL);
> : HttpURLConnection conn = (HttpURLConnection) url.openConnection
> ();
> : conn.setRequestMethod("GET");
> :
> : // Read response to File
> : log.info("Storing XML to File" + localXML.getCanonicalPath());
> : FileOutputStream fos = new FileOutputStream(new
> File(xmlStorageDir,
> : updateSite + ".xml"));
> :
> : BufferedReader rd = new BufferedReader(new InputStreamReader(
> : conn.getInputStream()));
> : String line;
> : while ((line = rd.readLine()) != null) {
> : line = line + '\n'; // add break after each line. It preserves
> : formatting.
> : fos.write(line.getBytes("UTF8"));
> : }
> :
> : // close connections
> : rd.close();
> : fos.close();
> : conn.disconnect();
> : log.info("Got the XML... File saved.");
> : }
> : } catch (Exception e) {
> : e.printStackTrace();
> : }
> :
> : return localXML;
> : }
> :
> : public File getXSL() {
> : StringBuffer retVal = new StringBuffer();
> :
> : // check for file
> : File localXSL = new File(xmlStorageDir, "ganja.xsl");
> :
> : try {
> : if (localXSL.createNewFile() && localXSL.canWrite()) {
> : // open connection
> : log.info("Downloading: " + XSL_URL);
> : URL url = new URL(XSL_URL);
> : HttpURLConnection conn = (HttpURLConnection) url.openConnection
> ();
> : conn.setRequestMethod("GET");
> : // Read response
> : BufferedReader rd = new BufferedReader(new InputStreamReader(
> : conn.getInputStream()));
> : String line;
> : while ((line = rd.readLine()) != null) {
> : line = line + '\n';
> : retVal.append(line);
> : }
> : // close connections
> : rd.close();
> : conn.disconnect();
> :
> : log.info("Got the XSLT.");
> :
> : // output file
> : log.info("Storing XSL to File" + localXSL.getCanonicalPath());
> : FileOutputStream fos = new FileOutputStream(new
> File(xmlStorageDir,
> : "ganja.xsl"));
> : fos.write(retVal.toString().getBytes());
> : fos.close();
> : log.info("File saved.");
> : }
> : } catch (Exception e) {
> : e.printStackTrace();
> : }
> : return localXSL;
> : }
> :
> : private String doUpdate(String sw) {
> : StringBuffer updateResult = new StringBuffer();
> : try {
> : // open connection
> : log.info("Connecting to and preparing to post to SolrUpdate
> : servlet.");
> : URL url = new URL("http://localhost:8080/update");
> : HttpURLConnection conn = (HttpURLConnection) url.openConnection();
> : conn.setRequestMethod("POST");
> : conn.setRequestProperty("Content-Type",
> "application/octet-stream");
> : conn.setDoOutput(true);
> : conn.setDoInput(true);
> : conn.setUseCaches(false);
> :
> : // Write to server
> : log.info("About to post to SolrUpdate servlet.");
> : DataOutputStream output = new DataOutputStream(
> conn.getOutputStream
> : ());
> : output.writeBytes(sw);
> : output.flush();
> : output.close();
> : log.info("Finished posting to SolrUpdate servlet.");
> :
> : // Read response
> : log.info("Ready to read response.");
> : BufferedReader rd = new BufferedReader(new InputStreamReader(
> : conn.getInputStream()));
> : log.info("Got reader....");
> : String line;
> : while ((line = rd.readLine()) != null) {
> : log.info("Writing to result...");
> : updateResult.append(line);
> : }
> : rd.close();
> :
> : // close connections
> : conn.disconnect();
> :
> : log.info("Done updating Solr for site" + updateSite);
> : } catch (Exception e) {
> : e.printStackTrace();
> : }
> :
> : return updateResult.toString();
> : }
> : }
> :
> :
> : On 7/28/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
> : >
> : >
> : > : I'm sure... it seems like solr is having trouble writing to a tomcat
> : > : response that's been inactive for a bit. It's only 30 seconds
> though, so
> : > I'm
> : > : not entirely sure why that would happen.
> : >
> : > but didn't you say you don't have this problem when you use curl --
> just
> : > your java client code?
> : >
> : > Did you try Yonik's python test client? or the java client in Jira?
> : >
> : > looking over the java clinet codey you sent, it's not clear if you are
> : > reading the response back, or closing the connections ... can you post
> a
> : > more complete sample app thatexhibits the problem for you?
> : >
> : >
> : >
> : > -Hoss
> : >
> : >
> :
>
>
>
> -Hoss
>
>