On 29/01/2010, Ken Krugler <[email protected]> wrote: > > On Jan 28, 2010, at 10:09pm, amoldavsky wrote: > > > > > > Hi Oleg, > > Thank you for the quick reply. > > > > So if there is a possibility that not the whole buffer is filled how can I > > insure or force HttpClient to fill the whole buffer? Should I maybe avoid > > Stream Readers all together? > > > > If bufferSize is X, and the server document you're fetching has Y bytes, > then what do you mean by "force HttpClient to fill the whole buffer"? > > At a minimum, you'd want > > int bytesRead = chunkedIns.read(tmp); > if (bytesRead != -1) { > return new String(tmp, 0, bytesRead); > } > > But that also uses the platform default encoding for the character set, > which often won't be correct.
However, if the user just wants to create a file with the contents of the response, then surely there is no need to mess with encodings? Just write the bytes to a file output stream without any conversion. > -- Ken > > > > > > > olegk wrote: > > > > > > > > On Wed, 2010-01-27 at 20:24 -0800, amoldavsky wrote: > > > > > > > Hi > > > > > > > > I have coded a simple file downloader using HttpClient 4.0. > > > > It works fine but there is something wrong with the String encoding or > > > > the > > > > buffer stream. The problem is that there are long sequences of "NULL" > > > > (ANSI > > > > code 00) through out the final file, like this: > > > > > http://old.nabble.com/file/p27350930/httpclient_error01.jpg > > > > > http://old.nabble.com/file/p27350930/httpclient_error02.jpg > > > > > > > > Here is the main code: > > > > > > > > public String getChunk(String url, int bufferSize) throws > > > > HTTPClientException > > > > { > > > > if(!chunkedStarted) > > > > { > > > > chunkedIns = getInputStream(url); > > > > chunkedStarted = true; > > > > } > > > > > > > > byte[] tmp = new byte[bufferSize]; > > > > try > > > > { > > > > if(chunkedIns.read(tmp) != -1) > > > > { > > > > > > > > > > What makes you think that the entire buffer will be filled with data? > > > > > > Oleg > > > > > > > > > > > > > return new String(tmp); > > > > } > > > > else > > > > { > > > > finish(); > > > > return null; > > > > } > > > > } > > > > catch(IOException e) > > > > { > > > > HTTPClientException e2 = new > HTTPClientException(e.getMessage()); > > > > e2.setStackTrace(e.getStackTrace()); > > > > throw e2; > > > > } > > > > } > > > > > > > > public void finish() > > > > { > > > > // do some cleaning > > > > } > > > > > > > > private InputStream getInputStream(String url) throws > > > > HTTPClientException > > > > { > > > > InputStream instream = null; > > > > > > > > httpClient = new DefaultHttpClient(); > > > > > httpClient.getParams().setParameter("http.useragent", > AGENT_NAME); > > > > > > > > HttpGet httpGet = new HttpGet(url); > > > > HttpResponse response = null; > > > > > > > > try > > > > { > > > > response = httpClient.execute(httpGet); > > > > HttpEntity entity = response.getEntity(); > > > > > > > > if(entity != null) > > > > { > > > > instream = entity.getContent(); > > > > } > > > > } > > > > catch(ClientProtocolException e) > > > > { > > > > HTTPClientException e2 = new > HTTPClientException(e.getMessage()); > > > > e2.setStackTrace(e.getStackTrace()); > > > > throw e2; > > > > } > > > > catch(IOException e) > > > > { > > > > HTTPClientException e2 = new > HTTPClientException(e.getMessage()); > > > > e2.setStackTrace(e.getStackTrace()); > > > > throw e2; > > > > } > > > > > > > > return instream; > > > > } > > > > > > > > getChuck and getInputStream can basically be one method but I just > have > > > > the > > > > need to split them for internal conveniece, that does not change the > > > > funtionality as a whole. > > > > > > > > It seems like either the conversion from bytes to string is a problem: > > > > return new String(tmp); > > > > > > > > or that the buffer is not getting filled to the end. The latter could > not > > > > be > > > > possible because the files are ~30MB each and the buffer size is 2Kb. > > > > > > > > I have attached the file, it's a CSV (shortened to ~6KB), note that > long > > > > white space between some of the URLs, if you just remove it, the URL > > > > makes > > > > sense. > > > > http://old.nabble.com/file/p27350930/datafeed.csv > datafeed.csv > > > > > > > > Where can this white space come (null) from?? > > > > > > > > thank! > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: > [email protected] > > > For additional commands, e-mail: > [email protected] > > > > > > > > > > > > > > > > -- > > View this message in context: > http://old.nabble.com/HttpClient-4.0-encoding-madness-tp27350930p27366928.html > > Sent from the HttpClient-User mailing list archive at Nabble.com. > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > [email protected] > > For additional commands, e-mail: > [email protected] > > > > > > -------------------------------------------- > Ken Krugler > +1 530-210-6378 > http://bixolabs.com > e l a s t i c w e b m i n i n g > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
