I figured instead of trying to index content, I'd simply issue a query via
SolrJ. This seems related to my problem below. I create a
CommonsHttpSolrServer instance in the manner already described and in a new
method:
@Override
public List<String> getNodeIdsForProductId(final String productId,
final String partnerId) {
final List<String> nodes = new ArrayList<String>();
final CommonsHttpSolrServer solrServer =
(CommonsHttpSolrServer)getSolrServer(partnerId);
final SolrQuery query = new SolrQuery();
query.setQuery("productId:" + productId);
query.addField("nodeId");
try {
final QueryResponse response = solrServer.query(query);
final SolrDocumentList docs = response.getResults();
log.info(String.format("getNodeIdsForProductId - got %d
nodes for productId: %s",
docs.getNumFound(), productId));
for (SolrDocument doc : docs) {
log.info(doc);
}
} catch (SolrServerException ex) {
final String msg = String.format("Unable to query Solr
server %s, for query: %s", solrServer.getBaseURL(), query);
log.error(msg);
throw new ServiceException(msg, ex);
}
return nodes;
}
When issuing the query I get:
2011-02-14 13:13:28 INFO solr.SolrProductIndexService - getSolrServer - Solr
url: http://localhost:8080/solr/partner-tmo
2011-02-14 13:13:28 INFO solr.SolrProductIndexService - getSolrServer -
construct server for url: http://localhost:8080/solr/partner-tmo
2011-02-14 13:13:28 ERROR solr.SolrProductIndexService - Unable to query Solr
server http://localhost:8080/solr/partner-tmo, for query:
q=productId%3Aproduct4&fl=nodeId
...
Caused by: org.apache.solr.client.solrj.SolrServerException:
org.apache.commons.httpclient.NoHttpResponseException: The server localhost
failed to respond
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:484)
...
Caused by: org.apache.commons.httpclient.NoHttpResponseException: The server
localhost failed to respond
at
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1976)
If I run this through the proxy again, I can see the request being made as:
GET
/solr/partner-tmo/select?q=productId%3Aproduct4&fl=nodeId&wt=xml&version=2.2
HTTP/1.1
User-Agent: Solr[org.apache.solr.client.solrj.impl.CommonsHttpSolrServer] 1.0
Host: localhost:8080
And I get no response from Solr. If instead I use this URL in Firefox:
http://localhost:8080/solr/partner-tmo/select?q=productId%3Aproduct4&fl=nodeId&wt=xml&version=2.2
I get search results. What is it about SolrJ that is just not working out?
What basic thing am I missing? Using Firefox here, or curl below, I can talk to
Solr (running in Tomcat 6) just fine. But when going via SolrJ, I cannot update
or query. All of this stuff is running on a single system. I guess I'll try a
simpler app/unit test to see what happens...
This is really a big problem for me. Any suggests are greatly appreciated.
Thanks,
Jeff
On Feb 13, 2011, at 9:15 PM, Jeff Schmidt wrote:
> Hello again:
>
> Back to the javabin iissue:
>
> On Feb 12, 2011, at 6:07 PM, Lance Norskog wrote:
>
>> --- But I'm unable to get SolrJ to work due to the 'javabin' version
>> mismatch. I'm using the 1.4.1 version of SolrJ, but I always get an
>> HTTP response code of 200, but the return entity is simply a null
>> byte, which does not match the version number of 1 defined in Solr
>> common. ---
>>
>> I've never seen this problem. At this point you are better off
>> starting with 3.x instead of chasing this problem down.
>
> I'm now using the latest branch_3x built Solr and SolrJ. Other places I've
> seen the message:
>
> Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 0) or
> the data in not in 'javabin' format at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99) at
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
>
>
> One was told to make sure the version of Solr and SolrJ are compatible, and
> that the schema is valid. Unlike 1.4, I see 3.1 actually outputs the expected
> and received version numbers, which is helpful. You can see the invalid
> version of 0 is indicated which is the zero byte I receive in response.
>
> I have Solr running within Tomcat by following the wiki. I have the
> conf/Catalina/localhost/solr.xml file set as:
>
> <?xml version="1.0" encoding="utf-8"?>
> <Context
> docBase="/usr/local/ingenuity/isec/solr/apache-solr-3.1-SNAPSHOT.war"
> debug="0" crossContext="true">
> <Environment name="solr/home" type="java.lang.String"
>
> value="/Users/jas/535Consulting/Clients/Ingenuity/ProfServices/svn/trunk/ing/isec/src/main/solr/multicore"
> override="true"/>
> </Context>
>
> With that, I'm able to use my browser to index some content (DIH, curl etc.)
> and issue queries, so it seems Solr is running okay in tomcat
> (apache-tomcat-6.0.30). To index some Products, I have this simple method:
>
> @Override
> public void addProducts(final Collection<Product> products, final
> String indexName) {
>
> log.info(String.format("addProducts - indexing %d products to
> Solr core: %s",
> products.size(), indexName));
>
> Assert.notNull(indexName);
>
> final Collection<SolrInputDocument> docs = new
> ArrayList<SolrInputDocument>();
> for (Product product : products) {
>
> final SolrInputDocument doc =
> createDocumentForProduct(product);
> docs.add(doc);
> log.info("addProduct: document to index: " + doc);
> }
>
> final SolrServer solrServer = getSolrServer(indexName);
> try {
> solrServer.add(docs);
> solrServer.commit(commitWaitFlush, commitWaitSearcher);
> } catch (Exception ex) {
> final String msg = String.format("Unable to add and
> commit %d documents to core: %s",
> products.size(), indexName);
> log.error(msg);
> throw new ServiceException(msg, ex);
> }
> }
>
> And I have:
>
> protected SolrServer getSolrServer(final String indexName) {
>
> final String url = solrServerBaseUrl + indexName;
> log.info("getSolrServer - construct server for url: " + url);
> try {
> final CommonsHttpSolrServer solrServer = new
> CommonsHttpSolrServer(solrServerBaseUrl + indexName);
> //solrServer.setParser(new BinaryResponseParser());
> //solrServer.setParser(new XMLResponseParser());
> solrServer.setRequestWriter(new BinaryRequestWriter());
> return solrServer;
> } catch (Exception ex) {
> final String msg = String.format("Unable to create Solr
> server for url: %s", url);
> log.error(msg);
> throw new ServiceException(msg, ex);
> }
> }
>
> Note that this is code for prototyping. :)
>
> As you can see, in getSolrServer() I'm trying various settings.
> http://wiki.apache.org/solr/Solrj is tagged Solr 1.4, but I'm assuming it's
> at least very similar in 3.1. For the core in question, solrconfig.xml does
> have:
>
> <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
> <requestHandler name="/update/javabin"
> class="solr.BinaryUpdateRequestHandler" />
>
> I can see in the Solr log:
>
> Feb 13, 2011 3:35:14 PM org.apache.solr.core.RequestHandlers
> initHandlersFromConfig
> INFO: created /update/javabin: solr.BinaryUpdateRequestHandler
>
> Running this through the burp proxy to try to see what's going on, I can see
> my application making the following request to Solr via SolrJ:
>
> ------------------------------------------
> POST /solr/partner-tmo/update/javabin?wt=javabin&version=2 HTTP/1.1
> User-Agent: Solr[org.apache.solr.client.solrj.impl.CommonsHttpSolrServer] 1.0
> Host: localhost:8090
> Content-Type: application/octet-stream
> Content-Length: 543
>
> Äà¶msÀà'delByIdà&delByQà$docs…Áà%boostÃà$name"idà#val-ING:afa|08520åÃæ&nodeIdç'ING:afaåÃæ)productIdç%08520åÃæ0nodeSourceIdTypeç"EGåÃæ,nodeSourceIdç#672åÃæ+productTypeç≠(chemical%shRNAåÃæ-description_tç?
> This is the description for product
> 08520åÃæ'brand_sç%FLUKAåÃæ%sku_sç(A980-852å…ÁåÃæ"idç-ING:afa|08530åÃæ&nodeIdç'ING:afaåÃæ)productIdç%08530åÃæ0nodeSourceIdTypeç"EGåÃæ,nodeSourceIdç#672åÃæ+productTypeç™(chemicalåÃæ-description_tç?
> This is the description for product
> 08530åÃæ'brand_sç%FLUKAåÃæ%sku_sç(A980-853å
> ------------------------------------------
>
> That looks pretty binary. In response I see:
>
> ------------------------------------------
> HTTP/1.0 200 OK
> Content-type: application/octet-stream
> Content-length: 1
>
>
> ------------------------------------------
>
> Looking at the hex view, I can see the one byte of data is 0x00.
>
> My other approach was to go the XML route. So, to do this, I comment out the
> setting of the request writer and go with the default, which the wiki says is
> XML. Running this through the proxy I see:
>
> ------------------------------------------
> POST /solr/partner-tmo/update?wt=javabin&version=2 HTTP/1.1
> User-Agent: Solr[org.apache.solr.client.solrj.impl.CommonsHttpSolrServer] 1.0
> Host: localhost:8090
> Content-Type: text/xml; charset=utf-8
> Content-Length: 856
>
> <add><doc boost="1.0"><field name="id">ING:afa|08520</field><field
> name="nodeId">ING:afa</field><field name="productId">08520</field><field
> name="nodeSourceIdType">EG</field><field
> name="nodeSourceId">672</field><field
> name="productType">chemical</field><field
> name="productType">shRNA</field><field name="description_t">This is the
> description for product 08520</field><field
> name="brand_s">FLUKA</field><field name="sku_s">A980-852</field></doc><doc
> boost="1.0"><field name="id">ING:afa|08530</field><field
> name="nodeId">ING:afa</field><field name="productId">08530</field><field
> name="nodeSourceIdType">EG</field><field
> name="nodeSourceId">672</field><field
> name="productType">chemical</field><field name="description_t">This is the
> description for product 08530</field><field
> name="brand_s">FLUKA</field><field name="sku_s">A980-853</field></doc></add>
> ------------------------------------------
>
> I can see it has specified the proper update handler in the URI and XML is
> being uploaded. In response though, I get the exact same one as when going
> binary, including the application/octet-stream content type. I end up with
> the same javabin version mismatch stacktrace as well, even though I'm trying
> to talk XML. But, the presence of wt=javabin&version=2 is not very
> encouraging when going the default XML route.
>
> Just for grins, I added:
>
> solrServer.setParser(new XMLResponseParser());
>
> And now the exception is:
>
> Caused by:
> com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character (NULL,
> unicode 0) encountered: not valid in any content| at [row,col
> {unknown-source}]: [1,1]
> at
> com.ctc.wstx.sr.StreamScanner.constructNullCharException(StreamScanner.java:640)
>
> So, apparently javabin is out of the way, and now the zero byte returned is
> mucking up the XML parser. According to proxy, the request is similar to the
> previous one, but:
>
> POST /solr/partner-tmo/update?wt=xml&version=2.2 HTTP/1.1
>
> So, great we are going the XML route, but the response is the same HTTP 200
> and a zero byte...
>
> So, I'm not sure what's going on. I can say though that I am not seeing any
> log activity solr.2011-*.log once Solr has started and I attempt to issue
> these requests. Maybe it's tomcat? But, if I go directly to Solr outside of
> SolrJ and add some content to that same core, I do see log activity, and get
> a valid response:
>
> [imac:solr/input-data/tmo-products] jas% curl --header "Content-type:
> text/xml; charset=utf-8" --request POST -w "\nhttp code: %{http_code}\n" -d
> @nodes-products.xml "http://localhost:8080/solr/partner-tmo/update"
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader"><int name="status">0</int><int
> name="QTime">11</int></lst>
> </response>
>
> http code: 200
> [imac:solr/input-data/tmo-products] jas%
>
> Much better than a single null byte.
>
> Any idea of what is going on? Apologies for the long email, but I'm trying
> to provide all the details. This problem must reside in something I'm doing.
> I'm sure others are using SolrJ successfully.
>
> Thanks!
>
> Jeff
>
>>
>> On Sat, Feb 12, 2011 at 1:37 PM, Jeff Schmidt <[email protected]> wrote:
>>> Hello:
>>>
>>> I'm working on incorporating Solr into a SaaS based life sciences semantic
>>> search project. This will be released in about six months. I'm trying to
>>> determine which version of Solr makes the most sense. When going to the
>>> Solr download page, there are 1.3.0, 1.4.0, and 1.4.1. I've been using
>>> 1.4.1 while going through some examples in my Packt book ("Solr 1.4
>>> Enterprise Search Server").
>>>
>>> But, I also see that Solr 3.1 and 4.0 are in the works. According to:
>>>
>>>
>>> https://issues.apache.org/jira/browse/#selectedTab=com.atlassian.jira.plugin.system.project%3Aroadmap-panel
>>>
>>> there is a high degree of progress on both of those releases; including a
>>> slew of bug fixes, new features, performance enhancements etc. Should I be
>>> making use of one of the newer versions? The hierarchical faceting seems
>>> like it could be quite useful. Are there any guesses on when either 3.1 or
>>> 4.0 will be officially released?
>>>
>>> So far, 1.4.1 has been good. But I'm unable to get SolrJ to work due to the
>>> 'javabin' version mismatch. I'm using the 1.4.1 version of SolrJ, but I
>>> always get an HTTP response code of 200, but the return entity is simply a
>>> null byte, which does not match the version number of 1 defined in Solr
>>> common. Anyway, I can follow up on that issue if 1.4.1 is still the most
>>> appropriate version to use these days. Otherwise, I'll try again with
>>> whatever version you suggest.
>>>
>>> Thanks a lot!
>>>
>>> Jeff
>>> --
>>> Jeff Schmidt
>>> 535 Consulting
>>> [email protected]
>>> (650) 423-1068
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> [email protected]
>
> --
> Jeff Schmidt
> 535 Consulting
> [email protected]
> (650) 423-1068
> http://www.535consulting.com
>
>
>
>
>
>
>
--
Jeff Schmidt
535 Consulting
[email protected]
(650) 423-1068
http://www.535consulting.com