Hi,
I am newbie to solr and exploring solr last few days.
I am using solr cell with tika for parsing, indexing and searching
Posting the rich text documents via Solrj.
My actual requirement is instead of using local documents(pdf, doc & docx),
i want to use webpages(urls for eg..,(http://www.apache.org)). 

eg..,
req.addFile(new File("docs/mailing_lists.html"));
instead
req.url(new urlconnection("http://www.apache.org";)
anything like the above is there in solrj.

Actually i am using curl for testing. it works fine

curl
"http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=attr_content&commit=true";
-F "stream.url=http://wiki.apache.org/solr/SolrConfigXml"; 

but i am in need to use otherthan curl.
Below code works fine for local document indexing and searching. But instead
i want to post urls.

here is my code.,

                String url = "http://localhost:8983/solr";;
                SolrServer server = new CommonsHttpSolrServer(url);
                ContentStreamUpdateRequest req = new ContentStreamUpdateRequest(
                                "/update/extract");
                req.addFile(new File("docs/mailing_lists.html"));
                req.setParam("literal.id", "index1");
                req.setParam("uprefix", "attr_");
                req.setParam("fmap.content", "attr_content");
                req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
                NamedList result = server.request(req);
                assertNotNull("Couldn't upload index.pdf", result);
                QueryResponse rsp = server.query(new SolrQuery("*:*"));
                Assert.assertEquals(1, rsp.getResults().getNumFound());

any suggestion or answer will be appreciated.


-- 
View this message in context: 
http://old.nabble.com/How-to-send-web-pages%28urls%29-to-solr-cell-via-solrj--tp27450083p27450083.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to