Index Update for XPathEntityProcessor
Hello, First, apologies for cross posting; I initially posted this on Stackoverflow [1] and just realised I might have better luck posting the question here. I am relatively new to Solr and currently working the the XPathEntityProcessor DIH. I have a dataset that will be periodically updated with new XML files and would like to find what best approach would be best suited seeing that delta-import is only supported in sqlEntityProcessor [2]. [1] http://stackoverflow.com/q/13914465/664424 [2] http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command-1 -- Phiri http://lightonphiri.org
Re: XsltUpdateRequestHandler - HTTP ERROR 404
> Are you using 4.0? In 4.0 it has been merged into a single Sorry, forgot to specify Solr version --I am working with 4.0 >As I understand it, if you post XML tohttp://localhost:8983/solr/update and >provide a tr parameter, it will do > the same thing as the XsltUpdateRequestHandler did. Still no luck getting it to work, I am getting the same 404 error... a bit more context here; I am working with the example directory set up and so basically started my Solr server with command below. Could it be that I don't have the stylesheet in the right place? java -Dsolr.solr.home="./example-DIH/solr/" -jar start.jar >From what I could see before I dumped my stylesheet in w5 core, db core had a similar set up... ./solr/collection1/conf/xslt ./example-DIH/solr/db/conf/xslt ./example-DIH/solr/w5/conf/xslt Lighton Phiri http://lightonphiri.org
Re: XsltUpdateRequestHandler - HTTP ERROR 404
I finally understand what you meant and called UpdateRequestHandler as shown below. curl "http://localhost:8983/solr/w5/update?commit=true&tr=x.xsl However, I am now getting an error '500 - Unable to initialize Templates' error. Part of error stack is below. I am not quite familiar with Solr errors though... turned on debugging but still can't figure out what I am doing wrong. I know stylesheet is fine because I can apply it to candidate XML files with xsltproc and it works. Searched mailing list archive and came up with this [1] 50030Unable to initialize Templates 'x.xsl'java.io.IOException: Unable to initialize Templates 'x.xsl' at org.apache.solr.util.xslt.TransformerProvider.getTemplates(TransformerProvider.java:117) at org.apache.solr.util.xslt.TransformerProvider.getTransformer(TransformerProvider.java:77) at org.apache.solr.handler.loader.XMLLoader.getTransformer(XMLLoader.java:180) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:116) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) : : [1] http://lucene.472066.n3.nabble.com/getTransformer-error-td3047726.html Lighton Phiri http://lightonphiri.org On 18 December 2012 16:27, Lighton Phiri wrote: >> Are you using 4.0? In 4.0 it has been merged into a single > > Sorry, forgot to specify Solr version --I am working with 4.0 > >>As I understand it, if you post XML tohttp://localhost:8983/solr/update and >>provide a tr parameter, it will do >> the same thing as the XsltUpdateRequestHandler did. > > Still no luck getting it to work, I am getting the same 404 error... a > bit more context here; I am working with the example directory set up > and so basically started my Solr server with command below. Could it > be that I don't have the stylesheet in the right place? > > java -Dsolr.solr.home="./example-DIH/solr/" -jar start.jar > > From what I could see before I dumped my stylesheet in w5 core, db > core had a similar set up... > > ./solr/collection1/conf/xslt > ./example-DIH/solr/db/conf/xslt > ./example-DIH/solr/w5/conf/xslt > > Lighton Phiri > http://lightonphiri.org
Re: XsltUpdateRequestHandler - HTTP ERROR 404
Finally got this to work --thank you Upayavira. Turns out the issue was as a result of a syntax error in my stylesheet --I had instread of :( I didn't notice the xsltproc compilation errors before because it processed the valid styles and printed out the errors at the beginning of output. compilation error: file example-DIH/solr/w5/conf/xslt/x.xsl line 86 element value xsltStylePreCompute: unknown xsl:value compilation error: file example-DIH/solr/w5/conf/xslt/x.xsl line 90 element value xsltStylePreCompute: unknown xsl:value Lighton Phiri http://lightonphiri.org On 18 December 2012 17:01, Lighton Phiri wrote: > I finally understand what you meant and called UpdateRequestHandler as > shown below. > > curl "http://localhost:8983/solr/w5/update?commit=true&tr=x.xsl > > However, I am now getting an error '500 - Unable to initialize > Templates' error. Part of error stack is below. I am not quite > familiar with Solr errors though... turned on debugging but still > can't figure out what I am doing wrong. I know stylesheet is fine > because I can apply it to candidate XML files with xsltproc and it > works. > > Searched mailing list archive and came up with this [1] > > > > 500 name="QTime">30Unable to > initialize Templates 'x.xsl' name="trace">java.io.IOException: Unable to initialize Templates > 'x.xsl' > at > org.apache.solr.util.xslt.TransformerProvider.getTemplates(TransformerProvider.java:117) > at > org.apache.solr.util.xslt.TransformerProvider.getTransformer(TransformerProvider.java:77) > at org.apache.solr.handler.loader.XMLLoader.getTransformer(XMLLoader.java:180) > at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:116) > at > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > : > : > > [1] http://lucene.472066.n3.nabble.com/getTransformer-error-td3047726.html > Lighton Phiri > http://lightonphiri.org > > > On 18 December 2012 16:27, Lighton Phiri wrote: >>> Are you using 4.0? In 4.0 it has been merged into a single >> >> Sorry, forgot to specify Solr version --I am working with 4.0 >> >>>As I understand it, if you post XML tohttp://localhost:8983/solr/update and >>>provide a tr parameter, it will do >>> the same thing as the XsltUpdateRequestHandler did. >> >> Still no luck getting it to work, I am getting the same 404 error... a >> bit more context here; I am working with the example directory set up >> and so basically started my Solr server with command below. Could it >> be that I don't have the stylesheet in the right place? >> >> java -Dsolr.solr.home="./example-DIH/solr/" -jar start.jar >> >> From what I could see before I dumped my stylesheet in w5 core, db >> core had a similar set up... >> >> ./solr/collection1/conf/xslt >> ./example-DIH/solr/db/conf/xslt >> ./example-DIH/solr/w5/conf/xslt >> >> Lighton Phiri >> http://lightonphiri.org
Re: Index Update for XPathEntityProcessor
Just in case someone comes across this, I've decided to use UpdateRequestHandler [1] and it seems to be doing everything I want. [1] http://wiki.apache.org/solr/UpdateRequestHandler Lighton Phiri http://lightonphiri.org On 17 December 2012 17:06, Lighton Phiri wrote: > Hello, > > First, apologies for cross posting; I initially posted this on > Stackoverflow [1] and just realised I might have better luck posting > the question here. > > I am relatively new to Solr and currently working the the > XPathEntityProcessor DIH. I have a dataset that will be periodically > updated with new XML files and would like to find what best approach > would be best suited seeing that delta-import is only supported in > sqlEntityProcessor [2]. > > [1] http://stackoverflow.com/q/13914465/664424 > [2] http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command-1 > > -- > Phiri > http://lightonphiri.org
Re: Top Terms Using Luke
I suppose this will do; I just figured they'd be a built-in way of excluding stopwords. Thank you. On 15 January 2013 22:08, Shawn Heisey wrote: > To get an idea for which non-stopwords are dominant in your index, just ask > for more top terms, instead of just the top ten or top twenty. If you are > using a program to parse the information, have your program remove the terms > that you don't want to include, then trim the list to the proper size. Lighton Phiri http://lightonphiri.org