Re: Indexing XML
Benoit, Are you familiar with the Vufind project (http://www.vufind.org)? If you look at the PHP code in the import folder to see how the indexing is working (there's an XSL transformation that then updates the index). I've also written some initial code to use embedded Solr to do this indexing directly from marc format files, including holding the entire marcxml format record in the index. You can contact me off-list if you have questions... Wayne Walter Underwood wrote: > Solr is not an XML engine (or a MARC engine). It uses XML as an input format > for fielded data. It does not index or search arbitrary XML. You need to > convert your XML into Solr's format. > > I would recommend expressing MARC in a Solr schema, then working on the > input XML. The input XML depends on the schema. > > If you need an XML engine, I'd recommend MarkLogic (commercial), a very > good product. > > wunder > > On 10/5/07 12:44 AM, "PAUWELS Benoit" <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> I wish to index well formed xml documents as they are. >> >> I have a database filled with MARCXML records. An example of these looks like >> this: >> >> >> >> > >> ns0:schemaLocation="http://www.loc.gov/MARC21/slim >> http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"; >> >> xmlns="http://www.loc.gov/MARC21/slim"; >> xmlns:ns0="http://www.w3.org/2001/XMLSchema-instance";> >> >> 0nam 22 a 4500 >> >> 00050 >> >> 20050826220257.0 >> >> 000710s1998xx r 000 0 dut >> d >> >> >> >> Univ >> >> >> >> >> >> van Wetten, J. W. >> >> >> >> >> >> De positie van vrouwen in de >> asielprocedure >> / >> >> J.W. van Wetten, N. Dijkhof, F. >> Heide. >> >> >> >> >> >> >> >> The idea is to create Lucene indexes on specific MARC fields and store the >> complete MARC record in Lucene 'as is'. In the presentation layer of my >> application I would then have this complete MARC record at hand, and as such >> have full flexibility on which MARC fields to display. So I want to create >> the >> following record through XSLT and feed this to SOLR. >> >> >> >> >> >> De positie van vrouwen in de asielprocedure >> >> van Wetten, J. W. >> >> ... >> >> >> >> > >> ns0:schemaLocation="http://www.loc.gov/MARC21/slim >> http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"; >> >> xmlns="http://www.loc.gov/MARC21/slim"; >> xmlns:ns0="http://www.w3.org/2001/XMLSchema-instance";> >> >> 0nam 22 a 4500 >> >> 00050 >> >> 20050826220257.0 >> >> 000710s1998xx r 000 0 dut >> d >> >> >> >> UGent >> >> >> >> >> >> van Wetten, J. W. >> >> >> >> >> >> De positie van vrouwen in de >> asielprocedure >> / >> >> J.W. van Wetten, N. Dijkhof, F. >> Heide. >> >> >> >> >> >> >> >> >> >> >> >> I have the following in my schema.xml: >> >> >> >> > termVectors="true"/> >> >> > termVectors="true"/> >> >> >> >> >> >> >> >> SOLR has of course a problem with the XML in the 'originalRecord' field. >> >> Is there a solution to this? Has anyone done this before? >> >> >> >> Thanks a lot. >> >> Benoit. >> >> >> >> >> >> = >> >> PAUWELS Benoit >> >> Université Libre de Bruxelles - Libraries >> >> Head of Automation >> >> Av. F.D. Roosevelt 50, CP 180 >> >> 1050 BRUSSELS >> >> Belgium >> >> Tel: + 32 2 650 23 91 >> >> Fax: + 32 2 650 23 91 >> >> = >> >> >> >> >> > -- /** * Wayne Graham * Earl Gregg Swem Library * PO Box 8794 * Williamsburg, VA 23188 * 757.221.3112 * http://swem.wm.edu/blogs/waynegraham/ */
Tomcat JNDI Settings
I'm attempting to set up multiple instances of Solr using JNDI (taken from http://wiki.apache.org/solr/SolrTomcat). I created a new solr.xml file in $CATALINA_HOME/conf/Catalina/localhost with: The box is running tomcat 5.5.23, so the solr.war file is out of the webapps path and the solr/home has the config folder from the examples directory. When I restart Tomcat, I get an error SEVERE: Exception starting filter SolrRequestFilter class java.lang.NoClassDefFoundError: Could not initialize class org.apache.solr.core.SolrConfig ... Tomcat does create the folders in the webapps folder (and I've created multiple context files), but every single one of them throws the same error. Obviously I'm missing something, I just can't figure out what...any thoughts? TIA, Wayne -- /** * Wayne Graham * Earl Gregg Swem Library * PO Box 8794 * Williamsburg, VA 23188 * 757.221.3112 * http://swem.wm.edu/blogs/waynegraham/ */
Re: Tomcat JNDI Settings
Hi Chris, Thanks for getting back to me. The folder /var/lib/tomcat5/solr/home exists as does /var/lib/tomcat5/solr/home/conf/solrconfig.xml. It's basically a copy of the files from examples folder at this point. I put war files in /var/lib/tomcat5/webapps, so I have the apache-solr-1.2.0.war file outside of the webapps folder. Are there any special permissions these files need? I have them owned by the tomcat user. Wayne Chris Hostetter wrote: > : : crossContext="true" > > : : value="/var/lib/tomcat5/solr/home" override="true" /> > : > > : SEVERE: Exception starting filter SolrRequestFilter class > : java.lang.NoClassDefFoundError: Could not initialize class > : org.apache.solr.core.SolrConfig > > this may be a variant of SOLR-337 ... are you sure > /var/lib/tomcat5/solr/home exists? does it contain a ./conf directory? > does the conf directory contain a solrconfig.xml file? > > https://issues.apache.org/jira/browse/SOLR-337 > > Second suggestion: is /var/lib/tomcat5/ the directory where you normally > put war files for tomcat? i recall people saying that with tomcat, if you > want to use a context file then the war *must* not be in the nromal webaps > directory... > > http://wiki.apache.org/solr/SolrTomcat#head-7036378fa48b79c0797cc8230a8aa0965412fb2e > > "For Tomcat 5.5 and later, the war file must be stored outside of the > webapps directory for this to work. Otherwise, this Context element is > ignored." > > > -Hoss > -- /** * Wayne Graham * Earl Gregg Swem Library * PO Box 8794 * Williamsburg, VA 23188 * 757.221.3112 * http://swem.wm.edu/blogs/waynegraham/ */
Re: Tomcat JNDI Settings
Hi Hoss, I just wanted to follow up to the list on this one...I could never get the JNDI settings to work with Tomcat. I went to Jetty and everything is working quite nicely. Wayne Chris Hostetter wrote: > : Thanks for getting back to me. The folder /var/lib/tomcat5/solr/home > : exists as does /var/lib/tomcat5/solr/home/conf/solrconfig.xml. It's > : basically a copy of the files from examples folder at this point. > : > : I put war files in /var/lib/tomcat5/webapps, so I have the > : apache-solr-1.2.0.war file outside of the webapps folder. > : > : Are there any special permissions these files need? I have them owned by > : the tomcat user. > > that should be fine ... is /var/lib/tomcat5/solr/home/ writable by the > tomcat user so it can make the ./data and ./data/index directories? > > are you sure there aren't any other errors in the logs above the one you > mentioned already? > > > > > -Hoss > -- /** * Wayne Graham * Earl Gregg Swem Library * PO Box 8794 * Williamsburg, VA 23188 * 757.221.3112 * http://swem.wm.edu/blogs/waynegraham/ */