Re: getting DIH to read my XML files: solved
Shalin, thanks for the pointer. The following data-config.xml worked. The trick was realising that EVERY entity tag needs to have its own datasource, I guess I had been assuming that it was implicit for certain processors. The whole thing is confusing in that there is both the dataSource element(s), which is to all intents and purposes required, and an optional dataSource attribute of the entity element. If the entity dataSource attribute is missing it defaults to one of the defined ones??? Unless you are using FileListEntityProcessor where you have to explicitly state you are not using a dataSource. As a newbie I think my lesson learnt, is to name every dataSource element I define and to reference named dataSources from every entity element I add, except for FileListEntityProcessor where is has to be set to null. 0 Regards Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Cant get HTMLStripTransformer's stripHTML to work in DIH.
Hello all, I have the following DIH data-config.xml file. Adding HTMLStripTransformer and the associated stripHTML on the para tag seems to have broke things. I am using a nightly build from 12-jan-2009 The /record/sect1/para contains HTML sub tags which need to be discarded. Is my use of stripHTML correct? -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: getting DIH to read my XML files: solved
Hi Fergus, The idea here is that if you do not give a name to your data source, then it is the 'default' data source and gets used automatically by all entities. If you decide to give a name to the data source, then it should be specified for each entity. Even when you have multiple data sources, you can decide to give a name to only one of them. The un-named one will be used for all entities which do not have a dataSource attribute. Hope that helps. On Mon, Jan 19, 2009 at 4:11 PM, Fergus McMenemie wrote: > Shalin, thanks for the pointer. > > The following data-config.xml worked. The trick was realising > that EVERY entity tag needs to have its own datasource, I guess > I had been assuming that it was implicit for certain processors. > > The whole thing is confusing in that there is both the dataSource > element(s), which is to all intents and purposes required, and an > optional dataSource attribute of the entity element. If the entity > dataSource attribute is missing it defaults to one of the defined > ones??? Unless you are using FileListEntityProcessor where you have > to explicitly state you are not using a dataSource. > > As a newbie I think my lesson learnt, is to name every dataSource > element I define and to reference named dataSources from every > entity element I add, except for FileListEntityProcessor where > is has to be set to null. > > > > > processor="FileListEntityProcessor" > dataSource="null" > fileName=".*xml" > newerThan="'NOW-1000DAYS'" > recursive="true" > rootEntity="false" > baseDir="/Volumes/spare/ts/j/groups"> > processor="XPathEntityProcessor" > dataSource="myfilereader" > url="${jcurrent.fileAbsolutePath}" > stream="false" > forEach="/record" > transformer="DateFormatTransformer">0 > >xpath="/record/metadata/subje...@qualifier='fullTitle']"/> > >xpath="/record/metadata/subje...@qualifier='publication']"/> > xpath="/record/metadata/subje...@qualifier='pubAbbrev']"/> >xpath="/record/metadata/da...@qualifier='pubDate']"/> > > > > > > > Regards Fergus. > -- > > === > Fergus McMenemie > Email:fer...@twig.me.uk > Techmore Ltd Phone:(UK) 07721 376021 > > Unix/Mac/Intranets Analyst Programmer > === > -- Regards, Shalin Shekhar Mangar.
Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.
This looks fine. Can you post the stack trace? On Mon, Jan 19, 2009 at 4:14 PM, Fergus McMenemie wrote: > Hello all, > > I have the following DIH data-config.xml file. Adding > HTMLStripTransformer and the associated stripHTML on the > para tag seems to have broke things. I am using a nightly > build from 12-jan-2009 > > The /record/sect1/para contains HTML sub tags which need > to be discarded. Is my use of stripHTML correct? > > > > > processor="FileListEntityProcessor" >fileName=".*xml" >newerThan="'NOW-1000DAYS'" >recursive="true" >rootEntity="false" >dataSource="null" >baseDir="/Volumes/spare/ts/jxml/data/news/groups"> > > dataSource="myfilereader" > processor="XPathEntityProcessor" > url="${jcurrent.fileAbsolutePath}" > stream="false" > forEach="/record" > > transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,HTMLStripTransformer"> > >template="${jcurrent.fileAbsolutePath}" /> >replaceWith="$1" sourceColName="fileAbsePath"/> > >stripHTML="true" /> > xpath="/record/metadata/subje...@qualifier='fullTitle']" /> > xpath="/record/metadata/subje...@qualifier='publication']" /> > xpath="/record/metadata/da...@qualifier='pubDate']" > dateTimeFormat="MMdd" /> > > > > > > -- > > === > Fergus McMenemie > Email:fer...@twig.me.uk > Techmore Ltd Phone:(UK) 07721 376021 > > Unix/Mac/Intranets Analyst Programmer > === > -- Regards, Shalin Shekhar Mangar.
Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.
>This looks fine. Can you post the stack trace? > Yep, here is the juicy bit. Let me know if you need more. Jan 19, 2009 11:08:03 AM org.apache.catalina.startup.Catalina start INFO: Server startup in 2390 ms Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrCore execute INFO: [janesdocs] webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=12 Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties INFO: Read dataimport.properties Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter doFullImport INFO: Starting Full Import Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll INFO: [janesdocs] REMOVING ALL DOCUMENTS FROM INDEX Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy onInit INFO: SolrDeletionPolicy.onInit: commits:num=2 commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_1,version=1232363283058,generation=1,filenames=[segments_1] commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_2,version=1232363283059,generation=2,filenames=[segments_2] Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy updateCommits INFO: last commit = 1232363283059 Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.EntityProcessorBase applyTransformer WARNING: transformer threw error java.lang.NullPointerException at java.io.StringReader.(StringReader.java:33) at org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) at org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) at org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DocBuilder buildDocument SEVERE: Exception while processing: janescurrent document : null org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) at org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) Caused by: java.lang.NullPointerException at java.io.StringReader.(StringReader.java:33) at org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) at org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) at org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) ... 9 more Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) at org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.
Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.
Hmmm, Just to clarify I retested the thing using the nightly as of today 18-jan-2009. The problem is still there and this traceback is from that nightly. >>This looks fine. Can you post the stack trace? >> >Yep, here is the juicy bit. Let me know if you need more. > >Jan 19, 2009 11:08:03 AM org.apache.catalina.startup.Catalina start >INFO: Server startup in 2390 ms >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrCore execute >INFO: [janesdocs] webapp=/solr path=/dataimport params={command=full-import} >status=0 QTime=12 >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.SolrWriter >readIndexerProperties >INFO: Read dataimport.properties >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter >doFullImport >INFO: Starting Full Import >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll >INFO: [janesdocs] REMOVING ALL DOCUMENTS FROM INDEX >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy onInit >INFO: SolrDeletionPolicy.onInit: commits:num=2 > > commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_1,version=1232363283058,generation=1,filenames=[segments_1] > > commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_2,version=1232363283059,generation=2,filenames=[segments_2] >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy updateCommits >INFO: last commit = 1232363283059 >Jan 19, 2009 11:14:06 AM >org.apache.solr.handler.dataimport.EntityProcessorBase applyTransformer >WARNING: transformer threw error >java.lang.NullPointerException > at java.io.StringReader.(StringReader.java:33) > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DocBuilder >buildDocument >SEVERE: Exception while processing: janescurrent document : null >org.apache.solr.handler.dataimport.DataImportHandlerException: >java.lang.NullPointerException > at > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203) > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) >Caused by: java.lang.NullPointerException > at java.io.StringReader.(StringReader.java:33) > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) > ... 9 more >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter >doFullImport >SEVERE: Full Import failed >org.apache.solr.handler.dataimport.DataImportHandlerException: >java.lang.NullPointerException > at > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(D
MoreLikeThis/These and Queries
Hi, I'm using Solr's MoreLikeThese functionality for a rudimentary related products system, but we need it to only return related products that match certain criteria. For example, we don't want to see any related products that are discontinued. I'm having difficulty figuring out if there's a way to filter/query on the MLT result. Regards, Andrew Ingram
Re: MoreLikeThis/These and Queries
Hi, Have you tried to add a filter directly to the /solr/mlt?q=-request? Try to add "&fq=available:yes", and see if you can limit the MoreLikeThis-documents to documents that has "yes" in the "available"-field. I have had some success with this approach. /Clas, Frisim.com On Mon, Jan 19, 2009 at 2:49 PM, Andrew Ingram wrote: > Hi, > > I'm using Solr's MoreLikeThese functionality for a rudimentary related > products system, but we need it to only return related products that match > certain criteria. For example, we don't want to see any related products > that are discontinued. I'm having difficulty figuring out if there's a way > to filter/query on the MLT result. > > Regards, > Andrew Ingram >
Re: MoreLikeThis/These and Queries
I think the problem might be that I'm using the standard handler with the mlt:true parameter. The MLT handler doesn't seem to be mentioned in my config file, do you know how I can enable it? Regards, Andrew Ingram Clas Rydergren wrote: Hi, Have you tried to add a filter directly to the /solr/mlt?q=-request? Try to add "&fq=available:yes", and see if you can limit the MoreLikeThis-documents to documents that has "yes" in the "available"-field. I have had some success with this approach. /Clas, Frisim.com
Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.
Ah, it needs a null check for multi valued fields. I've committed a fix to trunk. The next nightly build should have it. You can checkout and build from the trunk if need this immediately. On Mon, Jan 19, 2009 at 7:02 PM, Fergus McMenemie wrote: > Hmmm, > > Just to clarify I retested the thing using the nightly as of today > 18-jan-2009. The problem is still there and this traceback is from > that nightly. > > >>This looks fine. Can you post the stack trace? > >> > >Yep, here is the juicy bit. Let me know if you need more. > > > >Jan 19, 2009 11:08:03 AM org.apache.catalina.startup.Catalina start > >INFO: Server startup in 2390 ms > >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrCore execute > >INFO: [janesdocs] webapp=/solr path=/dataimport > params={command=full-import} status=0 QTime=12 > >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.SolrWriter > readIndexerProperties > >INFO: Read dataimport.properties > >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter > doFullImport > >INFO: Starting Full Import > >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2 > deleteAll > >INFO: [janesdocs] REMOVING ALL DOCUMENTS FROM INDEX > >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy onInit > >INFO: SolrDeletionPolicy.onInit: commits:num=2 > > > commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_1,version=1232363283058,generation=1,filenames=[segments_1] > > > commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_2,version=1232363283059,generation=2,filenames=[segments_2] > >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy > updateCommits > >INFO: last commit = 1232363283059 > >Jan 19, 2009 11:14:06 AM > org.apache.solr.handler.dataimport.EntityProcessorBase applyTransformer > >WARNING: transformer threw error > >java.lang.NullPointerException > > at java.io.StringReader.(StringReader.java:33) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) > > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) > > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) > > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) > > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) > > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) > >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DocBuilder > buildDocument > >SEVERE: Exception while processing: janescurrent document : null > >org.apache.solr.handler.dataimport.DataImportHandlerException: > java.lang.NullPointerException > > at > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) > > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) > > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) > > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) > > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) > > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) > >Caused by: java.lang.NullPointerException > > at java.io.StringReader.(StringReader.java:33) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.ja
Re: WebLogic 10 Compatibility Issue - StackOverflowError
I am running into the same problem. I found that I can successfully run the "ping" request. For example: http://localhost:7101/solr/admin/ping But all other requests to the Admin page fail. KSY wrote: > > I hit a major roadblock while trying to get Solr 1.3 running on WebLogic > 10.0. > > A similar message was posted before - ( > http://www.nabble.com/Solr-1.3-stack-overflow-when-accessing-solr-admin-page-td20157873.html > http://www.nabble.com/Solr-1.3-stack-overflow-when-accessing-solr-admin-page-td20157873.html > ) - but it seems like it hasn't been resolved yet, so I'm re-posting here. > > I am sure I configured everything correctly because it's working fine on > Resin. > > Has anyone successfully run Solr 1.3 on WebLogic 10.0 or higher? Thanks. > > > SUMMARY: > > When accessing /solr/admin page, StackOverflowError occurs due to an > infinite recursion in SolrDispatchFilter > > > ENVIRONMENT SETTING: > > Solr 1.3.0 > WebLogic 10.0 > JRockit JVM 1.5 > > > ERROR MESSAGE: > > SEVERE: javax.servlet.ServletException: java.lang.StackOverflowError > at > weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:276) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) > at > weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42) > at > weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDispatcherImpl.java:526) > at > weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:261) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) > at > weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42) > at > weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDispatcherImpl.java:526) > at > weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:261) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) > at > weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42) > at > weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDispatcherImpl.java:526) > at > weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:261) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) > > -- View this message in context: http://www.nabble.com/WebLogic-10-Compatibility-Issue---StackOverflowError-tp21521190p21545253.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: MoreLikeThis/These and Queries
Hi, Even if you use the /solr/select-version of MLT, I guess that just adding the fq-parameter may work. If you would like to add the MLT request handler, add something like title,data 1 as a request handler plugin in your solrconfig.xml file. If you are running on Tomcat you will have to make adjustments to you web.xml-file. /Clas, Frisim.com On Mon, Jan 19, 2009 at 3:15 PM, Andrew Ingram wrote: > I think the problem might be that I'm using the standard handler with the > mlt:true parameter. The MLT handler doesn't seem to be mentioned in my > config file, do you know how I can enable it? > > Regards, > Andrew Ingram > > Clas Rydergren wrote: >> >> Hi, >> >> Have you tried to add a filter directly to the /solr/mlt?q=-request? >> Try to add "&fq=available:yes", and see if you can limit the >> MoreLikeThis-documents to documents that has "yes" in the >> "available"-field. I have had some success with this approach. >> >> /Clas, Frisim.com >> >
Re: MoreLikeThis/These and Queries
Thanks, I'll try this. I tried using the /select version and the problem was that fq applies only to the original query rather than the mlt results which are effectively separate queries. Clas Rydergren wrote: Hi, Even if you use the /solr/select-version of MLT, I guess that just adding the fq-parameter may work. If you would like to add the MLT request handler, add something like title,data 1 as a request handler plugin in your solrconfig.xml file. If you are running on Tomcat you will have to make adjustments to you web.xml-file. /Clas, Frisim.com
Re: MoreLikeThis/These and Queries
Ah, I see. I have, more or less, only used the MoreLikeThis (single document) version, and the fq-filter is then applied to the (only) query that is made. Sorry. /Clas On Mon, Jan 19, 2009 at 4:42 PM, Andrew Ingram wrote: > Thanks, I'll try this. > > I tried using the /select version and the problem was that fq applies only > to the original query rather than the mlt results which are effectively > separate queries. > > > Clas Rydergren wrote: >> >> Hi, >> >> Even if you use the /solr/select-version of MLT, I guess that just >> adding the fq-parameter may work. >> >> If you would like to add the MLT request handler, add something like >> >> >> >> title,data >> 1 >> >> >> >> as a request handler plugin in your solrconfig.xml file. If you are >> running on Tomcat you will have to make adjustments to you >> web.xml-file. >> >> /Clas, Frisim.com >> >
Embedded Solr updates not showing until restart
Hi, We're evaluating the use of Solr for use in a web application. I've got the web application configured to use an embedded instance of Solr for queries (setup as a slave), and a remote instance for writes (setup as a master). The replication scripts are running fine and the embedded slave does appear to be getting the updates, but queries run against the embedded slave don't show up until I restart the web application. We're using SolrJ as our interface to Solr. Can anyone provide any insight into why updates don't show up until after a webapp restart? Thanks, Erik -- View this message in context: http://www.nabble.com/Embedded-Solr-updates-not-showing-until-restart-tp21546235p21546235.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Word Delimiter struggles
Thank you Shalin, I'm in the process of implementing your suggestion, and it works marvelously. Had to upgrade to solr 1.3, and had to hack up acts_as_solr to function correctly. Is there a way to receive a search for a given field, and have solr know to automatically check the two fields? I suppose not. I'm trying to avoid having to manipulate user input too much, so hoping to be able to have a user search for: title:phpGroupWare and have it search the two fields automatically. Right now, in implementing your solution, I take their title search and convert it to (titlew:(phpGroupWare) OR titlec:(phpGroupWare)) and it works marvelously, but of course would be easier if I could just let it go as is. (titlew being wdf_wordparts and titlec being wdf_catenatewords) Thank you kindly, we've grown to depend strongly on solr for OSVDB.org and datalossdb.org -- it is a fantastic tool. Dave On Sat, Jan 17, 2009 at 5:08 AM, Shalin Shekhar Mangar wrote: > Hi Dave, > > A quick experimentation found the following fieldtypes to be successful with > your queries. Add one as a copyField to the other and search on both: > > > > > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" > preserveOriginal="1"/> > > > > > > > > generateWordParts="0" generateNumberParts="0" catenateWords="1" > catenateNumbers="1" catenateAll="1" splitOnCaseChange="1" > preserveOriginal="0"/> > > > > > I added the following test to TestWordDelimiterFilter.java > > public void testDave() { > >assertU(adoc("id", "191", >"wdf_preserve", "phpGroupWare")); >assertU(commit()); > >assertQ("preserving original word", >req("wdf_preserve:PHPGroupWare") >, "//resu...@numfound=1]" >); > >assertQ("preserving original word", >req("wdf_wordparts:phpGroupWare wdf_catenatewords:phpGroupWare") >, "//resu...@numfound=1]" >); > >assertQ("preserving original word", >req("wdf_wordparts:PHPGroupware wdf_catenatewords:PHPGroupware") >, "//resu...@numfound=1]" >); >assertQ("preserving original word", >req("wdf_wordparts:phpGroupware wdf_catenatewords:phpGroupware") >, "//resu...@numfound=1]" >); >assertQ("preserving original word", >req("wdf_wordparts:phpgroupware wdf_catenatewords:phpgroupware") >, "//resu...@numfound=1]" >); > >assertQ("preserving original word", >req("wdf_wordparts:(php groupware) wdf_catenatewords:(php > groupware)") >, "//resu...@numfound=1]" >); > >assertQ("preserving original word", >req("wdf_wordparts:(php group ware) wdf_catenatewords:(php group > ware)") >, "//resu...@numfound=1]" >); > >assertQ("preserving original word", >req("wdf_wordparts:(PHPGroup ware) wdf_catenatewords:(PHPGroup > ware)") >, "//resu...@numfound=1]" >); > > } > > I'll let someone else comment if there is an easier way to do this (without > two fields). >
advice on minimal solr/jetty
Hi everyone, I'd like to see how much I can reduce the startup time of jetty/solr. Right now I have it at about 3s - that's fast, but I'd like to see how close to zero I can get it. I've minimized my schema and solrconfig down to what I use (my solr needs are pretty vanilla). Now I'm looking at all the solr plugins that get loaded at startup that I don't use and wondering whether getting rid of those would help. But before jumping off into jar manipulation I figured I'd pose this question to the group - what would you do? -Steve
OOME diagnosis - possible to disable caching?
Hi all, I have 20 indices, each ~10GB in size, being searched by a single Solr slave instance (using the multicore features in a slightly old 1.2 dev build) I'm getting unpredictable, but inevitable, OutOfMemoryError from the slave, and I have no more physical memory to throw at the problem (HotSpot 1.6 with Xmx=4000m). At this point, I'd like to see how much memory Lucene is eating by disabling all supplemental Solr caches. Which solrconfig settings do I need to be paying attention to here? filterCache, queryResultCache and documentCache? I'm not faceting, sorting, highlighting or anything like that (all in an effort to get more docs in a searchable index!) Thanks! James
Re: OOME diagnosis - possible to disable caching?
On 19-Jan-09, at 2:44 PM, James Brady wrote: Hi all, I have 20 indices, each ~10GB in size, being searched by a single Solr slave instance (using the multicore features in a slightly old 1.2 dev build) I'm getting unpredictable, but inevitable, OutOfMemoryError from the slave, and I have no more physical memory to throw at the problem (HotSpot 1.6 with Xmx=4000m). At this point, I'd like to see how much memory Lucene is eating by disabling all supplemental Solr caches. Which solrconfig settings do I need to be paying attention to here? filterCache, queryResultCache and documentCache? That should be it. Note that it is unadvisable to reduce documentCache size to less than 100 or so, as Solr assumes that enough cache for the docs for one query. A heap dump should help point you in to the relevant problems. Also, it should be easy to get a rough estimate of the filtercache mem usage by looking at its size. -Mike
Re: Using Threading while Indexing.
Your 3 instances are trying to acquire the physical lock to the index. If you want to use multi-threaded indexing, I would suggest http interface, as Solr will control the request queue for you and index as much docs as it can receive from your open threads (resource wise obviously). 2009/1/19 Sagar Khetkade > > Hi, > > I was trying to index three sets of document having 2000 articles using > three threads of embedded solr server. But while indexing, giving me > exception "org.apache.lucene.store.LockObtainFailedException: Lock obtain > timed out: SingleInstanceLock: write.lock". I know that this issue do > persists with Lucene; is it the same with Solr? > > Thanks and Regards, > Sagar Khetkade. > _ > For the freshest Indian Jobs Visit MSN Jobs > http://www.in.msn.com/jobs > -- Alexander Ramos Jardim
Re: Embedded Solr updates not showing until restart
Do they show up if you use non-embedded? That is, if you hit that slave over HTTP from your browser, are the changes showing up? On Jan 19, 2009, at 11:18 AM, edre...@ha wrote: Hi, We're evaluating the use of Solr for use in a web application. I've got the web application configured to use an embedded instance of Solr for queries (setup as a slave), and a remote instance for writes (setup as a master). The replication scripts are running fine and the embedded slave does appear to be getting the updates, but queries run against the embedded slave don't show up until I restart the web application. We're using SolrJ as our interface to Solr. Can anyone provide any insight into why updates don't show up until after a webapp restart? Thanks, Erik -- View this message in context: http://www.nabble.com/Embedded-Solr-updates-not-showing-until-restart-tp21546235p21546235.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Querying Solr Index for date fields
We want boosting of results for ?q=dateField:[NOW-45DAYS TO NOW] and ?q=DateField:[NOW TO NOW+45DAYS] below mentioned fq tag gives me error dateField:[NOW-45DAYS TO NOW]^1.0 DateField:[NOW TO NOW+45DAYS]^1.0 Please suggest the syntax of fields defined in fq tag? tHAnks, Prerna Erik Hatcher wrote: > > It doesn't really make sense to use a date field in a dismax qf > parameter. Use an fq parameter instead, to filter results by a date > field. > > Dismax is aimed for end users textual queries, not for field selection > or more refined typed queries like date or numeric ranges. > > Erik > > > On Jan 16, 2009, at 9:23 AM, prerna07 wrote: > >> >> >> We also make query on date ranges, it works when you use NOW function. >> Try using : >> ?q=dateField:[* TO NOW] >> ?q=dateField:[NOW-45DAYS TO NOW] >> ?q=dateField:[NOW TO NOW+45DAYS] >> >> >> Issue: Current issue which i am facing is with dismaxrequesthandler >> for date >> field. >> As soon as I add dateField in dismaxrequest tag, dismax for other >> string / text attributes stops working. My search query is ? >> q=SearchString, >> the error i get is >> "The request sent by the client was syntactically incorrect (Invalid >> Date >> String:'searchTerm')." >> >> Please suggets how can i use date field in qf of dismaxrequest. >> >> Thanks, >> Prerna >> >> >> Akshay-8 wrote: >>> >>> You will have to URL encode the string correctly and supply date in >>> format >>> Solr expects. Please check this: >>> http://wiki.apache.org/solr/SolrQuerySyntax >>> >>> On Fri, Jan 9, 2009 at 12:21 PM, Rayudu >>> wrote: >>> Hi All, I have a field with is solr.DateField in my schema file. If I want to get the docs. for a given date for eg: get all the docs. whose date value is 2009-01-09 then how can I query my index. As solr's date format is -mm-ddThh:mm:ss, if I give the date as 2009-01-09T00:00:00Z it is thorwing an exception "solr.SolrException: HTTP code=400, reason=Invalid Date String:'2009-01-09T00' " . if I give the date as 2009-01-09 it is thorwing an exception , solr.SolrException: HTTP code=400, reason=Invalid Date String:'2009-01-09' Thanks, Rayudu. -- View this message in context: http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21367097.html Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >>> -- >>> Regards, >>> Akshay Ukey. >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21500362.html >> Sent from the Solr - User mailing list archive at Nabble.com. > > > -- View this message in context: http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21557213.html Sent from the Solr - User mailing list archive at Nabble.com.
how can solr search angainst group of field
Good days gentlemen. in my search engine i have 4 groups of text: 1) user address 2) user description 3) ... 4) ... I want to give users ability to search all of them with ability to conjunction selection for searching some of them. conjunction means that user should be able to search 1) and 2) fields, 1 AND 3 fields and so on. I'm realizing how i can give them ability to search everywhere - it can be archieved by copyFields parameter but how can user search for bunch of terms in different groups? now i'm using such syntax +(default_field:WORD default_field:WORD2 default_field:WORD3) if i want to give them oportunity to search by 2 of 4 fields, i should repeat a construction? i.e. (field1:WORD field1:WORD2 field1:WORD3) (field2:WORD field2:WORD2 field2:WORD3) ? is there any ability to specify field1,field2:TERM ? -- View this message in context: http://www.nabble.com/how-can-solr-search-angainst-group-of-field-tp21557783p21557783.html Sent from the Solr - User mailing list archive at Nabble.com.
Searching for 'A*' is not returning me same result as 'a*'
Hi, I am using the following analyser for indexing and querying - -- - I search using Solr admin console. When I search for - institutionName:a*, I get 93 matching records. But when I search for - institutionName:A*, I DO NOT get any matching records. I did field Analysis for a* and A* for the analyzer configuration. For a* -- http://www.nabble.com/file/p21557926/a-analysis.gif For A* -- http://www.nabble.com/file/p21557926/A1-analysis.gif As per my understanding, analyzer is working fine in both the case. I am not able to understand, why query is not returning me any result for A*? :confused: I feel that I am missing out something, can anyone help me with that? Regards, Manu -- View this message in context: http://www.nabble.com/Searching-for-%27A*%27-is-not-returning-me-same-result-as-%27a*%27-tp21557926p21557926.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Help with Solr 1.3 lockups?
Java 1.5 has thread-locking bugs. Switching to Java 1.6 may cure this problem. On Thu, Jan 15, 2009 at 10:57 AM, Jerome L Quinn wrote: > > Hi, all. > > I'm running solr 1.3 inside Tomcat 6.0.18. I'm running a modified query > parser, tokenizer, highlighter, and have a CustomScoreQuery for dates. > > After some amount of time, I see solr stop responding to update requests. > When crawling through the logs, I see the following pattern: > > Jan 12, 2009 7:27:42 PM org.apache.solr.update.DirectUpdateHandler2 commit > INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true) > Jan 12, 2009 7:28:11 PM org.apache.solr.common.SolrException log > SEVERE: Error during auto-warming of > key:org.apache.solr.search.queryresult...@ce0f92b9 > :java.lang.OutOfMemoryError >at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122) >at org.apache.lucene.index.SegmentTermEnum.term > (SegmentTermEnum.java:167) >at org.apache.lucene.index.SegmentMergeInfo.next > (SegmentMergeInfo.java:66) >at org.apache.lucene.index.MultiSegmentReader$MultiTermEnum.next > (MultiSegmentReader.java:492) >at org.apache.lucene.search.FieldCacheImpl$7.createValue > (FieldCacheImpl.java:267) >at org.apache.lucene.search.FieldCacheImpl$Cache.get > (FieldCacheImpl.java:72) >at org.apache.lucene.search.FieldCacheImpl.getInts > (FieldCacheImpl.java:245) >at org.apache.solr.search.function.IntFieldSource.getValues > (IntFieldSource.java:50) >at org.apache.solr.search.function.SimpleFloatFunction.getValues > (SimpleFloatFunction.java:41) >at org.apache.solr.search.function.BoostedQuery$CustomScorer. > (BoostedQuery.java:111) >at org.apache.solr.search.function.BoostedQuery$CustomScorer. > (BoostedQuery.java:97) >at org.apache.solr.search.function.BoostedQuery > $BoostedWeight.scorer(BoostedQuery.java:88) >at org.apache.lucene.search.IndexSearcher.search > (IndexSearcher.java:132) >at org.apache.lucene.search.Searcher.search(Searcher.java:126) >at org.apache.lucene.search.Searcher.search(Searcher.java:105) >at org.apache.solr.search.SolrIndexSearcher.getDocListNC > (SolrIndexSearcher.java:966) >at org.apache.solr.search.SolrIndexSearcher.getDocListC > (SolrIndexSearcher.java:838) >at org.apache.solr.search.SolrIndexSearcher.access$000 > (SolrIndexSearcher.java:56) >at org.apache.solr.search.SolrIndexSearcher$2.regenerateItem > (SolrIndexSearcher.java:260) >at org.apache.solr.search.LRUCache.warm(LRUCache.java:194) > at org.apache.solr.search.SolrIndexSearcher.warm > (SolrIndexSearcher.java:1518) >at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1018) >at java.util.concurrent.FutureTask$Sync.innerRun > (FutureTask.java:314) >at java.util.concurrent.FutureTask.run(FutureTask.java:149) >at java.util.concurrent.ThreadPoolExecutor$Worker.runTask > (ThreadPoolExecutor.java:896) >at java.util.concurrent.ThreadPoolExecutor$Worker.run > (ThreadPoolExecutor.java:918) >at java.lang.Thread.run(Thread.java:735) > > Jan 12, 2009 7:28:11 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run > SEVERE: Socket accept failed > Throwable occurred: java.lang.OutOfMemoryError >at java.net.PlainSocketImpl.socketAccept(Native Method) >at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:414) >at java.net.ServerSocket.implAccept(ServerSocket.java:464) >at java.net.ServerSocket.accept(ServerSocket.java:432) >at > org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket > (DefaultServerSocketFactory.java:61) >at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run > (JIoEndpoint.java:310) >at java.lang.Thread.run(Thread.java:735) > > <<<> > << Java dumps core and heap at this point >> > <<<> > > Jan 12, 2009 7:28:21 PM org.apache.solr.common.SolrException log > SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain > timed out: SingleInstanceLock: write.lock >at org.apache.lucene.store.Lock.obtain(Lock.java:85) >at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1140) >at org.apache.lucene.index.IndexWriter.(IndexWriter.java:938) >at org.apache.solr.update.SolrIndexWriter. > (SolrIndexWriter.java:116) >at org.apache.solr.update.UpdateHandler.createMainIndexWriter > (UpdateHandler.java:122) >at org.apache.solr.update.DirectUpdateHandler2.openWriter > (DirectUpdateHandler2.java:167) >at org.apache.solr.update.DirectUpdateHandler2.addDoc > (DirectUpdateHandler2.java:221) >at org.apache.solr.update.processor.RunUpdateProcessor.processAdd > (RunUpdateProcessorFactory.java:59) >at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate > (XmlUpdateRequestHandler.java:196) >at > org.apache.solr.handler.XmlUpdateRequest