Re: Aliases for fields

2009-08-18 Thread Fergus McMenemie
tored="true" multiValued="false" termVectors="false" >> alias="source.date"/> >> >> is there any jira issue related? >> >> Thx >> >> -- >> Lici >> -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Netbeans and Solr : Whac-A-Mole

2009-09-07 Thread Fergus McMenemie
. PS: I am a total netbeans newbie. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Netbeans and Solr : Whac-A-Mole

2009-09-07 Thread Fergus McMenemie
But when you want to run testcases... you are doing that from the command line? Are you are only using the IDE as an editor? >Regards >Rajan > >On Mon, Sep 7, 2009 at 3:26 PM, Fergus McMenemie wrote: > >> Hello all, >> >> I would appreciate help from somebody who h

Re: Netbeans and Solr : Whac-A-Mole

2009-09-07 Thread Fergus McMenemie
nd play with XPathRecordReader.java other than ant -Dtestcase=TestXPathRecordReader test Which takes 8secs to run here? I am not using XpathRecordReader outside of DIH, but looking to see how I would add support for xpaths such as //a. Fergus. > >On Mon, Sep 7, 2009 at 3:26 PM, Fergus McMenemie w

Re: Netbeans and Solr : Whac-A-Mole

2009-09-07 Thread Fergus McMenemie
>On Mon, Sep 7, 2009 at 5:58 PM, Fergus McMenemie wrote: > >> >This testcase is quite independent of anything in Solr. It is a >> >standalone utility and the only dependency is stax. >> >discalimer (I run these testcases from Intellij and command line) >&g

Re: Specifying multiple documents in DataImportHandler dataConfig

2009-09-08 Thread Fergus McMenemie
. >> >> If I remove the 2 document elements and wrap both entity sets in just >> one document tag, then both sets get indexed, which seemingly achieves >> my goal. This just doesnt make sense from my understanding of how DIH >> works. My 2 content types are indeed separate so

RE: Extract info from parent node during data import

2009-09-10 Thread Fergus McMenemie
/20090817070752.xml"; >> > processor="XPathEntityProcessor" forEach="/document/category/item" >> > transformer="DateFormatTransformer" stream="true" dataSource="dataSource"> >> >> > commonField="true" /> >> > >> > >> > >> > >> > >> > >> > >> > >> > This is how I have specified my schema >> > >> > > > required="true" /> >> > >> > >> > >> > >> > id >> > id >> > >> > >> > >> > >> > >> > >> > _ >> > Need a place to rent, buy or share? Let us find your next place for you! >> > http://clk.atdmt.com/NMN/go/157631292/direct/01/ >> >> >> >> -- >> - >> Noble Paul | Principal Engineer| AOL | http://aol.com > >_ >Get Hotmail on your iPhone Find out how here >http://windowslive.ninemsn.com.au/article.aspx?id=845706 -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Extract info from parent node during data import

2009-09-11 Thread Fergus McMenemie
t; >> > >>> >> > - category: Category 2; id: 4; author: Author 4 >>> >> > >>> >> > >>> >> > >>> >> > Any ideas on how I can get to a parent node from within a child during >>> >> > data import? If it cant be done, what do you suggest would be the best >>> >> > way so I can keep using the DataImportHandler... would XSLT be a good >>> >> > idea to 'flatten out' the structure a bit? >>> >> > >>> >> > >>> >> > >>> >> > Thanks >>> >> > >>> >> > >>> >> > >>> >> > This is what my XML document looks like: >>> >> > >>> >> > >>> >> > >>> >> > Category 1 >>> >> > >>> >> > 1 >>> >> > Author 1 >>> >> > >>> >> > >>> >> > 2 >>> >> > Author 2 >>> >> > >>> >> > >>> >> > >>> >> > Category 2 >>> >> > >>> >> > 3 >>> >> > Author 3 >>> >> > >>> >> > >>> >> > 4 >>> >> > Author 4 >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > And this is what my dataConfig looks like: >>> >> > >>> >> > >>> >> > >>> >> > >> >> > url="http://localhost:9080/data/20090817070752.xml"; >>> >> > processor="XPathEntityProcessor" forEach="/document/category/item" >>> >> > transformer="DateFormatTransformer" stream="true" >>> >> > dataSource="dataSource"> >>> >> > >> >> > commonField="true" /> >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > This is how I have specified my schema >>> >> > >>> >> > >> >> > required="true" /> >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > id >>> >> > id >>> >> > -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: [DIH] Multiple repeat XPath stmts

2009-09-13 Thread Fergus McMenemie
maths to the transformers and I think we will have a turing complete language:-) fergus. >Thanks, >Grant -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: FileListEntityProcessor and LineEntityProcessor

2009-09-16 Thread Fergus McMenemie
fferedReader.readLine(Unknown Source) >at >org.apache.solr.handler.dataimport.LineEntityProcessor.nextRow(LineEn >tityProcessor.java:109) >... 8 more > > > >Note that my input files have 53812 lines, which is the same as the document >number that I'm choking on

Re: Extract info from parent node during data import (redirect:)

2009-09-17 Thread Fergus McMenemie
am the only one who knows >it. I would love to have more eyes on that. > >>I would like to open a JIRA for improving XPathRecordReader. >Please go ahead. You can paste the contents of this mail in the list . >There may be others with similar ideas > >Noble. -- ===

Number of terms in a SOLR field

2009-09-29 Thread Fergus McMenemie
Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexed or stored. As a QA check I was wanting to compare the results from indexing the

Re: Number of terms in a SOLR field

2009-09-30 Thread Fergus McMenemie
>Fergus McMenemie wrote: >> Hi all, >> >> I am attempting to test some changes I made to my DIH based >> indexing process. The changes only affect the way I >> describe my fields in data-config.xml, there should be no >> changes to the way the data is in

Re: Number of terms in a SOLR field

2009-09-30 Thread Fergus McMenemie
>Fergus McMenemie wrote: >>> Fergus McMenemie wrote: >>>> Hi all, >>>> >>>> I am attempting to test some changes I made to my DIH based >>>> indexing process. The changes only affect the way I >>>> describe my fields in

Re: Query filters/analyzers

2009-10-02 Thread Fergus McMenemie
>On Thu, Oct 1, 2009 at 7:59 PM, Claudio Martella > wrote: > >> >> About the copyField issue in general: as it copies the content to the >> other field, what is the sense to define analyzers for the destination >> field? The source is already analyzed so i guess that the RESULT of the >> analysis i

Re: Error when indexing XML files

2009-10-13 Thread Fergus McMenemie
>Hi, > >I am trying to index XML files using SolrJ. The original XML file contains >nested elements. For example, the following is the snippet of the XML file. > > >  SOMETHING >  SOME_OTHER_THING >  > >I have added the elements "name" and "facility" in Schema.xml file to make >these e

Re: Using DIH's special commands....Help needed

2009-10-15 Thread Fergus McMenemie
d to delete these rows using >> DIH?In other words, where/how do I specify this? >> >> >The $deleteDocByQuery is for deleting Solr documents by a Solr query and not >DB rows. > >-- >Regards, >Shalin Shekhar Mangar. -- ===

Re: Error when indexing XML files

2009-10-15 Thread Fergus McMenemie
Hi, Please find the schema file attached. Please let me know what I am doing wrong. Regards Chaitali --- On Wed, 10/14/09, Fergus McMenemie wrote: From: Fergus McMenemie Subject: Re: Error when indexing XML files To: solr-user@lucene.apache.org Date: Wednesday, October 14, 2009, 2:25 AM

Re: Error when indexing XML files

2009-10-15 Thread Fergus McMenemie
>Hi, > >Please find the schema file attached. Please let me know what I am doing wrong. > >Regards >Chaitali > >--- On Wed, 10/14/09, Fergus McMenemie wrote: > > >From: Fergus McMenemie >Subject: Re: Error when indexing XML files >To: solr-user@lucene.

Re: Question about DIH execution order

2009-11-02 Thread Fergus McMenemie
; name="id"/> >> > >> > >> > >> > >> > >> > >> >> keep the field as follows >> > column="TmpCourseId" name="CourseId" >> template="Course:${Course.CourseId}" name="id"/> >> >> >> >> >> -- >> - >> Noble Paul | Principal Engineer| AOL | http://aol.com >> -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Trying to run solr-1.3.0 under tomcat 5.5.20 on OS X 10.5.5

2008-11-05 Thread Fergus McMenemie
t(Bootstrap.java:294) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:432) So I guess the solrconfig.xml is been seen! Any help gratefully accepted! -- ======= Fergus McMenemie Email:[EMAIL PROTECTED] Techmore Ltd Phone:(U

Re: Large Data Set Suggestions

2008-11-05 Thread Fergus McMenemie
//earth-info.nga.mil/gns/html/namefiles.htm. It has 6.6M Documents. It is actually a CVS separated file but it is trivial to convert to XML. -- ======= Fergus McMenemie Email:[EMAIL PROTECTED] Techmore Ltd Phone:(

Trying to run solr-1.3.0 under tomcat 5.5.20 on OS X 10.5.5 (works with 1.2.0)

2008-11-06 Thread Fergus McMenemie
to tomcat under OS X tiger do not work. The only change is the version of solr. Any ideas? At 12:25 + 6/11/08, Fergus McMenemie wrote: >Hello all, > >I downloaded everything and set it up as per the instructions, and while it >does run under jetty, I can not get it to start und

Newbe! Trying to run solr-1.3.0 under tomcat. Please help

2008-11-14 Thread Fergus McMenemie
(leopard and tiger) all fail as follows. I also tried cutting and pasting the instructions from:- http://wiki.apache.org/solr/SolrTomcat Here is what I see on the browser. When I try to access http://localhost:8080/solr At 14:26 + 14/11/08, Fergus McMenemie wrote: >HTTP Sta

Re: Newbe! Trying to run solr-1.3.0 under tomcat. Solved!

2008-11-15 Thread Fergus McMenemie
t;true" > >> >> >> >> And Solr started up just fine and it's admin, etc worked as expected. >> >> Oh, and on Mac OS X (of course!), version 10.5.5. >> >> Erik >> >> On Nov 14, 2008, at 12:17 PM, Fergus McMenemie wrote:

Upgrade from 1.2 to 1.3 gives 3x slowdown

2008-11-19 Thread Fergus McMenemie
. -- === Fergus McMenemie Email:[EMAIL PROTECTED] Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Upgrade from 1.2 to 1.3 gives 3x slowdown

2008-11-20 Thread Fergus McMenemie
pecting a >> speed up. I saw the bit about increasing ramBufferSizeMB and set >> it to 64MB; it had no effect. >> -- >> >> === >> Fergus McMenemie

Re: [VOTE] Community Logo Preferences

2008-11-24 Thread Fergus McMenemie
https://issues.apache.org/jira/secure/attachment/12394263/apache_solr_a_blue.jpg

Re: Upgrade from 1.2 to 1.3 gives 3x slowdown + script!

2008-11-26 Thread Fergus McMenemie
ina/localhost/solr cp apache-solr-nightly/example/webapps/solr.war /usr/local/tomcat/webapps rm solr # rm the symbolic link ln -s solrnightly solr rm -r solr/data /usr/local/tomcat/bin/startup.sh sleep 10 # give solr time to launch and setup echo "Starting indexing at " `date` " wit

Re: Upgrade from 1.2 to 1.3 gives 3x slowdown + script!

2008-12-01 Thread Fergus McMenemie
gt;Can you produce an MD5 hash of the WAR file or something, such that I >can know I have the exact bits. Better yet, perhaps you can put those >files up somewhere where they can be downloaded. > >Thanks, >Grant > >On Nov 26, 2008, at 10:54 AM, Fergus McMenemie wrote: > &g

Re: Upgrade from 1.2 to 1.3 gives 3x slowdown + script!

2008-12-11 Thread Fergus McMenemie
down. > >-Yonik > >On Wed, Nov 26, 2008 at 10:54 AM, Fergus McMenemie <[EMAIL PROTECTED]> wrote: >> Hello Grant, >> >> Not much good with Java profilers (yet!) so I thought I >> would send a script! >> >> Details... details! Having decided to produce

correct use of copyFields in schema.xml

2008-12-17 Thread Fergus McMenemie
. IMHO, being able to nest copyFields inside fields makes for more self documenting code! Regards Fergus -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac

getting DIH to read my XML files

2009-01-13 Thread Fergus McMenemie
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:378) Anybody able to point out what I have done wrong? Regards Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021

Re: getting DIH to read my XML files

2009-01-13 Thread Fergus McMenemie
... 10 more > >On Tue, Jan 13, 2009 at 9:28 PM, Fergus McMenemie wrote: > >> Hello, >> >> I am trying to use DIH with FileListEntityProcessor to to walk the >> disk and read XML documents. I have a dataConfig.xml as follows:- >> >> &g

DIH XPathEntityProcessor fails with docs containing

2009-01-16 Thread Fergus McMenemie
emoving the DOCTYPE directive fixes everything. I know that use of DOCTYPE is out of fashion, and it does not exist in our newer documents, however there are lots of older XML docs about! Regards Fergus. -- === Fergus McMenemie

Re: Is it just me or multicore default is broken? Can't ping

2009-01-16 Thread Fergus McMenemie
>>> All looks smooth without errors on startup. >>>> Also can can open admin at >>>> >>>> http://localhost:8983/solr/core1/admin/ >>>> >>>> >>>> But then trying to ping >>>> http://localhost:8983/solr/core1/

Re: getting DIH to read my XML files: solved

2009-01-19 Thread Fergus McMenemie
Regards Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Cant get HTMLStripTransformer's stripHTML to work in DIH.

2009-01-19 Thread Fergus McMenemie
stripHTML correct? -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd

Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.

2009-01-19 Thread Fergus McMenemie
start rollback Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback >On Mon, Jan 19, 2009 at 4:14 PM, Fergus McMenemie wrote: > >> Hello all, >> >> I have the following DIH data-config.xml file. Adding >> HTMLStripTransformer

Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.

2009-01-19 Thread Fergus McMenemie
71) > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) > ... 9 more >Jan 19, 2009 11:14:06 AM org.

Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.

2009-01-21 Thread Fergus McMenemie
an checkout and build >from the trunk if need this immediately. > >On Mon, Jan 19, 2009 at 7:02 PM, Fergus McMenemie wrote: > >> Hmmm, >> >> Just to clarify I retested the thing using the nightly as of today >> 18-jan-2009. The problem is still there and this

Re: DIH XPathEntityProcessor fails with docs containing

2009-01-21 Thread Fergus McMenemie
t; http://www.w3.org/1999/xlink"; xlink:href="" > urname="metadata" xlink:type="simple"> > http://purl.org/dc/elements/1.1/"; > qualifier="pdate">20080131 > >The DTD does exist at the specified location. Removing the DOCTYPE directive >fixes everything. I know that use of DOCTYPE is out of fashion, and it does >not exist in our newer documents, however there are lots of older XML docs >about! -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Cant get HTMLStripTransformer's stripHTML to work in DIH.

2009-01-21 Thread Fergus McMenemie
he cause? Well spotted. I had made a mess of sanitizing the config file I sent to you. I will in future make sure the stuff I am messing with matches what I send to the list. However there is no typo in the underlying file; at least not on that line:-) > > >On Wed, Jan 21,

Re: DIH XPathEntityProcessor fails with docs containing

2009-01-23 Thread Fergus McMenemie
Seems to work fin on this mornings 23-jan-2009 nightly. Thanks very much. >On Wed, Jan 21, 2009 at 6:05 PM, Fergus McMenemie wrote: > >> >> After looking looking at http://issues.apache.org/jira/browse/SOLR-964, >> where >> it seems this issue has been addressed

Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-24 Thread Fergus McMenemie
> >I could work around the problem by creating SOLR fields like >"home_address_street" and "office_address_street" and do some xpath >mapping. However I don't want to do it as we can have multiple >'other' addresses. Also I have other fields whose type is not easily >distinguished like address. > >As I mentioned being new to SOLR I might have completely goofed on a >way to set it up - much appreciate any direction on it. I am using >SOLR 1.3 > >Regards, >Guna -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

DIH FileListEntityProcessor recursion and fileName clash

2009-02-01 Thread Fergus McMenemie
iles(aFile, files); + return; +} long sz = aFile.length(); Date lastModified = new Date(aFile.lastModified()); if (biggerThan != -1 && sz <= biggerThan) -- === Fergus McMenemie Email

DIH using values from solrconfig.xml inside data-config.xml

2009-02-02 Thread Fergus McMenemie
rds Fergus. -- ======= Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: DIH FileListEntityProcessor recursion and fileName clash

2009-02-02 Thread Fergus McMenemie
d -- " + fList); Assert.assertEquals(3, fList.size()); } Regards Fergus. >On Mon, Feb 2, 2009 at 2:36 AM, Fergus McMenemie wrote: > >> Hello >> >> I have been trying to find out why DIH in FileListEntityProcessor >> mode did not appear to be recursing into

Re: DIH using values from solrconfig.xml inside data-config.xml

2009-02-02 Thread Fergus McMenemie
upported. > > dateTimeFormat="MMdd" /> > > >On Mon, Feb 2, 2009 at 9:24 AM, Noble Paul ?? Â Ë³Ë < >noble.p...@gmail.com> wrote: > >> this patch must help >> >> On Mon, Feb 2, 2009 at 10:49 PM, Shalin Shekhar Mangar >> wrote: >>

Re: DIH using values from solrconfig.xml inside data-config.xml

2009-02-04 Thread Fergus McMenemie
up. I suspect that the addition of //para would cover many of the use cases, and what was left could be covered by a preceding XSLT transform. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: DIH, assigning multiple xpaths to the same solr field: solved

2009-02-04 Thread Fergus McMenemie
Thanks Shalin, Using the following appears to work properly! Regards Fergus >On Wed, Feb 4, 2009 at 1:35 AM, Fergus McMenemie wrote: > >> > dataSource="myfilereader" >> processor="XPathEntityProcessor" >> url

DIH fails to import after svn update

2009-02-11 Thread Fergus McMenemie
olrIndexSearcher Regards to all. -- ======= Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

"ant dist" of a nightly download fails

2009-02-11 Thread Fergus McMenemie
st" is fine. Removing the javascript contrib directory allows the "ant dist" to complete and I have a usable war file. However I suspect this may not represent best practise; however "ant test" is still fine. What does removal of the this contrib function loose me? I was wondering if it went with the DIH ScriptTransformer? Regards Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: DIH fails to import after svn update

2009-02-11 Thread Fergus McMenemie
Thanks, That fixed it. >On Wed, Feb 11, 2009 at 4:19 PM, Fergus McMenemie wrote: > > >> java.lang.NoSuchFieldError: docCount >>at >> org.apache.solr.handler.dataimport.SolrWriter.getDocCount(SolrWr

Is this DIH entity forEach expression OK?

2009-02-12 Thread Fergus McMenemie
. Is is OK to have an xpath expression within forEach which is a child of another of the forEach xpath expressions? Or.. is there a better way of doing this? Regards -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore

Re: Is this DIH entity forEach expression OK? ... yes

2009-02-13 Thread Fergus McMenemie
mediaBlock/mediaObject/@vurl" /> > xpath="/record/mediaBlock/caption" /> > >Is is OK to have an xpath expression within forEach which is a child >of another of the forEach xpath expressions? > Yes. It works fine, duplicate "uniqueKey"s were making it a

Problem using DIH templatetransformer to create uniqueKey

2009-02-13 Thread Fergus McMenemie
sort this but was wondering if there was a good reason for this behavior. Regards. -- ======= Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Problem using DIH templatetransformer to create uniqueKey

2009-02-13 Thread Fergus McMenemie
/mediaBlock/mediaObject/@vurl" /> > >The trouble is that vurl is only defined as a child of "/record/mediaBlock" >so my attempt to create id, the uniqueKey fails for the parent document >"/record" > >I am hacking around with "TemplateTransformer.j

Re: Problem using DIH templatetransformer to create uniqueKey

2009-02-13 Thread Fergus McMenemie
/> >> > xpath="/record/mediaBlock/mediaObject/@vurl" /> >> >>The trouble is that vurl is only defined as a child of "/record/mediaBlock" >>so my attempt to create id, the uniqueKey fails for the parent document >>"/record" >> >>I am hacking around with "TemplateTransformer.java" to sort this but was >>wondering if there was a good reason for this behavior. >> -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Problem using DIH templatetransformer to create uniqueKey

2009-02-13 Thread Fergus McMenemie
g files does though. > > Erik > > >On Feb 13, 2009, at 8:17 AM, Fergus McMenemie wrote: > >> Paul, >> >> Following up your usenet sussgetion: >> >> > ignoreMissingVariables="true"/> >> >> and to add more to what I was thinking... &g

DIH transformers

2009-02-16 Thread Fergus McMenemie
Regards Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: DIH transformers - sect 2

2009-02-17 Thread Fergus McMenemie
>On Mon, Feb 16, 2009 at 3:22 PM, Fergus McMenemie wrote: >> >> 2) Having used TemplateTransformer to assign a value to an >> entity column that column cannot be used in other >> TemplateTransformer operations. In my project I am >> attempting to r

Re: DIH transformers - sect 2 - SOLR-1033

2009-02-21 Thread Fergus McMenemie
I have created SOLR-1033 in JIRA to address this issue. At 13:32 + 21/2/09, Fergus McMenemie wrote: >>On Mon, Feb 16, 2009 at 3:22 PM, Fergus McMenemie wrote: >>> >>> 2) Having used TemplateTransformer to assign a value to an >>> entity column th

passing parameters into the XSLTResponseWriter: particularly hostname

2009-02-27 Thread Fergus McMenemie
-- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: passing parameters into the XSLTResponseWriter: particularly hostname

2009-03-09 Thread Fergus McMenemie
want into the XML doc itself where your stylesheet >has access to it. > > >-Hoss Doh! of course. Thanks. -- ======= Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021

a new DIH manifestEnityProcessor

2009-03-09 Thread Fergus McMenemie
? Suggestions for a different name? Suggestions on how to do the delete bitty from within an entity? Regards Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721

Re: a new DIH manifestEnityProcessor

2009-03-09 Thread Fergus McMenemie
or crawlers where we had to. Fergus > >--Noble > >On Mon, Mar 9, 2009 at 8:30 PM, Fergus McMenemie wrote: >> Hello, >> >> I have almost finished a new DIH EntityProcessor which >> I am calling the manifestEnityProcessor. It is designed >> around the idea tha

Re: a new DIH manifestEnityProcessor

2009-03-09 Thread Fergus McMenemie
> >On Mon, Mar 9, 2009 at 10:44 PM, Fergus McMenemie wrote: >>>manifest processing has a very limited usecase. Why can't it be >>>processed using a PlainTextEntityProcessor and write a Tranformer to >>>read lines using regex? >>> >> Ehmmm Ok. T

Re: DIH with a list of changed documents?

2009-03-09 Thread Fergus McMenemie
manifestEnityProcessor see if you can find the thread titled:- "a new DIH manifestEnityProcessor" is your list of changed documents a list of additions and updates only, or does it contain deletes as well? Fergus. -- =======

Re: DIH with a list of changed documents?

2009-03-09 Thread Fergus McMenemie
>Le 09-mars-09 à 22:29, Fergus McMenemie a écrit : >>> how would I implement entity-processor if I were able to get the list >>> of recently changed documents of our sites? >> >> H, this sounds like a job for my manifestEnityProcessor >> see if you can

Re: a new DIH manifestEnityProcessor SOLR-1060 on jira

2009-03-10 Thread Fergus McMenemie
ements are , >> 1)read a file line by line >> 2) filter out lines (include or exclude ) based on a regex >> 3) extract parts (named parts) from the line using another regex >> >> Noble >> >> >> On Tue, Mar 10, 2009 at 1:50 AM, Fergus McMenemie >>

Re: Problem using DIH templatetransformer to create uniqueKey: solved

2009-03-12 Thread Fergus McMenemie
gt; >> Erik >> >> >>On Feb 13, 2009, at 8:17 AM, Fergus McMenemie wrote: >> >>> Paul, >>> >>> Following up your usenet sussgetion: >>> >>> >> ignoreMissingVariables="true"/> >>> >

DIH use of the ?command=full-import entity= command option

2009-03-12 Thread Fergus McMenemie
-- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: DIH use of the ?command=full-import entity= command option

2009-03-12 Thread Fergus McMenemie
:17 AM, Fergus McMenemie wrote: > >> Hello, >> >> Can anybody describe the intended purpose, or provide a >> few examples, of how the DIH entity= command option works. >> >> Am I supposed to build a data-conf.xml file which contains >> many different al

Problem encoding ':' char in a solr query

2009-03-18 Thread Fergus McMenemie
t; ": "" at line 1, column 21. Was expecting one of: ... ... ... "+" ... "-" ... "(" ... "*" ... "^" ... ... ... ... ... ... "[" ... "{" ... ... My encoding did not work! Help! -- =

Re: DIH - read datasource param values from property file or configure JNDI datasource

2009-03-19 Thread Fergus McMenemie
>I am looking for a implementation of DIH feature: It also takes in a >properties file for the data source configuration >(http://issues.apache.org/jira/browse/SOLR-469) > >I want to externalize the data source parameters like driver, url, user and >password to property file outside the solr. My

Re: Scheduling DIH

2009-03-26 Thread fergus mcmenemie
H, my tuppence worth! IMHO I do not think this should be built into solr. Doing it properly leads to all kinds of nasty platform dependent issues... will we then want to add notification features on success/failure? via email? Ideally, all the scheduled activities on a system should be ce

Clarifying use of

2009-03-27 Thread fergus mcmenemie
Hello, Due to limitations with the way my content is organised and DIH I have to add “-imgCaption:[* TO *]” to some of my queries. I discovered the name=”appends” functionality tucked away inside solconfig.xml. This looks a very useful feature, and I created a new requestHandler to deal with my pr

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-30 Thread Fergus McMenemie
(ThreadPoolExecutor.java:907) >>> at java.lang.Thread.run(Thread.java:637) >>> >>> I see two things in CHANGES.txt that might apply, but I'm not sure: >>> 1. I think commons-csv was upgraded >>> 2. The CSV loader stuff was refactored to s

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-30 Thread Fergus McMenemie
. I think commons-csv was upgraded >>>> 2. The CSV loader stuff was refactored to share common code >>>> >>>> I'm still investigating. >>>> >>>> -Grant >>> >>> -- >>> Grant Ingersoll >&

Re: DIH; Hardcode field value/replacement based on source column

2009-03-31 Thread Fergus McMenemie
ents. >> >> Any idea why this DIH instruction would see constant value appear twice?? >> -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-31 Thread Fergus McMenemie
m -rf solr/data" before tomcat is launched. So I do not understand how the above helps. UNLESS there are duplicate gaz entries. >In the meantime, I'm trying to see if I can pinpoint down a specific >change and see if there is anything that might help it perform better. &g

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-01 Thread Fergus McMenemie
2m49.997s -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Fergus McMenemie
>On Apr 1, 2009, at 9:39 AM, Fergus McMenemie wrote: > >> Grant, >> >> Redoing the work with your patch applied does not seem to > >> >> make a difference! Is this the expected result? > >No, I didn't expect Solr 1095 to fix the problem. Overwrit

Problem using ExtractingRequestHandler with tomcat

2009-04-02 Thread Fergus McMenemie
.create(RequestHandlers.java:154) > at > org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:163) Any ideas? -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Problem using ExtractingRequestHandler with tomcat

2009-04-02 Thread Fergus McMenemie
>On Apr 2, 2009, at 4:26 AM, Fergus McMenemie wrote: >> I cant get ExtractingRequestHandler to work with tomcat. Using the >> latest version from svn and then a "make clean dist" and copying the >> war file to a clean tomcat does not work. > >make?! :) Oops!

Using ExtractingRequestHandler to index a large PDF

2009-04-02 Thread Fergus McMenemie
th it. Fergus... -- ======= Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Fergus McMenemie
s for the all the help. Fergus. -- ======= Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Additive filter queries

2009-04-03 Thread Fergus McMenemie
Engineer, Zappos.com >jnewb...@zappos.com - 702-943-7562 Ditto! As best I understand, you somehow need to arrange for each different combination of colour, size and width to be indexed as a separate sol document. -- ======= Fergus McMene

Re: DIH API for specifying a either specific or all configurations imported

2009-04-06 Thread Fergus McMenemie
t&entity=jc See the docs at:- http://wiki.apache.org/solr/DataImportHandler#head-1582242c1bfc1f3e89f4025bf2055791848acefb Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: Using ExtractingRequestHandler to index a large PDF ~solved

2009-04-06 Thread Fergus McMenemie
atalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) > >Although the PDF is big, it contains very little text; it is a map. > > "java -jar solr/lib/tika-0.3.jar -g" appears to have no bother with it. > >Fergus... >-- > >===

Re: Searching on mulit-core Solr

2009-04-06 Thread Fergus McMenemie
gt;> they need to have the same solr.xml (for multicore etc). We don't want >> to replicate the indexes also (we got very light search traffic, but >> very high indexing traffic) so they need to use the same index. >> >> >> Thanks, >> -vivek >> -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: How could I avoid reindexing same files?

2009-04-07 Thread Fergus McMenemie
; > >> > In case all my files are in one folder which is scanned frequently, is >> > there a Solr feature of checking and skipping a file if it has already >> > been indexed >> > and not changed since? >> > >> > >> > Th

Re: How could I avoid reindexing same files?

2009-04-07 Thread Fergus McMenemie
also dump all checksums and pathnames from solr if/when you wanted to validate your folder structure and or indexes. >Regards, >Veselin K > >On Tue, Apr 07, 2009 at 09:01:31AM +0100, Fergus McMenemie wrote: >> Veselin, >> >> Well, as far as solr is concerned, there is tw

Re: DIH; Hardcode field value/replacement based on source column

2009-04-08 Thread Fergus McMenemie
ves the same. Although attempting to use /*/ fails. Another lesson learnt! #! /usr/local/bin/perl use strict; my($s)="cat mat rat hat"; my($c)=0; print " a-match", ++$c, "='$1'\n" while( $s =~ m/(at)/g ); $c=0; print " b-match", ++$c, "=

Re: How could I avoid reindexing same files?

2009-04-08 Thread Fergus McMenemie
>Hi Fergus, > >On Tue, Apr 07, 2009 at 05:06:23PM +0100, Fergus McMenemie wrote: >> >Thank you much Fergus, >> > >> >I was considering implementing a database which would hold a path name >> >and an MD5 sum of each file. >> Snap. That is close

Re: Searching on mulit-core Solr

2009-04-09 Thread Fergus McMenemie
andardHostValve.invoke(StandardHostValve.java:128) >>        at >> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) >>        at >> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) >>        at >> org.apache.cat

Re: Using ExtractingRequestHandler to index a large PDF ~solved

2009-04-14 Thread Fergus McMenemie
>On Apr 6, 2009, at 10:16 AM, Fergus McMenemie wrote: > >> Hmmm, >> >> Not sure how this all hangs together. But editing my solrconfig.xml >> as follows >> sorted the problem:- >> >>> multipartUploadLimitInKB="2048" /> >

Re: indexing txt file

2009-04-15 Thread Fergus McMenemie
What about my xml file, and >txt file? > >Thank you, >Alex > > >On Tue, Apr 14, 2009 at 12:37 AM, Alejandro Gonzalez < >alejandrogonzalezd...@gmail.com> wrote: > >> you should construct the xml containing the fields defined in your >> schema.xml and give them the values from the text files. for example if you >> have an schema defining two fields "title" and "text" you should construct >> an xml with a field "title" and its value and another called "text" >> containing the body of your doc. then you can post it to Solr you have >> deployed and make a commit an it's done. it's possible to construct an xml >> defining more than jus t a doc >> >> >> >> >> "doc1 title" >> "doc1 text" >> >> . >> . >> . >> >> "docn title" >> "docn text" >> >> >> >> >> >> 2009/4/14 Noble Paul ?? Â Ë³Ë >> >> > what is the cntent of your text file? >> > Solr does not directly index files >> > --Noble >> > >> > On Tue, Apr 14, 2009 at 3:54 AM, Alex Vu wrote: >> > > Hi all, >> > > >> > > Currently I wrote an xml file and schema.xml file. What is the next >> step >> > to >> > > index a txt file? Where should I put my txt file I want to index? >> > > >> > > thank you, >> > > Alex V. >> > > >> > >> > >> > >> > -- >> > --Noble Paul >> > >> -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===

Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-15 Thread Fergus McMenemie
>On Apr 2, 2009, at 9:23 AM, Fergus McMenemie wrote: > >> Grant, >> >> >> >>> I should note, however, that the speed difference you are seeing may >>> not be as pronounced as it appears. If I recall during ApacheCon, I >>> commented

  1   2   >