Re: DIH Http input bug - problem with two-level RSS walker

2008-11-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
an off-by-one bug of some sort in the DIH code. > > Thanks, > > Lance > > -Original Message- > From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:[EMAIL PROTECTED] > Sent: Saturday, November 01, 2008 7:44 PM > To: solr-user@lucene.apache.org > Subject: Re: DIH Http input

RE: DIH Http input bug - problem with two-level RSS walker

2008-11-03 Thread Lance Norskog
solr-user@lucene.apache.org Subject: Re: DIH Http input bug - problem with two-level RSS walker If you wish to create 1 doc per inner entity the set rootEntity="false" for the entity outer. The exception is because the url is wrong On Sat, Nov 1, 2008 at 10:30 AM, Lance Norskog <[EMAIL

Re: DIH Http input bug - problem with two-level RSS walker

2008-11-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
It may be fine to provide that but, what other benefit can you get which you can't get from writing a Simple DataSource in java.Script is just a convenience , right? --Noble On Mon, Nov 3, 2008 at 11:41 AM, Jon Baer <[EMAIL PROTECTED]> wrote: > On a side note ... it would be nice if your data sou

Re: DIH Http input bug - problem with two-level RSS walker

2008-11-02 Thread Jon Baer
On a side note ... it would be nice if your data source could also be the result of a script (instead of trying to hack around it w/ JdbcDataSource) ... Something similar to what ScriptTransformer does ... (http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bd

Re: DIH Http input bug - problem with two-level RSS walker

2008-11-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Lance, I guess I got your problem So you wish to create docs for both entities (as suggested by Jon Baer). So the best solution would be to create two root entities. The first one should be the outer and write a transformer to store all the urls into the db . The JdbcDataSource can do inserts/up

Re: DIH Http input bug - problem with two-level RSS walker

2008-11-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Lance, Do a full import w/o debug and let us know if my suggestion worked (rootEntity="false" ) . If it didn't , I can suggest u something else (Writing a Transformer ) On Sun, Nov 2, 2008 at 8:13 AM, Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]> wrote: > If you wish to create 1 doc per inner

Re: DIH Http input bug - problem with two-level RSS walker

2008-11-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
gt; > > > > - Jon > > On Nov 1, 2008, at 3:26 PM, Norskog, Lance wrote: > >> The inner entity drills down and gets more detail about each item in the >> outer loop. It creates one document. >> >> -Original Message- >

Re: DIH Http input bug - problem with two-level RSS walker

2008-11-01 Thread Noble Paul നോബിള്‍ नोब्ळ्
If you wish to create 1 doc per inner entity the set rootEntity="false" for the entity outer. The exception is because the url is wrong On Sat, Nov 1, 2008 at 10:30 AM, Lance Norskog <[EMAIL PROTECTED]> wrote: > I wrote a nested HttpDataSource RSS poller. The outer loop reads an rss feed > which c

Re: DIH Http input bug - problem with two-level RSS walker

2008-11-01 Thread Jon Baer
he outer loop. It creates one document. -Original Message- From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED] Sent: Friday, October 31, 2008 10:24 PM To: solr-user@lucene.apache.org Subject: Re: DIH Http input bug - problem with two-level RSS walker On Sat, Nov 1, 2008 at 10:30 AM, Lance

RE: DIH Http input bug - problem with two-level RSS walker

2008-11-01 Thread Norskog, Lance
- problem with two-level RSS walker On Sat, Nov 1, 2008 at 10:30 AM, Lance Norskog <[EMAIL PROTECTED]> wrote: > I wrote a nested HttpDataSource RSS poller. The outer loop reads an > rss feed which contains N links to other rss feeds. The nested loop > then reads each one of

Re: DIH Http input bug - problem with two-level RSS walker

2008-10-31 Thread Shalin Shekhar Mangar
On Sat, Nov 1, 2008 at 10:30 AM, Lance Norskog <[EMAIL PROTECTED]> wrote: > I wrote a nested HttpDataSource RSS poller. The outer loop reads an rss > feed > which contains N links to other rss feeds. The nested loop then reads each > one of those to create documents. (Yes, this is an obnoxious thi