DIH caching URLDataSource/XPath entity (not root)

2016-12-19 Thread Chantal Ackermann
XMLs files with mappings loaded via different DIH entities? I’d appreciate any samples or hints. Or maybe someone is able to spot the error in the following configuration? (The custom DataSource is a subclass of URLDataSource and handles Basic Auth as well as decompression.)

Re: Error when using URLDataSource to index RSS items

2014-06-07 Thread Alexandre Rafalovitch
uot;/rss/channel/item"> > >xpath="/rss/channel/item/title" /> > > > > > > But I am facing the following error > > Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag > ; expected . > > Can any body help? > > >

Error when using URLDataSource to index RSS items

2014-06-06 Thread ienjreny
e following error Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag ; expected . Can any body help? -- View this message in context: http://lucene.472066.n3.nabble.com/Error-when-using-URLDataSource-to-index-RSS-items-tp4140548.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: URLDataSource : indexing from other Solr servers

2014-05-16 Thread helder.sepulveda
I will try with the SolrEntityProcessor but I'm still intrested to know why will it not work with the XPathEntityProcessor -- View this message in context: http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135730.html Sent from the Solr -

Re: URLDataSource : indexing from other Solr servers

2014-05-14 Thread helder.sepulveda
.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135567.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: URLDataSource : indexing from other Solr servers

2014-05-14 Thread Gora Mohanty
On 12 May 2014 21:41, helder.sepulveda wrote: > > I been trying to index data from other solr servers but the import always > shows: > Indexing completed. Added/Updated: 0 documents. Deleted 0 documents. > Requests: 1, Fetched: 0, Skipped: 0, Processed > > My data config looks like this: Nothing

Re: URLDataSource : indexing from other Solr servers

2014-05-13 Thread Shawn Heisey
On 5/12/2014 10:11 AM, helder.sepulveda wrote: > I been trying to index data from other solr servers but the import always > shows: > Indexing completed. Added/Updated: 0 documents. Deleted 0 documents. > Requests: 1, Fetched: 0, Skipped: 0, Processed I'm wondering why you're using the XPathEntity

Re: URLDataSource : indexing from other Solr servers

2014-05-13 Thread Gora Mohanty
On 12 May 2014 22:52, helder.sepulveda wrote: > Here is the data config: > > > > > > url="http://slszip11.as.homes.com/solr/select?q=*:*"; > processor="XPathEntityProcessor" > forEach="/response/result/doc" > trans

Re: URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
MESQUITE 017304 0 1996 75181 2014-04-20T16:28:52.467Z -- View this message in context: http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135332.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
I tested calling the URL using curl right on the server, and I get a valid response and the correct content -- View this message in context: http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135333.html Sent from the Solr - User mailing list archive

Re: URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
er="DateFormatTransformer"> -- View this message in context: http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135331.html Sent from the Solr - User mailing list archive at Nabble.com.

URLDataSource : indexing from other Solr servers

2014-05-12 Thread helder.sepulveda
context: http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: URLDataSource : Issue assigning single xpath field name to two solr fields

2014-02-24 Thread Shalin Shekhar Mangar
The XPathEntityProcessor supports only one field mapping per xpath so using copyField is the only way. On Mon, Feb 24, 2014 at 2:45 PM, manju16832003 wrote: > I'm not sure if I would be missing any configuration params here, however > when I tried to assign an xpath field from URLData

Re: URLDataSource : Issue assigning single xpath field name to two solr fields

2014-02-24 Thread Gora Mohanty
On 24 February 2014 14:45, manju16832003 wrote: > I'm not sure if I would be missing any configuration params here, however > when I tried to assign an xpath field from URLDataSource (XML end point) to > two fields defined in schema.xml. > > Here is my scenario,

URLDataSource : Issue assigning single xpath field name to two solr fields

2014-02-24 Thread manju16832003
I'm not sure if I would be missing any configuration params here, however when I tried to assign an xpath field from URLDataSource (XML end point) to two fields defined in schema.xml. Here is my scenario, I have two fields *profile_display* and *profile_indexed* My assignment in DataImpotHa

Re: How to set http proxy for solr in order to make URLDataSource type source work well

2013-12-17 Thread xie kidd
gt; java -Dhttp.proxyHost=proxyhostURL -Dhttp.proxyPort=proxyPortNumber -jar >> start.jar >> >> >> >> >> >> On Tuesday, December 17, 2013 11:08 AM, xie kidd >> wrote: >> Hi >> >> I install a solr instance after firewall, and try the URLD

Re: How to set http proxy for solr in order to make URLDataSource type source work well

2013-12-17 Thread xie kidd
Dhttp.proxyPort=proxyPortNumber -jar > start.jar > > > > > > On Tuesday, December 17, 2013 11:08 AM, xie kidd > wrote: > Hi > > I install a solr instance after firewall, and try the URLDataSource Example > as: http://wiki.apache.org/solr/DataImportHandler > > Error happ

Re: How to set http proxy for solr in order to make URLDataSource type source work well

2013-12-17 Thread Ahmet Arslan
Hi, Once I used something like below : java -Dhttp.proxyHost=proxyhostURL -Dhttp.proxyPort=proxyPortNumber -jar start.jar On Tuesday, December 17, 2013 11:08 AM, xie kidd wrote: Hi I install a solr instance after firewall, and try the URLDataSource Example as: http://wiki.apache.org/solr

How to set http proxy for solr in order to make URLDataSource type source work well

2013-12-17 Thread xie kidd
Hi I install a solr instance after firewall, and try the URLDataSource Example as: http://wiki.apache.org/solr/DataImportHandler Error happened, from log, i think this issue was caused by the http proxy. So my question is, how should i set the http proxy for Solr? or for URLDataSource type

How to set http proxy for solr in order to make URLDataSource type source work well

2013-12-17 Thread xie kidd
Hi I install a solr instance after firewall, and try the URLDataSource Example as: http://wiki.apache.org/solr/DataImportHandler Error happened, from log, i think this issue was caused by the http proxy. So my question is, how should i set the http proxy for Solr? or for URLDataSource type

Re: DIH - URLDataSource import size

2013-10-25 Thread Chris Hostetter
: I have an issue that is only coming on live environment. The DIH : with URLDataSource is not working when the file size imported is large : (i.e. 100kb above - which is not so large). If its large, it returns : nothing (as seen in the Debug section of DataImport at Solr Admin). are you sure

Re: DIH - URLDataSource import size

2013-10-23 Thread Shalin Shekhar Mangar
the logs of your production environment? On Wed, Oct 23, 2013 at 1:10 PM, Raheel Hasan wrote: > anyone? > > > On Tue, Oct 22, 2013 at 9:50 PM, Raheel Hasan >wrote: > > > Hi, > > > > I have an issue that is only coming on live environment. The DIH > > with

Re: DIH - URLDataSource import size

2013-10-23 Thread Raheel Hasan
anyone? On Tue, Oct 22, 2013 at 9:50 PM, Raheel Hasan wrote: > Hi, > > I have an issue that is only coming on live environment. The DIH > with URLDataSource is not working when the file size imported is large > (i.e. 100kb above - which is not so large). If its large, it return

DIH - URLDataSource import size

2013-10-22 Thread Raheel Hasan
Hi, I have an issue that is only coming on live environment. The DIH with URLDataSource is not working when the file size imported is large (i.e. 100kb above - which is not so large). If its large, it returns nothing (as seen in the Debug section of DataImport at Solr Admin). However, when

URLDataSource & PlainTextEntityProcessor not working

2013-09-10 Thread Raheel Hasan
Hi, I am trying to load data (as plaint text) from a URL. For this I am using URLDataSource & PlainTextEntityProcesso. However, I have the following not working. I checked access logs of my web server, the url is not even getting called: http://localhost/update_1/test

URLDatasource Authentication

2013-07-23 Thread Kalyan Kuram
Hi I am trying to access xml files which are stored in our cms,how do i pass username/passwd to dih so i can get all xml files its throwing exception java.io.IOException: Server returned HTTP response code: 401 for URL: http://admin:admin...@cms1.zinio.com.com//articles/100850443.xml Is ther

Re: More debugging DIH - URLDataSource (solved)

2012-08-28 Thread Carrie Coy
Thank you for these suggestions. The real problem was incorrect syntax for the primary key column in data-config.xml. Once I corrected that, the data loaded fine. wrong: Right: On 08/25/2012 08:52 PM, Lance Norskog wrote: About XPaths: the XPath engine does a limited range of xpat

Re: More debugging DIH - URLDataSource

2012-08-25 Thread Lance Norskog
12) (W4537) > 2388 > > > PRODUCT: OPAQUE PONY > BEADS 6X9MM (BAG OF 850) (BE9000) > 1313 > > > > > > My DIH: > > | >type="URLDataSource" > encoding="UTF-8" > connectionTimeou

More debugging DIH - URLDataSource

2012-08-24 Thread Carrie Coy
I'm trying to write a DIH to incorporate page view metrics from an XML feed into our index. The DIH makes a single request, and updates 0 documents. I set log level to "finest" for the entire dataimport section, but I still can't tell what's wrong. I suspect the XPath. http://localhost:80

Re: URLDataSource delta import

2011-12-21 Thread Alessandro Benedetti
Any News? I'm also interested in this topic :) 2011/12/12 Brian Lamb > Hi all, > > According to > > http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource > a > delta-import is not "currently" implemented for URLDataSource. I sa

URLDataSource delta import

2011-12-12 Thread Brian Lamb
Hi all, According to http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource a delta-import is not "currently" implemented for URLDataSource. I say "currently" because I've noticed that such documentation is out of date in many places. I want

Re: DIH URLDataSource paginating with $nextUrl & $hasMore

2011-09-16 Thread B B
BTW - ignore the xpath typo, it should read: On Fri, Sep 16, 2011 at 12:59 PM, B B wrote: > Has anyone successfully setup DIH URLDataSource to paginate imports > using  $nextUrl & $hasMore ?: > > http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xm

DIH URLDataSource paginating with $nextUrl & $hasMore

2011-09-16 Thread B B
Has anyone successfully setup DIH URLDataSource to paginate imports using  $nextUrl & $hasMore ?: http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 It's not working for me, the data imports on the first page, but no subsequent calls to the data source are

RE: how to deal with URLDatasource which needs authorization?

2011-08-29 Thread Jaeger, Jay - DOT
So, the question then seems to be: is there a way to place credentials in the URLDataSource. There doesn't seem to be an explicit user ID or password ( http://wiki.apache.org/solr/DataImportHandler#Configuration_of_URLDataSource_or_HttpDataSource ) but perhaps you can include them i

RE: how to deal with URLDatasource which needs authorization?

2011-08-25 Thread deniz
- Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-deal-with-URLDatasource-which-needs-authorization-tp3280515p3285579.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: how to deal with URLDatasource which needs authorization?

2011-08-24 Thread Jaeger, Jay - DOT
[mailto:denizdurmu...@gmail.com] Sent: Wednesday, August 24, 2011 4:38 AM To: solr-user@lucene.apache.org Subject: how to deal with URLDatasource which needs authorization? hi all i am trying to index a page which basically returns an xml file. But i dont want it to be accessible for anyone else... the page

how to deal with URLDatasource which needs authorization?

2011-08-24 Thread deniz
from here, but i dont want anyone else to access it. so what to do for adding authorization information to solr, order to let it index the data - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-deal-with-URLDatasource-which

Re: DIH URLDataSource and useSolrAddSchema=true

2010-11-15 Thread Dario Rigolin
On Monday, November 15, 2010 11:18:47 am Lance Norskog wrote: > This is more complex than you need. The Solr update command can accept > streamed data, with the stream.url and stream.file options. You can just > use solr/update with stream.url=http://your.machine/your.php.script and > it will read

Re: DIH URLDataSource and useSolrAddSchema=true

2010-11-15 Thread Lance Norskog
t and I need to make solr able to get all changed documents into the index. Looking at DIH I decidec to use URLDataSource and useSolrAddSchema=true pointing to my application url: getchangeddocstoindex.php. But my PHP page could stream hundreds of megabytes (maybe couple of Gigs!). Anybody kn

DIH URLDataSource and useSolrAddSchema=true

2010-11-15 Thread Dario Rigolin
I'm looking to index data in Solr using a PHP page feeding the index. In my application I have all docs allready "converted" to a solr/add xml document and I need to make solr able to get all changed documents into the index. Looking at DIH I decidec to use URLDataSource and u

URLDataSource

2010-06-26 Thread Jason Chaffee
I would like to the URLDataSource to make RESTful calls to get content and only re-index when content changes. This means using http headers to make a request and using the response headers to determine when to make the request. For example, Request Headers: Accept: application/xml if

Re: Using XSLT with DIH for a URLDataSource

2010-02-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
6:24 AM, Roland Villemoes >>> wrote: >>>> You're right! >>>> >>>> I was as simple (stupid!) as that, >>>> >>>> Thanks a lot.... (for your time .. very appreciated) >>>> >>>> Roland >>>> >>

Re: Using XSLT with DIH for a URLDataSource

2010-02-25 Thread Lance Norskog
-Oprindelig meddelelse- >>> Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble >>> Paul ??? ?? >>> Sendt: 22. februar 2010 14:01 >>> Til: solr-user@lucene.apache.org >>> Emne: Re: Using XSL

Re: Using XSLT with DIH for a URLDataSource

2010-02-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
t; >> -Oprindelig meddelelse- >> Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble >> Paul ??? ?? >> Sendt: 22. februar 2010 14:01 >> Til: solr-user@lucene.apache.org >> Emne: Re: Using XSLT with DIH for a URLDataSource >> >

Re: Using XSLT with DIH for a URLDataSource

2010-02-24 Thread Lance Norskog
> -Oprindelig meddelelse- > Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble > Paul ??? ?? > Sendt: 22. februar 2010 14:01 > Til: solr-user@lucene.apache.org > Emne: Re: Using XSLT with DIH for a URLDataSource > > The xslt file looks

SV: Using XSLT with DIH for a URLDataSource

2010-02-22 Thread Roland Villemoes
-user@lucene.apache.org Emne: Re: Using XSLT with DIH for a URLDataSource The xslt file looks fine . is the location of the file correct ? On Mon, Feb 22, 2010 at 2:57 PM, Roland Villemoes wrote: > > Hi > > (thanks a lot) > > Yes, The full stacktrace is this:

Re: Using XSLT with DIH for a URLDataSource

2010-02-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
gt; > And my test.xslt (cut down to almost nothing just to move further and see > that XSLT was working): > > > xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> >   >     >       >     >   > >   >     >       >         >       >      

SV: Using XSLT with DIH for a URLDataSource

2010-02-22 Thread Roland Villemoes
): -Oprindelig meddelelse- Fra: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sendt: 22. februar 2010 10:08 Til: solr-user@lucene.apache.org Emne: Re: Using XSLT with DIH for a URLDataSource On Mon, Feb 22, 2010 at 1:18 PM, Roland Villemoes wrote: > Hi, > &g

Re: Using XSLT with DIH for a URLDataSource

2010-02-22 Thread Shalin Shekhar Mangar
On Mon, Feb 22, 2010 at 1:18 PM, Roland Villemoes wrote: > Hi, > > I have to load data for Solr from a UrlDataSource supplying me with a XML > feed. > > In the simple case where I just do simple XSLT select this works just fine. > Just as shown on the wiki (http:/

Using XSLT with DIH for a URLDataSource

2010-02-21 Thread Roland Villemoes
Hi, I have to load data for Solr from a UrlDataSource supplying me with a XML feed. In the simple case where I just do simple XSLT select this works just fine. Just as shown on the wiki (http://wiki.apache.org/solr/DataImportHandler) But I need to do some manipulation of the XML feed first, So

Re: [DIH] URLDataSource and fetching a link

2009-10-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
http://feeds1.nytimes.com/nyt/rss/Sports " processor="XPathEntityProcessor" forEach="/rss/channel | /rss/channel/item" dataSource="rss" transformer="RegexTransformer,DateFormatTransformer">

Re: [DIH] URLDataSource and fetching a link

2009-10-20 Thread Grant Ingersoll
Finally getting back to this... On Sep 17, 2009, at 12:28 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: 2009/9/17 Noble Paul നോബിള്‍ नोब्ळ् : it is possible to have a sub entity which has XPathEntityProcessor which can use the link ar the url This may not be a good solution. But you can use the

Re: [DIH] URLDataSource and fetching a link

2009-09-17 Thread Grant Ingersoll
On Sep 16, 2009, at 9:13 PM, Walter Underwood wrote: I would use the RSS feed (hopefully in Atom format) as a source of links, then use a regular web spider to fetch the content. I seriously doubt that DIH is up to the task of general fetching from the Wild Wild Web. That is a dirty and di

Re: [DIH] URLDataSource and fetching a link

2009-09-17 Thread Grant Ingersoll
On Sep 16, 2009, at 9:13 PM, Walter Underwood wrote: I would use the RSS feed (hopefully in Atom format) as a source of links, then use a regular web spider to fetch the content. I seriously doubt that DIH is up to the task of general fetching from the Wild Wild Web. That is a dirty and di

Re: [DIH] URLDataSource and fetching a link

2009-09-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
2009/9/17 Noble Paul നോബിള്‍ नोब्ळ् : > it is possible to have a sub entity which has XPathEntityProcessor > which can use the link ar the url This may not be a good solution. But you can use the $hasMore and $nextUrl options of XPathEntityProcessor to recursively loop if there are more links >

Re: [DIH] URLDataSource and fetching a link

2009-09-16 Thread Walter Underwood
I would use the RSS feed (hopefully in Atom format) as a source of links, then use a regular web spider to fetch the content. I seriously doubt that DIH is up to the task of general fetching from the Wild Wild Web. That is a dirty and difficult job and DIH is designed for cooperating data s

Re: [DIH] URLDataSource and fetching a link

2009-09-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is possible to have a sub entity which has XPathEntityProcessor which can use the link ar the url On Thu, Sep 17, 2009 at 8:57 AM, Grant Ingersoll wrote: > Many RSS feeds contain a to some full article.  How can I have the > DIH get the RSS feed and then have it go and fetch the content at th

[DIH] URLDataSource and fetching a link

2009-09-16 Thread Grant Ingersoll
Many RSS feeds contain a to some full article. How can I have the DIH get the RSS feed and then have it go and fetch the content at the link? Thanks, Grant

Re: DIH: URLDataSource and incremental indexing

2009-07-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
09 at 7:57 PM, Erik Hatcher wrote: > I'm exploring other ways of getting data into Solr via DataImportHandler > than through a relational database, particularly the URLDataSource. > > I see the special commands for deleting by id and query as well as the > $hasMore/$nextUrl

DIH: URLDataSource and incremental indexing

2009-07-09 Thread Erik Hatcher
I'm exploring other ways of getting data into Solr via DataImportHandler than through a relational database, particularly the URLDataSource. I see the special commands for deleting by id and query as well as the $hasMore/$nextUrl techniques, but I'm unclear on exactly how one