XMLs files with mappings loaded via
different DIH entities? I’d appreciate any samples or hints.
Or maybe someone is able to spot the error in the following configuration?
(The custom DataSource is a subclass of URLDataSource and handles Basic Auth as
well as decompression.)
uot;/rss/channel/item">
>
>xpath="/rss/channel/item/title" />
>
>
>
>
>
> But I am facing the following error
>
> Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag
> ; expected .
>
> Can any body help?
>
>
>
e following error
Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag
; expected .
Can any body help?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Error-when-using-URLDataSource-to-index-RSS-items-tp4140548.html
Sent from the Solr - User mailing list archive at Nabble.com.
I will try with the SolrEntityProcessor
but I'm still intrested to know why will it not work with the
XPathEntityProcessor
--
View this message in context:
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135730.html
Sent from the Solr -
.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135567.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 12 May 2014 21:41, helder.sepulveda wrote:
>
> I been trying to index data from other solr servers but the import always
> shows:
> Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
> Requests: 1, Fetched: 0, Skipped: 0, Processed
>
> My data config looks like this:
Nothing
On 5/12/2014 10:11 AM, helder.sepulveda wrote:
> I been trying to index data from other solr servers but the import always
> shows:
> Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
> Requests: 1, Fetched: 0, Skipped: 0, Processed
I'm wondering why you're using the XPathEntity
On 12 May 2014 22:52, helder.sepulveda wrote:
> Here is the data config:
>
>
>
>
>
> url="http://slszip11.as.homes.com/solr/select?q=*:*";
> processor="XPathEntityProcessor"
> forEach="/response/result/doc"
> trans
MESQUITE 017304
0
1996
75181
2014-04-20T16:28:52.467Z
--
View this message in context:
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135332.html
Sent from the Solr - User mailing list archive at Nabble.com.
I tested calling the URL using curl right on the server, and I get a valid
response and the correct content
--
View this message in context:
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135333.html
Sent from the Solr - User mailing list archive
er="DateFormatTransformer">
--
View this message in context:
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135331.html
Sent from the Solr - User mailing list archive at Nabble.com.
context:
http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321.html
Sent from the Solr - User mailing list archive at Nabble.com.
The XPathEntityProcessor supports only one field mapping per xpath so
using copyField is the only way.
On Mon, Feb 24, 2014 at 2:45 PM, manju16832003 wrote:
> I'm not sure if I would be missing any configuration params here, however
> when I tried to assign an xpath field from URLData
On 24 February 2014 14:45, manju16832003 wrote:
> I'm not sure if I would be missing any configuration params here, however
> when I tried to assign an xpath field from URLDataSource (XML end point) to
> two fields defined in schema.xml.
>
> Here is my scenario,
I'm not sure if I would be missing any configuration params here, however
when I tried to assign an xpath field from URLDataSource (XML end point) to
two fields defined in schema.xml.
Here is my scenario,
I have two fields
*profile_display* and *profile_indexed*
My assignment in DataImpotHa
gt; java -Dhttp.proxyHost=proxyhostURL -Dhttp.proxyPort=proxyPortNumber -jar
>> start.jar
>>
>>
>>
>>
>>
>> On Tuesday, December 17, 2013 11:08 AM, xie kidd
>> wrote:
>> Hi
>>
>> I install a solr instance after firewall, and try the URLD
Dhttp.proxyPort=proxyPortNumber -jar
> start.jar
>
>
>
>
>
> On Tuesday, December 17, 2013 11:08 AM, xie kidd
> wrote:
> Hi
>
> I install a solr instance after firewall, and try the URLDataSource Example
> as: http://wiki.apache.org/solr/DataImportHandler
>
> Error happ
Hi,
Once I used something like below :
java -Dhttp.proxyHost=proxyhostURL -Dhttp.proxyPort=proxyPortNumber -jar
start.jar
On Tuesday, December 17, 2013 11:08 AM, xie kidd wrote:
Hi
I install a solr instance after firewall, and try the URLDataSource Example
as: http://wiki.apache.org/solr
Hi
I install a solr instance after firewall, and try the URLDataSource Example
as: http://wiki.apache.org/solr/DataImportHandler
Error happened, from log, i think this issue was caused by the http proxy.
So my question is, how should i set the http proxy for Solr? or for
URLDataSource type
Hi
I install a solr instance after firewall, and try the URLDataSource Example
as: http://wiki.apache.org/solr/DataImportHandler
Error happened, from log, i think this issue was caused by the http proxy.
So my question is, how should i set the http proxy for Solr? or for
URLDataSource type
: I have an issue that is only coming on live environment. The DIH
: with URLDataSource is not working when the file size imported is large
: (i.e. 100kb above - which is not so large). If its large, it returns
: nothing (as seen in the Debug section of DataImport at Solr Admin).
are you sure
the logs of your production environment?
On Wed, Oct 23, 2013 at 1:10 PM, Raheel Hasan wrote:
> anyone?
>
>
> On Tue, Oct 22, 2013 at 9:50 PM, Raheel Hasan >wrote:
>
> > Hi,
> >
> > I have an issue that is only coming on live environment. The DIH
> > with
anyone?
On Tue, Oct 22, 2013 at 9:50 PM, Raheel Hasan wrote:
> Hi,
>
> I have an issue that is only coming on live environment. The DIH
> with URLDataSource is not working when the file size imported is large
> (i.e. 100kb above - which is not so large). If its large, it return
Hi,
I have an issue that is only coming on live environment. The DIH
with URLDataSource is not working when the file size imported is large
(i.e. 100kb above - which is not so large). If its large, it returns
nothing (as seen in the Debug section of DataImport at Solr Admin).
However, when
Hi,
I am trying to load data (as plaint text) from a URL. For this I am
using URLDataSource & PlainTextEntityProcesso. However, I have the
following not working. I checked access logs of my web server, the url is
not even getting called:
http://localhost/update_1/test
Hi
I am trying to access xml files which are stored in our cms,how do i pass
username/passwd to dih so i can get all xml files
its throwing exception
java.io.IOException: Server returned HTTP response code: 401 for URL:
http://admin:admin...@cms1.zinio.com.com//articles/100850443.xml
Is ther
Thank you for these suggestions. The real problem was incorrect syntax
for the primary key column in data-config.xml. Once I corrected that,
the data loaded fine.
wrong:
Right:
On 08/25/2012 08:52 PM, Lance Norskog wrote:
About XPaths: the XPath engine does a limited range of xpat
12) (W4537)
> 2388
>
>
> PRODUCT: OPAQUE PONY
> BEADS 6X9MM (BAG OF 850) (BE9000)
> 1313
>
>
>
>
>
> My DIH:
>
> |
>type="URLDataSource"
> encoding="UTF-8"
> connectionTimeou
I'm trying to write a DIH to incorporate page view metrics from an XML
feed into our index. The DIH makes a single request, and updates 0
documents. I set log level to "finest" for the entire dataimport
section, but I still can't tell what's wrong. I suspect the XPath.
http://localhost:80
Any News?
I'm also interested in this topic :)
2011/12/12 Brian Lamb
> Hi all,
>
> According to
>
> http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource
> a
> delta-import is not "currently" implemented for URLDataSource. I sa
Hi all,
According to
http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource
a
delta-import is not "currently" implemented for URLDataSource. I say
"currently" because I've noticed that such documentation is out of date in
many places. I want
BTW - ignore the xpath typo, it should read:
On Fri, Sep 16, 2011 at 12:59 PM, B B wrote:
> Has anyone successfully setup DIH URLDataSource to paginate imports
> using $nextUrl & $hasMore ?:
>
> http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xm
Has anyone successfully setup DIH URLDataSource to paginate imports
using $nextUrl & $hasMore ?:
http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1
It's not working for me, the data imports on the first page, but no
subsequent calls to the data source are
So, the question then seems to be: is there a way to place credentials in the
URLDataSource.
There doesn't seem to be an explicit user ID or password (
http://wiki.apache.org/solr/DataImportHandler#Configuration_of_URLDataSource_or_HttpDataSource
) but perhaps you can include them i
-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-deal-with-URLDatasource-which-needs-authorization-tp3280515p3285579.html
Sent from the Solr - User mailing list archive at Nabble.com.
[mailto:denizdurmu...@gmail.com]
Sent: Wednesday, August 24, 2011 4:38 AM
To: solr-user@lucene.apache.org
Subject: how to deal with URLDatasource which needs authorization?
hi all
i am trying to index a page which basically returns an xml file. But i dont
want it to be accessible for anyone else... the page
from here, but i dont want anyone else to
access it.
so what to do for adding authorization information to solr, order to let it
index the data
-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-deal-with-URLDatasource-which
On Monday, November 15, 2010 11:18:47 am Lance Norskog wrote:
> This is more complex than you need. The Solr update command can accept
> streamed data, with the stream.url and stream.file options. You can just
> use solr/update with stream.url=http://your.machine/your.php.script and
> it will read
t and I need to make solr able to get all changed documents into the
index. Looking at DIH I decidec to use URLDataSource and useSolrAddSchema=true
pointing to my application url: getchangeddocstoindex.php.
But my PHP page could stream hundreds of megabytes (maybe couple of Gigs!).
Anybody kn
I'm looking to index data in Solr using a PHP page feeding the index.
In my application I have all docs allready "converted" to a solr/add xml
document and I need to make solr able to get all changed documents into the
index. Looking at DIH I decidec to use URLDataSource and u
I would like to the URLDataSource to make RESTful calls to get content and only
re-index when content changes. This means using http headers to make a request
and using the response headers to determine when to make the request. For
example,
Request Headers:
Accept: application/xml
if
6:24 AM, Roland Villemoes
>>> wrote:
>>>> You're right!
>>>>
>>>> I was as simple (stupid!) as that,
>>>>
>>>> Thanks a lot.... (for your time .. very appreciated)
>>>>
>>>> Roland
>>>>
>>
-Oprindelig meddelelse-
>>> Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble
>>> Paul ??? ??
>>> Sendt: 22. februar 2010 14:01
>>> Til: solr-user@lucene.apache.org
>>> Emne: Re: Using XSL
t;
>> -Oprindelig meddelelse-
>> Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble
>> Paul ??? ??
>> Sendt: 22. februar 2010 14:01
>> Til: solr-user@lucene.apache.org
>> Emne: Re: Using XSLT with DIH for a URLDataSource
>>
>
> -Oprindelig meddelelse-
> Fra: noble.p...@gmail.com [mailto:noble.p...@gmail.com] På vegne af Noble
> Paul ??? ??
> Sendt: 22. februar 2010 14:01
> Til: solr-user@lucene.apache.org
> Emne: Re: Using XSLT with DIH for a URLDataSource
>
> The xslt file looks
-user@lucene.apache.org
Emne: Re: Using XSLT with DIH for a URLDataSource
The xslt file looks fine . is the location of the file correct ?
On Mon, Feb 22, 2010 at 2:57 PM, Roland Villemoes
wrote:
>
> Hi
>
> (thanks a lot)
>
> Yes, The full stacktrace is this:
gt;
> And my test.xslt (cut down to almost nothing just to move further and see
> that XSLT was working):
>
>
> xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
>
>
>
>
>
>
>
>
>
>
>
>
):
-Oprindelig meddelelse-
Fra: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sendt: 22. februar 2010 10:08
Til: solr-user@lucene.apache.org
Emne: Re: Using XSLT with DIH for a URLDataSource
On Mon, Feb 22, 2010 at 1:18 PM, Roland Villemoes
wrote:
> Hi,
>
&g
On Mon, Feb 22, 2010 at 1:18 PM, Roland Villemoes
wrote:
> Hi,
>
> I have to load data for Solr from a UrlDataSource supplying me with a XML
> feed.
>
> In the simple case where I just do simple XSLT select this works just fine.
> Just as shown on the wiki (http:/
Hi,
I have to load data for Solr from a UrlDataSource supplying me with a XML feed.
In the simple case where I just do simple XSLT select this works just fine.
Just as shown on the wiki (http://wiki.apache.org/solr/DataImportHandler)
But I need to do some manipulation of the XML feed first, So
http://feeds1.nytimes.com/nyt/rss/Sports
"
processor="XPathEntityProcessor"
forEach="/rss/channel | /rss/channel/item"
dataSource="rss"
transformer="RegexTransformer,DateFormatTransformer">
Finally getting back to this...
On Sep 17, 2009, at 12:28 AM, Noble Paul നോബിള്
नोब्ळ् wrote:
2009/9/17 Noble Paul നോബിള് नोब्ळ्
:
it is possible to have a sub entity which has XPathEntityProcessor
which can use the link ar the url
This may not be a good solution.
But you can use the
On Sep 16, 2009, at 9:13 PM, Walter Underwood wrote:
I would use the RSS feed (hopefully in Atom format) as a source of
links, then use a regular web spider to fetch the content.
I seriously doubt that DIH is up to the task of general fetching
from the Wild Wild Web. That is a dirty and di
On Sep 16, 2009, at 9:13 PM, Walter Underwood wrote:
I would use the RSS feed (hopefully in Atom format) as a source of
links, then use a regular web spider to fetch the content.
I seriously doubt that DIH is up to the task of general fetching
from the Wild Wild Web. That is a dirty and di
2009/9/17 Noble Paul നോബിള് नोब्ळ् :
> it is possible to have a sub entity which has XPathEntityProcessor
> which can use the link ar the url
This may not be a good solution.
But you can use the $hasMore and $nextUrl options of
XPathEntityProcessor to recursively loop if there are more links
>
I would use the RSS feed (hopefully in Atom format) as a source of
links, then use a regular web spider to fetch the content.
I seriously doubt that DIH is up to the task of general fetching from
the Wild Wild Web. That is a dirty and difficult job and DIH is
designed for cooperating data s
it is possible to have a sub entity which has XPathEntityProcessor
which can use the link ar the url
On Thu, Sep 17, 2009 at 8:57 AM, Grant Ingersoll wrote:
> Many RSS feeds contain a to some full article. How can I have the
> DIH get the RSS feed and then have it go and fetch the content at th
Many RSS feeds contain a to some full article. How can I have
the DIH get the RSS feed and then have it go and fetch the content at
the link?
Thanks,
Grant
09 at 7:57 PM, Erik Hatcher wrote:
> I'm exploring other ways of getting data into Solr via DataImportHandler
> than through a relational database, particularly the URLDataSource.
>
> I see the special commands for deleting by id and query as well as the
> $hasMore/$nextUrl
I'm exploring other ways of getting data into Solr via
DataImportHandler than through a relational database, particularly the
URLDataSource.
I see the special commands for deleting by id and query as well as the
$hasMore/$nextUrl techniques, but I'm unclear on exactly how one
60 matches
Mail list logo