Re: Solr response error 403 when I try to index medium.com articles

2016-03-30 Thread Jeferson dos Anjos
look into running a more > > robust crawler (nutch, droids, Lucidworks Fusion, etc...) that has more > > features and debugging options (notably: rate limiting) and use that code > > to feath the content, then push it to Solr. > > > > > > : Date: Tue, 29 Mar

Re: Solr response error 403 when I try to index medium.com articles

2016-03-30 Thread Jack Krupansky
, Lucidworks Fusion, etc...) that has more > features and debugging options (notably: rate limiting) and use that code > to feath the content, then push it to Solr. > > > : Date: Tue, 29 Mar 2016 20:54:52 -0300 > : From: Jeferson dos Anjos > : Reply-To: solr-user@lucene.apache

Re: Solr response error 403 when I try to index medium.com articles

2016-03-30 Thread Chris Hostetter
, etc...) that has more features and debugging options (notably: rate limiting) and use that code to feath the content, then push it to Solr. : Date: Tue, 29 Mar 2016 20:54:52 -0300 : From: Jeferson dos Anjos : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject:

Re: Solr response error 403 when I try to index medium.com articles

2016-03-30 Thread Jeferson dos Anjos
Jack, thanks for the reply. With other sites over https I'm not having trouble. What logic suggests you change? Did not quite understand. 2016-03-29 21:01 GMT-03:00 Jack Krupansky : > Medium switches from http to https, so you would need the logic for dealing > with https security handshakes. > >

Solr response error 403 when I try to index medium.com articles

2016-03-29 Thread Jeferson dos Anjos
I'm trying to index some pages of the medium. But I get error 403. I believe it is because the medium does not accept the user-agent solr. Has anyone ever experienced this? You know how to change? I appreciate any help 500 94 Server returned HTTP response code: 403 for URL: https://medium.com

Re: Solr response error 403 when I try to index medium.com articles

2016-03-29 Thread Jack Krupansky
Medium switches from http to https, so you would need the logic for dealing with https security handshakes. -- Jack Krupansky On Tue, Mar 29, 2016 at 7:54 PM, Jeferson dos Anjos < jefersonan...@packdocs.com> wrote: > I'm trying to index some pages of the medium. But I get error 403. I > believe

Solr response error 403 when I try to index medium.com articles

2016-03-29 Thread Jeferson dos Anjos
I'm trying to index some pages of the medium. But I get error 403. I believe it is because the medium does not accept the user-agent solr. Has anyone ever experienced this? You know how to change? I appreciate any help 500 94 Server returned HTTP response code: 403 for URL: https://medium.com