Please don't top post.

On 25 November 2010 15:38, Ron Piggott <ron.pigg...@actsministries.org> wrote:
>
> Is "User Agent" suppose to have a hyphen  "-"  ?   Ron
>
>
>
> The Verse of the Day
> “Encouragement from God’s Word”
> http://www.TheVerseOfTheDay.info
> -----Original Message----- From: Richard Quadling
> Sent: Thursday, November 25, 2010 9:16 AM
> To: Deva
> Cc: Shreyas Agasthya ; Ron Piggott ; php-general@lists.php.net ;
> a...@ashleysheridan.co.uk
> Subject: Re: [PHP] Fw: Spoofing user_agent
>
> On 25 November 2010 11:32, Deva <devendra...@gmail.com> wrote:
>>
>> Use curl
>> http://php.net/manual/en/book.curl.php
>>
>>
>> On Thu, Nov 25, 2010 at 4:41 PM, Shreyas Agasthya
>> <shreya...@gmail.com>wrote:
>>
>>> I feel you should use more of the 4th method here as you are not trying
>>> to
>>> read the file but the header level  (7th layer) information of the HTTP
>>> protocol.
>>>
>>> http://php.net/manual/en/function.file-get-contents.php
>>>
>>>
>>> --Shreyas
>>>
>>> On Thu, Nov 25, 2010 at 4:11 PM, Ron Piggott <
>>> ron.pigg...@actsministries.org
>>> > wrote:
>>>
>>> >   Will the header pass with using file_get_contents , or should I be
>>> using
>>> > another command, and if so, which one?  Ron
>>> >
>>> > <?php
>>> >
>>> >     header('User Agent: RonBot (http://www.example.com)');
>>> >     $url = "http://www.example.com";; <http://www.example.com%22;>
>>> >
>>> >         $input = file_get_contents($url);
>>> >
>>> >
>>> >
>>> > The Verse of the Day
>>> > “Encouragement from God’s Word”
>>> > http://www.TheVerseOfTheDay.info
>>> >
>>> >  *From:* Shreyas Agasthya <shreya...@gmail.com>
>>> > *Sent:* Thursday, November 25, 2010 4:21 AM
>>> > *To:* Ron Piggott <ron.pigg...@actsministries.org>
>>> > *Cc:* php-general@lists.php.net ; a...@ashleysheridan.co.uk
>>> > *Subject:* Re: [PHP] Fw: Spoofing user_agent
>>> >
>>> > A standard HTTP Request headers is : User Agent (without the >
>>> > underscore).
>>> >
>>> > --Shreyas
>>> >
>>> > On Thu, Nov 25, 2010 at 2:36 PM, Ron Piggott <
>>> > ron.pigg...@actsministries.org> wrote:
>>> >
>>> >>
>>> >> Is this what you are telling me to do:
>>> >>
>>> >> header('user_agent: RonBot (http://www.theverseoftheday.info)');
>>> >>
>>> >> Ron
>>> >>
>>> >> The Verse of the Day
>>> >> “Encouragement from God’s Word”
>>> >> http://www.TheVerseOfTheDay.info
>>> >>
>>> >> From: a...@ashleysheridan.co.uk
>>> >> Sent: Thursday, November 25, 2010 3:34 AM
>>> >> To: Ron Piggott ; php-general@lists.php.net
>>> >> Subject: Re: [PHP] Fw: Spoofing user_agent
>>> >>
>>> >> You need to set it in the header request you make. Putting it in the
>>> >> script you're using as a spider with ini_set won't do anything because
>>> the
>>> >> Target site doesn't know anything about it.
>>> >>
>>> >> Thanks,
>>> >> Ash
>>> >> http://www.ashleysheridan.co.uk
>>> >>
>>> >> ----- Reply message -----
>>> >> From: "Ron Piggott" <ron.pigg...@actsministries.org>
>>> >> Date: Thu, Nov 25, 2010 08:25
>>> >> Subject: [PHP] Fw: Spoofing user_agent
>>> >> To: <php-general@lists.php.net>
>>> >>
>>> >> I have wrote a script to generate a sitemap of my web site.  It crawls
>>> all
>>> >> of the site web pages.  (About 30,000)
>>> >>
>>> >> I need help to spoof the user_agent variable so the stats program
>>> running
>>> >> in the background ( “AWSTATS” ) will treat the crawl as a bot, not
>>> browsing
>>> >> usage.
>>> >>
>>> >> The sitemap generator is a cron job.  I tried the syntax:
>>> >> ini_set('user_agent', 'RonBot (http://www.theverseoftheday.info)/'/);
>>> >>
>>> >> This didn’t work.  The browsing was attributed to the dedicated IP
>>> >> address.
>>> >>
>>> >> How do I get AWSTATS to access this, such as other entries under the
>>> >> “Robots/Spiders visitors” heading:
>>> >> Unknown robot (identified by 'bot*')
>>> >>
>>> >> I don’t mean any ill will by changing this setting.  Thanks for the
>>> help.
>>> >>
>>> >> Ron
>>> >>
>>> >> The Verse of the Day
>>> >> “Encouragement from God’s Word”
>>> >> http://www.TheVerseOfTheDay.info
>>> >>
>>> >>
>>> >
>>> >
>>> > --
>>> > Regards,
>>> > Shreyas Agasthya
>>> >
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Shreyas Agasthya
>>>
>>
>>
>>
>> --
>> :DJ
>>
>
> It is no use using header(). This sets a header for the client, not
> the server of any file_get_contents() requests.
>
> I use stream_contexts.
>
> $s_Contents = file_get_contents(
>  $s_URL,
>  False,
>  stream_context_create(
>   array(
>     'http' => array(
>       'method' => 'GET',
>       'header' => "User-Agent: RonBot (http://www.example.com)\r\n"
>     ),
>   )
>  )
> );
>
> You can supply cookies, or anything else, with the request. Make sure
> you add a \r\n to each of the headers and just concatenate them.
>
> If you are doing this in a loop, then I'd recommend creating a default
> stream context and then the request would just be ...
>
> $s_Contents = file_get_contents($s_URL);
>
> As the default stream context would be applied.
>
> I had to use a default stream context to route all http requests
> through an NTLM authentication proxy server because PHP doesn't deal
> with NTLM authentication.
>
> See my user notes on
> http://docs.php.net/manual/en/function.stream-context-get-default.php.
> Don't bother with the link at the bottom of the user note- it's not
> live.
>
> Richard.
>
> --
> Richard Quadling
> Twitter : EE : Zend
> @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY
>

http://en.wikipedia.org/wiki/User_agent "... the identity is
transmitted via the User-Agent request header, ... "



-- 
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to