Re: may be a bug in string.rstrip

2007-11-22 Thread Tyler Reguly
Wow... took someone else to point this out to me.. Kinda feel like an idiot
for responding :)

rstrip doesn't take a string... it removes all the chars that you list
individually...

that's why the e was removed... there's an e in .torrent but it hit the x
which it didn't match on, so it stopped... that's why the other e was left.

On 11/22/07, Tyler Reguly <[EMAIL PROTECTED]> wrote:
>
> Interesting... I tried this on three machines Windows/Python 2.4.3,
> FC4/Python 2.4.3, Ubuntu/Python 2.5.1 and I saw the same thing for each...
> It's apparently not a three character issue but rather related to specific
> characters (e, n, o, r, t). A further test revealed that this affects one
> additional character... the . (period/dot) character.
>
> Here's what I did to test it:
>
> for i in range(97,123):
>  x = 'ab' + chr(i) + '.torrent'
>  y = x.rstrip('.torrent')
>  print ('Before: %s (%d)\tAfter: %s (%d)' % (x, len(x), y, len(y) ) )
>
> Before: abcdefga.torrent (16)   After: abcdefga (8)
> Before: abcdefgb.torrent (16)   After: abcdefgb (8)
> Before: abcdefgc.torrent (16)   After: abcdefgc (8)
> Before: abcdefgd.torrent (16)   After: abcdefgd (8)
> Before: abcdefge.torrent (16)   After: abcdefg (7)
> Before: abcdefgf.torrent (16)   After: abcdefgf (8)
> Before: abcdefgg.torrent (16)   After: abcdefgg (8)
> Before: abcdefgh.torrent (16)   After: abcdefgh (8)
> Before: abcdefgi.torrent (16)   After: abcdefgi (8)
> Before: abcdefgj.torrent (16)   After: abcdefgj (8)
> Before: abcdefgk.torrent (16)   After: abcdefgk (8)
> Before: abcdefgl.torrent (16)   After: abcdefgl (8)
> Before: abcdefgm.torrent (16)   After: abcdefgm (8)
> Before: abcdefgn.torrent (16)   After: abcdefg (7)
> Before: abcdefgo.torrent (16)   After: abcdefg (7)
> Before: abcdefgp.torrent (16)   After: abcdefgp (8)
> Before: abcdefgq.torrent (16)   After: abcdefgq (8)
> Before: abcdefgr.torrent (16)   After: abcdefg (7)
> Before: abcdefgs.torrent (16)   After: abcdefgs (8)
> Before: abcdefgt.torrent (16)   After: abcdefg (7)
> Before: abcdefgu.torrent (16)   After: abcdefgu (8)
> Before: abcdefgv.torrent (16)   After: abcdefgv (8)
> Before: abcdefgw.torrent (16)   After: abcdefgw (8)
> Before: abcdefgx.torrent (16)   After: abcdefgx (8)
> Before: abcdefgy.torrent (16)   After: abcdefgy (8)
> Before: abcdefgz.torrent (16)   After: abcdefgz (8)
>
>
>
> Before: aba.torrent (11)After: aba (3)
> Before: abb.torrent (11)After: abb (3)
> Before: abc.torrent (11)After: abc (3)
> Before: abd.torrent (11)After: abd (3)
> Before: abe.torrent (11)After: ab (2)
> Before: abf.torrent (11)After: abf (3)
> Before: abg.torrent (11)After: abg (3)
> Before: abh.torrent (11)After: abh (3)
> Before: abi.torrent (11)After: abi (3)
> Before: abj.torrent (11)After: abj (3)
> Before: abk.torrent (11)After: abk (3)
> Before: abl.torrent (11)After: abl (3)
> Before: abm.torrent (11)After: abm (3)
> Before: abn.torrent (11)After: ab (2)
> Before: abo.torrent (11)After: ab (2)
> Before: abp.torrent (11)After: abp (3)
> Before: abq.torrent (11)After: abq (3)
> Before: abr.torrent (11)After: ab (2)
> Before: abs.torrent (11)After: abs (3)
> Before: abt.torrent (11)After: ab (2)
> Before: abu.torrent (11)After: abu (3)
> Before: abv.torrent (11)After: abv (3)
> Before: abw.torrent (11)After: abw (3)
> Before: abx.torrent (11)After: abx (3)
> Before: aby.torrent (11)After: aby (3)
> Before: abz.torrent (11)After: abz (3)
>
>
> On 11/22/07, kyo guan <[EMAIL PROTECTED]> wrote:
> >
> > Hi :
> >
> > Please look at this code:
> >
> > >>> 'exe.torrent'.rstrip('.torrent')
> > 'ex'<-  it should be 'exe', why?
> >
> > but this is a right answer:
> >
> > >>> '120.exe'.rstrip('.exe')
> > '120'   <-- this is a right value.
> >
> > there is a bug in the rstrip, lstrip there isn't this problem.
> >
> >
> >
> > Kyo.
> >
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
>
>
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: may be a bug in string.rstrip

2007-11-22 Thread Tyler Reguly
Interesting... I tried this on three machines Windows/Python 2.4.3,
FC4/Python 2.4.3, Ubuntu/Python 2.5.1 and I saw the same thing for each...
It's apparently not a three character issue but rather related to specific
characters (e, n, o, r, t). A further test revealed that this affects one
additional character... the . (period/dot) character.

Here's what I did to test it:

for i in range(97,123):
 x = 'ab' + chr(i) + '.torrent'
 y = x.rstrip('.torrent')
 print ('Before: %s (%d)\tAfter: %s (%d)' % (x, len(x), y, len(y) ) )

Before: abcdefga.torrent (16)   After: abcdefga (8)
Before: abcdefgb.torrent (16)   After: abcdefgb (8)
Before: abcdefgc.torrent (16)   After: abcdefgc (8)
Before: abcdefgd.torrent (16)   After: abcdefgd (8)
Before: abcdefge.torrent (16)   After: abcdefg (7)
Before: abcdefgf.torrent (16)   After: abcdefgf (8)
Before: abcdefgg.torrent (16)   After: abcdefgg (8)
Before: abcdefgh.torrent (16)   After: abcdefgh (8)
Before: abcdefgi.torrent (16)   After: abcdefgi (8)
Before: abcdefgj.torrent (16)   After: abcdefgj (8)
Before: abcdefgk.torrent (16)   After: abcdefgk (8)
Before: abcdefgl.torrent (16)   After: abcdefgl (8)
Before: abcdefgm.torrent (16)   After: abcdefgm (8)
Before: abcdefgn.torrent (16)   After: abcdefg (7)
Before: abcdefgo.torrent (16)   After: abcdefg (7)
Before: abcdefgp.torrent (16)   After: abcdefgp (8)
Before: abcdefgq.torrent (16)   After: abcdefgq (8)
Before: abcdefgr.torrent (16)   After: abcdefg (7)
Before: abcdefgs.torrent (16)   After: abcdefgs (8)
Before: abcdefgt.torrent (16)   After: abcdefg (7)
Before: abcdefgu.torrent (16)   After: abcdefgu (8)
Before: abcdefgv.torrent (16)   After: abcdefgv (8)
Before: abcdefgw.torrent (16)   After: abcdefgw (8)
Before: abcdefgx.torrent (16)   After: abcdefgx (8)
Before: abcdefgy.torrent (16)   After: abcdefgy (8)
Before: abcdefgz.torrent (16)   After: abcdefgz (8)



Before: aba.torrent (11)After: aba (3)
Before: abb.torrent (11)After: abb (3)
Before: abc.torrent (11)After: abc (3)
Before: abd.torrent (11)After: abd (3)
Before: abe.torrent (11)After: ab (2)
Before: abf.torrent (11)After: abf (3)
Before: abg.torrent (11)After: abg (3)
Before: abh.torrent (11)After: abh (3)
Before: abi.torrent (11)After: abi (3)
Before: abj.torrent (11)After: abj (3)
Before: abk.torrent (11)After: abk (3)
Before: abl.torrent (11)After: abl (3)
Before: abm.torrent (11)After: abm (3)
Before: abn.torrent (11)After: ab (2)
Before: abo.torrent (11)After: ab (2)
Before: abp.torrent (11)After: abp (3)
Before: abq.torrent (11)After: abq (3)
Before: abr.torrent (11)After: ab (2)
Before: abs.torrent (11)After: abs (3)
Before: abt.torrent (11)After: ab (2)
Before: abu.torrent (11)After: abu (3)
Before: abv.torrent (11)After: abv (3)
Before: abw.torrent (11)After: abw (3)
Before: abx.torrent (11)After: abx (3)
Before: aby.torrent (11)After: aby (3)
Before: abz.torrent (11)After: abz (3)


On 11/22/07, kyo guan <[EMAIL PROTECTED]> wrote:
>
> Hi :
>
> Please look at this code:
>
> >>> 'exe.torrent'.rstrip('.torrent')
> 'ex'<-  it should be 'exe', why?
>
> but this is a right answer:
>
> >>> '120.exe'.rstrip('.exe')
> '120'   <-- this is a right value.
>
> there is a bug in the rstrip, lstrip there isn't this problem.
>
>
>
> Kyo.
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Outbound HTML Authentication

2007-11-29 Thread Tyler Reguly
Hello,

You should probably read the HTTP RFC is you're going to write a screen
scraper... but either way.

401 tells you that Auth is required there are several types of
"server-based auth" (Different from form based auth)... They include

   - Basic
   - Digest
   - NTLM (or Negotiate)

Basic is easy to implement... Digest is slightly more complex... NTLM
requires that you have an understanding of how NTLM works in general.

There are a couple things you can do...

   1. Find a public implementation of NTLM in python (I don't believe one
   exists... but if it does, I'd love if someone could point it out)
   2. Use the  NTLM Authentication Proxy Server (
   http://www.geocities.com/rozmanov/ntlm/ )
   3. Follow Ronald Tschalär's write-up on NTLM over HTTP and implement
   it yourself ( http://www.innovation.ch/personal/ronald/ntlm.html )

I actually did the recently for a project that I'm working on... and looked
fairly deeply at Ronald's write-up... It is fairly decent... and I may
actually implement it at some point in the future as a released Python
module... for now though you'll have to do it yourself.

--
Tyler Reguly
http://www.computerdefense.org



On 11/29/07, Mudcat <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I was trying to do a simple web scraping tool, but the network they
> use at work does some type of internal authentication before it lets
> the request out of the network. As a result I'm getting the '401 -
> Authentication Error' from the application.
>
> I know when I use a web browser or other application that it uses the
> information from my Windows AD to validate my user before it accesses
> a website. I'm constantly getting asked to enter in this info before I
> use Firefox, and I assume that IE picks it up automatically.
>
> However I'm not sure how to tell the request that I'm building in my
> python script to either use the info in my AD account or enter in my
> user/pass automatically.
>
> Anyone know how to do this?
>
> Thanks
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list