Re: [Tutor] Problem with multiple input

2009-07-29 Thread Peter Anderson

Alan,

Thanks heaps for the quick feedback. I think you are right about 
multiple inputs on the one line. Looks elegant but is hard work.


Regards,
Peter
--
*Peter Anderson*
There is nothing more difficult to take in hand, more perilous to 
conduct, or more uncertain in its success, than to take the lead in the 
introduction of a new order of things—Niccolo Machiavelli, /The Prince/, 
ch. 6

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] curve fitting

2009-07-29 Thread Bala subramanian
Friends,

I wish to do some curve fitting with python by defining my own equations.
Could someone please give some guidance or examples on doing the same.

Thanks,
Bala
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] searching for an ip and subnets in a dir of csv's

2009-07-29 Thread Wayne
On Tue, Jul 28, 2009 at 9:36 PM, Nick Burgess wrote:

> Good evening List,
>
> I am trying to have this script search for an IP or nearest subnet
> match in a dir of csv's. It works with an absolute match,  It will be
> receiving a whole IP address, so if there is no absolute match no data
> is returned, however if it is listed somewhere in a subnet I want to
> know.   I need it to search and loop through the 32 bit, then the 24
> bit, 16, 8.  I am stumped at how to cut of the numbers on the right
> with the period as the delimiter. I have looked at strip, split need a
> clue what else to try.
>
> Thanks!
>
>
>
>args.append(arg.strip("\n"))
> args = list(set(args))
> for arg in args:
>for f in files:
>pattern = re.compile(sys.argv[1])   <   I
> am thinking loop 4 times and do something different here
>ff = csv.reader(open (f, 'rb'), delimiter=' ', quotechar='|')
>for row in ff:
>if any(pattern.search(cell) for cell in row):
>print f
>print ', '.join(row)


It's often helpful to provide example data and solutions so we know exactly
what you're looking for. If we understand quickly, chances are we'll reply
just as quickly.

If you have the IPs

192.168.1.1
192.168.1.10
192.168.2.2
192.169.1.1

And you were looking for 192.168.1.2, do you want it to return nothing? Or
both 192.168.1.1 and 192.168.1.10? Or only 192.168.1.1 as it's the closest
match?

Also, I don't know if I'd do re.compile on the raw sys.argv data, but
perhaps filter it - just compiling is error prone and possibly a security
hole. Even if you're doing it just for yourself, I'd still try to make it as
break-proof as possible.

HTH,
Wayne
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] concerning help() function

2009-07-29 Thread Wayne
On Wed, Jul 29, 2009 at 1:29 AM, David  wrote:

> Dear Tutors,
>
> whenever I make use of the help() function, I have a good chance of
> getting an error. I have not yet understood this tool very well.
>
> Take the modules operator and random as examples. The former is
> built-in, the latter not.
> Do I wish to see the help files, I have to use a different syntax:
>
> help(random)
> help('operator')
>
> I figured this by trial and error, and I am keen to find out when the
> help() function is to be supplied with a string, and when no '' is
> required. It certainly does not seem intuitive to me!


>>> operator
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'operator' is not defined
>>> help('operator')

>>> random
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'random' is not defined
>>> help(random)
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'random' is not defined
>>> help('random')
(produces the helpfile)

This goes back to pythons objects. If something is an object in python, you
can probably pass it to help(), otherwise you need to pass a string.

>>> operator = 'foo'
>>> help(operator)
 Welcome to ASCII Assassin!
 (1) Start Game
 (2) Instructions

:>

I guess that's an easter egg... so I'm off to play ASCII Assassin!

HTH,
Wayne

-- 
To be considered stupid and to be told so is more painful than being called
gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness,
every vice, has found its defenders, its rhetoric, its ennoblement and
exaltation, but stupidity hasn’t. - Primo Levi
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] searching for an ip and subnets in a dir of csv's

2009-07-29 Thread Nick Burgess
 And you were looking for 192.168.1.2, do you want it to return nothing? Or
both 192.168.1.1 and 192.168.1.10? Or only 192.168.1.1 as it's the closest
match?

I would like it to return both, all possible matches.

The data looks something like this in the CSV's,

Server foo.bar.org  10.2.2.2such&such org
Apache farm subnet  10.2.3.0/24so&so corp.

the format is random

The script will be ran from a third party tool so only one argument
can be passed to it which will be an entire IP address.  If within the
CSV's there is no 32 bit match there could be a subnet that might
match, thats why I need it to loop over the dots.  If there is a 32
bit and a subnet match both will be returned which is desirable .

I figured out a way to cut of the ends from the dots using the following code,

string.rsplit('.',1)[:1];

looping over this and incrementing the first digit cuts of the octets
from the IP correctly.  My problem now is that the rsplit returns a
list and I need a string.  I am stuck on how to use the rsplit.
Thanks.


ipAdres = re.compile(sys.argv[1])
print ipAdres.pattern
print ipAdres.pattern.rsplit('.',1)[:1]
for arg in args:
for f in files:
ff = csv.reader(open (f, 'rb'), delimiter=' ', quotechar='|')
for row in ff:
if any(ipAdres.search(cell) for cell in row):
print f
print ', '.join(row)











On Wed, Jul 29, 2009 at 8:13 AM, Wayne wrote:
> On Tue, Jul 28, 2009 at 9:36 PM, Nick Burgess 
> wrote:
>>
>> Good evening List,
>>
>> I am trying to have this script search for an IP or nearest subnet
>> match in a dir of csv's. It works with an absolute match,  It will be
>> receiving a whole IP address, so if there is no absolute match no data
>> is returned, however if it is listed somewhere in a subnet I want to
>> know.   I need it to search and loop through the 32 bit, then the 24
>> bit, 16, 8.  I am stumped at how to cut of the numbers on the right
>> with the period as the delimiter. I have looked at strip, split need a
>> clue what else to try.
>>
>> Thanks!
>>
>>
>>
>>        args.append(arg.strip("\n"))
>> args = list(set(args))
>> for arg in args:
>>    for f in files:
>>        pattern = re.compile(sys.argv[1])                   <   I
>> am thinking loop 4 times and do something different here
>>        ff = csv.reader(open (f, 'rb'), delimiter=' ', quotechar='|')
>>        for row in ff:
>>            if any(pattern.search(cell) for cell in row):
>>                print f
>>                print ', '.join(row)
>
> It's often helpful to provide example data and solutions so we know exactly
> what you're looking for. If we understand quickly, chances are we'll reply
> just as quickly.
>
> If you have the IPs
>
> 192.168.1.1
> 192.168.1.10
> 192.168.2.2
> 192.169.1.1
>
> And you were looking for 192.168.1.2, do you want it to return nothing? Or
> both 192.168.1.1 and 192.168.1.10? Or only 192.168.1.1 as it's the closest
> match?
>
> Also, I don't know if I'd do re.compile on the raw sys.argv data, but
> perhaps filter it - just compiling is error prone and possibly a security
> hole. Even if you're doing it just for yourself, I'd still try to make it as
> break-proof as possible.
>
> HTH,
> Wayne
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] searching for an ip and subnets in a dir of csv's

2009-07-29 Thread Wayne
On Wed, Jul 29, 2009 at 7:43 AM, Nick Burgess wrote:

>  And you were looking for 192.168.1.2, do you want it to return nothing? Or
> both 192.168.1.1 and 192.168.1.10? Or only 192.168.1.1 as it's the closest
> match?
>
> I would like it to return both, all possible matches.
>
> The data looks something like this in the CSV's,
>
> Server foo.bar.org  10.2.2.2such&such org
> Apache farm subnet  10.2.3.0/24so&so corp.
>
> the format is random
>
> The script will be ran from a third party tool so only one argument
> can be passed to it which will be an entire IP address.  If within the
> CSV's there is no 32 bit match there could be a subnet that might
> match, thats why I need it to loop over the dots.  If there is a 32
> bit and a subnet match both will be returned which is desirable .


http://www.velocityreviews.com/forums/t356058-regular-expressions-and-matches.html

That will give you the regex for matching any IP addresses. If you use the
findall method
http://docs.python.org/library/re.html

then it will give you a list of IPs (as strings).
If you packed each of these IPs into a dictionary you could use this to get
the key:

for ip in iplist:
key = ip[:ip.rfind('.')]  # Gets everything but the last bit
if mydict.get(key):
mydict[key].append(ip)
else:
mydict[key] = [ip]

Then you just have to do something like
mydict.get(sys.argv[1][:sys.argv[1].rfind('.')]

Which will return None or the list of IPs.

Although now that I'm thinking a little more about it, that's probably
excessive for your needs - you can just do this:

matches = []
ipnet = sys.argv[1][:sys.argv[1].rfind('.')]

for ip in iplist:
if ip.startswith(ipnet):
matches.append(ip)

and that should give you a list of all IPs within that same subnet.

HTH,
Wayne
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] curve fitting

2009-07-29 Thread Eike Welk
On Wednesday 29 July 2009, Bala subramanian wrote:
> Friends,
>
> I wish to do some curve fitting with python by defining my own
> equations. Could someone please give some guidance or examples on
> doing the same.

You can use the Numpy/Scipy libraries for that. I think they have 
examples for curve fitting on their website. But unfortunately the 
website is down (or my Internet is broken).

http://www.scipy.org/

There are also special mailing lists for Numpy/Scipy/Matplotlib users, 
to which you should subscribe. This one would be good for your 
question:
http://projects.scipy.org/mailman/listinfo/scipy-user


If I understand you right, you have a formula with some parameters. 
Now you are searching for parameter values so that the formula really 
goes through the data points. This is a task for optimization 
functions.

Kind regards,
Eike.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] curve fitting

2009-07-29 Thread Skipper Seabold
On Wed, Jul 29, 2009 at 9:42 AM, Eike Welk wrote:
> On Wednesday 29 July 2009, Bala subramanian wrote:
>> Friends,
>>
>> I wish to do some curve fitting with python by defining my own
>> equations. Could someone please give some guidance or examples on
>> doing the same.
>

What kind of curve fitting exactly?  Linear equations?  Can you
provide an example?

> You can use the Numpy/Scipy libraries for that. I think they have
> examples for curve fitting on their website. But unfortunately the
> website is down (or my Internet is broken).
>
> http://www.scipy.org/
>

There were some problems with the site yesterday as well.  Scipy would
be a good place to start.  I am in the home stretch of completing a
google summer of code project to integrate some mostly linear
statistical models (different flavors of least squares fitting,
generalized linear models, robust statistics, generalized additive
models, etc.) into SciPy, which might also be of interest to you
depending on your needs (some info here
).  There are also some "cookbook"
examples from the scipy page that show some recipes for doing some
basic curve fitting (OLS, interpolation) with the available tools.

> There are also special mailing lists for Numpy/Scipy/Matplotlib users,
> to which you should subscribe. This one would be good for your
> question:
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
> If I understand you right, you have a formula with some parameters.
> Now you are searching for parameter values so that the formula really
> goes through the data points. This is a task for optimization
> functions.
>
> Kind regards,
> Eike.
> ___
> Tutor maillist  -  tu...@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] searching for an ip and subnets in a dir of csv's

2009-07-29 Thread Martin A. Brown
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello,

 : The script will be ran from a third party tool so only one 
 : argument can be passed to it which will be an entire IP address.  
 : If within the CSV's there is no 32 bit match there could be a 
 : subnet that might match, thats why I need it to loop over the 
 : dots.  If there is a 32 bit and a subnet match both will be 
 : returned which is desirable .
 :
 : I figured out a way to cut of the ends from the dots using the 
 : following code,
 : 
 : string.rsplit('.',1)[:1];

You, very likely, know something about your data that I do not, 
however, I couldn't help but notice that your choice of octet 
boundaries is not really quite sufficient when trying to find the 
network in which a given host would live.  The /24, /16 and /8 
boundaries are used in classful routing (class C, class B and class 
A respectively), which has largely been supplanted by CIDR.

Classless InterDomain Routing (CIDR) allows for netmasks which do 
not fall on the octet boundaries.  Consider the IP 66.102.1.104.  
Here are some of the network addresses (prefixes) that could contain 
that host:

  66.102.1.104/32
  66.102.1.104/31
  66.102.1.104/30
  66.102.1.104/29
  66.102.1.96/28
  66.102.1.96/27
  66.102.1.64/26
  66.102.1.0/25
  66.102.1.0/24
  66.102.0.0/23
  66.102.0.0/22

If you found any of these in your input data, that would indicate 
that the host 66.102.1.104 would be inside that network.  With that 
said, here's a little function (my bad reimplementation of something 
probably in an IP handling library like ipaddr [0]) that will return 
a list of prefixes in string form which you can search for in your 
input text.

Remember when searching for IPs in their string form that you should 
anchor (at absolute least) at the beginning of the string.  In 
short, without the anchoring, you'll get three results, where only 
one should actually match.  (Imagine looking for '4.17.112.0/24' 
without an anchor at the front.  You'd also match '24.17.112.0/24' 
and '174.17.112.0/24'.)

 : looping over this and incrementing the first digit cuts of the 
 : octets from the IP correctly.  My problem now is that the rsplit 
 : returns a list and I need a string.  I am stuck on how to use the 
 : rsplit. Thanks.
 : 
 : 
 : ipAdres = re.compile(sys.argv[1])
 : print ipAdres.pattern
 : print ipAdres.pattern.rsplit('.',1)[:1]
 : for arg in args:
 : for f in files:
 : ff = csv.reader(open (f, 'rb'), delimiter=' ', quotechar='|')
 : for row in ff:
 : if any(ipAdres.search(cell) for cell in row):
 : print f
 : print ', '.join(row)

Good luck in your text searching!

- -Martin

 [0] An IP address handling library that is much richer than the 
 hastily-written snippet below:
   http://code.google.com/p/ipaddr-py/


  #! /usr/bin/env python
  #
  #
  
  import sys
  import struct
  import socket
  
  maxmasklen = 32
  
  def list_networks( ip, minmasklen=8, defaultmasklen=32 ):
if ip is None:
  raise ValueError, "Supply an IP to this function"
mask = defaultmasklen
ip = struct.unpack('>L',socket.inet_aton( ip )) 
networks = list()
while mask >= minmasklen:
  bits  = maxmasklen - mask
  ip= ( ( ( ip[0] >> bits ) << bits ), )
  net   = socket.inet_ntoa( struct.pack( '>L', ip[0] ) ) + '/' + str( mask )
  networks.append( net )
  mask -= 1
return networks
  
  if __name__ == '__main__':
for ip in sys.argv[1:]:
  networks = list_networks( ip );
  for network in networks:
print network
  print
  
  # -- end of file


- -- 
Martin A. Brown
http://linux-ip.net/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: pgf-0.72 (http://linux-ip.net/sw/pine-gpg-filter/)

iD8DBQFKcFbwHEoZD1iZ+YcRApGhAJ9fJxjEigCeCZ7cpDSa9uCoAadl+QCfYkhI
7fR7P0mZRsBl4lbleqkDoqw=
=ns0F
-END PGP SIGNATURE-
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] concerning help() function

2009-07-29 Thread Dave Angel

David wrote:

Dear Tutors,

whenever I make use of the help() function, I have a good chance of
getting an error. I have not yet understood this tool very well.

Take the modules operator and random as examples. The former is
built-in, the latter not.
Do I wish to see the help files, I have to use a different syntax:

help(random)
help('operator')

I figured this by trial and error, and I am keen to find out when the
help() function is to be supplied with a string, and when no '' is
required. It certainly does not seem intuitive to me!

Many thanks,

David


  
As far as I know, the only difference between the two forms is that with 
the quotes, you can get help on a module that has *not* been imported, 
while if a module had been imported, you can leave off the quotes.


For those two modules, you need the quotes till you import them.  I 
don't see any difference between them, at least in :Python 2.6.2


For things that are really built-in, such as the function open(), you 
can get help without importing anything, since it's already visible.  So 
you can say help(open), or help(property).



DaveA
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Web crawling!

2009-07-29 Thread Raj Medhekar
Does anyone know a good webcrawler that could be used in tandem with the 
Beautiful soup parser to parse out specific elements from news sites like BBC 
and CNN? Thanks!

-Raj


  ___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] curve fitting

2009-07-29 Thread Eike Welk
On Wednesday 29 July 2009, Eike Welk wrote:
> You can use the Numpy/Scipy libraries for that. I think they have
> examples for curve fitting on their website. 

This page contains examples:
http://www.scipy.org/Cookbook/FittingData

-
Eike.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] curve fitting

2009-07-29 Thread Skipper Seabold
On Wed, Jul 29, 2009 at 10:27 AM, Bala
subramanian wrote:
> I have to do the following:
>
> Eq 1) F = A * FB + (1-A) FO - (1)
>
> Eq 2) A = 1/2a0 [ ( ao + x + bo) - { ( ao + x + bo)2 - 4 aobo ) }0.5 ]
> . (2)
>
> KNOWN: F, FB, FO, ao,
> UNKNOWN:  bo
>
> I have to fit the data first to Eq 1, find A and then fit it to Eq 2 to find
> bo. I know python programming but am very new to this kind of analysis. I
> would greatly appreciate if you could provide me some guidance on how to do
> the same.
>
> Thanks,
> Bala
>

It's good practice to reply-all to a mailing list and not to top post
(replying at the top of an email), so it's easier to read and follow
the discussion.

What is your fitting criterion?  Ie., are you trying to minimize the
sum of squared errors (least squares), etc.  There are different
fitting criteria depending on what kind of data you have and where
your noise is expected to come from.

It looks like you could do this with a least squares fit if you want,
putting linear constraints on the coefficients (which should be a part
of the scipy.models soon), or you could rearrange the equations to get
your unknown in only one place.

Eq 1 would then be F = FO + A * (FB - FA)

You can do similarly for equation 2 I think but didn't look closely.
I'm still not sure what everything is in those equations.  If you ask
over on the scipy-user list and include the equations and a data
example, you will almost certainly get some more help.

Cheers,

Skipper

>
> On Wed, Jul 29, 2009 at 3:59 PM, Skipper Seabold 
> wrote:
>>
>> On Wed, Jul 29, 2009 at 9:42 AM, Eike Welk wrote:
>> > On Wednesday 29 July 2009, Bala subramanian wrote:
>> >> Friends,
>> >>
>> >> I wish to do some curve fitting with python by defining my own
>> >> equations. Could someone please give some guidance or examples on
>> >> doing the same.
>> >
>>
>> What kind of curve fitting exactly?  Linear equations?  Can you
>> provide an example?
>>
>> > You can use the Numpy/Scipy libraries for that. I think they have
>> > examples for curve fitting on their website. But unfortunately the
>> > website is down (or my Internet is broken).
>> >
>> > http://www.scipy.org/
>> >
>>
>> There were some problems with the site yesterday as well.  Scipy would
>> be a good place to start.  I am in the home stretch of completing a
>> google summer of code project to integrate some mostly linear
>> statistical models (different flavors of least squares fitting,
>> generalized linear models, robust statistics, generalized additive
>> models, etc.) into SciPy, which might also be of interest to you
>> depending on your needs (some info here
>> ).  There are also some "cookbook"
>> examples from the scipy page that show some recipes for doing some
>> basic curve fitting (OLS, interpolation) with the available tools.
>>
>> > There are also special mailing lists for Numpy/Scipy/Matplotlib users,
>> > to which you should subscribe. This one would be good for your
>> > question:
>> > http://projects.scipy.org/mailman/listinfo/scipy-user
>> >
>> >
>> > If I understand you right, you have a formula with some parameters.
>> > Now you are searching for parameter values so that the formula really
>> > goes through the data points. This is a task for optimization
>> > functions.
>> >
>> > Kind regards,
>> > Eike.
>> > ___
>> > Tutor maillist  -  tu...@python.org
>> > http://mail.python.org/mailman/listinfo/tutor
>> >
>> ___
>> Tutor maillist  -  tu...@python.org
>> http://mail.python.org/mailman/listinfo/tutor
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Fwd: Re: curve fitting

2009-07-29 Thread Eike Welk
Forwarding Skipper's other message, which must have somehow been lost:
--  Forwarded Message  --

Subject: Re: [Tutor] curve fitting
Date: Wednesday 29 July 2009
From: Skipper Seabold 
To: Eike Welk 

On Wed, Jul 29, 2009 at 9:42 AM, Eike Welk wrote:
> On Wednesday 29 July 2009, Bala subramanian wrote:
>> Friends,
>>
>> I wish to do some curve fitting with python by defining my own
>> equations. Could someone please give some guidance or examples on
>> doing the same.
>

What kind of curve fitting exactly?  Linear equations?  Can you
provide an example?

> You can use the Numpy/Scipy libraries for that. I think they have
> examples for curve fitting on their website. But unfortunately the
> website is down (or my Internet is broken).
>
> http://www.scipy.org/
>

There were some problems with the site yesterday as well.  Scipy would
be a good place to start.  I am in the home stretch of completing a
google summer of code project to integrate some mostly linear
statistical models (different flavors of least squares fitting,
generalized linear models, robust statistics, generalized additive
models, etc.) into SciPy, which might also be of interest to you
depending on your needs (some info here
).  There are also some "cookbook"
examples from the scipy page that show some recipes for doing some
basic curve fitting (OLS, interpolation) with the available tools.

> There are also special mailing lists for Numpy/Scipy/Matplotlib 
users,
> to which you should subscribe. This one would be good for your
> question:
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
> If I understand you right, you have a formula with some parameters.
> Now you are searching for parameter values so that the formula 
really
> goes through the data points. This is a task for optimization
> functions.
>
> Kind regards,
> Eike.
> ___
> Tutor maillist  -  tu...@python.org
> http://mail.python.org/mailman/listinfo/tutor
>

---
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] (no subject)

2009-07-29 Thread Chris Castillo
# Module demonstrates use of lists and set theory principles

def Unite(set1, set2):  # evaluate 2 lists, join both into 1 new list
newList = []
for item in set1:
newList.append(item)
for item in set2:
newList.append(item)
newList.sort()
return newList

def Intersect(set1, set2):  # evaluate 2 lists, check for
commonalities, output commonalities to 1 new list
newList = []
for item in set1:
if item in set1 and item in set2:
newList.append(item)
newList.sort()
return newList

def Negate(set1, set2): # evaluate 2 lists, return negation of 1st list
newList = []
for item in set1:
if item in set2:
set1.remove(item)
newList = set1
return newList


could this be done in a more elegant fashion?
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Web crawling!

2009-07-29 Thread Arun Tomar
Hi!
Raj.

On Wed, Jul 29, 2009 at 9:29 PM, Raj Medhekar wrote:

> Does anyone know a good webcrawler that could be used in tandem with the
> Beautiful soup parser to parse out specific elements from news sites like
> BBC and CNN? Thanks!
> -Raj
>

As i didn't find any good webcrawler as per my clients need, so i wrote one
for them, but it's specific for their need only. i can't disclose any more
details about it.

In short, i'm using my app to crawl the specific sites, then parse it with
beautiful soup and extract all the links on that page, then visit the links
and search for the keywords on those pages. If the keyword is occurs more
than the specified limit then it's a useful link and store it in the
database or else leave it.


-- 
Regards,
Arun Tomar
blog: http://linuxguy.in
website: http://www.solutionenterprises.co.in
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Web crawling!

2009-07-29 Thread vince spicer
On Wed, Jul 29, 2009 at 9:59 AM, Raj Medhekar wrote:

> Does anyone know a good webcrawler that could be used in tandem with the
> Beautiful soup parser to parse out specific elements from news sites like
> BBC and CNN? Thanks!
> -Raj
>
>
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
>

I have used httplib2 http://code.google.com/p/httplib2/ to crawl sites(with
auth/cookies) and lxml (html xpath) to parse out links.

but you could use builtin urllib2 to request pages if no auth/cookie support
is required, here is a simple example

import urllib2
from lxml import html

page = urllib2.urlopen("http://this.page.com ")
data = html.fromstring(page.read())

all_links = data.xpath("//a") # all links on the page

for link in all_links:
print link.attrib["href"]
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Issues with regex escaping on \{

2009-07-29 Thread gpo

My regex is being run in both Python v2.6 and v3.1
For this example, I'll give one line.  This lines will be read out of log
files.  I'm trying to get the GUID for the User ID to query a database with
it, so I'd like a sub match.  Here is the code
-
import re
line = '>Checking Privilege for UserId:
{88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
{71AD2527-8494-4654-968D-FE61E9A6A9DF}. Returned hr = 0'
pUserID=re.compile('UserID: \{(.+)\}',re.I)  #Sub match is one or more
characters between the first set of squigglies immediately following
'UserID: '

#the output is:
(re.search(pUserID,line)).group(1)
'88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
{71AD2527-8494-4654-968D-FE61E9A6A9DF'
---
Why isn't the match terminating after it finds the first \}  ?
-- 
View this message in context: 
http://www.nabble.com/Issues-with-regex-escaping-on-%5C%7B-tp24724060p24724060.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Parsing Bible verses

2009-07-29 Thread tearsfornations

hey, i recently was working on some custom bible software and created an
sqlite3 database of a few different bibles, one of which is the American
Standard Version (ASV) which you can download here

http://tearsfornations.hostcell.net/bibledoth/asv.bible.sqlite3



here is the create tables

CREATE TABLE bible_info
(
   description VARCHAR(55),
   abbreviation VARCHAR(7),
   comments TEXT,
   font VARCHAR(15),
   format VARCHAR(10),
   strongs BOOLEAN
);

CREATE TABLE [bible_verses] (
[b] int(4)  NULL,
[c] int(4)  NULL,
[v] int(4)  NULL,
[t] text  NULL,
PRIMARY KEY ([b],[c],[v])
);





John Fouhy wrote:
> 
> 2009/5/26 Eduardo Vieira :
>> Now, a little farther on the topic of a Bible database. I'm not sure
>> how I should proceed. I don't really have the db file I need, I will
>> have to generate it somehow, from a bible software, because the
>> version I want is for Portuguese. I have found a bible in sql, a bible
>> in MS Access to give me some ideas on how to structure my database.
>> But my question is do I really need a sql database for my need, since
>> I will be only reading from it? Never adding or updating. One like
>> sqlite. Would a pickled dictionary from Bible_reference to verse be
>> faster? Should I work with anydbm?
> 
> If you don't want to use a database, you should probably use the shelve
> module.
> 
> However, there is really no particular reason not to use a relational
> database.  It seems a solution fairly well suited to your problem.
> Python has a database supplied these days: sqlite3.  I suggest using
> that, rather than MS Access.
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Parsing-Bible-verses-tp23660711p24725020.html
Sent from the Python - tutor mailing list archive at Nabble.com.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Issues with regex escaping on \{

2009-07-29 Thread vince spicer
On Wed, Jul 29, 2009 at 11:35 AM, gpo  wrote:

>
> My regex is being run in both Python v2.6 and v3.1
> For this example, I'll give one line.  This lines will be read out of log
> files.  I'm trying to get the GUID for the User ID to query a database with
> it, so I'd like a sub match.  Here is the code
> -
> import re
> line = '>Checking Privilege for UserId:
> {88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
> {71AD2527-8494-4654-968D-FE61E9A6A9DF}. Returned hr = 0'
> pUserID=re.compile('UserID: \{(.+)\}',re.I)  #Sub match is one or more
> characters between the first set of squigglies immediately following
> 'UserID: '
>
> #the output is:
> (re.search(pUserID,line)).group(1)
> '88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
> {71AD2527-8494-4654-968D-FE61E9A6A9DF'
> ---
> Why isn't the match terminating after it finds the first \}  ?
> --
> View this message in context:
> http://www.nabble.com/Issues-with-regex-escaping-on-%5C%7B-tp24724060p24724060.html
> Sent from the Python - tutor mailing list archive at Nabble.com.
>
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>



your grouping (.+) appears to be greedy, you can make it non-greedy with a
question mark

EX:

pUserID=re.compile('UserID:\s+{(.+?)}',re.I)

Vince
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] manipulating lists

2009-07-29 Thread Dave Angel

(You omitted a title, so I made one up.  Hope it's okay)

Chris Castillo wrote:

# Module demonstrates use of lists and set theory principles

def Unite(set1, set2):  # evaluate 2 lists, join both into 1 new list
newList = []
for item in set1:
newList.append(item)
for item in set2:
newList.append(item)
newList.sort()
return newList

def Intersect(set1, set2):  # evaluate 2 lists, check for
commonalities, output commonalities to 1 new list
newList = []
for item in set1:
if item in set1 and item in set2:
newList.append(item)
newList.sort()
return newList

def Negate(set1, set2): # evaluate 2 lists, return negation of 1st list
newList = []
for item in set1:
if item in set2:
set1.remove(item)
newList = set1
return newList


could this be done in a more elegant fashion?

  
Note:  don't ever use tabs in source code.  Use spaces to indent - 
preferably 4 each.  If you accidentally mix spaces and tabs, you could 
have very mysterious bugs


Just by inspection - untested code:

#Use docstring.  Use extend rather than append in loop.  Use slice 
notation to make a copy of the items in the list


def Unite(set1, set2):
   """  evaluate 2 lists, join both into 1 new list"""
   newList = set1[:]
   newList.extend(set2)
   return sorted(newList)

def Intersect(set1, set2):
   """ evaluate 2 lists, check for
commonalities, output commonalities to 1 new list"""
   newList = []
   for item in set1:
   if item in set2:
   newList.append(item)
   return sorted(newList)

#This one was buggy;  it modified set1 in place, and returned another reference 
to it
def Negate(set1, set2):
   """ evaluate 2 lists, items that are in first list, but not in second """
   newList = set1[:]
   for item in set2:
   newList.remove(item)
   return newList #Question:  did you mean to sort this one too?

Now, could these be improved further?  Absolutely.  Using list comprehensions 
or similar techniques, some of these loops could be one-liners.
But I'm guessing you're a beginner in Python, and it's more important that you 
get used to writing readable, robust code than clever code.

One thing I do recommend, even at this stage, is to study set().

DaveA


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Issues with regex escaping on \{

2009-07-29 Thread Bill Campbell
On Wed, Jul 29, 2009, vince spicer wrote:
>On Wed, Jul 29, 2009 at 11:35 AM, gpo  wrote:
>
>>
>> My regex is being run in both Python v2.6 and v3.1
>> For this example, I'll give one line.  This lines will be read out of log
>> files.  I'm trying to get the GUID for the User ID to query a database with
>> it, so I'd like a sub match.  Here is the code
>> -
>> import re
>> line = '>Checking Privilege for UserId:
>> {88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
>> {71AD2527-8494-4654-968D-FE61E9A6A9DF}. Returned hr = 0'
>> pUserID=re.compile('UserID: \{(.+)\}',re.I)  #Sub match is one or more
>> characters between the first set of squigglies immediately following
>> 'UserID: '
>>
>> #the output is:
>> (re.search(pUserID,line)).group(1)
>> '88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
>> {71AD2527-8494-4654-968D-FE61E9A6A9DF'
>> ---
>> Why isn't the match terminating after it finds the first \}  ?
...
>your grouping (.+) appears to be greedy, you can make it non-greedy with a
>question mark

As a general rule, it's a Good Idea(tm) to write regular
expressions using the raw quote syntax.  Instead of:

re.compile('UserID: \{(.+)\}',...)

Use:

re.compile(r'UserID: \{(.+)\}',...)

The alternative is to backwhack any special characters with an
appropriate number if ``\'' characters, whatever that may be.

Bill
-- 
INTERNET:   b...@celestial.com  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
Voice:  (206) 236-1676  Mercer Island, WA 98040-0820
Fax:(206) 232-9186  Skype: jwccsllc (206) 855-5792

Government is a broker in pillage, and every election is a sort of advance
auction in stolen goods. -- H.L. Mencken 
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] concerning help() function

2009-07-29 Thread Alan Gauld


"David"  wrote 


whenever I make use of the help() function, I have a good chance of
getting an error. I have not yet understood this tool very well.


You need to import the module to maker the name visible


help(random)
help('operator')

I figured this by trial and error, 


error mainly.

What you are seeing is the help for the string 'operator' - which 
is the same as the help for any other string - the builtin string 
methods. Comparte it with



help('')


HTH,


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] concerning help() function

2009-07-29 Thread Alan Gauld

help('operator')

I figured this by trial and error, and I am keen to find out when the


Oops, always try before posting! And don;t assume...

I juast vdid and you are right, it does give help on an unimported module!

Sorry, how embarrassing! :-)

Alan G



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] (no subject)

2009-07-29 Thread Alan Gauld


"Chris Castillo"  wrote 


# Module demonstrates use of lists and set theory principles

could this be done in a more elegant fashion?


Yes use the Python set type.

Alan G

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Issues with regex escaping on \{

2009-07-29 Thread wesley chun
in addition to the good advice from vince (watch out for greediness
regardless of what you're looking for) and bill (use raw strings...
regexes are one of their primary use cases!), another thing that may
help with the greediness issue are the character sets you're using
inside to match with.

for example, if you know the string inside the curly braces, e.g.,
{88F96ED2-D471-DE11-95B6-0050569E7C88}, are numbers and letters and
dashes, you can use "[\w-]+" which means one or more alphanumeric
characters plus the dash. the use of ".+" is just *asking* for greedy
gobbling up of characters like "}" because "." means match any single
character.

hope this helps!
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Core Python Programming", Prentice Hall, (c)2007,2001
"Python Fundamentals", Prentice Hall, (c)2009
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
python training and technical consulting
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] implementing set operations [was Re: (no subject)]

2009-07-29 Thread wesley chun
> could this be done in a more elegant fashion?

in addition to alan's obvious solution, if you wanted to roll your
own, you have a good start. my comments below.


> def Unite(set1, set2):          # evaluate 2 lists, join both into 1 new list
>        newList = []
>        for item in set1:
>                newList.append(item)
>        for item in set2:
>                newList.append(item)
>        newList.sort()
>        return newList

- as a style point, use lowercase for all function and variable names.
Capping the first letter is recommended only for class names.

- changing it from "unite" to "union" would make it slightly more accurate

- why the sort()? sets, are by default, unordered data structures...
think  of drawing its elements inside the circles in Venn diagrams.

- if you do end up using Python sets, use the set() factory function
in 2.x, i.e, newSet = set(set1) + set(set2); or set literals in Python
3.x, i.e., mySet = {1, 5, -4, 42}, if you have the exact elements


> def Intersect(set1, set2):              # evaluate 2 lists, check for
> commonalities, output commonalities to 1 new list
>        newList = []
>        for item in set1:
>                if item in set1 and item in set2:
>                        newList.append(item)
>        newList.sort()
>        return newList

- the "item in set1 and " in the if-statement is redundant. blow it away

- i think you can build it using a listcomp


> def Negate(set1, set2):         # evaluate 2 lists, return negation of 1st 
> list
>        newList = []
>        for item in set1:
>                if item in set2:
>                        set1.remove(item)
>        newList = set1
>        return newList

- you probably don't want to call set1.remove(). lists are immutable,
and you would've change the contents of set1. it's best if you made
newList a copy of set1, i.e., newList = list(set1) or newList =
set1[:], and *then* did your for-loop

- better yet, use a list comp with an if to build newList to avoid all
of the shenanigans i just described

hope this helps!
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Core Python Programming", Prentice Hall, (c)2007,2001
"Python Fundamentals", Prentice Hall, (c)2009
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
python training and technical consulting
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] implementing set operations [was Re: (no subject)]

2009-07-29 Thread wesley chun
> - you probably don't want to call set1.remove(). lists are immutable,
> and you would've change the contents of set1.

sorry, make that "mutable." sets, dicts, and lists are standard Python
types that are mutable.

-wesley
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Currency conversion module in python

2009-07-29 Thread Amit Sethi
Hi , Does anybody know of any currency conversion module in python

-- 
A-M-I-T S|S
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor