[Tutor] How to adjust a text file...

2007-09-27 Thread GTXY20
Hi,

I have a CSV file as follows:

IDProducts
1 a b c d
1 a e
2 a b c
2 a
3 b c
3 a
4 d
5 a d

I am trying to write a script that will take the CSV file and output another
text file as follows:

ID   Products
1a
1b
1c
1d
1a
1e

etc.. for all of the ID's essentially I need to create a single instance for
products for each ID - currently the products are separated by a space. I am
thinking I need a for loop that will search on the space as a delimiter...
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to adjust a text file...

2007-09-28 Thread GTXY20
Thanks...

I was able to use the following to get what i needed done:


inp = open('input.txt', 'r')
out = open('output.txt', 'w')

for line in inp:
Fields = line.split(",")
ID = Fields[0]
ProductMess = Fields[1]
Product = ProductMess.split()
for item in Funds:
out.write ('%s\t%s\n'%(ID, Product))
Now my next challenge is to link a current table to this file and replace
values in the Product area based on the value - sort of like a Global
replace. Any hints as to where Python might have some sort of lookup table
functionality.

M.


On 9/27/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
>
> GTXY20 wrote:
> > Hi,
> >
> > I have a CSV file as follows:
> >
> > IDProducts
> > 1 a b c d
> > 1 a e
> > 2 a b c
> > 2 a
> > 3 b c
> > 3 a
> > 4 d
> > 5 a d
> >
> > I am trying to write a script that will take the CSV file and output
> > another text file as follows:
> >
> > ID   Products
> > 1a
> > 1b
> > 1c
> > 1d
> > 1a
> > 1e
> >
> > etc.. for all of the ID's essentially I need to create a single instance
> > for products for each ID - currently the products are separated by a
> > space. I am thinking I need a for loop that will search on the space as
> > a delimiter...
>
> I should probably be teaching you to fish but tonight I have extra fish
> :-)
>
> If the products are single words then this is very simple. Something like
>
> inp = open('input.txt')
> out = open('output.txt')
>
> # Headers
> inp.next()
> out.write('ID\tProducts\n')
>
> for line in inp:
>   fields = line.split()
>   prodId = fields[0]
>   products = fields[1:]
>   for product in products:
> out.write('%s\t%s\n' % (prodId, product))
>
> inp.close()
> out.close()
>
>
> If the product text is more complex then you might want to use the csv
> module to help read and write the file.
>
> BTW in Python 3 you can write
>   prodId, *products = fields.split()
>
> http://www.python.org/dev/peps/pep-3132/
>
> Kent
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Dictionary - count values where values are stored as a list

2007-09-30 Thread GTXY20
Hello,

Any way to display the count of the values in a dictionary where the values
are stored as a list? here is my dictionary:

 {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4':
['a', 'c']}

I would like to display count as follows and I would not know all the value
types in the values list:

Value QTY
a   4
b   3
c   4

Also is there anyway to display the count of the values list combinations so
here again is my dictionary:
{'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4':
['a', 'c']}


And I would like to display as follows

QTY Value List Combination
3  a,b,c
1  a,c

Once again all help is much appreciated.

M.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Dictionary - count values where values are stored as a list

2007-10-01 Thread GTXY20
This works perfectly.

However I will be dealing with an import of a very large dictionary - if I
call the commands at command line this seems to be very taxing on the CPU
and memory and will take a long time.

I was thinking of creating each as a fucntion whereby python would just to
write to a file instead of calling within a python shell do you think that
this would speed up the process?

All in total I will probably be looking at about 2 million dictionary keys
with assorted value quantities.

M.


On 10/1/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
>
> GTXY20 wrote:
> > Hello,
> >
> > Any way to display the count of the values in a dictionary where the
> > values are stored as a list? here is my dictionary:
> >
> > {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4':
> > ['a', 'c']}
> >
> > I would like to display count as follows and I would not know all the
> > value types in the values list:
> >
> > Value QTY
> > a   4
> > b   3
> > c   4
>
> You need two nested loops - one to loop over the dictionary values and
> one to loop over the individual lists. collections.defaultdict is handy
> for accumulating the counts but you could use a regular dict also:
>
> In [4]: d={'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b',
> 'c'], '4': ['a', 'c']}
> In [5]: from collections import defaultdict
> In [6]: counts=defaultdict(int)
> In [7]: for lst in d.values():
>...: for item in lst:
>...: counts[item] += 1
>...:
> In [8]: counts
> Out[8]: defaultdict(, {'a': 4, 'c': 4, 'b': 3})
>
> In [10]: for k, v in sorted(counts.items()):
>: print k,v
>:
>:
> a 4
> b 3
> c 4
>
>
> > Also is there anyway to display the count of the values list
> > combinations so here again is my dictionary:
> >
> > {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4':
> > ['a', 'c']}
> >
> >
> > And I would like to display as follows
> >
> > QTY Value List Combination
> > 3  a,b,c
> > 1  a,c
>
> Again you can use a defaultdict to accumulate counts. You can't use a
> mutable object (such as a list) as a dict key so you have to convert it
> to a tuple:
>
> In [11]: c2=defaultdict(int)
> In [13]: for v in d.values():
>: c2[tuple(v)] += 1
>:
> In [14]: c2
> Out[14]: defaultdict(, {('a', 'b', 'c'): 3, ('a', 'c'): 1})
>
> Printing in order of count requires switching the order of the (key,
> value) pairs:
>
> In [15]: for count, items in sorted( ((v, k) for k, v in c2.items()),
> reverse=True):
>: print count, ', '.join(items)
>:
> 3 a, b, c
> 1 a, c
>
> or using a sort key:
> In [16]: from operator import itemgetter
> In [17]: for items, count in sorted(c2.items(), key=itemgetter(1),
> reverse=True):
>: print count, ', '.join(items)
>:
> 3 a, b, c
> 1 a, c
>
> Kent
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Dictionary - count values where values are stored as a list

2007-10-01 Thread GTXY20
Thanks again I have worked that issue out.

However I have the following function and it is throwing this error:

FEXpython_v2.py", line 32, in UnitHolderDistributionqty
count[item]+=1
KeyError: 3

This is the function:

def Distributionqty(dictionary):
holder=list()
held=list()
distqtydic={}
count={}
for key in sorted(dictionary.keys()):
holder.append(key)
held.append(len(dictionary[key]))
for (key, value) in map(None, holder, held):
distqtydic[key]=value
for item in distqtydic.values():
count[item]+=1
for k,v in sorted(count.items()):
fdist=k
qty=v
print fdist,qty

Not sure...

M.



On 10/1/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
>
> GTXY20 wrote:
> >
> > This works perfectly.
> >
> > However I will be dealing with an import of a very large dictionary - if
>
> > I call the commands at command line this seems to be very taxing on the
> > CPU and memory and will take a long time.
> >
> > I was thinking of creating each as a fucntion whereby python would just
> > to write to a file instead of calling within a python shell do you think
> > that this would speed up the process?
>
> I don't understand what you are suggesting.
>
> Both of your requirements just need the values of the dict. If the dict
> is being created from a file, you could probably build the count dicts
> on the fly as you read the values without ever creating the dict with
> all the items in it.
>
> Kent
>
> >
> > All in total I will probably be looking at about 2 million dictionary
> > keys with assorted value quantities.
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Dictionary - count values where values are stored as a list

2007-10-01 Thread GTXY20
Thanks so much I changed to the following and this worked:

def HolderDistributionqty(dictionary):
from collections import defaultdict
count=defaultdict(int)
for item in dictionary.values():
count[len(item)]+=1
for k,v in sorted(count.items()):
fdist=k
qty=v
print fdist,qty

M.

On 10/1/07, GTXY20 <[EMAIL PROTECTED]> wrote:
>
>
> Thanks again I have worked that issue out.
>
> However I have the following function and it is throwing this error:
>
> FEXpython_v2.py", line 32, in UnitHolderDistributionqty
> count[item]+=1
> KeyError: 3
>
> This is the function:
>
> def Distributionqty(dictionary):
> holder=list()
> held=list()
> distqtydic={}
> count={}
> for key in sorted(dictionary.keys()):
> holder.append(key)
> held.append(len(dictionary[key]))
> for (key, value) in map(None, holder, held):
> distqtydic[key]=value
> for item in distqtydic.values():
> count[item]+=1
> for k,v in sorted(count.items()):
> fdist=k
> qty=v
> print fdist,qty
>
> Not sure...
>
> M.
>
>
>
> On 10/1/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
> >
> > GTXY20 wrote:
> > >
> > > This works perfectly.
> > >
> > > However I will be dealing with an import of a very large dictionary -
> > if
> > > I call the commands at command line this seems to be very taxing on
> > the
> > > CPU and memory and will take a long time.
> > >
> > > I was thinking of creating each as a fucntion whereby python would
> > just
> > > to write to a file instead of calling within a python shell do you
> > think
> > > that this would speed up the process?
> >
> > I don't understand what you are suggesting.
> >
> > Both of your requirements just need the values of the dict. If the dict
> > is being created from a file, you could probably build the count dicts
> > on the fly as you read the values without ever creating the dict with
> > all the items in it.
> >
> > Kent
> >
> > >
> > > All in total I will probably be looking at about 2 million dictionary
> > > keys with assorted value quantities.
> >
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Update values stored as a list in a dictionary with values from another dictionary

2007-10-01 Thread GTXY20
Hello all,

Let's say I have the following dictionary:

{1:(a,b,c), 2:(a,c), 3:(b,c), 4:(a,d)}

I also have another dictionary for new value association:

{a:1, b:2, c:3}

How should I approach if I want to modify the first dictionary to read:

{1:(1,2,3), 2:(1,3), 3:(2,3), 4:(1,d)}

There is the potential to have a value in the first dictionary that will not
have an update key in the second dictionary hence in the above dictionary
for key=4 I still have d listed as a value.

As always all help is appreciated.

M.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Update values stored as a list in a dictionary with values from another dictionary

2007-10-02 Thread GTXY20
I seem to be encountering a problem and I think it is because I actually
have my data as follows:

data = {1:[a,b,c], 2:[a,c], 3:[b,c], 4:[a,d]}

not as previously mentioned:

data = {1:(a,b,c), 2:(a,c), 3:(b,c), 4:(a,d)}

So the values are actually stored as a list.

I am trying to adjust so that data ends up being:

{1:[1,2,3], 2:[1,3], 3:[2,3], 4:[1,d]}

right now I am getting:

{1:[[1],[2],[3]], 2:[[1],[3]], 3:[[2],[3]], 4:[[1],d]}

which is problmatic for other things I am trying to do - it is indicating
that the values are not hashable.





On 10/2/07, John Fouhy <[EMAIL PROTECTED]> wrote:
>
> On 02/10/2007, GTXY20 <[EMAIL PROTECTED]> wrote:
> > Hello all,
> >
> > Let's say I have the following dictionary:
> >
> > {1:(a,b,c), 2:(a,c), 3:(b,c), 4:(a,d)}
> >
> > I also have another dictionary for new value association:
> >
> > {a:1, b:2, c:3}
> >
> > How should I approach if I want to modify the first dictionary to read:
> >
> >  {1:(1,2,3), 2:(1,3), 3:(2,3), 4:(1,d)}
> >
> > There is the potential to have a value in the first dictionary that will
> not
> > have an update key in the second dictionary hence in the above
> dictionary
> > for key=4 I still have d listed as a value.
>
> You could use the map function...
>
> Let's say we have something like:
>
> transDict = { 'a':1, 'b':2, 'c':3 }
>
> We could define a function that mirrors this:
>
> def transFn(c):
> try:
> return transDict[c]
> except KeyError:
> return c
>
> Then if you have your data:
>
> data = { 1:('a','b','c'), 2:('a','c'), 3:('b','c'), 4:('a','d')}
>
> You can translate it as:
>
> for key in data.keys():
> data[key] = map(transFn, data[key])
>
> HTH!
>
> --
> John.
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Update values stored as a list in a dictionary with values from another dictionary

2007-10-02 Thread GTXY20
Sorry - solved my own problem - it was the way I was creating my dictionary
and assigning the value as a list.

I will post my final working code shortly.

M.

On 10/2/07, GTXY20 <[EMAIL PROTECTED]> wrote:
>
> I seem to be encountering a problem and I think it is because I actually
> have my data as follows:
>
> data = {1:[a,b,c], 2:[a,c], 3:[b,c], 4:[a,d]}
>
> not as previously mentioned:
>
> data = {1:(a,b,c), 2:(a,c), 3:(b,c), 4:(a,d)}
>
> So the values are actually stored as a list.
>
> I am trying to adjust so that data ends up being:
>
> {1:[1,2,3], 2:[1,3], 3:[2,3], 4:[1,d]}
>
> right now I am getting:
>
> {1:[[1],[2],[3]], 2:[[1],[3]], 3:[[2],[3]], 4:[[1],d]}
>
> which is problmatic for other things I am trying to do - it is indicating
> that the values are not hashable.
>
>
>
>
>
> On 10/2/07, John Fouhy <[EMAIL PROTECTED]> wrote:
> >
> > On 02/10/2007, GTXY20 < [EMAIL PROTECTED]> wrote:
> > > Hello all,
> > >
> > > Let's say I have the following dictionary:
> > >
> > > {1:(a,b,c), 2:(a,c), 3:(b,c), 4:(a,d)}
> > >
> > > I also have another dictionary for new value association:
> > >
> > > {a:1, b:2, c:3}
> > >
> > > How should I approach if I want to modify the first dictionary to
> > read:
> > >
> > >  {1:(1,2,3), 2:(1,3), 3:(2,3), 4:(1,d)}
> > >
> > > There is the potential to have a value in the first dictionary that
> > will not
> > > have an update key in the second dictionary hence in the above
> > dictionary
> > > for key=4 I still have d listed as a value.
> >
> > You could use the map function...
> >
> > Let's say we have something like:
> >
> > transDict = { 'a':1, 'b':2, 'c':3 }
> >
> > We could define a function that mirrors this:
> >
> > def transFn(c):
> > try:
> > return transDict[c]
> > except KeyError:
> > return c
> >
> > Then if you have your data:
> >
> > data = { 1:('a','b','c'), 2:('a','c'), 3:('b','c'), 4:('a','d')}
> >
> > You can translate it as:
> >
> > for key in data.keys():
> > data[key] = map(transFn, data[key])
> >
> > HTH!
> >
> > --
> > John.
> >
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Update values stored as a list in a dictionary with values from another dictionary

2007-10-02 Thread GTXY20
This seemed to work:

def transFn(c):
transfile = open('translate.txt', 'r')
records = transfile.read()
transfile.close()
lines = records.split()
transDict = {}
for line in lines:
key, value = line.split(',')
transDict[key] = value
try:
   return transDict[c]
except KeyError:
   return c

for key in data.keys():
data[key] = map(transFn, data[key])

On 10/2/07, GTXY20 <[EMAIL PROTECTED]> wrote:
>
>
> Sorry - solved my own problem - it was the way I was creating my
> dictionary and assigning the value as a list.
>
> I will post my final working code shortly.
>
> M.
>
> On 10/2/07, GTXY20 <[EMAIL PROTECTED]> wrote:
> >
> > I seem to be encountering a problem and I think it is because I actually
> > have my data as follows:
> >
> > data = {1:[a,b,c], 2:[a,c], 3:[b,c], 4:[a,d]}
> >
> > not as previously mentioned:
> >
> > data = {1:(a,b,c), 2:(a,c), 3:(b,c), 4:(a,d)}
> >
> > So the values are actually stored as a list.
> >
> > I am trying to adjust so that data ends up being:
> >
> > {1:[1,2,3], 2:[1,3], 3:[2,3], 4:[1,d]}
> >
> > right now I am getting:
> >
> > {1:[[1],[2],[3]], 2:[[1],[3]], 3:[[2],[3]], 4:[[1],d]}
> >
> > which is problmatic for other things I am trying to do - it is
> > indicating that the values are not hashable.
> >
> >
> >
> >
> >
> > On 10/2/07, John Fouhy <[EMAIL PROTECTED]> wrote:
> > >
> > > On 02/10/2007, GTXY20 < [EMAIL PROTECTED]> wrote:
> > > > Hello all,
> > > >
> > > > Let's say I have the following dictionary:
> > > >
> > > > {1:(a,b,c), 2:(a,c), 3:(b,c), 4:(a,d)}
> > > >
> > > > I also have another dictionary for new value association:
> > > >
> > > > {a:1, b:2, c:3}
> > > >
> > > > How should I approach if I want to modify the first dictionary to
> > > read:
> > > >
> > > >  {1:(1,2,3), 2:(1,3), 3:(2,3), 4:(1,d)}
> > > >
> > > > There is the potential to have a value in the first dictionary that
> > > will not
> > > > have an update key in the second dictionary hence in the above
> > > dictionary
> > > > for key=4 I still have d listed as a value.
> > >
> > > You could use the map function...
> > >
> > > Let's say we have something like:
> > >
> > > transDict = { 'a':1, 'b':2, 'c':3 }
> > >
> > > We could define a function that mirrors this:
> > >
> > > def transFn(c):
> > > try:
> > > return transDict[c]
> > > except KeyError:
> > > return c
> > >
> > > Then if you have your data:
> > >
> > > data = { 1:('a','b','c'), 2:('a','c'), 3:('b','c'), 4:('a','d')}
> > >
> > > You can translate it as:
> > >
> > > for key in data.keys():
> > > data[key] = map(transFn, data[key])
> > >
> > > HTH!
> > >
> > > --
> > > John.
> > >
> >
> >
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Update values stored as a list in a dictionary with values from another dictionary

2007-10-02 Thread GTXY20
I adjusted so that I get the following so if I do not need to translate a
dictionary I do not call the function transFn:

def transFn(translatefile):
transfile = open(translatefile, 'r')
records = transfile.read()
transfile.close()
lines = records.split()
transDict = {}
for line in lines:
key, value = line.split(',')
transDict[key] = value

for key, value in Data.items():
  Data[key] = [ transDict.get(i, i) for i in value ]

At this point can anyone recommend any python modules that helps to create
reporting and graphing?

On 10/2/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
>
> GTXY20 wrote:
> >
> > This seemed to work:
> >
> > def transFn(c):
> > transfile = open('translate.txt', 'r')
> > records = transfile.read()
> > transfile.close()
> > lines = records.split()
> > transDict = {}
> > for line in lines:
> > key, value = line.split(',')
> > transDict[key] = value
> > try:
> >return transDict[c]
> > except KeyError:
> >return c
>
> Yikes! This is re-reading translate.txt every time transFn() is called,
> i.e. once for every key in data!
>
> Kent
>
> >
> > for key in data.keys():
> > data[key] = map(transFn, data[key])
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Update values stored as a list in a dictionary with values from another dictionary

2007-10-02 Thread GTXY20
Here's an interesting question:

Can I use the transFn function to remove items in the value list.

Can this be done by simple assigning the current value a value of null
in the translate file?

M.



On 10/2/07, GTXY20 <[EMAIL PROTECTED]> wrote:
> I adjusted so that I get the following so if I do not need to translate a
> dictionary I do not call the function transFn:
>
> def transFn(translatefile):
> transfile = open(translatefile, 'r')
> records = transfile.read()
> transfile.close()
> lines = records.split()
> transDict = {}
> for line in lines:
> key, value = line.split(',')
> transDict[key] = value
>
> for key, value in Data.items():
>   Data[key] = [ transDict.get(i, i) for i in value ]
>
> At this point can anyone recommend any python modules that helps to create
> reporting and graphing?
>
> On 10/2/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
> >
> > GTXY20 wrote:
> > >
> > > This seemed to work:
> > >
> > > def transFn(c):
> > > transfile = open('translate.txt', 'r')
> > > records = transfile.read()
> > > transfile.close()
> > > lines = records.split()
> > > transDict = {}
> > > for line in lines:
> > > key, value = line.split(',')
> > > transDict[key] = value
> > > try:
> > >return transDict[c]
> > > except KeyError:
> > >return c
> >
> > Yikes! This is re-reading translate.txt every time transFn() is called,
> > i.e. once for every key in data!
> >
> > Kent
> >
> > >
> > > for key in data.keys():
> > > data[key] = map(transFn, data[key])
> >
> >
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Update values stored as a list in a dictionary with values from another dictionary

2007-10-02 Thread GTXY20
I have the transFn function as follows:

def transFn(translatefile):
transfile = open(translatefile, 'r')
records = transfile.read()
transfile.close()
lines = records.split()
transDict = {}
for line in lines:
key, value = line.split(',')
transDict[key] = value

for key, value in data.items():
  data[key] = [ x for x in (transDict.get(i, i) for i in value) if x is
not None]

my original data is:

data = {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'],
'4': ['a', 'c']}

my transDict is:

transDict = {'a': '1', 'b': '2'}

However when I run transFn my data is:

data = {'1': ['1', '2', 'c'], '3': ['1', '2', 'c'], '2': ['1', '2', 'c'],
'4': ['1', 'c']}

I was expecting:

{'1': ['1', '2'], '3': ['1', '2'], '2': ['1', '2'], '4': ['1']}

I will see if I can work with a remove command??

M.

On 10/2/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
>
> GTXY20 wrote:
> > Here's an interesting question:
> >
> > Can I use the transFn function to remove items in the value list.
> >
> > Can this be done by simple assigning the current value a value of null
> > in the translate file?
>
> No, that will make the translated value be None (I guess that is what
> you mean by null). You could then filter for these, for example
>
>  for key, value in Data.items():
> Data[key] = [ transDict.get(i, i) for i in value if
> transDict.get(i, i) is not None]
>
> If you want to avoid double-fetching from transDict (if Data is huge
> this might matter) then you could write out the loop or possibly use
> something like
>Data[key] = [ x for x in (transDict.get(i, i) for i in value) if x is
> not None]
>
> which makes an intermediate generator and filters that.
>
> Kent
>
> >
> > M.
> >
> >
> >
> > On 10/2/07, GTXY20 <[EMAIL PROTECTED]> wrote:
> >> I adjusted so that I get the following so if I do not need to translate
> a
> >> dictionary I do not call the function transFn:
> >>
> >> def transFn(translatefile):
> >> transfile = open(translatefile, 'r')
> >> records = transfile.read()
> >> transfile.close()
> >> lines = records.split()
> >> transDict = {}
> >> for line in lines:
> >> key, value = line.split(',')
> >> transDict[key] = value
> >>
> >> for key, value in Data.items():
> >>   Data[key] = [ transDict.get(i, i) for i in value ]
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Review and criticism of python project

2008-01-03 Thread GTXY20
Hello all,

Is there a forum or group where I can upload my python project for review?

I am new at Python and at this point my program is doing what it needs to I
just can't help but feeling I have some errors or improper coding going on
inside.

Any advice is very much appreciated.

Thanks.

GTXY20
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-04 Thread GTXY20
There are no errors per se - the script is doing what it needs to I guess I
just want to check it for compliance - for some reason I think itis a mess
and should be much cleaner.

I am only concerned with one particular area of the complete project - it is
229 lines in total - would this be too much to post? I do not have a website
to post code to - just don't want to post too much for the group and annoy
anyone.

Thanks for your comments and let me know.

GTXY20

On Jan 3, 2008 6:08 PM, wesley chun <[EMAIL PROTECTED]> wrote:

> > Is there a forum or group where I can upload my python project for
> review?
> >
> > I am new at Python and at this point my program is doing what it needs
> to I
> > just can't help but feeling I have some errors or improper coding going
> on
> > inside.
>
>
> if it's not too huge, feel free to post it here for feedback!  let us
> know a bit about what it does, your intentions and/or perhaps how you
> built it, and any specific areas of concern. also post any error msgs
> if applicable.
>
> cheers,
> -- wesley
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> "Core Python Programming", Prentice Hall, (c)2007,2001
>http://corepython.com
>
> wesley.j.chun :: wescpy-at-gmail.com
> python training and technical consulting
> cyberweb.consulting : silicon valley, ca
> http://cyberwebconsulting.com
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-04 Thread GTXY20
Hi there.

What this section area does is takes a data file that is comma separated and
imports - there is a unique ID in the first field and a code in the second
field that corresponds to a certain section of information. What I need from
this is for the process to role up against unique ID all section holdings
withot duplicates, report on section combinations, and overal section
counts. In addtion I need the ability to assigna value for page count to
these sections and have the ability to uploada translation file just in case
a section is identiifed by multiple values that needs to be normalized to a
single unique value.

Sorry for the lengthly code response - all commenst are appreciated - as
mentioned I am quite new with Python - it is doing what I need it to do but
I think that it is a mess and needs to be cleaned up a little.

Thanks for any comments.

GTXY20

import sys
import os
class __analysis:
def __init__(self):
print '***Analysis Tool***'
datafile=raw_input('data file name:')
self.datafile=datafile
self.parsefile()

# script to import unitID section data and section page count reference and
create a sorted dictionary
# where in uhdata{} key=unitID and value=unitID section holdings
# where in pgcnt{} key=Section and value=page count

def parsefile(self):
try:
uhdatafile = open(self.datafile, 'r')
records = uhdatafile.read()
uhdatafile.close()
lines = records.split()
self.uhdata={}
for line in lines:
uh, tf = line.split(',')
if uh in self.uhdata:
f=self.uhdata[uh]
if tf not in f:
f.append(tf)
else:
self.uhdata[uh]=[tf]

for uh, Sections in self.uhdata.items():
Sections.sort()
except IOError:
print 'file not found check file name'
analysis()

ftranslateok=raw_input('would you like to translate section codes?
(y/n):')
if ftranslateok == 'y':
self.transFn()
else:
pass
pgcountok=raw_input('would you like to assign section page counts?
(y/n):')
if pgcountok == 'y':
self.setPageCounts()
else:
missingpgcounts={}
fmissingpgcounts=[]
for x in self.uhdata:
for f in self.uhdata[x]:
if f not in fmissingpgcounts:
fmissingpgcounts.append(f)
for x in fmissingpgcounts:
missingpgcounts[x]=0
self.pgcounts = missingpgcounts
fdistmodel=raw_input('would you like to define max section
distribution cut off? (y/n):')
if fdistmodel == 'y':
self.fdistmax=raw_input('what is the max distributions before a
full book?:')
self.fdistmax=int(self.fdistmax)
self.Sectiondistmax()
else:
self.fdistmax=10
self.Sectiondistmax()
sys.exit(1)

# function to determine number of uniqueID for each section
def Sectionqty(self):
Sectionqtyoutfile = open('Sectionqty.txt', 'w+')
Sectionqtyoutfile.write ('Section\tQTY\n')
from collections import defaultdict
fcounts=defaultdict(int)
flst=[]
flst2=[]
if self.fdistmax == 10:
for v in self.uhdata.values():
for item in v:
fcounts[item]+=1

for k,v in sorted(fcounts.items()):
Section=k
fqty=v
Sectionqtyoutfile.write ('%s\t%s\n' % (Section, fqty))

else:
for k,v in self.uhdata.items():
if len(v)<=self.fdistmax:
flst.append(self.uhdata[k])
for i in flst:
for x in i:
flst2.append(x)
for Sections in flst2:
fcounts[Sections]+=1
for k,v in sorted(fcounts.items()):
Section= k
fqty= v
Sectionqtyoutfile.write ('%s\t%s\n' % (Section, fqty))

Sectionqtyoutfile.close()
self.SectionCombqty()

# function to determine number of uniqueID section combinations and
associated section page counts
def SectionCombqty(self):
SectionCombqtyoutfile = open('SectionCombqty.txt', 'w+')
SectionCombqtyoutfile.write('Combination Qty\tNumber of
Sections\tCombination\tCombinationPageCount\tTotalPages\n')
fullbook = 'Full Book'
fgreater=[]
fcheck=0
from collections import defaultdict
fcomb=defaultdict(int)
for uh in self.uhdata.keys():
fcom

Re: [Tutor] Review and criticism of python project

2008-01-04 Thread GTXY20
thanks for the feedback - i will go through your comments line by line
adjust and test and will re-post when complete.

GTXY20

On Jan 4, 2008 9:29 PM, Kent Johnson <[EMAIL PROTECTED]> wrote:

> Tiger12506 wrote:
>
> > Ouch. Usually in OOP, one never puts any user interaction into a class.
>
> That seems a bit strongly put to me. Generally user interaction should
> be separated from functional classes but you might have a class to help
> with command line interaction (e.g. the cmd module in the std lib) and
> GUI frameworks are usually built with classes.
>
>
> >>uh, tf = line.split(',')
> >>if uh in self.uhdata:
> >>f=self.uhdata[uh]
> >>if tf not in f:
> >>f.append(tf)
> >>else:
> >>self.uhdata[uh]=[tf]
> >
> > This can read
> >
> > try:
> >   f = self.uhdata[uh]
> > except KeyError:
> >   self.uhdata[uh] = []
> > finally:
> >   self.uhdata[uh].append(tf)
>
> These are not equivalent - the original code avoids duplicates in
> self.uhdata[uh].
>
> Possibly self.uhdata[uh] could be a set instead of a list. Then this
> could be written very nicely using defaultdict:
>
> from collections import defaultdict
>   ...
>   self.uhdata = defaultdict(set)
>   ...
>   self.uhdata[uh].add(tf)
>
> >>for uh, Sections in self.uhdata.items():
> >>Sections.sort()
> >
> > This will not work. In the documentation, it says that dictionary object
> > return a *copy* of the (key,value) pairs.
> > You are sorting those *copies*.
>
> No, the original code is fine. The docs say that dict.items() returns "a
> copy of a's list of (key, value) pairs". This is a little misleading,
> perhaps. A dict does not actually contain a list of key, value pairs, it
> is implemented with a hash table. dict.items() returns a new list
> containing the key, value pairs.
>
> But the keys and values in the new list are references to the same keys
> and values in the dict. So mutating the values in the returned list, via
> sort(), will also mutate the values in the dict.
>
>
> >>missingpgcounts={}
> >>fmissingpgcounts=[]
> >>for x in self.uhdata:
> >>for f in self.uhdata[x]:
> >>if f not in fmissingpgcounts:
> >>fmissingpgcounts.append(f)
> >
> > fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in
> > fmissingpgcounts]
> > Ahhh... This looks like a set.
> > fmissingpgcounts = set(self.uhdata.itervalues())
>
> Again, this is not quite the same thing. The original code builds a set
> of the contents of the values. You are building a set from the values
> (which are already lists). You still need one loop:
> fmissingpgcounts = set()
> for v in self.uhdate.itervalues():
>   fmissingpgcounts.update(v)
>
> > self.pgcounts = dict((x,0) for x in fmissingpgcounts)
>
> or, self.pgcounts = dict.fromkeys(fmissingpgcounts, 0)
>
> That's all I have energy for...
> Kent
>
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] PHP & Python suggestions....

2008-02-03 Thread GTXY20
Hi all,

First off let me say how helpful and informative this mailing list is. It is
very much appreciated.

Anyway, I am looking for some suggestions for reading up on how to call
Python from PHP scripts, specifically calling from a PHP web application -
PHP will call python script with some arguments and python will run on the
server and return the results within another PHP page.

Once again thanks.

GTXY20.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] C# defaultdict question

2008-04-07 Thread GTXY20
Hello all,

Sadly I need to convert a great python application into C# .Net. I have been
pretty successful so far but I was wondering if anyone knew of something
similar to a python defaultdict(int) in C#. In python I am doing:

g = {} (where the value in the key value pair is a tuple of values)
f = defaultdict(int)

In order to get totals of particular values within each tuple within the
complete dictionary g I do:

for values in g.values():
   for item in values:
 f[item]+=1

I can accomplish this in C# by converting the values (which are stored as a
List) from a C# dictionary to  an Array then looping through the Array but
was wondering if anyone had a different take or thoughts how to do this in
C# as easily as it is done in Python.

Thanks.

G.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Nested dictionary with defaultdict

2008-04-15 Thread GTXY20
Hi tutors,

I currently have a dictionary like the following:

{'1': ['220', '220', '220''], '2': ['220', '238', '238', '238', '238'], '3':
['220', '238'], '4': ['220', '220'], '5': ['220', '220', '238'], '6':
['238', '238'], '7': ['220']}

I am trying to create a dictionary that would list the current key and a
count of the iterations of values within the value list like so:

{'1': {'220' : 3}, '2': {'220' : 1}, 2: {238 : 4}, '3': {'220' : 1}, 3: {
'238' : 1}, '4': {220 : 2}, '5': {'220: 2}, '5': {238 : 1}, '6': {'238' :
2}, '7': {'220' : 1}}

Now I am pretty sure that I need to loop through the first dictionary and
create a defaultdict from the values for each key in that dictionary but at
this point I am having difficulty coming up with the loop.

I am looking for a satrting point or any suggestions.

Many thanks in advance.

GTXY20
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Nested dictionary with defaultdict

2008-04-15 Thread GTXY20
Hi Kent,

Yes I think so I think I am almost there with this:

from collections import defaultdict
d = {'1': ['220', '220', '220''], '2': ['220', '238', '238', '238', '238'],
'3': ['220', '238'], '4': ['220', '220'], '5': ['220', '220', '238'], '6':
['238', '238'], '7': ['220']}

for f, b in d.items():
h = defaultdict(int)
for j in b:
    h[j]+=1
print ('%s, %s' % (f, h))

However, not exactly happy with the printed output as soon as I complete I
will repost what I come up with.

Thanks so much.

M.

On Tue, Apr 15, 2008 at 10:17 PM, Kent Johnson <[EMAIL PROTECTED]> wrote:

> GTXY20 wrote:
>
> >
> > Hi tutors,
> >
> > I currently have a dictionary like the following:
> >
> > {'1': ['220', '220', '220''], '2': ['220', '238', '238', '238', '238'],
> > '3': ['220', '238'], '4': ['220', '220'], '5': ['220', '220', '238'], '6':
> > ['238', '238'], '7': ['220']}
> >
> > I am trying to create a dictionary that would list the current key and a
> > count of the iterations of values within the value list like so:
> >
> > {'1': {'220' : 3}, '2': {'220' : 1}, 2: {238 : 4}, '3': {'220' : 1}, 3:
> > { '238' : 1}, '4': {220 : 2}, '5': {'220: 2}, '5': {238 : 1}, '6': {'238' :
> > 2}, '7': {'220' : 1}}
> >
>
> ?? Do you really want keys of '2' and 2? How can you have two keys '5'? I
> guess maybe you want
> {'1': {'220' : 3}, '2': {'220' : 1, 238 : 4}, '3': {'220' : 1, '238' : 1},
> '4': {220 : 2}, '5': {'220: 2, 238 : 1}, '6': {'238' : 2}, '7': {'220' : 1}}
>
>
> Now I am pretty sure that I need to loop through the first dictionary and
> > create a defaultdict from the values for each key in that dictionary but at
> > this point I am having difficulty coming up with the loop.
> >
> > I am looking for a satrting point or any suggestions.
> >
>
> Do you know how to turn
> ['220', '238', '238', '238', '238']
> into
> {'220' : 1, '238' : 4}
> ?
>
> If so, then put that code in a loop over the key, value pairs of the dict.
>
> Kent
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Nested dictionary with defaultdict

2008-04-15 Thread GTXY20
Thanks John and Kent for the guidance.

This following ends up working perfect for me - instead of print to the
console I will just write this to a text file. I will also wrap it in a
function.
from collections import defaultdict
d = {'1': ['220', '220', '220''], '2': ['220', '238', '238', '238', '238'],
'3': ['220', '238'], '4': ['220', '220'], '5': ['220', '220', '238'], '6':
['238', '238'], '7': ['220']}

for f, b in d.items():
h = defaultdict(int)
for j in b:
h[j]+=1
for k,v in sorted (h.items()):
print ('%s, (%s:%s)' % (f, k, v))

M.


On Tue, Apr 15, 2008 at 10:19 PM, John Fouhy <[EMAIL PROTECTED]> wrote:

> On 16/04/2008, GTXY20 <[EMAIL PROTECTED]> wrote:
> > I currently have a dictionary like the following:
> >
> > {'1': ['220', '220', '220''], '2': ['220', '238', '238', '238', '238'],
> '3':
> > ['220', '238'], '4': ['220', '220'], '5': ['220', '220', '238'], '6':
> > ['238', '238'], '7': ['220']}
> >
> > I am trying to create a dictionary that would list the current key and a
> > count of the iterations of values within the value list like so:
> >
> > {'1': {'220' : 3}, '2': {'220' : 1}, 2: {238 : 4}, '3': {'220' : 1}, 3:
> {
> > '238' : 1}, '4': {220 : 2}, '5': {'220: 2}, '5': {238 : 1}, '6': {'238'
> :
> > 2}, '7': {'220' : 1}}
> [...]
> > I am looking for a satrting point or any suggestions.
>
> Can you write a function that will take a list and return a dictionary
> with the counts of elements in the list?
>
> i.e. something like:
>
> >>> def countValues(valueList):
> ...  # your code goes here
> ...
> >>> countValues(['220', '238', '238', '238', '238'])
> {'238': 4, '220': 1}
>
> P.S.  Your sample output is not a valid dictionary...
>
> --
> John.
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Iterate through dictionary values and remove item

2008-05-14 Thread GTXY20
Hello all,

I have dictionary like the following:

d={(1,23A):[a,b,c,d],  (1,24A):[b,c,d], (2,23A):[a,b], (2,24A):[a,b,c,d]}

I would like to iterate through the dictionary such that if it finds
the value 'a' in the values of the key that it would remove the value
'b' from the values list. In addition if it finds 'a' in the values
list I would like it to take the first item from the key tuple k[0]
and and then look through the dictionary for that k[0] value and then
remove 'b' from its value list even if 'a' is not present in that
list. So at the end I would like my dictionary to end with:

d={(1,23A):[a,c,d],  (1,24A):[c,d], (2,23A):[a], (2,24A):[a,c,d]}

Initally I did the following:

for k,v in d.items():
u=k[0]
b=k[1]
if 'a' in v:
for k,v in d.items():
if k[0] == u:
for values in v:
if values == 'b':
v.remove(values)

However this will be a very big dictionary and it ended up running for
a very long time - in fact I had to kill the process.

Can anyone suggest an alternate method to do this?

Thanks in advance for any suggestions.

G.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Iterate through dictionary values and remove item

2008-05-15 Thread GTXY20
Hi Kent and Bob,

Bob sorry about the previous code snippet (was late) - I had previously been
trying to accomplish with the following:

for k,v in d.items():
u=k[0]
b=k[1]
if 'a' in v:
for k,v in d.items():
if k[0] == u:
for vals in v:
if vals == 'b':
v.remove('b')

this as mentioned was not performing very well at all.

Kent, I incorporated your suggestion:

toRemove=set(k[0] for k, v in d.iteritems() if 'b' in v)
for k,v in d.iteritems():
   if k[0] in toRemove:
   try:
   v.remove('b')
   except ValueError:
   pass

and the speed and result improved very dramatically. I suspect that I need
to get a better handle on the difference between items() and iteritems() and
what situations would call for them respectively.

Having said that, Kent I am not 100 percent sure of what you menat when you
mention a two-level dict. Can you give me a very brief example?

Thank you so much for your feedback.

G.

On Thu, May 15, 2008 at 7:57 AM, Kent Johnson <[EMAIL PROTECTED]> wrote:

>  On Thu, May 15, 2008 at 2:40 AM, GTXY20 <[EMAIL PROTECTED]> wrote:
> > Hello all,
> >
> > I have dictionary like the following:
> >
> > d={(1,23A):[a,b,c,d],  (1,24A):[b,c,d], (2,23A):[a,b], (2,24A):[a,b,c,d]}
> >
> > I would like to iterate through the dictionary such that if it finds
> > the value 'a' in the values of the key that it would remove the value
> > 'b' from the values list. In addition if it finds 'a' in the values
> > list I would like it to take the first item from the key tuple k[0]
> > and and then look through the dictionary for that k[0] value and then
> > remove 'b' from its value list even if 'a' is not present in that
> > list. So at the end I would like my dictionary to end with:
> >
> > d={(1,23A):[a,c,d],  (1,24A):[c,d], (2,23A):[a], (2,24A):[a,c,d]}
> >
> > Initally I did the following:
> >
> > for k,v in d.items():
> > u=k[0]
> > b=k[1]
> > if 'a' in v:
> > for k,v in d.items():
> > if k[0] == u:
> > for values in v:
> > if values == 'b':
> > v.remove(values)
> >
> > However this will be a very big dictionary and it ended up running for
> > a very long time - in fact I had to kill the process.
>
> The main problem is the nested loops. Every time you find an 'a' you
> search the whole dict for items starting with matching keys.  There
> are some smaller problems as well - you could use d.iteritems()
> instead of d.items() to avoid creating intermediate lists, and there
> are better ways to remove 'b' from v.
>
> The first thing I would do is reduce this to two passes over the dict
> - the first pass can find the keys, the second to delete 'b' from the
> list. For example,
> toRemove=set(k[0] for k, v in d.iteritems() if 'b' in v)
> for k,v in d.iteritems():
>if k[0] in toRemove:
>try:
>v.remove('b')
>except ValueError:
>pass
>
> If this is too slow, consider splitting up your keys - make a
> two-level dict so you can find all the keys starting with a particular
> value by a dict lookup instead of search.
>
> Kent
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Randomize SSN value in field

2008-05-22 Thread GTXY20
Hello all,

I will be dealing with an address list where I might have the following:

Name SSN
John 1
John 1
Jane 2
Jill 3

What I need to do is parse the address list and then create a unique random
unidentifiable value for the SSN field like so:

Name SSNrandomvalue
John 1a1b1c1d1
John 1a1b1c1d1
Jane 2a2b2c2d2
Jill 3a3b3c3d3

The unique random value does not have to follow this convention but it needs
to be unique so that I can relate it back to the original SSN when needed.
As opposed to using the random module I was thinking that it would be better
to use either sha or md5. Just curious as to thoughts on the correct
approach.

Thank you in advance.

G.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Randomize SSN value in field

2008-05-22 Thread GTXY20
Thanks all;

Basically I will be given an address list of about 50 million address lines
- this will boil down to approximately 15 million unique people in the list
using SSN as the reference for the primary key. I was concerned that by
using random I would eventually have duplicate values for different SSN's.
After assigning the unique reference for the SSN I need to then forward the
address file (complete 50 million records) to a larger group for analysis
and assign the unique reference in place of the SSN.

I will take a look at the various options I have with sha and md5 along with
the information regarding cryptography to see what i can come up with.

Alternatively I guess I could parse the address list and build a dictionary
where the key is the SSN and the value starts at 1 and is incremented as I
add addtional SSN keys to the dictionary. I would hold onto this dictionary
for reference as information is fed back to me.

With respect to a potentially large dictionary object can you suggest
efficient ways of handling memory when working with large dictionary
objects?

As always your help much appreciated.

G.

On Thu, May 22, 2008 at 1:39 PM, Kent Johnson <[EMAIL PROTECTED]> wrote:

> On Thu, May 22, 2008 at 12:14 PM, GTXY20 <[EMAIL PROTECTED]> wrote:
> > Hello all,
> >
> > I will be dealing with an address list where I might have the following:
> >
> > Name SSN
> > John 1
> > John 1
> > Jane 2
> > Jill 3
> >
> > What I need to do is parse the address list and then create a unique
> random
> > unidentifiable value for the SSN field
>
> > The unique random value does not have to follow this convention but it
> needs
> > to be unique so that I can relate it back to the original SSN when
> needed.
> > As opposed to using the random module I was thinking that it would be
> better
> > to use either sha or md5. Just curious as to thoughts on the correct
> > approach.
>
> How are you relating back to the SSN? Are you keeping a
> cross-reference? If so, you might just assign sequence numbers for the
> unidentifiable value. If you want the key itself to be convertable
> back to the SSN (which wouldn't work with random values) you will need
> some cryptography. If you want a unique key that won't collide with
> other keys then sha or md5 is a better bet than random.
>
> Kent
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Replace sequence in list - looking for direction

2008-06-16 Thread GTXY20
Hello all,

Thanks in advance for any thoughts, direction, and guidance.

Let's say I have a list like the following:

a = ['a1','a2','a3','a4','a5','a6']

and then I have dictionary like the following:

b = {'a1,a2,a3':'super'}

I need some direction and thoughts on how I might seach the list for the
string (the key in dict b) sequence in list a and replace the occurance with
the value from dict b. At the end I would like to have list a represented
as:

a = ['super', 'a4', 'a5', 'a6']

Thanks.

G.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor