Re: [Tutor] Reading CSV files in Pandas

2013-10-21 Thread Mark Lawrence

On 21/10/2013 04:05, Sivaram Neelakantan wrote:


you could try the following newsgroup or mailing list for more
specialised help.

  gmane.org:gmane.comp.python.pydata

  sivaram
  --


Thanks for this, it explains why I couldn't find pandas there :)

--
Python is the second best programming language in the world.
But the best has yet to be invented.  Christian Tismer

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string list in alphabetical!

2013-10-21 Thread Sammy Cornet
Thank you for help Steven! I intend to correct it. But also I would like to 
know if I wrote the correctly in order to the output that I'm looking for?



On Oct 20, 2013, at 19:22, "Sammy Cornet"  wrote:

> Hello!
> 
> I'm using python 2.7.5 version and I'm trying to write a program related to a 
> file named unsorted_fruits.txt contains a list of 26 fruits, each one with a 
> name that begins with a different letter of the alphabet. My program's goal 
> is to read in the fruits from the file unsorted_fruits.txt and writes them 
> out in alphabetical order to a file named sorted_fruits.txt while I'm using 
> while loop
> 
> For this assignment you must incorporate the use of a list, for loop and / or 
> while loop.
> 
> so here is what I have on my script:
> 
> 
> 
> infile = open('Desktop/unsorted_fruits.docx' ,"r")
> 
> outfile = open('Desktop/sorted_fruits.docx', 'w')
> 
> 
> 
> def find():
> 
> index = 0
> 
> while index < 26:
> 
> list < 26
> 
> list = ["a", "b", "c", "e", "f", "g", "h", "i", "j", "k", "l", "m", 
> "n" "o", "p", "q", "r", "s", "t", "u", "v", "w", "z", "y", "z"]
> 
> if index[0] == list[0]:
> 
> infile = list + 1
> 
> print infile
> 
> index += 1
> 
> infile.close() 
> 
> outfile.close() 
> 
> 
> And here is my output:
> 
> Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)] on 
> win32
> Type "copyright", "credits" or "license()" for more information.
> >>>  RESTART 
> >>> 
> 
> Traceback (most recent call last):
>   File "F:/7.real.py", line 1, in 
> infile = open('Desktop/unsorted_fruits.docx' ,"r")
> IOError: [Errno 2] No such file or directory: 'Desktop/unsorted_fruits.docx'
> >>> 
> 
> 
> I'm not sure if I write the program correctly, but I can't get my file wich 
> is on my desktop to the program. CAn someone help me please?
> 
> 
> 
> 
> 
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string list in alphabetical!

2013-10-21 Thread Alan Gauld

On 21/10/13 01:16, Sammy Cornet wrote:


so here is what I have on my script:


OK some comments below...


infile = open('Desktop/unsorted_fruits.docx' ,"r")
outfile = open('Desktop/sorted_fruits.docx', 'w')


You probably want to use txt files.


def find():
 index = 0
 while index < 26:
 list < 26


You haven't created list yet so you can't compare it to anything. But 
even if you had this line does nothing useful.



 list = ["a", "b", "c", "e", "f", "g", "h", "i", "j", "k", "l",
"m", "n" "o", "p", "q", "r", "s", "t", "u", "v", "w", "z", "y", "z"]


OK, Now you've created list... Although its a bad choice of name because 
it hides the list() operation in Python. So you can't now convert things 
to lists.



 if index[0] == list[0]:


This will fail since index is a number and list is a list. You can't 
index a number. I'm not sure what you thought you were comparing?



 infile = list + 1


I don't know what you think this does but what it does in practice is 
throws away your open file and tries to add 1 to your list which is an 
error. You can't add lists and numbers.



 print infile



 index += 1

 infile.close()
 outfile.close()


Since you haven't read anything from infile or written anything to 
outfile this doesn't achieve much.


I think you need to sit down with a pen and paper and work out how you 
would solve this problem then convert that to Python. As it is you have 
a long way to go. I also think you may be making the exercise much 
harder than it should be. Take a look at the documentation for the 
sort() method of lists, that should help.


HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string list in alphabetical!

2013-10-21 Thread Steven D'Aprano
On Sun, Oct 20, 2013 at 09:15:05PM -0500, Sammy Cornet wrote:
> Thank you for help Steven! I intend to correct it. But also I would 
> like to know if I wrote the correctly in order to the output that I'm 
> looking for?

I don't know, I didn't study your code in that much detail.

Why don't you fix the problems you already know about, then try running 
it, and see if it works as you expect?

That is the normal process of programming: 

1) write some code 
2) fix the bugs until it will run
3) test if it works correctly
4) repeat until done


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string list in alphabetical!

2013-10-21 Thread Lukas Nemec

On 10/21/2013 01:16 PM, Steven D'Aprano wrote:

On Sun, Oct 20, 2013 at 09:15:05PM -0500, Sammy Cornet wrote:

Thank you for help Steven! I intend to correct it. But also I would
like to know if I wrote the correctly in order to the output that I'm
looking for?

I don't know, I didn't study your code in that much detail.

Why don't you fix the problems you already know about, then try running
it, and see if it works as you expect?

That is the normal process of programming:

1) write some code
2) fix the bugs until it will run
3) test if it works correctly
4) repeat until done



I'd like to upgrade that process :D ...

1) think about your problem
2) if there are some heplful libraries that can make it way easier, use them
3) write some code
4) fix the bugs until it'll run
5) write unittests
6) test if it works correctly and if unittests pass
7) repeat until done

Bye :)

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string list in alphabetical!

2013-10-21 Thread Sammy Cornet
I appreciate your help and advices in concern of my challenge. In fact, I'm 
confuse because I was sent a lot of lesson in comparison to what I usually have 
for each week. I will try it the you told me. Think you for all of your email.


On Oct 21, 2013, at 4:08, "Sammy Cornet"  wrote:

> Thank you for help Steven! I intend to correct it. But also I would like to 
> know if I wrote the correctly in order to the output that I'm looking for?
> 
> 
> 
> On Oct 20, 2013, at 19:22, "Sammy Cornet"  wrote:
> 
>> Hello!
>> 
>> I'm using python 2.7.5 version and I'm trying to write a program related to 
>> a file named unsorted_fruits.txt contains a list of 26 fruits, each one with 
>> a name that begins with a different letter of the alphabet. My program's 
>> goal is to read in the fruits from the file unsorted_fruits.txt and writes 
>> them out in alphabetical order to a file named sorted_fruits.txt while I'm 
>> using while loop
>> 
>> For this assignment you must incorporate the use of a list, for loop and / 
>> or while loop.
>> 
>> so here is what I have on my script:
>> 
>> 
>> 
>> infile = open('Desktop/unsorted_fruits.docx' ,"r")
>> 
>> outfile = open('Desktop/sorted_fruits.docx', 'w')
>> 
>> 
>> 
>> def find():
>> 
>> index = 0
>> 
>> while index < 26:
>> 
>> list < 26
>> 
>> list = ["a", "b", "c", "e", "f", "g", "h", "i", "j", "k", "l", "m", 
>> "n" "o", "p", "q", "r", "s", "t", "u", "v", "w", "z", "y", "z"]
>> 
>> if index[0] == list[0]:
>> 
>> infile = list + 1
>> 
>> print infile
>> 
>> index += 1
>> 
>> infile.close() 
>> 
>> outfile.close() 
>> 
>> 
>> And here is my output:
>> 
>> Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)] on 
>> win32
>> Type "copyright", "credits" or "license()" for more information.
>> >>>  RESTART 
>> >>> 
>> 
>> Traceback (most recent call last):
>>   File "F:/7.real.py", line 1, in 
>> infile = open('Desktop/unsorted_fruits.docx' ,"r")
>> IOError: [Errno 2] No such file or directory: 'Desktop/unsorted_fruits.docx'
>> >>> 
>> 
>> 
>> I'm not sure if I write the program correctly, but I can't get my file wich 
>> is on my desktop to the program. CAn someone help me please?
>> 
>> 
>> 
>> 
>> 
>> ___
>> Tutor maillist  -  Tutor@python.org
>> To unsubscribe or change subscription options:
>> https://mail.python.org/mailman/listinfo/tutor
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Tutor] string list in alphabetical!

2013-10-21 Thread Siva Cn
Hi Sammy,

Try this this may help you !

--
def sort_file1_to_file2(file1, file2):
"""."""
input_content = []
with open(file1, 'r') as fp:
input_content = fp.read()

input_content = input_content.splitlines()

_dict = {ele[0].lower(): ele for ele in input_content}

out_content = "\n".join([_dict[chr(idx)]
 for idx in range(97, 123)
 if chr(idx) in _dict])

with open(file2, 'w') as fp:
fp.write(out_content)

sort_file1_to_file2('file1.txt', 'file2.txt')


*-- Regards --*
*
*
*   Siva Cn*
*Python Developer*
*
*
*http://www.cnsiva.com*
-
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-21 Thread Danny Yoo
On Sat, Oct 19, 2013 at 7:29 AM, Manish Tripathi 
wrote:
>
> I am trying to import a csv file in Pandas but it throws an error. The
format of the data when opened in notepad++ is as follows with first row
being column names:
>
> "End Customer Organization ID,End Customer Organization Name,End Customer
Top Parent Organization ID,End Customer Top Parent Organization
Name,Reseller Top Parent ID,Reseller Top Parent Name,Business,Rev Sum
Division,Rev Sum Category,Product Family,Version,Pricing Level,Summary
Pricing Level,Detail Pricing Level,MS Sales Amount,MS Sales Licenses,Fiscal
Year,Sales Date"
> "11027676,Baroda Western Uttar Pradesh Gramin
Bankgfhgfnjgfnmjmhgmghmghmghmnghnmghnmhgnmghnghngh,4078446,Bank Of
Barodadfhhgfjyjtkyukujkyujkuhykluiluilui;iooi';po'fserwefvegwegf,1809012,""Hcl
Infosystems Ltd - Partnerdghftrutyhb
frhywer5y5tyu6ui7iukluyj,lgjmfgnhfrgweffw"",Server &
CALsdgrgrfgtrhytrnhjdgthjtyjkukmhjmghmbhmgfngdfbndfhtgh,SQL Server &
CALdfhtrhtrgbhrghrye5y45y45yu56juhydsgfaefwe,SQL
CALdhdfthtrutrjurhjethfdehrerfgwerweqeadfawrqwerwegtrhyjuytjhyj,SQL
CALdtrye45y3t434tjkabcjkasdhfhasdjkcbaksmjcbfuigkjasbcjkasbkdfhiwh,2005,Openfkvgjesropiguwe90fujklascnioawfy98eyfuiasdbcvjkxsbhg,Open
Lklbjdfoigueroigbjvwioergyuiowerhgosdhvgfoisdhyguiserhguisrh,""Open
Stddfm,vdnoghioerivnsdflierohgushdfovhsiodghuiohdbvgsjdhgouiwerho"",125.85,1,FY07,12/28/2006"
> "12835756,Uttam Strips Pvt Ltd,12835756,Uttam Strips Pvt
Ltd,12565538,Redington C/O Fortis Financial Services Ltd,MBS,Dynamics
ERP,Dynamics NAV,Dynamics NAV Business Essentials,Non-specific,Other,MBS
SA,MBS New Customer Enhanc. Def,0,0,FY09,9/15/2008"
> "12233135,Bhagwan Singh Tondon,12233135,Bhagwan Singh Tondon,2652941,H B
S Systems Pvt Ltd,Server & CAL,SQL Server & CAL,SQL CAL,SQL
CAL,Non-specific,Open,Open L&SA,Deferred Open L&SA - New,0,0,FY09,9/15/2008"
> "11602305,Maya Academy Of Advanced Cinematics,9750934,Maya Entertainment
Ltd,336146,Embee Software Pvt Ltd,Server & CAL,Windows Server & CAL,Windows
Server HPC,Windows Compute Cluster Server,Non-specific,Open,Open V/MYO -
Rec,OLV Perpet L&SA Recur-Def,0,0,FY09,9/25/2008"
> "13336009,Remiel Softech Solution Pvt Ltd,13336009,Remiel Softech
Solution Pvt Ltd,13335482,Redington C/O Remiel Softech Solutions Pvt
Ltd,MBS,Dynamics ERP,Dynamics NAV,Dynamics NAV Business
Essentials,Non-specific,Other,MBS SA,MBS New Customer Enhanc.
Def,0,0,FY09,12/23/2008"
> "7872800,Science Application International Corporation,2839760,GOVERNMENT
OF KARNATAKA,10237455,Cubic Computing P.L,Server & CAL,SQL Server & CAL,SQL
Server Standard,SQL Server Standard Edition,Non-specific,Open,Open
SA/UA,Deferred Open SA - Renewal,0,0,FY09,1/15/2009"
> "13096361,Pratham Software Pvt Ltd,13096361,Pratham Software Pvt
Ltd,10133086,Krap Computer,Information Worker,Office,Office Standard /
Basic,Office Standard,2007,Open,Open L,Open Std,7132.44,28,FY09,9/24/2008"
> "12192276,Texmo Precision Castings,12192276,Texmo Precision
Castings,4059430,Quadra Systems. - Partner,Server & CAL,Windows Server &
CAL,Windows Standard Server,Windows Server Standard,Non-specific,Open,Open
L&SA,Deferred Open L&SA - New,0,0,FY09,11/15/2008"
>
> Kindly note that the same file when double clicked in the csv format
opens in excel with comma separated values BUT with NO quotation marks in
each line as shown in notepad++.
>
> I have used encoding as UTF-8 which gives the following error:

Questions:

* Where is this data coming from?
* Who or what is generating this file?
* Is it being automatically generated, or is someone manually typing in the
file's content?


Knowing the answers to these questions may help to isolate what the actual
problem is.

The source of this input, if they are a good, responsible party, should be
saying up front how to interpret its bytes.  Otherwise you are being put
into a position of having to guess the proper interpretation.  Guessing can
be fun sometimes, I suppose, but I personally don't like doing it unless I
have no choice.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-21 Thread Danny Yoo
>
> * Where is this data coming from?
> * Who or what is generating this file?


Just to be more specific about this: I have a very strong suspicion that
whatever is generating the input that you're trying to read is doing
something ad-hoc with regards to CSV file format.  Knowing what generated
the file, whether it be Excel, or some custom script, is very helpful in
diagnosing where the problem's originating from.


Your suspicion about the quotes around entire rows:

> Does it have to do with the "" marks present before each line in the data?

sounds reasonable.  I expect quotes around individual fields, but not
around entire rows.  Such a feature sounds anomalous because it doesn't fit
the description of known CSV formats:

http://en.wikipedia.org/wiki/Comma-separated_values
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Tutor] string list in alphabetical!

2013-10-21 Thread bob gailer

On 10/21/2013 12:16 PM, Siva Cn wrote:

Hi Sammy,

Try this this may help you !

--
def sort_file1_to_file2(file1, file2):
"""."""
input_content = []
with open(file1, 'r') as fp:
input_content = fp.read()

input_content = input_content.splitlines()

_dict = {ele[0].lower(): ele for ele in input_content}

out_content = "\n".join([_dict[chr(idx)]
 for idx in range(97, 123)
 if chr(idx) in _dict])

with open(file2, 'w') as fp:
fp.write(out_content)

sort_file1_to_file2('file1.txt', 'file2.txt')

I am surprised to see this program. It seems unnecessarily complex and 
somewhat hard to read.

Especially for a rank beginner (the OP)
Also has an unnecessary statement (input_content = [])

IMHO it is more customary and a lot simpler to process the  lines in a 
file thusly:


  for line in open(file1, 'r'):
process the line

--
Bob Gailer
919-636-4239
Chapel Hill NC

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string list in alphabetical!

2013-10-21 Thread Albert-Jan Roskam


On Mon, 10/21/13, Lukas Nemec  wrote:

 Subject: Re: [Tutor] string list in alphabetical!
 To: tutor@python.org
 Date: Monday, October 21, 2013, 1:21 PM
 
 On 10/21/2013 01:16 PM, Steven
 D'Aprano wrote:
 > On Sun, Oct 20, 2013 at 09:15:05PM -0500, Sammy Cornet
 wrote:
 >> Thank you for help Steven! I intend to correct it.
 But also I would
 >> like to know if I wrote the correctly in order to
 the output that I'm
 >> looking for?
 > I don't know, I didn't study your code in that much
 detail.
 >
 > Why don't you fix the problems you already know about,
 then try running
 > it, and see if it works as you expect?
 >
 > That is the normal process of programming:
 >
 > 1) write some code
 > 2) fix the bugs until it will run
 > 3) test if it works correctly
 > 4) repeat until done
 >
 >
 I'd like to upgrade that process :D ...
 
 1) think about your problem
 2) if there are some heplful libraries that can make it way
 easier, use them
 3) write some code
 4) fix the bugs until it'll run
 5) write unittests
 6) test if it works correctly and if unittests pass
 7) repeat until done
 
 
step 5 might also precede step 3


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] How to add a new column to a hierarchical dataframe grouped by groupby

2013-10-21 Thread Boris Vladimir Comi
The following script try to calculate the resulting average of the direction 
and magnitude of the wind. My monthly dataframe has the following column:

data

FechaHoraDirViento  MagViento Temperatura  Humedad  PreciAcu
0   2011/07/01  00:003186.621.22  100   1.7
1   2011/07/01  00:153425.521.20  100   1.7
2   2011/07/01  00:303296.621.15  100   4.8
3   2011/07/01  00:452797.521.11  100   4.2
4   2011/07/01  01:003186.021.16  100   2.5

The first thing I do is convert to radians the DirViento column

dir_rad=[]
for i in range(0, len(data['DirViento'])):
dir_rad.append(data['DirViento'][i]*(pi/180.0))
data['DirViento']=around(dir_rad,1)

Now get the columns of the components: u and v wind and add to data

Uviento=[]
Vviento=[]
for i in range(0,len(data['MagViento'])):
Uviento.append(data['MagViento'][i]*sin(data[DirViento][i]))
Vviento.append(data['MagViento'][i]*cos(data[DirViento][i]))
data['u']=around(Uviento,1)
data['v']=around(Vviento,1)


data
Data columns:
Fecha   51  non-null values
Hora51  non-null values
DirViento   51  non-null values
MagViento   51  non-null values
Temperatura 51  non-null values
Humedad 51  non-null values
PreciAcu51  non-null values
u   51  non-null values
v   51  non-null values
dtypes: float64(6), int64(2), object(2)

Now we indexed the dataframe and grouped

index=data.set_index(['Fecha','Hora'],inplace=True)

grouped = index.groupby(level=0)

data['u']

Fecha   Hora
2011/07/01  00:00-4.4
00:15-1.7
00:30-3.4
00:45-7.4
01:00-4.0
2011/07/02  00:00-4.5
00:15-4.2
00:30-7.6
00:45-3.8
01:00-2.0
2011/07/03  00:00-6.3
00:15   -13.7
00:30-0.3
00:45-2.5
01:00-2.7

Now get resultant wind direction for each day

 grouped.apply(lambda x: 
((scipy.arctan2(mean(x['uu']),mean(x['vv'])))/(pi/180.0)))

 Fecha
 2011/07/01   -55.495677
 2011/07/02   -39.176537
 2011/07/03   -51.416339

The result obtained, I need to apply the following conditions

for i in grouped.apply(lambda x: 
((scipy.arctan2(mean(x['uu']),mean(x['vv'])))/(pi/180.0))):
if i < 180:
i=i+180
else:
if i > 180:
i=i-180
else:
i=i
print i

124.504323033
140.823463279
128.5836605

How to add the previous result to the next dictionary

stat_cea = 
grouped.agg({'DirRes':np.mean,'Temperatura':np.mean,'Humedad':np.mean,'PreciAcu':np.sum})



stat_cea
FechaDirRes Humedad  PreciAcu  Temperatura

2011/07/01 100.00  30.4  21.367059
2011/07/02 99.823529   18.0  21.841765
2011/07/03 99.8235294.0  21.347059
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-21 Thread Danny Yoo
On Mon, Oct 21, 2013 at 11:57 AM, Manish Tripathi wrote:

> It's pipeline data so must have been generated through Siebel and sent as
> excel csv.
>
>
I am assuming that you are talking about "Siebel Analytics", some kind of
analysis software from Oracle:

http://en.wikipedia.org/wiki/Siebel_Systems

That would be fine, except that knowing it comes out of Siebel is no
guarantee that the output you're consuming is well-formed Excel CSV.  For
example, I see things like this:

http://spendolini.blogspot.com/2006/04/custom-export-to-csv.html

where the generated output is "ad-hoc".



---

Hmmm... but let's assume for the moment that your data is ok.  Could the
problem be in pandas?  Let's follow this line of logic, and see where it
takes us.

Given the structure of the error you're seeing, I have to assume that
pandas is trying to decode the bytes, and runs into an issue, though the
exact position where it's running into an error is in question.  In fact,
looking at:

https://github.com/pydata/pandas/blob/master/pandas/io/parsers.py#L1357

for example, the library appears to be trying to decode line-by-line under
certain situations.  If it runs into an error, it will report an offset
into a particular line.

Wow.  That can be very bad, if I'm reading that right.  It does not give
that offset from the perspective of the whole file.  But it's worse because
it's unsound.  The code _should_ be doing the decoding from the perspective
of the whole file, not at the level of single lines.  It needs to be using
codecs.open(), and let codecs.open() handle the details of
byte->unicode-string decoding.  Otherwise, by that time, it's way too late:
we've just taken an interpretation of the bytes that's potentially invalid.
 Example: if we're working with UTF-16, and we got into this code path,
it'd be really bad.


It's hard to tell whether or not we're taking that code path.  I'm
following the definition of read_csv from:

https://github.com/pydata/pandas/blob/master/pandas/io/parsers.py#L409

to:

   https://github.com/pydata/pandas/blob/master/pandas/io/parsers.py#L282

to:

https://github.com/pydata/pandas/blob/master/pandas/io/parsers.py#L184

to:

https://github.com/pydata/pandas/blob/master/pandas/io/common.py#L100



Ok, at that point, they appear to try to decode the entire file.  Somewhat
good so far.  Though, technically, pandas should be using codecs.open():


http://docs.python.org/2/howto/unicode.html#reading-and-writing-unicode-data

and because they aren't, they appears to suck the entire file into memory
with StringIO.  Yikes.


Now the pandas library must make sure _not_ to decode() again, because
decoding is not an idempotent operation.

As a concrete example:

##
>>> 'foobar'.decode('utf-16')
u'\u6f66\u626f\u7261'
>>> 'foobar'.decode('utf-16').decode('utf-16')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python2.7/encodings/utf_16.py", line 16, in decode
return codecs.utf_16_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2:
ordinal not in range(128)
##

This is reminiscent of the kind of error you're encountering, though I'm
not sure if this is the same situation.



Unfortunately, I'm running out of time to analyze this further.  If you
could upload your data file somewhere, someone else here may have time to
investigate the error you're seeing in more detail.  From reading the
Pandas code, I'm discouraged by the code quality: I do think that there's a
potential of a bug in the library.  The code is a heck of a lot more
complicated than I think it needs to be.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How to add a new column to a hierarchical dataframe grouped by groupby

2013-10-21 Thread Mark Lawrence

On 21/10/2013 20:38, Boris Vladimir Comi wrote:

This is the Python tutor mailing list so it is essentially aimed at 
beginners to programming and/or Python, so please ask your pandas 
specific question on the google group here 
https://groups.google.com/forum/#!forum/pydata or the equivalent mailing 
list at gmane.comp.python.pydata.


--
Python is the second best programming language in the world.
But the best has yet to be invented.  Christian Tismer

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-21 Thread Mark Lawrence

On 21/10/2013 22:42, Danny Yoo wrote:




This question has now been placed on the correct forum here 
http://article.gmane.org/gmane.comp.python.pydata/2294 so I see little 
sense in us attempting to follow it up.


--
Python is the second best programming language in the world.
But the best has yet to be invented.  Christian Tismer

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-21 Thread Manish Tripathi
It's pipeline data so must have been generated through Siebel and sent as
excel csv.


On Mon, Oct 21, 2013 at 11:32 PM, Danny Yoo  wrote:

> >
> > * Where is this data coming from?
> > * Who or what is generating this file?
>
>
> Just to be more specific about this: I have a very strong suspicion that
> whatever is generating the input that you're trying to read is doing
> something ad-hoc with regards to CSV file format.  Knowing what generated
> the file, whether it be Excel, or some custom script, is very helpful in
> diagnosing where the problem's originating from.
>
>
> Your suspicion about the quotes around entire rows:
>
> > Does it have to do with the "" marks present before each line in the
> data?
>
> sounds reasonable.  I expect quotes around individual fields, but not
> around entire rows.  Such a feature sounds anomalous because it doesn't fit
> the description of known CSV formats:
>
> http://en.wikipedia.org/wiki/Comma-separated_values
>
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Tutor] string list in alphabetical!

2013-10-21 Thread Alan Gauld

On 21/10/13 17:16, Siva Cn wrote:

Hi Sammy,

Try this this may help you !


Siva,
the list policy is not to provide full solutions for homework type 
questions. It's better to provide some hints and let the OP

figure it out for him/her self.

That having been said there are a few issues with the code below...


--
def sort_file1_to_file2(file1, file2):
 """."""


A docstring with a dot is pretty useless, unless there's
some Python magic going on that I'm unaware of


 input_content = []


This is not needed since you initialise it below


 with open(file1, 'r') as fp:
 input_content = fp.read()
 input_content = input_content.splitlines()


It would be easier to use readlines to get each line in
a list. And given we know there are only 26 lines memory
usage is not an issue.


 _dict = {ele[0].lower(): ele for ele in input_content}

 out_content = "\n".join([_dict[chr(idx)]
  for idx in range(97, 123)
  if chr(idx) in _dict])


And this is way more complicated than need be. We know the lines start 
with unique letters of the alphabet so we can just use the built in 
sort() method or sorted() function to get


outcontent = "".join(sorted(input_content))  # readlines preserves \n?

If the first letters are mixed case there would need a slight tweak to 
account for  that but there is no suggestion from the OP that that

is an issue.


 with open(file2, 'w') as fp:
 fp.write(out_content)


HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string list in alphabetical!

2013-10-21 Thread Steven D'Aprano
On Mon, Oct 21, 2013 at 01:21:59PM +0200, Lukas Nemec wrote:
> On 10/21/2013 01:16 PM, Steven D'Aprano wrote:

> >That is the normal process of programming:
> >
> >1) write some code
> >2) fix the bugs until it will run
> >3) test if it works correctly
> >4) repeat until done
> >
> >
> I'd like to upgrade that process :D ...
> 
> 1) think about your problem
> 2) if there are some heplful libraries that can make it way easier, use them
> 3) write some code
> 4) fix the bugs until it'll run
> 5) write unittests
> 6) test if it works correctly and if unittests pass
> 7) repeat until done


Heh, very true! But the most important step is step 1, think about your 
problem. The biggest mistake in programming is to start writing code 
without thinking the problem through first. Instead, a good start is to 
think about how you would solve the problem if you were doing it by 
hand.


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] [OT] Programming practice was: Re: string list in alphabetical!

2013-10-21 Thread Alan Gauld

On 22/10/13 00:07, Steven D'Aprano wrote:


I'd like to upgrade that process :D ...

1) think about your problem
2) if there are some heplful libraries that can make it way easier, use them
3) write some code
4) fix the bugs until it'll run
5) write unittests
6) test if it works correctly and if unittests pass
7) repeat until done



Heh, very true! But the most important step is step 1,


I agree. I've recently started coaching the son of a friend in computing 
for his new school (he is effectively a year behind

his new classmates). They use VB6 but at a level I can cope with! :-)

The interesting thing however is that the schools have not taught
any kind of approach to problem solving, they just give a homework 
assignment and expect them to produce code. My friend wants to

dive into Vuisual Studio to start work immediately. I've been
trying to get him to adopt a workflow where he writes on paper
an informal "use case" description of the solution and if
necessary a pseudo code design. But it's been a real challenge
to get him to slow down and understand exactly what he is being
asked to do before diving into code. (Some of that is just
natural youthful impatience, but mostly it's lack of instruction
in an alternative! :-)

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] [OT] Programming practice was: Re: string list in alphabetical!

2013-10-21 Thread Danny Yoo
>
>
> I agree. I've recently started coaching the son of a friend in computing
> for his new school (he is effectively a year behind
> his new classmates). They use VB6 but at a level I can cope with! :-)
>
> The interesting thing however is that the schools have not taught
> any kind of approach to problem solving, they just give a homework
> assignment and expect them to produce code.



I feel like we've had this conversation a long, long time ago.  :P

This is the sort of thing that we should be expect out of math classes.
 Polya's "How to Solve It" gives an approach that I wish I had seen during
my own grade schooling. I ended up being exposed to the book from a
recommendation that said something to the effect of: "If you want to be a
good programmer, read this!"


Another place where I'm seeing the act of problem solving being explicitly
taught is in "How to Design Programs":

http://www.ccs.neu.edu/home/matthias/HtDP2e/

(Note: I have worked with the authors of this book.)

They use the term "Design Recipe", which is a similar shape to Steven's
approach.  Though I'd say to Steven: move the "write unittests" part from
point 5 up to right after point 1.  Then we'll be in more agreement.  :P
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor