[Tutor] Reading CSV files in Pandas

2013-10-19 Thread Manish Tripathi
I am trying to import a csv file in Pandas but it throws an error. The
format of the data when opened in notepad++ is as follows with first row
being column names:

"End Customer Organization ID,End Customer Organization Name,End
Customer Top Parent Organization ID,End Customer Top Parent
Organization Name,Reseller Top Parent ID,Reseller Top Parent
Name,Business,Rev Sum Division,Rev Sum Category,Product
Family,Version,Pricing Level,Summary Pricing Level,Detail Pricing
Level,MS Sales Amount,MS Sales Licenses,Fiscal Year,Sales
Date""11027676,Baroda Western Uttar Pradesh Gramin
Bankgfhgfnjgfnmjmhgmghmghmghmnghnmghnmhgnmghnghngh,4078446,Bank Of
Barodadfhhgfjyjtkyukujkyujkuhykluiluilui;iooi';po'fserwefvegwegf,1809012,""Hcl
Infosystems Ltd - Partnerdghftrutyhb
frhywer5y5tyu6ui7iukluyj,lgjmfgnhfrgweffw"",Server &
CALsdgrgrfgtrhytrnhjdgthjtyjkukmhjmghmbhmgfngdfbndfhtgh,SQL Server &
CALdfhtrhtrgbhrghrye5y45y45yu56juhydsgfaefwe,SQL
CALdhdfthtrutrjurhjethfdehrerfgwerweqeadfawrqwerwegtrhyjuytjhyj,SQL
CALdtrye45y3t434tjkabcjkasdhfhasdjkcbaksmjcbfuigkjasbcjkasbkdfhiwh,2005,Openfkvgjesropiguwe90fujklascnioawfy98eyfuiasdbcvjkxsbhg,Open
Lklbjdfoigueroigbjvwioergyuiowerhgosdhvgfoisdhyguiserhguisrh,""Open
Stddfm,vdnoghioerivnsdflierohgushdfovhsiodghuiohdbvgsjdhgouiwerho"",125.85,1,FY07,12/28/2006""12835756,Uttam
Strips Pvt Ltd,12835756,Uttam Strips Pvt Ltd,12565538,Redington C/O
Fortis Financial Services Ltd,MBS,Dynamics ERP,Dynamics NAV,Dynamics
NAV Business Essentials,Non-specific,Other,MBS SA,MBS New Customer
Enhanc. Def,0,0,FY09,9/15/2008""12233135,Bhagwan Singh
Tondon,12233135,Bhagwan Singh Tondon,2652941,H B S Systems Pvt
Ltd,Server & CAL,SQL Server & CAL,SQL CAL,SQL
CAL,Non-specific,Open,Open L&SA,Deferred Open L&SA -
New,0,0,FY09,9/15/2008""11602305,Maya Academy Of Advanced
Cinematics,9750934,Maya Entertainment Ltd,336146,Embee Software Pvt
Ltd,Server & CAL,Windows Server & CAL,Windows Server HPC,Windows
Compute Cluster Server,Non-specific,Open,Open V/MYO - Rec,OLV Perpet
L&SA Recur-Def,0,0,FY09,9/25/2008""13336009,Remiel Softech Solution
Pvt Ltd,13336009,Remiel Softech Solution Pvt Ltd,13335482,Redington
C/O Remiel Softech Solutions Pvt Ltd,MBS,Dynamics ERP,Dynamics
NAV,Dynamics NAV Business Essentials,Non-specific,Other,MBS SA,MBS New
Customer Enhanc. Def,0,0,FY09,12/23/2008""7872800,Science Application
International Corporation,2839760,GOVERNMENT OF
KARNATAKA,10237455,Cubic Computing P.L,Server & CAL,SQL Server &
CAL,SQL Server Standard,SQL Server Standard
Edition,Non-specific,Open,Open SA/UA,Deferred Open SA -
Renewal,0,0,FY09,1/15/2009""13096361,Pratham Software Pvt
Ltd,13096361,Pratham Software Pvt Ltd,10133086,Krap
Computer,Information Worker,Office,Office Standard / Basic,Office
Standard,2007,Open,Open L,Open
Std,7132.44,28,FY09,9/24/2008""12192276,Texmo Precision
Castings,12192276,Texmo Precision Castings,4059430,Quadra Systems. -
Partner,Server & CAL,Windows Server & CAL,Windows Standard
Server,Windows Server Standard,Non-specific,Open,Open L&SA,Deferred
Open L&SA - New,0,0,FY09,11/15/2008"

*Kindly note that the same file when double clicked in the csv format opens
in excel with comma separated values BUT with NO quotation marks in each
line as shown in notepad++.*

I have used encoding as UTF-8 which gives the following error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91 in position
13: invalid start byte

Then used encoding='cp1252' first and then tried with latin1.

df=pd.read_csv(filename,encoding='cp1252')
or

df=pd.read_csv(filename,encoding='latin1')

With both the encodings it didn't give any error and the data got imported
but as one single column and not as different columns.

Does it have to do with the "" marks present before each line in the data?
I had a similar csv file with comma separated values, but that didn't have
double quotation marks in each line and that got imported correctly both
with cp1252 and latin1. But not for UTF-8 even though the file was saved in
utf8 format in notepad++. But in this case utf8 doesnt work as usual and
other two import it as single column.

Please advise.

Thanks
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-19 Thread Alan Gauld

On 19/10/13 15:29, Manish Tripathi wrote:

I am trying to import a csv file in Pandas but it throws an error.


This is the second time I've seen pandas mentioned recedntly I really 
must go and look it up to find out what it is...


Meanwhile can you clarify what you mean by importing the csv file?
Are you using a Pandas facility to do this?
Or are you using Python code? And if the latter are you using the csv 
module - striongly recommended for any csv work




|df=pd.read_csv(filename,encoding='cp1252')


I'm assuming from this it is some kind of Pandas feature?
Have you tried asking on a Pandas mailing list/forum?
This group is targetted at standard library and core
language, so it will be a matter of luck if you find
anyone who uses pandas and can help.

OK, I found the Pandas page, it's for data analysis/modelling.
It has a StackOverflow link for asking questions so if you
don't get help here you should try that next.

I'm assuming you have read the docs on csv input here:

http://pandas.pydata.org/pandas-docs/stable/io.html#io-read-csv-table

There are way too many options for me to read through
so I'll leave it to those who know!

HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-19 Thread Mark Lawrence

On 19/10/2013 23:40, Alan Gauld wrote:


This is the second time I've seen pandas mentioned recedntly I really
must go and look it up to find out what it is...


Just started out myself at http://pandas.pydata.org/ and at first glance 
it seems quite awesome.


--
Roses are red,
Violets are blue,
Most poems rhyme,
But this one doesn't.

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Reading CSV files in Pandas

2013-10-19 Thread Mark Lawrence

On 19/10/2013 15:29, Manish Tripathi wrote:

You are far more likely to get a response to the identical question that 
you've already asked on stackoverflow than you are here.


--
Roses are red,
Violets are blue,
Most poems rhyme,
But this one doesn't.

Mark Lawrence

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor