I don't think pandas allows blank values in integer columns? You might get better results asking on the pandas list, though -- see http://pandas.pydata.org/community.html
-n On May 13, 2015 2:17 PM, "Vincent Davis" <[email protected]> wrote: > I have a large (~400mb) csv file I am trying to open in Pandas. When I > don't specify the dtype and open it with the following command It appears > to work. > > df = pd.io.parsers.read_csv(CSVFILECLEAN2013, quotechar='"', > low_memory=False, na_values='') > > If I try to specify the dtype for each field I get an error but no hint as > to where I should look. I have "cleaned" the csv by checking that all > values that should be an int for a float are either blank or can be cast as > a float or a int. I guess my question is, can I get a more useful error > message or is there a hint as to where the problem is that I am not seeing. > > Exception Traceback (most recent call last) > <ipython-input-2-8715d8cbaa54> in <module>() > 3 import load_data > 4 import numpy as np > ----> 5 df2 = load_data.load('jeffco_2013') > > /Users/vmd/GitHub/Jeffco-Properties/tools/load_data.py in load(data) > 47 def load(data): > 48 files = dict(jeffco_2013 = > '/Users/vmd/GitHub/Jeffco-Properties/Data/JeffersonCo/Datasets/2013_clean_Jeffco_ATSDTA_ATSP600.csv') > ---> 49 return pd.io.parsers.read_csv(files[data], quotechar='"', > low_memory=False, na_values='', dtype=DATASHAPE) > > /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py > in parser_f(filepath_or_buffer, sep, dialect, compression, doublequote, > escapechar, quotechar, quoting, skipinitialspace, lineterminator, header, > index_col, names, prefix, skiprows, skipfooter, skip_footer, na_values, > na_fvalues, true_values, false_values, delimiter, converters, dtype, > usecols, engine, delim_whitespace, as_recarray, na_filter, compact_ints, > use_unsigned, low_memory, buffer_lines, warn_bad_lines, error_bad_lines, > keep_default_na, thousands, comment, decimal, parse_dates, keep_date_col, > dayfirst, date_parser, memory_map, float_precision, nrows, iterator, > chunksize, verbose, encoding, squeeze, mangle_dupe_cols, tupleize_cols, > infer_datetime_format, skip_blank_lines) > 468 skip_blank_lines=skip_blank_lines) > 469 > --> 470 return _read(filepath_or_buffer, kwds) > 471 > 472 parser_f.__name__ = name > > /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py > in _read(filepath_or_buffer, kwds) > 254 return parser > 255 > --> 256 return parser.read() > 257 > 258 _parser_defaults = { > > /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py > in read(self, nrows) > 713 raise ValueError('skip_footer not supported for > iteration') > 714 > --> 715 ret = self._engine.read(nrows) > 716 > 717 if self.options.get('as_recarray'): > > /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py > in read(self, nrows) > 1162 > 1163 try: > -> 1164 data = self._reader.read(nrows) > 1165 except StopIteration: > 1166 if nrows is None: > > pandas/parser.pyx in pandas.parser.TextReader.read (pandas/parser.c:7426)() > > pandas/parser.pyx in pandas.parser.TextReader._read_rows > (pandas/parser.c:8484)() > > pandas/parser.pyx in pandas.parser.TextReader._convert_column_data > (pandas/parser.c:9795)() > > pandas/parser.pyx in pandas.parser.TextReader._convert_tokens > (pandas/parser.c:10403)() > > pandas/parser.pyx in pandas.parser.TextReader._convert_with_dtype > (pandas/parser.c:11257)() > > Exception: Integer column has NA values > > > > > > Vincent Davis > 720-301-3003 > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
