Re: ValueError: arrays must all be same length
Am 02.10.20 um 14:34 schrieb Shaozhong SHI: Hello, I got a json response from an API and tried to use pandas to put data into a dataframe. However, I kept getting this ValueError: arrays must all be same length. Can anyone help? The following is the json text. What do you expect the dataframe to look like? dataframes are 2D tables, JSON is a tree. Christian -- https://mail.python.org/mailman/listinfo/python-list
Re: python if and same instruction line not working
On Tuesday, September 29, 2020 at 5:28:22 PM UTC+2, MRAB wrote:
> On 2020-09-29 15:42, pascal z via Python-list wrote:
> > I need to change the script commented out to the one not commented out. Why?
> >
> > # for x in sorted (fr, key=str.lower):
> > # tmpstr = x.rpartition(';')[2]
> > # if x != csv_contents and tmpstr == "folder\n":
> > # csv_contentsB += x
> > # elif x != csv_contents and tmpstr == "files\n":
> > # csv_contentsC += x
> >
> > for x in sorted (fr, key=str.lower):
> > if x != csv_contents:
> > tmpstr = x.rpartition(';')[2]
> > if tmpstr == "folder\n":
> > csv_contentsB += x
> > elif tmpstr == "file\n":
> > csv_contentsC += x
> >
> You haven't defined what you mean by "not working" for any test values
> to try, but I notice that the commented code has "files\n" whereas the
> uncommented code has "file\n".
Very good point, it should what caused the issue
By the way, it seems it's ok to check \n as end of line, it will work on
windows linux and mac platforms even if windows use \r\n
--
https://mail.python.org/mailman/listinfo/python-list
Re: python show folder files and not subfolder files
On Thursday, September 24, 2020 at 4:37:07 PM UTC+2, Terry Reedy wrote: > On 9/23/2020 7:24 PM, pascal z via Python-list wrote: > > Please advise if the following is ok (i don't think it is) > > > > #!/usr/bin/env python3 > > # -*- coding: utf-8 -*- > > > > import os > > > > csv_contents = "" > > output_file = '/home/user/Documents/csv/output3csv.csv' > > Lpath = '/home/user/Documents/' > > > > csv_contents = "FOLDER PATH;Size in Byte;Size in Kb;Size in Mb;Size in Gb\n" > > > > d_size = 0 > > for root, dirs, files in os.walk(Lpath, topdown=False): > > for i in files: > > d_size += os.path.getsize(root + os.sep + i) > > csv_contents += "%s ;%.2f ;%.2f ;%.2f ;%.2f \n" % (root, > > d_size, d_size/1024, d_size/1048576, d_size/1073741824) > > > > counter = Lpath.count(os.path.sep) > > if counter < 5: > > for f in os.listdir(Lpath): > > path = os.path.join(Lpath, f) > > f_size = 0 > > f_size = os.path.getsize(path) > > csv_contents += "%s ;%.2f ;%.2f ;%.2f ;%.2f \n" % > > (path, f_size, f_size/1024, f_size/1048576, f_size/1073741824) > > > > fp = open(output_file, "w") > > fp.write(csv_contents) > > fp.close() > > > Read > https://docs.python.org/3/faq/programming.html#what-is-the-most-efficient-way-to-concatenate-many-strings-together > -- > Terry Jan Reedy Thanks for this tip. I do think it's better to use lists than concatenate into string variable. However, writing a list to a csv file is not something easy. If strings stored into the list have commas and single quotes (like song title's), it messes up the whole csv when it first meets this. Example with arr as list: import csv import io (...) csv_contents = "%s;%s;%s;%.2f;%.2f;%.2f;%.2f;%s" % (vfolder_path, vfile_name, vfolder_path_full, 0.00, 0.00, 0.00,0.00, "folder") arr.append([csv_contents]) b = io.BytesIO() with open(CSV_FILE,'w', newline ='\n') as f: #write = csv.writer(f,delimiter=';') #write = csv.writer(f,quotechar='\'', quoting=csv.QUOTE_NONNUMERIC,delimiter=',') write = csv.writer(f,b) for row in arr: write.writerows(row) (...) string samples: ;'Forgotten Dreams' Mix.mp3;'Awakening of a Dream' Ambient Mix.mp3;Best of Trip-Hop & Downtempo & Hip-Hop Instrumental.mp3;2-Hour _ Space Trance.mp3 for the titles above, the easiest thing to write csv for me is (...) csv_contents += "%s;%s;%s;%.2f;%.2f;%.2f;%.2f;%s" % (vfolder_path, vfile_name, vfolder_path_full, 0.00, 0.00, 0.00,0.00, "folder" with open(CSV_FILE,'w') as f: f.write(csv_contents) csv_contents can be very large and it seems to work. It can concatenate 30k items and it's ok. Also with the above, I have the expected result into each of the 8 rows having the corresponding data. This is not always the case with csv writerows. If it meets a character it can't process, from there everything go into a single cell row. The split on semi colon from doesnt work anymore. I am not allowed to change the name of the files (it could be used later somewhere else, making side effects...). -- https://mail.python.org/mailman/listinfo/python-list
Re: ValueError: arrays must all be same length
On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI
wrote:
> Hello,
>
> I got a json response from an API and tried to use pandas to put data into
> a dataframe.
>
> However, I kept getting this ValueError: arrays must all be same length.
>
> Can anyone help?
>
> The following is the json text. Regards, Shao
>
> (snip json_text)
> import pandas as pd
>
> import json
>
> j = json.JSONDecoder().decode(req.text) ###req.json
>
> df = pd.DataFrame.from_dict(j)
>
I copied json_text into a Jupyter notebook and got the same error trying to
convert this into a pandas DataFrame:When I tried to copy this into a
string, I got an error,, but without enclosing the paste in quotes, I got
the dictionary.
dir(json_text)
['__class__',
'__contains__',
'__delattr__',
'__delitem__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getitem__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__iter__',
'__le__',
'__len__',
'__lt__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__reversed__',
'__setattr__',
'__setitem__',
'__sizeof__',
'__str__',
'__subclasshook__',
'clear',
'copy',
'fromkeys',
'get',
'items',
'keys',
'pop',
'popitem',
'setdefault',
'update',
'values']
pd.DataFrame(json_text)
---
ValueErrorTraceback (most recent call last)
in
> 1 pd.DataFrame(json_text)
D:\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data,
index, columns, dtype, copy)
433 )
434 elif isinstance(data, dict):
--> 435 mgr = init_dict(data, index, columns, dtype=dtype)
436 elif isinstance(data, ma.MaskedArray):
437 import numpy.ma.mrecords as mrecords
D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
init_dict(data, index, columns, dtype)
252 arr if not is_datetime64tz_dtype(arr) else arr.copy()
for arr in arrays
253 ]
--> 254 return arrays_to_mgr(arrays, data_names, index, columns,
dtype=dtype)
255
256
D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
arrays_to_mgr(arrays, arr_names, index, columns, dtype)
62 # figure out the index, if necessary
63 if index is None:
---> 64 index = extract_index(arrays)
65 else:
66 index = ensure_index(index)
D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
extract_index(data)
363 lengths = list(set(raw_lengths))
364 if len(lengths) > 1:
--> 365 raise ValueError("arrays must all be same length")
366
367 if have_dicts:
ValueError: arrays must all be same length
I got a different error trying json.loads(str(json_text)),
---
JSONDecodeError Traceback (most recent call last)
in
> 1 json.loads(str(json_text))
D:\anaconda3\lib\json\__init__.py in loads(s, cls, object_hook,
parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
355 parse_int is None and parse_float is None and
356 parse_constant is None and object_pairs_hook is None
and not kw):
--> 357 return _default_decoder.decode(s)
358 if cls is None:
359 cls = JSONDecoder
D:\anaconda3\lib\json\decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):
D:\anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
351 """
352 try:
--> 353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
355 raise JSONDecodeError("Expecting value", s, err.value)
from None
JSONDecodeError: Expecting property name enclosed in double quotes: line 1
column 2 (char 1)
I think the solution is to fix the arrays so that the lengths match.
for k in json_text.keys():
if isinstance(json_text[k], list):
print(k, len(json_text[k]))
relationships 0
locationTypes 0
regulatedActivities 2
gacServiceTypes 1
inspectionCategories 1
specialisms 4
inspectionAreas 0
historicRatings 4
reports 5
HTH,.
--
> https://mail.python.org/mailman/listinfo/python-list
>
--
https://mail.python.org/mailman/listinfo/python-list
Re: ValueError: arrays must all be same length
On Sun, Oct 4, 2020 at 8:39 AM Tim Williams wrote: > > > On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI > wrote: > >> Hello, >> >> I got a json response from an API and tried to use pandas to put data into >> a dataframe. >> >> However, I kept getting this ValueError: arrays must all be same length. >> >> Can anyone help? >> >> The following is the json text. Regards, Shao >> >> (snip json_text) > > >> import pandas as pd >> >> import json >> >> j = json.JSONDecoder().decode(req.text) ###req.json >> >> df = pd.DataFrame.from_dict(j) >> > > I copied json_text into a Jupyter notebook and got the same error trying > to convert this into a pandas DataFrame:When I tried to copy this into a > string, I got an error,, but without enclosing the paste in quotes, I got > the dictionary. > > (delete long response output) > for k in json_text.keys(): > if isinstance(json_text[k], list): > print(k, len(json_text[k])) > > relationships 0 > locationTypes 0 > regulatedActivities 2 > gacServiceTypes 1 > inspectionCategories 1 > specialisms 4 > inspectionAreas 0 > historicRatings 4 > reports 5 > > HTH,. > > This may also be more of a pandas issue. json.loads(json.dumps(json_text)) has a successful round-trip > -- >> https://mail.python.org/mailman/listinfo/python-list >> > -- https://mail.python.org/mailman/listinfo/python-list
Re: python if and same instruction line not working
On 2020-10-04 10:35, pascal z via Python-list wrote:
On Tuesday, September 29, 2020 at 5:28:22 PM UTC+2, MRAB wrote:
On 2020-09-29 15:42, pascal z via Python-list wrote:
> I need to change the script commented out to the one not commented out. Why?
>
> # for x in sorted (fr, key=str.lower):
> # tmpstr = x.rpartition(';')[2]
> # if x != csv_contents and tmpstr == "folder\n":
> # csv_contentsB += x
> # elif x != csv_contents and tmpstr == "files\n":
> # csv_contentsC += x
>
> for x in sorted (fr, key=str.lower):
> if x != csv_contents:
> tmpstr = x.rpartition(';')[2]
> if tmpstr == "folder\n":
> csv_contentsB += x
> elif tmpstr == "file\n":
> csv_contentsC += x
>
You haven't defined what you mean by "not working" for any test values
to try, but I notice that the commented code has "files\n" whereas the
uncommented code has "file\n".
Very good point, it should what caused the issue
By the way, it seems it's ok to check \n as end of line, it will work on
windows linux and mac platforms even if windows use \r\n
By default, when the 'open' function opens a file in text mode, it uses
"universal newlines mode", so the rest of the program doesn't have to
worry about the differences in line endings. It's explained in the
documentation about the 'open' function.
--
https://mail.python.org/mailman/listinfo/python-list
Re: python show folder files and not subfolder files
On 04Oct2020 02:56, pascal z wrote: >On Thursday, September 24, 2020 at 4:37:07 PM UTC+2, Terry Reedy wrote: >> Read >> https://docs.python.org/3/faq/programming.html#what-is-the-most-efficient-way-to-concatenate-many-strings-together > >Thanks for this tip. I do think it's better to use lists than >concatenate into string variable. However, writing a list to a csv file >is not something easy. If strings stored into the list have commas and >single quotes (like song title's), it messes up the whole csv when it >first meets this. [...] >[...] >csv_contents = "%s;%s;%s;%.2f;%.2f;%.2f;%.2f;%s" % (vfolder_path, >vfile_name, vfolder_path_full, 0.00, 0.00, 0.00,0.00, "folder") >arr.append([csv_contents]) >[...] Is there a reaon you're not using the csv module to write and read CSV files. It knows how to correctly escape values in a number of common dialects (the default dialect works well). By composing CSV files with %-formatting (or with any crude string cormatting) you the exact syntax issue you're describing. Faced with user supplied data, these issues become "injection attacks", as exemplified by this XKCD comics: https://xkcd.com/327/ https://www.explainxkcd.com/wiki/index.php/Little_Bobby_Tables The correct approach here is to have a general and _correct_ formatter for the values, and to not assemble things with simplistic approaches like %-formatting. With databases the standard approach for assembling SQL is to provide template SQL with the values as arguments, and have the db-specific driver construct SQL for you. And with CSV files the same applies: import the csv module and use csv.writer() to general the CSV data; you just hand the writer an array of values (strings, floats, whatever) and it takes care of using the correct syntax in the file. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Collecting more feedback about contributing to Python
Have you tried contributing to the development of Python itself, or have considered doing so? I'd like to hear your thoughts and experiences! I'm collecting such information to guide work during the upcoming core-dev sprint on making contribution easier and friendlier. You can reach out publicly or privately. I'll keep private stories to myself, only mentioning specific relevant points from them without mentioning who sent them. Public stories will be added to the dedicated repo, which already includes many such stories which I have previously collected. https://github.com/taleinat/python-contribution-feedback For more info on the core dev sprint: https://python-core-sprint-2020.readthedocs.io/index.html - Tal Einat -- https://mail.python.org/mailman/listinfo/python-list
Re: Problem
Am 03.10.2020 um 17:25 schrieb Dennis Lee Bieber: > On Fri, 2 Oct 2020 21:47:38 +0200, Hexamorph declaimed > the following: > > >> >> - Add a folder named "Python.org " (or similar) to the >> desktop with shortcuts to Python, IDLE and the CHM. >> >> - Add a checkbox (default enabled) like "Start the IDLE Python >> Editor/Shell" at the end of the installation procedure. >> > > Which may only result in reinforcing the idea that one runs the > installer to run Python. Perhaps not if the installer says, that Python is already installed and accessible per Startmenu and desktop icons. At some point they will probably notice this hint. There are multiple reasons, why beginners don't find out how to start Python. Some expect a GUI-Application like an Editor in the Startmenu, some expect desktop icons, some don't realize, that the installer is just that. They all need a different change. However, I'm out of this discussion now. With the exception of Terry changing IDLE's menu entry, this has been a very unproductive discussion. People where only highlighting possible drawbacks and remaining problems (including the usual nitpicking on edge-cases) without coming up with own ideas. If you want to lessen the amount of those initial newcomer questions, reconsider what was proposed: - Rename the MSI as suggested by Eryk Sun. - Add a folder named "Python.org " (or similar) to the desktop with shortcuts to Python, IDLE and the CHM. - Add a checkbox (default enabled) like "Start the IDLE Python Editor/Shell" at the end of the installation procedure. - Perhaps, if possible, add a button like "Start the IDLE Python Editor/Shell" to the Repair/Modify/Remove Dialog. So long and happy hacking! :-) -- https://mail.python.org/mailman/listinfo/python-list
