Re: ValueError: arrays must all be same length

2020-10-04 Thread Christian Gollwitzer

Am 02.10.20 um 14:34 schrieb Shaozhong SHI:

Hello,

I got a json response from an API and tried to use pandas to put data into
a dataframe.

However, I kept getting this ValueError: arrays must all be same length.

Can anyone help?

The following is the json text.  


What do you expect the dataframe to look like? dataframes are 2D tables, 
JSON is a tree.


Christian

--
https://mail.python.org/mailman/listinfo/python-list


Re: python if and same instruction line not working

2020-10-04 Thread pascal z via Python-list
On Tuesday, September 29, 2020 at 5:28:22 PM UTC+2, MRAB wrote:
> On 2020-09-29 15:42, pascal z via Python-list wrote:
> > I need to change the script commented out to the one not commented out. Why?
> > 
> >  # for x in sorted (fr, key=str.lower):
> >  # tmpstr = x.rpartition(';')[2]
> >  # if x != csv_contents and tmpstr == "folder\n":
> >  # csv_contentsB += x
> >  # elif x != csv_contents and tmpstr == "files\n":
> >  # csv_contentsC += x
> > 
> >  for x in sorted (fr, key=str.lower):
> >  if x != csv_contents:
> >  tmpstr = x.rpartition(';')[2]
> >  if tmpstr == "folder\n":
> >  csv_contentsB += x
> >  elif tmpstr == "file\n":
> >  csv_contentsC += x
> > 
> You haven't defined what you mean by "not working" for any test values 
> to try, but I notice that the commented code has "files\n" whereas the 
> uncommented code has "file\n".

Very good point, it should what caused the issue

By the way, it seems it's ok to check \n as end of line, it will work on 
windows linux and mac platforms even if windows use \r\n
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python show folder files and not subfolder files

2020-10-04 Thread pascal z via Python-list
On Thursday, September 24, 2020 at 4:37:07 PM UTC+2, Terry Reedy wrote:
> On 9/23/2020 7:24 PM, pascal z via Python-list wrote:
> > Please advise if the following is ok (i don't think it is)
> > 
> > #!/usr/bin/env python3
> > # -*- coding: utf-8 -*-
> > 
> > import os
> > 
> > csv_contents = ""
> > output_file = '/home/user/Documents/csv/output3csv.csv'
> > Lpath = '/home/user/Documents/'
> > 
> > csv_contents = "FOLDER PATH;Size in Byte;Size in Kb;Size in Mb;Size in Gb\n"
> > 
> > d_size = 0
> > for root, dirs, files in os.walk(Lpath, topdown=False):
> >  for i in files:
> >  d_size += os.path.getsize(root + os.sep + i)
> >  csv_contents += "%s   ;%.2f   ;%.2f   ;%.2f   ;%.2f  \n" % (root, 
> > d_size, d_size/1024, d_size/1048576, d_size/1073741824)
> > 
> >  counter = Lpath.count(os.path.sep)
> >  if counter < 5:
> >  for f in os.listdir(Lpath):
> >  path = os.path.join(Lpath, f)
> >  f_size = 0
> >  f_size = os.path.getsize(path)
> >  csv_contents += "%s   ;%.2f   ;%.2f   ;%.2f   ;%.2f  \n" % 
> > (path, f_size, f_size/1024, f_size/1048576, f_size/1073741824)
> > 
> > fp = open(output_file, "w")
> > fp.write(csv_contents)
> > fp.close()
> 
> 
> Read
> https://docs.python.org/3/faq/programming.html#what-is-the-most-efficient-way-to-concatenate-many-strings-together
> -- 
> Terry Jan Reedy

Thanks for this tip. I do think it's better to use lists than concatenate into 
string variable. However, writing a list to a csv file is not something easy. 
If strings stored into the list have commas and single quotes (like song 
title's), it messes up the whole csv when it first meets this. Example with arr 
as list:


import csv
import io

(...)

csv_contents = "%s;%s;%s;%.2f;%.2f;%.2f;%.2f;%s" % (vfolder_path, vfile_name, 
vfolder_path_full, 0.00, 0.00, 0.00,0.00, "folder")
arr.append([csv_contents])

b = io.BytesIO()
with open(CSV_FILE,'w', newline ='\n') as f:
#write = csv.writer(f,delimiter=';')
#write = csv.writer(f,quotechar='\'', 
quoting=csv.QUOTE_NONNUMERIC,delimiter=',')
write = csv.writer(f,b)
for row in arr:
write.writerows(row)

(...)

string samples: ;'Forgotten Dreams' Mix.mp3;'Awakening of a Dream' Ambient 
Mix.mp3;Best of Trip-Hop & Downtempo & Hip-Hop Instrumental.mp3;2-Hour _ Space 
Trance.mp3

for the titles above, the easiest thing to write csv for me is 


(...)
csv_contents += "%s;%s;%s;%.2f;%.2f;%.2f;%.2f;%s" % (vfolder_path, vfile_name, 
vfolder_path_full, 0.00, 0.00, 0.00,0.00, "folder"

with open(CSV_FILE,'w') as f:
f.write(csv_contents)


csv_contents can be very large and it seems to work. It can concatenate 30k 
items and it's ok. Also with the above, I have the expected result into each of 
the 8 rows having the corresponding data. This is not always the case with csv 
writerows. If it meets a character it can't process, from there everything go 
into a single cell row. The split on semi colon from doesnt work anymore.

I am not allowed to change the name of the files (it could be used later 
somewhere else, making side effects...).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ValueError: arrays must all be same length

2020-10-04 Thread Tim Williams
On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI 
wrote:

> Hello,
>
> I got a json response from an API and tried to use pandas to put data into
> a dataframe.
>
> However, I kept getting this ValueError: arrays must all be same length.
>
> Can anyone help?
>
> The following is the json text.  Regards, Shao
>
> (snip json_text)


> import pandas as pd
>
> import json
>
> j = json.JSONDecoder().decode(req.text)  ###req.json
>
> df = pd.DataFrame.from_dict(j)
>

I copied json_text into a Jupyter notebook and got the same error trying to
convert this into a pandas DataFrame:When I tried to copy this into a
string, I got an error,, but without enclosing the paste in quotes, I got
the dictionary.

dir(json_text)
['__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

pd.DataFrame(json_text)

---

ValueErrorTraceback (most recent call last)
 in 
> 1 pd.DataFrame(json_text)

D:\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data,
index, columns, dtype, copy)
433 )
434 elif isinstance(data, dict):
--> 435 mgr = init_dict(data, index, columns, dtype=dtype)
436 elif isinstance(data, ma.MaskedArray):
437 import numpy.ma.mrecords as mrecords

D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
init_dict(data, index, columns, dtype)
252 arr if not is_datetime64tz_dtype(arr) else arr.copy()
for arr in arrays
253 ]
--> 254 return arrays_to_mgr(arrays, data_names, index, columns,
dtype=dtype)
255
256

D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
arrays_to_mgr(arrays, arr_names, index, columns, dtype)
 62 # figure out the index, if necessary
 63 if index is None:
---> 64 index = extract_index(arrays)
 65 else:
 66 index = ensure_index(index)

D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
extract_index(data)
363 lengths = list(set(raw_lengths))
364 if len(lengths) > 1:
--> 365 raise ValueError("arrays must all be same length")
366
367 if have_dicts:

ValueError: arrays must all be same length


I got a different error trying json.loads(str(json_text)),
---
JSONDecodeError   Traceback (most recent call last)
 in 
> 1 json.loads(str(json_text))

D:\anaconda3\lib\json\__init__.py in loads(s, cls, object_hook,
parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
355 parse_int is None and parse_float is None and
356 parse_constant is None and object_pairs_hook is None
and not kw):
--> 357 return _default_decoder.decode(s)
358 if cls is None:
359 cls = JSONDecoder

D:\anaconda3\lib\json\decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):

D:\anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
351 """
352 try:
--> 353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
355 raise JSONDecodeError("Expecting value", s, err.value)
from None

JSONDecodeError: Expecting property name enclosed in double quotes: line 1
column 2 (char 1)

I think the solution is to fix the arrays so that the lengths match.

for k in json_text.keys():
if isinstance(json_text[k], list):
print(k, len(json_text[k]))

relationships 0
locationTypes 0
regulatedActivities 2
gacServiceTypes 1
inspectionCategories 1
specialisms 4
inspectionAreas 0
historicRatings 4
reports 5

HTH,.

-- 
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ValueError: arrays must all be same length

2020-10-04 Thread Tim Williams
On Sun, Oct 4, 2020 at 8:39 AM Tim Williams  wrote:

>
>
> On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI 
> wrote:
>
>> Hello,
>>
>> I got a json response from an API and tried to use pandas to put data into
>> a dataframe.
>>
>> However, I kept getting this ValueError: arrays must all be same length.
>>
>> Can anyone help?
>>
>> The following is the json text.  Regards, Shao
>>
>> (snip json_text)
>
>
>> import pandas as pd
>>
>> import json
>>
>> j = json.JSONDecoder().decode(req.text)  ###req.json
>>
>> df = pd.DataFrame.from_dict(j)
>>
>
> I copied json_text into a Jupyter notebook and got the same error trying
> to convert this into a pandas DataFrame:When I tried to copy this into a
> string, I got an error,, but without enclosing the paste in quotes, I got
> the dictionary.
>
>
(delete long response output)


> for k in json_text.keys():
> if isinstance(json_text[k], list):
> print(k, len(json_text[k]))
>
> relationships 0
> locationTypes 0
> regulatedActivities 2
> gacServiceTypes 1
> inspectionCategories 1
> specialisms 4
> inspectionAreas 0
> historicRatings 4
> reports 5
>
> HTH,.
>
>
This may also be more of a pandas issue.

json.loads(json.dumps(json_text))

has a successful round-trip


> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python if and same instruction line not working

2020-10-04 Thread MRAB

On 2020-10-04 10:35, pascal z via Python-list wrote:

On Tuesday, September 29, 2020 at 5:28:22 PM UTC+2, MRAB wrote:

On 2020-09-29 15:42, pascal z via Python-list wrote:
> I need to change the script commented out to the one not commented out. Why?
> 
>  # for x in sorted (fr, key=str.lower):

>  # tmpstr = x.rpartition(';')[2]
>  # if x != csv_contents and tmpstr == "folder\n":
>  # csv_contentsB += x
>  # elif x != csv_contents and tmpstr == "files\n":
>  # csv_contentsC += x
> 
>  for x in sorted (fr, key=str.lower):

>  if x != csv_contents:
>  tmpstr = x.rpartition(';')[2]
>  if tmpstr == "folder\n":
>  csv_contentsB += x
>  elif tmpstr == "file\n":
>  csv_contentsC += x
> 
You haven't defined what you mean by "not working" for any test values 
to try, but I notice that the commented code has "files\n" whereas the 
uncommented code has "file\n".


Very good point, it should what caused the issue

By the way, it seems it's ok to check \n as end of line, it will work on 
windows linux and mac platforms even if windows use \r\n

By default, when the 'open' function opens a file in text mode, it uses 
"universal newlines mode", so the rest of the program doesn't have to 
worry about the differences in line endings. It's explained in the 
documentation about the 'open' function.

--
https://mail.python.org/mailman/listinfo/python-list


Re: python show folder files and not subfolder files

2020-10-04 Thread Cameron Simpson
On 04Oct2020 02:56, pascal z  wrote:
>On Thursday, September 24, 2020 at 4:37:07 PM UTC+2, Terry Reedy wrote:
>> Read
>> https://docs.python.org/3/faq/programming.html#what-is-the-most-efficient-way-to-concatenate-many-strings-together
>
>Thanks for this tip. I do think it's better to use lists than 
>concatenate into string variable. However, writing a list to a csv file 
>is not something easy. If strings stored into the list have commas and 
>single quotes (like song title's), it messes up the whole csv when it 
>first meets this. [...]
>[...]
>csv_contents = "%s;%s;%s;%.2f;%.2f;%.2f;%.2f;%s" % (vfolder_path, 
>vfile_name, vfolder_path_full, 0.00, 0.00, 0.00,0.00, "folder")
>arr.append([csv_contents])
>[...]

Is there a reaon you're not using the csv module to write and read CSV 
files. It knows how to correctly escape values in a number of common 
dialects (the default dialect works well).

By composing CSV files with %-formatting (or with any crude string 
cormatting) you the exact syntax issue you're describing. Faced with 
user supplied data, these issues become "injection attacks", as 
exemplified by this XKCD comics:

https://xkcd.com/327/
https://www.explainxkcd.com/wiki/index.php/Little_Bobby_Tables

The correct approach here is to have a general and _correct_ formatter 
for the values, and to not assemble things with simplistic approaches 
like %-formatting.

With databases the standard approach for assembling SQL is to provide 
template SQL with the values as arguments, and have the db-specific 
driver construct SQL for you. And with CSV files the same applies: 
import the csv module and use csv.writer() to general the CSV data; you 
just hand the writer an array of values (strings, floats, whatever) and 
it takes care of using the correct syntax in the file.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Collecting more feedback about contributing to Python

2020-10-04 Thread Tal Einat
Have you tried contributing to the development of Python itself, or have
considered doing so? I'd like to hear your thoughts and experiences! I'm
collecting such information to guide work during the upcoming core-dev
sprint on making contribution easier and friendlier.

You can reach out publicly or privately. I'll keep private stories to
myself, only mentioning specific relevant points from them without
mentioning who sent them.

Public stories will be added to the dedicated repo, which already includes
many such stories which I have previously collected.
https://github.com/taleinat/python-contribution-feedback

For more info on the core dev sprint:
https://python-core-sprint-2020.readthedocs.io/index.html

- Tal Einat
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem

2020-10-04 Thread Mirko via Python-list
Am 03.10.2020 um 17:25 schrieb Dennis Lee Bieber:
> On Fri, 2 Oct 2020 21:47:38 +0200, Hexamorph  declaimed
> the following:
> 
> 
>>
>> - Add a folder named "Python.org " (or similar) to the
>> desktop with shortcuts to Python, IDLE and the CHM.
>>
>> - Add a checkbox (default enabled) like "Start the IDLE Python
>> Editor/Shell" at the end of the installation procedure.
>>
> 
>   Which may only result in reinforcing the idea that one runs the
> installer to run Python.


Perhaps not if the installer says, that Python is already installed
and accessible per Startmenu and desktop icons. At some point they
will probably notice this hint. There are multiple reasons, why
beginners don't find out how to start Python. Some expect a
GUI-Application like an Editor in the Startmenu, some expect desktop
icons, some don't realize, that the installer is just that. They all
need a different change.

However, I'm out of this discussion now. With the exception of Terry
changing IDLE's menu entry, this has been a very unproductive
discussion. People where only highlighting possible drawbacks and
remaining problems (including the usual nitpicking on edge-cases)
without coming up with own ideas.

If you want to lessen the amount of those initial newcomer
questions, reconsider what was proposed:

- Rename the MSI as suggested by Eryk Sun.
- Add a folder named "Python.org " (or similar) to the
desktop with shortcuts to Python, IDLE and the CHM.
- Add a checkbox (default enabled) like "Start the IDLE Python
Editor/Shell" at the end of the installation procedure.
- Perhaps, if possible, add a button like "Start the IDLE Python
Editor/Shell" to the Repair/Modify/Remove Dialog.

So long and happy hacking! :-)
-- 
https://mail.python.org/mailman/listinfo/python-list