[issue34393] json.dumps - allow compression
New submission from liad : The list of arguments of json.dump() can be seen here: https://docs.python.org/2/library/json.html Notice that there is no way to make compression. For example pandas allows you to do: df.to_csv(path_or_buf=file_name, index=False, encoding='utf-8', compression='gzip', quoting=QUOTE_NONNUMERIC) I want to be able to compress when I do: with open('products.json', 'w') as outfile: json.dump(data, outfile, sort_keys=True) Please add the ability to compress using json.dump() -- messages: 323475 nosy: liad100 priority: normal severity: normal status: open title: json.dumps - allow compression type: enhancement versions: Python 2.7 ___ Python tracker <https://bugs.python.org/issue34393> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34393] json.dumps - allow compression
liad added the comment: The gzip module may work for saving file localy but for example: This upload json to Google Storage: import datalab.storage as storage storage.Bucket('mybucket').item(path).write_to(json.dumps(response), 'application/json') Your won't work here unless I save the file locally and only then upload it... It's a bit of a problem when your files are 100 GBs+ I still think the json.dump() should support compression -- ___ Python tracker <https://bugs.python.org/issue34393> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34393] json.dumps - allow compression
liad added the comment: True there are endless versions of compression just like there are endless version of file formats. Still there are some build-ins like conversion from string to json. For example you don't support of json to orc file. Same argument could have been raise here : how would we choose which conversions to also? Still a choice has been made and some basic conversion behavior is supported. You are claiming that it's all or nothing which I don't think is the right approach. Many are now moving their storage into cloud platforms. The storage is as it sound - storage. It doesn't offer any programming service what you stream is what you will have. Streaming huge files without compression = bleeding money for no reason. Saving the files to disk, compress them and then upload them might be very slow and also the idea is having machine with big memory and low storage - if you have to save huge files localy you'll also need big storage which costs more money. Regarding google there is a pending request for who chooses to use GoogleCloudPlatform package but not all use that. https://github.com/GoogleCloudPlatform/google-cloud-python/issues/5791 Not to mention that there are dozes of other service providers. So even if Google will support it - this doesn't give answer to storage service providers I still claim that this is a basic legit request and can be handled by the json.dump() function. gzip is fine. It also supported by pandas extension and is well known. -- ___ Python tracker <https://bugs.python.org/issue34393> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34393] json.dumps - allow compression
liad added the comment: I'm sure I will find a work-around. I posted it for other who will face the same issue as me. There are many who uses cloud storage but not many work with PB size files. This will likely to change in the near future as more and more company start to process huge amount of data. I'm not sure what you mean by designing an API. I think you sale it up for no need. It simply add of optional parameter which will trigger compression of gzip. That's it. Nothing sophisticated. Something like: json.dumps(data, outfile, sort_keys=True,compression='gzip') compression - Optional. A string representing the compression to use in the output. Allowed values are ‘gzip’. -- ___ Python tracker <https://bugs.python.org/issue34393> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com