How to import Apache parquet files?

2019-11-05 Thread Softwarelimits
Hi, I need to come and ask here, I did not find enough information so I
hope I am just having a bad day or somebody is censoring my search results
for fun... :)

I would like to import (lots of) Apache parquet files to a PostgreSQL 11
cluster - yes, I believe it should be done with the Python pyarrow module,
but before digging into the possible traps I would like to ask here if
there is some common, well understood and documented tool that may be
helpful with that process?

It seems that the COPY command can import binary data, but I am not able to
allocate enough resources to understand how to implement a parquet file
import with that.

I really would like follow a person with much more knowledge than me about
either PostgreSQL or Apache parquet format instead of inventing a bad
wheel.

Any hints very welcome,
thank you very much for your attention!
John


Re: How to import Apache parquet files?

2019-11-05 Thread Softwarelimits
Hi Imre, thanks for the quick response - yes, I found that, but I was not
sure if it is already production ready - also I would like to use the data
with the timescale extension, that is why I need a full import.

Have  nice day!

On Tue, Nov 5, 2019 at 4:09 PM Imre Samu  wrote:

> >I would like to import (lots of) Apache parquet files to a PostgreSQL 11
> cluster
>
> imho: You have to check and test the Parquet FDW ( Parquet File Wrapper )
> - https://github.com/adjust/parquet_fdw
>
> Imre
>
>
>
>
> Softwarelimits  ezt írta (időpont: 2019. nov.
> 5., K, 15:57):
>
>> Hi, I need to come and ask here, I did not find enough information so I
>> hope I am just having a bad day or somebody is censoring my search results
>> for fun... :)
>>
>> I would like to import (lots of) Apache parquet files to a PostgreSQL 11
>> cluster - yes, I believe it should be done with the Python pyarrow module,
>> but before digging into the possible traps I would like to ask here if
>> there is some common, well understood and documented tool that may be
>> helpful with that process?
>>
>> It seems that the COPY command can import binary data, but I am not able
>> to allocate enough resources to understand how to implement a parquet file
>> import with that.
>>
>> I really would like follow a person with much more knowledge than me
>> about either PostgreSQL or Apache parquet format instead of inventing a bad
>> wheel.
>>
>> Any hints very welcome,
>> thank you very much for your attention!
>> John
>>
>