Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert lists and dicts to str in parquet #206

Merged
merged 1 commit into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion plaidcloud/utilities/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import uuid
import unicodecsv as csv

import pyarrow as pa
import pandas as pd
import numpy as np
import requests
Expand All @@ -21,7 +22,7 @@
from plaidcloud.utilities.remote.dimension import Dimensions

__author__ = 'Paul Morel'
__copyright__ = 'Copyright 2010-2021, Tartan Solutions, Inc'
__copyright__ = 'Copyright 2010-2024, Tartan Solutions, Inc'
__credits__ = ['Paul Morel']
__license__ = 'Apache 2.0'
__maintainer__ = 'Paul Morel'
Expand Down Expand Up @@ -496,6 +497,10 @@ def bulk_insert_dataframe(self, table_object, df, append=False, chunk_size=50000
load_type='parquet',
)
if data_load:
schema = pa.Schema.from_pandas(df)
for col in schema:
if isinstance(col.type, (pa.ListType, pa.StructType)):
df[col.name] = df[col.name].map(str)
with tempfile.NamedTemporaryFile(mode='wb+') as pq_file:
df.to_parquet(pq_file)
# upload the file
Expand Down
5 changes: 3 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,13 @@ ipython
numpy
pandas
plaidcloud-rpc@git+https://github.com/PlaidCloud/[email protected]#egg=plaidcloud-rpc
pyyaml
pyarrow
PyYAML
requests
setuptools
orjson
openpyxl
sqlalchemy~=1.4
SQLAlchemy~=1.4
sqlalchemy-greenplum
sqlalchemy-hana
texttable
Expand Down
Loading