Releases · aws/aws-sdk-pandas

03 Mar 16:59

2.5.0

ac5407b

AWS Data Wrangler 2.5.0

Caveats

⚠️ For platforms without PyArrow 3 support (e.g. MWAA, EMR, Glue PySpark Job):

➡️ pip install pyarrow==2 awswrangler

Documentation

New HTML tutorials #551
Use bump2version for changing version numbers #573
Mishandling of wildcard characters in read_parquet #564

Enhancements

Support for ExpectedBucketOwner #562

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @impredicative, @adarsh-chauhan, @Malkard.

P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

Assets 6

04 Feb 13:24

igorborgest

2.4.0-docs

a1ec7d3

AWS Data Wrangler 2.4.0 (Docs updated)

Caveats

⚠️ For platforms without PyArrow 3 support (e.g. EMR, Glue PySpark Job):

➡️ pip install pyarrow==2 awswrangler

Documentation

Update to include PyArrow 3 caveats for EMR and Glue PySpark Job. #546 #547

New Functionalities

Redshift COPY now supports the new SUPER type (i.e. SERIALIZETOJSON) #514
S3 Upload/download files #506
Include dataset BUCKETING for s3 datasets writing #443
Enable Merge Upsert for existing Glue Tables on Primary Keys #503
Support Requester Pays S3 Buckets #430
Add botocore Config to wr.config #535

Enhancements

Pandas 1.2.1 support #525
Numpy 1.20.0 support
Apache Arrow 3.0.0 support #531
Python 3.9 support #454

Bug Fix

Return DataFrame with unique index for Athena CTAS queries #527
Remove unnecessary schema inference. #524

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @danielwo, @jiteshsoni, @igorborgest, @njdanielsen, @eric-valente, @gvermillion, @zseder, @gdbassett, @orenmazor, @senorkrabs, @Natalie-Caruana, @dragonH, @nikwerhypoport, @hwangji.

P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

Assets 6

03 Feb 23:26

igorborgest

2.4.0

4b9f270

AWS Data Wrangler 2.4.0

New Functionalities

Redshift COPY now supports the new SUPER type (i.e. SERIALIZETOJSON) #514
S3 Upload/download files #506
Include dataset BUCKETING for s3 datasets writing #443
Enable Merge Upsert for existing Glue Tables on Primary Keys #503
Support Requester Pays S3 Buckets #430
Add botocore Config to wr.config #535

Enhancements

Pandas 1.2.1 support #525
Numpy 1.20.0 support
Apache Arrow 3.0.0 support #531
Python 3.9 support #454

Bug Fix

Return DataFrame with unique index for Athena CTAS queries #527
Remove unnecessary schema inference. #524

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @danielwo, @jiteshsoni, @igorborgest, @njdanielsen, @eric-valente, @gvermillion, @zseder, @gdbassett, @orenmazor, @senorkrabs, @Natalie-Caruana.

P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

Assets 6

10 Jan 14:36

igorborgest

2.3.0

13e71a0

AWS Data Wrangler 2.3.0

New Functionalities

DynamoDB support #448
SQLServer support (Driver must be installed separately) #356
Excel files support #419 #509
Amazon S3 Access Point support #393
Amazon Chime initial support #494
Write compressed CSV and JSON files on S3 #308 #359 #412

Enhancements

Add query parameters for Athena #432
Add metadata caching for Athena #461
Add suffix filters for s3.read_parquet_table() #495

Bug Fix

Fix keep_files behavior for failed Redshift COPY executions #505

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @danielwo, @jiteshsoni, @gvermillion, @rodalarcon, @imanebosch, @dwbelliston, @tochandrashekhar, @kylepierce, @njdanielsen, @jasadams, @gtossou, @JasonSanchez, @kokes, @hanan-vian @igorborgest.

P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

Assets 6

23 Dec 00:05

igorborgest

2.2.0

c241095

AWS Data Wrangler 2.2.0

New Functionalities

Add aws_access_key_id, aws_secret_access_key, aws_session_token and boto3_session for Redshift copy/unload #484

Bug Fix

Remove dtype print statement #487

Thanks

We thank the following contributors/users for their work on this release:

@danielwo, @thetimbecker, @njdanielsen, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

Assets 7

21 Dec 11:11

igorborgest

2.1.0

92ae19d

AWS Data Wrangler 2.1.0

New Functionalities

Add secretmanager module and support for databases connections #402

con = wr.redshift.connect(secret_id="my-secret", dbname="my-db")
df = wr.redshift.read_sql_query("SELECT ...", con=con)
con.close()

Bug Fix

Fix connection attributes quoting for wr.*.connect() #481
Fix parquet table append for nested struct columns #480

Thanks

We thank the following contributors/users for their work on this release:

@danielwo, @nmduarteus, @nivf33, @kinghuang, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

Assets 7

11 Dec 10:58

igorborgest

2.0.1

2a3afb0

AWS Data Wrangler 2.0.1

New Functionalities

New wr.timestream.create_database() function
New wr.timestream.create_table() function
New wr.timestream.delete_database() function
New wr.timestream.delete_table() function
New ignore_empty argument to ignore 0 bytes files for:

Enhancements

Automatically rollback in case of failed queries for:

Thanks

We thank the following contributors/users for their work on this release:

@danielwo, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

Assets 7

07 Dec 12:27

igorborgest

2.0.0

7d68e78

AWS Data Wrangler 2.0.0

Breaking changes

sqlalchemy and psycopg2 dependencies replaced by redshift_connector and pg8000
All wr.db.* functions was distributed into wr.redshift.*, wr.postgresql.* and wr.mysql.* (Tutorial)
Redshift COPY and UNLOAD function was refactored into wr.redshift.* (Tutorial)
wr.catalog.get_engine() was replaced by wr.redshift.connect(), wr.postgresql.connect(), wr.mysql.connect() (Tutorial)

New Functionalities

Amazon Timestream support (Tutorial)

Enhancements

General performance improved for s3 I/O removing eventual consistency guardrails (Reference)
Add retry with decorrelated jitter for Athena and Glue Catalog calls to overcome throttling in high concurrency scenarios.

Docs

Updates regarding all new functionalities
Add Amazon Timestream tutorial
Add Amazon Timestream tutorial 2

AWS re:Invent related news

Thanks

We thank the following contributors/users for their work on this release:

@Brooke-white, @danielwo, @sapientderek, @pmleveque, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

Assets 7

26 Nov 01:43

igorborgest

1.10.1

ba6e9f5

AWS Data Wrangler 1.10.1

New Functionalities

catalog.add_column() #451
catalog.delete_column() #451

Enhancements

Deterministic result for s3.read_parquet_metadata() #449
~30% faster package import time #460

Bug Fix

Fix Athena read with ctas_approach=False and chunksize=True #458
Fix overwriting for not enforced configs #450

Docs

Small fixes #462 #458 #446

Thanks

We thank the following contributors/users for their work on this release:

@tuannguyen0901, @bryanyang0528, @czagoni, @jesusch, @danielwo, @DonghanYang, @eric-valente, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

Assets 7

31 Oct 20:31

igorborgest

1.10.0

fa1a439

AWS Data Wrangler 1.10.0

New Functionalities

Add configurable Endpoint URL for AWS services #418
Add global environment configuration for Athena workgroups #437

Enhancements

Support for Apache Arrow 2.0.0 #436
Allow Decimal to float casting for wr.db.read_sql_query() #431
Allow unsafe conversions for wr.db.read_sql_query() #427

Bug Fix

QuickSight functions now allow usernames with "/" #434
Fix duplicated carriage return for wr.s3.to_csv() running on Windows platform.

Thanks

We thank the following contributors/users for their work on this release:

@martinSpears-ECS, @imanebosch, @Eric-He-98, @brombach, @Thomas-Hirsch, @vuchetichbalint, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

Assets 7

Releases: aws/aws-sdk-pandas

AWS Data Wrangler 2.5.0

Caveats

Documentation

Enhancements

Thanks

AWS Data Wrangler 2.4.0 (Docs updated)

Caveats

Documentation

New Functionalities

Enhancements

Bug Fix

Thanks

AWS Data Wrangler 2.4.0

New Functionalities

Enhancements

Bug Fix

Thanks

AWS Data Wrangler 2.3.0

New Functionalities

Enhancements

Bug Fix

Thanks

AWS Data Wrangler 2.2.0

New Functionalities

Bug Fix

Thanks

AWS Data Wrangler 2.1.0

New Functionalities

Bug Fix

Thanks

AWS Data Wrangler 2.0.1

New Functionalities

Enhancements

Thanks

AWS Data Wrangler 2.0.0

Breaking changes

New Functionalities

Enhancements

Docs

AWS re:Invent related news

Thanks

AWS Data Wrangler 1.10.1

New Functionalities

Enhancements

Bug Fix

Docs

Thanks

AWS Data Wrangler 1.10.0

New Functionalities

Enhancements

Bug Fix

Thanks