Skip to content

Releases: aws/aws-sdk-pandas

AWS Data Wrangler 2.5.0

03 Mar 16:59
Compare
Choose a tag to compare

Caveats

⚠️ For platforms without PyArrow 3 support (e.g. MWAA, EMR, Glue PySpark Job):

➡️ pip install pyarrow==2 awswrangler

Documentation

Enhancements

  • Support for ExpectedBucketOwner #562

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @impredicative, @adarsh-chauhan, @Malkard.


P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

AWS Data Wrangler 2.4.0 (Docs updated)

04 Feb 13:24
Compare
Choose a tag to compare

Caveats

⚠️ For platforms without PyArrow 3 support (e.g. EMR, Glue PySpark Job):

➡️ pip install pyarrow==2 awswrangler

Documentation

  • Update to include PyArrow 3 caveats for EMR and Glue PySpark Job. #546 #547

New Functionalities

  • Redshift COPY now supports the new SUPER type (i.e. SERIALIZETOJSON) #514
  • S3 Upload/download files #506
  • Include dataset BUCKETING for s3 datasets writing #443
  • Enable Merge Upsert for existing Glue Tables on Primary Keys #503
  • Support Requester Pays S3 Buckets #430
  • Add botocore Config to wr.config #535

Enhancements

  • Pandas 1.2.1 support #525
  • Numpy 1.20.0 support
  • Apache Arrow 3.0.0 support #531
  • Python 3.9 support #454

Bug Fix

  • Return DataFrame with unique index for Athena CTAS queries #527
  • Remove unnecessary schema inference. #524

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @danielwo, @jiteshsoni, @igorborgest, @njdanielsen, @eric-valente, @gvermillion, @zseder, @gdbassett, @orenmazor, @senorkrabs, @Natalie-Caruana, @dragonH, @nikwerhypoport, @hwangji.


P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

AWS Data Wrangler 2.4.0

03 Feb 23:26
Compare
Choose a tag to compare

New Functionalities

  • Redshift COPY now supports the new SUPER type (i.e. SERIALIZETOJSON) #514
  • S3 Upload/download files #506
  • Include dataset BUCKETING for s3 datasets writing #443
  • Enable Merge Upsert for existing Glue Tables on Primary Keys #503
  • Support Requester Pays S3 Buckets #430
  • Add botocore Config to wr.config #535

Enhancements

  • Pandas 1.2.1 support #525
  • Numpy 1.20.0 support
  • Apache Arrow 3.0.0 support #531
  • Python 3.9 support #454

Bug Fix

  • Return DataFrame with unique index for Athena CTAS queries #527
  • Remove unnecessary schema inference. #524

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @danielwo, @jiteshsoni, @igorborgest, @njdanielsen, @eric-valente, @gvermillion, @zseder, @gdbassett, @orenmazor, @senorkrabs, @Natalie-Caruana.


P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

AWS Data Wrangler 2.3.0

10 Jan 14:36
Compare
Choose a tag to compare

New Functionalities

  • DynamoDB support #448
  • SQLServer support (Driver must be installed separately) #356
  • Excel files support #419 #509
  • Amazon S3 Access Point support #393
  • Amazon Chime initial support #494
  • Write compressed CSV and JSON files on S3 #308 #359 #412

Enhancements

  • Add query parameters for Athena #432
  • Add metadata caching for Athena #461
  • Add suffix filters for s3.read_parquet_table() #495

Bug Fix

  • Fix keep_files behavior for failed Redshift COPY executions #505

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @danielwo, @jiteshsoni, @gvermillion, @rodalarcon, @imanebosch, @dwbelliston, @tochandrashekhar, @kylepierce, @njdanielsen, @jasadams, @gtossou, @JasonSanchez, @kokes, @hanan-vian @igorborgest.


P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

AWS Data Wrangler 2.2.0

23 Dec 00:05
Compare
Choose a tag to compare

New Functionalities

  • Add aws_access_key_id, aws_secret_access_key, aws_session_token and boto3_session for Redshift copy/unload #484

Bug Fix

  • Remove dtype print statement #487

Thanks

We thank the following contributors/users for their work on this release:

@danielwo, @thetimbecker, @njdanielsen, @igorborgest.


P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

AWS Data Wrangler 2.1.0

21 Dec 11:11
Compare
Choose a tag to compare

New Functionalities

  • Add secretmanager module and support for databases connections #402
con = wr.redshift.connect(secret_id="my-secret", dbname="my-db")
df = wr.redshift.read_sql_query("SELECT ...", con=con)
con.close()

Bug Fix

  • Fix connection attributes quoting for wr.*.connect() #481
  • Fix parquet table append for nested struct columns #480

Thanks

We thank the following contributors/users for their work on this release:

@danielwo, @nmduarteus, @nivf33, @kinghuang, @igorborgest.


P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

AWS Data Wrangler 2.0.1

11 Dec 10:58
Compare
Choose a tag to compare

New Functionalities

Enhancements

Thanks

We thank the following contributors/users for their work on this release:

@danielwo, @igorborgest.


P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

AWS Data Wrangler 2.0.0

07 Dec 12:27
Compare
Choose a tag to compare

Breaking changes

New Functionalities

Enhancements

  • General performance improved for s3 I/O removing eventual consistency guardrails (Reference)
  • Add retry with decorrelated jitter for Athena and Glue Catalog calls to overcome throttling in high concurrency scenarios.

Docs

AWS re:Invent related news

Thanks

We thank the following contributors/users for their work on this release:

@Brooke-white, @danielwo, @sapientderek, @pmleveque, @igorborgest.


P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

AWS Data Wrangler 1.10.1

26 Nov 01:43
Compare
Choose a tag to compare

New Functionalities

  • catalog.add_column() #451
  • catalog.delete_column() #451

Enhancements

  • Deterministic result for s3.read_parquet_metadata() #449
  • ~30% faster package import time #460

Bug Fix

  • Fix Athena read with ctas_approach=False and chunksize=True #458
  • Fix overwriting for not enforced configs #450

Docs

Thanks

We thank the following contributors/users for their work on this release:

@tuannguyen0901, @bryanyang0528, @czagoni, @jesusch, @danielwo, @DonghanYang, @eric-valente, @igorborgest.


P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

AWS Data Wrangler 1.10.0

31 Oct 20:31
Compare
Choose a tag to compare

New Functionalities

  • Add configurable Endpoint URL for AWS services #418
  • Add global environment configuration for Athena workgroups #437

Enhancements

  • Support for Apache Arrow 2.0.0 #436
  • Allow Decimal to float casting for wr.db.read_sql_query() #431
  • Allow unsafe conversions for wr.db.read_sql_query() #427

Bug Fix

  • QuickSight functions now allow usernames with "/" #434
  • Fix duplicated carriage return for wr.s3.to_csv() running on Windows platform.

Thanks

We thank the following contributors/users for their work on this release:

@martinSpears-ECS, @imanebosch, @Eric-He-98, @brombach, @Thomas-Hirsch, @vuchetichbalint, @igorborgest.


P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!