Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transformHeader Called Multiple Times in Papa Parse v5.5.1 #1083

Open
sambit04126 opened this issue Jan 16, 2025 · 0 comments
Open

transformHeader Called Multiple Times in Papa Parse v5.5.1 #1083

sambit04126 opened this issue Jan 16, 2025 · 0 comments

Comments

@sambit04126
Copy link

sambit04126 commented Jan 16, 2025

Bug Report: transformHeader Called Multiple Times in Papa Parse v5.5.1

  1. Title
    transformHeader Function Invoked for Data Rows in Papa Parse v5.5.1

  2. Description
    In Papa Parse version 5.5.1, the transformHeader function is being called for each data row's fields instead of exclusively for the header row. This results in unexpected transformations of data fields, leading to incorrect parsed output.

  3. Expected Behavior
    transformHeader should be invoked only once for each header field in the first row of the CSV.
    Data rows should be processed without invoking transformHeader.

  4. Actual Behavior
    transformHeader is invoked for each field in data rows, causing data values to be transformed incorrectly.
    This leads to incorrect and malformed rows in the transformation process.

  5. Steps to Reproduce
    Setup the Environment:

Install Papa Parse version 5.5.1.

npm install [email protected]

Create a file named index.js and paste the following code:

const Papa = require('papaparse');

const csvContent = `Name,Age,City
John Doe,30,New York
Jane Smith,25,Los Angeles
,abc`;

function mapHeader(header) {
    const transformed = header.trim().toLowerCase();
    console.log(`Transforming header: "${header}" -> "${transformed}"`);
    return transformed;
}

function transformData(data) {
    console.log(`Transforming data: "${data}"`);
    return data;
}

let currentChunk = [];
const chunkSize = 2;

function processChunk(chunk) {
    console.log(chunk);
}

Papa.parse(csvContent, {
    header: true,                
    dynamicTyping: true,        
    skipEmptyLines: true,       
    transformHeader: mapHeader,  
    transform: transformData,   
    step: function (results, parser) {
        console.log('Parsed Row:', results.data);
        currentChunk.push(results.data);

        if (currentChunk.length === chunkSize) {
            processChunk(currentChunk);
            currentChunk = []; 
        }
    },
    complete: function () {
        console.log('Parsing Complete.');
        if (currentChunk.length > 0) {
            processChunk(currentChunk);
        } else {
            console.log('No remaining data to process.');
        }
    },
    error: function (error) {
        console.error('Parsing Error:', error.message);
    }
});

Run the script

node index.js

Observe the Output:

With Papa Parse v5.5.1:

Transforming header: "Name" -> "name"
Transforming header: "Age" -> "age"
Transforming header: "City" -> "city"
Transforming header: "name" -> "name"
Transforming header: "age" -> "age"
Transforming header: "city" -> "city"
Transforming header: "John Doe" -> "john doe"
Transforming header: "30" -> "30"
Transforming header: "New York" -> "new york"
Transforming data: "john doe"
Transforming data: "30"
Transforming data: "new york"
Parsed Row: { name: 'john doe', age: 30, city: 'new york' }
Transforming header: "Jane Smith" -> "jane smith"
Transforming header: "25" -> "25"
Transforming header: "Los Angeles" -> "los angeles"
Transforming data: "jane smith"
Transforming data: "25"
Transforming data: "los angeles"
Parsed Row: { name: 'jane smith', age: 25, city: 'los angeles' }
[ { name: 'john doe', age: 30, city: 'new york' },
  { name: 'jane smith', age: 25, city: 'los angeles' } ]
Transforming header: "" -> ""
Transforming header: "abc" -> "abc"
Transforming data: ""
Transforming data: "abc"
Parsed Row: { name: null, age: 'abc' }
Parsing Complete.
[ { name: null, age: 'abc' } ]

With Papa Parse v5.4.1:

Transforming header: "Name" -> "name"
Transforming header: "Age" -> "age"
Transforming header: "City" -> "city"
Transforming header: "Name" -> "name"
Transforming header: "Age" -> "age"
Transforming header: "City" -> "city"
Transforming data: "John Doe"
Transforming data: "30"
Transforming data: "New York"
Parsed Row: { name: 'John Doe', age: 30, city: 'New York' }
Transforming data: "Jane Smith"
Transforming data: "25"
Transforming data: "Los Angeles"
Parsed Row: { name: 'Jane Smith', age: 25, city: 'Los Angeles' }
[ { name: 'John Doe', age: 30, city: 'New York' },  { name: 'Jane Smith', age: 25, city: 'Los Angeles' } ]
Transforming data: ""
Transforming data: "abc"
Parsed Row: { name: null, age: 'abc' }
Parsing Complete.
[ { name: null, age: 'abc' } ]

The transformHeader function in Papa Parse version 5.5.1 is incorrectly being invoked for data rows, leading to unexpected transformations. This regression affects data integrity during CSV parsing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant