You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Although PapaParse has a delimiter guessing mechanism, the method is far from being accurate. Actually, the research on this subject is on its way. Libraries like CleverCSV implements robust dialect sniffing strategies, backed by scientific research.
However, it is actually really difficult to separate the delimiters guessing mechanism from the Python's CleverCSV library. A recent research has pointed out a new universal method that can be integrated into whatever CSV parser, proved far more reliable than CleverCSV, as its research paper demonstrate.
The proposal is to implement the Table Uniformity Method, already implemented in Python and being considered to be implemented in Rust, by code porting. In this way, the wonderful PapaParse project will have an state of the art delimiter guessing strategy, improving significantly the automation in CSV processing.
The text was updated successfully, but these errors were encountered:
Although PapaParse has a delimiter guessing mechanism, the method is far from being accurate. Actually, the research on this subject is on its way. Libraries like CleverCSV implements robust dialect sniffing strategies, backed by scientific research.
However, it is actually really difficult to separate the delimiters guessing mechanism from the Python's CleverCSV library. A recent research has pointed out a new universal method that can be integrated into whatever CSV parser, proved far more reliable than CleverCSV, as its research paper demonstrate.
The proposal is to implement the Table Uniformity Method, already implemented in Python and being considered to be implemented in Rust, by code porting. In this way, the wonderful PapaParse project will have an state of the art delimiter guessing strategy, improving significantly the automation in CSV processing.
The text was updated successfully, but these errors were encountered: