-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New widget : cleansing data #6640
Comments
Have you checked out the widgets Impute, Unique and Preprocess? |
hello @wvdvegte This time, yes. Also the preprocess for text. But it's not exactly the same thing since here, you can also clean strings, choose on which fields to apply, etc. The idea is more to have a better data quality. Best regards, Simon |
There's also a lot you can do using the Formula widget and any Python code that fits into a one-line variable assignment, e.g. removing (leading, trailing or all) spaces, case modifications and many other things as long as it doesn't require external libraries or multiple lines of codes. For inexperienced programmers like me, AI chatbots can very effectively be used to generate such code. |
@simonaubertbd My first impression is your task can be achieved with some combination of existing widgets. Admittedly, for some specifics, you would indeed need Python Script, particularly for text handling.
If nothing else, such as widget is more text-specific than general Orange. I need to be convinced of its general applicability first. At the moment, it seems specific for you own workflow. |
What's your use case?
More than once, the data I work on contains null values, or unwanted space at the end of fields, and sometimes a full field is empty or there totally empty rows in the middle of the data set. There aren't a lot of case but it's happen so many times a widget to automatize that would be great.
What's your proposed solution?
A widget with the main cleaning operations. Somthing like that :
Are there any alternative solutions?
using on all the concerned workflows several widgets to process the data.
The text was updated successfully, but these errors were encountered: