Scanner support for zip archives containing multiple files #165
Labels
enhancement
New feature or request
needs investigation
It looks as though have all the information needed but investigation is required
scanners
Any tasks or issues relating specifically to scanners
Had the following exchange in the Redpanda Community Slack, where we (@mihaitodor, realistically) realized that the existing scanners don't support zip archives containing multiple files.
In case it helps, just to break down a use case: I'm grabbing a data dump a few times per day via SFTP that has a bunch of CSV files. It's pretty sizeable, so opening everything in memory (as @mihaitodor pointed out I can already do with the
unarchive
processor) is pretty expensive (memory-wise, I mean).This is my first time using benthos/redpanda connect, but I'm assuming that this hypothetical scanner will need to either add some metadata to each line/message to indicate which file it's coming from or otherwise allow me to batch them in a way that I can tell which is which for subsequent processing.
If there's any way I can be of assistance, please let me know.
The text was updated successfully, but these errors were encountered: