Traffic Parser

Parser for *.csv.gz traffic files. Requires Node.js 12 or later.

Example usage:

$ git clone https://github.com/bbar/traffic-parser.git traffic-parser
$ cd traffic-parser
$ yarn install # (or npm install)
$ node parse.js \
    --sources="/some/path/a.csv.gz /some/path/b.csv.gz /some/path/c.csv.gz /some/path/d.csv.gz" \
    --destination="/some/path/data/parsed/intervals" \
    --weekdays="0,1,2,3,4,5,6" \
    --batch=625 \
    --sourceInterval=5 \
    --targetInterval=60

Argument	Required	Type	Default	Description
sources	yes	String		Files to parse
destination	yes	String		Directory where parsed files are placed
weekdays	no	String	0,1,2,3,4,5,6	Weekdays to parse (0=Sunday)
batch	no	Int	625	Max lines written to any file at once
sourceInterval	no	Int	5	(Mins) Interval between traffic samples in the *.csv.gz file
targetInterval	no	Int	5	(Mins) Desired interval between traffic samples for parsed files

A quick note about the batch argument... A batch size of 625 means the code will parse 625 lines from a *.csv.gz file, write those to disk, then parse another 625 lines, write to disk, …, until it’s done. If you set a batch limit of 30000 or so and haven't increased node's limit with --max-old-space-size, V8 will likely explode with a memory error. Surprisingly (to me, anyway) setting a larger batch size doesn't mean better performance. I started at 20000 and kept cutting it in half until I saw the best performance, which was around 625. So that's the default.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Traffic Parser

Files

README.md

Latest commit

History

README.md

File metadata and controls

Traffic Parser