Running a bulk ingest test

Continous ingest supports bulk ingest in addition to live ingest. A map reduce job that generates rfiles using the tables splits can be run. This can be run in a loop like the following to continually bulk import data.

# create the ci table if necessary
./bin/cingest createtable

for i in $(seq 1 10); do
   # run map reduce job to generate data for bulk import
   ./bin/cingest bulk /tmp/bt/$i
   # ask accumulo to import generated data
   echo -e "table ci\nimportdirectory /tmp/bt/$i/files true" | accumulo shell -u root -p secret
done
./bin/cingest verify

Another way to use this in test is to generate a lot of data and then bulk import it all at once as follows.

for i in $(seq 1 10); do
  ./bin/cingest bulk /tmp/bt/$i
done

(
  echo "table ci"
  for i in $(seq 1 10); do
    echo "importdirectory /tmp/bt/$i/files true"
  done
) | accumulo shell -u root -p secret
./bin/cingest verify

Bulk ingest could be run concurrently with live ingest into the same table. It could also be run while the agitator is running.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bulk-test.md

bulk-test.md

Running a bulk ingest test

Files

bulk-test.md

Latest commit

History

bulk-test.md

File metadata and controls

Running a bulk ingest test