Continous ingest supports bulk ingest in addition to live ingest. A map reduce job that generates rfiles using the tables splits can be run. This can be run in a loop like the following to continually bulk import data.
# create the ci table if necessary
./bin/cingest createtable
for i in $(seq 1 10); do
# run map reduce job to generate data for bulk import
./bin/cingest bulk /tmp/bt/$i
# ask accumulo to import generated data
echo -e "table ci\nimportdirectory /tmp/bt/$i/files true" | accumulo shell -u root -p secret
done
./bin/cingest verify
Another way to use this in test is to generate a lot of data and then bulk import it all at once as follows.
for i in $(seq 1 10); do
./bin/cingest bulk /tmp/bt/$i
done
(
echo "table ci"
for i in $(seq 1 10); do
echo "importdirectory /tmp/bt/$i/files true"
done
) | accumulo shell -u root -p secret
./bin/cingest verify
Bulk ingest could be run concurrently with live ingest into the same table. It could also be run while the agitator is running.