You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The bulk load script accepts text from standard input, representing elasticsearch documents. It then calls indexing code that is shared with regular indexing functionality, even though the argument type is different.
This makes the code really difficult to work on, because any value can be either a string or an array of hashes. This complexity affects all of the indexing code, eg
def bulk_payload(document_hashes_or_payload)
if document_hashes_or_payload.is_a?(Array)
index_items_from_document_hashes(document_hashes_or_payload)
else
index_items_from_raw_string(document_hashes_or_payload)
end
end
Why
There are two separate code paths that essentially do the same thing, and if you make any change to this code you have to be very careful to change both of them in the same way, and test both of them.
The text was updated successfully, but these errors were encountered:
Moved from https://trello.com/c/8zhBPuQT/12-make-bulk-loader-work-with-arrays-instead-of-strings.
What
Every night a job runs to rebuild the search index with new popularity data.
https://github.com/alphagov/search-analytics/blob/master/nightly-run.sh
The bulk load script accepts text from standard input, representing elasticsearch documents. It then calls indexing code that is shared with regular indexing functionality, even though the argument type is different.
This makes the code really difficult to work on, because any value can be either a string or an array of hashes. This complexity affects all of the indexing code, eg
Why
There are two separate code paths that essentially do the same thing, and if you make any change to this code you have to be very careful to change both of them in the same way, and test both of them.
The text was updated successfully, but these errors were encountered: