Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] intermittent 409 Client Error message appears during the final create_stage_index (harvest dag) & publish_collection (publish dag) tasks; re-running the step in Airflow runs in success #1095

Open
gamontoya opened this issue Aug 9, 2024 · 1 comment · May be fixed by #1119
Labels
bug Something isn't working technical debt

Comments

@gamontoya
Copy link
Collaborator

gamontoya commented Aug 9, 2024

Example error message:

Page to the first attempt for the log:

==

Some notes:
Gabriela initiated batches of 100 collections from the Registry, and experienced that most would run successfully, while some would fail at the final step.
From Airflow, re-running the final task resulted in success.
These steps apply to both the Harvest (to -stage) DAG as well as the Publish (to -prod) DAG.

Another note:
Gabriela initiated 3 collections from the registry, to publish collections to -prod, and also experienced this error. So it seems that this error is not just a huge batch size issue. (Re-running the final task in Airflow works fine though.)

@gamontoya gamontoya added the bug Something isn't working label Aug 9, 2024
@gamontoya gamontoya changed the title [bug] UCSD re-harvesting issue at create_stage_index task instance (32 collections) [bug] UCSD re-harvesting issue at create_stage_index task instance Aug 12, 2024
@christinklez christinklez changed the title [bug] UCSD re-harvesting issue at create_stage_index task instance [bug] intermittent 409 Client Error message appears during the final create_stage_index (harvest dag) & publish_collection (publish dag) tasks; re-running the step runs in success Aug 13, 2024
@christinklez christinklez changed the title [bug] intermittent 409 Client Error message appears during the final create_stage_index (harvest dag) & publish_collection (publish dag) tasks; re-running the step runs in success [bug] intermittent 409 Client Error message appears during the final create_stage_index (harvest dag) & publish_collection (publish dag) tasks; re-running the step in Airflow runs in success Aug 13, 2024
@amywieliczka
Copy link
Collaborator

This is an issue with OpenSearch not quite indexing documents fast enough, but as long as the counts are the same, it is harmless.

Adding some more logging output here for clarification:

[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - 
----------------------------------------
Indexed 16 records to index `rikolti-stg-2024-07-11-t15_35_50` from page `28306/vernacular_metadata_2024-09-30T23:17:42/mapped_metadata_2024-09-30T23:17:54/with_content_urls_2024-09-30T23:18:16/data/0.jsonl`
     all 16 records had is_shown_by field removed
     all 16 records had thumbnail_source field removed
     all 16 records had thumbnail.from-cache field removed
     all 16 records had media_source field removed
     all 16 records had is_shown_at field removed
     all 16 records had item_count field removed
     all 16 records had thumbnail.component_content_harvest_metadata field removed

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - 
----------------------------------------
> Deleting 16 outdated record(s) from collection 28306 in `rikolti-stg-2024-07-11-t15_35_50` index.
 records: outdated versions
      16: 28306/vernacular_metadata_2024-08-06T00:17:50/mapped_metadata_2024-08-06T00:18:02/with_content_urls_2024-08-06T00:18:23
New indexed documents have version: 28306/vernacular_metadata_2024-09-30T23:17:42/mapped_metadata_2024-09-30T23:17:54/with_content_urls_2024-09-30T23:18:16
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - ERROR @ delete_by_query from /usr/local/airflow/dags/rikolti/record_indexer/index_collection.py
[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - {'batches': 1,
 'deleted': 0,
 'failures': [{'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb05926072]: version conflict, '
                                   'required seqNo [2180119], primary term '
                                   '[1]. current document has seqNo [2613171] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb05926072',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb18554166]: version conflict, '
                                   'required seqNo [2180120], primary term '
                                   '[1]. current document has seqNo [2613172] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb18554166',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb3493655f]: version conflict, '
                                   'required seqNo [2180121], primary term '
                                   '[1]. current document has seqNo [2613173] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb3493655f',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb3869088g]: version conflict, '
                                   'required seqNo [2180122], primary term '
                                   '[1]. current document has seqNo [2613174] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb3869088g',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb47905951]: version conflict, '
                                   'required seqNo [2180123], primary term '
                                   '[1]. current document has seqNo [2613175] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb47905951',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb54731950]: version conflict, '
                                   'required seqNo [2180124], primary term '
                                   '[1]. current document has seqNo [2613176] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb54731950',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb6633617w]: version conflict, '
                                   'required seqNo [2180125], primary term '
                                   '[1]. current document has seqNo [2613177] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb6633617w',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb67360079]: version conflict, '
                                   'required seqNo [2180126], primary term '
                                   '[1]. current document has seqNo [2613178] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb67360079',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb7282085z]: version conflict, '
                                   'required seqNo [2180127], primary term '
                                   '[1]. current document has seqNo [2613179] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb7282085z',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb7452736r]: version conflict, '
                                   'required seqNo [2180128], primary term '
                                   '[1]. current document has seqNo [2613180] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb7452736r',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb79646854]: version conflict, '
                                   'required seqNo [2180129], primary term '
                                   '[1]. current document has seqNo [2613181] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb79646854',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb82718564]: version conflict, '
                                   'required seqNo [2180130], primary term '
                                   '[1]. current document has seqNo [2613182] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb82718564',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb85107671]: version conflict, '
                                   'required seqNo [2180131], primary term '
                                   '[1]. current document has seqNo [2613183] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb85107671',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb8817938x]: version conflict, '
                                   'required seqNo [2180132], primary term '
                                   '[1]. current document has seqNo [2613184] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb8817938x',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb89544588]: version conflict, '
                                   'required seqNo [2180133], primary term '
                                   '[1]. current document has seqNo [2613185] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb89544588',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb9773574k]: version conflict, '
                                   'required seqNo [2180134], primary term '
                                   '[1]. current document has seqNo [2613186] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb9773574k',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409}],
 'noops': 0,
 'requests_per_second': -1.0,
 'retries': {'bulk': 0, 'search': 0},
 'throttled_millis': 0,
 'throttled_until_millis': 0,
 'timed_out': False,
 'took': 117,
 'total': 16,
 'version_conflicts': 16}
[2024-10-01, 20:55:44 UTC] {{taskinstance.py:1824}} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/decorators/base.py", line 220, in execute
    return_value = super().execute(context)
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 181, in execute
    return_value = self.execute_callable()
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 198, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/usr/local/airflow/dags/rikolti/dags/shared_tasks/indexing_tasks.py", line 123, in stage_collection_task
    index_collection_task("rikolti-stg", collection, version_pages, context)
  File "/usr/local/airflow/dags/rikolti/dags/shared_tasks/indexing_tasks.py", line 23, in index_collection_task
    raise e
  File "/usr/local/airflow/dags/rikolti/dags/shared_tasks/indexing_tasks.py", line 20, in index_collection_task
    index_collection(alias, collection_id, version_pages)
  File "/usr/local/airflow/dags/rikolti/record_indexer/index_collection.py", line 31, in index_collection
    delete_collection_records_from_index(collection_id, index, version_path)
  File "/usr/local/airflow/dags/rikolti/record_indexer/index_collection.py", line 164, in delete_collection_records_from_index
    r = delete_by_query(index, data)
  File "/usr/local/airflow/dags/rikolti/record_indexer/index_collection.py", line 95, in delete_by_query
    r.raise_for_status()
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: https://search-rikolti-2-xxbcriyfw5iqysaj7p3fhhscae.us-west-2.es.amazonaws.com/rikolti-stg-2024-07-11-t15_35_50/_delete_by_query

@amywieliczka amywieliczka linked a pull request Oct 1, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working technical debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants