Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sweepers not run(ning) against geo-prod #124

Open
jordanpadams opened this issue May 9, 2024 · 16 comments
Open

sweepers not run(ning) against geo-prod #124

jordanpadams opened this issue May 9, 2024 · 16 comments
Assignees
Labels

Comments

@jordanpadams
Copy link
Member

jordanpadams commented May 9, 2024

Checked for duplicates

No - I haven't checked

πŸ› Describe the bug

When I did a members query attempt on a collection, that should work, it does not.

πŸ•΅οΈ Expected behavior

I expected it to work.

πŸ“œ To Reproduce

curl --GET https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data::1.1/members | json_pp
{
   "data" : [],
   "summary" : {
      "hits" : 0,
      "limit" : 100,
      "properties" : [],
      "q" : "",
      "search_after" : [],
      "sort" : [],
      "took" : 25
   }
}

πŸ–₯ Environment Info

Chrome / MacOSx

πŸ“š Version of Software Used

Latest deployed

🩺 Test Data / Additional context

No response

πŸ¦„ Related requirements

This is blocking:

βš™οΈ Engineering Details

No response

πŸŽ‰ Integration & Test

No response

@jordanpadams
Copy link
Member Author

jordanpadams commented May 9, 2024

@alexdunnjpl @tloubrieu-jpl any idea why this query is not working? We have noticed this with several attempts to run deep-archive have failed and produced incorrect data products. This has already been blocked on several occasions, and we thought we fixed those, so not sure what happened.

@alexdunnjpl
Copy link
Contributor

alexdunnjpl commented May 9, 2024

@jordanpadams do you have an example product (full url to document preferred), which should be appearing in this query?

@jordanpadams
Copy link
Member Author

@alexdunnjpl here is an opensearch query with the associated data products:
https://search-geo-prod-6iz6lwiw6luyffpsq52ndsrtbu.us-west-2.es.amazonaws.com/_dashboards/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15y,to:now))&_a=(columns:!(lid,'ops:Tracking_Meta%2Fops:archive_status'),filters:!(),index:'04de9280-9067-11ed-aa4d-b9457fec4322',interval:auto,query:(language:kuery,query:'lid:urn%5C:nasa%5C:pds%5C:msl_gt_diagenesis_supplement%5C:data*'),sort:!())

Time ops:Tracking_Meta/ops:archive_status _id
Β  Mar 19, 2024 @ 07:51:22.348 archived

Β  | Mar 19, 2024 @ 07:51:22.270 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:target_classification::1.0

Β  | Mar 19, 2024 @ 07:51:22.170 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodule_rich_bedrock::1.0

Β  | Mar 19, 2024 @ 07:51:22.070 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodules::1.0

Β  | Mar 19, 2024 @ 07:51:21.985 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:local_rmsep_sigma20_win50_n20::1.0

Β  | Mar 19, 2024 @ 07:51:21.947 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:dark_strata::1.0

Β  | Mar 19, 2024 @ 07:51:21.862 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:cements::1.0

Β  | Mar 19, 2024 @ 07:51:19.577 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data::1.1

Β  | Nov 2, 2022 @ 10:55:10.771 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data::1.0

https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:cements::1.0
https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodule_rich_bedrock::1.0
https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodules::1.0
https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:dark_strata::1.0
https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:local_rmsep_sigma20_win50_n20::1.0
https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:target_classification::1.0
https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:veins::1.0

@alexdunnjpl
Copy link
Contributor

@jordanpadams taking https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:cements::1.0 as an example, there is no sweepers metadata present in the document.

Has sweepers been running on whichever OpenSearch node hosts the relevant product documents?

@jordanpadams
Copy link
Member Author

@alexdunnjpl I have no idea...

@alexdunnjpl
Copy link
Contributor

Plan to run local sweepers against GEO. Currently blocked by GEO node getting hammered by MCP migration.

@alexdunnjpl alexdunnjpl transferred this issue from NASA-PDS/registry-api May 9, 2024
@alexdunnjpl alexdunnjpl changed the title members query not working for collections sweepers not run(ning) against geo-prod May 9, 2024
@alexdunnjpl
Copy link
Contributor

possibly due to ReadErrors being encountered

@sjoshi-jpl is there any record available of if/when the geo-prod sweepers jobs started failing?

@alexdunnjpl
Copy link
Contributor

Initial assumption about load was incorrect - there is a block of very-large documents in GEO, resulting in some requests taking more than an order of magnitude longer than others, specifically in repairkit (which does not pull document subsets)

Currently resolving by dropping repairkit page size to 500 and increasing timeout to 180sec.

At 500/pg, maximum observed request time was 1m42s

@alexdunnjpl
Copy link
Contributor

Note to self - solved but not yet implemented, pending discussion with @tloubrieu-jpl

@jordanpadams
Copy link
Member Author

@alexdunnjpl @tloubrieu-jpl where are we at with this? It looks like this may be resolved?

@alexdunnjpl
Copy link
Contributor

@jordanpadams I need to loop back to it with @tloubrieu-jpl to decide on how we want to tweak the timeout parameters to resolve the issue.

I've run the sweepers locally to resolve the state of geo having no ancestry metadata and there's a good chance that sweepers are now running against GEO (because the massive docs are dealt with and no longer fetched by ancestry sweeper), but the root cause remains outstanding.

If it's important to close this out let me know - should be a quick thing, I've just been laser-focused on the migration and have been ignoring everything else.

@jordanpadams
Copy link
Member Author

@alexdunnjpl πŸ‘ all good. just checking.

@tloubrieu-jpl
Copy link
Member

This going to be worked on after the migration to MCP is completed.

@tloubrieu-jpl
Copy link
Member

@alexdunnjpl @sjoshi-jpl we said we will work on that after the migration to MCP. Where are we with sweeper running on the nodes ? Thanks.

@alexdunnjpl
Copy link
Contributor

alexdunnjpl commented Nov 11, 2024

@tloubrieu-jpl this is probably a perfunctory close at this point once it's able to be retested - I'll defer to Sagar on status but it'll become clear once the sweeper is running on GEO.

@tloubrieu-jpl
Copy link
Member

blocked by #147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ToDo
Status: Release Backlog
Development

No branches or pull requests

4 participants