Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a user, I want to search for ESA data sets and context products #31

Open
jordanpadams opened this issue Sep 15, 2024 · 3 comments
Open
Assignees
Labels

Comments

@jordanpadams
Copy link
Member

Checked for duplicates

Yes - I've already checked

πŸ§‘β€πŸ”¬ User Persona(s)

Search Engineer

πŸ’ͺ Motivation

...so that I can search for ESA data in our keyword search interface

πŸ“– Additional Details

We need a script to:

  • query the API to get all ESA context products, bundles, and collections
  • download all the files
  • load them into Solr

Acceptance Criteria

Given an index containing all ESA data loaded in the OpenSearch registry
When I perform a keyword search query for "bepicolombo"
Then I expect the context product for the BepiColombo investigation and the associated bundles/collections to be returned

βš™οΈ Engineering Details

No response

πŸŽ‰ I&T

No response

@jordanpadams jordanpadams added B15.1 p.should-have requirement the current issue is a requirement labels Sep 15, 2024
@jordanpadams jordanpadams self-assigned this Sep 15, 2024
@jordanpadams jordanpadams changed the title As a user, I want to populate the Solr index with ESA products As a user, I want to search for ESA data sets and context products Sep 28, 2024
@jordanpadams jordanpadams transferred this issue from NASA-PDS/registry-legacy-solr Sep 28, 2024
@nutjob4life
Copy link
Member

Hi @jordanpadams, couple questions:

Can this same query work?

(
    (
        product_class eq "Product_Context" or
        product_class eq "Product_Bundle" or
        product_class eq "Product_Collection"
    )
    and ops:Harvest_Info.ops:node_name like "PSA"
)

Secondly, is the idea like registry-legacy-soly#135 but this time download all the ops:Data_File_Info.ops:file_refs into a directory tree? Or is there also a pysolr step?

@jordanpadams
Copy link
Member Author

@nutjob4life this is basically a follow-on to NASA-PDS/registry-legacy-solr#135 where we actually load the data to support the interface. No implementation needed.

@nutjob4life
Copy link
Member

nutjob4life commented Oct 21, 2024

Oh! So just run harvest -c harvest.cfg?

The harvest.cfg generated by the script in NASA-PDS/registry-legacy-solr#135 looks like this:

<?xml version='1.0' encoding='UTF-8'?>
<harvest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://github.com/NASA-PDS/harvest/blob/main/src/main/resources/conf/configuration.xsd">
  <registry auth="/path/to/auth/file">app://localhost.xml</registry>
  <load>
    <directories>
      <path>/Users/kelly/Documents/Clients/JPL/PDS/Development/nasa-pds/operations/download</path>
    </directories>
  </load>
  <fileInfo processDataFiles="true" storeLabels="true">
    <fileRef replacePrefix="/Users/kelly/Documents/Clients/JPL/PDS/Development/nasa-pds/operations/download" with="https://url/to/archive"/>
  </fileInfo>
  <autoGenFields/>
</harvest>

which I'm certain is wrong, specifically the <registry auth=… and the with= attribute in the <fileRef>. What really should be there? Does ESA PSA have a node URL? (Sorry, I'm a total "harvest" newb.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ToDo
Development

No branches or pull requests

2 participants