Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2425-W-004 IWTS Artist data ETL #1

Open
3 of 5 tasks
Tracked by #9
saumier opened this issue Jul 30, 2024 · 15 comments
Open
3 of 5 tasks
Tracked by #9

2425-W-004 IWTS Artist data ETL #1

saumier opened this issue Jul 30, 2024 · 15 comments
Labels

Comments

@saumier
Copy link
Member

saumier commented Jul 30, 2024

Work Order https://docs.google.com/document/d/1-agbLAyTtHIt2W-fTWsEWe--q6g_HRGnmAJd1yiaOws/edit#heading=h.phzh70cof3ne

Tasks

Preview Give feedback
  1. saumier
@saumier
Copy link
Member Author

saumier commented Aug 15, 2024

Waiting for Frederic to communicate IWTS API so we can evaluate work.

@SkipTay
Copy link

SkipTay commented Aug 22, 2024

Hi there. I have built a prototype for the API.

here are the key notes. as of right now I am waiting on the new fields to be implemented in the database from Ryan and Stephen.

I can provide the url for api access and the command line access. Not sure if this should be done here or in email. if this is a private project please advise its is secure and I can provide these items here.

Here are my current notes from the file.
/*

  • Skip created API for the CAPACOA Arts Data Graph Project.
  • Created June 13, 2024.
  • This file will be used to allow arts data to connect and get performer info for Open Data.
  • Files added for this API to work:
    • Added a line in the .htaccess file to keep the environment variable hidden.
    • Added api_requests.log to check requests. Periodically, we should check to ensure only the arts data graph is connecting.
  • Key functionality:
    • Force HTTPS
    • Rate limiting: May need to check the logs once the data is accessed to ensure our settings are within the Arts Data requirements.
    • Includes the database connection file.
    • Created log function as per above.
    • Validate the API key, connect to the database, return the data in JSON format.
  • Currently Testing:
    • Current query only grabs a specific performerid that is one of Skips applications for testing.
    • Not interacting with any other performer data currently.
  • To Do:
    • Update query once the field names in the performer table are updated (performerwebsite_2, performerwebsite_3, performerwebsite_4, performerwebsite_5, performer_type, performer_description, consent).
    • Determine OpenData access process. if the request will becoming from a specific ip or domain we can limit access that those specific connection.
      */

@saumier saumier assigned saumier and unassigned fjjulien Aug 23, 2024
@saumier
Copy link
Member Author

saumier commented Aug 27, 2024

Need help from Skip to call API.

See email:

Hi Skip,

Thanks for sending me access to your API in development.

I tried a call from my local PC but I get the following error (I replaced the key you sent me with stars)

https://iwanttoshowcase.ca/apiiwts.php?api_key=**********
—> {"error": "Unauthorized access. Invalid API key.", "received_api_key": "****" }

I also get the same error message with curl:

curl -H "X-API-Key: ***" https://iwanttoshowcase.ca/apiiwts.php
—> {"error":"Unauthorized access. Invalid API key.","received_api_key":""}

How do you suggest I proceed?

Regarding restricting access: I don’t have a permanent IP address.

Doc from Github: "For scripted calling of IP addresses we are using Github workflows hosted in Azure and subsequently have the same IP address ranges as the Azure datacenters. Since there are so many IP address ranges for GitHub-hosted runners, we do not recommend that you use these as allowlists for your internal resources."

From what I can research, the domain github.com http://github.com/ should allow you to restrict access. Perhaps we can test this after I figure out how to call your API from my local PC.

Regards,
Gregory

@saumier saumier assigned SkipTay and unassigned saumier Aug 27, 2024
@SkipTay
Copy link

SkipTay commented Aug 27, 2024 via email

@saumier
Copy link
Member Author

saumier commented Aug 27, 2024

@SkipTay Data received from API

[
  {
    "performerid": 4,
    "performer_name": "Skip Taylor",
    "performer_country": "Canada",
    "performer_website1": "http://www.osac.ca/",
    "performance_category": "Music",
    "performance_subgenre1": "Cabaret",
    "performer_pronouns": "He/Him"
  }
]

@SkipTay Is this a complete set of properties or are you planning to add/remove properties? I am wondering when I should begin working on mapping these to schema.org and/or wikidata.org?

@SkipTay
Copy link

SkipTay commented Aug 27, 2024

That is correct. currently only sending 1 test artist. I am waiting for Ryan to complete his work and add the additional fields. Once the fields are added I will update the API to send all opt in artist profiles and all the requested fields.

@SkipTay
Copy link

SkipTay commented Sep 6, 2024

Hi Gregory,

The new fields were added to the IWTS production tables. I have updated the API to include the new fields. I have sent the output to @fjjulien to confirm the output is as he expected, but if you want to review now you can using the same url provided earlier.

Currently it is still just the one test performer, but the field and JSON structure should be there.

@fjjulien
Copy link

fjjulien commented Sep 6, 2024

I reviewed the output against the IWTS Open Data Model document. Everything is good except:

  1. For performer_country, the stored value should be an ISO 3166-1 alpha-2 code. There are detailed instructions and links in the document on how it should be implemented. According to a comment by Ryan in the doc, this should have been implemented and older values should have been replaced by Steven.
  2. For the performer_type, for interoperability purposes, the stored values should preferably be as per the Schema labels: Person, PerformingGroup, Organization. This said, this mapping can easily be done in Artsdata. Either way is fine. It's not as big a deal as the country two-letter codes.

Keep up the good work!

@SkipTay
Copy link

SkipTay commented Sep 6, 2024

I haven’t officially got word from Ryan that the work was complete. I suspect they may run a query to replace the existing Country codes this weekend. I should have known the schema bit. Again I’ll wait until the work Ryan is doing is complete before coding that. Stephen may already be coding those values for performer-type. I simply put a value in that field to be sure it was returned.

@SkipTay
Copy link

SkipTay commented Sep 24, 2024

HI all.

The fields are all working as expected now and I believe the output is in a format you will expect. I have sent an image to Frederic for the output. Gregory if you want to call the api and confirm the output suits your needs? The 3 profiles included currently are test profiles so they should not be loaded into OpenData, but they are marked with consent. Once you approve the output I will remove consent from these 3 profiles and we will ahve to wait for official consent from real profiles. I have attached a jpg of the json output for reference.

api json output

@fjjulien
Copy link

As I wrote over email, the output looks as agreed upon in the open data model. 👍

@saumier
Copy link
Member Author

saumier commented Oct 8, 2024

@fjjulien @SkipTay I'll review the API output with my team and complete the WO estimate.

@saumier saumier assigned dev-aravind and unassigned SkipTay Oct 8, 2024
@saumier
Copy link
Member Author

saumier commented Oct 8, 2024

@dev Please take a look at the output from the IWTS API and estimate an ETL into Artsdata. Here is a reference doc https://docs.google.com/document/d/14SHgIWuItkp8lnmeculEctfyLc0YV6CVuZRX4e8h0q0/edit but please estimate only the first step which is to load the data as-is using the same property names as the API with prefix https://iwanttoshowcase.ca/vocabulary#

@SkipTay
Copy link

SkipTay commented Oct 9, 2024

The API will once we reconcile, include the Open Data Unique Identifier and the wikidata unique identifier as well. But that will take a bit of development on our side yet.

@fjjulien
Copy link

Note: I already mapped IWTS's Performance Category vocabulary to Artsdata and Wikidata. The mapping is available in this spreadsheet.

@saumier saumier transferred this issue from culturecreates/artsdata-orion Nov 5, 2024
@saumier saumier added the project label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

4 participants