Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review Perseids implementation #163

Open
zfletch opened this issue Oct 2, 2019 · 4 comments
Open

Review Perseids implementation #163

zfletch opened this issue Oct 2, 2019 · 4 comments

Comments

@zfletch
Copy link

zfletch commented Oct 2, 2019

I've finished writing a Perseids implementation of the DTS API. The base URI is https://dts.perseids.org/ and the source code is available at https://github.com/perseids-project/dts-api/.

I believe that I've implemented the spec correctly, but I'm sure there are still bugs and other issues. If you find any problems or have feedback, please let me know in a comment or create and issue at https://github.com/perseids-project/dts-api/issues/.

Thanks!

@balmas balmas self-assigned this Oct 11, 2019
@balmas
Copy link
Contributor

balmas commented Oct 11, 2019

per 11-Oct-2019 technical committee meeting @balmas will review, others welcome to as well. Initial feedback from @PonteIneptique is that it looks good.

@balmas
Copy link
Contributor

balmas commented Oct 16, 2019

On the whole, I think this looks great! The only things I noticed might be more issues with the dts documentation than the implementation itself.

  1. I wonder if we want to make some recommendations for ids of the main collections endpoint. This implementation reports 'default' which seems not likely to be a very unique identifier. Maybe the recommendation should be for this just to be the full url of the entrypoint if there isn't otherwise a unique identifier for the collections as a whole?

  2. Another question on best practices -- in this implementation sometimes properties use html-escaped versions of ids and paths and sometimes they don't. For example, the values of @id and dts-passage in navigation response are html escaped:

"@id": "\/navigation?groupBy=1&id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2&level=1",

"dts:passage": "\/documents?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2{&ref}{&start}{&end}",

But elsewhere they are not.

  1. In the documents endpoint, the link header for the collection itself uses "collection" in the documentation we have "collections". I think actually "collection" singular does make more sense.

  2. This is just a note about the data itself -- I would have expected a link rel="up" to the parent book for a request for passage 1.2 of the illiad (/documents/?id=urn:cts:greekLit:tlg0012.tlg001.perseus-grc2&ref=1.2) but it wasn't there.

  3. And another data note: I got a different link rel="up" when I queried a range of passages in book 1 (this gave me the last passage of the text, 24.802) than I did when I queried a single passage in Book 1 (this gave me the last passage of the book, 1.611):

curl -I "https://dts.perseids.org/documents?id=urn:cts:greekLit:tlg0012.tlg001.perseus-grc2&ref=1.1"
...
Link: </documents?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2&ref=1.2>; rel="next", </documents?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2&ref=1.1>; rel="first", </documents?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2&ref=1.611>; rel="last", </navigation?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2>; rel="contents", </collections?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2>; rel="collection"

vs

curl -I "https://dts.perseids.org/documents?id=urn:cts:greekLit:tlg0012.tlg001.perseus-grc2&start=1.1&end=1.3"
...
Link: </documents?end=1.6&id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2&start=1.4>; rel="next", </documents?end=1.3&id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2&start=1.1>; rel="first", </documents?end=24.804&id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2&start=24.802>; rel="last", </navigation?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2>; rel="contents", </collections?id=urn%3Acts%3AgreekLit%3Atlg0012.tlg001.perseus-grc2>; rel="collection"

Otherwise, I think this is great and that we should add it to our list of known implementations.

(@zfletch , just a question about the implementation though ... Thibault and I were a little sad to see the ingest of data from XML into a SQL database ... just curious about your design choice here and whether you considered other approaches, maybe a Ruby implementation of the Capitains models?)

@zfletch
Copy link
Author

zfletch commented Oct 18, 2019

Thanks so much for the detailed review! I've fixed some of the issues you pointed out and I have followup questions and comments for other ones:

  1. I wonder if we want to make some recommendations for ids of the main collections endpoint. ... Maybe the recommendation should be for this just to be the full url of the entrypoint if there isn't otherwise a unique identifier for the collections as a whole?

I agree and I think a recommendation would be useful. I wasn't sure what to use as the ID while developing, so I ended up copying default from other implementations.

  1. ... in this implementation sometimes properties use html-escaped versions of ids and paths and sometimes they don't.

I think my implementation uses escaped IDs when they're part of a path and unescaped IDs when they're just a normal string. So, urn:cts:greekLit:tlg0012.tlg001.perseus-grc2 is not escaped in the collections endpoint but it is escaped in the navigation endpoint because the ID is /navigation?.... I can't think of a better way of doing this, it seems wrong to escape a regular string and not escaping a path could cause problems.

  1. In the documents endpoint, the link header for the collection itself uses "collection" in the documentation we have "collections". I think actually "collection" singular does make more sense.

I'm actually pretty confused about which one is correct to use here. The documentation says collections in the table near the beginning, but the examples all use collection. Does the table take precedence over the examples in the documentation?

(The reason this implementation used collection is because I wrote this implementation by copying and pasting each example into a test file then writing code to make the test pass.)

  1. I would have expected a link rel="up" to the parent book for a request for passage 1.2 of the illiad (/documents/?id=urn:cts:greekLit:tlg0012.tlg001.perseus-grc2&ref=1.2) but it wasn't there.

Thanks for pointing this out! I've added up in this PR. It looks like I missed it for the same reason I used collection: none of the examples use up.

  1. And another data note: I got a different link [rel="last"] when I queried a range of passages in book 1 (this gave me the last passage of the text, 24.802) than I did when I queried a single passage in Book 1 (this gave me the last passage of the book, 1.611):

Thanks for noticing this. There is a pretty major bug in the link functionality. I've (hopefully) fixed it in this PR.

I also saw this issue about totalItems and realized that I was providing the wrong number of totalItems when nav=parents. I have a PR for that here. I do think that the documentation is confusing in this case because it says, "totalItems is the number of children contained by the object."

... just curious about your design choice here and whether you considered other approaches, maybe a Ruby implementation of the Capitains models?

I had two reasons for doing it this way. The primary reason is that this is the technology stack I'm most familiar with. I thought it would be easier for me to develop, maintain, and deal with hosting using it. The secondary reason is that I wanted to design it fairly differently from Capitains—not because there's any problem with Capitains—but because I thought I would be able to provide more useful feedback if I implemented it differently. I was assuming that some other future implementations would use a SQL database and so, writing my own with SQL, I could find particular parts of the spec that were difficult to handle with SQL and offer suggestions.

@balmas
Copy link
Contributor

balmas commented Oct 21, 2019

Thanks for your response @zfletch . I have entered issues for the points about the spec needing clarification (for @id value of collections endpoint and best practices for escaping @id values). Have also made a ref back to this item in #141 to make sure the totalItems documentation being updated takes your confusion into account.

Thanks for making the fixes to the implementation and for your explanation on your design approach with SQL. It's definitely good to have multiple perspectives and example implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants