(Draft) ScienceDirect: Object Retrieval API #353

nils-herrmann · 2024-09-03T12:27:55Z

Research for the implementation of the ScienceDirect: Object Retrieval API. The description of the API states: These interfaces represent retrieval of objects associated with a full text article. This resource can also return reference details for an individual object or an entire full text article. The reference metadata response will contain links to the associated Full-Text article.

nils-herrmann · 2024-09-03T13:13:24Z

The Object Retrieval API returns different things depending on the query:

Object references metadata of a document
A specific object
A thumbnail image
A regular-sized image
A high resolution image

I think the best choice is to represent the object references metadata of a document as a property and then allow through functions to get specific objects. Here is an example:

All the object references metadata of a document are retrieved at initialisation

objects = ObjectRetrieval('S0360131524001623', refresh=True)
objects.object_references

[{'url': 'https://api.elsevier.com/content/object/eid/1-s2.0-S0360131524001623-gr1.jpg?httpAccept=%2A%2F%2A',
  'eid': '1-s2.0-S0360131524001623-gr1.jpg',
  'ref': 'gr1',
  'filename': 'gr1.jpg',
  'mimetype': 'image/jpeg',
  'size': '135574',
  'height': '346',
  'width': '491',
  'type': 'IMAGE-DOWNSAMPLED'},
 {'url': 'https://api.elsevier.com/content/object/eid/1-s2.0-S0360131524001623-gr2.jpg?httpAccept=%2A%2F%2A',
  'eid': '1-s2.0-S0360131524001623-gr2.jpg',
  'ref': 'gr2',
  'filename': 'gr2.jpg',
  'mimetype': 'image/jpeg',
  'size': '129935',
  'height': '365',
  'width': '624',
  'type': 'IMAGE-DOWNSAMPLED'}
  ...]

Specific objects can be queried with a function

gr1 = objects.get_specific_object('gr1')
image = Image.open(gr1)
display(image)

Seems pretty cool that the library will also allow users to access more than text (images, videos, excel sheets, word documents ). What do you think @Michael-E-Rose ?

Michael-E-Rose · 2024-09-08T10:33:40Z

We had such a case before, with the SerialSearch() and the SerialTitle() classes which both access the Serial Title API. That API has a search part and a retrieval part, so we created two classes. I'm thinking about the same for this class. However, we might put the three image retrieval classes together and handle the quality of the image via view. Could you please check whether the return values of the three image access points are (almost) the same?

nils-herrmann · 2024-09-09T15:28:34Z

Implementing two classes (1 for metadata and 1 for the objects) is a good idea.

Regarding the images:

There are 3 image views (STANDARD,THUMBNAIL , HIGH)
STANDARD is always available
THUMBNAIL , HIGH are not always available

nils-herrmann · 2024-09-09T17:18:15Z

(Update with answers) For sake of documenting: I'm having some trouble with the retrieval of .svg objects. In the
file attached the problem is exemplified with two questions:

Why do I get a 404 when retrieving .svg objects?

It is an elsevier error. The solution is to query with a view (id/{id}/ref/{ref}/{view})

Why does the metadata returns the mime type 'image/svg+xml' although it cannot be found on the documentation.

SVG is extensible, conformant "image/svg+xml" processors must expect that content received is well-formed XML, but it cannot be guaranteed that the content is valid to a particular DTD or Schema or that the processor will recognize all of the elements and attributes in the document.

Michael-E-Rose · 2024-10-07T16:38:22Z

Scopus doesn't maintain its documentation well. There are things in the API but not in the documentation, and vice versa. So, don't investigate the issue too much; pybliometrics can go without svg if needed.

nils-herrmann · 2024-10-16T12:20:23Z

I contacted Elsevier's Data Support Team and they clarified the issue. The answers are documented above.

nils-herrmann · 2024-10-17T09:14:30Z

This draft resulted in two issues: #355 and #360

nils-herrmann added the Effort: High label Sep 3, 2024

nils-herrmann self-assigned this Sep 3, 2024

nils-herrmann added a commit to nils-herrmann/pybliometrics that referenced this issue Sep 5, 2024

pybliometrics-dev#353 First implementation of the ObjectRetrieval API

57ee8c5

nils-herrmann mentioned this issue Sep 10, 2024

ScienceDirect: ObjectMetadata API #355

Open

nils-herrmann added a commit to nils-herrmann/pybliometrics that referenced this issue Sep 10, 2024

pybliometrics-dev#353 Second draft with ObjectRetrieval API

14ec92a

nils-herrmann changed the title ~~ScienceDirect: Object Retrieval API~~ (Documentation) ScienceDirect: Object Retrieval API Sep 11, 2024

nils-herrmann changed the title ~~(Documentation) ScienceDirect: Object Retrieval API~~ (Draft) ScienceDirect: Object Retrieval API Sep 11, 2024

nils-herrmann closed this as completed Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Draft) ScienceDirect: Object Retrieval API #353

(Draft) ScienceDirect: Object Retrieval API #353

nils-herrmann commented Sep 3, 2024 •

edited

Loading

nils-herrmann commented Sep 3, 2024 •

edited

Loading

Michael-E-Rose commented Sep 8, 2024

nils-herrmann commented Sep 9, 2024 •

edited

Loading

nils-herrmann commented Sep 9, 2024 •

edited

Loading

Michael-E-Rose commented Oct 7, 2024

nils-herrmann commented Oct 16, 2024

nils-herrmann commented Oct 17, 2024

(Draft) ScienceDirect: Object Retrieval API #353

(Draft) ScienceDirect: Object Retrieval API #353

Comments

nils-herrmann commented Sep 3, 2024 • edited Loading

nils-herrmann commented Sep 3, 2024 • edited Loading

Michael-E-Rose commented Sep 8, 2024

nils-herrmann commented Sep 9, 2024 • edited Loading

nils-herrmann commented Sep 9, 2024 • edited Loading

Michael-E-Rose commented Oct 7, 2024

nils-herrmann commented Oct 16, 2024

nils-herrmann commented Oct 17, 2024

nils-herrmann commented Sep 3, 2024 •

edited

Loading

nils-herrmann commented Sep 3, 2024 •

edited

Loading

nils-herrmann commented Sep 9, 2024 •

edited

Loading

nils-herrmann commented Sep 9, 2024 •

edited

Loading