Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented autocomplete endpoint and added documentation #33

Merged
merged 8 commits into from
Mar 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 41 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The following features of OpenAlex are currently supported by PyAlex:
- [x] Select fields
- [x] Sample
- [x] Pagination
- [ ] [Autocomplete endpoint](https://docs.openalex.org/how-to-use-the-api/get-lists-of-entities/autocomplete-entities)
- [x] Autocomplete endpoint
- [x] N-grams
- [x] Authentication

Expand All @@ -45,10 +45,10 @@ pip install pyalex

## Getting started

PyAlex offers support for all [Entity Objects](https://docs.openalex.org/api-entities/entities-overview): [Works](https://docs.openalex.org/api-entities/works), [Authors](https://docs.openalex.org/api-entities/authors), [Sources](https://docs.openalex.org/api-entities/sourcese), [Institutions](https://docs.openalex.org/api-entities/institutions), [Concepts](https://docs.openalex.org/api-entities/concepts), [Publishers](https://docs.openalex.org/api-entities/publishers), and [Funders](https://docs.openalex.org/api-entities/funders).
PyAlex offers support for all [Entity Objects](https://docs.openalex.org/api-entities/entities-overview): [Works](https://docs.openalex.org/api-entities/works), [Authors](https://docs.openalex.org/api-entities/authors), [Sources](https://docs.openalex.org/api-entities/sourcese), [Institutions](https://docs.openalex.org/api-entities/institutions), [Concepts](https://docs.openalex.org/api-entities/concepts), [Publishers](https://docs.openalex.org/api-entities/publishers), [Funders](https://docs.openalex.org/api-entities/funders), and [Autocomplete](https://docs.openalex.org/how-to-use-the-api/get-lists-of-entities/autocomplete-entities).

```python
from pyalex import Works, Authors, Sources, Institutions, Concepts, Publishers, Funders
from pyalex import Works, Authors, Sources, Institutions, Concepts, Publishers, Funders, autocomplete
```

### The polite pool
Expand All @@ -63,6 +63,18 @@ import pyalex
pyalex.config.email = "[email protected]"
```

### Max retries

By default, PyAlex will raise an error at the first failure when querying the OpenAlex API. You can set `max_retries` to a number higher than 0 to allow PyAlex to retry when an error occurs. `retry_backoff_factor` is related to the delay between two retry, and `retry_http_codes` are the HTTP error codes that should trigger a retry.

```python
from pyalex import config

config.max_retries = 0
config.retry_backoff_factor = 0.1
config.retry_http_codes = [429, 500, 503]
```

### Get single entity

Get a single Work, Author, Source, Institution, Concept, Publisher or Funder from OpenAlex by the
Expand Down Expand Up @@ -305,6 +317,32 @@ for page in pager:
```


### Autocomplete

OpenAlex reference: [Autocomplete entities](https://docs.openalex.org/how-to-use-the-api/get-lists-of-entities/autocomplete-entities).

Autocomplete a string:
```python
from pyalex import autocomplete

autocomplete("stockholm resilience centre")
```

Autocomplete a string to get a specific type of entities:
```python
from pyalex import Institutions

Institutions().autocomplete("stockholm resilience centre")
```

You can also use the filters to autocomplete:
```python
from pyalex import Works

r = Works().filter(publication_year=2023).autocomplete("planetary boundaries")
```


### Get N-grams

OpenAlex reference: [Get N-grams](https://docs.openalex.org/api-entities/works/get-n-grams).
Expand Down
2 changes: 2 additions & 0 deletions pyalex/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
from pyalex.api import Venues
from pyalex.api import Work
from pyalex.api import Works
from pyalex.api import autocomplete
from pyalex.api import config
from pyalex.api import invert_abstract

Expand All @@ -45,6 +46,7 @@
"Concept",
"People",
"Journals",
"autocomplete",
"config",
"invert_abstract",
]
30 changes: 28 additions & 2 deletions pyalex/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,11 @@ def _get_multi_items(self, record_list):
return self.filter(openalex_id="|".join(record_list)).get()

def _full_collection_name(self):
return config.openalex_url + "/" + self.__class__.__name__.lower()
if self.params is not None and 'q' in self.params.keys():
base_url = config.openalex_url + "/autocomplete/"
return base_url + self.__class__.__name__.lower()
else:
return config.openalex_url + "/" + self.__class__.__name__.lower()

def __getattr__(self, key):
if key == "groupby":
Expand Down Expand Up @@ -286,7 +290,6 @@ def get(self, return_meta=False, page=None, per_page=None, cursor=None):
self._add_params("per-page", per_page)
self._add_params("page", page)
self._add_params("cursor", cursor)

return self._get_from_url(self.url, return_meta=return_meta)

def paginate(self, method="cursor", page=1, per_page=None, cursor="*", n_max=10000):
Expand Down Expand Up @@ -343,6 +346,11 @@ def select(self, s):
self._add_params("select", s)
return self

def autocomplete(self, s, **kwargs):
""" autocomplete the string s, for a specific type of entity """
self._add_params("q", s)
return self.get(**kwargs)
Comment on lines +349 to +352
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be to implement it like this:

Suggested change
def autocomplete(self, s, **kwargs):
""" autocomplete the string s, for a specific type of entity """
self._add_params("q", s)
return self.get(**kwargs)
def autocomplete(self, q, return_meta=False):
"""Autocomplete query q for entity"""
self._add_params("q", q)
# manipulate URL for autocomplete
url_split = urlsplit(self.url)
new_url = urlunsplit(
(
url_split.scheme,
url_split.netloc,
f"/autocomplete{url_split.path}",
url_split.query,
url_split.fragment,
)
)
return self._get_from_url(
new_url, return_meta=return_meta, resource_class=Autocomplete
)

Also, it's not the nicest option, but it keeps the autocomplete functionality separated from everything else. We don't need the changes above. As I don't think this was the most intuitive implementation OpenAlex chose, we might benefit from keeping it separate from the rest of the class.

I changed _get_from_url to accept an extra argument.



# The API

Expand Down Expand Up @@ -421,6 +429,20 @@ class Funders(BaseOpenAlex):
resource_class = Funder


class Autocomplete(OpenAlexEntity):
pass


class autocompletes(BaseOpenAlex):
""" Class to autocomplete without being based on the type of entity """
resource_class = Autocomplete

def __getitem__(self, key):
return self._get_from_url(
config.openalex_url + "/autocomplete" + "?q=" + key, return_meta=False
)


def Venue(*args, **kwargs): # deprecated
# warn about deprecation
warnings.warn(
Expand All @@ -443,6 +465,10 @@ def Venues(*args, **kwargs): # deprecated
return Sources(*args, **kwargs)


def autocomplete(s):
""" autocomplete with any type of entity """
return autocompletes()[s]

# aliases
People = Authors
Journals = Sources
13 changes: 13 additions & 0 deletions tests/test_pyalex.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from pyalex import Sources
from pyalex import Work
from pyalex import Works
from pyalex import autocomplete
from pyalex.api import QueryError


Expand Down Expand Up @@ -301,3 +302,15 @@ def test_auth():
pyalex.config.api_key = None

assert len(w_no_auth) == len(w_auth)


def test_autocomplete_works():
w = Works().filter(publication_year=2023).autocomplete("planetary boundaries")

assert len(w) > 5


def test_autocomplete():
a = autocomplete("stockholm resilience")

assert len(a) > 5
Loading