Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce @field(disabled=True) #120

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

BurnzZ
Copy link
Member

@BurnzZ BurnzZ commented Jan 30, 2023

Similar to #63 but this requires the field to be controlled with @field(disabled=True). It also requires that the disabled field be available in the item class.

class ArticlePage(WebPage[Article]):

    @field
    def url(self):
        ...

    # other standard fields
    # ...

    @field(disabled=True)
    def articleBodyRaw(self):
        ...

page = ArticlePage(response)
item = await page.to_item()
# ArticlePage(url="https://example.com", ..., articleBodyRaw=None)

This enables some fields to not be included when .to_item() is called. Related to #118 (the API would change with respect to this), the fields set with @field(disabled=True) can be selected later on.

I also think that this PR and #118 should be released together. Otherwise, disabled fields can't really be utilized at all.

To be decided:

  • should we go with the disabled terminology? or perhaps something else.

TODO:

@BurnzZ BurnzZ marked this pull request as ready for review January 30, 2023 08:50
@BurnzZ BurnzZ changed the title introduce @field(disable=True) introduce @field(disabled=True) Jan 30, 2023
@@ -16,6 +16,10 @@
_FIELDS_INFO_ATTRIBUTE_WRITE = "_web_poet_fields_info_temp"


def _fields_template():
return {"enabled": {}, "disabled": {}}
Copy link
Member Author

@BurnzZ BurnzZ Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative approach is to declare something like:

_FIELDS_INFO_ATTRIBUTE_READ = "_web_poet_fields_info"
_FIELDS_INFO_ATTRIBUTE_WRITE = "_web_poet_fields_info_temp"
_FIELDS_INFO_ATTRIBUTE_READ_DISABLED = "_web_poet_fields_info_disabled"
_FIELDS_INFO_ATTRIBUTE_WRITE_DISABLED = "_web_poet_fields_info_temp_disabled"

But I think it could be unwieldly when more field-flags are introduced later on.

We could also simply use the same arrangement as before but then get_fields_dict() would need to loop over all FieldInfo instances and check which are disabled or not.

def y(self) -> int:
return 2

@field(disabled=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe enabled would be a better name.

I would prefer something that suggests how the field can still be used as a property but is not included in the to_item output, but I cannot think of a good name (export? itemize? itemizable? to_item? add_to_item? include_in_item?).

Copy link
Member Author

@BurnzZ BurnzZ Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial implementation actually used enabled, but it seems get_fields_dict(include_disabled=True) sounds much better.

Hmmm, to_item and include_in_item does sound better in terms of what it does. Currently, either enabled or disabled isn't that clear about what it does.

+1 on something like @field(to_item=False).

Alongside with this, perhaps renaming to get_fields_dict(all_fields=True) would work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially thinking about something like "exclude_by_default=True" or "excluded_by_default=True", or "extract_by_default=False"; alternatives like "to_item=False" also look fine.

A possible issue with to_item=False is that you may think such field doesn't have to be defined in the item, but it does; I think it's still an error if item doesn't have it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially thinking about something like "exclude_by_default=True" or "excluded_by_default=True", or "extract_by_default=False"

I think these have the same issue with include and exclude where it's not clear what's happening, i.e., where it's being excluded from.

A possible issue with to_item=False is that you may think such field doesn't have to be defined in the item, but it does; I think it's still an error if item doesn't have it.

You're right. What do you think about having something like omit_in_to_item=True or to_item_omit=True?

@codecov
Copy link

codecov bot commented Jan 30, 2023

Codecov Report

Merging #120 (e76d42e) into master (fad17a4) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #120      +/-   ##
==========================================
+ Coverage   98.81%   98.83%   +0.01%     
==========================================
  Files          28       28              
  Lines        1183     1199      +16     
==========================================
+ Hits         1169     1185      +16     
  Misses         14       14              
Impacted Files Coverage Δ
web_poet/fields.py 98.93% <100.00%> (+0.21%) ⬆️

def get_fields_dict(cls_or_instance) -> Dict[str, FieldInfo]:
def get_fields_dict(
cls_or_instance, include_disabled: bool = False
) -> Dict[str, FieldInfo]:
"""Return a dictionary with information about the fields defined
for the class: keys are field names, and values are
:class:`web_poet.fields.FieldInfo` instances.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: update this docstring when disabled has been renamed to a better one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants