ConfluenceLoader page_ids and label parameters not working #28179
-
Checked other resources
Commit to Help
Example Codefrom langchain_community.document_loaders import ConfluenceLoader
loader = ConfluenceLoader(
url="https://learnitall.atlassian.net/wiki", username="me", api_key=c,
space_key="space", include_attachments=False, limit=50,
label="dbr"
)
OR
loader = ConfluenceLoader(
url="https://learnitall.atlassian.net/wiki", username="me", api_key=c,
space_key="space", include_attachments=False, limit=50,
page_ids=["page"]
)
documents = loader.load()
documents DescriptionI am trying to load only pages with a specific label e.g. "dbr", but not only all other docs have been loaded as the pages with the label are being duplicated. I've tried using pages_id but same behaviour has happened. System InfoSystem Information
Package Information
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
@avfranco-br maybe it's a version issue? I didn't find any issues with the code you provided, but perhaps the pages don't have the correct label? You could try implementing a custom CustomConfluenceLoader by overriding the _lazy_load method with your own filter: class CustomConfluenceLoader(ConfluenceLoader):
def _lazy_load(self, **kwargs: Any) -> Iterator[Document]:
# your logit here |
Beta Was this translation helpful? Give feedback.
Hi @feijoes, removing space_key from the initialisation has fixed the issue. Now, only the entries that matches pages_id or label are returned. Thanks again for your reply.