Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Source Link to the sources #2036

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 13 additions & 6 deletions private_gpt/ui/ui.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, you will need to do more than you might expect. Storing files in the local folder may not be trivial. If anything, please update the PR, ping me and we'll check it out :)

Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,11 @@ class Source(BaseModel):
file: str
page: str
text: str
pdf_prefix: str = "C:/UsedFilesFolder/"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot assume that. You will need to store files in a folder in addition to current process


class Config:
frozen = True
USE_HYPERLINKS_FOR_SOURCES: bool = True

@staticmethod
def curate_sources(sources: list[Chunk]) -> list["Source"]:
Expand All @@ -76,7 +78,11 @@ def curate_sources(sources: list[Chunk]) -> list["Source"]:
) # Unique sources only

return curated_sources


def to_hyperlink(self) -> str:
encoded_file = self.file.replace(" ", "%20")
file_path = f"{self.pdf_prefix}{encoded_file}#page={self.page}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#page will only work when it's a doc and it's indexable (e.g. pdf)

return f'<a href="file:///{file_path}" target="_blank">{self.file} (page {self.page})</a>'

@singleton
class PrivateGptUi:
Expand Down Expand Up @@ -123,10 +129,10 @@ def yield_deltas(completion_gen: CompletionGen) -> Iterable[str]:
used_files = set()
for index, source in enumerate(cur_sources, start=1):
if f"{source.file}-{source.page}" not in used_files:
sources_text = (
sources_text
+ f"{index}. {source.file} (page {source.page}) \n\n"
)
if settings().USE_HYPERLINKS_FOR_SOURCES:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a ui setting object. The best idea is create a new filed in UI settings and create a boolean property

sources_text += f"{index}. {source.to_hyperlink()} \n\n"
else:
sources_text += f"{index}. {source.to_text()} \n\n"
used_files.add(f"{source.file}-{source.page}")
sources_text += "<hr>\n\n"
full_response += sources_text
Expand Down Expand Up @@ -289,7 +295,8 @@ def _list_ingested_files(self) -> list[list[str]]:
"file_name", "[FILE NAME MISSING]"
)
files.add(file_name)
return [[row] for row in files]
sorted_files = sorted(files) # Sort the files alphabetically
return [[row] for row in sorted_files] # Use sorted files

def _upload_file(self, files: list[str]) -> None:
logger.debug("Loading count=%s files", len(files))
Expand Down
Loading