Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to React 18 and Implement React-Data-Table #243

Open
wants to merge 71 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
21f0624
Refactor label selector so it doesn't depend on the AnnotationStore.
JSv4 Sep 22, 2024
b98f767
Added some db tweaks for multi-format file support. Added test to che…
JSv4 Sep 29, 2024
79a5c57
Added different pipelines for different file types.
JSv4 Sep 29, 2024
93d6b31
Adding tests.
JSv4 Sep 29, 2024
731c0e0
Added tests for mimetype-routed doc parsing pipeline.
JSv4 Sep 29, 2024
7126d2b
Refactoring to support multiple file types in annotator / viewer.
JSv4 Sep 29, 2024
e92d7c7
Wiring up TxtAnnotator.
JSv4 Sep 30, 2024
e0a176b
Got display of txt files to work. Need to clean up a few more things.
JSv4 Oct 2, 2024
4a9744f
Improved component styling.
JSv4 Oct 3, 2024
e4f1a1e
Continued improvementes to the labelset edit modal to provide for a b…
JSv4 Oct 3, 2024
712f5de
Fixed issue with label selector popup.
JSv4 Oct 5, 2024
ec9b9e3
Wired up a bunch of features in TxtAnnotator... annotations displaying.
JSv4 Oct 5, 2024
69836c8
Tying TxtAnnotator to global state.
JSv4 Oct 5, 2024
42bc3d5
Avoided useContext in TxtAnnotator and instead passing function as prop.
JSv4 Oct 5, 2024
55e0e2d
Checkpoint.
JSv4 Oct 6, 2024
b76364e
Got proper display of ALL annotations and search results working in <…
JSv4 Oct 6, 2024
d0ece73
Got scroll search results into view working. Was working off wrong in…
JSv4 Oct 6, 2024
63730f5
Scrolling to txt annotation from sidebar works. Need to get it to wor…
JSv4 Oct 6, 2024
c882fe9
Jump to annotation working for BOTH span-based and token-based annota…
JSv4 Oct 7, 2024
44d72c7
Fixed zIndex issue with single doc analyzer select modal. Added a str…
JSv4 Oct 8, 2024
6927f93
Solved rendering issue - not loading proper text file.
JSv4 Oct 10, 2024
c705258
Jump to annotation working for both txt-based and pdf-based formats.
JSv4 Oct 12, 2024
bbf1e4e
Improving rendering.
JSv4 Oct 13, 2024
5c203a8
Sorted label rendering drift. Other minor oddities persist.
JSv4 Oct 13, 2024
aed5f76
Displaying approval / rejection feedback. Still some oddness with dup…
JSv4 Oct 13, 2024
c581a7d
Tidying up some things with search handling in text view mode.
JSv4 Oct 13, 2024
7fb88f8
Preparing to add more obvious connectors between labels and annotations.
JSv4 Oct 13, 2024
547c22c
Initial implementation of d3 force-directed layout for labels.
JSv4 Oct 13, 2024
4a9178a
Text Annotator is looking great! Minor cleanup and workflow issues re…
JSv4 Oct 14, 2024
d351376
Fixed renderingf of text file thumbnail.
JSv4 Oct 14, 2024
13c092f
Added a dropzone to DocumentCards so you can just drag files in now.
JSv4 Oct 14, 2024
5eed9e0
Checkpoint for adding file dropzones.
JSv4 Oct 14, 2024
74bb10f
File appears now on upload.
JSv4 Oct 14, 2024
7adf4eb
Took out remainder of old PAWLS preprocessor code.
JSv4 Oct 14, 2024
7bb640e
Remove unused test.
JSv4 Oct 14, 2024
01a2d68
Removed more unused code. Ran linter.
JSv4 Oct 14, 2024
03194e5
Bump version tag
JSv4 Oct 14, 2024
9703686
Synced search values for <ActionBar/> and <SidebarSearchWidget/>
JSv4 Oct 14, 2024
ce80316
Fixed one test.
JSv4 Oct 16, 2024
c6d9335
Updated another test.
JSv4 Oct 16, 2024
ea3ee0e
Restored test_document_uploads.py
JSv4 Oct 16, 2024
d3048a8
Resolved plaintext checker issues.
JSv4 Oct 16, 2024
f60a6d2
Resolved errors in test_text_thumbnails.
JSv4 Oct 16, 2024
767de22
Ran linter.
JSv4 Oct 16, 2024
36161fd
Fixed query test fixtures
JSv4 Oct 17, 2024
cb289bb
Ran linter.
JSv4 Oct 17, 2024
40bd8ac
Removed unused test
JSv4 Oct 18, 2024
9c430a1
Restored upload test for xlsx and pptx.
JSv4 Oct 18, 2024
0a56f0a
Improve decorator coverage.
JSv4 Oct 18, 2024
49a03e5
Test fallback error handling in analysis decorator.
JSv4 Oct 18, 2024
90ecb97
Add test for txt ingestion pipeline and structural parser. Fixed bug …
JSv4 Oct 18, 2024
bcef4d8
Ran linter.
JSv4 Oct 19, 2024
765c657
Added a test to check that txt thumbnail extractor works.
JSv4 Oct 19, 2024
3759311
Covered the pdf thumbnail function to ensure returned image is square.
JSv4 Oct 19, 2024
2c0441a
Updated thumbnail regex test to be more verbose... need to see in Git…
JSv4 Oct 23, 2024
d57655b
Loosened up regex test to still effectively test the file name withou…
JSv4 Oct 23, 2024
c73291b
Lots of small changes to upgrade various packages to react>=18 compat…
JSv4 Oct 23, 2024
1578248
Improved data grid.
JSv4 Oct 23, 2024
2925be8
Data display in grid now working.
JSv4 Oct 24, 2024
f547855
Empty working again.
JSv4 Oct 26, 2024
3139e5a
Got cell loading wheel working!
JSv4 Oct 26, 2024
41f04a9
Restored approve / reject functionality in local state.
JSv4 Oct 28, 2024
07ddde5
Imporoved a lot of styling in datagrid. Add selection column. Add edi…
JSv4 Oct 31, 2024
29edd77
AWESOME custom schema editor.
JSv4 Oct 31, 2024
15a281e
Checkpoint.
JSv4 Nov 2, 2024
7c612f8
Added select email.
JSv4 Nov 2, 2024
c342147
Added edit and delete buttons for cols.
JSv4 Nov 2, 2024
0d15945
Cleaned up some parsing issues, added some utility calculated fields …
JSv4 Nov 3, 2024
f416c27
Have JSON extraction and editing working nicely.
JSv4 Nov 3, 2024
ebeeb74
Fixed feedback button and signals in extract grid.
JSv4 Nov 3, 2024
0bccb8a
Set default corrected_data property on DataCell to None.
JSv4 Nov 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .ipython/profile_default/history.sqlite
Binary file not shown.
14 changes: 14 additions & 0 deletions config/graphql/graphene_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ class LabelTypeEnum(graphene.Enum):
DOC_TYPE_LABEL = "DOC_TYPE_LABEL"
TOKEN_LABEL = "TOKEN_LABEL"
METADATA_LABEL = "METADATA_LABEL"
SPAN_LABEL = "SPAN_LABEL"


class AnnotationSummaryType(graphene.ObjectType):
Expand Down Expand Up @@ -412,7 +413,11 @@ def resolve_full_annotation_list(self, info, document_id=None):
results = self.annotations.all()
if document_id is not None:
document_pk = from_global_id(document_id)[1]
logger.info(
f"Resolve full annotations for analysis {self.id} with doc {document_pk}"
)
results = results.filter(document_id=document_pk)

return results

class Meta:
Expand All @@ -429,13 +434,22 @@ class Meta:


class FieldsetType(AnnotatePermissionsForReadMixin, DjangoObjectType):
in_use = graphene.Boolean(
description="True if the fieldset is used in any extract that has started."
)
full_column_list = graphene.List(ColumnType)

class Meta:
model = Fieldset
interfaces = [relay.Node]
connection_class = CountableConnection

def resolve_in_use(self, info) -> bool:
"""
Returns True if the fieldset is used in any extract that has started.
"""
return self.extracts.filter(started__isnull=False).exists()

def resolve_full_column_list(self, info):
return self.columns.all()

Expand Down
200 changes: 169 additions & 31 deletions config/graphql/mutations.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,10 @@
make_corpus_public_task,
)
from opencontractserver.types.dicts import OpenContractsAnnotatedDocumentImportType
from opencontractserver.types.enums import ExportType, PermissionTypes
from opencontractserver.types.enums import ExportType, LabelType, PermissionTypes
from opencontractserver.users.models import UserExport
from opencontractserver.utils.etl import is_dict_instance_of_typed_dict
from opencontractserver.utils.files import is_plaintext_content
from opencontractserver.utils.permissioning import (
set_permissions_for_obj_to_user,
user_has_permission_for_obj,
Expand Down Expand Up @@ -814,6 +815,10 @@ class Arguments:
description="If provided, successfully uploaded document will "
"be uploaded to corpus with specified id",
)
add_to_extract_id = graphene.ID(
required=False,
description="If provided, successfully uploaded document will be added to extract with specified id",
)
make_public = graphene.Boolean(
required=True,
description="If True, document is immediately public. "
Expand All @@ -835,7 +840,14 @@ def mutate(
custom_meta,
make_public,
add_to_corpus_id=None,
add_to_extract_id=None,
):
if add_to_corpus_id is not None and add_to_extract_id is not None:
return UploadDocument(
message="Cannot simultaneously add document to both corpus and extract",
ok=False,
document=None,
)

ok = False
document = None
Expand All @@ -860,36 +872,75 @@ def mutate(
# Check file type
kind = filetype.guess(file_bytes)
if kind is None:
return UploadDocument(
message="Unable to determine file type", ok=False, document=None
)

if kind.mime not in settings.ALLOWED_DOCUMENT_MIMETYPES:
if is_plaintext_content(file_bytes):
kind = "application/txt"
else:
return UploadDocument(
message="Unable to determine file type", ok=False, document=None
)
else:
kind = kind.mime

if kind not in settings.ALLOWED_DOCUMENT_MIMETYPES:
return UploadDocument(
message=f"Unallowed filetype: {kind.mime}", ok=False, document=None
message=f"Unallowed filetype: {kind}", ok=False, document=None
)

user = info.context.user
pdf_file = ContentFile(file_bytes, name=filename)
document = Document(
creator=user,
title=title,
description=description,
custom_meta=custom_meta,
pdf_file=pdf_file,
backend_lock=True,
is_public=make_public,
)
document.save()

if kind in [
"application/pdf",
"application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"application/vnd.openxmlformats-officedocument.presentationml.presentation",
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
]:
pdf_file = ContentFile(file_bytes, name=filename)
document = Document(
creator=user,
title=title,
description=description,
custom_meta=custom_meta,
pdf_file=pdf_file,
backend_lock=True,
is_public=make_public,
file_type=kind, # Store filetype
)
document.save()
elif kind in ["application/txt"]:
txt_extract_file = ContentFile(file_bytes, name=filename)
document = Document(
creator=user,
title=title,
description=description,
custom_meta=custom_meta,
txt_extract_file=txt_extract_file,
backend_lock=True,
is_public=make_public,
file_type=kind,
)
document.save()

set_permissions_for_obj_to_user(user, document, [PermissionTypes.CRUD])

# If add_to_corpus_id is not None, link uploaded document to corpus
# Handle linking to corpus or extract
if add_to_corpus_id is not None:
try:
corpus = Corpus.objects.get(id=from_global_id(add_to_corpus_id)[1])
transaction.on_commit(lambda: corpus.documents.add(document))
except Exception as e:
message = f"Adding to corpus failed due to error: {e}"
elif add_to_extract_id is not None:
try:
extract = Extract.objects.get(
Q(pk=from_global_id(add_to_extract_id)[1])
& (Q(creator=user) | Q(is_public=True))
)
if extract.finished is not None:
raise ValueError("Cannot add document to a finished extract")
transaction.on_commit(lambda: extract.documents.add(document))
except Exception as e:
message = f"Adding to extract failed due to error: {e}"

ok = True

Expand Down Expand Up @@ -1063,13 +1114,24 @@ class Arguments:
required=True,
description="Id of the label that is applied via this annotation.",
)
annotation_type = graphene.Argument(
graphene.Enum.from_enum(LabelType), required=True
)

ok = graphene.Boolean()
annotation = graphene.Field(AnnotationType)

@login_required
def mutate(
root, info, json, page, raw_text, corpus_id, document_id, annotation_label_id
root,
info,
json,
page,
raw_text,
corpus_id,
document_id,
annotation_label_id,
annotation_type,
):
corpus_pk = from_global_id(corpus_id)[1]
document_pk = from_global_id(document_id)[1]
Expand All @@ -1085,6 +1147,7 @@ def mutate(
annotation_label_id=label_pk,
creator=user,
json=json,
annotation_type=annotation_type.value,
)
annotation.save()
set_permissions_for_obj_to_user(user, annotation, [PermissionTypes.CRUD])
Expand Down Expand Up @@ -1924,20 +1987,95 @@ def mutate(
return CreateExtract(ok=True, msg="SUCCESS!", obj=extract)


class UpdateExtractMutation(DRFMutation):
class IOSettings:
lookup_field = "id"
pk_fields = ["corpus", "fieldset", "creator"]
serializer = ExtractSerializer
model = Extract
graphene_model = ExtractType
class UpdateExtractMutation(graphene.Mutation):
"""
Mutation to update an existing Extract object.

Supports updating the name (title), corpus, fieldset, and error fields.
Ensures proper permission checks are applied.
"""
class Arguments:
id = graphene.String(required=True)
title = graphene.String(required=False)
description = graphene.String(required=False)
icon = graphene.String(required=False)
label_set = graphene.String(required=False)
id = graphene.ID(required=True, description="ID of the Extract to update.")
title = graphene.String(required=False, description="New title for the Extract.")
corpus_id = graphene.ID(required=False, description="ID of the Corpus to associate with the Extract.")
fieldset_id = graphene.ID(required=False, description="ID of the Fieldset to associate with the Extract.")
error = graphene.String(required=False, description="Error message to update on the Extract.")
# The Extract model does not have 'description', 'icon', or 'label_set' fields.
# If these fields are added to the model, they can be included here.

ok = graphene.Boolean()
message = graphene.String()
obj = graphene.Field(ExtractType)

@staticmethod
@login_required
def mutate(root, info, id, title=None, corpus_id=None, fieldset_id=None, error=None):
print(f"UpdateExtractMutation.mutate called with: id={id}, title={title}, corpus_id={corpus_id}, fieldset_id={fieldset_id}, error={error}")
user = info.context.user

try:
extract_pk = from_global_id(id)[1]
extract = Extract.objects.get(pk=extract_pk)
except Extract.DoesNotExist:
return UpdateExtractMutation(ok=False, message="Extract not found.", obj=None)

# Check if the user has permission to update the Extract object
if not user_has_permission_for_obj(
user_val=user,
instance=extract,
permission=PermissionTypes.UPDATE,
include_group_permissions=True,
):
return UpdateExtractMutation(ok=False, message="You don't have permission to update this extract.", obj=None)

# Update fields
if title is not None:
extract.name = title

if error is not None:
extract.error = error

if corpus_id is not None:
corpus_pk = from_global_id(corpus_id)[1]
try:
corpus = Corpus.objects.get(pk=corpus_pk)
# Check permission
if not user_has_permission_for_obj(
user_val=user,
instance=corpus,
permission=PermissionTypes.READ,
include_group_permissions=True,
):
return UpdateExtractMutation(ok=False, message="You don't have permission to use this corpus.", obj=None)
extract.corpus = corpus
except Corpus.DoesNotExist:
return UpdateExtractMutation(ok=False, message="Corpus not found.", obj=None)

if fieldset_id is not None:
fieldset_pk = from_global_id(fieldset_id)[1]
print(f"Attempting to update extract {extract.id} with fieldset_id {fieldset_id} (pk: {fieldset_pk})")
try:
fieldset = Fieldset.objects.get(pk=fieldset_pk)
print(f"Found fieldset {fieldset.id} for update")
# Check permission
if not user_has_permission_for_obj(
user_val=user,
instance=fieldset,
permission=PermissionTypes.READ,
include_group_permissions=True,
):
print(f"User {user.id} denied permission to use fieldset {fieldset.id}")
return UpdateExtractMutation(ok=False, message="You don't have permission to use this fieldset.", obj=None)
print(f"Updating extract {extract.id} fieldset to {fieldset.id}")
extract.fieldset = fieldset
except Fieldset.DoesNotExist:
print(f"Fieldset with pk {fieldset_pk} not found")
return UpdateExtractMutation(ok=False, message="Fieldset not found.", obj=None)

extract.save()
extract.refresh_from_db()

return UpdateExtractMutation(ok=True, message="Extract updated successfully.", obj=extract)


class AddDocumentsToExtract(DRFMutation):
Expand Down
10 changes: 7 additions & 3 deletions config/graphql/queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,18 +200,22 @@ def resolve_annotations(

# Filter by annotation_label__label_type
logger.info(
f"Queryset county before filtering by annotation_label__label_type: {queryset.count()}"
f"Queryset count before filtering by annotation_label__label_type: {queryset.count()}"
)
label_type = kwargs.get("annotation_label__label_type")
if label_type:
logger.info(f"Filtering by annotation_label__label_type: {label_type}")
queryset = queryset.filter(annotation_label__label_type=label_type)
logger.info(f"Queryset count after filtering by label type: {queryset.count()}")

logger.info(f"QFilter value for analysis_isnull: {analysis_isnull}")
logger.info(f"Q Filter value for analysis_isnull: {analysis_isnull}")
# Filter by analysis
if analysis_isnull is not None:
logger.info(f"Filtering by analysis_isnull: {queryset.count()}")
logger.info(
f"QS count before filtering by analysis is null: {queryset.count()}"
)
queryset = queryset.filter(analysis__isnull=analysis_isnull)
logger.info(f"Filtered by analysis_isnull: {queryset.count()}")

# Filter by document_id
document_id = kwargs.get("document_id")
Expand Down
8 changes: 7 additions & 1 deletion config/settings/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,13 @@

# UPLOAD CONTROLS
# ------------------------------------------------------------------------------
ALLOWED_DOCUMENT_MIMETYPES = ["application/pdf"]
ALLOWED_DOCUMENT_MIMETYPES = [
"application/pdf",
"application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"application/vnd.openxmlformats-officedocument.presentationml.presentation",
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"application/txt",
]

# AUTHENTICATION
# ------------------------------------------------------------------------------
Expand Down
Loading
Loading