-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: TypeError: 'Token' object is not subscriptable #3473
Comments
@alanakbik Please provide guidance, this is urgent |
Hello @kdk2612 the training script looks good, but I don't see where the printouts in the stacktrace are coming from ("This is the SPAN ..."). Did you modify other parts of the code? Are you calling get_labels() somewhere near this printout? If you want only the span annotations for NER, you should call get_labels('ner') instead, as otherwise it will also iterate over the token-level annotations. For us to be able to help, we'd need a runnable script (including dataset and embeddings) that throws the error. But from the printouts, I assume the error is thrown in custom code outside the library. |
Yes I added the print statements in the get_gold_labels() other than that I am not making any changes. The error is happening bcoz I have some tokens "[PAD] X" in this format, this is the token + Label. My assumption is that the error happens because the X Label is expected to have a prefix "S-" or "B-" etc. |
So the label is only "X"? Have you tried replacing the label with "B-X"? Could you paste a part of the column corpus in plain text? |
unfortunately I cant share the data, but here is what it looks like Token O Token O |
Ok, can you try replacing "X" with "B-X"? It should work then. |
Yes, I used "S-" instead of "B-" |
Describe the bug
Getting an error when trying to train NER model using custom dataset. This was working back in Dec 2023. I have trained a model using the same data and FLAIR version 0.13.1 but not sure what has changed since then.
I have padded sentence in the data, eg. "[PAD] X" is the data set if the size of the sentence is greater than 512 tokens.
I printed every span that the model is reading, and see that for some SPAN I am getting back the TOKEN object. I am not sure what is going on.
To Reproduce
Screenshots
No response
Additional Context
No response
Environment
python 3.10
flair 0.13.1
The text was updated successfully, but these errors were encountered: