Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent results using fuzzywuzzy library #75

Open
kd10041 opened this issue Jun 18, 2024 · 2 comments
Open

inconsistent results using fuzzywuzzy library #75

kd10041 opened this issue Jun 18, 2024 · 2 comments

Comments

@kd10041
Copy link

kd10041 commented Jun 18, 2024

So, I have used this library for my workflow to match a bunch of pairs of strings. But this whenever I run the script the output results varies ! How can this be possible shouldn't the levenshtein distance between two strings is always constant (matching the order of the strings) ?

@maxbachmann
Copy link
Contributor

Yes the similarities should stay the same. Can you provide an example where this does occur?

@kd10041
Copy link
Author

kd10041 commented Jun 19, 2024

Currently I don't have any concrete examples but I am using this library to match a pairs of a bunch of datas ( ~800 ) So when I run the script with same datafiles and script on my pc and on my colleague's pc the output varies!
for context: I am trying to match a smaller string ( all in english ) to a larger string ( this is taken out from html pages contains gargable and other language than english). It is given that smaller string is the larger string ( You can think smaller string as answers and larger string as the subset) I want to know how many of these pairs matched between these set of ~800 data points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants