-
-
Notifications
You must be signed in to change notification settings - Fork 880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Person name + instance id is not a unique constraint #5045
Comments
Lemmy should never fetch anything from an HTTP URL in production mode, this is checked here. And here is the relevant fetch code used by Lemmy 0.19.3 which correctly calls the check. It seems like this was caused by a misconfiguration on kbin.earth:
Can you find out if there are any users from other instances with the same problem? Or ask the admin of kbin.earth if there was such a misconfiguration in the past? At the moment it looks correct. |
I talked to jwr1 on Matrix and there were some config issues a while back related to HTTP/HTTPS and certificates, and during the troubleshooting/mitigation attempts there may have been responses like that indeed. With this query I can actually find quite a few more affected users, showing a total of 210 rows: with duplicate_users as (
select lower(name) as name, domain
from person
join instance on instance.id = person.instance_id
group by lower(name), domain
having count(*) > 1
)
select person.id, actor_id, person.name, instance.domain
from person
join instance on instance.id = person.instance_id
join duplicate_users on lower(person.name) = duplicate_users.name and instance.domain = duplicate_users.domain
order by domain asc, name asc, person.id asc I can see the following types of duplicates in our DB:
|
Most of these should be fixed by the url equality check I linked above. The remaining ones all seem to be caused by instances or users being deleted and then recreated. Though its also possible that a platform would allow multiple local users with the same name as long as they have different urls. Can you point out where the assumption is made that (name, instance_id) is unique? That seems to be the real problem, but I cant think of any such case. |
The prime example would be lemmy referencing profiles of remote users as name@domain. |
True, that is a simplification to improve usability. Lemmy itself has duplicate usernames when seen from other platforms, as you can have both actors /u/test and /c/test with the same name. Activitypub only specifies that the id url is unique (section 3.1), but the The only solution I can think of would be to reference remote users by the actor_id, eg |
If this isn't possible in newer versions anyway, maybe instead of adding a constraint, we could do a migration that uses your query above to delete the duplicates (we've had to do migrations to do that before). It could potentially be dangerous adding (person name + instance_id) uniqueness, since as @Nutomic said, person name doesn't have to be unique, only the actor_id. |
A database constraint is not an ideal solution for federation because there can legitimately be multiple accounts with the same name and with their own post histories (eg instance deleted and recreated). It would then be impossible for these accounts to federate with Lemmy. However this is an uncommon case and we may decide not to support it. So then we would add the unique(name, instance_id) constraint and rename duplicate accounts. This is essentially the same issue as #5052 which other Fediverse platforms experience with Lemmy, as Lemmy can have user and community with the same name. So to be consistent we should resolve that in the same way, by prohibiting creation of users and communities with the same name. |
Requirements
Summary
Various places assume that the combination of
instance_id
(typically used as instance domain, but on DB level this would be the instance ID field) and username is a unique set, similar to the actor ID, but this is not actually enforced and can lead to duplicateperson
s in the DB.Steps to Reproduce
Unknown at this point, was likely due to incorrect configuration on the remote side generating HTTP URLs instead of HTTPS URLs for actor IDs temporarily.
Technical Details
0.19.3 does not have a unique constraint for the combination of
instance_id
+name
on the person table.As far as I can tell this is the same in 0.19.5, as the list of indexes is the same for both versions:
Comments by the user are still showing up on lemmy.world, such as https://lemmy.world/comment/12501998 and also when looking at the post, but they are not associated with the
person
that is returned when looking at https://lemmy.world/u/[email protected].See also https://lemmy.world/api/v3/user?person_id=8299094 and https://lemmy.world/api/v3/user?person_id=9532845.
Version
0.19.3
Lemmy Instance URL
lemmy.world
The text was updated successfully, but these errors were encountered: