-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate and normalize emails #107
base: main
Are you sure you want to change the base?
Validate and normalize emails #107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patching up the db by hand at the moment.
The CI/CD fails with:
ImportError while importing test module
'/__w/ansari-backend/ansari-backend/tests/test_main_api.py'.
37
<https://github.com/ansari-project/ansari-backend/actions/runs/13068473537/job/36495072687?pr=107#step:9:38>Hint:
make sure your test modules/packages have valid Python names.
38
<https://github.com/ansari-project/ansari-backend/actions/runs/13068473537/job/36495072687?pr=107#step:9:39>Traceback:
39
<https://github.com/ansari-project/ansari-backend/actions/runs/13068473537/job/36495072687?pr=107#step:9:40>/usr/local/lib/python3.10/site-packages/pydantic/networks.py:925:
in import_email_validator
40
<https://github.com/ansari-project/ansari-backend/actions/runs/13068473537/job/36495072687?pr=107#step:9:41>
import email_validator
41
<https://github.com/ansari-project/ansari-backend/actions/runs/13068473537/job/36495072687?pr=107#step:9:42>E
ModuleNotFoundError: No module named 'email_validator'
Do I need to update the version of python we are using?
…On Thu, Jan 30, 2025 at 11:42 PM Amr Elsayed ***@***.***> wrote:
This PR accompanies the frontend PR that was recently merged.
The frontend version disables capitalization of emails during data input
and this PR enforces lowercase normalization across the database layer.
One thing that will need to be done before pushing this update to
production is to lowercase all user emails and eliminate possible email
duplicates (due to different casing).
If there are no duplicates in the database, an update query as simple as
this should do the trick:
update public.users set email = trim(lower(email))
However, if there are duplicates, this query will fail with duplicate key
value violation as the email field has a unique constraint.
A group by query can be used to identify duplicate emails:
select trim(lower(email)) as normalized_email from public.users group by 1
having count(*) > 1
Duplicate accounts will then need to be either merged or deleted, not sure
what you have in mind.
I have also switched the data type for emails from str to EmailStr
(Pydantic) to add email validation at the API level.
Finally, there is a seperate commit that fixes some typos.
------------------------------
You can view, comment on, or merge this pull request online at:
#107
Commit Summary
- ad65ace
<ad65ace>
Validate and normalize emails
- 35f965a
<35f965a>
Fix typos
File Changes
(2 files <https://github.com/ansari-project/ansari-backend/pull/107/files>
)
- *M* src/ansari/ansari_db.py
<https://github.com/ansari-project/ansari-backend/pull/107/files#diff-8879b6438777b2b40bb6df8de6a07a69330f1faaa34ec921818d93c0592f1328>
(71)
- *M* src/ansari/app/main_api.py
<https://github.com/ansari-project/ansari-backend/pull/107/files#diff-2992d55f9ca1d6058339740b3f9569b9ebdd3ea6511471c7fbee960147add022>
(8)
Patch Links:
- https://github.com/ansari-project/ansari-backend/pull/107.patch
- https://github.com/ansari-project/ansari-backend/pull/107.diff
—
Reply to this email directly, view it on GitHub
<#107>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAUXGUAX7DBYXQNM5NSDWCL2NMSNJAVCNFSM6AAAAABWG7K3V6VHI2DSMVQWIX3LMV43ASLTON2WKOZSHAZDENRYG42TOMY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I see, apologies the email-validator package is missing. When I forked repo, I changed the fastapi plain package to the standard package (i.e. fastapi to fastapi[standard]) which includes the fastapi-cli (useful in dev mode), I wasn't sure why you weren't using the standard dependecies by default, so I didn't push the fastapi depedency update but I forgot that the email-validator is part of the standard dependencies. I have reset my virtual environment and added the email-validator package on its own. However I hit another issue related litellm that has just been reported related to a regression affecting python 3.13. The current workaround is to avoid using the latest litellm release which I have excluded, check the latest commit for details. |
This PR accompanies the frontend PR that was recently merged.
The frontend version disables capitalization of emails during data input and this PR enforces lowercase normalization across the database layer.
One thing that will need to be done before pushing this update to production is to lowercase all user emails and eliminate possible email duplicates (due to different casing).
If there are no duplicates in the database, an update query as simple as this should do the trick:
update public.users set email = trim(lower(email))
However, if there are duplicates, this query will fail with duplicate key value violation as the email field has a unique constraint.
A group by query can be used to identify duplicate emails:
select trim(lower(email)) as normalized_email from public.users group by 1 having count(*) > 1
Duplicate accounts will then need to be either merged or deleted, not sure what you have in mind.
I have also switched the data type for emails from str to EmailStr (Pydantic) to add email validation at the API level.
Finally, there is a seperate commit that fixes some typos.