Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

false positive migrations detection dependent on encoding #647

Open
ivanbelenky opened this issue Aug 10, 2024 · 5 comments
Open

false positive migrations detection dependent on encoding #647

ivanbelenky opened this issue Aug 10, 2024 · 5 comments

Comments

@ivanbelenky
Copy link

Disclaimer: I lack knowledge on redis internals, and on the design criteria of the object mapping library.

Issue:
The migrator when running through the model_registry performs a hash calculation for the current schema

conn = cls.db()
try:
    schema = cls.redisearch_schema()
except NotImplementedError:
    log.info("Skipping migrations for %s", name)
    continue
current_hash = hashlib.sha1(schema.encode("utf-8")).hexdigest()

to then compare it to the one it gets from the db

stored_hash = conn.get(hash_key)
schema_out_of_date = current_hash != stored_hash

a connection without decoded_responses=True (and maybe with different charset) will create false positive migrations, because of distinct types, i.e. bytes and str for stored_hash and current_hash respectively.

I guess that a default encoding schema for bytes responses would suffice, or a user warning.

@slorello89
Copy link
Member

Can you provide a reproduction of a case where a model in Redis displays a false positive change?

@ivanbelenky
Copy link
Author

ivanbelenky commented Oct 31, 2024

Sure thing @slorello89, forgot about that

Having the following model, define a database with undecoded responses

import redis
from redis_om import Field, Migrator, HashModel

class User(HashModel):
    first_name: str
    last_name: str = Field(index=True)

    class Meta:
        database = redis.from_url("redis://0.0.0.0:6379", decode_responses=False)

and then just try to migrate that model

m = Migrator()

# should migrate
assert len(m.migrations) == 0
m.detect_migrations()
assert len(m.migrations) > 0

# run, 
m.run()

# reset migrations because migrator does not 
m.migrations = []
# already migrated, should not have migrations after detection
m.detect_migrations()
assert len(m.migrations) == 0 # raises

(for true positive migration checks, you can easily verify that the above code does not raise, if you change to decode_responses=True, there will be no error. Remember of course to wipe out whatever was written into the db)

@ivanbelenky
Copy link
Author

@slorello89 any update with regards to this?

@slorello89
Copy link
Member

@ivanbelenky - it's pretty clear to me that redis om python seems to depend on the decode_resposnes flag being True. As we are trying to default it to true here

The hash digests produced by the indexes assume that the string it gets back from redis is decoded into UTF-8, and when it isn't, it doesn't compare correctly.

Do you need decode_resposnes set to False? If so, why?

@ivanbelenky
Copy link
Author

ivanbelenky commented Dec 2, 2024

Thanks @slorello89 for your answer.

In my particular case it comes to reutilizing Redis objects defined in django project-wide settings, this connections are set with decode_responses=False.

"Pretty clear" + "seems to be" ~ strikes me as an oxymoronic statement.

  • I am unsure about other parts of this project where dependance on this flag is critical. Can you provide an example and or guidance on how to find one?
  • if a strict dependence on decoded response exists, this should be enforced and/or warned appropriately. False positive migrations is happening undercover otherwise. Would you agree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants