Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization for multi-valued LDAP attributes such as groups uniqueMember #284

Closed
wants to merge 2 commits into from

Conversation

intmgroupe
Copy link

See #255.
This pull request replaces #259 as per the discussion in its thread.

The goal is to make FORCE operations on large datasets faster by doing a few add and delete instead of one constantly huge replace when it is beneficial to do so.
The testing on this is nowhere near substantial, nor are any test cases updated to reflect the change in behaviour.

These changes were developed by INTM on behalf of EDF, who noticed the slowdown in real use, and the fix is succesfully working for them.

Please let us know if something is wrong or doesn't fit the guidelines adopted by the LSC project, so we can make any required modification before this can be merged.

Use smaller ADD and DELETE modifications instead of a huge REPLACE, at worst equivalent
@coudot coudot added this to the 2.2 milestone May 14, 2024
@davidcoutadeur
Copy link
Contributor

Giving a look to this PR.

Copy link
Contributor

@davidcoutadeur davidcoutadeur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this PR, it is quite elegant in my opinion.

I have only 2 general comments:

  • it would be really nice to have a corresponding test. Maybe we could grab the LSC log showing that we are in the condition: (missingValues.size() + extraValues.size()) >= toSetAttrValues.size()?
  • as it is a behavior change for everyone, I think we must document this (in main documentation + in release notes)

// check if there are any extra values to be removed
Set<Object> extraValues = SetUtils.findMissingNeedles(toSetAttrValues, dstAttrValues);

if((missingValues.size() + extraValues.size()) >= toSetAttrValues.size()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure at which point one replace is more efficient than 1 add + 1 delete.

Maybe it can be a configuration parameter? The condition would be:

(missingValues.size() + extraValues.size()) >= max_value_modifications

It would make the LSC modifications more predictable.

However I understand that depending on the entry size has also its benefits. For example, with the previous condition, if we had an entry with 1000 values, and max_value_modifications = 100, then 50 adds and 51 adds would lead to a big replace, which may be less efficient than adding 50 values + removing 51 values.

Without further efficiency analysis, I think we can keep the initial proposition in this PR.

@davidcoutadeur
Copy link
Contributor

I like this PR, it is quite elegant in my opinion.

I have only 2 general comments:

* it would be really nice to have a corresponding test. Maybe we could grab the LSC log showing that we are in the condition: `(missingValues.size() + extraValues.size()) >= toSetAttrValues.size()`?

* as it is a behavior change for everyone, I think we must document this (in main documentation + in release notes)

I have just written the corresponding unit test demonstrating the correct selection of 1 big replace / 1 add + 1 delete.

See this PR: #285

@davidcoutadeur
Copy link
Contributor

I like this PR, it is quite elegant in my opinion.

I have only 2 general comments:

* it would be really nice to have a corresponding test. Maybe we could grab the LSC log showing that we are in the condition: `(missingValues.size() + extraValues.size()) >= toSetAttrValues.size()`?

* as it is a behavior change for everyone, I think we must document this (in main documentation + in release notes)

I have also written the documentation and upgrade notes in corresponding project, see: lsc-project/documentation#5

@davidcoutadeur
Copy link
Contributor

For me, everything is ready for merging. @rouazana and @soisik, I don't know if you have any more remarks?

@davidcoutadeur
Copy link
Contributor

This PR is replaced by #285

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimization for multi-valued LDAP attributes such as groups uniqueMember
4 participants