Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type check first parameter for Json::stringToObject() #63

Draft
wants to merge 4 commits into
base: 2.0-dev
Choose a base branch
from

Conversation

nibra
Copy link
Contributor

@nibra nibra commented Jun 19, 2022

Pull Request for Issue joomla/joomla-cms#37783

Replaces PR #61

Summary of Changes

If the first parameter in a call to Json::stringToObject() is not a string, an E_USER_DEPRECATED error is triggered. and the value cast to string.
The next major version should typehint the parameter and remove the check.

Testing Instructions

Call the method with null as the first parameter. A log entry should be written to the deprecation log, if enabled. The return value should be an empty stdClass object.

Documentation Changes Required

No. The change actually just enforces the documented behaviour.

@mbabker
Copy link
Contributor

mbabker commented Jul 6, 2022

This entire pull request is unnecessary and implicitly changes the API contract.

You are basically saying that because of the lack of scalar typehints, the API contract is now stringToObject(mixed $data, array $options): mixed (contradictory to the documented API) because the type information does not exist in a runtime interface. It is up to the developer to ensure they are passing through correctly typed data, be it through their own automated testing or static analysis tools, therefore an explicit runtime deprecation notice for a type that was never explicitly supported to begin with is unwarranted.

You are unilaterally deciding that an untyped parameter is now explicitly treated as mixed. Do you intend to introduce runtime type checks and deprecation notices for every parameter without a runtime typehint?

@nibra
Copy link
Contributor Author

nibra commented Jul 7, 2022

You are basically saying that because of the lack of scalar typehints, the API contract is now stringToObject(mixed $data, array $options): mixed (contradictory to the documented API)

The API contract is factually that way, because it works that way (in some versions of PHP). People do whatever PHP accepts, not what's written in the documentation. Even the CMS passes null as $data and people complain, when running on PHP 8, where PHP functions used inside the method emit warnings or exceptions if not feeded with a string.

One could decide to repair the data by silently cast them to string (as PHP did before version 8), but that's not a suitable strategy for a framework. The way to go is heading towards typehinting everything, but we must give people time to repair their usage of the framework code. Hence the deprecation warning - trying to find a balance between strictness and pragmatism.

You are unilaterally deciding that an untyped parameter is now explicitly treated as mixed.

That's not me, it's PHP. Without this patch, users get a
Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in [...]/registry/src/Format/Json.php on line 54
message, lacking any hint on how to fix that without analysing the framework code. Emitting our own deprecation warning with a clear message is way better.

Do you intend to introduce runtime type checks and deprecation notices for every parameter without a runtime typehint?

We might need to, because with version 3 we want to get as close as possible to complete typehinting of parameters and return values.

@mbabker
Copy link
Contributor

mbabker commented Jul 7, 2022

The API contract is factually that way, because it works that way (in some versions of PHP). People do whatever PHP accepts, not what's written in the documentation.

Then there's no point in documenting the API and you might as well nuke all of the @param and @return values in doc blocks. PHP 5.6 and earlier had absolutely no way to typehint scalar values at runtime, and even with the improvements to the type system throughout PHP 7 and 8, there are still signatures that cannot be implemented as runtime behaviors. When typehints are not implemented in method signatures, the documentation is the API contract. You quite enjoy Laravel, a framework that is liberally duck typed and the stance of the maintainers is that they'll let the engine spit out warnings and errors when people pass the wrong types of data through.

PHP's type coercion still exists. But, there are some major differences in engine level C code and userland PHP code, namely how null is handled. Pretty much everything in the engine supports coercion from null (passing null to a string typehinted argument will be coerced to an empty string), however this same behavior does not exist in userland (passing null to a string typehinted argument emits a TypeError); https://3v4l.org/ZDvTd as a demonstration. PHP 8.1 specifically deprecates passing null to engine functions not declared nullable with intent to remove this coercion in PHP 9, bringing feature parity between the engine and userland.

Nobody else in the PHP world considers an untyped API explicitly supporting a mixed value when a typehint is omitted, the doc block (when present) is the contract and the only response needed when folks file bug reports trying to use the API in an unsupported manner is to tell them they're Doing It Wrong™ and to update their code. Yes, that is painful to hear, but it is the right response.

One could decide to repair the data by silently cast them to string (as PHP did before version 8), but that's not a suitable strategy for a framework.

Anyone relying on type coercion as an explicit feature is waiting to be bitten by a bug in production code. A framework performing unexpected type coercion of a provided value is just as bad as the engine silently coercing values IMO. The typecast added in this patch has Bad Idea written all over it IMO (not to mention that it's basically an incomplete patch, you aren't handling types that don't convert to string very well (like arrays or resources), and float-to-string conversion runs the risk of data loss depending on the PHP configuration).

You are unilaterally deciding that an untyped parameter is now explicitly treated as mixed.

That's not me, it's PHP. Without this patch, users get a Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in [...]/registry/src/Format/Json.php on line 54 message, lacking any hint on how to fix that without analysing the framework code. Emitting our own deprecation warning with a clear message is way better.

Users will only see this message if they are using the API incorrectly. And, your message truthfully does not provide a more actionable message than the PHP deprecation message; the only difference is one is an engine deprecation from a core function called by this class and one is a user deprecation emitted from this class (literally one frame higher up the stack than if the PHP deprecation were to be emitted).

Either way, that does not change the fact that you are explicitly changing the API to support a mixed type and creating a runtime behavior that contradicts the documented expectations. Because you are explicitly widening the supported types of the $data parameter from string to mixed, I would expect that any of the 10 primitive types from https://www.php.net/manual/en/language.types.intro.php would be injectable to this method and they be correctly handled. Truth of the matter is that only the string type, or things that (mostly) can be coerced into a string can be correctly handled by this method. Implicitly, this means that integers, booleans, null, and Stringable objects have behaviors that are well-defined enough that string coercion won't fatally break things but there will be potentially unexpected data types later on; the other primitives either have data loss through type manipulation (floats are notorious for this) or can't reasonably be coerced into a string.

Just let the engine spit out the relevant warning or error if someone is doing it wrong, it's usually going to have a better explanation of the issue (even if it means someone might have to step through a couple of frames of third-party code to understand why it was emitted) compared to an overzealous type cast to avoid an engine deprecation notice because the API is duck typed and runs a real risk of bad data manipulation.

@nibra nibra marked this pull request as draft July 7, 2022 11:43
@nibra
Copy link
Contributor Author

nibra commented Jul 7, 2022

Got your point. I tend to agree with your view. But then we have to make that clearer in the documentation. I will bring this up for discussion at the next production meeting. Moved this into draft state.

Thank you for your detailed explanation!

@Llewellynvdm
Copy link

Thank you @mbabker for sharing your thoughts on the importance of API documentation and type hints in PHP. I understand that you believe relying too heavily on type coercion or duck typing can be risky and lead to unexpected behavior in production code. You suggest that sticking to documented API contracts and relying on PHP engine's error messages is a safer approach.

I appreciate your insights and would like to know more about the best approach for adding type hints to the code. How can we make this change in a gradual manner and ensure that other developers who may be using the code are not caught off guard?

Overall, I think your points are well taken, and I look forward to hearing more about how we can implement these ideas in practice.

@mbabker
Copy link
Contributor

mbabker commented Feb 22, 2023

My stance is not much different than how the Laravel ecosystem deals with their duck typed APIs. The doc blocks are the contracts, anyone not respecting them is already Doing It Wrong™. Will runtime checks make it more obvious they are doing so? Absolutely. Is type coercion after said runtime check an appropriate solution? Absolutely not.

IMO, it is not a library author's responsibility to add runtime type checks (or coerce the input, which can result in data loss) inside a duck-typed function. If the intent is to add types to the signatures with the next major release, just do it, and document the change as part of the upgrade guide.

Now, here is the kicker from my experience in the (broken) Joomla development ecosystem. The support, or lack thereof, for null is rarely documented. There are a lot of places where null is legally passed through and handled but the value is not a documented acceptable type (in fact, here's a prime example from Registry::loadArray(). That same lack of documentation around nulls is why platforms with a long string of technical debt are being bitten hard by the PHP 8.1 deprecation of allowing null to non-nullable parameters. At some point, effort must be put in to review the entirety of the API to document where null is a legal type/value and where it isn't. For all other types, PHP's error system or coercion of types will be enough for the majority of use cases; either data will be coerced into something the function can use (whether the output is expected is up to the caller) or PHP will error out because you gave it something it can't process.

Long and short, you don't need runtime checks/deprecations in duck-typed APIs when planning to change to runtime declared types AS LONG AS the runtime signature matches the implied signature from the doc blocks; the doc block already serves as the API contract and no behavioral change for proper use of the API will be introduced by moving that contract from documentation to runtime. You do need runtime checks/deprecations in either duck-typed or fully-typed APIs when intending to narrow the supported types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants