You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Where CST is i.MX Code Signing Tool, which is used to sign images for secure boot on NXP SOC:s.
When stress testing our signing I saw that the PKCS11 sessions are not correctly released, which after a short while under load causes errors due to lack of free sessions. With the yubihsm-connector the maximum number of consecutive sessions is 16 and the default timeout for inactive sessions is 30 seconds. When running more frequent than the timeouts can cope with signing operations fail.
I have been investigating the not released sessions, since we have a setup where signing operations can come in bursts that could potentially be hurt by the described problem. This could cause random build failures.
The results from my investigations are some smaller issues in the CST tool, but also one or two in libp11. They are described below.
I will make a PR for the first one.
Reference counting in p11_key.c pkcs11_object_free()
In the following code in p11_key.c the decrease of obj->refcnt happens before the following handling
of the obj->evp_key.
void pkcs11_object_free(PKCS11_OBJECT_private *obj)
{
if(!obj)
return;
if (pkcs11_atomic_add(&obj->refcnt, -1, &obj->lock) != 0)
return;
if (obj->evp_key) {
/* When the EVP object is reference count goes to zero,
* it will call this function again. */
EVP_PKEY *pkey = obj->evp_key;
obj->evp_key = NULL;
EVP_PKEY_free(pkey);
return;
}
pkcs11_slot_unref(obj->slot);
X509_free(obj->x509);
OPENSSL_free(obj->label);
pthread_mutex_destroy(&obj->lock);
OPENSSL_free(obj);
}
In the if (obj->evp_key) section there is a comment that states that the pkcs11_object_free() will
get a new call as a result of the EVP_PKEY_free(pkey) call. At the end of that scope it returns.
When the second call comes, as stated in the comment, the obj->refcnt will be decreased once more.
This means that it will likely go negative and not 0 as tested in the if-clause. This means that it will return
and the five clean-up lines at the end of the function will never be run.
I will make a PR for the above.
Workaround in eng_front.c blocks cleanup
In the function engine_destroy() in eng_front.c there is a turned off call to ctx_finish(),
with a comment as below.
/* Destroy the context allocated with ctx_new() */
static int engine_destroy(ENGINE *engine)
{
ENGINE_CTX *ctx;
int rv = 1;
ctx = get_ctx(engine);
if (!ctx)
return 0;
/* ENGINE_remove() invokes our engine_destroy() function with
* CRYPTO_LOCK_ENGINE / global_engine_lock acquired.
* Any attempt to re-acquire the lock either by directly
* invoking OpenSSL functions, or indirectly via PKCS#11 modules
* that use OpenSSL engines, causes a deadlock. */
/* Our workaround is to skip ctx_finish() entirely, as a memory
* leak is better than a deadlock. */
#if 0
rv &= ctx_finish(ctx);
#endif
rv &= ctx_destroy(ctx);
ENGINE_set_ex_data(engine, pkcs11_idx, NULL);
ERR_unload_ENG_strings();
return rv;
}
Without the turned off call to ctx_finish() I can't see that the cleanup is run at all. I understand the comment about the deadlock issue, but the result of the decision is not only a memory leak but a leak of PKCS#11 sessions and possibly other resources.
Would it be possible to handle this in a better way? To handle the locks in a more detailed way on calls to ENGINE_remove()?
The text was updated successfully, but these errors were encountered:
We use libp11 in a chain like the following:
CST - OpenSSL - libpkcs11.so - yubihsm_pkcs11.so - YubiHSM 2
Where CST is i.MX Code Signing Tool, which is used to sign images for secure boot on NXP SOC:s.
When stress testing our signing I saw that the PKCS11 sessions are not correctly released, which after a short while under load causes errors due to lack of free sessions. With the
yubihsm-connector
the maximum number of consecutive sessions is 16 and the default timeout for inactive sessions is 30 seconds. When running more frequent than the timeouts can cope with signing operations fail.I have been investigating the not released sessions, since we have a setup where signing operations can come in bursts that could potentially be hurt by the described problem. This could cause random build failures.
The results from my investigations are some smaller issues in the CST tool, but also one or two in libp11. They are described below.
I will make a PR for the first one.
Reference counting in p11_key.c
pkcs11_object_free()
In the following code in
p11_key.c
the decrease ofobj->refcnt
happens before the following handlingof the
obj->evp_key
.In the
if (obj->evp_key)
section there is a comment that states that thepkcs11_object_free()
willget a new call as a result of the
EVP_PKEY_free(pkey)
call. At the end of that scope it returns.When the second call comes, as stated in the comment, the
obj->refcnt
will be decreased once more.This means that it will likely go negative and not
0
as tested in the if-clause. This means that it will returnand the five clean-up lines at the end of the function will never be run.
I will make a PR for the above.
Workaround in
eng_front.c
blocks cleanupIn the function
engine_destroy()
ineng_front.c
there is a turned off call toctx_finish()
,with a comment as below.
Without the turned off call to
ctx_finish()
I can't see that the cleanup is run at all. I understand the comment about the deadlock issue, but the result of the decision is not only a memory leak but a leak of PKCS#11 sessions and possibly other resources.Would it be possible to handle this in a better way? To handle the locks in a more detailed way on calls to
ENGINE_remove()
?The text was updated successfully, but these errors were encountered: