e4c_store_mem: examine O(n) key lookup algorithm and examine if we need to a hash map if size is massive #13

odeke-em · 2020-01-07T07:57:17Z

Just noticed while auditing the code

Lines 84 to 104 in 601fd4d

    
           int e4c_getindex(e4storage *store, const char *topic) 
        
           { 
        
               int i; 
        
               uint8_t hash[E4_TOPICHASH_LEN]; 
        
               /* hash the topic */ 
        
               if (e4c_derive_topichash(hash, E4_TOPICHASH_LEN, topic) != 0) { 
        
                   return E4_ERROR_PERSISTENCE_ERROR; 
        
               } 
        
               /* look for it */ 
        
               for (i = 0; i < store->topiccount; i++) 
        
               {  
        
                   if (memcmp(store->topics[i].topic, hash, E4_TOPICHASH_LEN) == 0) 
        
                   { 
        
                       break; 
        
                   } 
        
               } 
        
               if (i >= store->topiccount) return E4_ERROR_TOPICKEY_MISSING; 
        
               return i; 
        
           }

but we don't have an explanation on the number of keys that can be stored in e4store. What's going to be the usual size? How often will it be called? Would be nice to have an idea about this and add a comment if the number of keys will be very small but if lots we need to audit this code and profile it as it could definitely become an attack vector to issue worst case algorithmic complexity attacks.

odeke-em · 2020-01-07T08:09:07Z

We need to also examine the code for removing a device too

libe4/src/e4c_pk_store_file.c

Lines 443 to 466 in 601fd4d

    
           int e4c_remove_device(e4storage* store, const uint8_t* id) 
        
           { 
        
               int i, j; 
        
               device_key *devicekeys = store->devices; 
        
               for (i = 0; i < store->devicecount; i++) 
        
               { 
        
                   if (memcmp(devicekeys[i].id, id, E4_ID_LEN) == 0) 
        
                   { 
        
                       /* remove this item and move list up */ 
        
                       for (j = i + 1; j < store->topiccount; j++) 
        
                       { 
        
                           memcpy(&devicekeys[j - 1], &devicekeys[j], sizeof(device_key)); 
        
                       } 
        
                       ZERO(devicekeys[store->devicecount]); 
        
                       store->devicecount--; 
        
                       return e4c_sync(store); 
        
                   } 
        
               } 
        
               return E4_ERROR_DEVICEPK_MISSING; 
        
           }

diagprov · 2020-02-06T17:31:08Z

I did not respond to this at the time, but I will now do so.

Firstly, this is unlikely to be a problem in embedded contexts. Encrypted EEPROM chips range from 3kb to 32kb of encrypted data, for example: https://www.digikey.com/en/articles/techzone/2019/jan/use-external-encrypted-eeprom--secure-data-embedded-systems. Given each topic key is 32 bytes, plus a 16 byte ID, plus our own 16 byte ID + key we can store precisely 682 such keys on such a device (32kb), provided no other data need be stored (like lookaside keys). This figure decreases in the event that we use public key cryptography - in this situation you can store around 300 keys of each type as we need to store longer secret keys plus C2 keys. This all assumes no key roll support at all.

In an embedded context using secure eeprom, therefore, the extra metadata required to maintain a hashmap is really not worth the storage space. It is unfortunately not magic and we must deal with hash collisions too. Note that on such a device we would not, in fact, use these structs as is: everything would be read from EEPROM as needed and we would not duplicate everything into RAM (these are just toy implementations).

HOWEVER it looks like we may be using libe4 in non-embedded contexts (by this, I mean desktops and mobile type environments, which may multithread and may dynamically allocate). Therefore, and I'll ping @veorq as well, is it worth implementing a storage specifically for these use cases inside libe4 itself?

My original intention, even with bindings of any kind, was that the calling / consuming code would actually provide the storage. In an embedded context, this is because I've no idea necessarily how to talk to EEPROM. It is very much not standard: you need to know how it is connected to your chip. In the case of mobile e.g. iOS/Android, since we store sensitive information (keys) the calling application may want to use those platform's secure storage APIs. Those APIs are as follows:

If a new storage is implemented in C I would envisage the following constraints:

A suitable run time lookup algorithm e.g. a hashmap or tree so we're not doing O(n) everything.
Enforced use of C11+. CONF=C89 make would fail with this storage.
Thread safety / reentrant protect/unprotect functions.

Also pinging @daeMOn63 as he might be interested in this.

diagprov · 2020-02-06T17:33:39Z

I should add that there is a cost to implementing and maintaining such a storage. And yes, I did mean C11 and above, not C++11. C++ is supported by virtue of the extern C interfaces.

diagprov self-assigned this Jan 7, 2020

diagprov added A-Storage Related to key storage infrastructure P-LOW Low priority issue labels Jan 7, 2020

diagprov modified the milestone: vNext Feb 6, 2020

diagprov added the O-Linux Linux label Feb 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

e4c_store_mem: examine O(n) key lookup algorithm and examine if we need to a hash map if size is massive #13

e4c_store_mem: examine O(n) key lookup algorithm and examine if we need to a hash map if size is massive #13

odeke-em commented Jan 7, 2020

odeke-em commented Jan 7, 2020

diagprov commented Feb 6, 2020

diagprov commented Feb 6, 2020

e4c_store_mem: examine O(n) key lookup algorithm and examine if we need to a hash map if size is massive #13

e4c_store_mem: examine O(n) key lookup algorithm and examine if we need to a hash map if size is massive #13

Comments

odeke-em commented Jan 7, 2020

odeke-em commented Jan 7, 2020

diagprov commented Feb 6, 2020

diagprov commented Feb 6, 2020