Skip to content
This repository has been archived by the owner on Dec 2, 2024. It is now read-only.

bytewise key-encoding on sub-level causes not found on other level #64

Closed
MeirionHughes opened this issue Jun 15, 2019 · 20 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@MeirionHughes
Copy link
Member

MeirionHughes commented Jun 15, 2019

I'm getting odd behavior when using bytewise encoding on a sublevel. The intention is have bytewise encode a number to lex-sortable key, and then have that key appended on the end of the sub-level. It works if I write one value, but the moment I write a second, I can't read keys from other nested subs. Any idea where I'm going wrong? Docs says a sub-level must encode to a buffer and bytewise should be doing just that.

var sub = require('subleveldown')
var memdown = require('memdown')
var levelup = require('levelup');
var encoding = require('encoding-down');
var bytewise = require('bytewise');
var msgpack = require('msgpack-lite');
var {streamToRx} = require('rxjs-stream');

var db = levelup(encoding(memdown()));

var test1= sub(db, "logs", {valueEncoding: "json" });
var test2 = sub(db, "data");
var nested1 = sub(test2, '1234', { keyEncoding: bytewise, valueEncoding: msgpack })

async function main(){
  await test1.put("1234", "FOO");

  console.log("Got: " + await test1.get("1234"));
  console.log("put one..")
  await nested1.put(10,  10);

  console.log("Got: " + await test1.get("1234"));

  await dumpKeys(db);

  await nested1.put(20,  20);
  console.log("put another..")

  await dumpKeys(db);

  console.log(await test1.get("1234"));

  await dumpKeys(db);
}

async function dumpKeys(db){
  console.log("DUMP:")
  await streamToRx(db.createKeyStream()).forEach(x=>console.log(" " + x.toString()));
}

main().catch(console.log);

Console output:

Got: FOO
put one..
Got: FOO
DUMP:
 !logs!1234
 !data!!1234!B@$
put another..
DUMP:
 !logs!1234
 !data!!1234!B@$
 !data!!1234!B@4
NotFoundError: Key not found in database [1234]
    at D:\Code\geo\node_modules\levelup\lib\levelup.js:160:15
    at D:\Code\geo\node_modules\encoding-down\index.js:50:21
    at Immediate.callNext (D:\Code\geo\node_modules\memdown\memdown.js:162:7)
    at runCallback (timers.js:694:18)
    at tryOnImmediate (timers.js:665:5)
    at processImmediate (timers.js:647:5)

deps:

    "bytewise": "^1.1.0",
    "encoding-down": "^6.0.2",
    "leveldown": "^5.1.0",
    "levelup": "^4.0.2",
    "msgpack-lite": "^0.1.26",
    "rxjs": "^6.5.1",
    "rxjs-stream": "^3.0.2",
    "subleveldown": "^4.0.0",
@MeirionHughes MeirionHughes changed the title Batch put in nested sub causes bytewise key-encoding on sub-level causes not found on other level Jun 15, 2019
@MeirionHughes
Copy link
Member Author

No issue if I switch to charwise - so I suspect the mix of string + buffer and its screwing with the ! delimiter checking.

@vweevers
Copy link
Member

You're likely hitting a known issue with memdown, when mixing buffers and strings in the same db, see Level/level-ttl#68 (comment) (different context, same underlying issue). Might be fixable by reverting Level/codec#23.

@vweevers
Copy link
Member

To confirm it's that, could you try replacing memdown with leveldown?

@vweevers vweevers added bug Something isn't working question Support or open question(s) labels Jun 16, 2019
@MeirionHughes
Copy link
Member Author

Yes - leveldown works.

@vweevers
Copy link
Member

Great. So, reverting Level/codec#23 should fix this, the level-ttl issue and a multileveldown issue (Level/community#71). I can't think of any reasons not to revert it. Feel up to making a PR?

@MeirionHughes
Copy link
Member Author

So, you want to revert #23, which was a revert itself?

@vweevers
Copy link
Member

Yes :)

@MeirionHughes
Copy link
Member Author

okaydoky I'll test if that fixes my unit-test locally then do the pr.

@MeirionHughes
Copy link
Member Author

okay I've cherry-picked the original commit and resolved the linting errors: https://github.com/MeirionHughes/codec/commit/9f6686a390b96200a93103e6672d96569a388a2a

doesn't seem to fix it unfortunately.

I yarn link and yarn link level-codec locally, and added a console.log during the export just to make sure it was being used.

@MeirionHughes
Copy link
Member Author

I'll see if I can make a basic failing test on here

MeirionHughes added a commit that referenced this issue Jun 16, 2019
@MeirionHughes
Copy link
Member Author

Okay I've managed to get it to trigger the race condition. At least for me and its consistent: https://github.com/Level/subleveldown/compare/tests/buffer-race-issue

it very odd because you can change subdb(db, 'logs') to subdb(db, 'AAAA') and it works...

@vweevers
Copy link
Member

{ keyEncoding: 'buffer' } should be { keyEncoding: 'binary' }

@MeirionHughes
Copy link
Member Author

MeirionHughes commented Jun 16, 2019

fixed - still failing. I've also made sure that my local version of level-codec with your suggested change (Level/codec@master...MeirionHughes:master) is defiantly being loaded.

... anyway I can use charwise and keep all my keys as strings for now.

@vweevers vweevers removed the question Support or open question(s) label Jun 16, 2019
@vweevers vweevers self-assigned this Jun 16, 2019
@vweevers
Copy link
Member

I had a closer look. Thanks for providing a clean subleveldown test! The problem is that memdown, when comparing two keys, and one key is a buffer while the other key is a string, does not compare them bytewise. Essentially it compares buf[index] to str[index].

@vweevers
Copy link
Member

We have a few options:

  1. Support comparing buffers to strings in memdown (bringing its behavior closer to leveldown) (but it's not compatible with IDB-style sort order, which sorts by type first and then content - i.e. string data is never equal to binary data) (bytewise is also IDB-style but sorts string and binary the other way around...)
  2. Have memdown sort by type, then content
  3. Have subleveldown always use buffers internally

I hate encodings right now :)

@vweevers
Copy link
Member

vweevers commented Jun 16, 2019

I think I prefer 2. Because although it is not the same behavior as leveldown, it is the least surprising and it's fairly reasonable to expect a consumer to be consistent in their usage of key/value types - i.e. don't mix buffers and strings in the same (sub)db.

@ralphtheninja WDYT?

Edit: I forgot I already wrote code for 2! To deal with other surprising cases, all arising from the fact that memdown does not compare by type. For example it considers 'a' and 0 to be equal (because JavaScript attempts to convert the left- and right-hand sides of greater-than and less-than operators to numbers, unless both sides are strings. In the case of 'a' and 0, 'a' becomes NaN, which is never greater or less than a number.)

@MeirionHughes
Copy link
Member Author

MeirionHughes commented Jun 16, 2019

Thanks for investigating, on a Sunday no-less! - Perhaps no.1 via an option so that it behaves just like leveldown? Non-breaking, but allows memdown to behave like leveldown during unit tests. I guess if you're code for no.2 already solves it then that is the way to go.

@vweevers
Copy link
Member

Perhaps no.1 via an option so that it behaves just like leveldown?

Although it's easy to add a comparator option to memdown to make it sort however you want, IMO the default behavior should cover common use cases like these, and not require configuration.

Somewhere this week, I'll try to clean up and push the code that I have. It will make memdown sort the same as IDB and level-js, and almost the same as bytewise.

Side note: are you using memdown purely to speed up tests? You might like level-test which defaults to leveldown in node, but is quite fast because it uses temporary directories without cleaning them up (leaving that to the OS).

@vweevers
Copy link
Member

Opened a PR for level-codec (Level/codec#51, thanks!); memdown discussion is to be continued in Level/memdown#186.

Keeping this open though, as a reminder to test everything together once ready.

@vweevers
Copy link
Member

Fixed in [email protected].

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants