Account tests #528

goodboy · 2023-07-05T15:30:38Z

More or less starting the final push to get piker.accounting and friends to a place where we can never get wrong position calcs, implement said calcs using polars.DataFrames, do offline ledger processing for all accounts, and be in a place to start implementing position metrics curves display in the chart UI as mentioned in #515.

Tasks:

Test list:

"offline" ledger test set:
- binance.paper input ledgers (of various sorts) where we
  verify Account.pps: dict[str Position properties outputs
  - long to zero
  - short to zero
  - long to short (via single clear) and back to zero
- use an ib.algopaper actual flex and API ledger and verify
  at least a (subset) of pps.
symcache tests which verify creation for all backends and
reloads when requested?

No point having duplicate data when we already stash the `expiry` on the mkt info type and can just read it (and cast to `datetime` obj). Further this fixes a regression caused by converting `._clears` to a list by adding a `._events: dict[str, Transaction]` which prevents double entering transactions based on checking the events table for the existing id.. Further add a sanity check that all events are popped (for now) after serializing the clearing table for the toml account file. In the longer run, ideally we don't have the separate sequences ._clears and ._events by choosing a better data structure (sorted unique set of mkt events) maybe a specially used `polars.DataFrame` (which we kind need eventually anyway)?

In an effort to properly support fiat pairs (aka forex) as well as more generally insert a fully-qualified `MktPair` in for the `Transaction.sys`. Note that there's a bit of special handling for API `Contract`s-as-dict records vs. flex-report-from-xml equivalents.

Also finally adds full `FeedInit` and `MktPair` support for this backend by handling: - all "currency" fields for each `Contract` by constructing and `Asset` and setting the `MktPair.src` with `.atype='fiat'`. - always render the `MktPair.src` name in the `.fqme` for fiat pairs (aka forex) but never for other instruments.

We're probably going to move to implementing all accounting using `polars.DataFrame` and friends and thus this rejig preps for a much more "stateless" implementation of our `Position` type and its internal pos-accounting metrics: `ppu` and `cumsize`. Summary: - wrt to `._pos.Position`: - rename `.size`/`.accum_size` to `.cumsize` to be more in line with `polars.DataFrame.cumsum()`. - make `Position.expiry` delegate to the underlying `.mkt: MktPair` handling (hopefully) all edge cases.. - change over to a new `._events: dict[str, Transaction]` in prep for #510 (and friends) and enforce a new `Transaction.etype: str` which is by default `clear`. - add `.iter_by_type()` which iterates, filters and sorts the entries in `._events` from above. - add `Position.clearsdict()` which returns the dict-ified and datetime-sorted table which can more-or-less be stored in the toml account file. - add `.minimized_clears()` a new (and close) version of the old method which always grabs at least one clear before a position-side-polarity-change. - mask-drop `.ensure_state()` since there is no more `.size`/`.price` state vars (per say) as we always re-calc the ppu and cumsize from the clears records on every read. - `.add_clear` no longer does bisec insorting since all sorting is done on position properties *reads*. - move the PPU (price per unit) calculator to a new `.accounting.calcs` as well as add in the `iter_by_dt()` clearing transaction sorted iterator. - also make some fixes to this to handle both lists of `Transaction` as well as `dict`s as before. - start rename of `PpTable` -> `Account` and make a note about adding a `.balances` table. - always `float()` the transaction size/price values since it seems if they get processed as `tomlkit.Integer` there's some suuper weird double negative on read-then-write to the clears table? - something like `cumsize = -1` -> `cumsize = --1` !?!? - make `load_pps_from_ledger()` work again but now includes some very very first draft `polars` df processing from a transaction ledger. - use this from the `accounting.cli.disect` subcmd which is also in *super early draft* mode ;) - obviously as mentioned in the `Position` section, add the new `.calcs` module with a `.ppu()` calculator func B)

New mod is `.data._symcache` and it needs backend clients to declare `Client.get_assets()` and `.get_mkt_pairs()` to generate the cache files which now go in the config dir under `_cache/`.

For starters rename the cache type to `SymbologyCache` and fill out its interface to include an (async) `.reload()` which can be used to populate the in-mem asset-table sets such that any tractor-runtime task can actually directly call it. Use a symcache file name schema of `_cache/<backend>.symcache.toml`. Dirtier deatz: - make `.open_symcache()` a `@cm` such that it can be used from sync code and will actually call `trio.run()` in the case where it needs to do a full (re)load; also don't write on exit only on reloads. - add `.get_symcache()` a simple non-ctx-mngr reader which again can mostly be called willy-nilly from sync code without the full runtime being up (but likely will only work if symcache files already exist for the backend).

Previously we weren't necessarily serializing mkt pairs (for IPC msging) entirely bc the assets `.src/.dst` were being sent just by their str-names. This now properly supports fully serializing `Asset`s as `dict`-msgs such that use of `MktPair.to_dict()` can be transmitted over `tractor.MsgStream`s and deserialized entirely back to struct from on the receiver end. Deats: - implement `Asset.to_dict()` and `.from_msg()` - adjust `MktPair.to_dict()` and `.from_msg()` to use these methods. - drop all the hacky "if .src/.dst is str" handling. - add better `MktPair.from_fqme()` input handling for expiry and venue; ensure that either can be extracted from passed fqme *and* if so they are also popped from any duplicate passed in `**kwargs**`.

As part of loading the cache we can now fill the asset sub-tables: `.mktmaps` and `.assets` with their deserialized struct instances! In theory this might be possible for the backend defined `Pair` structs as well but we need to figure out probably an endpoint to offer the conversion? Also, add a `SymbologyCache.search()` which allows sync code to scan the existing (known via cache) symbol set just like how async code can use the (much slower) `open_symbol_search()` ctx endpoint 💥

Since we now fully support interchange-as-dict-msg, use the msg codec API and drop manual `Asset` unpacking. Also, wrap `get_symcache()` in a `pdbp` crash handler block for now B)

Turns out we don't really need it directly for most "txn processing" AND if we do it's usually related to some `Account`-ing related calcs; which means we can instead just rely on the new `SymbologyCache` lookup to get it when needed. So, basically just get rid of it and rely instead on the `.fqme` to be the god-key to getting `MktPair` info (from the cache). Further, extend the `TransactionLedger` to contain much more info on the pertaining backend: - `.mod` mapping to the (pkg) py mod. - `.filepath` pointing to the actual ledger TOML file. - `_symcache` for doing any needed asset or mkt lookup as mentioned above. - rename `.iter_trans()` -> `.iter_txns()` and allow passing in a symcache or using the init provided one. - rename `.to_trans()` similarly. - delegate paper account txn processing to the `.clearing._paper_engine` mod's `norm_trade()` (and expect this similarly from other backends!) - use new `SymbologyCache.search()` to find the best but un-fully-qualified fqme for a given `txdict` being processed when writing a config (aka always try to expand to the most verbose `.fqme` possible). - add a `rewrite: bool` control to `open_trade_ledger()`.

Rename `open_pps()` -> `open_account()` for obvious reasons as well as expect a bit tighter integration with `SymbologyCache` and consequently `LedgerTransaction` in order to drop `Transaction.sym: MktPair` dependence when compiling / allocating new `Position`s from a ledger. Also we drop a bunch of prior attrs and do some cleaning, - `Position.first_clear_dt` we no longer sort during insert. - `._clears` now replaces by `._events` table. - drop the now masked `.ensure_state()` method (eventually moved to `.calc` submod for maybe-later-use). - drop `.sym=` from all remaining txns init calls. - clean out the `Position.add_clear()` method and only add the provided txn directly to the `._events` table. Improve some `Account` docs and interface: - fill out the main type descr. - add the backend broker modules as `Account.mod` allowing to drop `.brokername` as input and instead wrap as a `@property`. - make `.update_from_trans()` now a new `.update_from_ledger()` and expect either of a `TransactionLedger` (user-dict) or a dict of txns; in the latter case if we have not been also passed a symcache as input then runtime error since the symcache is necessary to allocate positions. - also, delegate to `TransactionLedger.iter_txns()` instead of a manual datetime sorted iter-loop. - drop all the clears datetime don't-insert-if-earlier-then-first logic. - rename `.to_toml()` -> `.prep_toml()`. - drop old `PpTable` alias. - rename `load_pps_from_ledger()` -> `load_account_from_ledger()` and make it only deliver the account instance and also move out all the `polars.DataFrame` related stuff (to `.calc`). And tweak some account clears table formatting, - store datetimes as TOML native equivs. - drop `be_price` fixing. - obvsly drop `.ensure_state()` call to pps.

To isolate it from the ledger/account mods and bc it is actually for doing (eventual) position calcs / anal, might as well put it in this mod. Add in the old-masked `ensure_state()` method content in case we want to use it later for testing. Also tighten up the parser loading inside `dyn_parse_to_dt()`.

Drop all the old `polars` (groupby + agg related) mangling to get a df per fqme by delegating to the new routine and add in the `.cumsum()`ing (per frame) as a first start on computing pps using dfs instead of python dicts + loops as in `ppu()`.

So you can do a `Struct1` - `Struct2` and we dump a little diff `list` of tuples for anal on the REPL B) Prolly can be broken out into it's own micro-patch?

Add `bs_src/dst_asset: str` properties which provide for unique keying into futures vs. spot venues by offering a `.venue: str` property which, for non-spot delivers normally an expiry suffix (eg. '.PERP') and for spot just delivers the bair chain-token key. This enables keying multiple venues with the same mkt pairs easily in a global flat key->pair table needed as part of supporting a symcache.

Meaning we add the `Client.get_assets()` and `.get_mkt_pairs()` methods. Also implement `.exch_info()` to take in a `expiry: str` to detect whether to look up a derivative venue instead of spot. In support of all this we now explicitly key all assets (via `._cache_pairs() during the populate of `._venue2assets` sub-tables) with their `.bs_dst_asset: str` value to ensure, for ex., a spot `BTCUSDT` has a distinct value from any futures contracts with the same `Pair.symbol: str` value! Also, ensure we always create a `brokers.toml` (from template) if DNE and binance is the user's first used backend XD

Instead of constructing them (previously manually) in `.get_mkt_info()` ep, just call `.get_assets()` and do key lookups for assets to hand directly to the `.src/dst` of `MktPair`. Refine fqme input parsing to match: - adjust parsing logic to only use `unpack_fqme()` on the input fqme token. - set `.mkt_mode: str` to the derivs venue when an expiry token is detected in the fqme. - pass the parsed `expiry: str` to `Client.exch_info()` to ensure a deriv venue (table) is used for pair lookup. - skip any "DEFI" venue or other unknown asset type cases (since binance doesn't seem to define some assets anywhere?). Also, just use the `Client._pairs` unified table for search input since the first call to `.exch_info()` won't necessarily contain the most up-to-date state whereas `._pairs` always will.

Took a little while to get right using declarative style but it's finally workin and seems (mostly correct B) Computes the ppu (price per unit) using the PnL since last net-zero-cumsize (aka the pnl from open to close) and uses it to calc the pnl-per-exit trade (using the ppu). Next up, bep (break even price both) per position and maybe since ledger start or an arbitrary ref point?

Since it appears impossible to compute the recurrence relations for PPU (at least sanely) without using embedded `polars.List` elements, this instead just implements price-per-unit and break-even-price calcs doing a plain-ol-for-loop imperative approach with logic branching. I burned wayy too much time trying to implement this in some kinda `polars` DF native way without luck, so hopefuly someone smarter can come in and make it work at some point xD Resolves a related bullet in #515

Also fix bug since we always need to reset cum_pos_pnl after a `exit_to_zero` case.

…nts B)

In order to attempt giving the user a realistic prediction for a BEP per txn we need to model what the (worst case) anticipated exit txn costs will be during the equivalent, paired entries. For now we use a simple "symmetric cost prediction" model where we assume the exit costs will be simply the same as the enter txn costs and thus on every entry we apply 2x the enter txn cost; on exit txns we then unroll these predictions by keeping a cumulative sum of the cost-per-unit and reversing the charges based on applying that mean to the current exit txn's size. Once unrolled we apply the actual exit txn cost received from the broker-provider.

Since it's depended on by `.data` stuff as well as pretty much everything else, makes more sense to expose it as a top level module (and maybe eventually as a subpkg as we add to it).

If a backend declares a top level `get_cost()` (provisional name) we call it in the paper engine to try and simulate costs according to the provider's own schedule. For now only `binance` has support (via the ep def) but ideally we can fill these in incrementally as users start forward testing on multiple cexes.

Finally this is a reason to use our new `OrderDialogs` abstraction; on order submission errors IB doesn't really pass back anything other then the `orderId` and the reason so we have to conduct our own lookup for a message to relay to the EMS.. So, for every EMS msg we send, add it to the dialog tracker and then use the `flows: OrderDialogs` for lookup in the case where we need to relay said error. Also, include sending a `canceled` status such that the order won't get stuck as a stale entry in the `emsd`'s own dialog table. For now we just filter out errors that are unrelated from the stream since there's always going to be stuff to do with live/history data queries..

Turns out we were expecting/processing `Status(resp='error')` msgs not `BrokerdError` (i guess bc latter was only really being used in initial `brokerd` msg responses and not for relay of actual provider clearing engine failures?) and the case block match / logic wasn't really correct. So this changes a few things: - always do reverse `oid` lookups from `reqid`s if possible in error msg handling case. - add a new `Error` client-dialog msg (derived from `Status`) which we now relay when `brokerd` sends a `BrokerdError` and no prior `Status` can be found (when it is we still fill in appropriate fields from the backend-error and just send back the last status msg like before). - try hard to look up the original `Order.symbol: str` for client broadcasting trying first using any `Status.req` and failing over to embedded `.brokerd_msg` field lookups. - drop the `Status.name = 'error'` from literal def.

This is a tricky edge case we weren't handling prior; an example is submitting a limit order with a price tick precision which mismatches that supported (probably bc IB reported the wrong one..) and IB responds immediately with an error event (via a special code..) but doesn't include any `Trade` object(s) nor details beyond the `reqid`. So, we have to do a little reverse EMS order lookup on our own and ideally indicate to the requester which order failed and *why*. To enable this we, - create a `flows: OrderDialogs` instance and pass it to most order/event relay tasks, particularly ensuring we update update ASAP in `handle_order_requests()` such that any successful submit has an `Ack` recorded in the flow. - on such errors lookup the `.symbol` / `Order` from the `flow` and respond back to the EMS with as many details as possible about the prior msg history. - always explicitly relay `error` events which don't fall into the sensible filtered set and wrap in a `BrokerdError.broker_details['flow']: dict` snapshot for the EMS. - in `symbols.get_mkt_info()` support adhoc lookup for `MktPair` inputs and when defined we re-construct with those inputs; in this case we do this for a first mkt: `'vtgn.nasdaq'`..

Started rejigging example code from this example to use more modern `asyncio` APIs: https://github.com/matt-kimball/mtr-packet-python/blob/master/examples/trace-concurrent.py Relates to #330

goodboy mentioned this pull request Jul 10, 2023

Start piker.storage subsys: cross-(ts)db middlewares #486

Open

34 tasks

goodboy added 29 commits July 12, 2023 08:45

Expose .accounting.load_account()

482403c

Fix test to use new load_account() location

c780164

Add src asset name ignore via MktPair._fqme_without_src: bool

87d6115

clearing._messages: add todo to drop the BrokedPosition msg

3ff9fb3

Always include the src asset for (parquet file names) for fiat pairs

9748b22

ib: fix Client.trades() return type annot

c0929c0

Add a symbology cache subsys

0050232

New mod is `.data._symcache` and it needs backend clients to declare `Client.get_assets()` and `.get_mkt_pairs()` to generate the cache files which now go in the config dir under `_cache/`.

Also handle Decimal interchange in MktPair msg-ification

3994fd8

Use MktPair.from_msg() in symcache

ddc5f2b

Since we now fully support interchange-as-dict-msg, use the msg codec API and drop manual `Asset` unpacking. Also, wrap `get_symcache()` in a `pdbp` crash handler block for now B)

Oof, fix .size tick msg encode..

520414a

.accounting: expose new names at pkg top level

749401e

Change cached-client hit msg to runtime level

ff26789

Drop config get/set/del apis..

87185cf

Add handy DiffDumping for our .types.Struct

8f40e52

So you can do a `Struct1` - `Struct2` and we dump a little diff `list` of tuples for anal on the REPL B) Prolly can be broken out into it's own micro-patch?

.nativedb: ignore an expired/ subdir

c9681d0

goodboy added 29 commits July 29, 2023 21:02

Use inf row/col repr for debugging atm

a088ebf

data.history: add TODO for non-zero epochs and some typing

100be54

Swap branch order for enter/exit

5d24b5d

Also fix bug since we always need to reset cum_pos_pnl after a `exit_to_zero` case.

Factor df conversion into lone routine: ledger_to_dfs()

85ae180

Pass sync code flag in flex report processor

29bab02

Handle txn costs in BEP, factor enter/exit blocks and df row assignme…

b6a7058

…nts B)

Drop commented, now deprecated edge case notes 🏄

1e3a4ca

Fix PositionTracker.pane attr resolve bug..

fff610f

Add some new hotkey maps for chart zoom and pane hiding

94ebe1e

Drop virt_cost: str from df output

a51a610

Add note about xonsh.main.main() attempted usage

ae444d1

ib: add back src/dst parsing for fiat pairs

e9dfd28

Officially drop Position.size

60751ac

Lul, fix open_ledger_dfs() to yield when ledger passed in..

e4ea7d6

Parametrize account names for offline ledger tests

5d86d33

Bleh, move .data.types back up to top level pkg

5ed8544

Since it's depended on by `.data` stuff as well as pretty much everything else, makes more sense to expose it as a top level module (and maybe eventually as a subpkg as we add to it).

Factor cumsize sign to var

85a38d0

Add example mtr prober from mtrpacket

78178c2

Started rejigging example code from this example to use more modern `asyncio` APIs: https://github.com/matt-kimball/mtr-packet-python/blob/master/examples/trace-concurrent.py Relates to #330

Better commenting around order-mode error block

077d9bf

Facepalm: remove now unused CostModel idea..

c5ed6e6

Add note about broadcast when no .symbol found

4aa04e1

Pkg with poetry, poetry2nix and a flake.nix

6e8d078

goodboy mentioned this pull request Sep 22, 2023

Ib py311 fixes #535

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Account tests #528

Account tests #528

goodboy commented Jul 5, 2023 •

edited

Loading

Account tests #528

Are you sure you want to change the base?

Account tests #528

Conversation

goodboy commented Jul 5, 2023 • edited Loading

Tasks:

Test list:

goodboy commented Jul 5, 2023 •

edited

Loading