Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

differences between paper and validation #2

Open
nkchem09 opened this issue Apr 13, 2022 · 20 comments
Open

differences between paper and validation #2

nkchem09 opened this issue Apr 13, 2022 · 20 comments

Comments

@nkchem09
Copy link

It is a gread work for the methods sharing.
Validate steps are done following the orders listed from the README and the results caculated are listed bellow:
image
It is very different from results from the paper:
image

I am wondering the reason.
Thank you for your help.

@replacementAI
Copy link

I believe I heard/read some where that they ran the model 5 times and averaged the results.

@q-learning-trader
Copy link

How long did you take to build all the features @nkchem09 ?
In my computer it is very slow.
Have you tried with 5% target volatility instead of 15% ?

@nkchem09
Copy link
Author

How long did you take to build all the features @nkchem09 ? In my computer it is very slow. Have you tried with 5% target volatility instead of 15% ?

It is been done just as the code provided with default parameters. The computer is consisted with i9 9900k, 64GB DDR4, 2080ti and it takes long time to get the result too.

@MickyDowns
Copy link

I ran the model using Kiernan's data (5x per @replacementAI and as specified in the paper) and can confirm the performance. I've also run it on day, day part, hour and five minute equity prices. It finds signal w/ this relatively noisy data, though (a) the signal declines as the time series frequency increases, (b) the importance of CPD breaks down. Regarding CPD, it takes a while to run w/ limited upside from using a GPU. That's either due to a configuration issue (getting the gpflow libs right is challenging) or the sequential nature of the problem. At 3.7 Ghz it will do a ticker in 60-ish minutes per CPU on the default data set. As you move to higher frequency data, the time required blows up as the value breaks down.

@replacementAI
Copy link

Thats good to hear @MickyDowns, could you share the results on the performances of different time candles?

@MickyDowns
Copy link

Hey @replacementAI, let me see about providing my version of @kieranjwood's framework after I'm done testing it. It's table driven so that you can run data at different frequencies relatively easily. I think that will be more useful than providing numbers which are dependent on my separate feature engineering. However, as you'd expect, the performance of these predictors degrades on the higher-frequency data. Fortunately, new types of predictors come available as you move from bar, to bid/ask, to LOB data.

@makovez
Copy link

makovez commented Mar 3, 2023

Hi, I also get difference results.

I have tried running the TFT model without change points module with the command "python.exe -m examples.run_dmn_experiment TFT"

It then runs experiments by default on:

  • 2016-2017 - Here position is always -1
  • 2017-2018 - Position varies but most of the time higher than 0.9
  • 2019-2020 - Position always over 0.9
  • 2020-2021 - Position varies but most of the time higher than 0.9

For each experiments Long is always better than the TFT model. Here the comparision between cumsum of captured returns and returns for 2017-2018 and 2020-2021 periods. Returns should be the same as go always full long.

image
2017-2018

image
2020-2021

@makovez
Copy link

makovez commented Mar 12, 2023

Is this normal? Does anyone have any idea about what is the issue?

@aicheung
Copy link

aicheung commented Mar 12, 2023 via email

@makovez
Copy link

makovez commented Mar 12, 2023

I haven't done any change in the code tho, i just ran the commands showed in the readme. So I also used the same dataset. Then i created the features and run this command.

python.exe -m examples.run_dmn_experiment TFT

I haven't changed anything inside the code.

@aicheung
Copy link

aicheung commented Mar 12, 2023 via email

@makovez
Copy link

makovez commented Mar 12, 2023

No i haven't, because it takes a lot of time. But even without cpd, the paper shows some results without the module and they are not comparable to what i am getting. I get most of the times positions over 0.9 or -0.9

@aicheung
Copy link

aicheung commented Mar 12, 2023 via email

@makovez
Copy link

makovez commented Mar 12, 2023

Sharpe Ratio 1-2 on what period ?

@makovez
Copy link

makovez commented Mar 12, 2023

For example for 2020-2021 I also get a sharpe ratio of 1.77 looking at result.json but it is still worst than the baseline meaning it is doing worst than the always full long position. Looking back again at captured_returns file for 2020-2021 with TFT without CPD module, the positions size varies, and most of them are not over 0.9/less than -0.9. But for the other periods what i previously said remains in place.

image
For example this is for 2019-2020, as you can see most of the times they are either over 0.9 or less than -0.9, i don't think this is a good sign

@aicheung
Copy link

aicheung commented Mar 12, 2023 via email

@makovez
Copy link

makovez commented Mar 13, 2023

image

I understand what you are saying but I still think that if the strategy does this, it is lossing 1/4 of potential gain. And this is not good. The strategy must at least be as good as buy and hold returns in period of bull market while performing best in period of bear market.

If he looses 1/4 potential gain that could have been made by just buy and hold and then when there is a bear market the strategy is avoiding losses, overall would still be same as buy and hold. Do you agree with me ?

@aicheung
Copy link

aicheung commented Mar 13, 2023 via email

@danbo6
Copy link

danbo6 commented Mar 5, 2024

I ran the model using Kiernan's data (5x per @replacementAI and as specified in the paper) and can confirm the performance. I've also run it on day, day part, hour and five minute equity prices. It finds signal w/ this relatively noisy data, though (a) the signal declines as the time series frequency increases, (b) the importance of CPD breaks down. Regarding CPD, it takes a while to run w/ limited upside from using a GPU. That's either due to a configuration issue (getting the gpflow libs right is challenging) or the sequential nature of the problem. At 3.7 Ghz it will do a ticker in 60-ish minutes per CPU on the default data set. As you move to higher frequency data, the time required blows up as the value breaks down.

Hi @MickyDowns, it's so nice to see that you can confirm the performance. I have a question about how the 'captured_returns' is calculated in the code, please see https://github.com/kieranjwood/trading-momentum-transformer/blob/master/mom_trans/deep_momentum_network.py#L484, here the 'returns ' is a volatility scaled returns. Should we use the real daily return? I understand the training objective is to optimize the sharpe ratio which is calculated using volatility scaled returns. But when we want to get the actual return, I think we need to use the actual daily return.

Hi @aicheung do you any thoughts on my comments above? Thanks!

@danbo6
Copy link

danbo6 commented Mar 9, 2024

It is a gread work for the methods sharing. Validate steps are done following the orders listed from the README and the results caculated are listed bellow: image It is very different from results from the paper: image

I am wondering the reason. Thank you for your help.

Same here. Did you find out the reason? You can add me @danzb0 on telegram to have more discussions :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants