-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a SQL support documentation #522
base: master
Are you sure you want to change the base?
Add a SQL support documentation #522
Conversation
Codecov Report
@@ Coverage Diff @@
## master #522 +/- ##
=======================================
Coverage 47.81% 47.81%
=======================================
Files 125 125
Lines 12376 12376
=======================================
Hits 5918 5918
Misses 5959 5959
Partials 499 499 |
…into docs/add-sql-support-documentation
@@ -284,6 +284,9 @@ marketstore connect --url <address> | |||
``` | |||
and run commands through the sql session. | |||
|
|||
### SQL Support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the following to README.ja.md
? 🙏
### SQL Support
[SQL support](./docs/sql-support/sql-support.md) を参照してください。
|
||
``` | ||
SELECT [ ALL | DISTINCT ] select_expr [, ...] | ||
[ FROM data_directory [, ...] ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[imo] data_directory
-> bucket_name
might be better because normal marketstore users (=SQL users) don't want to care where the data reside in marketstore but they just specify a bucket name when they use SQL
## SELECT Statements | ||
|
||
``` | ||
SELECT [ ALL | DISTINCT ] select_expr [, ...] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I'm not sure if [ALL | DISTINCT]
is supported or not...
|
||
#### Where `data_directory` is a sub-directory pointing to the `rootDirectory` of your configuration file | ||
|
||
``` sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think example SQL per each syntax is helpful for readers.
-- select all columns from a bucket
» SELECT * FROM `gdax_BTC-USD/1D/OHLCV`
-- select some columns from a bucket
» SELECT Epoch, Open, High FROM `gdax_BTC-USD/1D/OHLCV`
-- WHERE BETWEEN
» SELECT * FROM `gdax_BTC-USD/1D/OHLCV` WHERE Epoch BETWEEN '2021-09-10-12:30' AND '2021-10-26-08:40';
-- LIMIT
» SELECT * FROM `gdax_BTC-USD/1D/OHLCV` LIMIT 3;
SELECT * FROM `gdax_BTC-USD/1D/OHLCV`; --escape dashes by wrapping it with backticks | ||
``` | ||
|
||
#### Aggregate functions supported are the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some explanation about aggregate function would be helpful ! Please refer this design doc
https://github.com/alpacahq/marketstore/blob/6eb74814a8feb4141e164d57b240dec4e530491b/docs/design/executor_design.txt
``` | ||
|
||
#### Aggregate functions supported are the following: | ||
- COUNT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some explanation and example aggfunc usage might be helpful for readers.
I wrote examples for each aggfunc here just now !
##### CANDLE CANDLER
Candle Candler converts the time range of candle data.
-- converts 1-Min candle data to 4-Hour candle data
» SELECT candlecandler('4H',Open,High,Low,Close) FROM `TEST/1Min/OHLCV`;
##### TICK CANDLER
Tick Candler converts tick data into candlestick data and returns it.
-- converts tick data to 2-hour candlestick data
» SELECT tickcandler('2H', Ask) FROM `TEST/1Min/Tick`;
##### MAX
MAX retrieves only the maximum value of the specified column.
» SELECT MAX (Ask) FROM `TEST/1Min/Tick`;
##### MIN
MAX retrieves only the minimum value of the specified column.
» SELECT MIN (Bid) FROM `TEST/1Min/Tick`;
##### GAP
Gap returns only when the time diff between adjacent records exceeds the specified time range.
» SELECT Gap ('10Sec') FROM `TEST/1Min/Tick`;
##### COUNT
Count returns the number of records that match the condition.
» SELECT Count (*) FROM `TEST/1Min/Tick` WHERE Epoch BETWEEN '2017-01-01-00:30:00' AND '2017-01-01-02:30:00';
For more specific usages in code, please see our integration tests:
https://github.com/alpacahq/marketstore/blob/654151cb4e028a300f4f74c6d0fe9132531491da/tests/integ/tests/test_aggcandler.py
https://github.com/alpacahq/marketstore/blob/654151cb4e028a300f4f74c6d0fe9132531491da/tests/integ/tests/test_basic_aggfunc.py
@@ -0,0 +1,14 @@ | |||
## INSERT Statements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the doc for INSERT as well according to my review comments above 🙇
@@ -0,0 +1,10 @@ | |||
# SQL Support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[imo] Probably readers of this document want to know what SQL support is and how to use it, so how about adding some basic explanation and usage first? and used technologies can be mentioned later.
[Usage]
- CLI
$ marketstore connect --url localhost:5993
{"level":"info","timestamp":"2021-10-27T09:04:16.614+0900","msg":"Running single threaded"}
Connected to remote instance at: http://localhost:5993
Type `\help` to see command options
» SQL goes here
- pymarketstore
>>> import numpy as np, pandas as pd, pymarketstore as pymkts
>>> cli = pymkts.Client()
>>> reply = cli.sql("SELECT * FROM `USDJPY/1Min/OHLCV` WHERE Epoch Between '2018-01-01' AND '2018-01-02';")
>>> reply.first().df()
Open High Low Close Volume
Epoch
2018-01-01 16:00:00+00:00 112.579002 112.589996 112.579002 112.585999 61
2018-01-01 16:01:00+00:00 112.586998 112.597000 112.585999 112.592003 109
2018-01-01 16:02:00+00:00 112.593002 112.597000 112.588997 112.589996 38
2018-01-01 16:03:00+00:00 112.592003 112.642998 112.589996 112.637001 124
2018-01-01 16:04:00+00:00 112.638000 112.657997 112.637001 112.653000 136
... ... ... ... ... ...
2018-01-01 23:55:00+00:00 112.666000 112.667999 112.660004 112.667999 32
2018-01-01 23:56:00+00:00 112.668999 112.681999 112.667000 112.672997 46
2018-01-01 23:57:00+00:00 112.675003 112.678001 112.663002 112.671997 60
2018-01-01 23:58:00+00:00 112.668999 112.677002 112.664001 112.667000 49
2018-01-01 23:59:00+00:00 112.666000 112.667000 112.657997 112.663002 48
[464 rows x 5 columns]
Why? 🤔
What? 🗒️