Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

universe SP500 load data error from 2016-01-03 to 2017-01-03: AttributeError: 'Index' object has no attribute 'normalize' #190

Open
arisliang opened this issue Mar 17, 2021 · 3 comments

Comments

@arisliang
Copy link

When loading data universe SP 500 from 2016-01-03 to 2017-01-03, it gives 'Index' object has no attribute 'normalize':

C:\Users\arisl\anaconda3\envs\zipline-trader\python.exe C:/src/lycn/conda-environments/zipline-trader/src/zipline-trader/zipline/data/bundles/alpaca_api.py
C:/src/lycn/conda-environments/zipline-trader/src/zipline-trader/zipline/data/bundles/alpaca_api.py:276: UserWarning: Overwriting bundle with name 'alpaca_api'
def api_to_bundle(interval=['1m']):
C:/src/lycn/conda-environments/zipline-trader/src/zipline-trader/zipline/data/bundles/alpaca_api.py:355: UserWarning: Overwriting bundle with name 'alpaca_api'
end_session=end_date
Traceback (most recent call last):
File "C:/src/lycn/conda-environments/zipline-trader/src/zipline-trader/zipline/data/bundles/alpaca_api.py", line 363, in
show_progress=True,
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\bundles\core.py", line 513, in ingest
pth.data_path([name, timestr], environ=environ),
File "C:/src/lycn/conda-environments/zipline-trader/src/zipline-trader/zipline/data/bundles/alpaca_api.py", line 306, in ingest
daily_bar_writer.write(daily_data_generator(), assets=assets_to_sids.values(), show_progress=True)
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\psql_daily_bars.py", line 617, in write
return self._write_internal(it, assets)
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\psql_daily_bars.py", line 665, in _write_internal
for asset_id, table in iterator:
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\psql_daily_bars.py", line 658, in iterator
for asset_id, table in iterator:
File "C:\Users\arisl\anaconda3\envs\zipline-trader\lib\site-packages\click_termui_impl.py", line 315, in generator
for rv in self.iter:
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\psql_daily_bars.py", line 609, in
for sid, df in data
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\psql_daily_bars.py", line 716, in _write_to_postgres
result = self._format_df_columns_and_index(data, sid)
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\psql_daily_bars.py", line 716, in _write_to_postgres
result = self._format_df_columns_and_index(data, sid)
File "C:\src\lycn\conda-environments\zipline-trader\src\zipline-trader\zipline\data\psql_daily_bars.py", line 806, in _format_df_columns_and_index
data.index = data.index.normalize()
AttributeError: 'Index' object has no attribute 'normalize'

@shlomiku
Copy link
Owner

Hi,
you should check your pandas version.
image

a pandas index has a normalize method.

do you know what symbol it happened with? maybe the data is bad for that asset

@charlienewey
Copy link

I am experiencing this issue too. It's happening while trying to load the last 4 years' worth of S&P 500 data.

@shlomikushchi the AttributeError is raised on an Index, not a DateTimeIndex. I am wondering if there is some point in the code where a DateTimeIndex should be assigned but isn't.

@fimmugit
Copy link

fimmugit commented Nov 27, 2021

I also encountered this problem when ingesting ALL stocks. For smaller set of stocks, i.e. SP500, it is running fine and takes a couple minutes.

It turns out that the problem happens when data index is type of pandas.core.indexes.base.Index. So to solve the problem, I change line 806 in zipline-trader/zipline/data/psql_daily_bars.py:

from

     data.index = data.index.normalize()

to

    if isinstance(data.index, pd.core.indexes.base.Index):
        data.index = pd.to_datetime(data.index, utc=True)
    data.index = data.index.normalize()

It seems working.

Please note that ingesting ALL stocks takes a long time (about 1:30") even for one year worth of data. I am not sure if it takes more time to download data from Alpaca (I kind of believing this is possible due to Alpaca's rate limit) or if it takes more time to populate postgres database. I subscribe other data providers and may try to download data as text files and populate postgres from them. It takes reasonable time downloading from those providers from my previous experience.

One other thing worth to consider is to pre-scan the universe based on some criteria to reduce the stocks to be ingested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants