-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP Error 429 when fetching tweets #262
Comments
I have the same problem with this and all other non-API tweet scrapers at the moment. You can collect about 14,000 tweets before hitting the request limit. |
Same problem here, do you happen to know after how much time that number resets? @rbkhb |
Haven't figured that out, no |
I have the same problem and can confirm the 14000 tweets limit. I was able to retry after a couple of minutes (5 or less) need to check the exact time. |
I found a solution, not ideal but it works, maybe you can help me make it better: # Date to start from
date_upper = datetime.datetime(2020, 3, 1)
date_lower = datetime.datetime(2020, 2, 29)
date_until = date_upper
date_start = date_lower
start_string = date_start.strftime("%Y-%m-%d")
until_string = date_until.strftime("%Y-%m-%d")
for i in range(29):
# Create a custom search term and define the number of tweets
tweetCriteria = got.manager.TweetCriteria().setQuerySearch(
'Coronavirus').setSince(start_string).setUntil(until_string).setLang('it').setMaxTweets(count)
# Call getTweets and saving in tweets
print('--- Starting query... ---')
tweets = got.manager.TweetManager.getTweets(tweetCriteria)
print('--- Adding to list... ---')
add_to_list()
print('--- Writing JSON... ---')
# Saving list to JSON file
json.dump(tweet_list, open('./JSON/saver_output.json', 'w'))
print('--- Going to sleep... ---\n\n')
time.sleep(60*5)
# Add 1 to date after each passage
date_start += datetime.timedelta(days=1)
date_until += datetime.timedelta(days=1)
# Convert dates to string
start_string = date_start.strftime("%Y-%m-%d")
until_string = date_until.strftime("%Y-%m-%d") Doing like so i was able to retrieve almost 120k tweets in a night sleep without any hiccups, i know the code could be much shorter but i wrote it just before going to bed. |
Hi, |
Hi, I think I find a solution to get more than 14000 tweets per day with a small change in the package themself. You only have to install a sleeping time after 14000 tweets. In combination with a loop over the dates and rotation over proxy, this works for me very well. |
Hey @p-dre, that's a nice solution. However, I've encountered another problem - what if given query search, on one day, exceeds the 14k limit? |
Can u please share how you uses proxies and which proxy provider. |
@erno98 If you inside the package you will find a loop over the batches. I at a sleep time after 14000 tweets |
Hi, could you share your code with me since I really want to know how to set up sleep time after 14000 tweets. I have just started programming, many thanks! |
Hi, |
I have a long list of keywords (around 700). I want to fetch all of them since February, without any other criterias. Now, I immediately get struck with "An error occured during an HTTP request: HTTP Error 429: Too Many Requests", and when I open given in link browser, everything works fine.
I tried to fetch for 1 day periods only (for example 01-02-2020 to 02-02-2020, etc.), but it still doesn't work (because of the same error). Any ideas how to solve it? I tried to sleep the script after such error, but even an hour of waiting doesn't seem to affect it in any way.
After some waiting, the script runs for around 10% of the tweets, and gets the error again.
The text was updated successfully, but these errors were encountered: