-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading ouput from twitter-intact-stream failed #7
Comments
It looks like Note, that not all the information |
Thanks for your reply. Would you please be a bit more specific on how to prepare the input for birdspotter? As per your answer, birdspotter cannot use the output of twitter-intact-stream directly and it has to go through twarc, doesn't it? If possibly, may you provide some examples? |
On closer inspection; it looks like |
It looks like the
Normally, The work around at the moment would be to filter out lines that look like the above and then feed the result into I think it would be better if I'll leave this open till that is implemented. |
Rate limit messages are normal when using the search API, they give the number of lost tweets. You would expect them when using other Twitter API tools, so would be good if |
db93307 in the |
Hi, I used the crawler from twitter-intact-stream to collect tweets. Then I uncompressed the output file, add the extension .jsonl, then load it with birdspotter. The following error happened:
Extracting raw tweets: 6186it [00:03, 1872.15it/s]
Traceback (most recent call last):
File "", line 1, in
File "/home/tam/anaconda3/lib/python3.8/site-packages/birdspotter/BirdSpotter.py", line 56, in init
self.extractTweets(path, tweetLimit = tweetLimit, embeddings=embeddings)
File "/home/tam/anaconda3/lib/python3.8/site-packages/birdspotter/BirdSpotter.py", line 241, in extractTweets
for temp_user, temp_tweet, temp_content, temp_description, temp_cascade in itertools.chain(*map(self.process_tweet, tqdm(raw_tweets, desc="Extracting raw tweets"))):
File "/home/tam/anaconda3/lib/python3.8/site-packages/birdspotter/BirdSpotter.py", line 142, in process_tweet
temp_text = (j['text'] if 'text' in j.keys() else j['full_text'])
KeyError: 'full_text'
The text was updated successfully, but these errors were encountered: