-
Notifications
You must be signed in to change notification settings - Fork 65
Suggestion: Comment limiting + Don't discard on fail. #35
Comments
Perhaps dump the comments every x comments by default, and have a parameter to customize that. For example, generate:
This could also trivially be tied to a simple pause mechanic between dumps to throttle scraper use. Regarding more metadata, that would be best mentioned in its own ticket, but it occurs to me that it would be nigh incredible if it were made easy to scrape multiple times and essentially generate an "edit history". I suppose that's best left for another tool though. |
Is there a possibility to limit the amount of captured comments yet?
Is there any chance to get a follow up on this? |
I've downloaded large comment sets without seeing this problem myself. Perhaps it's me (my net connection?) or YouTube has improved/changed in some way. I can help with testing if anyone can give an example or two (even with inconsistent failures). |
I was trying to get the comments for this video https://www.youtube.com/watch?v=koPmuEyP3a0 in order to extract information for a research project. After realizing in multiple occasions that I was not able to run the scraper from beginning to end I was trying to gather as many comments as possible by using --stream, but the results are very unstructured. The scraper crashes with unknown error. Thank you for your help! |
I have it running on that URL and so far it hasn't crashed. I wonder if there is a limitation with the brute-force nature of the youtube-comment-scraper code running with I'll let the process continue, and report back later. For reference, my command is:
Maybe my choice to use |
Wouldn't you know it, my hunch was right. It did end up crashing, and there is a mention of memory garbage collection. I don't know that this will help the youtube-comment-scraper author directly, but it does give me a hint as to how I might fumble around on my own. Maybe crash output
|
I experimented with some parameters, to no avail:
|
Thank you for your effort! I was getting similar results while increasing the max-old-space-size. |
I checked the node.js issues tracker for "mark-sweep" and there might be some related items in there; I don't know. I do want to see this issue worked-around, but at this point I'm in way over my head. Hopefully @philbot9 will understand things better. |
Using the If there are further issues regarding the |
I have created Investigate ''unknown error'' on koPmuEyP3a0 for continued efforts using |
@ftomasin I was able to use another download tool for your video of interest. https://github.com/egbertbouman/youtube-comment-downloader
https://mega.nz/file/PVQVmSwD#sjIg_cPIBBZHeb6_FOOCVyOrJGvncm5B5fQql5kyfz4 |
Because videos with more than 20000 comments tend to overwhelm the online comment scraper, leading to a fail, I have two suggestions:
_____
.Don't forget: The csv and json files should contain the information of how many comments the video has in total.
This will allow the user to see the total amount of comments on a video from a comment file which does not contain every comment (due to manual limiting or failed capture).
More metadata:
The text was updated successfully, but these errors were encountered: