-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add possibility to reanalyze pending comments #374
base: master
Are you sure you want to change the base?
Conversation
I am not sure what happens when a few thousand comments are pending, like in #281 😬 Do we need some form of queueing for such cases? |
I testet the feature with 29.500 pending comments (thanks @2ndkauboy for the SQL export), and on my local test environment it processed close to 6.000 before a 504 error. So, it needs some sort of background processing and run the comments in batches. |
I added Action Scheduler (https://github.com/woocommerce/action-scheduler) for the background processing. Currently, batches of 500 comments are reanalyzed, before the next action is created for the next batch. That takes some time when we have so many pending comments like in my testcase, but using a low value here makes sure it should run on most servers (I did not test how long the script runs with We could also add a filter to let the user raise or lower the value. Edit: I added Action Scheduler via Composer. It is not included in the Composer autoloader file, that is the reason why I direclty load the I added that part of |
The actionscheduler can cause "redeclare" issues if it is required by more than one plugin. Maybe we should rewrite the namespace using php-scoper (see https://github.com/pfefferle/wordpress-webmention/blob/4.X/composer.json#L40) or check if the class/lib is already loaded. |
I thought that, too, but Action Scheduler already handles that: https://github.com/pluginkollektiv/antispam-bee/blob/281-bulk-analyzing/vendor/woocommerce/action-scheduler/action-scheduler.php Off topic: |
|
The latest commit adds an additional checkbox to the settings page. Currently when more than 20 comments are pending, I thought that would be a good count because it is the default count of one comments page, but of course we can increase or decrease that, or even include a filter. My first thought was to include the option as a button instead of the checkbox, but that would require JavaScript to send the button action as an AJAX request or to add a additional form element somewhere on the settings page. And if the user clicks the button without JS/AJAX, he would loose other modifications he made during that session. |
PS: for now I left the reanalyze button on the edit-comments.php page, and it is displayed as soon as comments are pending. |
|
Switched it to WP cron and it seems to work fine (running the first test currently with over 10.000 pending comments). When it finishes without issues, I will implement the rerun until no more comments are detected as spam. |
…ions Switch to GitHub Actions for tests
This PR is ready for review and testing again :) Reanalyzing reruns until no new comments are marked as spam during a run. @pfefferle I removed Action Scheduler and replaced it with the core cron functions, that’s much better, you are right. |
|
I have a friends site with 100k+ spam comments and 20k+ pending comments. >700 MB comments table ... Great for testing! So after some problems importing the DB I got it running in my local MAMP install.
The first thing that got me confused is why is this button on the published comments page too and not just on the pending comments page. It will just process the pending comments, so I think it could be confusing to show it on the other pages. Then I clicked the button and it grays out showing "Reanalyzing in progress…". Now I don't have any more visual hint what is happening. Does it work? After minutes it changed from 21.927 to 21.925. Not sure if this is the outcome or just a very slow process ... After 5 minutes more I am on 21.923. This is either tremendously slow on my local machine or we don't find much spam with the rest of the methods (look at the statistics below). Then I looked at the settings screen: I was wondering if the checkbox in the settings is clear. At the moment it shows "Reanalyzing in progress…", the checkbox is grayed out and I am not sure what happens if the checkbox is active. Can I start the re-analyzing here (if it is not started yet)? Fun fact: Our honeypot is by faaaaar the best method to find spam. Look at this data: Spam: 109.282109.228x Honeypot Unfortunately they missed to activate the local spam db setting ... Which leads to my next question: Does the re-analyze setting works with the active settings? So in my case I should have activated the local spam db setting before I've started the re-analyzing, or not? Which brings me to my final question: How can I stop the re-analyzing and re-start it with a different setting? Do I need to wait until it's done? What happens then? A notice? Just the button from the beginning? |
Hi @Zodiac1978, thanks for testing and your feedback!
Yes, you are right, that might be confusing. I will check if I can modify it to display only on the pending comments screen.
It reanalyzes 500 comments per batch before creating a new cron event, that needs to be triggered by a page load. When the button changes its state to »Reanalyzing in progress…« it is working, maybe I could try to add the count of reanalyzed comments there 🤔 I am thinking about if there is a way to directly trigger the cron event from ASB via a HTTP request after creating the event, so that there does not need to be a page load from someone, I will check that out.
Yes, you can start the reanalyzation there by checking the checkbox and saving the settings. After reanalyzation is finished, the checkbox is unchecked again. Reason for why I added it as a checkbox there was:
Yes, it works with the active settings, but you can change the settings during reanalyzation, and it should take effect from the next reanalyze batch (the next 500 comments).
Hm, currently there is no way to cancel it. Might be a good idea to add a button for that. Thinking about where to add that button in the comment list, I guess the best way would be to just remove the button from the comment list view and just have it in the settings, where we have more space. When reanalyzation is finished, nothing happens except the button state changes.
|
Final result was: 21.896 (without local spam db, just bbcode and regex. |
My last changes:
I think it is not possible to trigger the cron from the plugin, because it is a cron event itself. So, there need to be requests by users to make the reanalyzation continue after a batch finished. |
|
With the latest version I don't see the button anywhere and therefore can't see, start or stop anything. The I tried to start the re-analyzing with the checkbox. Needed some trial & error to understand, that I need to save with checked checkbox to start it. It is not very intuitive UX I think. |
Yes, I removed the button from the comments list because there is no place for the additional information there. And yes, I guess you are right that the checkbox control is not the best option there. Would it be good UX when we use a second submit button for the form there (not sure if it is possible, just thinking), so when the user clicks the button, all made changes are saved and the reanalyzation starts? Or would it be better to have a button that just starts reanalyzation without any other side effects. |
This. I think a new section above or below the settings would be best. It is not a setting, but an action item and therefore needs to be separated, I think. Without any visual representation it is really hard to tell if there something in the works and what. Maybe we could use the WooCommerce DB background upgrade method here. Displaying a notice that there is something running/has successfully ended. |
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
I started work on #281 with a button that allows the user to reanalyze all pending comments.
For now, the button lives in the row with the filter and bulk actions:
The button label is a little bit long, but I am not sure if
Reanalyze
or something like that would be clear enough.