Thank you
@SHaines for taking the time to remove the offending bots/users.
out of interest, I logged out & had a look at the registration form. Are you able to use a more complex puzzle for the captcha box instead of. Just a tick box?
The strangest part was seeing the captcha box already ticked when the registration page opened. If it’s doing that for everyone, could explain why so many bots are getting through 🤷♂️
There are a lot of things that go in the background of captcha to determine if you're human.
Copy pasting from elsewhere:
The captcha javascript code is obscured behind some very clever google processes. Furthermore, the success/failure/trustscore is all done on a google backend server, making it totally unknowable. All the captcha does it collect information and send it to google.
The captcha gives you a token. That token is not trusted by google. You then click on the captcha, and a bunch of information about your browser/history/session/clicking/etc is sent to google to process. If it trusts you, that token is trusted and can be used when you submit the form (you enter a username + password, you get token 112, you click submit on that registration form, the website submits 112 to google and checks if it is trusted or not, if it is it creates an account for you with your username + password, if it isn't it doesn't).
Broken down by information provided to google, I would say that the captcha has three main checks:
1) Who are you: What is your browsing history, captcha success/failure history, etc (this is gathered from the google cookies)
2) How legit is your environment (browser). This is the meat of the process. It sends info about what plugins are installed, your user agent, how your browser renders items, whether its rendering of a canvas element matches how that browser is expected to render it, etc.
3) How did you click the button. This is the execution time, the number of mouse/keyboard/touch actions made in the captcha iframe, and mouse movement/entry point/etc within the iframe.
It takes all that info, and gives it to some black box to process. We know there are minimum and maximum times you must enter it by, we know that some browsers and plugins etc are automatically considered untrustworthy, and we know that the more history you have, the more trustworthy you are.
It is widely believed that some fancy learning algs are at use in the google backend, trying to make sure if the same bots uses the same algorithms to create a mouse path and click behaviour, it will start trusting it less and less.