Hey Google's Association Service bot, thank you for the 400,000+ requests for assetlinks.json file over the last 9 hours, but we truly meant it when we said 404 - File Not Found. KThxBye. #abuse
GoogleAssociationService bot was kind enough to ask 1,000,000+ times yesterday for the same file from 4000+ Google IP addresses. Answer was the same 404 - File Not Found. The User-Agent does not provide a support link unlike their other bots.
@osm_tech The only solution I can see for all this shit is the IDP.
(And, because search-engines are so clueless about the history of the 'net: https://www.catb.org/jargon/html/I/Internet-Death-Penalty.html)
@mikro2nd
As much as I dislike Google, a lot of people & browsers still seem to be using them as a search engine...
Applying IDP to Google IP ranges would mean that nobody would be able to find #OpenStreetMap on the google search, and would instead probably get some scammers as the first result. I don't think that is ideal outcome.
@osm_tech
@osm_tech personally, I'd block all the #GAFAMs by their entire #ASN|s!
Fuck the crawlers; #Blackholing of their #DDoS attacks is the only feasible option!
Also send an #AbuseReport everytime they try that shite to them and all the providers from you till them...
@osm_tech what if you explicitly ban that path in your robots.txt
file?
@osm_tech it’s becoming clear that we need to be able to block all crawlers somehow
@osm_tech Tortious interference?
@tessarakt Likely just sloppy coding on their part. @Firefishy 's suspicion is that they queue checks, but then retry "failed" requests back into the infinitely growing queue.
@osm_tech just give the poor guy a redirect.. to a big asset.. hosted on google's servers :)
@gurkan If their bot is very hungry we could serve it our awesome weekly 148GB planet.osm file.
@osm_tech periodically crawl the logs for access to the problematic file, then rate limit or blackhole route the source address (https://nxdomain.no/~peter/forcing_the_password_gropers_through_a_smaller_hole.html should give you a general idea, OpenBSD specific though but should be doable on other platforms).
Or of course the classic "redirect to www.nsa.gov" which is likely easier to implement.
@osm_tech Just a Brainfart: on a 404 page have a "when you click this link your IP address will be blacklisted" link, can anyone implement this please.
@osm_tech have you tried responding with 301 moved permanently pointing at a nice big file on Google's network?