en.osm.town is one of the many independent Mastodon servers you can use to participate in the fediverse.
An independent, community of OpenStreetMap people on the Fediverse/Mastodon. Funding graciously provided by the OpenStreetMap Foundation.

Server stats:

268
active users

OpenStreetMap Ops Team

Hey Google's Association Service bot, thank you for the 400,000+ requests for assetlinks.json file over the last 9 hours, but we truly meant it when we said 404 - File Not Found. KThxBye.

GoogleAssociationService bot was kind enough to ask 1,000,000+ times yesterday for the same file from 4000+ Google IP addresses. Answer was the same 404 - File Not Found. The User-Agent does not provide a support link unlike their other bots.

@osm_tech The only solution I can see for all this shit is the IDP.

(And, because search-engines are so clueless about the history of the 'net: catb.org/jargon/html/I/Interne)

@mikro2nd
As much as I dislike Google, a lot of people & browsers still seem to be using them as a search engine...

Applying IDP to Google IP ranges would mean that nobody would be able to find #OpenStreetMap on the google search, and would instead probably get some scammers as the first result. I don't think that is ideal outcome.
@osm_tech

@mnalis @mikro2nd For now we've hardcoded a 429 - Too Many Requests response matching on UA + URL.

@osm_tech
It would be interesting to know if it helped... They should've stopped on that 404 too, so obviously there is buggy code involved :(

returning fake 200 with some dummy answer might be a next suggestion if 429 doesn't work (as it at least should use different code path).
@mikro2nd

@mnalis @mikro2nd Briefly tested 200 responses, didn't seem to have any impact. The 429 responses cut through stack and have minimal load impact now. Still ~12 req/second.

@osm_tech personally, I'd block all the #GAFAMs by their entire #ASN|s!

  • Fuck the crawlers; #Blackholing of their #DDoS attacks is the only feasible option!

  • Also send an #AbuseReport everytime they try that shite to them and all the providers from you till them...

@osm_tech what if you explicitly ban that path in your robots.txt file?

@osm_tech it’s becoming clear that we need to be able to block all crawlers somehow

@tessarakt Likely just sloppy coding on their part. @Firefishy 's suspicion is that they queue checks, but then retry "failed" requests back into the infinitely growing queue.

@osm_tech just give the poor guy a redirect.. to a big asset.. hosted on google's servers :)

@gurkan If their bot is very hungry we could serve it our awesome weekly 148GB planet.osm file. 😈😜

@osm_tech @gurkan fighting fire with fire seems not entirely unreasonable

@osm_tech periodically crawl the logs for access to the problematic file, then rate limit or blackhole route the source address (nxdomain.no/~peter/forcing_the should give you a general idea, OpenBSD specific though but should be doable on other platforms).

Or of course the classic "redirect to www.nsa.gov" which is likely easier to implement.

nxdomain.noForcing the password gropers through a smaller hole with OpenBSD's PF queues

@osm_tech Just a Brainfart: on a 404 page have a "when you click this link your IP address will be blacklisted" link, can anyone implement this please.

@osm_tech have you tried responding with 301 moved permanently pointing at a nice big file on Google's network?