en.osm.town is one of the many independent Mastodon servers you can use to participate in the fediverse.
An independent, community of OpenStreetMap people on the Fediverse/Mastodon. Funding graciously provided by the OpenStreetMap Foundation.

Server stats:

269
active users

#ai

522 posts446 participants51 posts today

AI bots strain Wikimedia as bandwidth surges 50%

Automated #AI bots seeking training data threaten Wikipedia project stability, foundation says.

arstechnica.com/information-te

Making the situation more difficult, many AI-focused crawlers do not play by established rules. Some ignore robots.txt directives. Others spoof browser user agents to disguise themselves as human visitors. Some even rotate through residential IP addresses to avoid blocking, tactics that have become common enough to force individual developers like Xe Iaso to adopt drastic protective measures for their code repositories.

This leaves Wikimedia’s Site Reliability team in a perpetual state of defense. Every hour spent rate-limiting bots or mitigating traffic surges is time not spent supporting Wikimedia’s contributors, users, or technical improvements. And it’s not just content platforms under strain. Developer infrastructure, like Wikimedia’s code review tools and bug trackers, is also frequently hit by scrapers, further diverting attention and resources.

Purple cartoon robots superimposed over a green library photo.
Ars Technica · AI bots strain Wikimedia as bandwidth surges 50%By Benj Edwards

How can one engage in algorithmic sabotage to poison "AI" scrapers looking for images when one is running a static website? Thanks to @pengfold, I've implemented a quick and easy way for my own blog:

tzovar.as/algorithmic-sabotage

Also thanks to @rostro & @asrg for the pointers and discussion!

what used to be an image after some sabotage, it's all a wobbly mess
Bastian Greshake TzovarasAlgorithmic sabotage for static sites II: Images
More from Bastian Greshake Tzovaras

#Microsoft used its #AI-powered #SecurityCopilot to discover 20 previously unknown vulnerabilities in the #GRUB2, #UBoot, and #Barebox #opensource #bootloaders.
GRUB2 (GRand Unified Bootloader) is the default boot loader for most #Linux distributions, including Ubuntu, while U-Boot and Barebox are commonly used in embedded and #IoT devices.
bleepingcomputer.com/news/secu #ITSec

Replied in thread

First "production test" successful 💪 ... after band-aid "deployment" (IOW, scp binaries to the prod jail).

#swad integrates with #nginx exactly as I planned it. And #PAM authentication using a child process running as root also just works (while the main process dropped privileges). 🥳

So, I guess I can say goodbye to #AI #bots hammering my poor DSL connection just to download poudriere build logs.

Still a lot to do for #swad: Make it nicer. So many ideas. Best start would probably be to implement more credentials checking modules besides PAM.

Oh wow, the National Novel Writing Month (NaNoWriMo) annual challenge for writers that started as a Yahoo! mailing list in 1999 is shutting down.

"NaNoWriMo lost significant community support when it took a stand in favor of the use of artificial intelligence in creative writing. [...] Around the same time, the nonprofit was also lambasted for inconsistent moderation on its all-ages forums, which created an unsafe environment for teenage writers, community members claimed."

techcrunch.com/2025/04/01/nano

TechCrunch · NaNoWriMo shut down after AI, content moderation scandals | TechCrunchNaNoWriMo, a twenty-five-year-old online writing community-turned-nonprofit, announced on Monday evening that it is shutting down. NaNoWriMo -- an

🙌 Join us for the 7th #KITeGG summer school on 15-16 May at the HfG Schwäbisch Gmünd: Re Shape 2025 – Forum for Artificial Intelligence in Art and Design

We’ll explore AI’s evolving role in creativity, interaction, society and question dominant AI narratives, rethink design beyond human-centered approaches, and imagine regenerative futures where technology coexists with the more-than-human world.

👀 reshapeforum.hfg-gmuend.de

Register (free):
🎫 eventbrite.de/e/re-shape-2025-

#Google #AI researchers were formerly like university researchers in this respect: They published their research when it was ready and without regard to corporate interests. For example, see the landmark 2017 paper introducing the transformer technology now in use by all major #LLM tools, including those from Google rivals.
arxiv.org/abs/1706.03762

More here.
en.wikipedia.org/wiki/Attentio

But that's changing. Google's AI researchers may now only publish their findings after an embargo and corporate approval.
arstechnica.com/ai/2025/04/dee

“'I cannot imagine us putting out the transformer papers for general use now,' said one current researcher…The new review processes [has] contributed to some departures. 'If you can’t publish, it’s a career killer if you’re a researcher,' said a former researcher."

arXiv logo
arXiv.orgAttention Is All You NeedThe dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.