en.osm.town is one of the many independent Mastodon servers you can use to participate in the fediverse.
An independent, community of OpenStreetMap people on the Fediverse/Mastodon. Funding graciously provided by the OpenStreetMap Foundation.

Server stats:

250
active users

#tesseract

1 post1 participant0 posts today

screenshot and OCR on the CLI with #incus and #tesseract

$ incus query /1.0/instances/buildd5dd6a3115ef/console?type=vga | tesseract - -

Output from a Windows 11 install from an ISO:
---
Estimating resolution as 161
C

Installing 42%
Please keep your computer on.

Your computer may restart a few times.

I’m still annoyed with the state of #OCR in #Linux (or #FLOSS in general). Not that the need for OCR’ing hasn’t diminished by the years, as more and more of publications are already in electronic form, but every once in a while a need arises. #Tesseract’s quality is #abysmal (and not in Joey’s sense). #ABBYYFineReader used to be the best in #Windows, and once upon a time they provided a #CLI-usable OCR engine for Linux too, but not any more. #atkjuttuja #computers

#OpenSource Programm I need.

1.some sort of an apple tags like variant for the open source world ( best is file manager from #elementaryos at this point but it only support tagging 8 colours no #
(Nice to look at automation like the Mac #hazel or Mac #defaultfolderx)

2.and #preview replacement ( pdf and other files reader with most of the pro features and some sort of working #ocr ( possibly a gui of #tesseract ? ) for #Linux and #android preferably (best I found was #pdfsambasic)

#DeepStash tries to be an alternate to algorithmic #SocialMedia by providing small bites of information from books, articles, and/or quotes. It has the ability to bookmark content, though it is limited for free accounts.

Alternately one can just screenshot the specific chunks of data and potentially make one's own filtered timeline if one combines the data from the screenshots into #Anki. This can either be done directly by using the images or one can quickly extract the text via #tesseract and some #python to generate a #csv file which can be used as an import into Anki.

I treated my #homelab some #paperless today. 📝 🧾

Paperless NGX is a really amazing piece of software that you can throw documents at and it will #OCR them with #tesseract and allow you to search, tag and organize them.

It works very smoothly and it's currently processing all the 20k documents (incl. old versions) from our family :gitannex: #gitAnnex documents repo - that'll take a while 😅

The :nixos: #NixOS module is also fantasticly easy to use as always. 🚀

My Album Of The Year is: "Lingua Ignota Pt. 1" by Persefone.

I chose the album as my AOTY because I was eagerly awaiting the release and not a single song on the EP disappoints! Every single one picks me up and combines the familiar Persefone sound without being stingy with refreshing new elements. I attended their live concert shortly after the releases and couldn't be happier! Great band, awesome show, nice crowd! Can't wait for Pt. 2! 😍

There were 3 more candidates for my AOTY. In order of preference:

TesseracT - War of Being
DVNE - Voidkind
VOLA - Friend of a Phantom

 
A Beginner's Guide to the #FourthDimension

Combining simple images with multi-dimensional #animation, this 6-min video illustrates basic concepts of objects in a #4thdimension. If you find this interesting, you can search on other animated videos of 4 #dimensions and higher — some being many times more complex than this one.

🔗 youtube.com/watch?v=j-ixGKZlLV 01 Jul 2016

My #tesseract #python project for automating enrichment of acta data on the #venezuela election is... maybe actually done? I might try and tweak some more bits -- namely, see if it's worth going character by character for the GUIDs and sha256 hashes -- but, for the most part, the labeling script I've got whipped up is knocking out every acta I throw at it at this point (except the absolute worst ones). It grabs the datetime, voting center info, geo info, qr code info, members of the mesa (names and citizen IDs), basically everything.

FINALLY getting some decent success and accuracy rates with #tesseract after an absurd amount of pre-processing. If you want high accuracy, you reeeeally have to baby it. But, on the other hand, stuff like this was still science fiction just 20 or so years ago, so... maybe I shouldn't be too harsh.

Now, to clean up this monstrosity of a "script" that's now clocking in at 1197 lines.