New blog post:
A simple, async page cache built on top of io_uring
New blog post:
A simple, async page cache built on top of io_uring
#Glowdust now comes with a single threaded, request based page cache. This lets me serve pages to multiple transactions, asynchronously, without spending time optimizing a thread safe version.
Which means I can start actually implementing the stuff I wrote about last week:
https://radiki.dev/posts/glowdust-tx-implementation-1/
TL;DR Transaction state is kept in the page cache and it's basically just another file. This causes a lot of interesting behavior to experiment with.
And just like that, #Glowdust now has a page cache that can actually fault pages in memory.
From some other place in memory.
And it's all it can do. Just page faults.
But hey, progress.
2500 words, a princess bride meme and my surprise at finishing it.
That's about 86% of what you need to know about my new blog post.
The remaining 14% is describing a new architecture for database transaction subsystems. In #Rust , obviously.
Do let me know what you think - especially about the backpressure design.
Because transaction state is
1) stored in the page cache and so
2) can be moved freely between threads,
transactions can be served from a priority queue from a fixed number of workers.
Thus, #Glowdust has, by design, the ability to automatically balance CPU and memory use.
Memory pressure pushes back via I/O to persist overflowing tx state and serve page faults.
CPU pressure reduces time spent on each tx per worker, which makes tx state accumulate in memory, which releases CPU.
Update: Initial transaction support has landed in #Glowdust
https://codeberg.org/glowdust/glowdust/commit/37a0b94f9846611c8b5f0433711537b678c07594
Pretty large update, and I still have some work to clean up and commit locally. But it's coming together.
The independence of this code from the language syntax is deliberate. It's all in the direction of having the frontend be pluggable so you can roll your own DBMS, language and all.
Pro tip: If you want to step through a thread, you can pass it a sync_channel(1) and use write/read calls to block and unblock it.
You can do this with multiple threads and create a state machine for controlling synchronization and reproducing or checking for race conditions.
I am currently doing this with #Glowdust and it works pretty well for writing tests for transaction isolation.
So #Glowdust commits stuff from different threads now, NBD.
About a month ago this seemed impossible. I still can't really believe that I got this far.
#Glowdust update: Multithreaded transaction execution works, with tx state storage in the page cache.
What remains is commit(), i.e. make per tx state visible to other transactions.
I mean, there are a million steps after that, but for now, only commit() remains.
TheRegister on the 50th anniversary of #SQL
https://www.theregister.com/2024/05/31/fifty_years_of_sql/
50 years is a lot, but for SQL I'll make an exception. It's a fantastically well thought out system.
But as one of its creator says, it's not enough
https://www.theregister.com/2024/05/10/sql_cocreator_nosql/
We need to progress beyond, and take advantage of modern computing platforms.
Have you looked at #Glowdust, per chance?
Starting the implementation of multithreaded access to the #Glowdust store.
Thoughts and prayers can be sent as replies to this toot.
First pass of performance work done on #Glowdust|'s key memtables - it no longer deserializes not in use entries.
The point of optimization work this early was to get tests running fast enough, so I can run the entire test suite as often as I like during development.
Next up, start bringing transactions in the LS store.
(and you can read all about the store layout in my latest post: https://radiki.dev/posts/glowdust-lsm-tree-1/
As the sole developer of #Glowdust, I need to choose if I'll support platforms without #io_uring support or drop uring support altogether.
The small experiment with CSV import showed me that I can't maintain two I/O stacks.
Plus, I started Glowdust in part because I wanted to play around with uring.
Hmm. Let's be honest - no one but me cares about Glowdust. And I only run #Linux.
And that's how, in the space of a single toot, I decided to do only io_uring.
Thank you for following along.
Log structured storage for #Glowdust is getting faster, but its still slow.
Still some algorithmic improvements remaining, but I think sooner or later I'll have to deal with the cost of serializing/deserializing with serde.
The advantage of using a log structured store for #Glowdust is that transaction state can be maintained without memory allocations - in fact, at the moment, the only allocation is the stack for the bytecode interpreter.
The disadvantage is that I don't yet know how to code a LS store and it's slow AF.
How often do you get to design a database transaction API?
For me, that's twice now.
Here's my notes on the second time - transaction API design for #Glowdust
Huh, it looks like tomorrow will be the day when #Glowdust|'s Log Structured native store will have support for multiple pages.
You know, for the fancy pants use cases that need more than 64k of data.