<coolcoder613>
I'm using what you might call the easy route with my OS (I'm mostly using libraries written by other people, for example for ATA, for my allocator, to run wasm...) my OS is written in Rust, and I'm currently having a lot of trouble FAT working on top of my the ATA driver.
<coolcoder613>
The ATA library want something with Read, Write, and Seek implemented, and I've done that, the problem in I don't know how to implement IOBase, which Read, Write and Seek depend on.
<Mutabah>
wait, why does the ATA library want a file interface?
<Mutabah>
Or does it _expose a file-like interface?
<Ermine>
Idk what IOBase means in the context of the library you're using
<Mutabah>
what library are you using?
<Ermine>
so you should refer to the library docs and code
<Ermine>
also, imo it's not "easy route", it's "taking shortcuts which will bring much more trouble long term"
<heat>
problem 4 is a nonissue, problem 2 is a nonissue
<the_oz>
a lot of this is BECAUSE paging to/from memory is possible...
<the_oz>
where would you go if you need resident memory? callstack?
<the_oz>
highly active ring buffer?
<dostoyevsky2>
https://www.youtube.com/watch?v=aoewwZwVmv4&t=1638 <- about 4: he's arguing that the kernel uses "expensive" mutexes and that if you'd were to manage memory directly in the DB there are cheaper mutexes...
<dinkelhacker>
Lucretia: in gdb do: "define hook-stop" and follow the instructions
<dinkelhacker>
You can define commands that will be executed when the
<dinkelhacker>
inferior stops
<heat>
dostoyevsky2, LOL
<heat>
hilarious
<dostoyevsky2>
heat: And then he adds at the end: "Even if Linus disagrees"?
<the_oz>
if he gonna go that route why bother with the kernal at all, DBMS bare metal
<the_oz>
no paging issues when you're supreme overlord
<dostoyevsky2>
heat: I mean #3 is nasty, having a SIGBUS instead of a proper error message, but if a write() fails in the DB all you get is a better error message and then the db crashes, I've never seen a DB recover from that
<dostoyevsky2>
nikolar: OS can flush dirty pages at any time, so if you are in a transaction that might be bad
<nikolar>
right, but you can force a flush no
<nikolar>
you've got basically the same issue with write anyway
<the_oz>
flush is better than refusal
<dostoyevsky2>
nikolar: So you could argue that the DB knows better what pages might be reused in the future than the OS... but not sure about that
voidah has joined #osdev
<heat>
yeah. mmap might make it easier to violate those constraints, but it's fundamentally the same thing
<nikolar>
i mean sure, but you get basically the same end situation
<heat>
oh, general wisedom among kernel folks is that userspace has _no_ fucking clue what it's doing
<nikolar>
just with some less code wiht mmap
<heat>
it turns out to be correct
<nikolar>
kek
<nikolar>
well the only userspace software i'd expect to have some clue are databases
<nikolar>
they are basically specialized filesystems
<heat>
meanwhile this guy says mmap is slow while mongodb is WEBSCALE and eats other dbs alive using mmap
<nikolar>
lel
<nikolar>
it's mongo though, didn't it have some major corruption bugs
<heat>
afaik mongo gives 0 fucks about data safety
<heat>
inb4 SNAPSHOTS
<dostoyevsky2>
the_oz: https://i.ibb.co/2cDT3JF/yb.png <- yellow brick does its own memory allocation, thread scheduling, device drivers, and network protocols
<nikolar>
heat: what if your snapshots get corrupted
<heat>
they would not, zfs wouldn't do that to you
<heat>
now, btrfs on raid...
<nikolar>
zfs doesn't corrupt stuff
<nikolar>
you're thinking of btrfs
<xal_>
there's a talk by one of the LMDB developers that makes a good argument for when mmap makes sense in a DBMS (unfortunately the audio is completely messed up, I had to download and play it in mono): https://www.youtube.com/watch?v=tEa5sAh-kVk
<nikolar>
literally the first time i've tried btrfs, it got corrupted
<nikolar>
(wasn't raid)
<kof673>
> they are basically specialized filesystems # i suspect there is some 1960s paper where these new filesystem things are like specialized databases
<nikolar>
kof673: i guess you could look at either as if it were the other one
<the_oz>
I've seen this before
<the_oz>
fibre channel
<nikolar>
kek should've guessed that it was postgresql
<the_oz>
I have much the same architecture serving up disk blocks
xal_ is now known as xal
<dostoyevsky2>
xal: thanks for that link
<Mondenkind>
'does he realize write() doesn't write to disk?' O_DIRECT, spdk
<heat>
well then you're comparing two different things
<heat>
correctness and performance wise
<nikolar>
> Try to minimize cache effects of the I/O to and from this file.
<nikolar>
no guarantees
<heat>
they're being humble, O_DIRECT is guaranteed to work AFAIK
<heat>
mmap is analogous to some fucked up lseek + buffered write, there's no O_DIRECT variant
<heat>
which, fun fact: lots of prod databases do buffered writes by default
<heat>
postgres does buffered writes, sqlite does buffered writes. not sure what the situation is with mariadb/mysql
<nikolar>
probably the same
<the_oz>
something something which trans engine something
<heat>
there are also these two syscalls named fdatasync and msync which can literally just sync a file range - which is exactly what you want in db terms
<xal>
i would really love to see fsync split into two different syscalls: one which acts as a write barrier and another which blocks until all pending writes are durably on disk. iirc OSX has this, but very few applications are aware that ordinary fsync is a no-op
voidah has quit [Remote host closed the connection]
<xal>
what difference does it make if my data is lost in a crash and it's in the disk's buffers rather than the OS buffers?
<heat>
right. that's PEBKACAP (problem exists between kernel and chair of apple programmer)
<xal>
when I was investigating this I (re)discovered that while sqlite goes out of their way to use F_FULLFSYNC on osx, apple's shipped version in /usr/lib patches that out and turns it into a write barrier only
<kof673>
i did find a post where the apples person just says disks (and/or controllers/whatever) lie, so that was the reasoning
<nikolar>
they lie somewhat, depending on the model etc
<nikolar>
doesn't mean the os should lie too
<kof673>
sure. i didn't save the link, they were saying this is why buy apple-approved drives instead of external things, because then they can mandate they do not lie/etc.
<kof673>
in theory anyways
<nikolar>
lol
<nikolar>
sure they will
xal has quit [Quit: bye]
xal has joined #osdev
<the_oz>
that seems rather naive
<the_oz>
assuming apple hw engineers aren't jaded as fuck
<heat>
i like the part where the arrows go all over the fucking place
<heat>
it's really nice
<nikolar>
lol
<nikolar>
so the whole thing?
<heat>
yea
<nikolar>
is there a tool that generates these graphs
<nikolar>
that's kind of cooo
<nikolar>
*cool
<heat>
nah this has to be manual
<heat>
someone had to manually do this
<nikolar>
yeah cursed
<geist>
that is pretty neat. there have been times i've sat down to trace some sort of data flow in something and ended up building some sort of text version of that
<geist>
but usually more of a outline form, tracing through function calls., but it always ends upbeing a mess
<nikolar>
just put it through dot then geist :P
<geist>
re: apple engineers and not flyshing things. yes they are jaded as fuck
<geist>
or at least back in the mid 2000s i was briefly on the FS team for a few months
<geist>
and it was a real problem: lots of new usb drive enclosures were hitting the market and a sizable number of them would ignore the flush command
<geist>
and thus not really deal with proper journalling
<geist>
i think it's mostly an attest to the fact that some of the FS folks (dbg in particular) took it *extremely* seriously that lack of flush meant you really couldn't guarantee that journal works
<heat>
i vaguely recall linux really never enabled the write cache
<heat>
but this may not be true, i dont remember honestly
obrien has quit [Remote host closed the connection]
<geist>
yah, windows also still kinda to this day defaults to 'optimize for quick removal' and keeping the write cache disabled
<nikolar>
doesn't linux detect when a thumbstick is plugged in, and disables caching
<nikolar>
so you can just yank it out
<heat>
oh i think it disabled the write cache for everything
foudfou_ has joined #osdev
<heat>
like, hard drives too
<geist>
dunno, the real key is the larger enclosures that have a proper HD inside them that does have some write cache on the disk
<nikolar>
so, writing directly to the hardware?
<nikolar>
no buffering and such
<heat>
no
<geist>
but if the USB chipset there doesn't pass a USB level flush through to the internal drive...
<heat>
it has its own page cache, obviously
<heat>
we're talking about the disk's write cache itself
<geist>
right
foudfou has quit [Ping timeout: 260 seconds]
<geist>
thumb drives i suspect have not a tremendous amount of write cache, though it all gets blurred with really high end ones, SD cards, NVME on usb enclosure, etc
<geist>
but at least in the 2000s none of that really existed that much, except enclosures with a spinning disk
<geist>
but spinning disks had by then long had at least some amount of write cache
<geist>
OTOH i think nowdays most things ar epretty good at honoring flush commands, if nothing else because most stuff is just standardized chips slapped in a box, and if the underlying commodity chip does it, then it's mostly all good
<nikolar>
i don't see why it wouldn't honestly
<nikolar>
there's plenty of transistor to spare for that i imagine
<geist>
nowadays yeah
<geist>
though i think more complex things like usb -> nvme is a lot of work. probalby a microcontroller there running something to do that translation
<geist>
and you know how most real code in the world is, especially to support selling some chip...
voidah has quit [Ping timeout: 245 seconds]
<nikolar>
but passing through a flush signal seems simple enough
<nikolar>
probably the simplest operation the code needs to handle
<heat>
yea but you need to, uhh, take that into consideration
<nikolar>
kek
<heat>
and it's not actually trivial if you're just testing the thing for functionality
<heat>
if you forget read or write the thing just doesn't boot
<nikolar>
yea
<heat>
if you forget flush you can corrupt stuff very rarely if the machine goes down and you're unlucky
<zid`>
I wish I'd paid more attention in French lessons. I can't remember any French idioms.
<zid`>
Oh well, that's life. Whatever will be, will be.
<nikolar>
lol
<the_oz>
siesta lay vida
vdamewood has quit [Quit: Life beckons]
kilic_ has quit [Quit: Leaving]
n_shp is now known as nshp
hwpplayer1 has joined #osdev
hwpplayer1 has quit [Remote host closed the connection]
cow321 has quit [Ping timeout: 252 seconds]
cow321 has joined #osdev
vdamewood has joined #osdev
eschaton_ is now known as eschaton
beto has quit [Ping timeout: 260 seconds]
Dead_Bush_Sanpai has quit [Ping timeout: 252 seconds]
Dead_Bush_Sanpai has joined #osdev
Left_Turn has quit [Read error: Connection reset by peer]
beto has joined #osdev
levitating has joined #osdev
<Ermine>
yay outdated documentation
<heat>
lol what are you looking at
<nikolar>
linux kernel
<Ermine>
drm gem
<nikolar>
kek
heat_ has joined #osdev
<heat_>
oh are you volunteering to port drm to onyx? sgtm
<heat_>
ok heat@
heat has quit [Read error: Connection reset by peer]
<Ermine>
it's quite possible if I grok this thing
levitating has quit [Read error: Connection reset by peer]
heat_ has quit [Read error: Connection reset by peer]
heat_ has joined #osdev
LittleFox has quit [Quit: ZNC 1.8.2+deb3.1+deb12u1 - https://znc.in]