<chiselfuse>
i block SIGINT in thread_function() and yet when i do ^C its handler gets executed instead of main's handler
spareproject has joined #osdev
MrCryo has quit [Ping timeout: 276 seconds]
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
<goliath>
chiselfuse, signal handlers are _per process_, not per thread. When the handler is invoked, the kernel picks a thread to run the handler on.
<goliath>
Blocking a signal in a thread simply means "don't pick this one"
<zid>
per thread signals would be amazing though
<zid>
no idea how you'd determine which thread got which signal, but hey
<nikolar>
Unmask the signal only on one thread ¯\_(ツ)_/¯
<zid>
I mean as the OS
<zid>
if they're per thread and you hit ctrl-c
<zid>
which thread gets the message
<nikolar>
all of them
<nikolar>
Random
<nikolar>
I don't know
<zid>
Thank you for that useful insight nik :p
<nikolar>
You're welcome zid <3
<nikolar>
Now I wonder if you could implement signal handling by spawning a temporary thread that would run the handler
CryptoDavid has joined #osdev
osdev199 has joined #osdev
heat has joined #osdev
bauen1 has joined #osdev
bauen1 has quit [Ping timeout: 260 seconds]
<heat>
nikolar, iirc thats how windows does it?
<heat>
at least for non-synchronous signals (i.e SIGFPE, SIGSEGV)
bauen1 has joined #osdev
edr has joined #osdev
<nikolar>
Oh really
<nikolar>
Interesting
<heat>
"SIGINT is not supported for any Win32 application. When a CTRL+C interrupt occurs, Win32 operating systems generate a new thread to specifically handle that interrupt. This can cause a single-thread application, such as one in UNIX, to become multithreaded and cause unexpected behavior"
<nikolar>
Heh
<Ermine>
horrid
<heat>
*based
<heat>
FTFY
<zid>
WM_QUIT
<zid>
or riot
osdev199 has quit [Quit: Leaving]
bslsk05 has quit [Ping timeout: 265 seconds]
nadja has quit [Remote host closed the connection]
V has quit [Remote host closed the connection]
gruetzkopf has quit [Remote host closed the connection]
nadja has joined #osdev
V has joined #osdev
gruetzkopf has joined #osdev
<chiselfuse>
ahhh i just realized it's per process, makes sense
<zid>
I like the idea of ctrl-ding each thread individually in some program though to see how fucked up I can make it
<zid>
"oh, the audio just got stuck, and now the menus don't work, and my save game is no longer updating"
Burgundy has quit [Ping timeout: 264 seconds]
<sortie>
heat: You are kidding me.
<sortie>
Toy OS.
<Ermine>
paradoxically enough, it works
<heat>
there are two types of systems in this world
<heat>
the ones that can't ever be vulnerable to the openssh syslog CVE
<heat>
and the UNIX ones
<sortie>
heat: Oh hey what's that syslog CVE?
<heat>
CVE-2024-6387
<heat>
signal programming is brilliant. the kernel people missed interrupts in userspace so much that they added interrupts
<heat>
but disabling interrupts in userspace is actually expensive because ofc
<sortie>
heat: https://www.armosec.io/blog/cve-2024-6387-regresshion-rce-vulnerability-openssh/ ← Reading this. I did see this CVE and didn't pay much attention cus I was on vacation. It does say syslog which I don't do as such in Sortix. Although my syslog does write to stderr, so my syslog still isn't async signal safe.
<heat>
ultimately its hard to exploit but shows a common problem with signal programming
<sortie>
Mostly the root problem is people thinking signal handlers are safe places to do stuff which they aren't
<zid>
signals are evil
<zid>
they make me have to care about EAGAIN
<sortie>
EINTR you mean
<zid>
probably
<sortie>
Only if you don't die on the signals, and don't turn on resuming system calls
<zid>
oh you can do the latter?
<sortie>
(except a few special cases)
<sortie>
Yep
<zid>
I'm not much of a posix programmer
<sortie>
It's the useful way to do signals
<zid>
I honestly know way more about winapi
<sortie>
In my programs that expect signals and handle them, I mask the signals at all times, and have ppoll unmask them so the signal handler runs during ppoll at a same point. In a single threaded program, that means doing whatever there is safe.
<zid>
Okay keep your secrets as to how to turn on resuming syscalls :p
<sortie>
But in these cases, I still only set a volatile sig_atomic_t signal_happened = 1; and return to wake up the ppoll system call (which I don't resume) so it fails with EINTR and I can dispatch the signal straight from the main loop safely
<sortie>
zid: I mean it's completely standard signal programming. sigaction(2) .sa_flags = SA_RESTART.
<zid>
ah thanks
<heat>
isn't it the default too?
<sortie>
signal(3) does not do this, it's essentially deprecated.
<zid>
I think the last time I dicked with signals was the old interface
<sortie>
heat: SA_RESTART is not 0, so no
<zid>
yea signal()
<sortie>
Hmm musl does set SA_RESTART in signal but I'm not spotting the POSIX text that says to do it
<zid>
I'm not sure I was ever actually *interrupted*, but the posix text does not in fact say signal should do it.. so I guess you just have to assume it might
<heat>
"In the GNU C Library, establishing a handler with signal sets all the flags to zero except for SA_RESTART, whose value depends on the settings you have made with siginterrupt"
<zid>
so my socket app I wrote that handled signals ended up wrapping accept in while != EINTR and stuff
<zid>
so that it didn't think the listen socket had disconnected accidentally
<zid>
if the syscall failed instead of succeeded
<sortie>
"Otherwise, the program shall resume execution at the point it was interrupted." is all POSIX says, whatever that means
<heat>
oh this is also interesting
<sortie>
I set SA_RESTART in Sortix signal(3).
<heat>
"The siginterrupt() function changes the restart behavior when a system call is interrupted by the signal sig. If the flag argument is false (0), then system calls will be restarted if interrupted by
<heat>
the specified signal sig. This is the default behavior in Linux."
<heat>
linux signal defaults to SA_RESTART
<sortie>
siginterrupt was removed in POSIX 2024.
<sortie>
Sortix never had it
<heat>
posix? what's posix? never heard of it
<FreeFull>
Too bad you can't implement posix without implementing unix signals
<sortie>
I recommend only using signal(3) if the values you supply are SIG_IGN or SIG_DFL, and you don't have any uses of sigaction (inconsistent but valid to mix them). Otherwise just use sigaction everywhere.
<sortie>
FreeFull: Honestly they do get a bit better to implement once you drop all the legacy stuff they removed in POSIX 2024 and honestly they're not too bad. Just a bit to learn and easily misunderstood.
<FreeFull>
Still a pain to deal with for the userspace
<heat>
i use signal a lot
<sortie>
To a large extent, user-space should stop caring so much about signals. No, I don't want your crash handler. It's a bad idea. Just crash.
<FreeFull>
But yeah, you can't just not implement them, since then a bunch of software won't work
<heat>
sigaction is too verbose for my purposes
<FreeFull>
I find it funny that even "Hey, the terminal has resized" is a signal
<zid>
The only valid use of a segfault handler I've seen was that for some reason, some 3rd party .so would crash the first time you ran it, but be fine if you just immediately restarted it
<zid>
so they just.. ignored the first crash
<FreeFull>
I think some VMs like the JVM rely on being able to handle segfaults too
xenos1984 has quit [Read error: Connection reset by peer]
<heat>
newer GCs don't need to handle segfaults i believe
<heat>
there's a fun madvise call to that effect
<sortie>
heat: Oh very interesting
guideX has joined #osdev
<sortie>
heat: Yeah sigaction can technically fail but also it will never fail on good inputs unless you got a shit impl
<sortie>
It's one of the few system calls, like fstat, that I never check the error on cus they won't fail
<heat>
oh, not madvise, mremap
<heat>
fstat can definitely fail
<sortie>
NFS is the only case where I think fstat might be able to fail
<zid>
ESHITIMPL
<heat>
dude, spurious -EIOs, spurious -ENOMEMs (in case your kernel sucks and can't handle an OOM)
guideX has quit [Quit: Leaving]
bslsk05 has joined #osdev
<sortie>
EIO I'm not quite concerned about, a quality kernel should already have that information cached in vfs for the lifetime of the open file handle
<sortie>
That's what Sortix does. You open a file, the stat informaton is populated there and won't fail.
<heat>
that doesnt reflect reality
<sortie>
(I think. User-space filesystems may roundtrip to user-space to get the information.)
<sortie>
ENOMEM can't happen in fstat on Sortix, the drivers have all allocated the information ahead of time
<sortie>
I also never check the error value of close(2). There's nothing reasonable that could be done there and the error should have been reported earlier.
<sortie>
(write(2) should have failed, or if you're really paranoid, you can call fsync(2) to be sure)
<heat>
oh, here's a fun one: -EIO reporting on write is completely fucked on unix
<heat>
google completely gave up on reporting that shit back to userspace, they got a daemon that reports and logs this stuff
<heat>
i *think* fsync usually reports EIOs, write does not in *any* case
<heat>
it's not like sockets where they can return old errors
<sortie>
Yeah. Agreed.
<sortie>
(idk anything about G's situation here)
<sortie>
But write(2) succeeding to me, means a commitment that the data has been accepted by the filesystem and scheduled for writing and all other errors have been checked
eddof13 has quit [Quit: eddof13]
<sortie>
If the underlying harddisk starts failing after that though, while the writing happens in the background, there's nothing really that can be done about that. A subsequent write could give EIO potentially. Any message on close(2) is a bad idea.
<sortie>
A fsync(2) really is the only place where user-space can meaningfully receive these errors, by being willing to wait for the background write to happen
<Ermine>
Iirc I've got EIO on write once
valerius_ is now known as valerius
<sortie>
Yeah it's happened to me a bunch too when my system was hosed when my harddisk was having trouble
<Ermine>
But it might me a hallucination on the second thought
mjg has quit [Ping timeout: 252 seconds]
<sortie>
Most programs should not fsync out because the background writeout is an important optimization. It's up to the sysadmin to decide how willing they are to accept filesystem corruption and what kind and trade off performance and such
goliath has quit [Ping timeout: 252 seconds]
eddof13 has joined #osdev
amj has quit [Ping timeout: 252 seconds]
<sortie>
But if you're doing a database thing or an operating system upgrade or what not and really want to be sure the writeout went through, fsync a good first choice.
<Ermine>
There's also sync
<sortie>
Or if you're looking to unmount
<Ermine>
unmount does sync doesn't it
<sortie>
sync also works but it also covers everything, which is a thing you may care about, but often it's also just a single file you care about
<GeDaMo>
Are there file systems which support transactions? "all of these writes work or none do"
<sortie>
unmount does require the processes to have stopped using the device, so fsync does help processes know it's done, as an extra safety, before shutting down
xenos1984 has joined #osdev
<nikolar>
GeDaMo: you can do something like that in filesystems with snapshotting
<nikolar>
Create a snapshot, do whatever, sync, delete the snapshot
<sortie>
GeDaMo: Other people definitely know more than me. I imagine you can turn on journaling for data for extra slowness and reliability, or advanced snapshot features, and what not. There's lots of filesystems one can choose
<nikolar>
And on failure, if there's a snapshot, just rollback
<Ermine>
GeDaMo: ntfs has transactions
<sortie>
It's on eof those reasonable sysadmin choices, like whether you only want metadata to be safe, or also file contents
<heat>
sortie, sysadmin knows jack shit about fs corruption
<heat>
programs need to fsync what they care about
<heat>
or at best fsync under a config option
<nikolar>
^
<sortie>
Sysadmins need to know their shit, it's the job
<Ermine>
yeah, i don't know why did they deprecate transactions
<kof673>
sysadmin does not necessarily make the financial decisions :D
<sortie>
Depending on what they're running and of what importance
<nikolar>
Underused I guess, Ermine
<sortie>
Programs like cp and sort don't need to use fsync. It's overkill.
<heat>
but install does
<Ermine>
I approved bying those hp servers, is this financial decision
<sortie>
But if you got any custom programs working on important data, yeah, you want fsync.
<nikolar>
wat
<kof673>
no, i mean total budget
<kof673>
or "use what we have, good luck"
<heat>
you're now responsible for mapping the BIOS and executing it in userspace, good luck Ermine
<zid>
don't cp your important data nikolar
<nikolar>
I'll rsync it
<kof673>
lol @ heat
<nikolar>
And snapshot first
<Ermine>
heat: yes, I *deeply* regret it
<Ermine>
now it's one of those cringe thoughts that doesn't let me sleep
<sortie>
It's a reasonable business decision to balance the need for reliability and cost and should be advised by technical leadership
<Ermine>
truly horrendous shit
<sortie>
If you're running a service that can easily be e.g. reimaged or automatically recover from backups, it can be totally fine to work with servers that can fail at any time
<heat>
caring about firmware and their awful hacks is a it's joever kind of business
goliath has joined #osdev
<sortie>
I probably should not share an office with heat, I imagine we'd have the best discussions but we'd probably just spend all day getting into fights over disagreements
<heat>
maybe
<sortie>
mremap(2) MREMAP_DONTUNMAP is interesting
amj has joined #osdev
<sortie>
userfaultfd(2) seems like peak Linux
<heat>
userfaultfd is the thing they love to use to make GC fast
<sortie>
Note: glibc provides no wrapper for userfaultfd(), necessitating the use of syscall(2).
<sortie>
No, that is peak Linux.
<Ermine>
what it is even
<heat>
userfaultfd basically lets you handle page faults in userspace
<heat>
but *all* page faults in a given region, not just SIGSEGVs
<puck>
i wanna play with userfaultfd
<Ermine>
boehm should use it
<puck>
technically you could implement userfaultfd without its existence, actually
<puck>
it's just a bit more terrifying
<heat>
how?
<puck>
..either fuse, or ublk, then mmap that in
<heat>
lol
<puck>
<del>now if you do this linus torvalds will come to your house and stare at you while you sleep</del>
<Ermine>
it does some shit to provoke SIGSEGV which is handled by boehm
<Ermine>
it's UB
<puck>
hm, does boehm segfault on purpose?
<sortie>
I actually one worked with the Boehm guy
<heat>
i wish linus torvalds would stare at me while i sleep 😔
<Ermine>
puck: yes
<kof673>
who is staring at linus? mjg? j/k
<puck>
oh, virtual mprotect, that's funny
CryptoDavid has quit [Quit: Connection closed for inactivity]
<puck>
kof673: linus torvalds awakes in the middle of the night. cold chills run down his spine, he flips the light switch. Linus Torvalds is staring at him from the foot of his bed.
<sortie>
I occasionally really want to drag Torvalds in front of my livelocked Linux desktop like a misbehaving dog being shamed and be like 'Explain yourself why is it not working'
<Ermine>
heat: 'linus torvalds is watching you' poster would be suitable
<puck>
sortie: you know what's fun? userfaultfd can catch page faults happening in the kernel
valerius has quit [Killed (NickServ (GHOST command used by theophilus!~corvus@user/theophilus))]
<Ermine>
sortie: I think he' the least suitable person for that answer
<sortie>
puck: it what
<heat>
yeah ofc
<puck>
sortie: like. kernelspace reads from userspace at times
<heat>
userfaultfd and fuse mapping faults get handled specially to avoid DOS
valerius_ has joined #osdev
<sortie>
Ermine: I tend to think that if I keep doing it everytime my Linux livelocks on IO due to OOM conditions, there will be accepted patchsets to improve things
<sortie>
And/or distros fix their configs
<sortie>
The flight home from Aarhus everytime will be so annoying
<Ermine>
oh, 12309-like?
<heat>
i havent had a good livelock in a good while
<Ermine>
I've got it while trying to unrar leaked nvidia sources
<sortie>
To trigger a livelock, just do basics like using Firefox for long enough with enough tabs and boom
<heat>
sir, i use chrome
<sortie>
Don't try to help me here. This is a can of worms.
<heat>
i have an SSD, swap, zswap and mglru
<sortie>
I honestly *do not* want to tinker with my debian install. It's there to get out of my way.
<heat>
works well
<heat>
i mean yeah you're using debian chances are you're still on 2.6
<sortie>
All of my energy being annoyed with Linux goes straight into Sortix.
<sortie>
Debian testing, it's quite up to date tbh
<Ermine>
Oh, I've one day had a lock with mglru on
<Ermine>
When I was trying to build rust package on my 1Gb vps
<sortie>
Hmm I am on Firefox 115 esr
<Ermine>
Firefox never caused such issue for me
<Ermine>
Btw my linux annoyances energy should go into Onyx
<nikolar>
Firefox did cause thrashing for me when it took up too much ram
<heat>
Ermine, SEND PATCHEN
<heat>
it is after all an excelent system for building software
<Ermine>
wifi
<Ermine>
Ot
<Ermine>
It's on the list of baits I've took
<heat>
ot?
<Ermine>
not ot, wifi
<heat>
wifi is terrible :(
<Ermine>
'Ot' is "I can't type"
<heat>
signals in general are terrible. fuck sound, fuck wifi
<Ermine>
I need to prove that relevant linux subsystem is crapper
<heat>
openbsd wifi OPTIMAL
<Ermine>
why all of the sudden
<nikolar>
Can you steal a wifi stack heat
<Ermine>
now that onyx kernel is GPL, everything can be stolen
<kof673>
i didn't know people still used rar (not a criticism)
<Ermine>
people still use winrar
<heat>
nikolar, if i ever wanted/needed wifi i would most definitely take someone else's
<heat>
i am NOT writing sound or wifi stacks
<nikolar>
From where though
<nikolar>
Like how hard would it be to take that kind of code and adapt it
<heat>
linux, openbsd for wifi, linux, freebsd for sound
<heat>
it would be hard but probably significantly easier than writing your own
<heat>
and par for the course, every self-respecting UNIX has a linux compat layer for DRM to run on
<nikolar>
Yeah makes sense
mjg has joined #osdev
<heat>
Ermine, duuuuude i could fucking maple tree now
<heat>
MAPLE TREEE
<Ermine>
Yeaaaaah
asarandi has quit [Quit: WeeChat 4.2.2]
gareppa has joined #osdev
Burgundy has joined #osdev
gareppa has quit [Quit: WeeChat 4.1.1]
Arthuria has joined #osdev
goliath has quit [Quit: SIGSEGV]
Arthuria has quit [Ping timeout: 252 seconds]
<Ermine>
I guess I should start with linux wifi docs
Arthuria has joined #osdev
eddof13 has quit [Quit: eddof13]
Arthuria has quit [Ping timeout: 248 seconds]
eddof13 has joined #osdev
nuno has left #osdev [Leaving]
antranigv_ has joined #osdev
antranigv has quit [Ping timeout: 252 seconds]
<Ermine>
heat: do you like openbsds wifi stack unironically?
asarandi has joined #osdev
<heat>
i don't know, i know it's solid and like... the best bit out of openbsd
<heat>
no idea if it's technically okay or pure garbage, and tbqh i can't judge wifi stacks
eddof13 has quit [Quit: eddof13]
melonai has quit [Quit: Ping timeout (120 seconds)]
melonai has joined #osdev
eddof13 has joined #osdev
bauen1 has quit [Ping timeout: 260 seconds]
<Ermine>
Seems like wifi would require in-kernel crypto...
<heat>
oh right that fucking sucks
<heat>
on one hand it's probably not too hard to deal with
<heat>
otoh, lots of code
<Ermine>
I think embedding bearssl would be fairly easy
<heat>
bearssl is unmaintained
<heat>
everyone's on mbedtls now
<Ermine>
yes, that's the sad part
foudfou has quit [Remote host closed the connection]
<heat>
or GNUTLS
<heat>
why sad?
foudfou has joined #osdev
<Ermine>
it's permissive
<Ermine>
permissively licensed*
<Ermine>
If it was maintained and complete, one could write openssl compat level and throw openssl away
<bslsk05>
pubs.opengroup.org: The Open Group Base Specifications Issue 8
<Ermine>
openssl is an annoyance
<sortie>
Update your bookmarks. I like how you can tell if you reading the old issue 7 specification if the font is green, since Issue 8 from 2024 has a new turquoise color.
<Ermine>
yay
<heat>
Ermine, openssl is nice
<Ermine>
There was a discussion in Alpine to switch away from openssl
<heat>
to what??
<sortie>
closedssl
<Ermine>
to nothing
<heat>
libressl lmfao, boringssl isn't ABI stable
<Ermine>
every other proposed alternative had its cons
<heat>
bearssl and mbedtls have different goals and aren't API compatible
<vin>
Surprising that there is no functionality that supports this
bauen1 has joined #osdev
<vin>
Since I expect the cost of colapsing base pages to huge pages must be same as spliting them, both will involve in page table updates and TLB shootdowns.
gog has joined #osdev
renzei has joined #osdev
renzei has quit [Quit: leaving]
<Ermine>
Woo, zoom doesn't immediately crash in wayland mode anymore
<bslsk05>
twitter: <Dexerto> CrowdStrike, the company that caused the biggest computer crash in history, is offering a $10 Uber Eats gift card as an apology to its clients.   Via TechCrunch https://pbs.twimg.com/media/GTQ_ChgXAAA2MAd.jpg
<netbsduser>
about huge pages
<netbsduser>
i thought a bit about them and adopted a buddy allocator with a view to the possibility of having transparent huge pages at some point
<netbsduser>
but the more i think of it, the more difficult i think it to actually implement them in a manner even remotely sane
<heat>
why?
Arthuria has quit [Ping timeout: 260 seconds]
<netbsduser>
linux sees a lot of attention paid to them but even on linux they are not particularly popular
<mjg>
they are literally transparently allocated
<heat>
sure they are
<mjg>
for years now
<netbsduser>
mjg: and people love to switch them off
<heat>
they're a make-or-break for many workloads
<heat>
it tends to be on the workloads that are already carefully curated with hugetlbfs support and that shit, that thp sucks
<netbsduser>
one thing that bothers me a bit is their interaction with paging dynamics
<netbsduser>
they are large but one accessing one byte of them implies the whole thing should stay in
<heat>
not quite
<heat>
they don't exactly need to or should work like that
<heat>
e.g you write to a file thp .data, do you want to Cow the whole 2MB page? probably not
<kof673>
and evil^H^H^H^Hexperienced person would say the $10 offer is to say "sorry, you already settled, can't sue" lol
<kof673>
*an
<netbsduser>
some might argue that the right solution then is to track prospective groupings for upgrade to large page, to check whether a good number of the individual small pages are frequently accessed, and then if they are over some reasonable period, then assume they will remain frequently accessed and upgrade
<heat>
nobody got time for that
<netbsduser>
but as denning says the working set varies over time
<netbsduser>
as a bit of a counterpoint to concerns about thp replacement, it says a lot that in the first days of 4k pages you might only have about 1000 of them in your system
eddof13 has quit [Quit: eddof13]
<netbsduser>
and nowadays a lot of systems have a lot more than 1000 2mib pages' worth of ram
<chiselfuse>
i don't understand what rseq is. i tried searching online but all i got is "used for critical sections that are restartable"
<chiselfuse>
when preempted
<netbsduser>
restartable sequences
<netbsduser>
suppose you have to do something with some data that's replicated per core
<heat>
AnonHugePages: 550912 kB
<heat>
ShmemHugePages: 647168 kB
<heat>
FileHugePages: 129024 kB
<chiselfuse>
can you explain what "replicated per core" means?
<netbsduser>
you can't be having the CPU your therad is running on change out from under you before you've finished your work
<netbsduser>
but to stop that costs dearly
<chiselfuse>
netbsduser: so say i have a bunch of instructions that i want to execute. why would i care whether the cpu context switches to another process and comes back in the middle of executing them (assuming that's what you mean by "change out from under you before you've finished your work")
<netbsduser>
i mean the thread is moved to another cpu
<netbsduser>
so you can instead arrange to have a restartable sequence, meaning the work is carried on and if it's interrupted halfway through, it's noted by the kernel, and it restarts your sequence
linear_cannon has quit [Read error: Connection reset by peer]
<chiselfuse>
why would it matter if it's moved to another cpu? don't the register values and other information get loaded in the other cpu so the thread can continue executing there without noticing?
linear_cannon has joined #osdev
eddof13 has joined #osdev
<netbsduser>
because you want to access and work with data relevant to the cpu you're currently running on
<zid>
L1 cares
<zid>
L1 gets lonely
<nikolar>
:(
<heat>
yeah like
<heat>
say you have a cache of stuff per-cpu
<chiselfuse>
zid: so you're saying that it would still work if i just switched to the other core but that it cost a lot because i now would have different contents in L1?
<heat>
you'd use rseq to make sure you either access that cache or restart (using your new cpu's data)
<heat>
note: this also applies if you get preempted, because then some other thread might've run on that cpu and touched the cache
<zid>
yes
<zid>
nikolar is it monday yet
<zid>
(and if you move more than 1 core away, your L2 is probably empty too)
<chiselfuse>
can you give me one detailed example? a process can be executing a lot of instructions that read and write to memory and therefore if it changes all of a sudden to a different core, L1 would be different there and cause it to slow down. now with rseq, i can tell the kernel to have a certain sequence of userspace instructions either start and end fully without switching cores. and in case it gets
<chiselfuse>
preempted, to restart that sequence of instructions so the L1 cache is rebuilt on the new core? but that 1) doesn't make much difference since if it just carries on executing on the different core it will have to rebuild the cache in the same way. 2) how is restarting a sequence of partially executed instructions safe? i could've already incremented a value in memory and now i'll increment it again
<zid>
they don't partially execute ever
<zid>
either they executed or they didn't
<chiselfuse>
how are they "restartable" then?
<zid>
the cpu has buffers and register renaming and all sorts of tricks to *speculatively* do work
<netbsduser>
chiselfuse: if they don't execute to completion they are restarted
<zid>
and then it copies the results to the 'real' registers afterwards when it confirms they were correct
frkazoid333 has quit [Ping timeout: 276 seconds]
<netbsduser>
you might analogise it to an optimistic approach some database engines use
<netbsduser>
they also have "restartable sequences", transactions that, if stuff is modified while it's being applied, they roll back and retry until it is committed
<heat>
chiselfuse, this has *nothing* to do with the cpu cache
<heat>
like, L1 should not even be in your vocabulary rn
<netbsduser>
what i don't actually know is whether any "normal" software is currently using these
<heat>
a simple example is a common optimization in malloc where you grab a bunch of memory at once (per thread or per cpu) and progressively take from it, locally, without locking
<netbsduser>
as in not database engines and other things with kernel-tier shared state and kernel-tier need to scale
<heat>
tcmalloc is using it, glibc ptmalloc uses it too i believ
<chiselfuse>
zid: okay, so the cpu will try to speculate which instructions will get executed in the future and execute them internally to later copy results into real registers/memory. where is the part that can get restarted in all this?
<zid>
almost none of it
<zid>
I'm saying that you can't be in a corrupt state
<zid>
either it's retired or it hasn't
<zid>
because future stuff is only done speculatively
<zid>
it retires in-order
<zid>
even if it was computed out of order
<chiselfuse>
heat: i don't understand your simple example
<heat>
ok so locking inter-cpu is bad and stuff, right?
<chiselfuse>
i don't understand what tha tmeans
<heat>
because you're serializing and that'll kill your perf
<heat>
dude
<heat>
you know what a lock is right?
<heat>
mutex?
<chiselfuse>
yep
<heat>
ok, and doing that is bad and slow, right?
<chiselfuse>
by locking inter-cpu, you mean that i stop every other process from using the cpu i'm using? can you give an example?
<heat>
ignore the inter-cpu bit
<heat>
locking is bad
<heat>
yes?
<chiselfuse>
what if i'm in a critical section, i need to execute without getting preempted
<chiselfuse>
why is it bad?
<heat>
because it's slow
frkzoid has joined #osdev
<heat>
if you have 8 threads all hammering a lock, you wont scale
<chiselfuse>
i guess that's a downside if my critical section can't use 100% of the cpu?
<heat>
no critical sections here
<heat>
so the basic observation is: if i keep data thread local, i won't need to lock and therefore it scales well
<heat>
so they did this for malloc. now malloc asks for more memory than it needs, and keeps it in the per-thread data
<heat>
and carefully hands it out
<heat>
the second observation is that keeping this thread local is sometimes too expensive if you have too many threads but not enough CPUs (this only applies to userspace, really). that's what rseq was built to solve, you can now keep data in per-cpu stuff and get notified if you were preempted in the middle of an rseq sequence
MrCryo has quit [Remote host closed the connection]
eddof13 has quit [Quit: eddof13]
memset has quit [Remote host closed the connection]
memset has joined #osdev
Maja has quit [Quit: No Ping reply in 180 seconds.]
Maja has joined #osdev
Matt|home has joined #osdev
<Matt|home>
hi.
<kof673>
aloha
<Matt|home>
very quick question, just want a very rough estimate i don't care about the details i'll look it up myself: on a scale of 1 to 10 or whatever, how difficult is it writing an OS installer for a modern x86 system? obv UEFI install i mean. 10 being like, idk nethack blindfolded
<heat>
an installer is just a normal program
<Matt|home>
i kinda wanna design my own installer but have it look actually legit good. windows installer has best graphics obv, but it's not remotely verbose or elegant enough
<Matt|home>
it's too simplistic
GeDaMo has quit [Quit: 0wt 0f v0w3ls.]
<nikolar>
Depends on hot os, for arch you basically do pacstrap and edit some config files
<nikolar>
For your own os, you'll have to figure out the details yourself
<Matt|home>
yeh
<nikolar>
But presumably at first, just copying a few files
* Matt|home
is tempted to use something like a 16k 10-minute video during the install.. nfi how many gigs XD
memset has quit [Remote host closed the connection]
<heat>
PACSTRAP!
memset has joined #osdev
<heat>
nikolar, do yall also have arch-chroot
<heat>
or did you rename it artix-chroot
<nikolar>
Artix
<heat>
lmfao
<nikolar>
And basestrap
<heat>
ew
<heat>
still pacman right?
<nikolar>
I am not sure why pacstrap was renamed
<nikolar>
As far as I know, pacman is completely unmodified
eddof13 has joined #osdev
eddof13 has quit [Client Quit]
<nikolar>
heat do you have a pacman port or something
<heat>
no, why
<nikolar>
Just curious
<nikolar>
No package managers then
eddof13 has joined #osdev
eddof13 has quit [Client Quit]
<heat>
yeah for now i have cp and tar and make install
<heat>
yeehaw
<heat>
i was thinking more towards rpm and not pacman
<nikolar>
Really
<nikolar>
Why's that
<heat>
pacman has some... weird choices
<nikolar>
Like what
<heat>
like data unsafety by default
<heat>
and they use the filesystem as the database
<heat>
like???
<nikolar>
Filesystem as the database?
<heat>
BASED choice on a btree fs, weird choice on anything else
<heat>
yeah the repo data for pacman is laid out using directories and files
<nikolar>
Are you sure, I feel like that's not true
<heat>
100%
<heat>
see /var/lib/pacman/local/
Turn_Left has quit [Ping timeout: 260 seconds]
<nikolar>
Eh oh well
<nikolar>
Works fine :P
eddof13 has joined #osdev
<heat>
the sync'd repos i guess have some file format, but i dunno what that is
<nikolar>
It's a tar of something
<heat>
yeah works fine-ish but it's all kind of weird data consistency wise
<nikolar>
Can't remember what it looks like exactly
eddof13 has quit [Client Quit]
<nikolar>
I've poked about the pacman dbs but I foegor
<nikolar>
Forgor
<heat>
💀
<nikolar>
Indeed
Matt|home has quit [Quit: Client closed]
<heat>
oh yeah it's a weird tar
<heat>
yikes its pretty much the same format as the local directory
<nikolar>
Guess that makes sense
renzei has joined #osdev
<kof673>
was it nikolar that said phoenix was a good name? make a phoenix pkg manager to carry away the repo elephant
<nikolar>
Lel
<heat>
just found a terrible memory leak in my slab allocator
<heat>
there was one edge case where i just forgor 💀 to actually free the object
<renzei>
hey gang and such - i tried setting up my PIT timer (ras syndrome i know lol) but it just triggers once instead of being a timer and it is ! not making any sense :(
<nikolar>
Just configure it the same every time it fires, you're welcome :P
<heat>
how are you configuring it
hwpplayer1 has joined #osdev
<renzei>
uhh, using asm outb to send 0x36 to 0x43, then the 1193182/100 in two parts to 0x40
<renzei>
i just followed a youtube tutorial tbh "^^ but i do understand what it's supposed to be doing
<renzei>
am i allowed to send github links here cause i got the source on there
<bslsk05>
renzei-z/operating-system - Just making a silly OS for fun :D (0 forks/0 stargazers)
<renzei>
i got osinted :(
<heat>
BANNED FOR SENDING LINKS
<heat>
GOODBYE
<renzei>
nooooooo !!!!!!
<renzei>
the timer code is in kernel/arch/i386/timer.c
<renzei>
i
<nikolar>
YOU'VE BEEN PWNED!1!1!
<renzei>
i also had a weird bug where a div by zero isr actually triggered the invalid opcode (isr6) for some reason - which may be related? i might have set up my isr/irqs wrong
<heat>
are you setting command mode 3?
<renzei>
i got l33ted and pwned... truly a sad day
<renzei>
command mode 3 as in square wave?
<heat>
yes
<renzei>
yeah i believe so
<renzei>
0x36
<heat>
yeah i'm sending 0x36 too
<renzei>
it *seems* to all be right but then it just.. doesn't work?
<renzei>
and then i also get the weird thing with my isrs where it fires the wrong one for the actual error/kernel panic it should be
<heat>
your PIC remapping code might be wrong
<heat>
also sounds like you're not EOI'ing the PIC?
<heat>
s/PIC/PIT/
<bslsk05>
<heat*> also sounds like you're not EOI'ing the PIT?
<renzei>
do you mean sending 0x20 to 0x20 to tell it that the interrupt is finished?
<zid>
that's the PIC
<renzei>
oh right i didn't see the sneaky sed in there
<zid>
The devices that are triggering interrupts will also need to be told to shut up
<heat>
yes 0x20 to 0x20
<zid>
else they'll just immediately retrigger the PIC after you reset it
<renzei>
yeah i send 0x20 to 0x20 in kernel/arch/i386/idt.c:171
<heat>
the ez way to figure this out: open qemu's monitor, do info pic
<zid>
thinking you never bothered to handle the interrupt at all
<renzei>
so wait am i supposed to be sending something to the PIC or the PIT
<renzei>
cause i already think that i'm handling EOI for PIC
<heat>
PIC
<zid>
does the PIT do level or edge?
<renzei>
heat i'm already EOIing the PIC at the end of the irq handler
<renzei>
wym level or edge? as in the kind of wave?
<zid>
I mean, in electrical terms, yes?
<heat>
do info pic
<renzei>
where do i do info pic - in gdb?
<zid>
It's whether the device signals the interrupt by giving you a blip, or by just wedging itself high until told to stop
<zid>
in the latter case, you need to.. tell it to stop
<heat>
qemu monitro
<heat>
s/monitro/monitor/
<bslsk05>
<heat*> qemu monitor
<zid>
else it'll just hold it high forever and keep retriggering the PIC
<zid>
never really messed with the PIT much, maybe I should look it up
<renzei>
stupid question - how do i access the qemu monitoer
<renzei>
monitor*
<zid>
-monitor stdio
<renzei>
that makes sense
<renzei>
which bit of the info pic am i supposed to be looking at here
<zid>
yea pit is all edge triggered
<heat>
pastebin it
<renzei>
one second
<zid>
so it should just be a case of telling the PIC that you're done, so that you get future interrupts, and the PIT can be left alone
<renzei>
after i started the os i got everything zero except pic1 irq_base=28 and elcr=0c, and pic0 irr=03 and irq_base=20
<zid>
single step through your PIC ACK outb's and make sure they update info pic in a way that makes sense
<renzei>
alrighty - gimme a min
<renzei>
single step through the PIC initialisation or the resetting after an irq?
<zid>
the latter
<renzei>
gotcha
<zid>
IRR should tell you what's pending, if it never clears then you have a problem, if it clears but you get another interrupt.. then the pit just sent you another interrupt
<zid>
but you should get some instructions inbetween, unless you set a really silly mode on the PIT, I hope!
<renzei>
okay so - pic goes to irr=01, isr=01 when the timer fires, and then irr stays 01, and isr goes back to 00 when it hits the end of the irq handler
<renzei>
but then the timer never fires again
<zid>
sounds like you set the PIT to a oneshot mode, or a really really long period then?
<heat>
run qemu with -d int and post the output
<renzei>
hmm - i mean, the PIT seems to be set to the right mode given the resources i was using, and the period doesn't seem to affect it, since i've tried changing the freq i send to the PIT, as well as leaving it running for like an hour
<zid>
what mode are you using?
<renzei>
for the PIT?
<zid>
yea
<renzei>
wait - i quit gdb and it ticked again, but that's all i did wtf
<renzei>
@zid i'm sending 0x36 to 0x43 for the PIT
<renzei>
which is mode 3 as far as i can tell
<renzei>
(square wave)
<zid>
oh are we twitter now
<nikolar>
@zid apparently
<renzei>
this is bullying.. i am a new irc user okay !!
<zid>
@nikolar don't forget to renew your subscription to my onlyfans
<renzei>
if you try running the os locally does it also not tick there? cause it could be an issue with my qemu not managing to set up the right ports and stuff idk
<zid>
and just to super check, you sent 0x36 to 0x43, and not 0x43 to 0x36? :P
<renzei>
yeah, 16bit binary, lh, and channel 0
<renzei>
yes, 0x43 is the port, and 0x36 is the value sent there
<zid>
and what reload bytes?
<renzei>
where i have void outb(uint16_t port, uint8_t val) { asm volatile("outb %1, %0" : : "dN" (port), "a" (val)); }
<renzei>
reload bytes? is that the stuff sent to 0x40?
<zid>
yea
<zid>
the way these timers work is that they count, hit 0xFFFF or 0x0 depending which way they go, then load their reload value
<zid>
and then it starts counting again, from there
<renzei>
i have divisor which is the Hz/freq (1193182 / freq) and i send (divisor & 0xFF) and then (divisor >> 8) & 0xFF as the high bytes
<renzei>
byte*
<zid>
so if you set a really small value it goes 3 2 1 0 (interrupt) (reload) 3 2 1 0 (interrupt reload) and is fast, or a big value and it's slow
<renzei>
yeah i tried sending like.. 20 and it didn't trigger again
<zid>
whether this one counts up or down or does strange things to the value idk
<renzei>
i still don't know how to reference people here so @heat here is the -d int that i completely missed:
<renzei>
anyway the os starts after selecting in grub after the spam of servicing hardware INT=0x08
<nikolar>
No, zid, you suck
<renzei>
um akshaully i atted him
<zid>
Then you get a vector=20 once, right at the bottom, which is your IRQ0 and thus the timer, which looks fine
<renzei>
yeah, and then no more output
<renzei>
i waited like 30 seconds-ish
<renzei>
i'm so confused because everything seems right and like it should work.. and then it doesn't!!
<zid>
what reload values do you actually send?
<renzei>
lowkey no idea because i haven't added printing numbers to my printf implementation yet "^^
<renzei>
lemme uh.. gdb it one sec
<zid>
breakpoint th- yea
<nikolar>
Lel
<renzei>
divisor is 11931, so i send outb(0x40, 11931 & 0xFF) and outb(0x40, (11931 >> 8) & 0xFF)
<zid>
why are you making me do math
<renzei>
so.. 155 then 46 (decimal)
<zid>
just look at what's in AL
<zid>
also it introduces mistakes
<zid>
there's an out dx, al you can stepi to from those breakpoints surely
<renzei>
yes there is - which i'm doing rn :p
vdamewood has joined #osdev
goliath has quit [Quit: SIGSEGV]
<zid>
There's some assumption somewhere we've gotten wrong
<zid>
that the outb code works, that the PIT is not made of cheese, etc
<zid>
so it's fine-toothed comb time to try find what
<zid>
either that or we bust out the big guns
<renzei>
0x36 first to the 0x43, then 0x9b (-101??) then 0x2e (46) to 0x40
<renzei>
so what i assumed to be 155 is actually apparently -101 according to gdb
<renzei>
but that might just be an error in displaying hex as decimal
<zid>
That seems big
<zid>
0x9b is a lot
<renzei>
i mean, the value i'm trying to send is on the order of 10000 which is a lot
<zid>
oh it's actually 0x2e right, which is sort of a lot?
<zid>
0x2e.9b time units
<renzei>
yeah i believe so
<renzei>
no idea if big or little ending tbh
<renzei>
s/ending/endian
<zid>
hardcode it to outb(0x40, 0); outb(0x40, 10); for now?
<zid>
fuck math
<zid>
we don't need mathematics where we're going
<renzei>
i did that.. alas, no ticking
<renzei>
i think it's an issue in the isr/irqs themselves, as there's also that issue of the wrong isr being called when an exception/panic happens
<renzei>
which *could* be related
<zid>
okay now make main a program that reads the latch in a loop and copies the data to the serial port tx and -serial file:log.txt in qemu? :P
<zid>
(slightly bigger gun)
darkstardevx has joined #osdev
<renzei>
the latch?
<renzei>
you're gonna have to explain a little more in detail what you mean (this is my first time trying to os dev)
<zid>
When the latch command has been sent, the current count is copied into an internal "latch register" which can then be read via the data port corresponding to the selected channel (I/O ports 0x40 to 0x42)
Left_Turn has joined #osdev
<zid>
You send it a command and it copies the current state to a latch
<zid>
that's what 0x40 reads
<renzei>
so what am i putting in kernel_main? reading 0x40-0x42 and outputting them to a serial port?
<renzei>
(sorry if i'm slow - i'm also a little drunk :p)
<zid>
write to 0x40, read from 0x40, write it to.. 0x3F8?
<zid>
if memory serves
<zid>
you'll wanna outb "meow" or something first to make sure it works
<renzei>
okay i gotta implement inb real quick
<zid>
then add the 'reading from the latch' part on top
<renzei>
so basically - while (1) { uint8_t res = inb(0x40); outb(0x3F8, res); }
<renzei>
?
<zid>
you need to write to 0x40 first
<zid>
to get it to latch the deets
<renzei>
yee, i mean after calling timer_init
<zid>
no
<zid>
in the loop
<renzei>
isn't the loop after timer_init
<zid>
You want a new value every time, not the old stale one
<renzei>
okay i see
<renzei>
what am i sending to 0x40 every loop iteration?
<zid>
writing to it tells it "copy the info to the latch" reading is "read the latch"
vdamewood has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
Turn_Left has quit [Read error: Connection reset by peer]
torresjrjr has quit [Ping timeout: 265 seconds]
torresjrjr has joined #osdev
spareproject has quit [Remote host closed the connection]
Matt|home has joined #osdev
Matt|home has quit [Client Quit]
Matt|home has joined #osdev
<heat>
nice maple tree is working
<nikolar>
That was fast
<heat>
the api itself is pretty well designed and easy to use
<heat>
as for maple tree deps itself, i already have most of them
<heat>
linux _is_ an onyx clone after all
<nikolar>
Did you already have the maple tree implemented
<nikolar>
I thought you were writing it right now
<heat>
oh i'm not writing a maple tree, i took linux's
<heat>
the maple tree is hugely complex and awful lol
<heat>
almost 8KLOC
<nikolar>
Oh you just took the whole thing
<nikolar>
Lame
<heat>
fair
<nikolar>
Lol
<heat>
i have little interest in spending ages writing one, i really just want to use it
<heat>
i got to look at a horrible SLAB api too
<Matt|home>
hi..
<heat>
kmem_cache_alloc_bulk takes a size_t, it's supposed to *always* return that argument, and returns 0 on error
<heat>
the best bit is that the return type is an int, not a size_t
<nikolar>
Lel
<nikolar>
Why
<heat>
i dunno, i tried searching on lore but couldn't figure it ut
<nikolar>
Maybe just an oversight
<nikolar>
Who needs to allocate more than 2gb at once anyway
<heat>
the current impl does not check for arg > INT_MAX, so if you ever get linux to allocate more than INT_MAX there... you get it to error out erroneously
<nikolar>
Heh nice
<nikolar>
Someone should report that
<heat>
hmm
<nikolar>
What
<heat>
int < size_t, do they promote to size_t or do they demote size_t to int?
<heat>
i think they demote the size_t?
<nikolar>
What do you mean by int < size_t
<nikolar>
A normal comparison?
<nikolar>
int should be promoted to size_t iirc
Burgundy has quit [Ping timeout: 260 seconds]
xenos1984 has quit [Read error: Connection reset by peer]
<heat>
oh, the loop will never end then
<nikolar>
Heh nice
<nikolar>
Yeah, no bueno
cultpony_ has joined #osdev
cultpony has quit [Ping timeout: 246 seconds]
cultpony_ is now known as cultpony
<nikolar>
Btw a rule od thumb, c never demotes times to smaller sizes
<nikolar>
Implicitly
<nikolar>
In binary ops
antranigv_ is now known as antranigv
<heat>
i have realized it not only will never end, but the int will underflow
<heat>
fuck me
<nikolar>
Yes, the int will underflow
<nikolar>
Report that
<heat>
i'll send a patchen
<nikolar>
Do indeed send patchen
<nikolar>
Though probably not a bad idea to report first and see the feedback