<heat>
you *basically* can think of device interrupts as some device asserting the INT# pin on the CPU
<heat>
CPU sees INT# is asserted and starts dispatching the interrupt (if it can. if interrupts are disabled, it'll do it as soon as they're enabled)
<heat>
dispatching the interrupt usually involves switching/saving a couple of registers and jumping to an address the kernel already set
<adder>
Would this be ISR address?
<heat>
yes
<heat>
then the kernel usually needs to save the rest of the registers, yadda yadda, jumps to actual C, the C actually handles the interrupt and does whatever
<heat>
then it undos everything and the CPU starts executing right where you were
<adder>
Yeah.
sbalmos has quit [Ping timeout: 255 seconds]
<adder>
Thanks, heat, that's helpful. I'll add that to my note.
<bslsk05>
github.com: Onyx/kernel/arch/riscv64/interrupts.S at master · heatd/Onyx · GitHub
<clever>
i had to implement interrupts from scratch before, and i was lazy/paranoid, and just always save all registers upon entering the ISR
<clever>
even though i know that gcc saves/restores most on the stack
<heat>
clever, you should save all the regs
sbalmos has joined #osdev
<clever>
ah, so i'm not just paranoid
<clever>
i was thinking you can get away with just saving the clobbered regs
<clever>
and the normal function prelude would save/restore the rest
<heat>
adder, see the link above, it's my asm code for riscv interrupt handling, it should help
<heat>
clever, technically you are, but having a well defined trap stack frame is really useful IMO
<heat>
if nothing else, for debugging
<clever>
yeah, i use the same save-all-regs code for both normal interrupts, and fatal interrupts
<clever>
so the asm doesnt have to care if something is fatal or not, just save everything
<clever>
c can then decide if its fatal, and print them all
Bonstra has quit [Ping timeout: 240 seconds]
Bonstra_ has joined #osdev
<gog>
does anybody know how to make qemu use the right keymap
<gog>
-k is does nothing
<gog>
LANG=is_IS.UTF-8 líka
lentement has joined #osdev
<heat>
wdym
<gog>
idk, it's probably because my locale is mixed
<heat>
i mean, what exactly are you talking about with keymap
<gog>
but the keymap is wrong. say i press Æ on my keyboard
<adder>
heat: Can you confirm: "When the processor receives an interrupt, it responds by initiating a "trap," which is a kind of forced function call into a specific part of the operating system called the interrupt handler or trap handler."
<gog>
i get : in qemu
<gog>
qemu thinks i have an ANSI keymap
<heat>
but is "qemu" the "qemu monitor" or "the emulated OS"
<gog>
the emulated os
<heat>
you have to switch it in the emulated OS
<gog>
:'(
<heat>
adder, the terminology is a bit loose but sounds good
lentement has quit [Ping timeout: 255 seconds]
<zid>
does qemu even give a shit about 'keymaps'
<zid>
or is it just forwarding scancodes
<gog>
forwarding scancodes
<zid>
it'd be weird if it were decrypting characters BACK to scancodes
<gog>
because the location of the keys does correspond correctly
<zid>
so that it could provide them back to ps/2 that way
<gog>
to ANSI
<zid>
Like, it'd be an interesting mode I guess, to allow you to seemlessly use the wrong keyboard for your client's locale
<heat>
honestly i don't know. can you get the scancode out of Xorg keypresses (etc)?
<gog>
maybe i'll spend the obscene amount to get a new top cover for my laptop with the ANSI keyboard
<heat>
wait. obviously, you always get the scancode
<zid>
yea, it's delivered as scancodes isn't it, xmodmap or whatever converts it
<heat>
yeah
<zid>
xkbcomp now apparently?
<zid>
Anyway, I feel like killing another industry
<heat>
i'd like to interject for a moment
<heat>
i want to tell the world i love my cat to bits
<zid>
can we kill the interjection industry gog, we're millenials we hae the power
<adder>
heat: Your code makes sense now, thanks.
<heat>
i don't know what my cat has to do with my code, but np
<adder>
(well, as a whole)
<gog>
no, i like interjections, it puts medicine in my muscle or beneath my skin or into my bloodstream
<zid>
I need more
<heat>
gog are you into doping
<zid>
steel
<heat>
or are you natty?
<zid>
gog is super doped sadly
<gog>
yes
<gog>
it's the only way to get the bod i got
<zid>
also there's something wrong with my nuclear
<zid>
pumping speed zero, low input fluid, wut
<zid>
WHY
<gog>
what
<zid>
oh ffs one random pipe join is too far away
<gog>
aer you using the right ratio of offshore pump o
<zid>
It shows I sort of need to add another couple of columns though
<zid>
if one water pipe being lazy can knock it out
netbsduser` has quit [Ping timeout: 260 seconds]
<heat>
adder, if you want to follow the interrupt path, it's interrupts.S -> traps.cpp -> plic.cpp (returns to traps.cpp) -> irq.cpp -> whatever driver -> irq.cpp -> plic.cpp and then unwinds
<heat>
hopefully you can follow the function calls on your way there, it's pretty simple
<heat>
everything you mentioned about processes and "RUNNING" is somewhat of a sideshow
<adder>
heat: Yeah, although I'm still going breadth-first, but I'm pretty sure I'll be heavily referencing onyx on my way.
<heat>
great idea, i should write a book
<heat>
Victim: How zid terrorised me and my family for 8 years
<heat>
then something about a kernel idk
<zid>
adder why are you asking heat how interrupts work, did you finish your hello world?
<adder>
zid: I'm trying to get additional context as what I'm reading is unclear to me.
<zid>
Reading is fun, but if it just means you end up asking questions imo you've read waay too far ahead
<adder>
No. I'm pretty sure this will lead to code.
<zid>
I'm fully expecting you to get stuck 20 steps prior
<zid>
then 17 steps prior, etc
<adder>
:)
<gog>
hi
<adder>
Hello, gog.
mahk has quit [Ping timeout: 264 seconds]
<heat>
zid's talking about a "Hello, world" but if you want to do gog then thats fine too
<adder>
Do in gog? I'm not a killer.
<heat>
actually, i just said do
<heat>
don't rizz up gog like that
<adder>
I'm a male, heat.
<zid>
heat why did you suggest he suddenly have sex with gog
<Mutabah>
wtf is going on here today?
<heat>
i referenced K&R
<zid>
no, you meant to
<zid>
'do gog' means something else in english
<zid>
you meant do a '"hello, gog" instead'
<heat>
do means exactly what you want it to mean
<gog>
i don't consent
<zid>
heat is a rapist confirmed
<heat>
unless "do the dishes" now means "fuck the dishes"
<zid>
They're just stuck waiting for some guys from the other side of the world to come over and remove a tree
<gog>
hahaha yeah
<Mutabah>
your spider-bots have a bug or two :)
<gog>
i think factorio 2.0 is going to fix that
<zid>
thank fuck
<gog>
my #1 fix is track alignment
<gog>
can't wait for that
navi has quit [Quit: WeeChat 4.1.2]
<zid>
yea not having to do tracks on evens will be good
<gog>
and elevated rail
<zid>
handy for not having to place signals, but so far for me it wouldn't be THAT useful
<gog>
yeh
<zid>
Is there an easy way to set up like, resource outposts for bots
<zid>
I guess circuits would do it fairly easily
<zid>
active provider chest, with an arm feeding it from a requester on a circuit
lentement has joined #osdev
lentement has quit [Ping timeout: 272 seconds]
netbsduser` has joined #osdev
knusbaum has quit [Ping timeout: 255 seconds]
netbsduser` has quit [Ping timeout: 252 seconds]
gog has quit [Quit: byee]
Turn_Left has quit [Read error: Connection reset by peer]
m3a has joined #osdev
<geist>
hello
<adder>
Hi, geist.
* geist
waves
<geist>
my weekend has officially begun. taking the next two days off
<geist>
yay
<adder>
My whole life is one long weekend.
<geist>
SGTM!
<heat>
ok geist@
<zid>
geist: I recommend chicken madras and a pint of bitter
<geist>
mmm not bad
netbsduser` has joined #osdev
zid` has joined #osdev
zid has quit [Ping timeout: 268 seconds]
zid` is now known as zid
ramenu__ has joined #osdev
ramenu_ has quit [Ping timeout: 264 seconds]
heat has quit [Remote host closed the connection]
heat has joined #osdev
netbsduser` has quit [Ping timeout: 252 seconds]
<heat>
are there many disk controllers (etc) out there that have individual disks sharing a queue?
<heat>
the only one i can think of is IDE... AHCI, NVMe, virtio-blk are more sensible and have at least a queue per disk/port
<clever>
heat: a few months back, i figured out why 1 disk in my 3 disk zfs array was always under-performing, the bloody bios had it in IDE emulation mode
<heat>
:D
<clever>
but only 1 sata port, the others all ran in sata mode, lol
<clever>
and i have to wonder, was that one controller acting as both ide and sata? how was it managing the multiple command flows?
gildasio has quit [Remote host closed the connection]
gildasio has joined #osdev
ramenu__ has quit [Ping timeout: 272 seconds]
mmohammadi9812 has joined #osdev
edr has quit [Quit: Leaving]
Bonstra_ has quit [Ping timeout: 246 seconds]
Bonstra has joined #osdev
mmohammadi9812 has quit [Ping timeout: 255 seconds]
orthoplex64 has joined #osdev
bradd has quit [Quit: No Ping reply in 180 seconds.]
alpha2023 has quit [Quit: No Ping reply in 180 seconds.]
orthoplex64 has quit [Remote host closed the connection]
orthoplex64 has joined #osdev
alpha2023 has joined #osdev
bradd has joined #osdev
crm has quit [Ping timeout: 268 seconds]
orthoplex64 has quit [Remote host closed the connection]
orthoplex64 has joined #osdev
orthoplex64 has quit [Max SendQ exceeded]
orthoplex64 has joined #osdev
orthoplex64 has quit [Remote host closed the connection]
mmohammadi9812 has joined #osdev
mmohammadi9812 has quit [Remote host closed the connection]
mmohammadi9812 has joined #osdev
netbsduser` has joined #osdev
SunClonus has quit [Quit: Leaving]
mmohammadi9812 has quit [Remote host closed the connection]
netbsduser` has quit [Ping timeout: 264 seconds]
eck has quit [Quit: PIRCH98:WIN 95/98/WIN NT:1.0 (build 1.0.1.1190)]
eck has joined #osdev
SGautam has joined #osdev
heat has quit [Ping timeout: 246 seconds]
netbsduser` has joined #osdev
k_hachig has joined #osdev
netbsduser` has quit [Ping timeout: 246 seconds]
k_hachig has quit [Quit: WeeChat 4.2.1]
Matt|home has joined #osdev
lentement has joined #osdev
lentement has quit [Ping timeout: 252 seconds]
geistvax has quit [Quit: [BX] This BitchX's for you]
Maja has quit [Ping timeout: 256 seconds]
<geist>
well the ports on an ahci controller are pretty logically separate, with their own queues and whatnot
<geist>
so it doesn't seem too difficult to just have it not show up on AHCI port as being there, but then having it respond on IDE legacy bits
<zid>
yea they're just little multicontroller things that just go drives <-> controller <-> interfaces
<zid>
and it can map various things to various things
mahk has joined #osdev
<geist>
AHCI actually has a fair amount of low level ATA offload built into it
<geist>
so it's not entirely stupid either
lentement has joined #osdev
lentement has quit [Ping timeout: 256 seconds]
Maja_ has joined #osdev
Maja_ has quit [Client Quit]
<mjg>
:)
<mjg>
fuck legacy stuff
<mjg>
embrace new stuff
<mjg>
like UEFI
<mjg>
wait
Maja_ has joined #osdev
Maja_ has quit [Client Quit]
<Mondenkind>
well you know what comes after embrace new stuff
zetef has joined #osdev
SGautam has quit [Quit: Connection closed for inactivity]
jjuran has quit [Read error: Connection reset by peer]
jjuran has joined #osdev
mmohammadi9812 has joined #osdev
SGautam has joined #osdev
goliath has joined #osdev
zetef has quit [Remote host closed the connection]
<kof123>
+moonchild the "fuck legacy stuff" is the father of the "embrace new stuff" the "embrace new stuff" is the father of the "fuck legacy stuff"
<kof123>
the simurgh is with you lol
<Mondenkind>
who's moonchild
<Mondenkind>
never heard of it
<Mondenkind>
anyway embrace new stuff is clearly the precursor to extend new stuff...
mmohammadi9812 has quit [Ping timeout: 272 seconds]
<kazinsal>
ironically microsoft seems to be more on the "embrace -> extend -> throw it on github" train these days more than anything else
<kazinsal>
the traditional triple-E is now the domain of hardware vendors that are running out of hardware tricks to sell in the era of on-the-fly-programmable ASICs etc
netbsduser` has quit [Ping timeout: 256 seconds]
<gog>
windoze
<gog>
i'm using windows rn
<gog>
it's terrible
<kazinsal>
I believe the kids these days call that a skill issue
mmohammadi9812 has joined #osdev
<gog>
:'(
* kazinsal
pets gog
<nikolar>
Poor gog, getting bullied on the internet
<kazinsal>
as with most catgirls, you just need to make up for it by giving her headpats and catnip and the occasional "good girl"
<zid>
That microblow winsucks guy really gets ragged on
<zid>
if he kills himself are we culpable?
zetef has quit [Remote host closed the connection]
SGautam has quit [Quit: Connection closed for inactivity]
mmohammadi9812 has quit [Remote host closed the connection]
mmohammadi9812 has joined #osdev
mmohammadi9812 has quit [Ping timeout: 260 seconds]
<Cindy>
pypy doesn't have int.to_bytes implemented
<Cindy>
ffs, i had to use struct.pack as a hack
<nikolapdp>
what are you writing CIndy
<nikolapdp>
*Cindy
<Cindy>
python script that generates opcode functions (instruction + size + EAs) from a template
<Cindy>
for a m68k emulator
<nikolapdp>
opcode functions as in functions that emulate a single opcode?
<Cindy>
no
<nikolapdp>
what then
<Cindy>
for each size and EAs per instruction, there'll be a function that has the same code from the template, but with stuff like M68K_EA_OP replaced to the macro specific to that EA and size
<Cindy>
like M68K_EA_OP -> M68K_EA_DN_8, if ea is dn and size is 8
<Cindy>
the template code has something like M68K_OPCODE(ORI, 8) {\n M68K_EA_TYPE dst = M68K_EA_OP(0);\n }\n
<Cindy>
for example
<Cindy>
the 0 is the EA index in the instruction, there are instructions like MOVE that have multiple EAs
<Cindy>
nikolapdp: am i making sense?
<Cindy>
it is a group of opcodes that fall under one function
<Cindy>
the rest of the opcode config doesn't matter, just EA and size
<Cindy>
you see how small it is compared to the output C code?
<nikolapdp>
yeah definitely
<Cindy>
it'll also generate a lookup table too
<Cindy>
a 65535 x <target pointer size> byte table
<Cindy>
well 65536*
<nikolapdp>
that's a large table
<Cindy>
in a 64-bit machine, it'll be 524KB
<Cindy>
well machine with 64-bit pointers
<Cindy>
in a machine with 32-bit pointers, it'll be 262KB
<nikolapdp>
yeah at least fits in l2
<Cindy>
this is the same technique other M68K emulators do
<Cindy>
like musashi
<nikolapdp>
interesting
<Cindy>
they generate a huge lookup table
<nikolapdp>
do you intend on adding a jit like qemu
<Cindy>
dynarec?
<Cindy>
yes
<Cindy>
but interpreter is just there for environments that don't like dynamic code
<Cindy>
for security and safety reasons
<Cindy>
like MISRA C
<nikolapdp>
or architecthres you don't have a jit for yet
<Cindy>
yes
<Cindy>
actually i wanna tell you something nikolapdp
<nikolapdp>
go ahead Cindy
<Cindy>
should i have the pointers point to a struct containing 2 functions
<Cindy>
one, disassemble
<Cindy>
the other, execute
lentement has joined #osdev
<nikolapdp>
that should probably be a second table if you ask me
<nikolapdp>
you're already under a lot of cache pressure because of how big it is
<Cindy>
i'll probably just make it a function
<Cindy>
not a table
<nikolapdp>
same idea though
<nikolapdp>
split it
suqdiq has quit [Quit: Ping timeout (120 seconds)]
goliath has quit [Quit: SIGSEGV]
<Cindy>
no
suqdiq has joined #osdev
<Cindy>
i meant make the disassemble thing a function, not apart of any table
<Cindy>
just one function
<Cindy>
if that works i guess
<nikolapdp>
yeah
<Cindy>
disassemble function is not executed as often as the execution function for a instruction
<Cindy>
so it doesn't need a massive table
<Cindy>
or its place in a cache
<Cindy>
hell, it's only executed if the user decides to disassemble the program
<nikolapdp>
yeah exactly
lentement has quit [Ping timeout: 255 seconds]
mmohammadi9812 has quit [Remote host closed the connection]
<Cindy>
nikolapdp: can a 524KB table really fit in a CPU cache?
<nikolapdp>
it can into l2
<nikolapdp>
in theory at least
<nortti>
yeah looks like newish processors can have 1 MiB of per-core L2 cache
<nikolapdp>
still not great
<nikolapdp>
wonder if it could be compressed in any way
<nortti>
looks like my CPU has 256 KiB per core of L2
<Cindy>
my CPU has 256 KiB too
<Cindy>
per core of L2
<nortti>
what model is that? I'm rocking an i3-2370M
<Cindy>
i5-2540M
<nortti>
mm, so similar age too, then
<nikolapdp>
mine has 512KiB per core
<nikolapdp>
zen 2
<Cindy>
wonder if i could split the table into 2
<Cindy>
match first 4 bits
<Cindy>
i mean
<nikolapdp>
i can never remember how zen 2 shares cache
<Cindy>
the table will match first 4 bits, and then the pointer will go to opcodes that have the same 4 bits
ramenu has quit [Ping timeout: 246 seconds]
<Cindy>
which is 32KB * 16
<Cindy>
in a machine with 64-bit pointers
ramenu has joined #osdev
<Cindy>
and the first table will be 128 bytes
<Cindy>
because 16 * 8 bytes (64-bit pointer)
<Cindy>
nikolapdp: is that more room for the cache?
SunClonus has joined #osdev
<nikolapdp>
what do you mean by > the pointer will go to opcodes that have the same 4 bits
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
<Cindy>
16 * <pointer sizes> bytes table of pointers that matches the first 4 bits -> each pointer -> table of opcode functions of the opcodes that have the same first 4 bits
<Cindy>
some entries in the 16-entry pointer table will be NULL, because they aren't taken in m68k
<Cindy>
so it'll be less than 524KB
<nikolapdp>
so you have another level of tables for the rest of the opcode?
<Cindy>
yes
<nikolapdp>
sounds like it should help
<nikolapdp>
though another memory indirection isn't ideal lol
<Cindy>
i dunno
<nortti>
if the full table would have gaps anyways, I don't think you'd save much by doing the two-level thing
<nortti>
you only get the parts of the table that are accessed in the cache (with cacheline granularity, so two or four entries per, if I remember cacheline sizes for x86 right)
<Cindy>
there are gaps of illegal opcodes
<nortti>
ah, no, 64 byte cache lines. so 8 entries
<nikolapdp>
you're right nortti
<nikolapdp>
guess you can always benchmark
<Cindy>
huh
linearcannon has quit [Ping timeout: 268 seconds]
netbsduser` has joined #osdev
<Cindy>
nikolapdp: i wonder how you can compress a lookup table
<nikolapdp>
that really depends on what it looks like
Matt|home has quit [Remote host closed the connection]
Matt|home has joined #osdev
lentement has joined #osdev
lentement has quit [Ping timeout: 264 seconds]
eddof13 has joined #osdev
vdamewood has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
linear_cannon has joined #osdev
SGautam has quit [Quit: Connection closed for inactivity]
linear_cannon has quit [Remote host closed the connection]
linear_cannon has joined #osdev
gog has quit [Quit: Konversation terminated!]
lentement has joined #osdev
lentement has quit [Ping timeout: 255 seconds]
goliath has joined #osdev
xenos1984 has quit [Ping timeout: 260 seconds]
xenos1984 has joined #osdev
linear_cannon has quit [Ping timeout: 246 seconds]
<bslsk05>
github.com: glibc/sysdeps/riscv/rv64 at master · bminor/glibc · GitHub
<geist>
i have been comparing my memcpy with glibc, and the one that is at least currently shipping in ubuntu 23.10 in glibc is clearly C derived
<geist>
they have some sort of fancy macro based C fallback memcpy that does a fairly good job
<heat>
macro based C faithfully describes 90% of GNU software
<geist>
yeah you should see the llvm-libc version...
<geist>
it's insane
<heat>
template based C++
<geist>
yerp!
<nikolapdp>
gross
<heat>
last i checked, their memcpy used some special llvm intrinsic they introduced to copy mem
<heat>
so if the complicated variable vector size, why did they go with that
<heat>
if its hard to pull off, that is
<geist>
oh speaking of moving lots of data, there's a new arm v8.7 feature that loads/stores 64 bytes at a time
<geist>
FEAT_LB64 or something
<nikolapdp>
geist where does it load into
<heat>
is it SIMD or just "load/store cache line"?
<geist>
load/store cache line into 8 sucessive registers
<heat>
heh
<nikolapdp>
interesting
gog has joined #osdev
<geist>
that feels like an ARM 'hold my beer' kinda thing
<geist>
like oh you like to move data? here we go
<geist>
though i guess technically arm32 was pretty close with ldm/stm and it has technically been regressed on arm64
<heat>
it reminds me of those wacky 32-bit ARM push N registers instructions you had
<heat>
yeah!
<geist>
this is a bit more limited, *must* be 64 byte aligned, so clearly it's a cache line load/store
<geist>
and probably 64bytes is pretty much officially the cache line from now on out
<geist>
though i guess it doesn't really have to line up
<heat>
weren't cache lines 128 byte on much of arm64 hardware?
<geist>
i only know of one arm machine that had 128, cavium thunderx1
<geist>
all of the arm produced arm cores have been 64
<heat>
it would kind of have to map for the full effect, no? a full cache line store could just invalidate the cache line on all other cores, and not have to read it back?
<geist>
yah
<geist>
this combined with the 'zero the cache line' instruction that pretty much all arches have now and you've got the full suite
<geist>
looking at that memcpy.S, that's already hecka thead customizes
<geist>
they have tons of extra instructions, like ldd/stdd (load store double word) that is highly nonstandard for riscv anyway
<geist>
so basically thats their own file on their own fork of glibc, so they can do what they want
<heat>
yeah, i suspect the rv SIMD isn't upstream yet
<geist>
also traditionally thead had their own vector bits, though i think they're going to switch to rvv 1.0
<heat>
thead has their own everything
knusbaum has quit [Remote host closed the connection]
<heat>
which is fun because you get new features before the spec bikesheds itself into acceptance, but horrible because G L E X T E N S I O N H E L L
knusbaum has joined #osdev
<Ermine>
Is there vulkan extension hell or not yet?
<nikolapdp>
i haven't heard of it yet
<Ermine>
i didn't either
knusbaum has quit [Ping timeout: 264 seconds]
xenos1984 has quit [Ping timeout: 240 seconds]
<Cindy>
i've heard of something new
<Cindy>
it's called computed goto
<Cindy>
and python interpreter uses this instead of a jump table or switch or some shit
<Cindy>
and it is much faster
<nikolapdp>
it's one less indirection
<nikolapdp>
also you can skip the function prologue and epilogue
<Cindy>
yes
<Cindy>
i still got the huge ass fucking goto table
<nikolapdp>
heh
xenos1984 has joined #osdev
<kof123>
not seeing a date but: > GO TO (Computed) (FORTRAN 77 Language Reference)
CryptoDavid has quit [Quit: Connection closed for inactivity]
<zid>
Then we copy paste the entire base as one chunk
<zid>
and put down 40 of it
Starfoxxes has joined #osdev
<gorgonical>
Let me hear some interesting kernel stack protection proposals
<zid>
my proposal is don't
<gorgonical>
Other than just protector pages on either side of the stack
<zid>
Just write correct code
<nikolar>
T R A I N S
<zid>
fwiw linux uses a gcc plugin that can measure stack depth
<zid>
so that you don't blow the stack
<zid>
and vlas are banned
<nikolar>
zid: can't go wrong by writing correct code
<gorgonical>
We didn't have a vla problem, we had a poll problem
<gorgonical>
nikolar: exactly, why didnt I think of that
<zid>
so you had a vla problem
<zid>
if you don'tknow how big your 'poll related information struct' is
<gorgonical>
No we do
<zid>
but you just neglected to notice it was fucking massive?
<zid>
that stack plugin would have been useful then
Matt|home has quit [Quit: Leaving]
<gorgonical>
The problem basically was: userspace calls poll(), poll allocates shit on the stack. We go into the impl where the kernel calls Linux asynchronously. Without competing tasks, there's a reasonably high chance that the waiting process that called poll is woken up to handle the interrupt. In that case, the stack depth involved in handling the interrupt and the consequent calls was too much for the stack and corrupting
<gorgonical>
task state
<zid>
oh -fstack-usage is now mainline?
<gorgonical>
Obviously it's very easy to fix once I recognize the problem
<gorgonical>
But the schedule() dependent nature of the problem made it not so easy to recognize
<zid>
gcc 4.6, new as hell
<zid>
Isn't having an interrupt stack normal btw
<zid>
to stop issues like that
<zid>
and.. it being just easier
<zid>
Just toss an empty page into tss.rsp0
<gorgonical>
That's probably the best solution
<zid>
actually how do you *not* do that, because you'd be overwriting everything if you didn't
<zid>
oh do r0 -> r0 interrupts
<zid>
not stack switch
<gorgonical>
I don't understand what you mean by overwriting stuff
<zid>
I syscall, I load rsp with a kernel stack at 0xBEEF0000, I can't write that *also* into tss
<zid>
because the interrupt would smash it
lentement has joined #osdev
<zid>
i.e at the time of the interrupt rsp is 0xBEEEFFF0
<heat>
gog, uni lipa
<gog>
NO
<heat>
tri lipa?
<gog>
yes
<gorgonical>
Here we're already in the syscall. userspace stack is stored somewhere else, kernel stack is sp. We are sleeping the process in kernel mode. It wakes up for an interrupt and uses its stack to handle the interrupt, but we're near the bottom
<gog>
but her name is albanian and it means "love"
<heat>
my name is albanian and also means love
<zid>
gorgonical: That requires the interrupt to go to the *extant* stack
<zid>
not switching
<gorgonical>
Yes
<zid>
> oh do r0 -> r0 interrupts not switch
<gorgonical>
I don't think so. Unless I'm missing something an interrupt just jumps to a specific PC
<zid>
but it loads tss.rsp0 if you're coming from ring3
<zid>
so that it has a stack to use
<gorgonical>
I don't even think arm does that, actually
<zid>
oh, arm
<gorgonical>
Yeah you have to stack switch and all that yourself
<zid>
arm has yea, a bunch of dedicated shadow regs and shit for all the kernel side right?
<gorgonical>
yep
<gorgonical>
So part of the interrupt/exception handler is to stack switch and stash all userspace context first
<zid>
for amd64 I'm not sure you *could* have written this bug, because I *think* tss.rsp0 is loaded even in r0 -> r0, but I'd have to have heat check for me
<heat>
not true
<zid>
it's skipped in r0->r0 then?
<heat>
you need to switch manually, or have an interrupt stack in the tss (and set it correctly for all IDT entries)
<zid>
That sounds like it is not skipped
<zid>
and is therefore true
<heat>
rsp0 is not an interrupt stack
<zid>
ah
<zid>
I mean, that's what it gets used for in practice, r3->r0 transitions load it, which includes interrupts
<heat>
right, but there's this interrupt stack system in the x86_64 TSS that lets you have specific IDT entries that switch to a given stack
<zid>
so you need to set IST to 1 or such in every IDT
<zid>
then fill out IST1
<heat>
yeah
<heat>
fwiw, i don't know what linux does here, but i do know they have a separate IRQ stack
<zid>
It just seems sensible imo
<heat>
i use the kernel stack
<zid>
stops you having to double up your stack length in case you do a gorg
<heat>
of the running thread
<zid>
and stops you faulting trying to take an irq
<zid>
which sounds.. icky
<zid>
Isn't that an instant triple fault if your stack ever gets full, heat
lentement has quit [Ping timeout: 252 seconds]
<heat>
my double fault handler has a separate stack
<zid>
interrupt -> push of frame hits guard page -> double fault -> push of frame hits guard page still -> triple fault
<zid>
oh
<zid>
IST1'd it?
<heat>
yes
<zid>
okay so you sort of have a mixture of both
<zid>
what do you do then though
<zid>
drop the IRQ, or crash the process?
<heat>
it's just a kernel crash
<zid>
smh lazy
<zid>
seems better to just set that IST1 for everything, and you'd get fewer kernel crashes overall for the same effect
<gorgonical>
zid: using an interrupt stack only moves the problem, too
<heat>
in reality you just need to be really careful with your stack usage
<heat>
like, i know my irqs are (if i did my job correctly) not going to use much stack
<heat>
so as long as i'm not pushing up against the stack limit, i'm fine
<gorgonical>
I suppose it's more efficient to have to use 3 pages for each cpu than per task for the stack
<zid>
It's just weird to have to consider your *actual* stack limit as
<zid>
n pages - sizeof_maximum_irq_frame
<zid>
rather than just n pages
<heat>
why? it also applies to, say, signal handling
<heat>
(unless you sigaltstack'd)
<zid>
I don't h ave stack limits in userspace, and I don't do signals in kernel space
<zid>
so I have never had to consider that
gbowne1 has joined #osdev
<heat>
in my design, the big thing i have to consider for every function is: can this be called in a deep callchain?
<zid>
I just minimize the useage, regardless
<zid>
usage
<heat>
and also: will i call a lot of crap? will i call *deep*?
gbowne1 has quit [Remote host closed the connection]
<heat>
i also minimize it but, you know, there are places where you can abuse the stack a bit, and places where it's pretty much a bad idea
gbowne1 has joined #osdev
<zid>
Disallow locals, task struct access only
<heat>
a typical poll optimization is to keep a bunch of state on the stack (for the sys_poll() handler), to avoid heap allocations. and this works because you're pretty much the first thing to be called, and the call chain will be pretty shallow
<zid>
every function must be (void), you use a hashset to find your parameters by __FUNC__
<gorgonical>
heat: this poll optimization is what got me actually
<zid>
the trick was 'keep the depth low afterwards' and you took a big irq on top :P
<zid>
so you failed to implement it properly
<heat>
what if every task had a "char buffer[BUFFER_SIZE]; char *pos = buffer;", and you incremented pos every time you needed some space
<gorgonical>
The problem being that my implementation of my async cross-kernel channel is not super-duper efficient and I was loading the stack after that
<heat>
you know, kind of like, say, a stack
<gorgonical>
And then the irq was the straw
<zid>
heat: I'd like to see a benchmark of a full software stack like that
<heat>
btw
<heat>
-Wframe-larger-than=<limit>
<gorgonical>
But that's only going to warn me if I'm being stack-intensive, right?
<gorgonical>
In theory I need to set that to like stacksize - irq_stackframe_size
<zid>
You need to do limit = max_depth - max_irq_depth
<heat>
you need to set it to something you think is reasonable
<zid>
which is what I said earlier, that's odd, I'd rather just have two stacks
<gorgonical>
zid: is it even a speed hit on an architecture that doesn't load the sp for you?
<gorgonical>
Just uses a few extra pages, right?
<heat>
my stacks are 16KB, and I set the stack limit to 1280
<zid>
My stacks are 4kB and I use about 40 bytes of it :P
<zid>
my kernel is ADVANCED
<heat>
so each function is virtually stopped from allocating ~ 1/16 of the stack
<gorgonical>
heat: you're saying you warn if you have used 1.2KB of your 16KB stack?
<heat>
on a single fucntion, yes
<gorgonical>
oh
<gorgonical>
oh
<gorgonical>
quite reasonable then
<zid>
that option doesn't do what it sounds like it does then
<gorgonical>
agreed. I thought it was a full-depth analysis
<zid>
I don't care about single function useage, only max depth
<heat>
"Wframe", it's in the name
<zid>
the single function will trip the max depth for all its children regardless if I set it small
<zid>
right, I care if the *child*'s frame, is now at a bad depth
<zid>
the parent can be at a bad a depth as it likes, if it isn't going to add more
<heat>
you can't actually measure depth so easily
<zid>
linux uses a plugin, last I heard
<zid>
it knows the cflow info and stack depth per func
<zid>
and just warns if it's beeg
<gorgonical>
Without type ranging you can't actually do it statically can you?
<heat>
you can do it at runtime, if you do something like kstack_top - sp
<zid>
type.. ranging?
<heat>
on interrupts
<gorgonical>
Unless you generate all possible callgraphs, even ones that are impossible
<heat>
yes
<zid>
without VLAs it's constant
<zid>
and recursion
<zid>
and yea, you need call flow analysis for *perfect* results
<zid>
but I don't care about perfect as long as I can annotate
<gorgonical>
zid: I just mean like typedef ranged_int int(0:10); and then the compiler knows ranged_int can't have value 11
<gorgonical>
dependent types
<zid>
oh for cfi?
<gorgonical>
yes
<zid>
yea I don't care
<gorgonical>
Because otherwise you're forced to address impossible callgraphs
<zid>
once, as long as you're allowed to annotate
<heat>
virtual functions would be a PITA, struct *_ops would be a PITA
<zid>
virtual functions? Have you been smoking crack again
<gorgonical>
yeah annotation is just the same thing but only for the analyzer
[Kalisto] has quit [Quit: Ping timeout (120 seconds)]
<heat>
IRQs and preemption and all that would throw a huge wrench in stack depth static analysis
<gorgonical>
indeed
<zid>
-fanalyzer is trying for this btw
<zid>
unfortunately it needs to solve the halting problem
[Kalisto] has joined #osdev
<zid>
so we'll see how far it gets
<gorgonical>
this all starts to sound an awful lot like the discussion around rtoses
<gorgonical>
enough analysis and guarantees eventually turns all kernels into rtoses
<zid>
which is why I said I don't care if it can do cfa or not, it will *always* have edge cases
<zid>
so I need to be able to annotate
<gorgonical>
do you do cfa now?
<zid>
using my excellent human insight of "this is impossible"
<zid>
you were saying you needed it
<zid>
I'm saying I don't n eed it
Cindy has quit [Ping timeout: 264 seconds]
<zid>
not that either of us have it
<gorgonical>
oh
<gorgonical>
annotation would solve that problem too
<gorgonical>
yes
<heat>
C++ and virtual functions would actually help, basically take all types with vtables, then see their subclasses, then calculate the maximum stack usage for each vtable entry
<heat>
whereas with the C struct something_ops idiom, it's a lot harder to track these down
Cindy has joined #osdev
<gorgonical>
Yeah it only would even be possible if you only do static assignment of ops
\Test_User has quit [Ping timeout: 264 seconds]
<gorgonical>
If you do runtime assignment of ops it's gonna be really hard without an integration to the analyzer
\Test_User has joined #osdev
<netbsduser`>
i was able to cut down on stack use by moving to a layered device model with iterative dispatch
<netbsduser`>
that is instead of a call stack of e.g. fs -> partition -> disk, the call stack is iop_continue() -> fs, then iop_continue() -> partition, etc
<heat>
hi dave cutler
<heat>
for what its worth, my IO shouldn't really use much stack
<netbsduser`>
they don't do that in windows at least
<netbsduser`>
on windows they call recursively down the driver stack; there is either an automatic kernel stack expansion or an automatic transition to a worker thread with a big stack if your chain gets too deep, i forgot which
<heat>
how much stack are you using on your IO?
<heat>
my IO should use around 0x68 (ext2_readpage) + 0x18 (sb_read_bio) + 0x8 (bio_submit_req_wait) + 0x28 + 0x8 (nvme io queue stuff) plus pushes around the time the nvme code rings the sq doorbell
<heat>
freebsd removed $FreeBSD$ and the old SCCS stuff
<heat>
freebsd is officially ANTI TRADITION
<zid>
Is that a prayer to the svn gods to not delete all their files
<AmyMalik>
:D
* AmyMalik
checks heat into sccs
<heat>
@(#)heat8.5 (Berkeley) 15/02/24
lentement has joined #osdev
lentement has quit [Ping timeout: 264 seconds]
<nikolar>
Did you build your own kernel or something
<gog>
what kind of loser
<nikolapdp>
gog you got bullied this morning
<gog>
when
<nikolapdp>
i am kidding
<Mondenkind>
i wish someone would bully me🥺
<nikolapdp>
Mondekind you suck
<nikolapdp>
better?
<nikolapdp>
*Mondenkind ^
<zid>
I am feeling very vulnerable right now if any goth girls want to take advantage of me
<Mondenkind>
no i want gog to do it
<nikolapdp>
kek
<gog>
uhhh
<heat>
brb i'm going to do the dishes
<nikolapdp>
who does dises at this time of the day
<Mondenkind>
what's wrong with doing dishes at midnight