klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
tacco has quit []
<geist> you could mask it at the GIC
<geist> but yeah i think if IRQs are disabled in EL2 and you bounce to EL1 where htey're reenabled, if the GIC is still asserting it it may still fire?
<geist> what you can't do is just configure the GIC to deliver an IRQ to a particular EL. that would be lovely, but IIRC the cpu still logically has a single /IRQ and /FIQ line
<gorgonical> I don't *think* the implementation relies on GIC3 behavior, so it must be something about ARM interrupts, I think
<gorgonical> I see
<gorgonical> So the logic for how interrupts get delivered is going to rely on the ARM chip's handling logic
<geist> i think so yeah. there's some switcharoo you need to do when entering the guests and back to the dom0, etc
<gorgonical> So then the hcr_el2.imo configuration is probably so that when running in EL1 the interrupt gets delivered straight in
<geist> yah, which may be the case when you're in say a dom0 in a type 2 style
<geist> but then when you switch to a guest EL1 with no hard IRQs routed to it, you program the HCR_EL2 to route all IRQS to EL2 first
<gorgonical> Rather than the chip jumping up to el2 and using that vbar. So in the case where we are already in el2 and the interrupt fires, you are thinking that by not eoi-ing the interrupt and switching back to el1, the interrupt might re-fire
<geist> yah i bet so. which EL1 aer you switching to? a guest or the dom0 (if you have one)?
<gorgonical> So this is a type-1 hypervisor. It's hafnium, where hafnium boots up and starts the "primary VM" according to ARM's FF-A spec
<geist> got it
<geist> so in that case is it that *all* irqs are routed to EL2?
<gorgonical> And the primary VM is responsible for providing scheduling for the hypervisor, whose only job really is to trap instructions and do context/world switches
<gorgonical> It seems actually that when running in secondary VMs the IRQs do get bounced into EL2, but when running the primary VM they are not bounced into el2
<geist> seems like in that case EL2 should probably claim all irqs for itself and then just manually bounce ones it wants to passs through by faking out an IRQ
<geist> ah, so that seems like the difference in programming of the hcr_el2 when running the dom0 vs guests
<geist> that sounds like a very loose interpretation of type 1 really... the fact taht there's a single guest that has more provildges than other and also takes IRQS really smells like a type 2
<geist> if type 1 and 2 even mean anything anymore, frankly
<gorgonical> so what do you mean by "faking out" an IRQ? because AFAIK the hypervisor just directly switches back into the primary VM when a "general" IRQ happens, no configuring something in the vmcs like in intel
<geist> faking out as in branching to the EL1s vbar as if an irq was delivered
<geist> setting FAR, etc etc
<gorgonical> I guess it could be doing that. It definitely could interrogate the sysregs to find out
<geist> but hoenstly i have not walked through how this is supposed to work. we have a vcpu thing in fuchsia but i didn't write it
<geist> i have a general knowledge of the cpu features, but haven't worked through the details of exactly what is reconfigured in EL2 when switching around
<gorgonical> But the hafnium code doesn't suggest it's doing that
<gorgonical> Hold on
<bslsk05> ​hafnium.googlesource.com: src/arch/aarch64/hypervisor/handler.c - hafnium - Git at Google
trufas has quit [Ping timeout: 265 seconds]
trufas has joined #osdev
<bslsk05> ​hafnium.googlesource.com: src/arch/aarch64/hypervisor/exceptions.S - hafnium - Git at Google
<gorgonical> And basically that function in irq_lower ends up here: https://hafnium.googlesource.com/hafnium/+/refs/heads/master/src/api.c#74
<bslsk05> ​hafnium.googlesource.com: src/api.c - hafnium - Git at Google
* geist nods
<gorgonical> I know you said this is a little out of your purview, but I'm doing my dilligence to make sure I haven't overlooked something
<geist> yah
<geist> dunno, a bit busy right now to try to grok this super complicated topic
<geist> but i think you're on the right track
<gorgonical> My heuristic conclusion is basically that I think the interrupt *has* to re-fire if it's active and you enter a different (lower or higher? Maybe only lower?) EL
<geist> unkless the HCR_EL2 always redirects it to EL2?
<geist> but it could be reprogrammed in all of that sequence
<geist> which would makes sense. you're switching to dom0, want the IRQ to fire, so you turn off the 'send everything to EL2' then enter EL1 in which case it instantly fires
<geist> because the IRQ is still level asserted
divine has joined #osdev
<gorgonical> I guess what I don't understand is the logic around how you can receive the irq and just keep going
<geist> that's what i'd assume happens here
<geist> what do you mean?
<gorgonical> It seems strange to me that you can "ignore" the IRQ after getting it and simply choose to switch ELs
<geist> why? that seems totally what you want to do
<gorgonical> Oh sure, that is exactly what we want and quite useful. Just a very unfamiliar idea
<geist> as in, EL2s job is not to deal with IRQs (except maybe a virtual timer) but to route the cpu between EL1 contextx that can
<gorgonical> I don't think any sort of mechanism like this exists in intel. Once the interrupt fires it's gone and you have to figure out which one fires
<geist> think of EL2 as the microcode in intel x86 that's running during vmcall/vmexit, etc
<geist> all of that magic black box is exposed here, but yuo can implement something that more or less does the sameish thing
<gorgonical> I'll say it's a little cleaner in that it's the same mechanism. IRQ fires but don't want to handle it? Ignore it, switch to a different EL, and hope the handlers there deal with it
<geist> yah
<gorgonical> As opposed to intel: an interrupt fired in the guest so it exited, but the interrupt status is in the vmcs and so if you want to deal with it you have to interrogate that and mutate state that way, separate of your own idt
<geist> and it's even possible they wont. EL1 dom0 may have entered EL2 originally as part of some hypercall, with irqs disabled (at EL1)
<geist> so you bounce back, it runs some more in EL1 until it unmasks its irqs and then boom it fires
<geist> that would be the reason you wouldn't want to 'fake an irq'
<gorgonical> That is a very good point
<geist> you possibly could, but given that the IRQs delivery mechanism in arm is cheap there's probably little reason to try to short circuit that
vdamewood has joined #osdev
vinleod has joined #osdev
vdamewood has quit [Ping timeout: 240 seconds]
freakazoid333 has quit [Read error: Connection reset by peer]
ElectronApps has joined #osdev
vinleod is now known as vdamewood
_mrlemke_ has quit [Read error: Connection reset by peer]
_mrlemke_ has joined #osdev
CryptoDavid has quit [Quit: Connection closed for inactivity]
isaacwoods has quit [Quit: WeeChat 3.2]
heat has joined #osdev
gog has quit [Ping timeout: 246 seconds]
iorem has joined #osdev
aquijoule__ has joined #osdev
aquijoule_ has quit [Ping timeout: 246 seconds]
_mrlemke_ has quit [Quit: Konversation terminated!]
Brnocrist has quit [Ping timeout: 272 seconds]
Brnocrist has joined #osdev
heat has quit [Ping timeout: 268 seconds]
sts-q has joined #osdev
<gorgonical> Utterly incomprehensible. The Hafnium docs say "interrupts are owned by the primary" and so that gets switched to if one happens while running a secondary (e.g. timer), but it's not via the virtual interrupt system and I honestly have no idea how
ElectronApps has quit [Read error: Connection reset by peer]
ElectronApps has joined #osdev
<gorgonical> Wait! From a programmer's guide for the generic timer, I think I have figured it out!
<gorgonical> So once the interrupt gets delivered, it's now masking that same interrupt from being received again, as one expects so you can actually handle it. However, doing a world switch back to the primary VM requires an eret, which will clear the interrupt status, and because the timer is *level-sensitive*, it will immediately re-deliver the interrupt after the eret executes. At that point, the target
<gorgonical> exception level registers will be in effect and that's how you defer the work
iorem has quit [Quit: Connection closed]
Izem has joined #osdev
srjek|home has quit [Ping timeout: 252 seconds]
nyah has quit [Ping timeout: 240 seconds]
froggey has quit [Ping timeout: 265 seconds]
froggey has joined #osdev
ElectronApps has quit [Ping timeout: 240 seconds]
ElectronApps has joined #osdev
<doug16k> gorgonical, figuring out arm docs is more impressive than implementing code :P
sprock has quit [Ping timeout: 258 seconds]
<geist> gorgonical: right
<geist> the interrupt masking bit is banked per level, so as you drop to the lower level it just fires (if the lower level has IRQs enabled)
silverballz is now known as silverwhitefish
<doug16k> linear probe hash table with interned hashed string key makes it instantaneous compared to map<string, ...>. this code was mostly just an elaborate series of strcmp calls before :P
<doug16k> just linear search the whole thing would be fast enough, but na, I have to go and just jump straight to the right one most of the time :P
<doug16k> combined with constexpr trick to compile-time pre-hash string literals
<doug16k> so it just knows the hash already :P
<doug16k> I want to see how fast I can get map<string abuse to run
<doug16k> kind of intrusive hash table, where your key needs to have a .hash member
<doug16k> that lets you just tell it the key and saves tons of hashing execution time
<doug16k> er, just tell it the hash
<doug16k> so far the actual memcmp of the key is pointless, haven't seen a hash collision yet
<clever> that reminds me
<clever> internally, the haskell json library, uses hashmaps for its objects
<clever> and i have heard of an attack, where a user generates json, with keys that intentionally collide in the hashmap
<clever> and that makes the hashmap performance take a nosedive, handling keys in the same bucket
<sahibatko> Is it just my impression or is the June 2021 revision of Intel manuals quite a lot refactored compared to the one from 2016? I mean, not just added stuff regarding PML5, but changed some "naming" too - is there perhaps a "hinghlighted changes" version available?
<doug16k> I have seen versions that are just the diff
<bslsk05> ​software.intel.com: Intel® 64 and IA-32 Architectures Software Developer's Manual...
<vdamewood> Does this mean I need to order new hardcopies from Lulu?
<doug16k> shows the changes indexed by section
<doug16k> changes highlighted in that
<sahibatko> doug16k:that seems to be it, thanks for the link
Izem has left #osdev [Good Bye]
<Arsen> I have an arm server that has an empty dtb and theoretically exposes all configuration over ACPI, but it appears that, with linux, if you set amd-smmu.disable_bypass=1 linux prevents the drive controller being written to (unknown stream id 0x800)
<Arsen> my guess this is meant to come from the IORT ACPI table, but I'm not sure how ARM IOMMU works, so I figured asking here would help
<Arsen> if it does come from there, then it's probably a firmware bug, upon inspecting that table with acpidump and iasl I found nothing that'd lead me to believe the stream id is referenced anywhere, but I also lack knowledge of ACPI :P
<Arsen> also I know this isn't strictly osdev but I'm decently sure there's no more appropriate place to ask
<geist> yah i guess something has to tell it which smmu ids map to what
<geist> surprised it has no DTB. guess it was generally designed to boot windows
<Arsen> the code agrees with my guess, and yours too then
<Arsen> well, it has a dtb, it's just perfectly empty :P indicating that all info should be coming from ACPI
<Arsen> but if they didn't expose this info in ACPI I should complain to the fw vendor, right?
<geist> the only arm server i've worked with (gigabyte based board, ThunderX2 cavium) has i think a fairly complete DTB and ACPI
<Arsen> that sounds like a smarter approach than what lenovo took here
<gorgonical> ive always wanted to work on one of the thunderx2's
<geist> yah all in all it's a nice machine
<gorgonical> I like the subversion of the regular expectation that arm == sbcs and thus low power
<gorgonical> Really gets my goat when people derisively say arm can't be fast because reason xyz and intel/amd will reign forever
<geist> well, the apple M1 proved them wrong
<geist> but indeed
<geist> as much as i'm not particularly happy that it's a vertical thing like apple, i am pleased that they actually produced a competitive microarchitecture not based on x86
<Arsen> this server is quite the opposite, it was meant to serve as a hypervisor before it stopped booting because this got merged: https://patchwork.kernel.org/project/linux-arm-kernel/patch/20190301192017.39770-1-dianders@chromium.org/
<bslsk05> ​patchwork.kernel.org: [v2] iommu/arm-smmu: Break insecure users by disabling bypass by default - Patchwork
<Arsen> (the opposite of a low power sbc)
<gorgonical> and not to mention that fujitsu is currently in the business of building supercomputers with arm
gioyik has quit [Quit: WeeChat 3.1]
<Arsen> but yeah this looks like a fun fw bug, does that sound right?
nyah has joined #osdev
ElectronApps has quit [Read error: Connection reset by peer]
ElectronApps has joined #osdev
gareppa has joined #osdev
gareppa has quit [Remote host closed the connection]
sprock has joined #osdev
sprock has quit [Ping timeout: 272 seconds]
sortie has quit [Remote host closed the connection]
sortie has joined #osdev
ZipCPU has quit [Ping timeout: 268 seconds]
zoey has quit [Ping timeout: 246 seconds]
GeDaMo has joined #osdev
z_is_stimky_ has quit [Read error: Connection reset by peer]
z_is_stimky has joined #osdev
dormito has quit [Ping timeout: 252 seconds]
dennis95 has joined #osdev
ZetItUp has joined #osdev
dormito has joined #osdev
ZipCPU has joined #osdev
ElectronApps has quit [Remote host closed the connection]
ElectronApps has joined #osdev
ElectronApps has quit [Remote host closed the connection]
ElectronApps has joined #osdev
isaacwoods has joined #osdev
gog has joined #osdev
gareppa has joined #osdev
gareppa has quit [Remote host closed the connection]
CryptoDavid has joined #osdev
ElectronApps has quit [Read error: Connection reset by peer]
ElectronApps has joined #osdev
ahalaney has joined #osdev
iorem has joined #osdev
Brnocrist has quit [Ping timeout: 265 seconds]
ElectronApps has quit [Remote host closed the connection]
srjek|home has joined #osdev
freakazoid333 has joined #osdev
Raito_Bezarius has quit [Ping timeout: 240 seconds]
Raito_Bezarius has joined #osdev
mahmutov has joined #osdev
gioyik has joined #osdev
mahmutov has quit [Ping timeout: 268 seconds]
iorem has quit [Quit: Connection closed]
smarton has quit [Changing host]
smarton has joined #osdev
srjek|home has quit [Ping timeout: 246 seconds]
silverwhitefish has quit [Quit: One for all, all for One (2 Corinthians 5)]
silverwhitefish has joined #osdev
silverwhitefish has quit [Client Quit]
silverwhitefish has joined #osdev
tenshi has joined #osdev
theseb has joined #osdev
<theseb> newbie locking question....Imagine you tried to implement a locking mechanism so that only one process had write permission on a file.....How avoid race conditions for that?....e.g. Imagine 10 processes simultaneously check to see the file is not locked....then they will ALL grab write permission at the same time! How avoid that?
mahmutov has joined #osdev
<tenshi> test-and-set the lock
CryptoDavid has quit [Quit: Connection closed for inactivity]
sts-q has quit [Remote host closed the connection]
Skyz has joined #osdev
<theseb> tenshi: k, thanks
theseb has quit [Quit: Leaving]
zoey has joined #osdev
tacco has joined #osdev
freakazoid333 has quit [Read error: Connection reset by peer]
dennis95 has quit [Quit: Leaving]
sprock has joined #osdev
drewlander has quit [Quit: ZNC 1.7.2+deb3 - https://znc.in]
drewlander has joined #osdev
ephemer0l has joined #osdev
Skyz has quit [Quit: Client closed]
nly has joined #osdev
Skyz has joined #osdev
nly has quit [Quit: Client closed]
nly has joined #osdev
<geist> neat. got my floppy drive controller emulator thingy in
<geist> gotek floppy emulator
<mjg> :)
<mjg> emulate atari tape recorder instead
<gog> you own systems with a floppy connector?
<mjg> probably nobody does, but their bioses likely still have code for it
<mjg> still, afair qeme likes the fdd
<mjg> qemu
<geist> yes and also these are useful for replacing the floppy drive in old industrial equipment
<geist> and i have a roland keyboard with a floppy drive
<gog> ah i see
<geist> but also figured it might be fun to load it up with a bunch of disk images and then put it on the 486 or whatnot
<bslsk05> ​www.gotekemulator.com: SFR1M44-U100 GoTek 3.5Inch 1.44MB USB SSD Floppy Driver Emulator - GoTek USB Floppy Emulator Manufacturer Factory
<geist> if this works i might order a black one to replace on the roland, since it's floppy disk hasn't worked in forever
Skyz has quit [Quit: Client closed]
Skyz has joined #osdev
gog has quit [Ping timeout: 258 seconds]
mahmutov has quit [Ping timeout: 252 seconds]
mahmutov has joined #osdev
GeDaMo has quit [Quit: Leaving.]
tenshi has quit [Quit: WeeChat 3.2]
nly has quit [Quit: Client closed]
CryptoDavid has joined #osdev
dormito has quit [Ping timeout: 240 seconds]
MiningMarsh has quit [Ping timeout: 246 seconds]
MiningMarsh has joined #osdev
scaleww has joined #osdev
Skyz has quit [Quit: Client closed]
srjek|home has joined #osdev
Skyz has joined #osdev
dormito has joined #osdev
<sortie> OK so I have a VM of my OS in the cloud that has frozen and become unresponsive, but I have attached a qemu monitor over VNC
<geist> yah i was wondering what was up
<geist> it hasn't been online in a while
<sortie> 11a42e:f4 hlt
<sortie> 11a42f:eb fd jmp 0x11a42e ← %rip
<sortie> So it's midnight and I'm tired but I do want to poke a bit at this lest any unattended upgrades reboot the host Linux
<sortie> Anything good at reading registers or knowing qemu able to spot or tell me how to tell whether this VM has triple faulted or something
<sortie> Or if this is indeed a deadlock
<sortie> I have something of a feeling that it's correctly sleeping on hlt, but the interrupt worker thread has stalled somehow, so packets aren't being received and delivered to user-space, so the system is idle
<sortie> Keyboard input doesn't work on the VM, I imagine for the same reason
<geist> i assume you're running it with KVM?
<sortie> KVM yes
<geist> part of the issue i see there is all the registers are reading zero
<geist> which, iirc, isn't entirely impossible with KVM sometimes
<geist> there have been points in time in the past where i've gotten incomplete or incorrect register readings
<geist> if it's just idling, is that your idle loop?
<sortie> This would be the kernel idle thread btw, so it doesn't have floating point state
<sortie> Yes, rip is my hlt loop
<geist> yah sadly that's how it looks a lot. a stuck but otherwise functioning kernel many times just appears to be in the idle loop
<geist> could have missed a timer, etc
<sortie> I assume RFL is the rflags and the 0x200 bit is set so interrupts are on
<geist> oh i see the regs being zero, the main ones are off the screen
<geist> yah so maybe poke at the interrupt controller state?
<sortie> Anyone know how to, uh, paginate stuff in the qemu monitor?
Skyz has quit [Quit: Client closed]
<geist> that seems to be a problem. i was actually going to look into that in a bit. when running under gnome i remember you have to make sure you have libvde (i think libvde) for it to link in some vt100/terminal support to the consoles
<geist> i have no idea if that also applies to VNC. was going to test that theory
<geist> but side note, you telling me that you can use ctrl-alt-2, etc has changed *everything* for my whole VM solution at home
<sortie> :D
<geist> i had resigned myself to the idea that you just can't get to it in my setup
<geist> but now it's all great
<sortie> I'm happy to help :)
<geist> libvte?
<sortie> I just use a qemu from my distro
<geist> anyway i'll give that a go in a bit
Skyz has joined #osdev
<geist> otherwise yeah you can't paginate or scrollback in it, which is a bummer
<geist> same with serial0 or so
<sortie> Mind you that I have a live VM here that reproduced the crash and it takes ~5 days to reproduce it
<geist> but like i said when building it manuyally and using a direct ui if it has vte it can let you scroll back in serial at least
<geist> sounds like you need more vms
<geist> these sort of hung system bugsare tough. generalyl the best solution is to start building into it your own watchdog schemes
<sortie> I believe this also happens on my sortix.sortix.org VM although that could be a different and older issue
<geist> since a stuck cpu is hard to catch
<sortie> In any case, just knowing this is a huge hint!
<geist> we have had to add increasingly complex solutions for zircon as well
<geist> though mostly in an SMP situation, cross checking each cpu from each other cpu
<froggey> ctrl+page up/down may or may not scroll the monitor
<sortie> Good suggestion, froggey, alas, that doesn't work nor any other similar thing I can think of
Skyz has quit [Client Quit]
mahmutov has quit [Ping timeout: 255 seconds]
<sortie> Hmm it could be a deadlock in the interrupt handling
<geist> froggey: yah that's what i was thinking about with the libvte stuff
<geist> ie, you dont get it unless qemu is compield a particular way
<geist> at least with gnome, i dunno if it applies to VNC or windows or mac uis
<geist> but may only be gnome specific, since libvte seems to be gnomeish
<sortie> I can probably fix the problem with a thorough code review for deadlocks and things stalling the interrupt worker thread
<froggey> mm, yeah. it's really hit-or-miss, which is annoying
<sortie> Is there any way to check if it triple faulted, just to rule that out?
<geist> sortie: while that's a good thing, it's the stuff you missed that you didn't think to check
<geist> some subtle race with the interrupt controller, something isn't acked, or some rare case where no timer is set where it should be, etc
<geist> once you get stable enough all the fun bugs turn into these sort of things
<sortie> Also a great suggestion
<geist> so one thing to do for example is maybe run with a no-hlt mode but then have a monotonic timer running at some rate like 1Hz
<geist> and in your idle loop check that it is incrementing
<sortie> Hopefully the VM will stay up tomorrow and Saturday cuz' I'll be busy but then it's VACAY
<sortie> Then I can debug this properly
<geist> timer irq bumps a counter, etc
<geist> this is a case where SMP actually helps, because you can have different cpus cross check each other
<geist> like have each cpu run a 1Hz timer that bumps a counter, and in the timer check that the other cpus haven't stopped, etc
<sortie> Timer interrupts should still fire
<geist> should and are are different things
<geist> that's the point, it's an assert that stuff is still ticking
<sortie> They don't use the interrupt worker thread, so I can likely see if the current time is still incrementing if I can find the memory location
<geist> right
<sortie> If that is the case, then I know the interrupt controller is still working correctly
<geist> also hardware watchdogs are useful here
<geist> dunno if qemu x86 has one, but if you enable it then you also set the cpu to pet the dog every second or so
<sortie> I did want to support the qemu watchdog for reliability
<sortie> Although I would want to delegate that to user-space, so an occasional self-test of availability must pass, or the vm gets destroyed
<geist> more layers is good
<geist> i'd suggest having layers that cross check each other
<sortie> Well thanks for the great conversation
<geist> yay this is the fun stuff
<sortie> It's after midnight so no imma sleep but good pointers
<sortie> The vnc monitor gave me a really good hint I can pursue and you gave me sone good tests to run to learn more
<geist> yah *yusually* what happens is your cpu gets i a loop with interupts disabled, and then HW watchdog catches it
<geist> in your case it's in the idle loop with ints enabled
<geist> but, perhaps the IRQ controller is not acked so its stuck
<geist> or, also possible, a critical timer was not set
<geist> or, also possible, it's simply a high level deadlock
<sortie> Somehow I doubt the IRQ controller stuff being wrong since I haven't touched that stuff for many years and it's been stable
<geist> and the kernel is running fine, there's just no activity
<sortie> I will of course verify
<geist> yah but have yuo stress tested this way for days, this is where you start seeing subtle races you didn't know about
<sortie> Indeed
<geist> 1/10 million times sort of things that only shows up after days of ticking
<geist> anyway. to bed you
<sortie> :)
<sortie> Always a pleasure talking
<klange> My endless adventure in bouncing between projects and never doing what's on the task list in front of me continues as I sketch out some widget toolkit thoughts once again...
gog has joined #osdev
ahalaney has quit [Quit: Leaving]
Robbe has quit [Remote host closed the connection]
Robbe has joined #osdev
Brnocrist has joined #osdev
freakazoid333 has joined #osdev
iorem has joined #osdev
Skyz has joined #osdev
<Skyz> What's the difference between a syscall and system api?