klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
dequbed has quit [Ping timeout: 250 seconds]
gwizon has quit [Ping timeout: 240 seconds]
biblio has joined #osdev
nyah has quit [Ping timeout: 260 seconds]
biblio has quit [Quit: Leaving]
<geist> have you tried to bring up the secondary cores yet? SMP on arm may uncover some bad memory barriering you have
<klange> Not yet. I want to get the basic task switching up first and then I'll play with that. Does QEMU even emulate the memory model completely enough to see those sorts of things?
<klange> I should hop over to my Mac at some point for the hvf
ElectronApps has joined #osdev
<geist> yah was gonna say then there's that. QEMU wont really emulate it properly, but... it does tend to run each emulated cpu as a separate thread and kinda yolos their write order except for explicit barriers
<geist> so to a certain extent you do get some amount of read/write ordering issues
<geist> not the full monty, but some of it
<geist> compared to an in order, single threaded, round robin emulation of a bunch of cores that qemu has traditionally done with x86
<zid> strong memory model for life
<geist> i've been told that qemu can multithread TCG emulation on x86 nowadays, but i dont see how it can do it
<geist> but anyway aside from memory model in *general* SMP is cleaner on ARM. you dont have to shoot cross cpu IPIs for TLB syncs, booting secondary cores is nice (with PSCI), sending IPIs is pretty easy with both GICv2 and GICv3, etc
<geist> but i gotta say impressed you grokked the ARM64 page tables that quickly. they're a real firehose of info, and the ARM manual does a terrible job of easing you into it
mahmutov has joined #osdev
<klange> I went with the 4K 4-level model so outside of the control bits most of this was copy-paste of my x86-64 memory management.
freakazoid333 has quit [Ping timeout: 240 seconds]
<klange> though yeah the ARM ARM was horrendous, both in its descriptions and ARM's web version is just absolutely one of the worst websites it has been my displeasure to navigate
<klange> I clink an anchor link and it reloads the whole page because oops the tree nav sent me to a slightly different version and the anchor links hardcode the full path
<geist> i think i have a bit of Stockholm syndrome with it, and kinda prefer it being overly explicit about everything
<geist> but yes, the web version is a waste of time. get the pdfs and work with that
<geist> funny you complain about the arm docs page, it used to be *way* worse than that. it actually improved greatly in the last few years. they replcaed the old thing with something new
<klange> I think I've seen the old one.
<geist> the old site used to look like a slow Sharepoint thing. the new one is a slightly faster Sharepoint looking thing
<klange> I did look at this stuff years ago.
<geist> i cant not think of the UHF firehose guy every time
freakazoid343 has joined #osdev
<klange> One thing I've enjoyed so far about ARM is how all the system registers have names that the assembler just knows.
<clever> that seems to only be true on aarch64
<geist> yesss! that was a big thing that arm really fixed there
<clever> it makes the asm far far simpler to read
<geist> there's till a way to encode a raw register for vendor extensions, etc, but yah
<clever> i was going to say, why cant binutils just add the same idea to arm32?
<geist> and the whole exception level thing i think is overall well thought out and for the most part pretty regular, also with register suffixes and whatnot
<clever> but what about co-processor register reuse?
<geist> in what context?
<clever> if a given register does X on armv6 and Y on armv7
<geist> yolo
<clever> when you disassemble, which name do you print?
<geist> yah. well same thing comes from using the ABI names vs raw names in objdump
<clever> ignoring that, you could just add this feature to arm32
<geist> `-mreg-names=raw` or whatnot
<clever> translate all mcr and mrc opcodes
<geist> sure. i think the rason it got added to arm64 is arm simply mandated it
<clever> yep
<geist> and then probably implemented it in at least binutils, so other compilers had to follow suit if they wanted to be compatible
<clever> and arm doesnt want to go back and edit arm32
<geist> whereas the old one if you added it to binutils but not clang or msvc after the fact no one would pick it up
<geist> well, it wouldn't help. now yo uhave billions of lines of code suing the old stuff, why bother trying to fix that? people aren't going to rewrite their code
<clever> that too
<geist> it was bad enough that somewhere in the 2000s they did an arm32/thumb2 tweak to their asm syntax to try to make a new unified version
<geist> that would assemble either way
<geist> i already forgot most of what was in the earlier version, but it tweaked some amount of this or that
<geist> that being said most of the control registers on cortex-m class hardware (where active arm32 development is still going on) *does* have name based mnemonics, so i guess they at least fixed the problem there
<klange> ugh one of my libs has '/home/klange/Projects/workspace/toaru-aarch64/util/local/lib/gcc/aarch64-unknown-toaru/10.3.0/../../../../aarch64-unknown-toaru/lib/libgcc_s.so.1' as a NEEDED...
<geist> i actually dont know what those accessors assemble to
<klange> how the hell did that happen...
<geist> oh yeah was gonna mention you can probably totally get away with no libgcc in the kernel on arm64 as well
<geist> same as x86-64
<geist> though i'm sure you already figured that out
<klange> I have not, but I'll look into it; this is userspace, though, and I have that lib, where it should be, and I wonder if I screwed up my binutils build again...
<geist> oh and say away from `long double`. an important ABI difference between arm64 and x86-64 is long double is defined as 128 bit
<geist> which arm has no implementation of, so it falls back to software
<clever> i also ran into confusion, when i was trying to figure out why my code was using softfloat via libgcc
<clever> turns out, it was doubles in my code, hardfloat is float (32bit) only
<clever> not the arm softfloat, but full software fpu
<geist> yah 32bit only floats doulbes are indeed an issue
<geist> there's some gcc switch i think for warning of double promotion
<clever> if i changed my code to use float, then it properly compiled to fpu opcodes
<geist> happens in cortex-m4 for example which is single precisio float only
<clever> ah, wasnt promotion for me, it was just using double and float interchangably
<klange> whyy is this getting this from there, I don't understand
<geist> gotcha
<clever> not thinking about the side-effects
<klange> I have a sysroot set properly
<klange> oh ffs it's almost definitely because it's referencing a libgcc_s.so that's actually a link script that asks for, literally, "libgcc_s.so.1"
<klange> which then forces path resolution, which further catches the first -L with the internal library path from gcc
Burgundy has quit [Ping timeout: 245 seconds]
<klange> And it has a comment saying that some functions are only in the static version, so I imagine if I remove that / make it a typical symlink it'll barf on some missing function until I further add -lgcc to the link spec...
<klange> Which means... where is my patchelf...
xenos1984 has quit [Read error: Connection reset by peer]
sdfgsdfg has quit [Quit: ayo yoyo ayo yoyo hololo, hololo.]
<klange> I will have to figure that out at some point, maybe swapping around -L ordering or removing them from the gcc path so it realizes they are in a sysroot-relative path...
fwg has quit [Quit: .oO( zzZzZzz ...]
sdfgsdfg has joined #osdev
<klange> okay, more fun quirks to deal with, but I've got a demo app doing dlopens, loading JPGs, and doing framebuffer stuff https://klange.dev/s/Screenshot%20from%202022-01-29%2010-23-19.png
fwg has joined #osdev
xenos1984 has joined #osdev
<geist> woot!
<klange> < geist> oh yeah was gonna mention you can probably totally get away with no libgcc in the kernel on arm64 as well ← looks like the atomics I'm trying to use for spin locks are still yielding libgcc references
<klange> Presumably I can figure out how those are implemented and just patch them in myself...
<geist> oh there's a switch for that
<geist> i think gcc has some sort of 'automatically pick v8.0 vs v8.1 atomics via a libgcc call' feature
<geist> that was added very recently
<geist> you can disable that in which case it inlines whatever atomics you have
<klange> -mno-outline-atomics maybe?
<geist> that one
<klange> \o/ it links
<geist> the v8.1 atomics are nice, which you can just choose to use if you want, but then you'll only run on newer cores
<geist> TCG with -cpu=max will pick it up
<geist> i should look into that, curious how that's implemented. or if LK is already using it and i just didn't notice
<geist> off hand what was one of the referenced symbols?
<klange> __aarch64_swp4_acq was the main one
[itchyjunk] has joined #osdev
<klange> But I also dabble with the atomic bit manips for process status flags which were yielding things like __aarch64_ldclr4_acq_rel
<geist> cool
<klange> maybe those are going to save me from the memory model, I know they just shit out regular ands and ors on x86 (maybe `lock`-prefixed?)
<geist> ah sure enough, already emitting them and i just didn't notice
<klange> like how I littered everything with spin locks long before I had SMP and it mostly worked except the handful of locks that were badly ordered
fwg has quit [Quit: so long and thanks for all the fish.]
<klange> locks too might help, those sync intrinsics are supposed to also force memory barriers where needed?
<geist> yep
<geist> arm atomics have c++ style acquire/release semantics built in
<geist> none/either/both
<geist> whatever your kernel mutex thing does should have a spinlock or an atomic in there anyway
<bslsk05> ​IRCCloud pastebin | Raw link: https://irccloud.com/pastebin/raw/L6LuZiK3
srjek has quit [Ping timeout: 240 seconds]
<geist> so that's an implementation of it. seems to read from some global to decide to use ldadd or ldxr/stxr
<geist> but.. what sets that global? i wonder if i'm supposed to set it up somehow
<geist> though the default polarity is if it's zero to use the legacy one (via the cbz instruction)
scaramanga has quit [Ping timeout: 240 seconds]
<geist> well there you go: ffff000000186a60 g O .bss0000000000000001 .hidden __aarch64_have_lse_atomics
<geist> guess something is supposed to set that
<geist> sounds easy enough
<klange> there's probably an initializer being baked in from it?
<klange> do you call initializers? that feels like a C++y thing to do
<geist> i do, but i suspect it's not built into libgcc because there's no platform abstract way of reading if the atomics are available or not
<geist> unlike x86 the cpuid like equivalent is not available in EL0
<geist> you have to get this sort of feature info out of the kernel
<geist> yah something like this https://gcc.gnu.org/pipermail/gcc-cvs/2020-June/300635.html which seems to be reading it from linux's aux array
<bslsk05> ​gcc.gnu.org: [gcc/devel/ranger] [AArch64] Use __getauxval instead of getauxval in LSE detection code in libgcc
<klange> I had just typed and then deleted 'maybe it expects it in an auxv or something, surely it's documented somewhere'
<klange> and then thought I'd go look first, but there ya go
<geist> yah. we have some sort of equivalent syscall in fuchsia too. it's a pain but i kinda get it. it's not EL0s job to interrogate the cpu. EL0 gets to know what EL1 deems necessary to tell it
<geist> vs the security nightmare that is cpuid
pieguy128 has quit [Ping timeout: 240 seconds]
<geist> riscv takes that further: supervisor mode (EL1 equivalent) doesn't get to read any of this, including what the current cpu # is. only machine mode (EL3 equivalent) gets to read that infos
<geist> supervisor has to get it from EL3 using software
<klange> Anyway, I should take a break. Userspace framebuffer access is working, linker is working in most situations though I'll need to do more work for TLS relocations (sure hope I did those regs right)...
<geist> means riscv has very few registers it needs to trap and virtualize in some sort of virtualization scheme
<geist> yah tpidrro, etc
<geist> good work
<clever> geist: but also, you cant run supervisor without an EL3 to answer those questions
<geist> on riscv yeah. by definition machine mode *must* be implemented on riscv
<geist> on arm it's the opposite, EL2 and EL3 may be omitted
<klange> oh right, I had a question for you; what's the least shitty way I can use the EL1 one of those for kernel "per core" stuff? I banged it into an inline mrs call and that's working but I'd like to convince it to do the mrs fewer times
<geist> well, there's tpidr_el1 which is handy to store the current thread pointer
<geist> x18 is ABI reserved as being 'whateer the platform wants to do with it', so that's also a nice place to put the per cpu register *or* the current thread register
<geist> with the caveat that it must be assumed to be trashed in an exception/irq coming out of EL0
<geist> so you always have to end up using tpidr_el1 to recover state one way or another
<geist> -ffixed-x18 will double super sure the cpu doesn't fiddle with it
<klange> Hm, maybe using x18 for the actual access is better and then I'll just make sure that gets restored from TPIDR as part of the various transitions
<geist> that's basically what we do in ziron up until we added shadow call stacks and whatnot and it got complicated
<geist> now we burn both x18 and x20, but same idea, we just have another thing to track (shadow call stack)
<geist> i think my idea is tpidr_el1 is good for the current thread, since you always have to move it to a regular register and it is thus preemption sensitive. but current thread by definition cant change across preemption so it's safe
<geist> and then x18 you can actually one instruction access something from it if its the current cpu. basically like you probably do against gs:
<geist> ie, `ldr x0, [x18, #8]` sort of stuff
<geist> but then based on how you use it it beeing preemption safe may or may not matter
<klange> Yeah, that's the ideal. Current setup is kinda assuming TPIDR can change any time it needs to be accessed which would be true if I did kernel preemption, but instead it should only be true when calling into a function that can yield.
<klange> If it's just "use this register whenever" that's going to be better.
<geist> yah
<klange> When I get to actually spinning up more cores I'll look at forcing the register, thanks :)
<klange> I was so happy with how "static struct ProcessorLocal __seg_gs * const this_core = 0;" managed to produce very good code.
<geist> yah
<geist> you can do something with the globla asm thing in gcc
<geist> something like `extern uint64_t asm("x18") curr_cpu;` or whatnot
pieguy128 has joined #osdev
<geist> i forget the precise order, but it does work
<geist> caveat i think clang is buggy with it. we had to stop using it in zircon cause we're mostly clang based
<klange> `static struct ProcessorLocal * asm("x18") this_cpu;` might work? I'll give it a poke...
<geist> yah something like that
<geist> though derefercing things directly off that you can't really guarantee with a single instruction if that's important
<geist> so i think for zircon we defined some accessor inline asm that absolutely uses the right instruction
<geist> for things like get_current_cpu_num() or whatnot
<klange> asm after the name, let's see what it produces
<geist> and i think you need the -ffixed thing or the ABI says it can be used as another temporary (after x16 and x17)
<geist> or you bake it into your triple
<geist> x19 is the first callee saved one, which is why x18 is the victim for this sort of thing
<geist> but if it doesn't work consistently, easy enough to just write some inline asm accessor that returns x18 directly. what's nice about using the variable is it can generally avoid an extra mov. inline asm tends to generate code that moves from x18 to something else, and then dereference the other thing
fwg has joined #osdev
<klange> `register struct ProcessorLocal * this_core asm("x18");`
<geist> oh it's re...yeah you found it
<klange> and then asm volatile ("mrs x18, TPIDR_EL1"); as needed to restore it
<geist> then you're going to anchor current thread off that too?
<klange> current thread for that is just a member of the struct, which I know I'm taking hits on but it's really straightforward
<geist> no that sounds fine.
<klange> for userspace thread stuff it's just the standard ABI thing which is "shove the weird pointer backreference thing in TPIDR_EL0"?
<geist> side note: some msr registers are fast to access and have no interlocks, and IIRC the TPIDRs are considered in the fast class
<geist> yah. there's also TPIDRRO_EL0 but i *think* that's unused by everything
<geist> it's explicitly read only by used space
<geist> could put like the current cpu # in it or something
<geist> (like the top of rdtscp)
<klange> this looks much cleaner than what I was getting with the mrs but I think that's because gcc was not handling that as nicely since it was literally an inlined asm snippet
CryptoDavid has quit [Quit: Connection closed for inactivity]
<geist> but like i said caveat is clang is iirc a bit buggy with it. or maybe it was that it worked fine for x18 but we wanted to use another reg and then it didn't deal with it right
<geist> it was because the shadow call stack feature of clang is hard coded to simply use x18 as a second stack pointer, so you have to move to somehting else for this
<bslsk05> ​fuchsia.googlesource.com: zircon/kernel/arch/arm64/include/arch/current_thread.h - fuchsia - Git at Google
<geist> ah yes. but the reason we can do that is we're compiling with -mcmodel=kernel, which i dont think gcc understands
<geist> otherwise it'd try to use the _el0 version
<geist> but yeah if you can use the builtin it is smarter about reusing multiple calls to it, and whatnot, since i think it uderstands that the thread pointer can never change
<klange> anyway, this is progressing quite nicely and I think I should take a break for lunch
<geist> https://fuchsia.googlesource.com/fuchsia/+/a3e235192ed4f3254a00eacd5015466599f9e97c/zircon/kernel/arch/arm64/include/arch/arm64/mp.h#64 is the older use of x15 for the current cpu. it's x20 now and we had to stop using the register variable
<bslsk05> ​fuchsia.googlesource.com: zircon/kernel/arch/arm64/include/arch/arm64/mp.h - fuchsia - Git at Google
<geist> this must be an older revision
<geist> kk, will stop talking to you
<geist> well, i mean keeping you from eating that is :)
<klange> probably, I had found it because I was specifically searching for 'tpidr_el1' after reading about it and came across that file from google results
<klange> that git hash is from nov 2020
<bslsk05> ​fuchsia.googlesource.com: zircon/kernel/arch/arm64/include/arch/current_thread.h - fuchsia - Git at Google
<geist> yah for the thread one
<geist> the current cpu is defined in mp.h
<bslsk05> ​fuchsia.googlesource.com: zircon/kernel/arch/arm64/include/arch/arm64/mp.h - fuchsia - Git at Google
<geist> i dont think you need that, all you really really need as an anchor that survives syscalls and whatnot is TPIDR
<geist> but we do both because
<geist> also fun thing: when you enter EL0 and youre using the usual SP-banking-scheme you also have that to survive an exception
<geist> can do some fun trickery by embedding things in SP
<geist> SP_EL1 that is, when EL0 is using SP_EL0
<geist> also a lovely scheme that most sane architectures have: banked SPs per mode. avoids all of the TSS like shenanigans
<klange> If I can get my kernel context switching (which is really just setjmp/longjmp) working I should be able to get to a full GUI soon, and then I'll clean up this branch and push it so I can more easily point to what I'm doing when I ask questions about how to do it better.
<geist> 👍
mahmutov has quit [Ping timeout: 256 seconds]
<gog> klange a month ago: "i'm done working on osdev" klange today: "i'm porting to aarch64 and it's going pretty good"
<gog> :D
<klange> i am endlessly fueled by spite
<gog> same, mood
<klange> think i'm not qualified, do ya? well i'll f***in' show ya
<gog> spite of the self is what got me where i am today :P
<gog> not so much with osdev, just generally
<kingoffrance> does that mean it is like <3-letter> agency, or <insert substance here> ....you never really quit osdev ? once you know enough what is possible, how can you not jump in?
<zid> what else can we spite
<zid> I wanna spite something
<gog> hm
kingoffrance has quit [Ping timeout: 250 seconds]
scaramanga has joined #osdev
zaquest has quit [Remote host closed the connection]
zaquest has joined #osdev
kingoffrance has joined #osdev
gog has quit [Quit: byee]
<klange> resolved the libgcc_s.so.1 link problem with `-l:`; NEEDED gets the clean version, toolchain still finds the right one
pretty_dumm_guy has quit [Quit: WeeChat 3.4]
[itchyjunk] has quit [Ping timeout: 256 seconds]
gorgonical has quit [Remote host closed the connection]
sdfgsdfg has quit [Quit: ayo yoyo ayo yoyo hololo, hololo.]
gorgonical has joined #osdev
sdfgsdfg has joined #osdev
warlock has quit [Quit: leaving]
warlock has joined #osdev
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
mahmutov has joined #osdev
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
<geist> okay so added code to read the mmio and io aperture out of the FDT. not too bad
<geist> basically there's a list of type, base/base/size 64bit values
<geist> the double base is because it can list an io range as 'start of io port, mmio where it's mapped, lenght'
<geist> so that's nice for cases like ARM
<zid> what's an FDT
<zid> firmware device table?
<zid> flimsy duck trap?
<geist> yes
<geist> flattened device tree
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
gorgonical has quit [Ping timeout: 240 seconds]
gorgonical has joined #osdev
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
<geist> that page_directory isn't page aligned, i guess that's a pointer to it?
<geist> oh also. SP *must* be 16 byte aligned at all times
<klange> it's a struct with a refcount attached yeah
<geist> or at least there's a SCTLR bit that enforces it because the ABI mandates it
<geist> if you set the bit it instantly faults the moment its ever not aligned
<klange> not sure what's happening with the stack pointer, it's probably off somewhere going into userspace
<geist> also fun fact: the address space goes all the way up to 0x0000.ffff.ffff.ffff
<geist> so you got a whole nother bit there
<geist> ( in case you're using a #define from x86 there with the 0x7f... address
<geist> )
ravan has joined #osdev
dormito has quit [Ping timeout: 240 seconds]
rustyy has quit [Remote host closed the connection]
rustyy has joined #osdev
<klange> and it did not crash, it's paused on a wfi that will never awaken without any hardware to irq
ThinkT510 has quit [Quit: WeeChat 3.4]
ThinkT510 has joined #osdev
ElectronApps has quit [Remote host closed the connection]
dormito has joined #osdev
gorgonical has quit [Ping timeout: 256 seconds]
mahmutov has quit [Ping timeout: 240 seconds]
<klange> and if i tick the clock during the idle thread, I can get a panel https://klange.dev/s/Screenshot%20from%202022-01-29%2018-09-27.png
<klange> I'm hitting my stack guards (surprised those are working) in the kernel, which I thought would happen... I'm still wrapping my head around these stack registers
vdamewood has joined #osdev
C-Man has quit [Ping timeout: 250 seconds]
<geist> oh yeah? you mean the split SP_EL1 and SP_EL0?
ElectronApps has joined #osdev
<geist> generally they're just separate, unless you fiddle with the SPSel register and use the alternate mode (where SP = SP_EL0 in EL1) but that's only if you have some specific reason to do it
<geist> and iirc on every exception entry the SPSel flips automatically such that SP = SP_ELx
<klange> oh actually I hit the stack protector because I failed to properly unset it as a stack protector when freeing it
<klange> haha
<klange> omg the clock is ticking and i have a tutorial
<Ermine> Yay
<klange> so if i understand my own code right I think this will only be ticking the clock when there's nothing else to do, so it's likely a cooperatively-tasked toaru :)
<moon-child> isn't that 'tickless'? Or is tickless something else?
GeDaMo has joined #osdev
heat has joined #osdev
<heat> morning
<klange> I think tickless means the clock source isn't regular, but on-demand? But you generally still have _a_ viable pre-emptive source somewhere.
<moon-child> hmmm
<heat> tickless means that the scheduler doesn't tick regularly
<heat> only ticks when the next schedule-out needs to happen
<heat> dynticks ticks regularly but stops ticking when the CPU is fully idle
<klange> Yeah, so what I've got right now is classic cooperative scheduling.
<moon-child> heat: right, ok, that makes more sense
<klange> I have done nothing around interrupts yet except up the one vector that was necessary for SVC calls, and I think my implementation is garbage.
<klange> whats a gic, sounds icky
<klange> Anyway, this is... honestly farther than I thought I'd get this weekend and it's only Saturday night
<heat> klange, the gic is the pic for arm i think
<klange> heat: that was one of those joke things
<heat> hahahahahaha
<heat> haha
<heat> haha
<heat> ha
<heat> :)
<klange> it was at best two has of funny
<klange> i wonder if I can start a terminal
<klange> oh right that requires threads and I still have clone disabled https://klange.dev/s/Screenshot%20from%202022-01-29%2019-07-24.png
<heat> honestly really impressive you have it going this quickly
<heat> my riscv64 port has been frozen for months
<heat> now I'm thinking that maybe arm64 would be cooler :/ I actually have hardware for that
<klange> that is why I opted for it
<heat> you can grab an arm64 board for 15 euro, but a riscv64 is like 100+ :/
<klange> okay I can't start a terminal quite yet but here's `uname -snr` and `uname -vmo` in a dialog box https://klange.dev/s/Screenshot%20from%202022-01-29%2019-15-17.png
<sortie> klange, boom!
<sortie> Wow you're making fast progress :)
<geist> klange: you wired up some of the gic or you wouldn't have a timer working right?
<klange> I am WFE'ing on the periodic event source from the system timer, so no
<geist> oh i see. and yeah the synchronous exceptions like svc come in via a different vector instead of IRQ/FIQ
<geist> by default the virt machine will do gicv2 which is pretty straightforward to get working so no biggie
<geist> gicv3 is a teensy bit more complicated, but not too much
<klange> i'm just messing around at this point, but i disabled the input thread for the terminal and got one running on startup to spit out my `sysinfo` command and go to a shell https://klange.dev/s/Screenshot%20from%202022-01-29%2019-33-50.png
<heat> geist: something that's kind of related and has been discussed in edk2: for the archs where the linker really doesn't let you link with user-space libgcc/compiler-rt (like riscv64), why not compile it as part of the build?
<klange> pay no mind to the lack of info or line feed on the CPU entry... it calls a kuroko script and that's bailing on TLS stuff at the moment
<heat> i think that theoretically works and you get all your intrinsics
<geist> heat: compile libgcc.a as part of it?
<geist> if you want i guess. but then you have to recreate that part of the build system
<heat> something like it, yeah
<heat> compiler-rt is liberally licensed and looks pretty simple
<klange> libgcc's build is super gnarley, compiler-rt looks like it's simpler
<geist> yah my general issue with compiler-rt is its generally less standaloney
<geist> though i guess if you're pulling in just what you need that might solve that problem
<heat> for a bare metal app you'd generally only need what, big int operations?
<heat> maybe soft fp if you're doing something more involved with lk or something
<geist> depends a bit on the arch, but yeah
<geist> maybe float
<geist> and assist for atomics potentially
<heat> the builtins part of compiler-rt seems to be a really simple set up
<heat> might be a viable option
<heat> i've yet to tackle that issue in my kernel
<geist> yah probably so. iirc it generally includes more than just simple builtins whcih is why i think it's a bit harder to build as a pure standalone
<geist> but if you get your hands dirty and customize it it probably is fine
<geist> i just haven't done that
<heat> yup
<heat> i remember seeing some undefined references to atomics stuff when building my kernel for rv64
<heat> because I tend to use the compiler's atomics and those are probably builtins in rv64
<geist> yah fun one i literally just discovered: using __builtin_ffs() on rv seems to bottom out in a call to ffs() that you must provide
<geist> oddly __builtin_clz() has a libgcc thing to implement it though
<bslsk05> ​IRCCloud pastebin | Raw link: https://irccloud.com/pastebin/raw/VprcunD1
<geist> looks like it's using some sort of table, __clz_tab
<geist> probably nibble or byte at a time
<heat> gcc or clang?
<geist> gcc
<geist> there's not even instructions for endian swaps:
<bslsk05> ​IRCCloud pastebin | Raw link: https://irccloud.com/pastebin/raw/a1toHIrx
<heat> oh wow I think this is just a table with 0xff entries
<heat> as stupid as they come
<geist> yah pretty fast though, you can easily do a popcnt or clz or whatnot with a table
g1n has joined #osdev
<g1n> hello
<g1n> forgot to join this channel after restart of irc client lol
<g1n> so i am starting os rewrite from scratch to understand it more
<g1n> last time i tryed i had a lot of code that i didn't understand lol
<sortie> :)
<geist> grats getting back into it
<g1n> :)
<g1n> idk how long i will try, but i hope i will at least find how to print hello world using C, totally on my own lol
<geist> cool! yeah that's a pretty good way to learn
<g1n> :)
<heat> congrats welcome to the biggest time sink of your life
<kazinsal> truth
<GeDaMo> OSdev or IRC? :P
<g1n> lol
<g1n> i was already in osdev for several monthes, but that code is bad so
<bslsk05> ​codeberg.org: GRU/orion: Simple OS to learn C - orion - Codeberg.org
<g1n> if anyone would be interested in other our projects check https://gruos.org/
<bslsk05> ​gruos.org: GRU
<heat> oh my
<heat> james molloy code
<heat> this is a clasic
<heat> classic*
* kazinsal screams
<kazinsal> FORBIDdEN
<heat> yeah this is a mix-and-match of lots of tutorials
<heat> good job in realising this is totally broken and bad
<zid> indian outsource the OS
<g1n> lol
vdamewood has quit [Read error: Connection reset by peer]
dormito has quit [Ping timeout: 250 seconds]
vdamewood has joined #osdev
elastic_dog has quit [Ping timeout: 240 seconds]
elastic_dog has joined #osdev
Burgundy has joined #osdev
bgs has joined #osdev
sdfgsdfg has quit [Quit: ayo yoyo ayo yoyo hololo, hololo.]
bgs has quit [Read error: Connection reset by peer]
bgs has joined #osdev
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
gog has joined #osdev
<g1n> hi gog
<gog> hey g1n
<g1n> whats up gog?
<gog> just waking up. at 1 in the afternoon lol
<gog> i didn't go to sleep until 6am tho
<g1n> oh
mahmutov has joined #osdev
<vdamewood> It's almost 6AM here and I haven't slept all night.
<gog> damn
<vdamewood> I woke up around 10 PM
<gog> i woke up at 10am yesterday and walked a total of 15km
<gog> needless to say i was pretty exhausted but i still ended up staying up until morning lol
<vdamewood> My (real life) kitty demands attention.
<gog> :o kitty
dormito has joined #osdev
<vdamewood> kitty go prrrr
<heat> cat go
<gog> noooo you can't just lay around all day and sleep
srjek has joined #osdev
<g1n> lmao
C-Man has joined #osdev
pretty_dumm_guy has joined #osdev
<heat> oh fuck off
<heat> linker script issues again
C-Man has quit [Ping timeout: 245 seconds]
* vdamewood fucks the linker scripts
* gog observes in horror
<vdamewood> *unsettled Tom*
<heat> if anyone doesn't deserve to be fucked, it's linker scripts
nyah has joined #osdev
* vdamewood unfucks the linker scripts.
tenshi has joined #osdev
<heat> de-fucks
<gog> git unfuck
<heat> woohooo it boots
<heat> geist, what's the difference between loading the kernel with OpenSBI and loading the kernel with -bios none?
<heat> in riscv64 qemu of course, machine virt
dude12312414 has joined #osdev
<Mutabah> The mode for one
<Mutabah> with OpenSBI you start in supervisor mode, with `-bios` you start in machine mode (iirc)
<heat> and I want supervisor mode I assume
<Mutabah> well, you can always enter it yourself from machine mode
<heat> yea
<heat> i'm not finding the riscv manual
<heat> i assume it's not the isa one
<Mutabah> Which manual?
<Mutabah> (what are you looking for?)
<heat> well, I'm looking for the intel manuals in riscv ;)
<Mutabah> The ISA specs on github are what I used
<Mutabah> they're VERY difernt to the intel ones (in design)
<heat> GeDaMo, yes exactly, thanks
<Mutabah> Read the text, it has important information
<GeDaMo> I don't know if that's the latest version
<heat> ah wait I was reading the unprivileged one
<heat> that makes sense
the_lanetly_052_ has joined #osdev
the_lanetly_052 has quit [Ping timeout: 240 seconds]
ElectronApps has quit [Remote host closed the connection]
dennis95 has joined #osdev
<heat> excuse my noob riscv asm but why is this faulting: https://gist.github.com/heatd/199cc2f7441a85dab05adf82b0b902b2
<bslsk05> ​gist.github.com: riscv_boot.S · GitHub
<heat> oh wait I know why this is faulting
<heat> rubber duck debugging at its finest
<heat> i forgot to identity map
unmanbearpig has quit [Quit: WeeChat 3.3]
the_lanetly_052_ has quit [Ping timeout: 256 seconds]
<heat> https://gist.github.com/heatd/59fc0131bdbcd2e299f296add28557ef welp I fixed everything I noticed was broken but it's still broken
<bslsk05> ​gist.github.com: paging_stupid_rv64.S · GitHub
<heat> info mem gives me no mappings even though the in memory array looks fine
<heat> top page table's entries only have V=1, they point to 2nd level page tables which have 0xf set (RWX + V = 1)
<vdamewood> Rubber ducky, you're the one. You make dev'ing lots of fun.
<gog> vdamewood: did you go to bed yet
<vdamewood> gog: No! Sleep is for the week, and this is the weekend!
<gog> fair
<vdamewood> Serious answer. No. I'll probably be up for about 6 more hours.
<vdamewood> Right now, I have to figure out how my web app is getting its database populates, so I can stop it and restore from a backup instead.
<vdamewood> Not as fun as writing a bootloader.
<gog> ooof
<vdamewood> Found it!
<gog> \o/
<vdamewood> Now I just need to figure out how to 'restore' the backup I made from production.
simpl_e has quit [Read error: Connection reset by peer]
<heat> ok got riscv mmu bringup
<heat> the sky is the limit now
<zid> I'm giving up on C, you can't make it optimize bank select mmio registers
<EtherNet> zid: that sounds complicated
<geist> heat: yay grats!
<zid> qbasic only from now I'm afraid
<geist> figured it out while i was asleep
<geist> the page tables in riscv are almost comically simple
<bslsk05> ​gist.github.com: gist:c43521f2a97e84861d995a960b6fe4af · GitHub
<geist> since you're running in supervisor mode you'll also want to find the SBI spec docs
<geist> zid: probably a little more straightforward to just have an inline function that sets the bank then writes the reg
bxh7 has joined #osdev
<zid> geist: sure, but it still can't optimize it
<geist> could even store in a global the last bank that was selected and only switch if
<zid> It was just a quick hacked up example
<geist> what would optimize look like in this case?
<zid> not writing 0 to the bank select twice
<geist> ah yeah. probably would need to track it as i said
<zid> yea but that probably fails as often as it works
<geist> if it's all inlined and the 'current bank select' variable is local it probably would optimize that
<geist> what do you mean fails?
<zid> mispredicts, as it were
<zid> fails to elide the check on the static
<geist> oh sure, but the check on the static should still be much faster than an extraneous write to hardware
<geist> doubleplus so of it's virtualized hardware
<zid> Actually I think on this dumb device I am thinking of, it might actually be faster to ignore the static entirely
<geist> okay
<zid> ram and mmio are both slow :D
sdfgsdfg has joined #osdev
<geist> yah if it's an AVR or something might not be a difference
<geist> any big fast machine with a complicated bus though mmio writes are probably slower
<zid> This bugs me an unreasonable amount considering it's like.. 8 cycles of wasteage :P
<geist> yah, i'm failing to get bothered by it
<geist> as a side note the more and more i interact with hardware at work the more i'm realizing the whole volatile pointer, let the compiler do the riht thing, is really not the best strategy
<zid> sounds like you're coming over to my side ;)
<geist> i'm grudgingly coming around to the idea that yo ureally have to funnel all mmio (and io port) accesses through helper functions
<geist> there's frequently some caveat or odd thing that the compiler does that causes a problem, whereas what you really want is for mmio accesses to be exactly consistent and use particular instructions
<geist> you can make the functions inline, etc so that it's not an outcall at least
<moon-child> you mean
<moon-child> inline assembly?
<moon-child> D:
<geist> yep.
<geist> i mean i know that's what Big OSes do, but i was always loathe to go that route, but i'm starting to see the advantages of it
<geist> or at least the lack of major disadvantages of it
<gog> inline assembly is cool and good
<gog> don't @ me
<geist> especially as compilers get more and more clever
<zid> I prefer not to avoid the compiler being clever
<geist> oh i dont hate inline asm, i just disliked the idea that you had to funnel all mmio through some routines
<moon-child> I bet you can't tell a perf difference between going through (non-inlined) standalone asm routines
<geist> probably not
<heat> geist, the page tables in riscv are simple but also weird
<geist> heat: oh yeah?
<heat> like, why do addresses start at bit 10
<gog> inline asm in an inline naked function
<geist> what did you find strange?
<geist> oooooh yep. bit 10. yep.
<geist> i think that's because they can address 34 bits of physical in 32bit mode without needing 3 levels maybe?
<heat> and the separation between the valid bit and RWX was new to me since that's not a thing in x86
<geist> 10 bit shift messed me up at least one
<heat> also no caching bits it seems
<geist> yah iirc V is it's own thing and RWX=0 has a special meaning, like page-table-pointer?
<geist> it is neat the permissions are simply RWX though
<zid> I can't wait for risc-v chips to start needing things like mtrrs, cache control bits, blah blah blah
<zid> and in 20 years it looks like x86
<geist> and it's subtle but there *is* a smap like thing that exists over in some supervisor control register
<geist> there's a bit somewhere thats smap like, making it such that supervisor code can't natively access user page if some condition is met
<heat> yeah I saw that
<heat> they also say you can never execute user pages in supervisor mode
<geist> yah that's SMEP basically
<heat> forced ;)
<geist> but as zid was saying, the lack of any cache bits is kinda interesting. i think the general vibe is all the eixsting riscv hardware uses physical ranges to decide caching, a-la x86
<heat> geist, what irq controller does this have?
<geist> so it doesn't really make its way up to the page tables. and i think that's easier to do if youre a simple cpu
<geist> it has a thing called a PLIC
<geist> which there's a spec for, but PLICs are not as tightly tied to the arch, like GIC is, but it's the semi standard, and basically the only thing sifive uses
<geist> keep in mind the virt risc machine in qemu was implemented by sifive, so it basically is a sifive design. pretty close to one of their SOCs
<heat> ah yes, found the spec
<heat> i really need libfdt ported
<geist> also when you read the SBI spec you'll notice it takes over some amount of low level hardware from you. pretty odd when coming from other architectures. looks almost Xen like
<geist> ie, timers: SBI's problem. inter-processor-interrupts? SBI. flushing TLBs? SBI
<geist> there's even a buitl in SBI console port you can just shove bytes out of
<geist> also of course booting secondary cpus is in SBI
<geist> so you'l pretty quickly need to write the SBI calling routine
<heat> also why did they choose to call CPUs harts
<heat> it's so weird
<geist> HART = hardware thread
<geist> they basically decided to not pretend they are cpus in the presence of virtualization and/or SMT
<geist> but yeah there are some different philosophical choices in how they name and do things that take a bit of getting used to
<geist> or at least make you reevaulate certain assumptions you may have had up until now
<geist> the biggest one that caused me trouble: the boot cpu is not necessarily HART 0
<geist> and for the most part HART ids are arbitrary, so yo uneed some sort of mapping of HART -> your local cpu # if you want them to be 0 based and linear
<geist> when dealig with SBI the HART id is basically just a handle to the cpu, so it can be numbered however it wants. it usually *is* 1:1 mapped to hardare, but it doesn't have to be. much in the same way it wouldn't be if you were being virtualized
<geist> and since you can't read the current hart id in supervisor mode out of a register (no `shartid` reg), it's like that scene in Hudsucker Proxy
<geist> you are given a hart id, it will not be repeated
<heat> :D
<heat> also am I supposed to run on top of opensbi?
<geist> yes
<heat> i've seen a few booting examples for linux where they pass a -kernel and do -bios none
<geist> you can do that too. but then yo uneed to provide whatever openSBI does for you
<geist> LK can run in 3 modes: 32 and 64 no SBI, 64 w/SBI
<geist> the first 2 start you in machine mode and you get no paging, but then it just runs in machine mode
<geist> the last one boots in supervisor because SBI hands it off to you, and then you get paging and whatnot
<geist> you could boot in machine mode and then drop to supervisor yourself and maybe leave a stud behind
<geist> but since some things aren't available in supervisor mode you need to reimplement some part of SBI or a SBI like thing
<geist> for example. hardware timers only exist at machine mode
<geist> (i think they're fixing that in a future spec so that you get banked timers like ARM)
<geist> AFAICT all 'big OS' are assumed to be running under an implementation of SBI. may be built into the kernel perhaps but seems like it makes sense. it's like PSCI on ARM but a bit more responsibility
<geist> side note i found it a bit funny that machine mode owns the timers (timer IRQS fire there and then get reflected down to supervisor mode via code) yet supervisor mode owns the PLIC
<geist> but, it works because hardware timers are built into the cpu and come in via a dedicated vector and dont go through the IRQ controller
<geist> so the machine mode code can trap and delegate it, but let hardware IRQs pass directly to supervisor mode
<geist> this is different from x86 and ARM where even if the timers are built into the core, they still route through some sort of IRQ handler (local apic or GIC) like a regular external irq
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
gog has quit [Quit: byee]
gog has joined #osdev
sdfgsdfg has quit [Quit: ayo yoyo ayo yoyo hololo, hololo.]
GeDaMo has quit [Remote host closed the connection]
<vdamewood> gog: okay, I think i'm going to go to sleep soon.
<gog> lol
<gog> yes
<gog> you've worked hard enough today
<vdamewood> But muh database@
<vdamewood> database!
<vdamewood> Well, laters.
<heat> geist, and SBI remains as a sort of resident firmware?
<heat> this whole design is a bit unorthodox to me honestly
<geist> yep
<geist> it is definitely a little strange if coming fromthe x86 world
<geist> SBI wasn't so odd to me because ARM has PSCI which is similar, though has less responsibility
<clever> what about SMM?
<geist> SMM being transparent i dont think is the same sort of thing at all
<clever> ive heard that some parts of acpi are just calls into SMM?
<geist> probably, but again since it's transparent, it's not really the same notion IMO
<clever> yeah
<gog> iirc that's true but it depends on the particulars of the motherboard
<clever> still there, but behind the curtain, and you dont rely on it as much
<geist> ACPI kinda is, but most of the heavy lifting there is on your side
<geist> it's really sort of closer to a classic DOS/BIOS split really. as in you write your OS to an interface that's supposed to be there
<clever> yeah
<geist> but protected mode x86 has moved away from that for such a long time it's strange to see stuff like that reappear
<geist> even if it's a pretty good idea
<heat> what's the advantage?
dennis95 has quit [Quit: Leaving]
<clever> heat: i think geist mentioned how doing more things in software lets you make the cpu simpler, no need to conditionally trap things like the core# register
<geist> right. basically in a virtualization like mode there's almost nothing in riscv that has to be trapped-and-emulated
<geist> like current cpu number, etc
<clever> it was already a hypercall
<geist> or even hw timers, since they dont exist in supervisor mode
<geist> exactly
<heat> why would you need to trap a core# register?
<geist> in virtualization the cpu # doesnt correspond to the real one
<clever> heat: to emulate a single-core machine, but run it on core-2
<heat> why would you not expose a hw timer to supervisor mode(literally the kernel)
<clever> and you dont want the OS confused as to why you only have core-2 and core0/1 are missing
<geist> because of what i just said that's more stuff you have to trap and emulate
<heat> yes but that's always software, the cpu doesn't get more complex because of it does it?
<geist> *or* at least more hardware you have to be virtualization-aware (like it is in ARM in a fairly complicated way)
<geist> sure it does. because at some point the supervisor mode kernel writes to what it thinks is a hw reg
<geist> but it really isn't because the hypervisor has to trap it and emulate
<geist> so either the hw has to have virt aware 'fake out the guest OS with this fake hardware' (APIC-v, ARM's nested timers, etc) or you simply punt on it (riscv)
<clever> and then the hypervisor has to do a more expensive save/restore, because there is no abi saying what would be clobbered
<geist> but really more importantly whenever there was a decision to do something that involves less hw, riscv generally chose that option
<geist> the complexity of the core is really really low
<geist> so since the SBI exists even when running on real hardware, it means when under a hypervisor the hypervisor simply has to provide the same SW interface and then bob's everyone's uncle
<clever> does it even have a seperate SBI vs hypervisor vs supervisor mode at that point?
<clever> i can see how i might run the hypervisor in supervisor mode
<geist> yah the hypervisor extensions to riscv i haven't really fukky grokked yet
<clever> and when you context switch from "hypervisor" to guest, you just tell the hw to re-direct all SBI calls to a given kernel, not the real SBI
<heat> yo i just stole your strtoul dont sue me pls kthxbye
<geist> i think early versions of it was much more aligned with ARM: they were going to have an intermediate level between supervisor and machine (literally their run level 2)
<geist> but i think they switched to a more x86 style scheme
<geist> but since there's very little hardware state to save that's not just regular registers i think the hw machinery to implement it should be much lower than x86
<geist> with the VHE features on arm v8.1 even ARM kinda went more in that direction too and somewhat 'nerfed' EL2
gorgonical has joined #osdev
<heat> geist, you think its hard to go from riscv64 to arm64?
<geist> well, i think arm64 is probably an order of magnitude more complicated
<geist> but riscv is also extremely simple
<heat> how many x86s is arm64
<geist> also depends on how far you go in. simple first level approximations of getting things working aren't too bad on arm64. its just subtle
<geist> oh about the same, jsut different
<geist> you worry about a slightly different set of things with arm64
<geist> less backwards compatibility and more ergnomic uses of what you h ave (less stupid shenanigans you have to jump through than x86)
<geist> but somewhat more flexibility and weak memory model stuff
<geist> and the ARM manual is very difficult to understand when you first get started
<heat> i remember I felt a bit over my head when looking at arm64 stuff for the first time
<geist> the entire riscv manual fits into like one sectino of one chapter of the ARM ARM
<heat> i've also struggled a bit with riscv64 but I'm kind of getting into it
<geist> i did too, the riscv docs are not laid out well
<geist> they read in a funny way that's hard to put my finger on. it's mitigated greatly by just being much shorter
<heat> the issue is that I don't really have real hardware for this and I'm not about to shell out 300 euro for like the only riscv64 out there
<geist> so you can reasonably expect to jsut read it end to end a few times
<heat> geist, the layout is definitely odd and I've never touched a RISC architecture before so the instructions definitely look funny
<geist> yah to be fair i was mostly griping about the priviiledged spec
<geist> which seems a bit wonky
<geist> the ISA manual is alright. the ISA itself is pretty consistent and i'm pretty fine with
<geist> there was some weirdness i didn't like initially, but now that i've seen enough codegen i think i get it
<heat> yeah but I'm not the greatest assembly programmer out there and writing riscv asm feels a lot more complicated IMO
<heat> with the lack of fancy stuff x86 instructions can do for you
<geist> yah that's risc
<geist> you get used to it. it's like building things out of smaller legos
<heat> i ended up writing my early paging code in C
<geist> but also if you're writing a lot of asm go read the asm riscv manual i think. it's a separate short doc
<geist> but it covers all the pseudoinstructions you should be aware of
<geist> which greatly help
<geist> adr, call, etc. things that you wont find in the ISA manual because they're not real
<heat> pseudo instructions aren't real, they can't help you
<heat> me: haha pseudo instructions go beep boop
C-Man has joined #osdev
<geist> that takes a bit of gettnig used to vs other ISAS that dont lean on pseudo instructions as much
<geist> mips did this too, but in general the assembler has more authority to do things than you expect
<graphitemaster> blue pseudo shoes
<geist> it has the ability to 'relax' a lot of expressions and substitute compressed instructions at will
<geist> and a lot of the pseudo isntructions are built around that notion
SikkiLadho has joined #osdev
<heat> how effective is that if you can only have a signed 12-bit offset from the global pointer?
<geist> yeah, i think the GP is a bit weak to be honest
Osm10 has joined #osdev
<heat> a page of range seems really small
<geist> but it has a fairly powerful PC relative load instruction
<geist> well, PC relative calculation instruction at least. similar to arm's ADRP
<zid> lea or riot
<SikkiLadho> I'm trying to implement memmove as I need it in a freestanding environment. I came across this implementation here: https://www.clc-wiki.net/wiki/memmove I get a reference error for __np_anyptrlt() when I use it in freestanding environment with aarch64-linux-gnu-gcc. How do I implement __np_anyptrlt()?
<bslsk05> ​www.clc-wiki.net: C standard library:string.h:memmove - clc-wiki
<zid> could you not just.. write it?
<geist> yeah rip out the np_anyptrlt
<geist> i have no idea what that's doing, but you dont need it
<zid> It's doing p1 < p2
<geist> oh i see what it's doing yeah
<geist> because memmove
<SikkiLadho> I read a warning to not remove it here: https://stackoverflow.com/questions/20062776/use-np-anyptrlt-in-memmove
<bslsk05> ​stackoverflow.com: c - Use __np_anyptrlt in memmove? - Stack Overflow
<zid> It's technically required to do that check but if the pointers aren't actually to the same object it isn't defined
<bslsk05> ​github.com: gcc/memmove.c at master · gcc-mirror/gcc · GitHub
<gog> just use this
<zid> So you need to rely on IDB
<geist> thumbs up
<SikkiLadho> Thank you!
<zid> Not sure why you couldn't just have written that yesterday though?
<zid> copying an array is not.. beyond your technical ability I hope?
<geist> now now lets not try to passive aggressively diss people's tech ability
<zid> I'm not passive aggressively doing anything
<geist> actually it's not so passive
<zid> or aggressive
<zid> literally just asking
* geist smiles
<geist> uh huh.
<geist> aint my first rodeo
<heat> tip: if you want a high quality implementation you're better off looking for actual C standard libraries, not a wiki ;)
<gog> i'll diss my own tech ability tyvm
<zid> nobody's dissed shit
<heat> __np_anyptrlt sounds like the C programming equivalent of FUD, I don't see how an optimiser will screw you over a pointer comparison
<gog> you don't have to i'll do it for you
<geist> well, if you can't tell that that's a diss then that explains a lot
<zid> heat: It's to obfuscate out the IDB
<zid> sorry I men abstract *
<heat> IDB?
<zid> implementation defined behavior
<heat> this is UB not implementation defined
<heat> if there even is a difference that is
<heat> BUT if the compiler really wants to it can find any UB it wants through LTO
<zid> The implementation is free to define things that aren't not actually defined by the C standard, and in this case is basically has to so it can capably implement memmove
<zid> and how it does that is.. up to the implementation
<SikkiLadho> Thank you.
<heat> geist, do you know off the top of your head how virt does serial?
<geist> riscv virt that is?
<heat> yea
<heat> virtio?
<bslsk05> ​github.com: lk/uart.c at master · littlekernel/lk · GitHub
<geist> basically a simply 16550
<geist> simple
<bslsk05> ​github.com: lk/virt.h at master · littlekernel/lk · GitHub
<heat> hmm
<heat> I was looking around info qtree but I couldn't find anything serial
<heat> and I obviously couldn't look at the FDT because I don't have printf up ;P
<heat> much thanks
simpl_e has joined #osdev
<geist> yah usually i just start by cribbing that from the qemu source
<geist> usually there's a big table at the start of the virt.c file for that particular machine
<heat> geist, how would I go about making a generic ARM64/RV64 image? that doesn't assume any platform
<geist> aside from using UEFI and building a file system that has both, i dont think you really do
<geist> there's not an easy way to merge ELF binaries for example
<heat> i mean arm64 or rv64
<heat> not both
<geist> oh oh
<heat> in the sense that you generally specify what platform you're building for when building a kernel for those
<geist> well basically you need to boot and read the FDT to figure out what drivers to run
<geist> the hard part is you can't rely on a particular load address
<geist> so generally you need to make your mmu setup code somewhat more intelligent to dynamically map the kernel whereever it happens to sit physically to where you need it to be virtually
<geist> and then you can detect the actual physical aperture from FDT
<heat> do bootloaders just put you where they want you to? without looking at the elf file
<geist> but really that mmu setup is the hardest part. i did it for arm64 in zircon, but never backported the changes to LK, which is still mostly 'compile for precisely this platform' style
<geist> no. they usually dont
<geist> but even if they did, if you're generic image you dont know where you want to go
<geist> because you dont know what system you're on and where physical memory is mapped
<geist> so you basically have to deal ewith 'bootloader put me somewhere so map where i am to -2GB virtually (or something like that)'
<heat> yah but you still need to link for some phys addr
<geist> no you dont
<geist> that's the point. you *cant*
<geist> since there's no physical address to link for
<heat> but I must? the program headers need to be filled
<geist> any reliance on 'bootloader reads my PA out of the ELF file' you have to remove, because that now doesn't work
<geist> oh sure, but you can just leave it in the default PA = VA
<geist> you're deep in bootloader specific territory here, but in general in that sort of environment you have to do whatever bootloader shenanigans you need to get loaded. which *generally* involves flattening your kernel to a flat binary and letting the loader put you where it wants
<geist> (vs an ELF file where yuo specify where it goes)
<heat> even with the linux boot protocol thing?
<geist> even with that
<geist> linux already has to deal with it for the preciseluy same reason
<heat> yuck
<geist> it's not so bad, you jsut remove an assumption that physical location is known at compile time
<geist> once you're in virtual mode it's all good
<geist> your initial mmu page table code just gets a bit more complicated is all
<heat> thankfully it's in C so no issue :)
<geist> what i ended up writing for zircon is a routine that basiaclly handles mapping physical run X to Y to virtual address Z
<heat> i like it better this way, although it's a bit more sensitive
<geist> so the startup code figures out the footprint of the kernel physically via pointer shenanigans (and _start and _end symbols) and then just says 'map that to -2GB virtually'
<heat> i just did a early_paging_setup()
<geist> then bouncse itself up there, with a huge identity map of say 0-64G or something left at the bottom
<geist> sot he kernel can then grub around and find the FDT/ramdisk/etc, get that mapped into the kernel, then turn off the identity map
<geist> or set up another physical mapping at the base of thek ernel (that's what I do) so it can stop using the identity map
<geist> but basically that's the trickiest part of the whole thing, then you're mostly into dynamic detection of drivers, whcih is mostly a software problem
<geist> since you now have to build dynamic dispatch of things like interupt conrollers and whatnot instead of relying on direct calls. no biggie, but some amount of typing and design
<heat> my uneducated self would assume you end up doing what you already do for x86 PC and ACPI
dude12312414 has joined #osdev
<geist> x86 you can cheat if you have your own loader
<geist> well for the physical location of the kernel that is
<geist> for the rest of the driver loading thing? ACPI yeah
<heat> sure, I mean the drivers
<geist> which is of course an order of magnitude more compelx than FDT because of bytecode and that bs
<heat> if you want to do things properly you can't assume any of the platform devices (RTC, PS2, etc) are there
<geist> but think of ACPI as basically the driver loading instructions for windows
<geist> right, those are PNP resources in ACPI
<heat> yeah windows doesn't do dtb
<geist> note UEFI does not imply no dtb. you just have to find a handle to the dtb inside the UEFI environment
<heat> I think the raspberry pi EDK2 code ends up translating the device tree into AML since UEFI requires ACPI(?)
<geist> (on ARM that is, DTB doesn't exist at all on any x86 platform i know of)
<heat> that's how windows boots on it at least
<heat> geist, I read about some x86 tablets that do have device trees
<geist> yah. basically if windows has ever booted on it, someone must have arranged for ACPI to exist, since windows simply wont
<heat> is there a good reason for it or just stubbornness?
<geist> for x86? dunno!
<heat> for arm
<geist> can you reprase the question?
<heat> why does windows always demand ACPI, even in architectures where ACPI is super rare (like arm)
<heat> maybe just legacy?
<geist> because MSFT
<geist> it's my understanding that their driver model is 100% built around it
<geist> it's simply the instructions to their internal layer as to what to load where. the whole binding rules that windows drivers use is highly ACPI centric
gorgonical has quit [Ping timeout: 240 seconds]
sonny has joined #osdev
<geist> all that aside it's fully specced out. the ACPI spec has sections for ARM and i think riscv in the newest section. it uses a subset of x86 stuff since there's al ot less legacy stuff to describe
<geist> but it's of course specced out probably because MSFT added it
<gog> becuse they don't wanna rewrite hal :p
<geist> same way that UEFI is also specced out for arm and riscv, including extending the PE binary format for those arches
<geist> right
<gog> because literally the whole kernel depends on it
<geist> exactly. the only way they could do it is to have some sort of up front translation of FDT -> ACPI internally, i guess
<geist> what i dont fully understand is how much linux depends on FDT or ACPI when both are present
<geist> looking at dmesg it seems to look at both, so maybe it's a case-by-case basis for which one is considered more truth
<heat> from the linux source I've read, every device belongs to a bus
<heat> PCI bus, acpi bus, probably device tree bus
pretty_d1 has joined #osdev
pretty_d1 has quit [Client Quit]
<geist> that's basically what we do in zircon too
<heat> if you have an acpi bus and a device tree bus I bet you get two drivers binding to a single physical device
<geist> there's an ACPI bust driver that instantiates an instance and then starts publishing device nodes as it walks the tree
<heat> each driver can only bind to a type of bus
<geist> and then the driver mananger matches devices against it. one of the devices is 'The PCI Bus' which then starts a PCI bus driver, repeat recursively
pretty_d1 has joined #osdev
pretty_d1 has quit [Client Quit]
pretty_d1 has joined #osdev
pretty_d1 has quit [Client Quit]
pretty_dumm_guy has quit [Ping timeout: 256 seconds]
Osm10 has quit [Quit: Client closed]
sdfgsdfg has joined #osdev
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
<heat> i have an idea
<gog> what is your idea heat
<heat> i'll straight up copy my x86's mmu.cpp and change it for riscv
<heat> i think most of it should work lol
<heat> it's 1400 lines but most of the logic should fit
<geist> sure
SikkiLadho has quit [Quit: Connection closed]
<geist> i rewrote mine from scratch but i was mostly trying some clever idea i thought i had (at the time) in C++
<geist> it kinda worked, but i think it's a bit unweildy to keep doing
<heat> what did you try?
<geist> the idea was to build a single page table walker routine that walks down the tree and makes callbacks at various points to request what to do next
<geist> and then have multiple wrapper routines set up a series of lambdas to do various operations (map/unmap/protect/query/etc)
<geist> so kinda works, but i dont know if it can go much further without getting too hard to drok
<geist> grok
<geist> what i dont like is having to reimplement that walking routine over and over again, even within the same arch
<bslsk05> ​github.com: lk/mmu.cpp at master · littlekernel/lk · GitHub
<geist> https://github.com/littlekernel/lk/blob/master/arch/riscv/mmu.cpp#L414 is where the mapper routine for example sets up a callback and then uses the walker to do it
<bslsk05> ​github.com: lk/mmu.cpp at master · littlekernel/lk · GitHub
heat has quit [Read error: Connection reset by peer]
<geist> it has a enum of possible return values that it hands back to the walker
heat_ has joined #osdev
<geist> anyway, since the walker is inlined it actually generates pretty good code
<geist> so that at least worked out pretty well. the lambda itself isnt' a separate funciton and everything is implemented in a single loop
<heat_> not bad
<heat_> i've implemented a walker for unmapping ranges
<geist> anyway it was a thought i had and wanted to try
<heat_> it's neat but the logic is a bit tricky sometimes
<geist> yah
<heat_> not something I want to re-implement from scratch again
<geist> a strategy that works fairly well is to simply recurse each level
<geist> but then you have this state you have to hand down and back to let you continue to stop or whatnot
<heat_> stupid question: how do I get a memory map?
<geist> it's in the FDT
<geist> there's a 'memory@' node for every run i think
<bslsk05> ​github.com: lk/fdtwalk.c at master · littlekernel/lk · GitHub
<heat_> neat
<heat_> i was trying to set up a tty for logging and I noticed I'm lacking *literally the concept of memory management*
<geist> yah you can also if you want use the SBI console to output stuff like that
<heat_> hence malloc(sizeof(struct tty)) doesn't work
<geist> basically you can just make a SBI call from basically instruction 0
<heat_> how is that better than serial?
<geist> you dont need to map it
<geist> but yeah, no difference
<geist> just good for very very early logging
<geist> but yeah i was misreading what yuo were saying there
<heat_> i have serial I just don't have the printk() -> platform_serial_write() connection since the printk's backend is a tty
<geist> was thinking you were 'i haven't mapped the seiral port yet' phase, which is also a problem
<geist> yah
<heat_> and my tty code uses malloc
<geist> but if you truly dynamically discover the serial port, etc you get into the part where the kernel is unable to log anything until you find the port, etc
<geist> so having a backup console is kinda nice
<heat_> yea I temporarily stole your putc code real quick for some output but i still need to connect it up
<heat_> i have a uart8250 driver but it assumes IO ports right now
<geist> yah
<geist> yah i have a long standing todo item to try to unify the probably 4 or 5 different copies of the 8250 driver in the LK try
<heat_> I totally need a way to get transparent MMIO/IO ports write()/read()
<geist> tree, but they're annoying
<geist> yah it's worse because mmio based 8250s many times map the registers differently
<heat_> i know linux just has a void * with some flags set if it's an io address I think
<geist> for example the real one has the whole 2 or 3 bank thing but lots of timse mmio versions of them just flatten the registers out
<heat_> geist, hmm really?
<geist> so you have to have an abstraction there too
immibis has joined #osdev
<geist> not a biggie, but basically means if you hve a truly cross platform 8250 driver you need to configure it with some amount of switches
<geist> not that it needs to be fast, so the runtime overhead is almost certainly negligable
sonny has quit [Ping timeout: 256 seconds]
<geist> i think for zircon we jsut have a pc centric io port based one and another mmio based 8250. keeps it from getting too messy
<heat_> maybe there's an argument for simplicity?
<heat_> linux's uart driver is really freaking messy for example
<geist> yah
<geist> that's probably what i'll do if i ever get around to it
<geist> combine the mmio ones into a generic driver but leave the pc one alone
sonny has joined #osdev
<heat_> and you're never going to get something excessively complex from a uart, even if you wire everything up with the tty termios interfaces and whatnot
heat_ is now known as heat
<geist> right, though you'd be surprised. uarts are surprisingly tricky to get exactly right
<heat> how so?
<geist> just what i've seen over the years. lots of broken implementations. uarts being written to from difficult kernel contexts, variable amounts of buffering, virtual uarts that stream data at unreal rates, etc
<clever> one of my usb uart chips, will deadlock if you send it a malformed byte, such as power-cycling the source at the wrong time
<geist> you can actually get into trouble any number of ways with it
<clever> it wont recover until you close and re-open the tty device
Oli has joined #osdev
gorgonical has joined #osdev
<sortie> https://p.ahti.space/6ebf76.html ← I've been working on my ports system, I made proper meta data for my ports, along with regular expressions to scrape the upstream release sites, and here's the output of my tool scanning for new releases
<bslsk05> ​p.ahti.space <no title>
<sortie> I'm making a lightweight BSD style ports system that will make it much cheaper and easier to maintain my ports and this script will make it much easier to stay up to date