<bslsk05>
ljrk.codeberg.page: Dissecting the UNIX v6 Allocator
<gorgonical>
It's osdev because I'm running it on qemu as a bare kernel
gareppa has joined #osdev
Terlisimo has quit [Quit: Connection reset by beer]
<mrvn>
heat: would be nice if the page would actually include the 60 lines of code as a block so one could read it
<mrvn>
some of those explanations are quite confusing
sebonirc has quit [Ping timeout: 268 seconds]
Terlisimo has joined #osdev
<mrvn>
ahh, found the link to the source
<geist>
gorgonical: oh hows that going? Finding that writing lots of riscv asm is pretty painful or painless?
sebonirc has joined #osdev
<heat>
one man's pain is another's countless hours of enjoyment
<gorgonical>
I would say pretty painless. We'll see how that changes when it comes to debugging this whole system once its runnable. But writing asm is actually pretty good. I like it best of arm, x86, and risc-v
<geist>
Yah that’s why I was curious what they were thinking about it
<geist>
Cool
<geist>
I find it about one notch simpler than I’d like (more of an arm64 level of risc is my home) but i could probably get used to it
<gorgonical>
I really, really enjoy that the assembler knows mnemonics for the registers, so arg registers are a0-7 or also x10-17
<gorgonical>
And I guess I could have done that with #defines also but the assembler just knowing them out of the box is nice. Cognitive load stuff
<geist>
Yah, and the mandatory built in pseudoinstructions help a bunch
<heat>
i like arm64's .req
<gorgonical>
Maybe if I spent more time I would prefer arm64 because of the loading stuff. I will say stack manip in risc-v is a little verbose
<geist>
Yeah it’s generally stuff like that that gets a bit annoying. Or lack of multi reg load/stores or pre/post increment/decrement
<gorgonical>
lw a0, (sp); lw a0, 8(sp); addi sp, sp, -16 is a little much
<geist>
Feels a bit verbose to always have to compute the address first
<geist>
Will be interesting to see what instruction fusing the first real high performance implementation comes up with
<gorgonical>
Meanwhile arm64 has ldrp and the post instructions
<heat>
well, it's still a sifive core, but on intel 4
<gorgonical>
I attended a talk by a sifive guy who said they feel confident the newer cores are much more performant
<heat>
with ddr5 and pcie 5 which is siiiiiiiiiiiiiiick
<gorgonical>
krste asanovic was the guy. couldn't remember his last name
<geist>
Yah. I’m waiting for apple to play the long game and switch over to riscv with one of their cores
<geist>
I forget where I was reading it but there was a good argument for not using the compressed riscv instructions if buildings really really high end implementation
Piraty has quit [Quit: No Ping reply in 180 seconds.]
<geist>
For more subtle reasons than just the more complex decoder
<gorgonical>
I mean they are for packing more instructions onto an embedded device, right?
<gorgonical>
oh
Piraty has joined #osdev
<geist>
Yah. My limited experience with it is riscv + compressed instructions approaches thumb2 in terms of density, which approaches x86
<geist>
From just generally compiling things and looking at the size of the text segment for equivalent things, etc
<gorgonical>
something I'm interested to see is how risc-v ends up on accelerators. There's a lot of work with various types. Vector units, systolic matrix stuff, etc
epony has joined #osdev
leitao has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<heat>
...what's the point of being super RISC if you're fusing everything anyway?
srjek has quit [Quit: Leaving]
<geist>
Reminds me of some article I was reading the other day among other things comparing ARM and x86 and they were using highly x86 centric instruction streams
<mrvn>
gorgonical: you mean things my assembler on m68k in the 90th had?
<geist>
Notably, noticing that ARM implementations dont seem to have a fast path for xor reg, reg
<mrvn>
just less and worse
<heat>
SHAME
<geist>
Like ‘so lame they dont fold out that and do zero register fast path’
<geist>
Which of course is missing the point. ARM says ‘this is how you zero a register, and it’s mov reg, #0’
<mrvn>
geist: do you mean xor r0, r0, r0?
<geist>
XOR reg, reg doesn’t have any advantage on ARM and thus there’s no fast path for it
<geist>
mrvn: yes, i sometimes do two register when its implied the latter two would be the same register
<mrvn>
xor being faster on x86 is legacy cruft
<geist>
Exactly. It encodes smaller, so it’s the fastest way to do it, etc
DonRichie has quit [Quit: bye]
DonRichie has joined #osdev
<mrvn>
it saved a byte so people started using it and then because everyone used it they made it faster too
<GeDaMo>
It would have used fewer clocks on the original 8086 too
<mrvn>
GeDaMo: fewer bus cycles
<mrvn>
nowadays it should be slower as it would require the ALU if that weren't optimized out
<mrvn>
and add register register dependencies
<mrvn>
I wonder, is "sub reg, reg" optimized too?
<heat>
the gang goes looking in the optimization manual
<GeDaMo>
Apparently it is on some microarchitectures
<heat>
it is
<heat>
in everything remotely modern
<heat>
particularly "In processors based on Intel Core microarchitecture"
<heat>
xor, sub, xorps/pd, pxor, subps/pd, psubb/w/d/q and avx equivalents
<heat>
seems like p4 also had this
<heat>
oh, here's a nasty detail about nops
<heat>
"The other NOPs have no special hardware support. Their input and output registers are interpreted by the hardware."
<heat>
other NOPs meaning sizeof(nop) > 1
dequbed has joined #osdev
Vercas6 has quit [Quit: Ping timeout (120 seconds)]
<mrvn>
heat: so I can create a register dependency with a NOP?