klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
MiningMarsh has quit [Ping timeout: 260 seconds]
nyah has quit [Quit: leaving]
gorgonical has joined #osdev
<gorgonical> I just need to ask one question that's also a vent: why is it so fucking hard to conclusively ascertain if a certain exception in arm64 will change exception levels?
<gorgonical> I am debugging non-responsiveness after turning on the MMU in secure state, and my hunch is that the page tables are mis-mapped somehow and the iftech is failing.
<gorgonical> But even trying to determine where the exception will end up has not been trivial. I am taking it in secure EL1 and I *think* the exception is targeting secure EL1 also. But trying to find a table, description, whatever of the exceptions has been extremely frustrating.
<gorgonical> For some reason the best reference for exactly how the exceptions work is the pseudocode allllll the way at the end of the manual
MiningMarsh has joined #osdev
nick64 has quit [Quit: Connection closed for inactivity]
small has joined #osdev
<small> heat:
<gog> small
<small> lol
<hmmmm> lol did you ever fix your 4-star pointer
gog has quit [Ping timeout: 256 seconds]
Matt|home has quit [Remote host closed the connection]
Burgundy has quit [Ping timeout: 256 seconds]
bgs has quit [Remote host closed the connection]
<small> who
small has quit [Ping timeout: 256 seconds]
spikeheron has quit [Quit: WeeChat 3.7.1]
joe9 has quit [Quit: leaving]
small has joined #osdev
epony has joined #osdev
spikeheron has joined #osdev
heat has quit [Ping timeout: 256 seconds]
fedorafansuper has quit [Quit: Textual IRC Client: www.textualapp.com]
terrorjack has quit [Quit: The Lounge - https://thelounge.chat]
invalidopcode has quit [Remote host closed the connection]
invalidopcode has joined #osdev
terrorjack has joined #osdev
fedorafan has joined #osdev
srjek has quit [Ping timeout: 256 seconds]
Vercas has joined #osdev
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
genpaku has quit [Remote host closed the connection]
genpaku has joined #osdev
QuietlyConfident is now known as LivewareProblem
bradd has quit [Ping timeout: 260 seconds]
bradd has joined #osdev
gildasio has quit [Ping timeout: 255 seconds]
gildasio has joined #osdev
vdamewood has quit [Remote host closed the connection]
vdamewood has joined #osdev
vdamewood has quit [Read error: Connection reset by peer]
vdamewood has joined #osdev
bgs has joined #osdev
gxt has quit [Ping timeout: 255 seconds]
gxt has joined #osdev
invalidopcode has quit [Remote host closed the connection]
invalidopcode has joined #osdev
lockna has joined #osdev
small has quit [Ping timeout: 256 seconds]
elastic_dog has quit [Killed (cadmium.libera.chat (Nickname regained by services))]
elastic_dog has joined #osdev
heat has joined #osdev
lockna has quit [Quit: lockna]
lockna has joined #osdev
small_ has joined #osdev
<dinkelhacker> gorgonical: Do you have access to a debugger an can get value in the respective ESR register? You could decode that at https://esr.arm64.dev/ which would probably give you a clue whats going on.
<bslsk05> ​esr.arm64.dev: AArch64 ESR decoder
* zid hides his dinkel
small_ has quit [Ping timeout: 272 seconds]
small_ has joined #osdev
sikkiladho has joined #osdev
small_ has quit [Read error: Connection reset by peer]
small_ has joined #osdev
fedorafan has quit [Quit: Textual IRC Client: www.textualapp.com]
fedorafan has joined #osdev
kof123 has quit [Ping timeout: 268 seconds]
micttyl has joined #osdev
lockna has quit [Quit: lockna]
danilogondolfo has joined #osdev
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 255 seconds]
GeDaMo has joined #osdev
kof123 has joined #osdev
gorgonical has quit [Remote host closed the connection]
gog has joined #osdev
fedorafan has quit [Ping timeout: 252 seconds]
lockna has joined #osdev
fedorafan has joined #osdev
outfox_ has quit [Ping timeout: 255 seconds]
lockna has quit [Quit: lockna]
lockna has joined #osdev
lockna has quit [Quit: lockna]
lockna has joined #osdev
Left_Turn has joined #osdev
Turn_Left has quit [Ping timeout: 260 seconds]
bauen1 has quit [Ping timeout: 272 seconds]
fedorafan has quit [Ping timeout: 248 seconds]
fedorafan has joined #osdev
zaquest has quit [Remote host closed the connection]
zaquest has joined #osdev
<epony> vdamewood, eat shit
<epony> retard moron ;-)
<sham1> That's some good shit
<epony> buy an arm sdk board
<ddevault> tfw not getting interrupts from serial
<epony> is the csr / rts set?
<ddevault> pl011
<ddevault> set bit 8 (TX enable) and 1 (UART enable) of UARTCR
<heat> does your pl011 drv work in qemu?
<ddevault> set UARTIFLS to 0, UARTIMSC to 0
<dinkelhacker> Question about heap allocation: I'm pretty early in my osdev journey and currently my kmalloc function just returns a whole 4k page. That's fine because up until now I only allocate some kernel structs and don't free anything. I've been thinking about implementing something more sophistecated and been wondering how others go about that topic. Does everybody just go down the slab allocator/buddy
<dinkelhacker> system route ?
<ddevault> enable IRQ 1 (intno 33) in GIC by writing 0b10 to GICD_ISENABLER1 and ICPENDR1
<ddevault> heat: everything but interrupts, yeah
<heat> ddevault, -trace away?
<heat> dinkelhacker, ok so memory allocation is pretty nuanced
<ddevault> anyone got a pl011 driver lying around I can study?
<zid> qemu source imo
<ddevault> ideally someone who uses named constants -_-
<zid> just write it as a client 'agaist' qemu is easiest
<ddevault> yeah I guess
<zid> That's how I fixed all the bugs in my e1000 when I wrote it
<zid> just made qemu tell me what the actual flow path is
<ddevault> that's how I wrote my PC serial driver too <_<
<ddevault> anyway I was hoping this would just werk and it just doesn't, so I'll get back to more important things and return to it later
<heat> dinkelhacker, I'll describe the linux system first. linux has a page allocator (buddy allocator as you may know) that gets you 2^n pages, with 2^n page alignment. the kernel keeps all the memory/a good chunk of memory consistently mapped. slab works on top of that, gets pages from the buddy allocator, divides them into chunks on top of the linear mapping of memory
<zid> my serial driver for PC infact, just werked
<zid> it is however, totally braindead on PC
<ddevault> "mess with my kernel while long-running process on more important task runs in the background" has become "long-running process finished and now you're blowing off the more important task to debug your kernel"
<heat> then if you for some reason want larger sizes that don't need to be contiguous, or in an mmap-like code path, it keeps (kept, changed a release ago) a red-black tree of virtual memory allocation areas. allocation is simply exploring the gaps between nodes and seeing if there's enough space, then doing page allocation and mapping using the MMU
<heat> this is more or less how things work. some systems have the kernel heap allocator also on top of MMU (like the BSDs, afaik Windows, etc)
<zid> buddy allocate your bytes, ez
<ddevault> wait, you guys aren't writing microkernels? for shame /s
<heat> as for your slab question, I think slab has been more or less been empirically proven as the best allocator for a kernel
<zid> ddevault: I may not shave or get a haircut that often, but I'm not a *total* hippie
<ddevault> what I'm curious to see resources on is virtual memory allocation strategies
<ddevault> i.e. address space management
<klange> my windowing system lives in a userspace process, that's micro enough for me
<zid> your bytey allocator is just a virtual memory allocator
<heat> mainly because it lowers lock contention and slab allocators tend to want to hang onto objects, which the kernel can totally deal with under memory pressure by simply force freeing them itself
<heat> ddevault, wdym allocation strategies?
<ddevault> where to place memory mappings, when to reap page tables, etc
<ddevault> mapping pages larger than 4K
<heat> last I checked, linux did not reap page tables? I think they also just do a first-fit on address space gaps
<zid> I'm not sure what more advanced technique you need than "big slab of shit, divide it into bytes and hand them out"
<sham1> Slabs of shit
<sham1> Brilliant
<zid> like, what could it add, other than complexity
lockna has quit [Quit: lockna]
<ddevault> well, if you never reap page tables it gets much simpler
<zid> userspace mallocs also have the exact same constraints to deal with and also pick that approach
<heat> "get slab of shit, divide it and hand them out" is a very generic idea
<heat> what matters is how you get slabs of shit, how large they are, how you divide them and then how you hand them out
<zid> The point is that you don't "manage" it at all, you specifically don't divide them, or hand them out
<zid> You let the person fiddling the bytes do that, you do nothing except provide the initial slab of shit
<zid> see: brk
<heat> who is "you"?
<sham1> As for why no microkernels, the process of actually making sure that the kernel is actually micro is certainly a weird one
<heat> brk is not the greatest of ideas
<zid> the person doing the "virtual memory allocation strategy"
<klange> to determine if a kernel is micro you need to get out the kernel calipers
<dinkelhacker> heat: Thanks for your explanation. I think for starters I'll try to keep things simple and just use contiguous (allthough I'm actually already using virtual memory). So how does the management of the slabs actually work? Does kmalloc just get one and keeps track of how many start addresses and the number of bytes it gave out for each request? To make it even simpler: how would you handle allocations
<dinkelhacker> of couple of bytes within a single page? Like you still have all the fragementation problems, right_
<heat> brk is objectively a bad idea. see: internal fragmentation
<klange> and they're locked in tanenbaum's basement
<dinkelhacker> zid: also in kernel space?
lockna has joined #osdev
<dinkelhacker> heat: No I haven't thanks for the pointer. I'll check that.
<klange> I've been using essentially the same malloc implementation since the very start; wrote it before I even got into the OS stuff
<sham1> klange: ah yes, the international standard microkernel. But yeah, bootstrapping one from boot seems odd, at least if one thinks about it in as "absolute" terms as I do (can't have initramfs in kernel because filesystem doesn't belong, but then have to solve problem of loading initial stuff)
<heat> i highly recommend you do. the slab allocator essentially works with slab caches (which are just pools of objects T), which have lists of slabs, which are essentially just a few pages of memory with N objects inside them
<heat> allocation is basically going to one of the partial slabs (slabs with allocations in them, but not entirely allocated) and grabbing one. if you don't have a partial slab, you allocate a new slab
<heat> kmalloc/malloc is traditionally implemented by keeping N slab caches for a bunch of bucket sizes and allocating from them
<klange> sham1: Back in the day, toaru32 was excessively modular, and had those problems - had to load dozens of modules just to get a working FS; with Misaka I just abandoned all of that and embedded the basics (tmpfs, tarfs) right in the core kernel.
<zid> The correct amount of modulation is approximately what linux does
<heat> klange, there's no such thing as excessively modular, see EFI
<zid> if it wasn't best, why would linux do it, QED
<heat> how many modules do you want? yes.
<heat> also reminder that EFI firmware loads EVERY driver even if you don't have hardware for it or a need for it
<heat> really high quality stuff
<sham1> Well the problem wouldn't really end with that either. Another sore point I think about is why should there be a program loader in a microkernel
<sham1> Of course the answer is to solve the bootstrapping problem, but that's just it, it keeps rearing its ugly head up
<gog> all modules matter
<sham1> Of course, my thinking might be closer to a nanokernel than anything else, but it's still something I've considered to be a large problem with this stuff. I like μ-kernels but I don't know how they could be made micro enough
<zid> the reductive version of a microkernel is objeticvely shit
<zid> so it's just a question of how much are you going to compromise to make it not shit
<sham1> I'm an academic. What is this “compromise” you speak of?
Burgundy has joined #osdev
<zid> You should increase the iron in your diet then
<zid> oh, academic
<sham1> Yeah, not anemic, thankfully
<sham1> But yeah, all these problems can be solved with enough engineering effort, obviously, but the complexity is always there
<dinkelhacker> heat: Cool. Thx again for the useful information. Once I've digested that I'll return with more questions :D
<heat> np
bgs has quit [Remote host closed the connection]
<heat> ddevault, in your efi stub, why do you pad your section names with 3 \0?
<ddevault> padding?
<ddevault> cargo-culted from linux's EFI stub?
<heat> i don't know
<heat> i looked at yours in hope of easier understanding
<ddevault> it aligns the next field on 8 so I'm gonna guess padding
lockna has quit [Quit: lockna]
fedorafan has quit [Ping timeout: 252 seconds]
<ddevault> FYI if you're trying to learn from my EFI stub
<ddevault> it exists just to get the bootloader online
<ddevault> after this the kernel is loaded and this executable is abandoned in memory to be reclaimed by the allocator at some point
<ddevault> so the priority for gooditude here is low
fedorafan has joined #osdev
small_ has quit [Quit: Konversation terminated!]
Brnocrist has quit [Ping timeout: 260 seconds]
Brnocrist has joined #osdev
bauen1 has joined #osdev
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
mahk has quit [Ping timeout: 268 seconds]
fedorafan has quit [Ping timeout: 256 seconds]
<heat> ddevault, halp
fedorafan has joined #osdev
<heat> error: symbol '_text_start' can not be undefined in a subtraction expression
<heat> .long _ro_end - _text_start
<heat> but you seem to do the same thing?
<ddevault> see linker script
<ddevault> I define _etext et al there
<heat> do you need __pecoff_data_size because you can't have 2 undefined syms in an expr?
<clever> heat: i forget the details, but i heard something about there also being start/end symbols that are auto-generated, seperate from what the linker script adds
<ddevault> maybe?
<ddevault> I do
<ddevault> most of the measurements in the linker script
<ddevault> I recall having similar issues before I did so
<heat> I guess that's what I need
<ddevault> to minimize suffering I recommend lifting my linker script and header wholesale and moving on with your life
<ddevault> I offer you an MIT license for it
<heat> no need
<heat> I can't use your linker script anyway lol
<ddevault> rip
<heat> thanks anyway
<ddevault> I might also be open to some effort to generalize this into a bootloader which is reusable outside of helios
<ddevault> to just have an EFI bootloader which loads an ELF file and moves on with our lives
<heat> my problem is that I'm doing like _text_end - _text_start, both of which are undefined syms that get defined somewhere else
<heat> where what I really need is something like CONST - und or local_sym - und
<ddevault> what's your ultimate goal here?
<heat> where you can't, you do things like .long __pecoff_data_size and do __pecoff_data_size = ...; in the linker script
<heat> EFI boot my kernel directly
<ddevault> gotcha
<ddevault> might have more success looking at linux than at helios, then
<ddevault> linux loads the whole kernel as PE/COFF, mine just uses it as an intermedite step
<heat> I think i'm relatively near
<ddevault> though this __pecoff_data_size business is in linux, too
<heat> (if you forget the compat issues I'll get anyway lol)
<ddevault> also, stupid question
<ddevault> are you using -fPIC
<ddevault> and which arch
<heat> i'm doing this on x86_64 for now, no fPIC
<ddevault> would recommend PIC
<heat> I don't need that
<ddevault> EFI binaries are loaded wherever the firmware feels like it
<zid> pic is for sillies
<heat> my boot bits are physical-address-relocatable
<ddevault> so you'll end up writing relocation code in PIC assembly
<zid> like virtual memory doesn't exist
<ddevault> you have some MMU usage constraints in this environment, zid
<ddevault> but yeah, I don't like PIC generally
<zid> pic is something you add later for kslr after you're super advanced
<ddevault> another part of why I don't want my EFI stub to be my actual kernel
<heat> ddevault, also fwiw there's a flag to say "I have no relocs, load me only where I want to"
<ddevault> oh? interesting
<heat> IMAGE_FILE_RELOCS_STRIPPED
<heat> This indicates that the file does not contain base relocations and must therefore be loaded at its preferred base address. If the base address is not available, the loader reports an error.
<ddevault> introduces a dependency on your platform's physical memory layout, though
<heat> it does
<heat> such is life
<ddevault> out of curiosity, why is EFI support in scope for your project?
<zid> heat loves efi
<heat> why is it not?
<zid> massive boner for it
<ddevault> I mean what does EFI offer you that BIOS does not
<ddevault> just want to better understand your goals
<heat> I just want to boot in EFI
<ddevault> for me it offers a standard boot environment with a filesystem implementation to load stuff from
<heat> GRUB already supports it, I already support the runtime services side of the deal, I just want the EFI stub too
<heat> like the cool kids
<ddevault> ah
<heat> fwiw I want to ditch EFI as quickly as possible
<heat> fuck that shit
<ddevault> ExitBootServices is called before kmain in my system :)
<heat> half of EFI is broken as shit and vendor code is at best very very questionable
<zid> I just don't see what you would need to do before your OS boots
<heat> particularly runtime services
<ddevault> load modules from the filesystem, zid
<ddevault> and obtain a memory map
<zid> either stay inside the efi shell forever, or ignore it as fast as possible
<ddevault> what am I looking at?
<netbsduser> rust
<zid> wow, heat is mean
<heat> minimal efi implementation for qemu arm64 virt
<heat> if you wanna ditch ovmf as fast as possible
<ddevault> neat
<heat> (which you should)
<ddevault> but I use ovmf on real hardware so I want a uniform environment
<ddevault> also I don't care
<heat> fair
<ddevault> I fucking hate serial consoles
<ddevault> why does it suck so much
<ddevault> also thanks for making me open my helios directory and start looking at my problems again
<heat> i like em
<heat> no problem
<heat> happy to help
<dinkelhacker> jesus I haven't thought about all that BIOS/stuff at all^^.. since I started on the raspberry pi 4 it was just "put your image here we will load it to this address"
<ddevault> problem is when you want to target ARM, not raspberry pis
<dinkelhacker> what do you mean by "target ARM"?
<dinkelhacker> Like you want to target a generic arm based platform
<dinkelhacker> ?
<ddevault> yes
<ddevault> and not a specific SoC
<ddevault> why the hell doesn't transmit work on my serial cable
<dinkelhacker> True story... in retrospect targeting the pi might have been an arrow to the knee..
<ddevault> soon enough I'm going to end up plugging buttons into the GPIO pins to control slides
<zid> I hope you're allowed a backup presentation
<zid> for when your pi crashes out
<ddevault> I'm not actually allowed any presentations, I was rejected and I'm still waiting to hear back about spare slots :(
<kaichiuchi> hm…
<kaichiuchi> I might raise a gcc bug
<ddevault> but the backup plan is to switch to laptop, easy enough
<ddevault> bah!
<ddevault> same issue on a different serial cable
<kaichiuchi> with Od/O0 on msvc/clang, where appropriate parameters are passed in registers
<kaichiuchi> with gcc you MUST use the register specifier to do that
<ddevault> RX and RX enable bits (and UART enable) are set to 1 in UARTCR
<ddevault> FIFOs enabled in UARTLCR_H
<ddevault> RX errors cleared in UARTECR
<ddevault> and... the RX flag never pops up in UARTFR when typing into the console
<ddevault> also GNU screen still really really really sucks
<ddevault> TX and RX enable bits*
<kaichiuchi> interesting
<kaichiuchi> clang 3.0.0 generates what i’d expect
<kaichiuchi> 4.1.2 does not
<heat> who cares about clang 3.0.0
<zid> That's 10 worse than gcc
<kaichiuchi> heat: this happens on trunk too
* ddevault types mindlessly into a console, watching nothing happen
<ddevault> I've reproduced this one bizzare serial problem on two rpis, three serial cables, and minicom and screen on the host
<ddevault> flow control. gdi
<heat> ddevault, actually this was pretty painless
<heat> I had an issue with the PE header alignment
<heat> the rest Just Works
<ddevault> ah
<ddevault> nice
<sham1> Alignment, it's dreadful
<ddevault> whyyyy doesn't my serial work
<ddevault> it worked exactly once
<sham1> It's also being dreadful
Matt|home has joined #osdev
<kaichiuchi> ok I think I’m going to file a bug and embarrass myself
<heat> good
<ddevault> ...huh
<heat> it's what you deserve
<ddevault> ah, I see
<ddevault> I spammed my keyboard into the console out of frustration and keys started showing up after a moment
<ddevault> turns out I was waiting for the FIFO to be full, not to be not-empty
<kaichiuchi> heat: no it isn’t
<kaichiuchi> :(
<kaichiuchi> I would probably wager this is violating the ABI
<sham1> Stop violating the ABI without consent
<sham1> In fact, just stop violating the ABI
<kaichiuchi> “oh baby ABI let me spill all of my parameters to your stack”
<sham1> <.<
<kaichiuchi> :(
<ddevault> mission accomplished
<sham1> Now cube it!
<GeDaMo> That's not squared, it's rectangulared! :P
<sham1> I could have also said to now raise it to a 3/2s power, but that's not quite as fun as cubing, so now it's a factor of 6
<sham1> And by factor I mean power
<ddevault> finished an ambitious demo with time to spare
<ddevault> and... still no slot to present it in
<ddevault> womp womp.
dude12312414 has joined #osdev
srjek has joined #osdev
SGautam has joined #osdev
[itchyjunk] has joined #osdev
epony has quit [Remote host closed the connection]
Vercas9 has joined #osdev
Vercas has quit [Quit: Ping timeout (120 seconds)]
Vercas9 is now known as Vercas
dude12312414 has quit [Ping timeout: 255 seconds]
dude12312414 has joined #osdev
Arsen has quit [Changing host]
Arsen has joined #osdev
Burgundy has quit [Remote host closed the connection]
micttyl has quit [Quit: leaving]
LivewareProblem is now known as aoei
aoei is now known as LivewareProblem
gog has quit [Quit: Konversation terminated!]
fedorafan has quit [Ping timeout: 256 seconds]
fedorafan has joined #osdev
<heat> hey guys
<heat> i found a new typo in the efi spec
<heat> what do I get
kori has left #osdev [WeeChat 3.6]
<Ermine> heat you rock!
<heat> i am not a rock Ermine
<Ermine> that's why there's no 'are'
SGautam has quit [Quit: Connection closed for inactivity]
nvmd has joined #osdev
nvmd has quit [Max SendQ exceeded]
nvmd has joined #osdev
nvmd has quit [Max SendQ exceeded]
nvmd has joined #osdev
nvmd has quit [Max SendQ exceeded]
tosemusername has joined #osdev
nvmd has joined #osdev
nvmd has quit [Client Quit]
xenos1984 has quit [Ping timeout: 256 seconds]
xenos1984 has joined #osdev
SGautam has joined #osdev
dequbed has quit [Quit: bye!]
gareppa has joined #osdev
gareppa has quit [Remote host closed the connection]
dequbed has joined #osdev
nyah has joined #osdev
<zid> heat: To bask in the afterglow of knowing that EFI is bad, and then finally proving it without a shadow of a doubt
gog has joined #osdev
<zid> Your sense of taste is unrivalled and your foresight perfect
<heat> pog
<heat> zid, did you know there was sandy bridge fw that gave you a bad memory map that told you some intel gfx allocated range was free
<heat> so you tried to use it and it went boom
<zid> you mean a bad bios
<zid> plenty of those around
<zid> I don't have GMA though so I am immune
<heat> there are also these ranges of memory you're only supposed to use before booting the OS but it used them anyway when calling into the fw to set the virtual addressing layout
<heat> now there's drama in arm64 where there are devices that *need* you to set the virtual address map and devices where you must not set it
<heat> this is so fucking broken
<heat> it's depressingly hilarious
<heat> it has been around for 20 fucking years
<zid> have they considered having a BIOS
<heat> noooooOOOOOooooOOOoooooOOOoooooo
<zid> They should, it's great
<heat> - person booted on EFI right now
<zid> You can write bootloaders on top of it, and the bios handles all the grotty dtails
Burgundy has joined #osdev
fedorafan has quit [Ping timeout: 246 seconds]
<heat> ddevault, btw your stub is kind of wrong
<zid> only kind of? That's good for EFI
<heat> ddevault, .word .Lpe_header - .L_head <-- should be .short in arm64, with a .align 4 right after it since the PE header needs to be 4-byte aligned
<heat> your solution should work right now but it's really non-obvious
fedorafan has joined #osdev
<gog> hi
mahk has joined #osdev
<ddevault> heat: busy atm, would you mind emailing me the details so I can follow up later?
<zid> so it fails to warn on 16bit truncates like it should, basically?
<kaichiuchi> hi
<heat> ddevault, just take a note, what I said is all there is to it
<heat> instead of .short (16 bit) + align 4 (2 bytes padding) you're doing a non-obvious .word (32-bit in arm64, 16-bit in x86_64)
<ddevault> aight
<ddevault> thanks
<zid> spooky size changing
<heat> tbf arm is the correct behavior here
<heat> word should be native word
<heat> not "native word in 1985"
<zid> yea but 4 people got to not update their .s files!
<kaichiuchi> heat: something here reminds me of you at work
<heat> is it a clogged toilet
bgs has joined #osdev
<geist> also re: align in assmbly, be careful it’s not a power of two
<geist> if using gas at least, i’ve found it to be safer to use .balign, which is always bytewise
<zid> I fucked up an align once
<geist> whereas depending on arch, .align is sometimes byte, sometimes power of 2
<zid> I made nasm ran out of memory and crash
<zid> I asked for align 2^32 I think
<geist> yeah that’ll do it
<zid> gas's directive syntax is like "What if we were like C and had IDB and UB and stuff? That'd be fun"
<dh`> have you ever looked at the gas sources?
<zid> no I'm not an idiot dh` what do you take me for
<zid> "have you ever put your finger into a blender?"
<heat> i use llvm as
<geist> i give it a lot of slack, since gas can historically please no one. if it does whatever the prevailing arch’s syntax says, folks bitch that it’s inconsistent between arches, if it tries to invent its own syntax, they bitch that it differs from $vendors syntax
<heat> it's nice
<geist> so it just tries to do what it can
<zid> "You know when you stick your head into an oven?"
srjek has quit [Read error: Connection reset by peer]
<kaichiuchi> heat: the new guy looks like BAZINGA
<geist> the fundamental issue is there’s no one consistent syntax between all arches, but gas tries to thread a particular needle
<heat> llvm binutils have made great significant progress such as, erm
<heat> errors with descriptions
srjek has joined #osdev
<zid> heat: and that thing where they erm
<heat> and colors!
<zid> and that other thing!
<geist> yah a lot of that was because fuchsia. llvm as and binutils stuff was pretty bad 5 or 6 years ago, and we had a mandate to at east match gas/objdump/etc for what fuchsia needed
[itchyjunk] has quit [Remote host closed the connection]
<geist> lots of linker script improivements, etc
<gog> hi
<heat> geist, they should've forced at&t on every arch
[itchyjunk] has joined #osdev
<heat> geist, also a fair amount of that AFAIK has also been clang-built-linux
<zid> I also think that
<heat> and the android teams
<geist> yep
<geist> but i think we were doing that somewhat before the clang-builds-linux push
<geist> but yeah, lots of stuff coming out of the google toolchain teams
<geist> also iirc the linux stuff still using binutils. freebsd did too, clang the compiler + binutils linkers/etc
<geist> though freebsd may have fully switched
<heat> you can build the linux kernel with llvm only
<heat> LLVM=1 make ...
<geist> word
<heat> it Just Works for the most part
<heat> funnily enough glibc can't be built with llvm
<heat> (and neither can gcc if you're cross-compiling it)
<geist> yah i’m going on a few year old data
<heat> srsly, the gcc thing is kind of annoying. it compiles well until it tries to yank some multilib data out of $CC
<geist> guess there’s no exact combination of an OS/distro that wants to use clang/llvm + glibc
lockna has joined #osdev
<heat> sure there is, CrOS
<zid> I'd be cross too if I had to use clang + glibc
<zid> ba-dum tish
<geist> oh that reminds me to look into crosvm. been meaning to, just keep forgetting to
outfox has joined #osdev
<geist> and now i know a bit more rust i may actually be able to read its source
<geist> my guess is it’ll be underwheming.
<geist> ie, it works, but isn’t very configurable, sinceits designed to do one thing
mahk has quit [Ping timeout: 246 seconds]
mahk has joined #osdev
gog has quit [Quit: byee]
<ddevault> heat: pushed fix, thanks
fedorafan has quit [Ping timeout: 256 seconds]
fedorafan has joined #osdev
srjek has quit [Ping timeout: 272 seconds]
Starfoxxes has quit [Ping timeout: 260 seconds]
SGautam has quit [Quit: Connection closed for inactivity]
Starfoxxes has joined #osdev
xenos1984 has quit [Ping timeout: 260 seconds]
GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]
lockna has quit [Quit: lockna]
xenos1984 has joined #osdev
<Bitweasil> Oh ffs. https://github.com/signup
<bslsk05> ​github.com: Join GitHub · GitHub
<Bitweasil> They've ruinsed it.
<Bitweasil> Like, "Literally unusably slow trying to render that starfield on my machines."
Vercas has quit [Ping timeout: 255 seconds]
gildasio has quit [Ping timeout: 255 seconds]
gildasio has joined #osdev
fedorafan has quit [Ping timeout: 256 seconds]
fedorafan has joined #osdev
bauen1 has quit [Ping timeout: 256 seconds]
knusbaum has quit [Quit: ZNC 1.8.2 - https://znc.in]
vdamewood has quit [Remote host closed the connection]
vdamewood has joined #osdev
knusbaum has joined #osdev
<ddevault> sign up for sourcehut instead :)
spikeheron has quit [Quit: WeeChat 3.7.1]
spikeheron has joined #osdev
bgs has quit [Remote host closed the connection]
Vercas has joined #osdev
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
<vdamewood> Mutabah klange air sortie geist: Can I talk to one of you?
* Mutabah is away (Sleep)
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
<kaichiuchi> i have a question
<heat> geist, it's quite possible it's more versatile these days
<kaichiuchi> the system V ABI provides two registers used for returning values
<heat> there was a patch floating around edk2 for OVMF on crosvm
<kaichiuchi> now, of course, in C you can only return one value
<kaichiuchi> am I right in assuming something like `int f(int *retval) { *retval = 1; return 666; }` will store 1 and 666 in those two registers?
<heat> no
<kaichiuchi> okay, I didn't think so
<bslsk05> ​godbolt.org: Compiler Explorer
<kaichiuchi> jesus christ
<kaichiuchi> that was like... 10 fucking seconds
<kaichiuchi> maybe less
<kaichiuchi> maybe I should do that
<heat> if you do e.g two u32 values, they get packed onto rax
<kaichiuchi> right, that's what I'd expect
<heat> if you do 4 u32, they get packed onto rdx:rax
<kaichiuchi> so is it really just to return 128-bit stuff?
<heat> hm? the example I gave you returns a struct in rdx:rax
<heat> not 128-bit stuff
Vercas has quit [Ping timeout: 255 seconds]
<kaichiuchi> oh okay okay
<kaichiuchi> I see
<heat> kaichiuchi, in general returning works by packing values onto rax, then rdx:rax, then on SIMD registers if SIMD is enabled, then they fall back to doing a stack allocation and passing a pointer
<kaichiuchi> gotcha
<heat> which is why returning complex structures has very little disadvantages
<heat> at the end of the day struct S {unsigned int a[500];}; struct S foo(); and void foo(struct S*); will have identical codegen
danilogondolfo has quit [Remote host closed the connection]
<heat> in fact in C++ you probably want to favour struct S so it directly constructs S
srjek has joined #osdev
Vercas has joined #osdev
dude12312414 has joined #osdev
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 248 seconds]
nshp has quit [Quit: ZNC 1.8.2 - https://znc.in]
<bslsk05> ​lore.kernel.org: [PATCH 00/41] Per-VMA locks
<heat> mjg, cpu_relax() in the middle of cmpxchg just Feels Right(tm)
<mjg> >
<mjg> If you are right, feel free to go and remove every cpu_relax() under the
<mjg> kernel/locking directory.
<mjg> wow
<mjg> that's one hot take
<heat> hilf is the "you're an obnoxious cunt" guy from al viro
<mjg> :d
<mjg> i don't remember that comment but viro is the guy to say somethingl ike that
<mjg> oh someone is trying to scale the mapple syroup
<bslsk05> ​lore.kernel.org: Re: [syzbot] WARNING in do_mkdirat - Al Viro
<heat> mjg, could you explain why pause is so detrimental to perf?
<heat> I don't get the "well, you grab the exclusive bus anyway"
<heat> don't you eventually lose it by just executing more instructions?
Burgundy has quit [Ping timeout: 268 seconds]
<Bitweasil> Isn't "pause" the "I'm in a spin loop, you can seriously go hyperthread something else now" hint?
heat has quit [Remote host closed the connection]
heat has joined #osdev
<heat> Bitweasil, yes
heat has quit [Remote host closed the connection]
<Bitweasil> *nods* I'd expect it to hurt performance in hot loops, for sure.
heat has joined #osdev
<Bitweasil> That's kind of the point. :)
<heat> note to self: do not click links in hexchat under the possibility of crashing the whole client
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
<heat> per our friend here: To my understanding on said architecture failed cmpxchg still grants you
<heat> exclusive access to the cacheline, making immediate retry preferable
<heat> when trying to inc/dec unless a certain value is found.
<mjg> heat: because the E owner fails to take advantage of it
<mjg> first, MESI aside, consider a case where n cpus tried a literally the same time
<mjg> all failed but 1
<mjg> now you have n - 1 cpus chilling on vacation for the duration of the pause instruction
<mjg> with nobody even trying
<heat> but for how long do you keep the excl cache line is my question?
<heat> https://godbolt.org/z/cjMbvxYdh for instance in this simple cmpxchg atomic inc
<bslsk05> ​godbolt.org: Compiler Explorer
<mjg> until smeone else takes it, for exampel by also executing the op
<mjg> or reading from the cacheline
<heat> really?
<heat> do you just not lose it on the next cycle or whatever?
<mjg> you lose it soon(tm) when faced with numerous cpus fucking here
<mjg> which is why it is crucial to retry asap
<heat> hmm ok I see
<mjg> and again
<mjg> everyone but 1 failing == everyone but 1 just chillin'
<heat> and this doesn't work for e.g spinlocks because you're not guaranteed to make progress in a spinlock loop, in a timely manner
<heat> right?
<mjg> that's weirdly stated in this context
<mjg> the key with spinlocks is that you are ultimatley waiting for someone else to set a speific value
<mjg> and you have no idea how long that's gonna take
<mjg> so the thing to do is to wait in least disruptive manner
<heat> whereas a cmpxchg inc is just a tight loop in 4 instructions where you either make progress or try again very quickly (and have a good chance of succeeding), a spinlock can block for many more cycles and shit
<mjg> no such consideration when flipping the counter
<heat> so forward progress may be unlikely, so pause is a good idea
<heat> yeah exactly
<mjg> well if you feel this way remove all pauses from onyx!
<heat> what the hell does pause even do?
<heat> besides the generic intel sdm meaning
<heat> hyperthread yield? is that it?
<Bitweasil> Far as I know, it's just a hint to the scheduler that you ought go run the other hyperthread harder for a while.
<mjg> it fucks off on the thread
<mjg> but crucially, any ht considerations aside, fucks off from the cacheline
Matt|home has quit [Quit: Leaving]
<mjg> so whoever is blocking your progress has easier time cmpleting what they need to do
<mjg> without you constantly stealing it
<heat> ah so that drops the E?
<mjg> pause does not drop E to my knowledge
<mjg> i'm saying when you pause you are not doing memory accesses
<heat> so what does "fuck off from the cacheline" mean?
<heat> ah ok
<mjg> so someone else has easier time doing the needful
<mjg> if you want smoe fun patch onyx to never pause
<heat> and bench?
<mjg> just literally cmpxchg in a tight loop when trying to lock
<mjg> you will see perf going down the shitter
<heat> fwiw I think my spinlocks just use xchg
<heat> or some primitive I have did. no idea why
<moon-child> xchg is fine
<mjg> if (__atomic_compare_exchange_n(&lock->lock, &expected_val, what_to_insert, false,
<mjg> __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
<mjg> you pessimal motherfucker!
<mjg> break;
<heat> ... does not seem to. but I swear I had seen a xchg somewhere
<mjg> while (__atomic_load_n(&lock->lock, __ATOMIC_RELAXED) != 0)
<mjg> cpu_relax();
<mjg> i think intel demo spinlocks are xchg
<moon-child> mjg: by the by, thoughts on hle?
<mjg> moon-child: does not work?
<mjg> funny you bring it up, it was recently flamed
nyah has quit [Quit: leaving]
<moon-child> :\
<moon-child> why
<mjg> i mean there is hw bugs which make it unusable so convo stops there
<moon-child> oh sure
<moon-child> I meant notionally
<heat> mjg, I cannot actually xchg because that only works if the value you want to swap with is the same as the value that is there if the spinlock is locked
<heat> as in if your spinlock can only be 0 or 1
<mjg> whack that pause and dup1_threads -t 8
<mjg> you will see a massive drop
<mjg> assuming you got spinlocks to protect the fd table
<heat> yessir yes indeed
<mjg> moon-child: i don't have a strong opinion
<heat> mjg, what's pessimal in my code btw?
<mjg> you instantly re-read
<mjg> you should do { cpu_relax(); } while ....