rnicholl1 has quit [Quit: My laptop has gone to sleep.]
elastic_dog is now known as Guest497
elastic_dog has joined #osdev
<geist>
yay
<heat>
overcomplicating things for the sake of performance is always fun
<heat>
believe me!
<bnchs>
so i'm thinking of a data structure
<heat>
wait, what's this "memory map" for?
<bnchs>
heat: memory map for an emulated CPU
<bnchs>
i'm literally trying to design one in a drawing
<heat>
you could totally have an rb tree here
<heat>
I think, I don't know how fragmented your thing is
<bnchs>
heat: i read my OS' documentations, they say fragmentation is a issue in their implementation
<bnchs>
sooo, let's just say, i'll implement that
<bnchs>
and call it a feature
Coldberg has joined #osdev
C-Man has quit [Read error: Connection reset by peer]
<bnchs>
also thanks for the idea, i was thinking of something similar to that, but i forgot the name
rnicholl1 has joined #osdev
<heat>
bnchs, implementation of what?
<bnchs>
heat: implementation of their memory allocation system
<heat>
but you're not supposed to allocate your emulator's memory chunk by chunk
<heat>
you get a big chunk of memory thru mmap or something, and it Just Works
<heat>
in all honesty since using an actual tree probably sucks, you may want some other scheme
<heat>
the stupid simple one is to have from 0 to N as DRAM, and then from N to infinity and beyond as MMIO
<heat>
this is how x86 PCs work
<heat>
except those weird fucking bits in the legacy space and the stuff right under 4GB, but those are just warts in x86 (who would've guessed, warts in x86)
<netbsduser>
the virtio spec is a bit of a joke
<netbsduser>
its virtio-fs section basically says "lmao we have not described the protocol, here is a link to a linux header file"
<heat>
that's also a problem with the virtio-gpu stuff
<heat>
particularly the 3D bits
<netbsduser>
the header alone is so far from adequate, at the least they could say what replies you get for different opcodes - i eventually gave up on one of them because there was no figuring it out (i would have to traipse through the linux kernel to figure it out)
<netbsduser>
it turned out i made a mistake with one of them and was getting junk replies, which i finally fixed (and now gcc runs: https://i.imgur.com/8kSp4LH.png)
<heat>
nice
<heat>
did you check one of the BSDs?
<heat>
fun fact: they check linux as docs all the time
<netbsduser>
yeah, i used a mixture of free and openbsd
<heat>
there's a good handful of stuff on the ext2/3/4 stuff that would be impossible to figure out unless you purposefully checked the linux code
<netbsduser>
virtio-fs follows the fuse wire protocol exactly while openbsd rolled theri own
<zid`>
ext_two_II_2_also_too
<netbsduser>
heat: oh, it doesn't mount it
<netbsduser>
it just found that gpt partition
<heat>
yes its just
<heat>
the name
<heat>
ext_extended_2_two_second
<netbsduser>
ext4 is also poorly documented, the article on it on some linux wiki or other was clearly given up very quickly
<heat>
it's not
<gog>
hih
<heat>
there's a full spec over at kernel.org
<zid`>
goggles
<gog>
ziddles
<netbsduser>
>"NOTE: This is a work in progress, based on notes that the author (djwong) made while picking apart a filesystem by hand"
<zid`>
gog, may I turn you into paste and spread you on toast
<gog>
yeah
<heat>
netbsduser, ok but it's entirely correct
<heat>
and decently complete
<gog>
what are you going to season me with
<netbsduser>
heat: that's good to hear
<netbsduser>
if i ever try ext4 at least i will have something to look at
<heat>
i should complete/clarify some missing bits
<heat>
but basically the big difference between ext4 and ext2 are just the extent stuff
<bnchs>
heat: mmap in a 32-bit environment
<heat>
and those are basically just a basic btree of {logical start, physical start, length}
<bnchs>
now lemme get this straight, the address has to be within 0x0 to 0x00FFFFFF
<bnchs>
also i'm not allocating my emulator's memory chunk by chunk
<heat>
very, very standard stuff. ofc they don't really describe it as such, which is IMO one of the docs's shortcomings
<heat>
the only thing I have not really looked at in depth is just the htree stuff. there are also some new feature for uninit stuff (inode tables, bitmaps, and extents), mainly for virtualization purposes, etc. so your driver needs to handle those if writing
<heat>
you also only need the htree stuff if writing, because it's backwards compatible
<heat>
erm, the uninitialized extent stuff is not really for virtualization/DISCARD purposes but rather for block allocation
<heat>
oh and RE journal, yeah you're kind of screwed there AFAIK
frkazoid333 has quit [Ping timeout: 240 seconds]
<bnchs>
like this is a 32-bit processor
<heat>
if you're not allocating chunk by chunk, why do you need an rb tree?
<heat>
why is it fragmented?
<bnchs>
i need a rb tree for the memory map
<bnchs>
and the memory allocation system is seperate from it
<heat>
you do not
<bnchs>
what do i need?
<heat>
i told you what you need
<heat>
a basic way to separate mmio accesses from memory accesses, memory gets all mapped straight up using mmap
<bnchs>
again, how can mmap help me here
<bnchs>
huh
<heat>
user does ./myemu -m 1G
<heat>
you mmap a 1GB chunk of memory, contiguous memory
<bnchs>
how can mmap help make memory mappings that's.. basically like MMIO
<zid`>
If only the cpu had a memory map built into its hw that you could use
<bnchs>
when a program tries to read from it, it does a page fault, but only the OS can handle the page fault
<heat>
why do you want to handle the page fault?
<clever>
bnchs: if the OS cant handle the fault, it runs the SEGV handler, which is free to fake the answer and resume execution
<heat>
(also, technically no, userfaultfd, but that's besides the point)
<bnchs>
for virtualized memory-mapped input output
<heat>
m8
<heat>
can u read?
<bnchs>
just like you said
<heat>
<heat> i told you what you need
<heat>
<heat> a basic way to separate mmio accesses from memory accesses, memory gets all mapped straight up using mmap
<clever>
mmio can just be not mapped, trap the fault, then emulate the access in software
<heat>
then on mmio you obviously need a separate way (rb tree or something should be fine, not perf critical)
<bnchs>
clever: yes that's what i mean
<heat>
but what matters here is that the stupidly common operation of reading and writing to DRAM Just Works
<bnchs>
also this is overkill and possibly not portable for something that doesn't even use x86
<heat>
what is?
<bnchs>
mmap method
<heat>
how is this overkill, whats your idea? how is this not portable?
<clever>
the mmap method can be used on any arch
<heat>
it has been portable since fucking 4.1BSD or something
<bnchs>
this program is not meant to only run in Linux
<heat>
windows has VirtualAlloc
<heat>
every UNIX used in 2023 as a desktop OS has mmap
<klange>
i swear toaru will have it soon
<klange>
at least for this usecase; file mappings tbd
<zid`>
when is unix getting mmapEx
<zid`>
like windows has MapViewOfFileEx and VirtualAllocEx
<heat>
sorry, you mean mmap2
<heat>
this is unix m8
<klange>
mmap2 is already at hing
<heat>
is it?
<klange>
(In Linux, and has been since 2.3!)
<heat>
I know linux has mmap_pgoff or whatever they call it
<zid`>
mmap2 is no good, I want an ex
<zid`>
bonus plus: plus alpha
<klange>
It's actually _probably_ the syscall your libc `mmap` is calling, too!
<heat>
oh, mmap_pgoff is mmap2
<heat>
cool
<bnchs>
heat: the CPU emulator has memory read/write functions, which is how it accesses the memory
<moon-child>
there's futex2 at least. And lseek64 or so
<heat>
damn.
<moon-child>
.oO( if mmap2 is so good how come there's no mmap2 2? )
<heat>
we went through 63 lseeks before this last version
<heat>
these linux people don't know how to build APIs do they
<zid`>
google says mmap2 is bytes/4096, so that you can do 2^44 not 2^32 in 32bit
<clever>
mmap64 just takes a 64bit byte count instead
<heat>
bnchs, yes. it accesses what memory exactly?
<clever>
but both of those, are hacks to allow a 32bit userland to access larger files
<heat>
what's your idea here? malloc(1GB)?
<clever>
a 64bit userland, just always has a 64bit byte offset
<bnchs>
heat: the emulated OS's memory (which executes a function), and the executable itself can request memory
<heat>
I guess that also works but is wasteful and will also use mmap
<heat>
ok so the emulated OS's memory is very vague
rnicholl1 has quit [Quit: My laptop has gone to sleep.]
<heat>
requesting memory does start to get into memory ballooning territory or something, so i'm out
<bnchs>
when the executable accesses the emulated OS' memory, the emulator runs a function to give it a version of the OS memory that is compatible with the executable
<bnchs>
(endian differences)
<klange>
_what_
<heat>
... so this isn't a normal emulator?
<bnchs>
no
<heat>
facepalm.gif
<heat>
<bnchs> heat: memory map for an emulated CPU
<heat>
for the record
<bnchs>
yes
<bnchs>
the CPU is the only emulated part
<bnchs>
the rest is implemented as a compatibility layer
<heat>
are you reinventing qemu-user
<bnchs>
this is literally not running linux
<zid`>
heat can you emulate a version of heat who doesn't speak english for a couple of hours so I can concentrate
<clever>
i think we have to go back to step 1, what is the host cpu? what is the guest cpu?
<bnchs>
the executable is NOT for linux
<heat>
why do you think this is for linux
<clever>
linux is not a cpu
<bnchs>
because qemu-user assumes that it is
<heat>
could you take what I say a little less literally?
<bnchs>
alright
<heat>
it's like, the whole fucking idea of qemu-user. make thing run on other thing, but userspace
<heat>
and in this case, if this is userspace, why do you need mmio?
<bnchs>
heat: to try to give the executable a translated version of the OS memory structure
<bnchs>
for endian and pointer size differences
<clever>
bnchs: what kind of executable? what guest cpu?
<heat>
wha
<bnchs>
clever: it's m68k
<bnchs>
also i'm just responding to heat's suggestion to using mmap
<clever>
bnchs: what OS are these executables normally talking to?
<bnchs>
a emulated OS-9 environment
<bnchs>
kinda like wine
<clever>
and i assume the host cpu is never going to be m68k?
<bnchs>
no
<clever>
wine is not an emulator
<clever>
its right in the anme
<clever>
name*
<bnchs>
yes i mean the design of the OS layer is kinda like wine
<bnchs>
not the emulated CPU
<clever>
so you basically have 2 choices
<clever>
1: emulate the cpu fully in software, just fetch an opcode, decode it, execute it, all ram access can just go into a big old byte-array, all mmio thru a function, convert endian as needed
<bnchs>
congrats, you just stated the obvious, and what i already did
<clever>
2: JIT the m68k into host asm, let blocks of it run natively, and call pre-written functions when doing mmio
<bslsk05>
michalsc/Emu68 - M68K emulation for AArch64/AArch32 (32 forks/245 stargazers/MPL-2.0)
<clever>
it turns the m68k asm directly into aarch64 asm, it avoids the endian problem by just running in big-endian aarch64 mode
<clever>
in this case, it runs the original amiga os under the emulation
<clever>
and the crazy part, is that it maps the host peripherals directly into the guest, so drivers compiled to m68k, can interface with host peripherals (after doing a byte-swap)
<bnchs>
this is not ummm
<bnchs>
related to the original question
<heat>
you do not need mmio
<heat>
period
<clever>
yeah, ive not seen any need for it
Arthuria has joined #osdev
<clever>
just break out of the emulator upon any syscalls, and translate the syscall args
<clever>
then handle the syscall however you want
<zid`>
a lot of emulators for be on le just.. run the memory in the wrong endian
<zid`>
and fix it up later
<zid`>
if they type pun then it secretly emits the wrong instructions for it
<moon-child>
meh
<zid`>
makes writing cheat codes pretty weird
<moon-child>
le cpus have free byteswaps
<zid`>
le cpus have free puns
<moon-child>
I would keep the memory in be
<heat>
how do you hide it?
<bnchs>
clever: now, tell me this, if a device filesystem driver requires accessing the system global variables (which the OS gives it by writing it in a CPU register)
<bnchs>
it's going to access it without a syscall
<zid`>
you can only detect it if you pun
<zid`>
be needs +2 and le needs +0 or whatever if you try to dword -> short
<heat>
bnchs, who called drivers into this?
<heat>
what's a system global variable?
<bnchs>
heat: this is literally apart of the emulator
<moon-child>
zid`: sure, but you're not really going to need to do much (if any) of that
<zid`>
that's why you keep it LE
<moon-child>
since you're going to be operating on behalf of the emulated code
<moon-child>
so keep it be
<zid`>
because it's rare compared to "write it to memory" or "read from memory"
<moon-child>
that way you never have to do any fixups for the emulated code
<zid`>
the BE program has no idea its memory is all in the wrong order
<zid`>
and it runs at full speed
<moon-child>
you don't even have to try to remember
<bnchs>
heat: system global variables are basically the kernel variables, which the filesystem driver can read and also write to
<clever>
zid`: your assuming a BE program never does 8bit access to a 32bit int
<bnchs>
it's important for some stuff like trying to change the behavior of the kernel
<moon-child>
if you keep the memory in le, then you have to do extra bookkeeping so you can do the fixup when the program does differently-sized accesses
<moon-child>
what if the emulated program does a memcpy?
<clever>
yep
<zid`>
clever: my assumig? no. I specifically said you need to do fixups for punning.
<zid`>
Several times
<zid`>
repeatedly
<clever>
ah, skimming
<zid`>
moon-child: memcpy using byte writes?
<bnchs>
trying to say "i don't need mmio" while the program can access the OS's own memory at any time
<zid`>
that's an edge-edge case :P
<moon-child>
zid`: memcpy using any size writes
<zid`>
irl riscy BE memcpy uses dwords
<moon-child>
say memcpy uses 8-byte writes, and you've got a 4-byte int in memory
<zid`>
the same size you're tracking memory at
<moon-child>
whatever
Coldberg has quit [Ping timeout: 255 seconds]
<moon-child>
or memcpy does misaligned accesses, assuming the architecture allows it
<zid`>
like, I didn't just make this up
<zid`>
lots of actual real life emulators do it
<zid`>
that's why I started with "lots of emulators .."
<bnchs>
now did you all get confused by that?
gog has quit [Ping timeout: 265 seconds]
heat has quit [Ping timeout: 248 seconds]
[itchyjunk] has quit [Read error: Connection reset by peer]
jtbx has joined #osdev
jtbx has quit [Quit: jtbx]
jtbx has joined #osdev
pmaz has quit [Ping timeout: 248 seconds]
tiggster has quit [Ping timeout: 240 seconds]
vdamewood has joined #osdev
Arthuria has quit [Remote host closed the connection]
xenos1984 has quit [Quit: Leaving.]
vdamewood has quit [Quit: Life beckons]
jtbx has quit [Quit: jtbx]
jtbx has joined #osdev
bliminse has quit [Ping timeout: 240 seconds]
bliminse has joined #osdev
slidercrank has joined #osdev
xenos1984 has joined #osdev
<mrvn>
You can negate all your addresses and then your le memory will look be.
<mrvn>
So the emulator translates any pointer access as "high - addr".
<moon-child>
thanks, I hate it
<moon-child>
also does that work? I'm not sure if it does
<moon-child>
say I do a 4-byte access, and then another 4 byte access which overlaps 2 bytes of the other access, I don't think that gives the right results
<mrvn>
moon-child: a 4-byte access specifies the end of the variable so you access "high - addr - 3" and then it works or something.
<mrvn>
s/3/4/
<mrvn>
Do all AArch64 support BE mode?
bgs has joined #osdev
<moon-child>
''NOTE: If the processor is executing code from the same memory area that is being used for the paging structures, the setting of these flags may or may not result in an immediate change to the executing code stream.'
<moon-child>
good to know
<sakasama>
mrvn: That's elegant enough, though then your memory accesses tend to involve decreasing addresses, which doesn't sound good for implicit prefetch.
GeDaMo has joined #osdev
<moon-child>
sakasama: prefetcher handles descending accesses just fine
<sakasama>
Write-combining too. :/
jtbx has quit [Quit: jtbx]
<moon-child>
hrm, I assume that's also handled with aplomb, but don't actually know
<sakasama>
moon-child: On architectures that support BE mode, I'd expect that, but not when emulating a BE arch on LE-only hardware.
<moon-child>
prefetching is uarchitectural
<moon-child>
and implementations of le archs handle it fine
<clever>
mrvn: i think all aarch64 cores support BE mode, ive looked into it before, and at least with qemu and a linux guest, you must execute linux in LE mode, it will switch itself to BE, and faults if you try to run a BE kernel in BE mode initially
<sakasama>
moon-child: Hmm... it seems Intel handles this case smoothly, but you get fewer prefetch streams for decreasing addresses so it's not quite symmetrical. The details appear to be model dependent.
<moon-child>
interesting
<moon-child>
is it that there are some ascending-only streams and some descending-only? Or some bidirectional and some ascending-only?
<sakasama>
The documentation I've found sucks, so I can't tell. :/
<bslsk05>
community.intel.com: Hardware prefetch and shared multi-core resources on Xeon - Intel Communities
<sakasama>
The last response especially.
gog has joined #osdev
<moon-child>
'I haven't encountered the number of prefetch streams as being as significant a limiting factor as the number of fill buffers (10 per physical core, ever since Woodcrest).' soo true
bliminse has quit [Quit: leaving]
<sakasama>
Okay, I should stop. I can imagine myself wasting far too much time reading about this. I may as well watch The Hobbit on repeat.
<moon-child>
lol
<moon-child>
here is a question: for an sasos, so 1gb pages make sense? They technically let you fit more address space into your tlb, but depend on locality; I think the tlb only has space for 4 1g pages
bliminse has joined #osdev
<kazinsal>
sitting here cackling like a madman because I've spent the past half hour debugging why something wouldn't work only to realize it was because I had an off-by-two in my cdecl math
<moon-child>
I think riscv will let you cover basically all your physical memory (assuming remotely sane quantities of physical memory) with one massive page. Alas, not x86
<sakasama>
My mind has purged most of my memories of x86 page tables. Now I am merely a depraved compiler developer.
<moon-child>
I've tried, but I can't; I must play for all the teams
gog has quit [Ping timeout: 276 seconds]
<sakasama>
I sometimes have a terrifying impulse to create my own mixed ternary architecture and intermittently hack in it during lapses of sanity, but otherwise I keep these urges under control.
<moon-child>
ooh that sounds fun
bauen1 has quit [Ping timeout: 250 seconds]
hmmmmm has joined #osdev
hmmmm has quit [Ping timeout: 252 seconds]
<mrvn>
moon-child: Are you sure that 4 1GB pages isn't a myth? With 32bit that was all you ever could have but with 64bit and todays ram having just 4 1GB pages seems rather limited for the TLB.
<mrvn>
sakasama: you should make a quaternary architecture
<moon-child>
pretty sure it was 4 prior to that, though
<moon-child>
oh this is interesting
<moon-child>
actually
<moon-child>
more complicated than that
<mrvn>
you have 2 levels though.
<moon-child>
it seems like it might be optimal to have a small number of 1gb pages and the rest 2mb pages?
slidercrank has quit [Quit: Why not ask me about Sevastopol's safety protocols?]
<mrvn>
moon-child: You have 32 2MB entries, that's 64MB. Not really anything that can replace a 1GB entry.
<moon-child>
yes, but the entries can be spread all over the address space
<moon-child>
so the question is whether applications exhibit enough spatial locality for the 1gb things to make sense
<sakasama>
mrvn: It's mixed by using balanced ternary arithmetic but with quaternary logical instructions.
<mrvn>
second level seems to be split 1024/1024. But that's just 2 1GB pages you can replace with 2MB entries at the cost of blocking all 4k pages.
<moon-child>
yes--don't care about 4k pages
<mrvn>
Then use 1GB pages for the phys mapping. You probably don't have 1TB of memory.
<moon-child>
I can't tell if this is saying the l2 tlb can store 1024 different 1gb pages
<mrvn>
thatÄs how I read it. But 4k pages would compete for the slows and evict entries.
<mrvn>
One question is how pages map to entries. If you map 1TB memory will they all go to separate indexes of will you have collisions and holes?
<moon-child>
you mean like if they actually hash the address or just use the high bits directly?
<mrvn>
The certainly do hash in some way. The question is how. It's 8 way so I assume each address can go to one of 8 slots. But if you have no competing 4k entries you should be able to get all 1024 slots filled with unique 1G pages.
<bslsk05>
pvk.ca: How bad can 1GB pages be? - Paul Khuong: some Lisp
<mrvn>
that's nearly a decade old
<mrvn>
interesting read though, the numbers will be different nowadways so you have to benchmark again but it's a good idea to look at the best and worst case for each pagesize.
<mrvn>
He mentions that huge pages leave more memory because page tables are smaller. But at 0.2% needed to map everything as 4k pages is that bit of extra memory really relevant?
<clever>
mrvn: there is also the cost those page tables have on the d-cache
<clever>
(i assume the tables partially live in both d-cache and tlb)
<clever>
d-cache, because the cpu read something, but it can expire without harming the tlb
<clever>
with 4 layers to the paging tables and 4k pages, a TLB miss will involve 4 cache lines becoming live (either refreshing the LRU, or causing a cache-miss&fetch)
<mrvn>
The page walk has cache too
<clever>
seperate from the i-cache and d-cache?
<mrvn>
no idea
<geist>
yup. think of it as TLB entries that jump the page walker to the end
<clever>
ive never heard of one before
<geist>
ie it can say 'for pages within virtual range X through Y, the terminal page table is located at physical address Z'
<clever>
but i can see how that might work, you can just cache every node in the tree as you walk
<clever>
and can skip to whatever node is the best, and resume the walk
<geist>
no, that's not at all how it works. it just saves the terminal
<clever>
ah
<geist>
you did't read the arm manual deep enough then, it's described somewhat there. even a53 has it
<mrvn>
How does that work with a mix of 1G/2M/4k pages? The terminal for 2M isn't the terminal for 4k.
<geist>
you actually have to maintain it, since there's a bit in the TLB flush instruction that says whether or not you additionally flush the page table walker cache
<clever>
ahhh, was wondering if it was just hidden under all of the tlb flush operations
<geist>
mrvn: good question. i think it's best effort, so it stands to reason the walker cacche only works for 4K terminal entries
<clever>
manual control means you have to document it better
<geist>
x86 has the same thing, but it's completely transparent. it's mentioned in both intel and AMD manuals, but by default when you invlpg it also invalidates the page walker cache that may cover that range
<geist>
AMD has a feature bit you can enable that lets you take over direct control of it, but i dont think anything really uses it
<clever>
ah, so if you dont move the tables, you could keep that cache intact?
<geist>
but with armv8 you absolutely have to be aware of it and it will absolutely bite you in the ass if you arent
<clever>
and speed up walking after a tlb invalidate
<mrvn>
clever: if you just change a page in the table you don't invalidate the page walker.
<clever>
yeah
<geist>
right. for ARM you explicitly invalidate page table entries like normal, but then if you move the page table you *additionally* want to invalidate the pt walker cache
<geist>
so it's ideal, you choose to flush it when it actually changes
<mrvn>
any change of a page directory would need to clear the page walker
bauen1 has joined #osdev
<clever>
geist: but couldnt you also just say to wipe everything (tlb and walk cache) and ignore the problem?
<geist>
right, that's what x86 does by default
<mrvn>
clever: reloading the page table should do that
<geist>
invlpg also invalidates the pt walker cache for that page table
<geist>
if you dig into say the a53 manual, it'll describe how many pt walker entries there are, etc. it talks a bit about how the TLB sram array is carved up
<geist>
and how many entries are for what. the tlb walker stuff is basically a seperate list of entries, iirc, with its own tags
<geist>
it's implementation defined how it works, but a lot of the ARM cores do it more or less the same way, with varying sizes and implementation details
<geist>
maybe they handle > smallest page granule PTs
<geist>
would have to track at what level its a cache entry for, etc
<clever>
ah, i think i see the issue, 600 page document for a63
<geist>
that's really how i figured out how it works. the ARMARM only generically describes it, but then the specific core manuals basically tell you precisely what is stored in it, so you can figure out how it works
<clever>
a few years back, i picked a random arm arm (forget which core), and just started reading it from page 1 while walking outside
<clever>
either i was on the wrong core, or i didnt get deep enough in
gog has joined #osdev
vdamewood has joined #osdev
<clever>
i do remember one bit about dumping the tags for the caches, i think?
<geist>
section 5.2.4 Walk cache RAM
<clever>
ah, was trying to ctrl+f for it
<geist>
yep, if you read the part about the format of the cache dump you can infer a lot about the internal structure
<clever>
that helps, and now that i look at the section headers, its incredibly obvious
<geist>
• 4-way set-associative 64-entry walk cache.
<geist>
ah here's the large page answer: "The walk cache RAM holds the result of a stage 1 translation up to but not including the last level. If the stage 1 translation results in a section or larger mapping then nothing is placed in the walk cache."
<clever>
one slightly odd thing i noticed with the rp2040, if you disable the XIP cache, you can reuse its data ram as regular ram, its just mapped into the addr space
<clever>
but its tag ram isnt mapped
<geist>
yah the tag ram may not have standard ram layout maybe
<clever>
so while you might be able to cheat, and peek at the cache's data ram, you dont know where that came from
<clever>
so you cant abuse that as a debug method
<clever>
but arm has proper debug access, via the co-processor web
<clever>
though, arm doesnt allow arbitrary rw access to the cache ram, so you cant repurpose it as more normal ram
<geist>
Table 6-15 Walk cache descriptor fields is the one you want
<geist>
describes the 128 bit tag ram for the walk cache basically
<geist>
well, 117 bit actually
<moon-child>
it feels like with this kind of stuff the arch has a tendency to get overfit to the uarch
<clever>
section 5.2.1 says there are 10 micro tlb's, on each of the instruction and data "sides", what exactly is a "side", a core?
<clever>
or are they just refering to the 2 halves of the L1 cache, L1i and L1d as 2 sides?
<geist>
yeah i suspect that's what they're saying
<geist>
it's not strictly speaking a half, sice it's possible (though not in this case) for there to be a dissimilar amount of space dedicated to i or d caches/tlbs/etc
<geist>
usually larger amount of icache and perhaps a correspondingly higher amount of micro itlb
<geist>
though seems in this case it's symmetric
<clever>
*checks notes*, pi3 is cortex-a53, it has 32kb of L1i and L1d, i think its 0x80 sets for d-cache, but i-cache doesnt have a valid set count?
<geist>
but it'd be wrong to call it a 'half'
<clever>
L1d had 0x700fe01a in its description, and L1i 0x201fe00a
<geist>
dunno, it should describe it
<clever>
yeah, i'm just not sure i'm decoding the description right, so i logged the raw encoding too
<geist>
it is only 2 way set associatinve on the i cache
<geist>
so maybe that causes it to have more sets
<geist>
see section 6.1
nyah has joined #osdev
<clever>
my notes say assoc is 3 and 1, but i think everything is stored as n-1
<geist>
yah
<clever>
ah found it, section 4.3.22 and its relatives
<clever>
thats where those 32bit numbers i pasted came from
<clever>
cache size id register
<geist>
yup
<clever>
L1i claims to have an assoc of 0xff+1
<clever>
but that seems like an escape hatch for N/A
<clever>
sets*
<clever>
6.1 says its 2-way
<clever>
ah, but thats assoc, not sets, hmmm
<clever>
where did i put my notes on caches
<geist>
yah youd need 2x sets if your assoc is 1/2 i think
<mrvn>
clever: doesn't every ARM core boot with the cache as ram and then you configure the DRAM?
<mjg>
clever: you not sleeping *yet* or *anymore*
* clever
points at rpi
* mjg
woke up around 4 am
<mrvn>
(any normal ARM, RPi is not normal :)
<clever>
mjg: not yet, i expect to pass out around 10am
<mrvn>
clever: doesn't rpi do that too though? configuring the DRAM though becomes using the VCs ram.
<clever>
mrvn: yeah, i think a normal arm starts in rom, and just turns the cache on blind (ramless)
<clever>
mrvn: yeah, but with the VPU cache, not the arm cache
<clever>
there is a tiny bit of sram, unknown why, the rom then uses vector-stores to write nulls to the L2 cache, and zero out the whole thing
<clever>
then it runs from that L2 cache
<mrvn>
I always wanted to write a mini OS that keeps running just in cache. Who needs ram anway?
<clever>
i suspect its using a vector-store, to avoid triggering a line-fill
<clever>
the way the code acts, it feels like its not a proper cache-as-ram mode
<clever>
its more, dont miss or evict, and the cache will never catch on
<mrvn>
wouldn't that fail on the first eviction though?
bauen1 has quit [Ping timeout: 265 seconds]
<clever>
yep
<mrvn>
you better be careful you don't alias any pages. :)
<clever>
the rom uses an invalidate control op, to just reset the entire L2, and then vector-stores to fill it without causing a miss
<clever>
then stage1 can live entirely within a 128kb range, starting at 0
bauen1 has joined #osdev
<mrvn>
Does the ARM on the rpi actually have ROM or is it getting that from the VC as read-only mapped memory?
<clever>
no arm rom, and its not even ro mapped
<clever>
the arm just begins with PC=0, which is plain old ram
<clever>
so the VPU has to drop some arm asm at the front of ram first
<clever>
for backwards compatability, the arm is always forced to start in 32bit mode
<mrvn>
the VPU could map it read-only though, map a bit of actual ROM to 0
<clever>
but the VPU can opt-in to aarch64, if its aware of that, and the hw supports it
<gog>
mew?
<clever>
mrvn: i have yet to find a way to map things into the arm physical space, in a read-only manner
<mrvn>
gog: rpi silliness
* moon-child
pets gog
<clever>
but there are several unused bits in the broadcom mmu
<mrvn>
clever: you map it readonly in the VPU and then the ARM gets a "bus error"?
<clever>
the VPU has no mmu on its end
<clever>
physical or go home!
<mrvn>
Is has those 16 (was it?) memory regions you can map
<clever>
re-reading section 6.1 of the a53 docs, the L1i is 2-way, 64 byte cache lines, and my notes say 32kb, so thats 512 cache lines, broken up into 256 pairs
* gog
prr
<clever>
mrvn: 64 pages, of 16mb each, but that mmu only impacts what the arm thinks the physical space is
<clever>
the VPU is entirely unaffected by that
<clever>
and its unknown if it has any permission flags
<mrvn>
ahh, that way around. I knew there was a 16 in there
<clever>
from how the bit-shifting is done, you have a 2mb resolution, on the target of each page
<clever>
ive not tested if 16mb alignment is required
<gog>
in c++ abi are reference parameters always just pointers or are there implementations that don't do that?
<mrvn>
is it used for more than moving the peripherals around?
<clever>
gog: i always assumed references are just pointers hidden behind a simpler syntax
<gog>
that's what i'm finding too, but there were allusions to this being implementation-defined
<mrvn>
gog: references are pointers that can't be nullptr.
<clever>
mrvn: yep, i have booted a pi1 with peripherals at 0x3f00, and a pi3 with peripherals at 0x2000
<clever>
basically, rpi didnt want to fragment ram, so they put peripherals just after ram
<clever>
but ram kept growing, and they didnt want to break compat with a firmware update
<mrvn>
and that is kind of a moving target
<clever>
and now mmio is a moving target, what model are you on?
<mrvn>
both problems are basically solved by the DT
<clever>
i solved them in an even more fun way
<clever>
i have a #define for where the peripherals live
<moon-child>
gog: I don't know shit about c++, but I think that if a caller has int *x; f(*x), and the callee is void f(int& x) { &x }, the callee's &x has to match the caller's x
<moon-child>
so it has to be a pointer
<clever>
the VPU side of the firmware, reads that, to configure the mmu
<clever>
the arm bootloader reads that, to offset all peripheral access
<clever>
and the arm bootloader patches the DT automatically, based on that
<gog>
that makes sense
<clever>
so i can just put peripherals in any of the 64 pages (except page 0), recompile, and it magically works
<mrvn>
moon-child: it has to eventually point to the same address. But you could have different memory representation for pointer and reference
<bslsk05>
github.com: lk-overlay/arm.c at master · librerpi/lk-overlay · GitHub
<mrvn>
gog: potentially a pointer could be struct { void *current, void *start; size_t len; } while a reference is always a single object so start/len make no sense.
<clever>
mrvn: line 333 will map every page to the framebuffer, as a sort of alarm, any writes will be visible, 338 then re-maps the lower 64mb to ram, 342 maps some highmem, 346/7 map mmio twice, and 350 map the framebuffer
<moon-child>
what if the caller says int x[2]; f(x[0]) and the callee is void f(int& x) { (&x)[1] = 27 }
<clever>
part of that is just playing around with options, and i could just do a linear map of all ram, plus mmio at the tail, for every model
<moon-child>
is that legal? I don't know c++, again, but it seems like it should be
<mrvn>
moon-child: then I would think that's UB but works.
<moon-child>
see also containerof and such like
<clever>
but i could also randomize every page of ram, and update dma-ranges on boot
<mrvn>
moon-child: breaks aliasing rules I bet
<clever>
respect dma-ranges, or fail
<clever>
no cheating, it changes on every boot!!
<moon-child>
it doesn't seem like it should but idk
<moon-child>
hmmmm. I think some stuff supports hotplug ram. Which seems fairly marginal but I'm sure is helpful sometimes. But I wonder--if a kernel has support for that, the same abstraction can be reused to let vm guests share memory dynamically with the host (and thence other guests); do any kernels support this or is there an established protocol for it?
<mrvn>
moon-child: the contract says you get a reference to a single int. Accessing the int past that violates the contract.
<mrvn>
moon-child: int y; f(y); would be bad
<clever>
the price for avoiding pointer syntax, is that you cant treat it as a pointer and increment the addr
<clever>
you must treat it as a regular variable
<moon-child>
mrvn: obviously, but I don't see what that has to do with anything. If I write void f(int *x) { x[1] = 27; }, then int x[2]; f(x) is fine, and int x; f(&x) is not. It doesn't seem obvious to me that references would be different. It does seem plausible that the standard would distinguish, but it doesn't seem at all obvious
<mrvn>
you should really forget about pointers in C++.
<moon-child>
'you should really forget about ... C++'
<moon-child>
yes, I agree with that
<clever>
yeah, references let you basically just ignore pointers
<moon-child>
you can't put a reference in a struct, can you?
<moon-child>
I don't like c++ references because they are not explicit
<moon-child>
which means you have less local reasoning
<moon-child>
is the latter equivalent to the former?
<clever>
mrvn: just double-checked the math on the custom broadcom mmu, the bus address is >>21'd before going into the MMU, so you have 2mb resolution as i remember, but you also have 21 "unused" bits in the pagetable, that could potentially contain flags?
vdamewood has quit [Quit: Life beckons]
<mrvn>
With templates you usualy get a bunch of overloads for const and non-const that differ.
<clever>
i once used templates and inline asm, to make some pretty crazy code
<clever>
basically, you give the function an 8bit, 16bit, or 32bit int, and it will then dynamically generate asm based on what type you picked
<mrvn>
in/out macros?
<clever>
mostly constexpr
<clever>
let me find the code...
<mrvn>
inb/outb for 8bit, 16bit 32bit I ment
<clever>
ah no, VPU vector opcodes
<mrvn>
ahh yes, writing your own vector __builtins basically.
<clever>
exactly
<clever>
for (int i=0; i<16; i++) { int temp = a[i] * b[i]; if (store) c[i] = temp; if (accumulate) accumulator[i] += temp; }
<clever>
the VPU can do this entire operation in just 2 clock cycles
<clever>
for mult, a/b can be 8bit or 16bit, and c can be 8/16/32, accumulator is always 48bit
<clever>
now, so you want to write a variant of the function, for every combination of (dont)store, (dont)accumulate, 8/16/32bit op1, 8/16/32 op2, 8/16/32/null dest, and every actual ALU op?
diamondbond has joined #osdev
<clever>
or do you want the compiler to just do its job and build from a template!
<clever>
mrvn: and here is the cursed code i wrote!
<mrvn>
it's nice nowadays where you can use constexpr if. With template specialization that stuff becomes a nightmare.
<clever>
i ran into trouble inserting H/HX/HY (the bit size specifier) with inline asm, there doesnt seem to be any way, beyond just writing it 3 times and using a constexpr if on sizeof
<clever>
but luckily, the assembler also accepts H8, H16, and H32 as well
<clever>
so i lied to gcc, and claimed thats an immediate it has to insert into the asm
<mrvn>
hehe
<mrvn>
I was wondering about that
<clever>
also, take every combination i gave earlier, and double it some more
<clever>
it supports both horizontal and vertical modes
<mrvn>
I didn't think you could use the asm arguments to construct the memnonic. But in the end the compiler just inserts the text for the argument in the asm output and lets the assembler deal with it.
<clever>
yep
<clever>
its basically just a glorified compile-time printf statement
<mrvn>
indeed
<clever>
the only problem, is that you cant lookup a string, and inject that
<mrvn>
At least you used %[name]
<clever>
yeah, with this many args, and the order being all over the place, its the only way to make it manageable
<clever>
but, half the power of this example code, is vpu-support-purec.h
<mrvn>
I hate it when people have 20 line asm() statements with %0, %1, %2, ...
<clever>
re-implementing the same functions, in plain old C
<clever>
so you could write an algo on x86, and work within the same restrictions as the real hw
<clever>
and once you work out all the bugs, like only having 16bit*16bit->32bit mults, you can migrate to the VPU, and not have to rewrite it all
<clever>
purec mode, also allows plain old gdb to read the matrix
<clever>
because its a regular old C array
<clever>
but the purec method, also counts how many clocks it would have taken on real hw, so you can still kinda benchmark it
<clever>
though, the dual-issue scalar and single-issue vector, still give opertunities to improve the code further
<clever>
the slowest computation opcode i know of, is doing 1024 mults, and it takes 128 clock cycles to complete
<clever>
but, once you start that operation, the cpu can go off and do normal scalar opcodes in parallel
<clever>
it only blocks, if you try to do another vector operation too early
<GeDaMo>
Can you get interrupts when a vector operation completes?
<clever>
i suspect you can
<clever>
as long as the ISR never touches a vector opcode, it just wont care
<GeDaMo>
Thinking about a "job queue" for vector operations
<clever>
something i need to investigate more, is interrupt latency, and how vector operations impact that
<clever>
the scalar side is "dual issue", so the instruction decoder/scheduler can submit 2 opcodes to it in the same clock cycle, and they can both complete at once, but only certain combinations
<clever>
and i suspect the decoder block can only do vector or scalar, so it cant issue orders to both sides at once
<clever>
vector opcodes can also be rather huge, up to 80bits long, including all the operands
<clever>
my rough understanding of the pipeline, is that you can issue a vector opcode, issue some scalar opcodes, service an irq, and potentially even return from irq before the vector opcode completes
<clever>
but if you issue a vector opcode too soon after the previous, it stalls, does that stall block irq handling? or has it not really ran anything, and can abort trying to execute it?
<clever>
things i need to test
<clever>
another thing to keep in mind, is that there is no known way to fault on vector access
<clever>
so you cant implement lazy context switching there
<clever>
and the vector state is over 4kb in size
<clever>
the official firmware solves this with a dumb old mutex
<clever>
context switching can happen freely, but only 1 thread is allowed to use vector opcodes at any time
<clever>
no save/restore, so expect it to be trashed next time you get the lock
tanto has quit [Quit: Adios]
pie_ has quit []
vancz has quit []
Bitweasil has quit [Ping timeout: 246 seconds]
tanto has joined #osdev
pie_ has joined #osdev
vancz has joined #osdev
tanto has quit [Client Quit]
pie_ has quit [Client Quit]
vancz has quit [Client Quit]
<moon-child>
agh
<moon-child>
my efi bootloader is now too big so it's making calls to chkstk
<clever>
there should be a gcc flag to disable that?
tanto has joined #osdev
vancz has joined #osdev
pie_ has joined #osdev
<moon-child>
oh no I was just being stupid
<clever>
ah
<moon-child>
stack allocated the structure with the memory map (fixed 256 entries inline)
<GeDaMo>
There's no flag for that :P
<clever>
thatll use up a lot of stack
<clever>
there is a gcc flag for helping with that
<clever>
-fstack-usage
<clever>
genet.c:185:13:genet_init 4 static
<clever>
boom, this function uses up 4 bytes of stack
<bslsk05>
github.com: lk/partition.c at master · littlekernel/lk · GitHub
<clever>
its allocating room for 1 sector on the stack
Bitweasil has joined #osdev
diamondbond has quit [Quit: Leaving]
<kof123>
" do any kernels support this" [hot plug ram] solaris? im sure i missed the context. think giant box that cannot suffer downtime maybe. of course, things would arguably be better redundant. but i guess "live hw upgrade" is an exception
<kof123>
let me rephrase: sooooooooooooolaris
<kof123>
what i mean by "arguably be better redundant" presumably another reason is swapping out something bad. so...what happened to those programs/data using that ram? vanished?
<kof123>
its different if you planned it i guess, or got warnings first and manually could disable the part you think will fail first
<kof123>
i would guess it was mainly or only for planned hw switch
<mjg>
i thought linux can do memory hotplug, in a vm tho
<mjg>
i would not put any bets on bare metal
theboringkid has joined #osdev
<kof123>
and all the same for cpus
<kof123>
i know nothing, talking out my...asterisk. feel free to tell me what is wrong here: it makes me think imaginary hw i might want some area (cpu(s)+ram) that is redundant for kernel/drivers, and also could be hot swappable (but, depends on however many are needed to reach "quorum"; so like raid, can survive some "copies" dying) .... and then a separate area, for userland programs, where the cpus and ram are hot swappable, but redundan
<kof123>
is optional. basically, kernel and drivers get strong guarantee, programs are supposed to checkpoint or whatever else if they are concerned. this would be a hybrid, versus requiring everything be redundant.
* kof123
<-- knows nothing about hw
bnchs has quit [Read error: Connection reset by peer]
<kof123>
just seems cheaper, and not having to guess "was that cpu or ram that died running something i cant recover from?"
<moon-child>
yeah I'm sure there are all sorts of asterisks for doing it on hardware
<moon-child>
I just wanna do it in a vm
<kof123>
well, giant mainframe stuff...nonstop, tandem...fancy stuff probably figured it all out. just no idea if *everything* was redundant or not
<bslsk05>
blog.davetcode.co.uk: Bringing emulation into the 21st century - David Tyler's Blog
<kof123>
As with all modern design it’s crucial to adhere to the model of “make it work then make it fast” In 1974 when the 8080 was released it achieved a staggering 2MHz. Our new modern, containerised, cloud first design doesn’t quite achieve that in it’s initial iteration. One of the many beautiful things about a microservice architecture is that, because function calls are now HTTP over TCP, we’re no long
<kof123>
limited to a single language in our environment
<kof123>
MOV Swift Moves data from one register to another 257MB 4.68ms and so forth
danilogondolfo has joined #osdev
Left_Turn has joined #osdev
theboringkid has joined #osdev
xenos1984 has quit [Quit: Leaving.]
ThinkT510 has quit [Quit: WeeChat 3.8]
ThinkT510 has joined #osdev
Left_Turn has quit [Ping timeout: 252 seconds]
Left_Turn has joined #osdev
<nikolar>
That's wonderfully cursed
<kof123>
Estimated People Required (organic) 6.666796
* sakasama
only eats organic people.
danilogondolfo has quit [Ping timeout: 260 seconds]
danilogondolfo has joined #osdev
bauen1 has quit [Ping timeout: 265 seconds]
bauen1 has joined #osdev
<gog>
y'all wanna hear me complain about how much i hate the codebase i have to work with
<gog>
blah blah blah tech debt blah blah blah
<gog>
most of this shit was written before any of us on the current team started
<mjg>
:D
<gog>
i'm so sick of fighting the bad practices of our forebears
<mjg>
brah
<mjg>
looks like you done goofed
<gog>
i didn't goof, i write perfect code all the time
<mjg>
not being handed out a piece of garbage is new employment 101
<gog>
i need a money
<gog>
and i like this company
<mjg>
rest assured the twats who wrote the code took all the rewards
<mjg>
now that it does not work it is your fualt
<gog>
one of them is a member of the board lmao
<mjg>
as i said
<gog>
our sales frontend is limping along
<mjg>
this probably used to "provide great customer value"
<mjg>
or so he was able to claim
<gog>
there's so much copypaste in this codebase
<gog>
there's so much useless redundnacy
<gog>
it's bad for performance, it's costing us money
<mjg>
are you reading thedailywtf yet?
<gog>
lol yeh
<gog>
i asked my boss for a sales front bug bash and feature freeze
<gog>
he's hesitant
<mjg>
here is a funny story
<gog>
but istg this thing is going to crash and burn severely
<mjg>
years back i was in a company which had a flagship product drowning in tech debt
<gog>
it only continues to work by the grace of God
<mjg>
anytimg i had to touch i was leaving nasty comments inside
<mjg>
so i came up with a plan how to unfuck it and presented it to my boss
<mjg>
he said no
<mjg>
so i quit
<mjg>
[one of the reasons]
<gog>
i totally understand
<mjg>
then they let me unfuck it on the way out ][lol]
<gog>
i really don't want to quit about this
<gog>
i want to fix it for real
<mjg>
turns out the unfucking came with a funny bug which i don't rmeember
<gog>
instead of these constant bandages
<mjg>
but which took some time to find :D
<mjg>
i guess it would have been better if the unfucking wad done by someone who stays there
<gog>
all of this coincides with yandex and majestic and petal starting to index us
<gog>
we have some critical flaw or MVC has some critical flaw
<gog>
arguably the latter
<gog>
but i can't find it in this fucking mess
<kof123>
i think my 2 (web) coding jobs were: "we're throwing this all away eventually (production)" and "we're throwing this all away eventually (maintenance)"
<gog>
i think i need to push harder for throwing the baby and the bathwater out
<gog>
it's so fucked
<gog>
i can't figure out why this one thing doesn't display
<kof123>
*maintenance == legacy production, leftovers barely touched except when needed, somewhat separate
<gog>
but the CEO is absolutely not going to approve that
<gog>
so the other option is pushing for some kind of technical process where we do a static analysis, async/await figuring outs
<gog>
etc
[itchyjunk] has joined #osdev
theboringkid has quit [Quit: Bye]
theboringkid1 has joined #osdev
theboringkid1 is now known as theboringkid
theboringkid has quit [Quit: Bye]
<mrvn>
kof123: I believe Linux on s390 can physically and kernel wise hot-plug memory.
<mrvn>
kof123: and balooning on VMs does it.
<mrvn>
gog: what language is it in?
<mrvn>
A great way to unfuck legacy c++ code is to look for all "new" calls and make them smart pointers. Then keep fixing compiler errors till it builds again. Repeat by going through all raw pointers in classes. And last annotate any raw pointer.
<mrvn>
Once you update just that to c++11/17/23 best practices you basically have gone through all the code.
theboringkid has joined #osdev
<gog>
mrvn: c#
<gog>
asp.net mvc
<gog>
and an unholy mess of javascript, typescript, shitscript
theboringkid has quit [Ping timeout: 248 seconds]
<gog>
i think one of our problems is a mismatched await/async
<gog>
i need to take some time and use this vs plugin that'll analyze that
theboringkid has joined #osdev
bnchs has joined #osdev
<sakasama>
The only truly unholy mess in this world is my genetic code. Oh, and... whatever hell is being invoked when attempting to resolve data dependencies with paraconsistent logic.
<gog>
true
<gog>
according to my boss it's way better than it was when he started in 2017
<gog>
like it was unusably bad when he started
<sakasama>
That's terrifying nonetheless.
<sakasama>
May Baphomet relieve you of your suffering.
<gog>
takk
<gog>
i have yet another bandaid submitted
<bnchs>
sakasama: hi
<gog>
maybe tonight i'll actually do osdev
<gog>
or i'll play factorio
theboringkid has quit [Ping timeout: 255 seconds]
<bnchs>
i had a power outage, but my laptop was running on battery
<sakasama>
bnchs: Greetings. I have been attempting programming: it is not proceeding smoothly.
<bnchs>
what did i do to spend the time? play games on emulators, fml
<bnchs>
sakasama: what are you programming?
theboringkid has joined #osdev
<sakasama>
A programming language to control an expert system intended to rewrite me into a more elegant form.
theboringkid has quit [Client Quit]
theboringkid1 has joined #osdev
theboringkid1 has quit [Client Quit]
<bnchs>
really?
<sakasama>
Yes. I have no life.
<bnchs>
but i thought people were still reverse engineering the brain for centuries at this point
<sakasama>
Irrelevant. I have achieved absolute desperation.
<sakasama>
On the bright side, I can claim I'm still an operating systems developer, with myself as the architecture.
<bnchs>
black-box reverse engineering a brain for centuries doesn't seem so exciting
<sakasama>
I'm not reverse engineering any brains. If all I succeed at is recreating my own intelligence, or that of any human, the project shall have been an abysmal failure.
<sakasama>
For instance, here is an approximate illustration of the median level of my cognitive functions: https://i.imgur.com/TGhAdkQ.jpeg
<bslsk05>
discourse.llvm.org: Avoidable overhead from threading by default - #2 by tschuett - LLD - LLVM Discussion Forums
<kazinsal>
drake meme no: have a waifu / drake meme yes: become the waifu / drake meme activated eyes: become your waifu's waifu
gog has quit [Quit: Konversation terminated!]
rnicholl1 has quit [Quit: My laptop has gone to sleep.]
xenos1984 has joined #osdev
sortie has quit [Remote host closed the connection]
rnicholl1 has joined #osdev
bnchs has quit [Read error: Connection reset by peer]
rnicholl1 has quit [Ping timeout: 248 seconds]
sortie has joined #osdev
joe9 has quit [Quit: leaving]
frkzoid has joined #osdev
<mrvn>
mjg: I want the ld.lld to use the makeserver. It should select on the token pipe and spawn an extra thread for every token is gets. Up to maybe 8 given how malloc degrades.
gog has joined #osdev
bauen1 has quit [Ping timeout: 255 seconds]
danilogondolfo has quit [Ping timeout: 255 seconds]
danilogondolfo has joined #osdev
bnchs has joined #osdev
<gog>
hi
<lav>
a gog!
<Ermine>
hi gog
<gog>
hi gog
<Ermine>
gog
<gog>
Ermine:
vdamewood has joined #osdev
valshaped has quit [Read error: Connection reset by peer]
valshaped has joined #osdev
linearcannon has quit [Read error: Connection reset by peer]
bauen1 has joined #osdev
DrinkThePoison has joined #osdev
vdamewood has quit [Quit: Life beckons]
Vercas6 has quit [Remote host closed the connection]
Vercas6 has joined #osdev
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
DrinkThePoison has left #osdev [#osdev]
gildasio has joined #osdev
slidercrank has quit [Ping timeout: 265 seconds]
<heat>
mjg, weird mixed response
<mjg>
i'm gonna srespond rather negatively tomorrow
<heat>
maskray's going to be on vacation tomorrow lol
<mjg>
lul
<mjg>
in that case i'm gonna slam him in few h
<heat>
so, glhf
<mjg>
ultimately it is pretty apparent the insanity is not going to change
<heat>
i think adding GNU make jobserver support is actually maybe a really good idea here
<mjg>
so i'm going to patch it locally
<mjg>
the *concept* would be great here
<mjg>
but the reality of the job server would make it suck even more
<mjg>
also note that wankers would spawn these threads anyway
<mjg>
they would just add extra overhead to figure out how many can run
<mjg>
the job server problem is the globally shared pipe
<mjg>
which is a massive source of contention on non-handwatch-scale system
<mjg>
s
<geist>
THREEEEEEEADS
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
<heat>
you have to wonder what Solaris /bin/ld did here
<heat>
OR IS IT PESSIMAL
<heat>
does it scale mjg
<heat>
did it spawn negative threads? a blackhole of threads? did it spawn 10000 threads?
<mjg>
brendan gregg himself said that
<mjg>
do you know that the venerable solaris scheduler has a perf bug where it keeps migrating threads for no reason?
<mjg>
pretty funny
<mjg>
while still at joyent
<heat>
did you know that the venerable solaris was used by the nazis in world war 2?
<heat>
calling it got would also work but the fucking openbsd weirdos took that name
dutch has quit [Quit: WeeChat 3.8]
dutch has joined #osdev
<kazinsal>
even though I know my april fools osdev project isn't actually going to be done in time for april fools it's sent me down a rabbit hole of looking at custom 5.25" floppy low-level formats
<kazinsal>
because clearly if I'm dumb enough to do a unix for a 5150, I may as well start fiddling with the sector gap length and squeeze out another 80K per disk...
<klange>
I don't really have enough ready for a new PonyOS release and I'm going to be out of town this weekend.