klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
nyah has quit [Ping timeout: 246 seconds]
gog has joined #osdev
radens has joined #osdev
<radens> Is there something like the bochs magic breakpoint for qemu?
<radens> Is there something like the bochs magic breakpoint for qemu?
<zid`> All fixed, anyway
<zid`> Turns out the mistake was dumb, and I absolutely should have known where to look
<zid`> given I *knew* where I'd made code changes previously
xenos1984 has quit [Read error: Connection reset by peer]
<gog> mew
<mrvn> radens: that int 3 thing?
<radens> mrvn:
<zid`> The chain is restored
<zid`> for when you absolutely need to chain windows -> linux -> boros -> gameboy together
<heat> sup gog
<radens> mrvn: xchgw bx, bx will break to the bochs debugger and is a nop for normal software. I've seen arm hint space instructions used similarler
<radens> *similarly
<zid`> I do kinda miss the bochs magic breakpoint
<heat> sometimes it do be like that
<heat> get a little tipsy, call her, tell her you're sorry and that you miss her
<gog> yeah afiak there's nothing like that in qemu, you actually have to do int3 and have a debugger attached
<radens> Oh so if you have gdb attached to the qemu remote and you do int3 it will break to gdb?
<heat> i don't think so?
<zid`> or just throw a completely invalid instruction in or something and -d int and wait for the fault to happen :P
<gog> no i think there's more to it than that
<mrvn> shouldn't be to hard to make the TCG translate xchgw bx, bx into int3
<zid`> int debug_magic(int n){ asm("mov rax, %0; ud2":(r) "n"); }
<radens> I mean I could patch xchgw bx, bx to call into the gdb stub
<heat> xchg breaks to the emulator's debugger, int3 doesn't work here
<heat> int3 will just trigger a VM-internal trap
<zid`> you can even pass a fault code that way :p
<heat> linux's BUG_ON kinda works like that
<mrvn> zid`: but that fails when you don't have a debugger attached
<bslsk05> ​elixir.bootlin.com: bug.h - arch/x86/include/asm/bug.h - Linux source code (v5.18.2) - Bootlin
<zid`> nod, it makes a lot of sense to do it like that irl
<zid`> if you don't have spooky magic
<heat> as far as I can see it's just so you have a fully consistent register dump
<heat> instead of doing $weird_stuff
<heat> you'd still need to pretend it's an interrupt entry and exit and push/pop stuff
xenos1984 has joined #osdev
vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
X-Scale` has joined #osdev
X-Scale has quit [Ping timeout: 256 seconds]
X-Scale` is now known as X-Scale
gog has quit [Ping timeout: 240 seconds]
SpikeHeron has quit [Ping timeout: 258 seconds]
Likorn has quit [Quit: WeeChat 3.4.1]
gorgonical has joined #osdev
heat has quit [Ping timeout: 272 seconds]
theruran has joined #osdev
<zid`> Pushed an update to my gameboy code, fixed it deserved it after I spent so much time staring at it in qemu
<zid`> s/fixed/figured
<zid`> It now runs an additional bizzare scene demo!
<zid`> without any corruption on the last few rows
<Jari--> hi
<Jari--> zid`
xenos1984 has quit [Read error: Connection reset by peer]
the_lanetly_052 has joined #osdev
vdamewood has joined #osdev
xenos1984 has joined #osdev
\Test_User has quit [Ping timeout: 246 seconds]
Likorn has joined #osdev
srjek has quit [Ping timeout: 258 seconds]
eroux has joined #osdev
gorgonical has quit [Ping timeout: 246 seconds]
the_lanetly_052 has quit [Ping timeout: 244 seconds]
wand has quit [Ping timeout: 240 seconds]
zaquest has quit [Remote host closed the connection]
zaquest has joined #osdev
terminalpusher has joined #osdev
Likorn has quit [Quit: WeeChat 3.4.1]
vdamewood has quit [Quit: Life beckons]
CryptoDavid has joined #osdev
wand has joined #osdev
terminalpusher has quit [Ping timeout: 252 seconds]
cross has quit [Quit: Lost terminal]
nyah has joined #osdev
genpaku has quit [Ping timeout: 272 seconds]
genpaku has joined #osdev
Celelibi has quit [Read error: Connection reset by peer]
Celelibi has joined #osdev
<mrvn> is anyone using for dns and has slow lookups?
bauen1 has quit [Ping timeout: 276 seconds]
GeDaMo has joined #osdev
Burgundy has joined #osdev
dh` has quit [Quit: brb]
vdamewood has joined #osdev
CryptoDavid has quit [Quit: Connection closed for inactivity]
ripmalware_ has joined #osdev
ripmalware has quit [Ping timeout: 244 seconds]
heat has joined #osdev
terminalpusher has joined #osdev
bauen1 has joined #osdev
eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
eroux has joined #osdev
gamozo has quit [Ping timeout: 246 seconds]
bauen1 has quit [Ping timeout: 246 seconds]
gamozo has joined #osdev
gog has joined #osdev
wand has quit [Remote host closed the connection]
terrorjack has quit [Quit: The Lounge - https://thelounge.chat]
terrorjack has joined #osdev
wand has joined #osdev
radens has quit [Quit: Connection closed for inactivity]
eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
bauen1 has joined #osdev
eroux has joined #osdev
X-Scale` has joined #osdev
X-Scale has quit [Ping timeout: 240 seconds]
X-Scale` is now known as X-Scale
terminalpusher has quit [Remote host closed the connection]
dennis95 has joined #osdev
<ddevault> I can allocate memory in userspace :D
<zid`> I can't, allocators are annoying :P
Starfoxxes has quit [Ping timeout: 260 seconds]
<mrvn> ddevault: (s)brk or something sensible?
<ddevault> something sensible
<ddevault> mmap, essentially
<ddevault> the kernel design is based on seL4
<ddevault> s/based on/inspired by/
Starfoxxes has joined #osdev
archenoth has joined #osdev
<heat> brk go brrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrk
dude12312414 has joined #osdev
bauen1 has quit [Ping timeout: 244 seconds]
Burgundy has quit [Remote host closed the connection]
srjek has joined #osdev
* Andrew pushes themselves onto the stack
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
bauen1 has joined #osdev
\Test_User has joined #osdev
<ddevault> bloody PIT
<ddevault> hmmm
<ddevault> the PIT stops firing when I jump to userspace
<j`ey> accidentally left IRQs masked?
<ddevault> if they were masked wouldn't they not create interrupts in the kernel, either?
nvmd has joined #osdev
<j`ey> ddevault: yeah afaik, but it sounds like you aren't getting interrupts at all..?
<ddevault> no
<ddevault> I am getting PIT interrupts before jumping to userspace, but not after
<ddevault> afk
<j`ey> if you're using qemu or someting, maybe you can check if interrupts are enabled
gog has quit [Ping timeout: 272 seconds]
nvmd has quit [Quit: WeeChat 3.5]
terminalpusher has joined #osdev
terminalpusher has quit [Remote host closed the connection]
<heat> maybe you're setting the wrong eflags
<heat> does anyone know of a way to measure tlb shootdowns for a process in Linux?
Likorn has joined #osdev
gorgonical has joined #osdev
ethrl has joined #osdev
dennis95 has quit [Quit: Leaving]
<geist> mrvn: did your dns resolve?
<geist> i use but dont seem to have particularly slow lookups right now
<geist> ddevault: when you jump to user space you load a new eflags via the iret or sysexit
<geist> you can easily accidentally leave IRQs masked
k8yun has joined #osdev
genpaku has quit [Remote host closed the connection]
genpaku has joined #osdev
FatAlbert has joined #osdev
ethrl has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
genpaku has quit [Quit: leaving]
genpaku has joined #osdev
zaquest has quit [Read error: Connection reset by peer]
k8yun has quit [Quit: Leaving]
<FatAlbert> i guess this is how a channel when 90% of the people in it knows what they doing look like
<heat> we pretend, yes
<FatAlbert> i think i'll study CS in Uni
vdamewood has quit [Quit: Life beckons]
<FatAlbert> as oppose to #linux when 90% of the peopel there probably use windows anyway and that's why the chat is a complete nonsense
<FatAlbert> at least here when people talk ... i learn something
<FatAlbert> so don't get me wrong but i hope your computer will break in some spectacular way
zaquest has joined #osdev
pretty_dumm_guy has joined #osdev
Likorn has quit [Quit: WeeChat 3.4.1]
<sham1> Let's not hope that the breakage is too spectacular
<sham1> That might easily lead to you being admitted to a hospital
FatAlbert has quit [Ping timeout: 240 seconds]
the_lanetly_052 has joined #osdev
ethrl has joined #osdev
the_lanetly_052 has quit [Ping timeout: 246 seconds]
Likorn has joined #osdev
nvmd has joined #osdev
<heat> you should all use
<heat> just saying
dh` has joined #osdev
<sortie> heat, enough! 1.1 will NEVER happen
<sortie> No way you're gettign a 1.1.1 patch release
<sortie> And is a pipe dream
<heat> :D
<heat> if is a pipe dream, what's
<sortie> /16 like god intended
<gamozo> I use IPV6 but then use /128 to make sure it's pointless
<gamozo> I change my IP every packet to prevent being hacked
GeDaMo has quit [Quit: There is as yet insufficient data for a meaningful answer.]
<mrvn> geist: https://paste.debian.net/1243576 dig works fine but ping sleeps 5s for some reason.
<bslsk05> ​paste.debian.net: debian Pastezone
<mrvn> I don't even get why it's doing the same request 3 times
<ddevault> geist: hm, aight
<mrvn> Same issue with by the way.
<heat> glibc right?
<mrvn> me?
<heat> i've seen some pretty weird behavior wrt nss when it was misconfigured
<mrvn> normal Debian install. I only configured dhcp to leave the resolv.conf alone.
<mrvn> It looks like the first message to contains 2 DNS queries and only gets one reply. Waiting for a second times out and then it sends each request again separately. Do I see that right?
<heat> that looks correct yes
<heat> let me trace mine
<heat> mrvn, how does your resolv look?
<mrvn> Do you get the same when you starce "ping www.debian.org"?
<bslsk05> ​www.debian.org: Debian -- The Universal Operating System
<mrvn> "nameserver"
<heat> ok lets see
FatAlbert has joined #osdev
<mrvn> I'm guessing the 2 dns requests are IPv4 and IPv6
<heat> mine only sends a single DNS request it seems
<heat> sendto(5, "J\254\1\0\0\1\0\0\0\0\0\0\0017\0017\0010\0010\0010\0010\0010\0010\0010\0010"..., 90, MSG_NOSIGNAL, NULL, 0) = 90
<heat> then a poll, FIONREAD, recvfrom and close
<mrvn> 90 bytes seems long
<mrvn> 48. 60 or 108 bytes reply?
<heat> 122
<heat> lets wireshark it
<heat> ok....
<heat> i'm wrong
<heat> it's 2 40 byte packets
<heat> and then i get two responses
<heat> aha I was looking at the wrong thing
<heat> its connecting to twice
<heat> wait, makes sense, it's for the reverse lookup of the IP addr
<heat> so, two 32 byte msgs sent by sendmmsg, no timeouts and a 48 byte response and a 60 byte response
<heat> everything works here
<heat> mrvn, how does your wireshark look?
<heat> if you don't get two responses, you may have a broken firewall or router in the way I guess
FatAlbert has quit [Ping timeout: 258 seconds]
<mrvn> standard query or A, AAAA, responce for A, 5s pause, query for A, response A, query AAAA, response AAAA
<mrvn> It looks like the second query is lost because it's send as separate frame
<heat> hm?
<heat> it's not lost
<heat> you're just losing the response
ethrl has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Likorn has quit [Quit: WeeChat 3.4.1]
ethrl has joined #osdev
ethrl has quit [Client Quit]
Likorn has joined #osdev
<zid`> It's not lost, heat just can't find it and doesn't know where it is
<zid`> ddevault: Did you fix your EFLAGS?
<ddevault> I will be looking into that tomorrow
<zid`> It should just be that the stack frame you constructed that iret pops hasn't got interrupt enable bit set in eflags, from what you sai
<zid`> said
<ddevault> that's probably it
ethrl has joined #osdev
citrons has joined #osdev
ethrl has quit [Read error: Connection reset by peer]
bauen1 has quit [Quit: leaving]
Likorn has quit [Quit: WeeChat 3.4.1]
floss-jas has quit [Remote host closed the connection]
nyah has quit [Quit: leaving]
<gorgonical> Do I "have" to store the address of the per_cpu data area in a register?
<gorgonical> I guess I do, don't I?
<mjg_> what arch is this? amd64?
<gorgonical> risc-v
<mjg_> oh, no opinoin :)
<gorgonical> I'm trying to figure out how Linux does this: I think linux uses tpidr_el0 for TLS, and then tpidr_el1 for PDA, and then I forget how/where the task_struct is stored
<gorgonical> RISC-V so far I think uses tp register for TLS, then CSR_SCRATCH for task_struct, and I have no idea where the PDA is stored
<mrvn> Do you want to access the per_cpu data area often and fast?
<gorgonical> Even if I didn't, wouldn't I still need some way for the CPUs to know their index in the central directory?
<mrvn> the cpu normaly has an opcode to get the ID
<gorgonical> There *is* a uscratch register that maybe I can shove this into. That saves one dereference
<gorgonical> My fear is that libc or something already uses uscratch
bauen1 has joined #osdev
<mrvn> How many registers do you have that the kernel can write to but user can't?
<heat> gorgonical, there's a tls register iirc
<gorgonical> Yeah, usually tp, I think that's like x4
<heat> yeah
<heat> I use it for my percpu data in the kernel since it's, well, my tls for all intents and purposes
<mrvn> can't userspace write to x4?
<gorgonical> But in user-mode thats tls storage. The kernel exception handler I'm stealing from linux shoves the task_struct* into the tp register
<gorgonical> mrvn: yes but that's why the kernel stores whatever it wants in one of these scratch registers
<heat> you keep it in the scratch register and swap it
<heat> it's like the swapgs stuff
<bslsk05> ​github.com: Onyx/interrupts.S at master · heatd/Onyx · GitHub
<bslsk05> ​github.com: Onyx/scheduler.cpp at master · heatd/Onyx · GitHub
<mrvn> For my kernel design I need at least 2 kernel only scratch registers.
<gorgonical> Do you have to make a call to the SBI firmware to read the hart id?
Burgundy has joined #osdev
<gorgonical> Manual says hartid is machine read-only
<mrvn> don't you have call the firmware for everything on risc-v?
<gorgonical> what do you mean
<mrvn> nothing, just hearsay
Likorn has joined #osdev
<heat> yes
<heat> you need to call sbi
pretty_dumm_guy has quit [Quit: WeeChat 3.5]
<heat> mrvn is mostly right :)
<gorgonical> well that's not gonna work for percpu then is it
<gorgonical> a call down to sbi to fetch cpuid to index a percpu pda directory
nick64 has joined #osdev
<gorgonical> okay so the chip does not have the N extensions so it does not have uscratch
<gorgonical> arm really gets out easy here because it has a tpidr_el0 that riscv doesn't have a direct equivalent for
<gorgonical> i guess maybe you could stash the pda ptr in mscratch, but sbi probably won't allow that
<heat> what are you doing
<heat> whats a "pda"?
<gorgonical> per-cpu data
<mrvn> public display of affection
* heat kisses mrvn
<klange> We'll have none of that here!
<gorgonical> personal digital assistant even
<heat> gorgonical, for the kernel?
<gorgonical> yes
X-Scale has quit [Ping timeout: 256 seconds]
<klange> I only do Pocket PCs.
<heat> gorgonical, why can't you use sscratch?
<gorgonical> sscratch already has the kernel task_struct* in it
<gorgonical> for context switching and all that
<heat> if you're using linux, they solved that problem I guess
<mrvn> why not store pre core data there and task_struct in per code data
<gorgonical> they did, but I can't figure out how. The per-cpu code si very abstracted
<mrvn> ?
<klange> I can walk through what I'm doing in misaka
<gorgonical> mrvn: that's probably a good idea
<gorgonical> that would require updating the task_struct variable whenever you switch_to though
<heat> klange, you have riscv now?
<klange> no, aarch64
<heat> this is riscv
<mrvn> gorgonical: obviously
<heat> gorgonical, btw, you're wrong (as I suspected) https://elixir.bootlin.com/linux/latest/source/arch/riscv/kernel/entry.S#L28
<bslsk05> ​elixir.bootlin.com: entry.S - arch/riscv/kernel/entry.S - Linux source code (v5.18.3) - Bootlin
<klange> > tpidr_el0
<klange> this isn't
<heat> wait maybe you're not wrong sorry
<heat> i didn't read everything that went after that
<heat> but they do their loading right there
<gorgonical> I only sort of understand how this is done on ARM64 so my understanding of how they solved this problem is vague
<gorgonical> heat: what do you mean about the loading?
<heat> loading of the tp
<gorgonical> Yeah. tp in userland points to tls. In kernel land they want it to point to the task_struct. So first insn is to swap them. CSR_SCRATCH contains the task_struct ptr. Then all the context switching is with the kernel-tp
genpaku has quit [Quit: leaving]
X-Scale has joined #osdev
<gorgonical> I've had a busy day so maybe I overlooked it, but the thread_info struct at the start of the task_struct struct doesn't appear to contain the pda
<mrvn> why should the task truct have anything about the per code data?
<gorgonical> It shouldn't, but the thread_info might point to it I supposed
<klange> On ARM64, I tell the compiler x18 is reserved and then stick the per-core pointer in there. I stole that from geist. That's only in the kernel; x18 is the swapped on context switch with everything else
<gorgonical> klange: That might be the thing to do
<zid`> sounds like mips where k0 is free for the kernel or whatever
<klange> In userspace, tpidr_el0 is the thread pointer. This is fully controlled by userspace and dutifully restored by the kernel. Then gcc's built-in understanding of thread-locals takes over.
<klange> The important thing to note is the difference between per-core and per-thread. An execution context that is per-thread does not unexpected change between function calls, but a per-core one definitely can if one of those function calls is a context switch.
<klange> So trying to convince the compiler to use thread-local stuff for your per-core stuff is a no-no, as it could elide a load after a function call; it can't do that if you tell it to always reference based on the register
<mrvn> Another thing that's nicer without per task kernel stack. per-core never changes while inside the kernel.
<gorgonical> I see. I mean the naive way is to use cpuid as an index, but I think that's pretty slow on riscv
<klange> The option to tell gcc and clang that x18 is reserved is `-ffixed-x18`. You can also make that part of your ABI spec and bake it in... which Apple does in userspace on macOS for stupid legacy reasons.
<heat> gorgonical, the proper way is to keep the percpu data pointer in sscratch and tp
<mrvn> gorgonical: and every other archs too
<klange> It's slow everywhere. Reserving a register is better because register-based addressing is universally faster.
<gorgonical> heat: what do you mean?
<klange> It's even faster than the tpidr lookups on ARM, since those still need a cycle or two to pull the msr out into a general register anyway - which is why gcc and clang will happily elide that operation when they can (*if they are doing it as part of native TLS)
<heat> gorgonical, erm. just keep it there
<heat> what more do you need?
<klange> I recent did a thing on macOS to manually do TLS operations for my interpreter because macOS's default is always calling out to library functions and using (basically) GOT callbacks
<gorgonical> and then put a ptr to the currently executing task in the percpu?
<heat> yes
<heat> that's what I do
<gorgonical> I think that's the best option I have unless I discover linux does something really smart
<gorgonical> thank you
<klange> I put something a level above the task, but maybe that's a design mistake on my part
<heat> i mentioned this option like 20 minute ago xD
<gorgonical> you did but I got confused about who mentioned what
<mrvn> You can chain it all from bottom to top: core -> thread -> process -> group
<klange> fun fact, for the userspace thread pointer macos uses tpidrro_el0 instead of tpidr_el0
<heat> what's tpidrrrrrrrrrrrrrrrrrrrrrro
<klange> which means __builtin_thread_pointer does the wrong thing
<klange> it's like tpidr but read only (and also it's a different register)
<heat> sounds like __builtin_thread_pointer needs to be fixed for the darwin targets
<klange> yeah, not sure why it hasn't been fixed to return the right thing
<mrvn> That sucks, no user space threads that wayx.
<klange> I assume because it's part of a gcc compatibility thing that they only care about on linux
<klange> it's only read-only in the direct sense
<klange> you can still use a syscall to set it
<klange> and they use it the same way in the end
<klange> _except_ that they push it all behind hooks that the dynamic linker sets up
<klange> rather than actually linking slot lookups
<klange> so everything is slow as hell
<heat> 10 bucks in how there's a massive exploit in the sillicon and they abstracted it that way so you can't set bad tp values
<klange> and thus we come to this fantastic bit of code: https://github.com/kuroko-lang/kuroko/blob/master/src/kuroko/vm.h#L219-L223
<bslsk05> ​github.com: kuroko/vm.h at master · kuroko-lang/kuroko · GitHub
<mrvn> kind of defeats the purpose of user space threads if you have to syscall to swap threads.
<klange> This inlines the thread slot lookup _like every other platform does normally_, so thread-local storage is just as fast as it is on Linux, or ToaruOS.
<heat> you almost always have to use a syscall to swap threads
<heat> fsgsbase is super recent in the grand scheme of things
<mrvn> heat: tls too
<heat> fsgsbase is like 2014 recent
<heat> not 2001 recent
<mrvn> post c99, basically still wet paint :)
<heat> huh
<heat> how old is tls actually?
<heat> like as an actual concept
<mrvn> I would use tpidrro_el0 for the shared kernel/user thread pointer and tpidr_el0 for tls.
<heat> i used ~linux's nptl date
<heat> mrvn, you'll leak a kernel address that way
<bslsk05> ​en.cppreference.com: C++ keywords: thread_local (since C++11) - cppreference.com
<klange> tpidr_el0 on macOS _appears_ to be the core ID.
<mrvn> heat: obviously. It's shared. Things like the pid and tid.
<klange> Not the thread ID, not a _pointer_ to a core struct. Just a number for the core you are on.
<klange> The most useless thing ever.
<mrvn> So if userspace writes to tpidr_el0 the core suddenly is a different core to the kernel?
<klange> tpidrro_el0 is the base of the thread-local data, which uses the descriptor slot approach; the rest of the thread struct is behind it at a fixed offset (basically it points into the middle of a struct, where a big array of pointers happens to start, which is typical)
<heat> can't you switch tls models?
<heat> or do they just not support it
<klange> models? there are no models on macOS
<klange> The only thing that has different TLS models is ELF.
<mrvn> Another of those optimizer things. Setting the tpidrro_el0 is slow so you make it point to a pointer to thread local.
<gorgonical> Update: it actually seems like Linux does the simple thing and just takes the processor ID as an index
<klange> (I _wish_, initial-exec is what I want for my interpreter, of course)
<heat> gorgonical, that's horrific
<klange> (inlined dynamic isn't really that slow, though, if done properly)
<heat> the x86 code doesn't
<gorgonical> In the default case it does anyway
<bslsk05> ​elixir.bootlin.com: percpu.h - include/asm-generic/percpu.h - Linux source code (v5.18.3) - Bootlin
<mrvn> gorgonical: sure that isn't just the fallback in case it has no spare scratch register?
<heat> I was looking at that a few minutes ago and it seems they only bothered to optimise the x86 code
<gorgonical> There's no percpu.h in several of the arch's
<gorgonical> mrvn: That's what I think
<heat> no, you're correct
<gorgonical> And riscv doesn't seem to have a percpu.h def so I guess that's what it does here
<mrvn> if you have no other way using the cpu ID as index into an array works universally.
<bslsk05> ​elixir.bootlin.com: smp.h - arch/riscv/include/asm/smp.h - Linux source code (v5.18.3) - Bootlin
<mrvn> damn, I wanted to do some more work on my kernel this week and it's friday already.
<klange> Symbol points to the descriptor, descriptor has key (index into the thread-local pointer array) + offset (because keys can be shared by many thread-locals), then you do tp[key]+offset and try to inline that as best as you can...
<klange> anyway I made my interpreter speedy on macos by abusing knowledge of how the thread local storage model works and it's great, the end
<klange> thank you apple for at least having these parts of macos be open-source
<heat> yes, that's the standard for dynamically linked objects in linux as well afaik
<gorgonical> heat: so linux does what you suggested; store the cpuid that the thread is currently running on and use that
<klange> (also for various reasons they can't change this abi, so this inlining of why dyld's hooks do is totally safe)
<mrvn> Looking back over how long this TLS discussion is running I really have to ask: Aren't threads more trouble than they are worth?
<gorgonical> or what mrvn suggested, I literally can't remember
<heat> gorgonical: no, i didn't suggest that, because that's slow
<klange> mrvn: probably
<heat> mrvn, no?
<heat> what's the alternative?
<klange> more processes
<mrvn> just imagine all the data race you avoid by not having thread :)
<klange> "I don't see the problem with that." - CPython
<gorgonical> heat: what's the alternative? This is faster than getting the cpuid directly isnt it
<mrvn> heat: factory model with message passing works nicely.
<gorgonical> Oh you're suggesting to directly put the ptr in, not the index
<gorgonical> yes that is better strictly
<heat> gorgonical, yes
<mrvn> gorgonical: if you store the ID in a register you might as well just store a pointer
<mrvn> one less memory access and addition
<gorgonical> Makes me wonder why linux does it this way then
<bslsk05> ​github.com: Onyx/percpu.h at master · heatd/Onyx · GitHub
<mrvn> maybe some archs don't have enough bits for a pointer in the scratch register but enough for the core ID?
<heat> they just didn't bother optimizing any other arch
<heat> probably because their percpu code is so fucking obtuse
<heat> macros on top of macros
<j`ey> linux is all macros
<heat> the kernel is just a giant macro
<heat> by the way, re: my previous question, perf has a tlb shootdown tracing thing
<heat> perf has everything
<heat> it's almost as good as a io_uring eBPF folio
<klange> I should do a riscv port of Misaka.
<klange> I hear it's very similar to aarch64 anyway.
<heat> it's a very simple arch
<heat> my x86_64 code = 7110 lines, riscv64 = 2882 lines
<geist> yay fun discussions
<geist> and yeah riscv is fun in that it's the bare minimum, no fuss. just gets it dun
<geist> sometimes that's annoying, but most of the time it's just straightforward
<heat> sup geist
xenos1984 has quit [Read error: Connection reset by peer]
<geist> not much. workin. meetings
<geist> the usual
<heat> sounds horrible
<heat> i've spent the last week doing my onboarding
<mrvn> always remember: This too shall pass
<zid`> are you officially a whatever now?
<heat> cloudflare yes
<zid`> officially a cloud
<heat> an orange one, yes
<zid`> just don't get it confused with being a spider
<zid`> they're the same word in japanese
<heat> spiderflare sounds great
<zid`> spiderflare is my metal band
<zid`> The first single on our LP is called Inferno Venom
<heat> anyway if I have to go through another 1hr onboarding session I'll unlive myself
<heat> let me engineer pls
<zid`> are you going to write their cool blog posts
<heat> hope so