klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
vai has joined #osdev
<vai> yo :-)
Arthuria has quit [Ping timeout: 246 seconds]
X-Scale has quit [Ping timeout: 256 seconds]
navi has quit [Ping timeout: 264 seconds]
karenthedorf has joined #osdev
X-Scale has joined #osdev
gog has quit [Ping timeout: 265 seconds]
<heat> is there a particularly TLB shootdown-intensive *realistic* workload?
<heat> basically i'm wondering how to strike the right tlb shootdown balances between IPIs and fine-grained TLB invalidation...
<heat> something really synthetic probably won't work if you can't easily see the TLB shootdown difference
Turn_Left has quit [Read error: Connection reset by peer]
bradd has quit [Ping timeout: 252 seconds]
bradd has joined #osdev
<vin> heat: Maybe PTE access bit scanning to detect page activity. https://sjp38.github.io/post/damon/
<bslsk05> ​sjp38.github.io: DAMON: Data Access Monitor | hacklog
<heat> damon's a whole thing ;)
<vin> heat: applications used here for evaluation http://www.cs.yale.edu/homes/abhishek/kumar-asplos18.pdf might be of interest for you
<heat> thank you, i'll check it out :)
Ram-Z has quit [Server closed connection]
Ram-Z has joined #osdev
Ameisen has quit [Server closed connection]
Ameisen has joined #osdev
Arthuria has joined #osdev
Mondenkind has quit [Server closed connection]
childlikempress has joined #osdev
childlikempress is now known as Mondenkind
qxz2 has quit [Server closed connection]
gcoakes has quit [Ping timeout: 245 seconds]
Stary has quit [Quit: ZNC - http://znc.in]
CompanionCube has quit [Quit: ZNC - http://znc.in]
heat has quit [Ping timeout: 258 seconds]
Stary has joined #osdev
CompanionCube has joined #osdev
alpha2023 has quit [Server closed connection]
alpha2023 has joined #osdev
Arthuria has quit [Ping timeout: 246 seconds]
<kazinsal> https://faultlore.com/cargo-mommy/ for the rust users
<bslsk05> ​faultlore.com: cargo-mommy
kazinsal has quit [Server closed connection]
kazinsal has joined #osdev
Gordinator has joined #osdev
GeDaMo has joined #osdev
vdamewood has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
<nikolar> rust users, smh
Gordinator has quit [Quit: My client has closed - perhaps I did that, perhaps I didn't]
<zid> Imagine being a user
<klys_> something might happen
Left_Turn has joined #osdev
gsekulski has joined #osdev
rustyy has quit [Quit: leaving]
Turn_Left has joined #osdev
rustyy has joined #osdev
Left_Turn has quit [Ping timeout: 258 seconds]
jimbzy has quit [Ping timeout: 240 seconds]
memset has joined #osdev
<adder> need some help with my linker?
<bslsk05> ​bpa.st: View paste 2OMA
<adder> I have stack_top defined in my linker script, and I'm trying to use it from boot.S
<adder> I declared it as extern, so I'm not sure how it's not seeing it
<adder> and I've no idea what the error means
<adder> and yes, this is my third attempt, ditched limine
<adder> what relocation? why is it undefined? how am I making a PIE object?
<klys_> needs info about your compile and link commands
<klys_> also I cannot do this for lack of time.
xenos1984 has quit [Read error: Connection reset by peer]
<adder> this is the makefile https://bpa.st/NUFQ
<bslsk05> ​bpa.st: View paste NUFQ
xenos1984 has joined #osdev
<adder> nvm, fixed
Gooberpatrol66 has quit [Ping timeout: 246 seconds]
navi has joined #osdev
gog has joined #osdev
heat has joined #osdev
op has joined #osdev
goliath has joined #osdev
MrCryo has joined #osdev
op has quit [Remote host closed the connection]
X-Scale has quit [Ping timeout: 256 seconds]
nathanpc has left #osdev [Textual IRC Client: www.textualapp.com]
Rubikoid has quit [Server closed connection]
Rubikoid has joined #osdev
vdamewood has joined #osdev
vdamewood has quit [Quit: Life beckons]
housemate has joined #osdev
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
emm has joined #osdev
zetef has joined #osdev
zetef has quit [Client Quit]
heat has quit [Read error: Connection reset by peer]
heat_ has joined #osdev
dequbed is now known as nadja
elfenix|cloud has quit [Server closed connection]
X-Scale has joined #osdev
MrCryo has quit [Remote host closed the connection]
housemate has quit [Quit: "I saw it in a TikTok video and thought that it was the smartest answer ever" ~AnonOps Radio [LOL]]
heat_ has quit [Remote host closed the connection]
heat_ has joined #osdev
jimbzy has joined #osdev
guideX has joined #osdev
<guideX> I'm having trouble separating my os code from the programs that run inside of it
<guideX> I guess my OS is missing an code interpreter, was just looking for others to sort of help me mentally grasp that, and see if I am thinking correctly about it
heat_ is now known as heat
<heat> you're not
<heat> Traditionally (in 99% of cases), you just use whatever the CPU architecture gives you to separate the user code and kernel code
<heat> traditionally some sort of kernel mode and user mode, or in x86 ring 0 and ring 3
<guideX> ah ok
<guideX> so which mode it will run in is one thing, but what about the act of separating the code from the os? right now, there's no concept of code in the os vs program
<guideX> that's somegthing I have to build out I guess right
<mjg> so what material on operating systems have you read so far
<heat> system calls
<zid> Files, is a good way..
<zid> see: hard drives, initrd
<guideX> this is a cosmos c# operating system https://www.gocosmos.org/ https://imgur.com/a/O2ect3s I have been reading things on os's, but honestly it all began saturday morning, I have used cosmos before though
<bslsk05> ​www.gocosmos.org: COSMOS - COSMOS
<bslsk05> ​imgur.com: Imgur: The magic of the Internet
MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]
<heat> no idea how that works, sorry
<guideX> no problem, it's quite different, I am familiar with it though, I wrote a cli os with cosmos in the past
<guideX> my understanding is limited though, on how to separate concerns, https://pastebin.com/raw/i9xKrisi this is a program for instance, but it is built with the os itself
<guideX> I am so far not sure how to like, separate the code of the os from code of programs, and was just kind of wondering how other os's do it, or tips
MiningMarsh has joined #osdev
<guideX> already I have a file system, networking, gui, and a program can have a window and stuff, but how do I put that code outside the os
<guideX> I can describe a program outside of the os, but the part where the code executes is harder
X-Scale has quit [Ping timeout: 256 seconds]
<guideX> is it like, I need to build a scripting language, and interpret the commands myself, or is my goal to try and use something
<guideX> or maybe neither of those things eh
Arthuria has joined #osdev
Matt|home has joined #osdev
<adder> I have a really, really, really stubborn pml4. I'm trying to get it page aligned via various means, from attributes to link time, but it remains at 0x111507?
<kof673> guideX, you keep saying "os" where i think most people would say "kernel"
jbowen has quit [Server closed connection]
jbowen has joined #osdev
<kof673> linux is "just a kernel" bsd includes both gnu/linux is an os (kernel + userland).
<kof673> you can use whatever terms you want, but understanding other people requires learning their definitions :D
<kof673> "Inconceivable!"
spare has joined #osdev
Starfoxxes has joined #osdev
X-Scale has joined #osdev
stefanct has quit [Server closed connection]
stefanct has joined #osdev
<guideX> kof673, I only started working on this thing on saturday, I'm a little ahead of my own knowledge and terminology
<guideX> it is already incredibly far along for how little time I have put into it
<guideX> I am using cosmos c# sdk, which abstracts things for me to some degree also
<guideX> there is a kernel project in my os vs the logic of the os
<guideX> the kernel handles things like, the file system, memory allocator, some drivers and things, and a whole lot more, the os project is about the os, built in features, and the things you see in the os
<guideX> there also libc, .net corlib, and the cosmos bits
<guideX> I say os instead because, it's hard to describe it all in a few short words xD
<guideX> but I guess the thing I am having trouble mostly with, is those "built in programs" vs the ones that exist outside my os, it's a troubling concept so far for me, I'm not sure how to go about that in a logical way
<GeDaMo> Does your system have the concept of processes?
<guideX> GeDaMo, yeah, I built like an app container
<guideX> it is for built in apps, but I have been trying to figure out how to have external apps, that bit is kind of missing
<guideX> I have it capable of describing a window of an external app, but I'm not sure how to execute code from an external app yet
<guideX> I say external like it means something, basically just code that is not compiled with the os
<guideX> I've been trying to figure out; how do I put this program for example outside the built code of the os https://pastebin.com/raw/btNZTLfM
<guideX> that : Window is what defines what is a gui program in my os
<GeDaMo> You'll need some kind of executable format which you can compile to and load into a process
basil has quit [Server closed connection]
SanchayanMaity has quit [Server closed connection]
<guideX> GeDaMo, I kind of just am looking for what is a logical way to go about it I guess, I can write the code to do it myself.. I would need something to build something like an interpreter for the front end controls (like xml or whatever), and something to interpret the code (a scripting language), and stuff it all inside some kind of zip file, and when launched, it does all the things with the front end and code to execute
SanchayanMaity has joined #osdev
<guideX> is that sort of, how to do that in a paragraph?
basil has joined #osdev
<guideX> I basically need to interpret scripts I guess eh
<GeDaMo> If you build an interpreter into your system, you can just load text files
<bslsk05> ​thasso.xyz: Setting up an x86 CPU in 64-bit mode
<heat> guideX, tip: don't use cosmos
<heat> cosmos isn't *really* a proper operating system
X-Scale has quit [Ping timeout: 256 seconds]
<heat> in using it you may just be confusing yourself further
<heat> adder, what did you try? are you sure you aren't looking at the wrong thing?
<kof673> https://0x0.st/s/JR6YKh_XCclMaYjOa3S0aA/XLwE.jpeg diagram of common levels of separation
X-Scale has joined #osdev
aejsmith has quit [Quit: Lost terminal]
aejsmith has joined #osdev
karenthedorf has quit [Remote host closed the connection]
<guideX> heat, just curious what do you find improper about it
<guideX> it does work on bare metal if that's a concern
<guideX> also you can download the cosmos source code and make changes to the base
<heat> it is (IMO) a glorified demo thing that's only popular because it's in C#, a language that really isn't suited for kernel development in any way shape or form
<guideX> actually, I don't think I'm using much cosmos, I am using it for debugging support, and the bootloader and console and certain things, I am mostly using .net native aot
<guideX> my other os is like 100% cosmos though
X-Scale has quit [Ping timeout: 256 seconds]
<guideX> ok, per your advice I removed all the cosmos bits, it still works fine xD
<guideX> I guess I was just using it for the cli portion, which I don't need it for that even
<guideX> this is entirely just, .net7/corlib, libc, and my os code
<heat> my advice is to stop using C# altogether :)
<heat> it is seriously not the language you want for low level development
<heat> C, C++, Rust - all fine choices
<heat> probably a few others i can't think of right now
<GeDaMo> asm! :P
<heat> /votekick GeDaMo
<dzwdz> any opinions on queue(3)?
<guideX> idk that is harder, the entire thing is c# xD
<guideX> I can write c++ too, but it is too late
<mjg> dzwdz: these are semi-shite macros, but ultimately they do work
<jimbzy> heat, C# == C++++?
<mjg> the real q though is if you should be linked listin' to begin with
<mjg> jimbzy: c# == ++c++;
<jimbzy> Ahhh
<jimbzy> That makes sense.
<dzwdz> i mean, linked lists are simple and i don't have any fancy needs
X-Scale has joined #osdev
<dzwdz> and i was thinking about using some common abstraction for linked lists instead of reimplementing them everywhere
<dzwdz> i'm debating if i should do that
<dzwdz> queue(3) is kinda ugly but at least it's well known, and i think it's used in the bsd kernels too?
<dostoyevsky2> can one use floating point numbers in the linux kernel? I remember on openbsd the compiler has flags that forbid fp to make context switches cheaper
<heat> dostoyevsky2, generally no, but there are exceptions if you really need SIMD for instance
<heat> (surrounded by kernel_fpu_begin/end())
<dostoyevsky2> heat: ah, interesting
<mjg> dude
<mjg> wtf
<mjg> fp being forbidden by default is kernels 101
theruran has joined #osdev
<dostoyevsky2> unless it's a cuda kernel
<heat> i would bet 200 weimar republic papiermarks as to how fuckin windows probably does something different
<heat> damn i was wrong, they also have KeSaveExtendedProcessorState/Restore
<heat> guess i lost like 2 cents
DragonMaus has joined #osdev
netbsduser has joined #osdev
<mjg> you also lost some social kredit with the gestapo
<heat> too early
<heat> no gestapo yet
<mjg> shit, also a month
<mjg> my apologies
Left_Turn has joined #osdev
Turn_Left has quit [Ping timeout: 258 seconds]
X-Scale has quit [Ping timeout: 256 seconds]
<mjg> s/also/almost/
<mjg> wtf
<mjg> anyhow i'm on the market for temporary access to a real pentium 3
<heat> lol what
<mjg> i'm not surprised to not find any options :[
<mjg> there is magic code i am totally not going to share which i'm confident sucks terribly
<mjg> despite the author claiming it's fast (while ofc providing nothing to back it up)
<mjg> according fog's instruction tables it is indeed bad
<mjg> the question is how much we talkin'
<heat> sometimes code size really does matter tho
<mjg> .. :D
<mjg> mofs
<mjg> not doing that shit would be less code
<mjg> look mon the code is totally geezered
<mjg> the question is hwat kind of stats we talkin' specifically
<kof673> model name: Pentium III (Coppermine) i got a system or 2
<mjg> can you give me ssh access? i only need to prod some userspace a little bit
<mjg> as an unpriv user
<mjg> is that linukkz by any chance?
<kof673> i don't know if i have access to router to portfwd :/
<kof673> yes, i can run knoppix 8.1 binaries easily lol live cd, if you can compile ther
<kof673> didn't mean to tease ....ask geist :D
<mjg> how much ram you got there
<mjg> fuckery could be done with reverse ssh, but i don't remember how that's done
<kof673> this system is like 512M not gonna happen. the other system is a laptop with non-working AC...so you would get about 2 hours. 1G or 2G there
<kof673> 2 hours before the battery dies lol
<kof673> maybe more...4?
<mjg> and you can't rechanrge the sucker? :D
<mjg> is it dead for good after?
<kof673> yes, but not while it is inside, i was hoping to build some contraption
<mjg> well i can prep some test scriptzz
<kof673> i can do that :D
<kof673> just boot knoppix 8.1 in qemu and get it working there :D
<mjg> 8(
<mjg> aight
<kof673> unless you have another live cd/dvd/usb stick :D
<mjg> can you test if perf works though?
<kof673> if you tell me what to do
<kof673> or does it need custom kernel?
<mjg> hrm
<mjg> that's 2017 vintage
<mjg> so that's after the metldown et al fiasco
<kof673> too new? lol
<heat> pretty sure those cpus don't have the bugs?
<mjg> ye it would be best to bench without "knowing" about the problems
<mjg> heat: dude
<mjg> the 32-bit kernels got a facelift
<mjg> full 4G space
<mjg> cause meltdown
<heat> what?
<heat> when did that happen?
<mjg> fresh after meltdown?
<mjg> kof673: yo mate can you boot the sucker and "lscpu"
<kof673> yeah, gimme 5 minutes or so
<heat> idk boss i don't follow the 32-bit kernel stuff
listentolist has joined #osdev
<kof673> actually that has knoppix 7.6.1 unless i burn a cd or dvd maybe. 1G ram , kernel 4.2.6 .... that one is Mobile Intel® Pentium® III Processor "pentium M" 1500 mhz
<kof673> its loading... :D
<kof673> this system...is pentium III coppermine, 930 "desktop" https://0x0.st/s/n0954Iym5WxGvf8efguQiQ/XL3b.txt lscpu knoppix 8.1 lol
<kof673> *930 MHz
<kof673> 8.1 has kernel 4.12.7 yes 2017
<heat> mjg, whatever you're talking about doesn't check out
<heat> just booted a new i386 alpine linux kernel and page tables are as usual, userspace mapped, kernel addresses at typical i386 places
<kof673> the pentium m.... is same lscpu but model 9, and a few more flags: fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 tm pbe bts est tm2 no sse2 on the desktop
<kof673> uname -a says 4.2.6 was built in dec 2015
<mjg> heat: it totes happened on freebsd, i did not verify on linux
<mjg> heat: i'm gonna check linux soon(tm)
<mjg> kof673: can you "perf top" in there
<mjg> i'm gonna need period-accurate gcc and whatnot for some tinkering, but that i'm gonna sort out on my end
<mjg> and by period accurate i mean about 2000
<kof673> linux-perf-4.12 is not installed, /usr/bni/perf fails. not installed... i can compile stuff...
<mjg> perf is compilable from the kernel source, but it has quite a few deps
Turn_Left has joined #osdev
<mjg> knoppix is probably not suited for that
<mjg> i can try to get a binary working to copy over there
<mjg> that said i'm gonna prod you some time next week
<mjg> thanks mate
<kof673> ok
<Matt|home> o\
gcoakes has joined #osdev
Left_Turn has quit [Ping timeout: 245 seconds]
<kof673> the 7.6.1 "perf top" doesn't even exist at all...so go with 8.1 :D
<kof673> unless you really need older kernel
<kof673> older knoppix than those should also all work AFAIK
gcoakes has quit [Ping timeout: 245 seconds]
<guideX> I think what I'll do is convert the c# to cil, then find a cil interpreter
<guideX> and then I'm off to using binaries
nortti has quit [Server closed connection]
nortti has joined #osdev
<guideX> and that is how to separate a program from a piece of code inside my os
sjs has quit [Server closed connection]
sjs has joined #osdev
<chiselfuse> how is the `catch syscall` implemented in gdb?
<chiselfuse> how does the process get stopped at a point where it executes a specified system call?
<heat> ptrace
<heat> see ptrace(2)'s PTRACE_SYSCALL
yuiyukihira has quit [Server closed connection]
yuiyukihira has joined #osdev
navi has quit [Quit: WeeChat 4.2.1]
navi has joined #osdev
xtex has quit [Server closed connection]
xtex has joined #osdev
ddevault has quit [Server closed connection]
ddevault has joined #osdev
pg12 has quit [Server closed connection]
pg12 has joined #osdev
sm2n has quit [Server closed connection]
sm2n has joined #osdev
exec64 has quit [Server closed connection]
exec64 has joined #osdev
torresjrjr has quit [Server closed connection]
torresjrjr has joined #osdev
X-Scale has joined #osdev
gsekulski has quit [Ping timeout: 245 seconds]
alethkit has quit [Server closed connection]
alethkit has joined #osdev
Brnocrist has quit [Server closed connection]
Brnocrist has joined #osdev
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
Starfoxxes has quit [Read error: Connection reset by peer]
gsekulski has joined #osdev
X-Scale has quit [Ping timeout: 256 seconds]
spare has quit [Remote host closed the connection]
karenthedorf has joined #osdev
Matt|home has quit [Quit: KVIrc 5.2.4 Quasar http://www.kvirc.net/]
X-Scale has joined #osdev
MrBonkers has quit [Quit: ZNC 1.8.2+deb2build5 - https://znc.in]
GeDaMo has quit [Quit: 0wt 0f v0w3ls.]
getz has quit [Server closed connection]
getz has joined #osdev
exark has quit [Quit: quit]
exark has joined #osdev
gcoakes has joined #osdev
<heat> geist, what was your solution wrt riscv page mappings requiring sfence.vma? my problem being on the kernel side
<geist> shoot em down
<heat> aww eww
<geist> the key is whether or not you can handle a stray page fault
<geist> so for kernel, on zircon, we cant, so i shoot down mappings. for user space, just sfence on the local cpu
adder has quit [Server closed connection]
<geist> but also sfence locally on page fault entry (or exit, but entry is easier) just to make sure if you dont find anything wrong it'll retry
<heat> btw am i misreading this or do i need a global tlb invalidation for a paging structure change?
adder has joined #osdev
<geist> i still dont 100% understand why it's needed, almost like the cpu can store a 'negative' entry, but i dont understand it
<geist> basically you do to be 100% correct
<heat> wack :(
<geist> the verbiage of the spec says you must sfence any time you change the page tables
<geist> but i think the key is when adding an entry the worst case is it misses the addition
<geist> so you get a stray page fault, which if you sfence inside the PF handler and retry will continue
<geist> so i avoid doing a shootdown for all cores for user mappings
<heat> oh that's even worse, christ
<heat> i didn't notice the "adding"
<heat> i've been tighening up my x86 semantics there wrt page table removal and i'm figuring i should do the same for riscv
<bslsk05> ​fuchsia.googlesource.com: zircon/kernel/arch/riscv64/mmu.cc - fuchsia - Git at Google
<heat> but, seriously, global invalidation on page table adding :sob:
<geist> well, 'adding' is changing the space
<heat> yeah but other MMUs aren't this silly
<geist> yep, and this is precisely why i dont really want to try to unify the page table logic
<geist> because whe it gets to the nitty gritty here the arches start to differ
<geist> esp when you throw ASIDs into the mix
<heat> in fact most x86's are de-facto "one invalidate flushes all of the walker cache"
<heat> actually this is pretty ok to handle, my tlb code is separate
<heat> e.g most of my unmap code is arch generic, with a bunch of arch-specific accessors and helpers. but the tlb invalidation code is probably going to be entirely separate
<geist> yah at the minimum you need to abstract it. probably something like 'handle_tlb_change(is_kernel, asid, addr)' and then some sort of spattring of 'flush_tlb(situation)' that each arch deals with
<geist> and depending on the arch it decides to take action or not on all those situations
<heat> arch-independent logic just does e.g tlb_remove_page, tlb_remove_pmd, tlb_remove_pud
<heat> for instance for arm64 i'm planning on just straight up shooting a tlbi and the tlb invalidation finish just being a dsb + isb
<geist> suggestion: stop using x86 style names for layers
<geist> just use level 0, 1, 2, 3, and decide which order to name it
<geist> far simpler
<heat> haha
<heat> i LARP'd linux sorry
<geist> at the minimum it makes it easier to write recursive stuff and just use something like int layer
<heat> wdym recursive?
<geist> if you really want you can use soething like enum level { PMD, PML4, etc }
<Ermine> 'while learning from its mistakes'
<geist> oh if you wanted to do somthing like 'traeverse(pt, vaddr, level)' that recursively calls itself
<geist> with level - 1 (or + 1)
<heat> yeah i can't do that, i can't assume levels are equal in size or length or format
<geist> oh you wanna port to 68k?
<geist> i think linux does too, they just recquire cpus with non uniform levels to suck it up
<heat> or x86 PAE :)
<geist> sure, but that's still easy to at least quantify
<geist> `constexpr size_of_level(int level) { switch(level) .... }
<geist> if you pull it off right it's pretty darn efficient
<geist> on arm64 that's all dynamic anyway if you choose anything but the default aspace size
<geist> so if you write it that way you just replace those with non constexpr functions
theyneversleep has joined #osdev
<bslsk05> ​gist.github.com: unmap.c · GitHub
<heat> it's darn generic!
<heat> but very linuxy :)
<heat> tlbi_remove_* does the TLB magic for whatever arch, pte_* (et al) are all arch-dependent, set_*() is also arch-dependent
<geist> yah see you can just collapse a few of thoe trailing funcs into one that takes a level
<heat> yeah
<heat> i did notice my page table function codegen got larger
<geist> now the problem there that's annoying is defining the order of the levels, and annoyingly they're not the same brtween arches
<geist> iirc arm numbers them backwards, like the leaf nodes are always 0, independent of how deep the structure is
<geist> which i guess makes sense in a certain way
<geist> though to me it always makes sense in my mind to number level 0 as the root, and the leaf level is just wherever it happens to be
<geist> 2, 3, 4, 5? whatever you know
<heat> yeah, like x86
<heat> _64
<geist> only reason it sort of matters on ARM is page faults in the ESR_EL1 arm actually tells you at what level it failed
<geist> and the way they encode it is according to their naming conventoin
goliath has quit [Quit: SIGSEGV]
<geist> so if they say there was a tlb permission failure at level 0 it was always at the leaf node
<geist> or atleast the deepest part. kinda makes sense
<geist> but anyway i still like the idea of counting up as you go down the tree, so that's my thing
<heat> yeah i prefer counting up too
<geist> x86 counts backwards to right? PML4, PML5?
<heat> yep
<heat> PML5 is the root
<geist> i guess the logic behind counting down is you can basically 'seed' your walk fro th root with how deep it is for your configuration
<geist> `int get_tree_depth() = 5` then start your walk until you're at 0
<geist> which sort of makes sense from a logic point of view, keeps you fromt having to test at every level if you've reached the terminal depth
<geist> or at least the test is compare with 0
<heat> the easier solution to this problem is to say "stop counting nerds lol" and adopt whatever a linux guy half-drunk in 2004 said should be the page levels
<geist> tis probably why the hardware internally works that way
<geist> 2004 haha, goes a lot farther back than that bruh
<heat> i am aware
<heat> dunno what was the 4-level arch first arch
<geist> like 1994 or so when they ported to alpha and had to deal with 'shit how are we gonna shoehorn this non x86 page tables into x86'
<geist> 'oh i know, lets just add a bunch of macros and deal with it'
<geist> make it look like x86
<heat> i have to say they're not macros and the type hack they found is lovely and i need to use it more
<geist> yah though originally it was probably macros since that was older C
<heat> typedef struct { pteval_t pte; } pte_t; /* no implicit type conversion now bozos */
<geist> i do remember they were abusing the crap out of inline functions that were not standard C. my early newos code tried to use a lot of their bits
<geist> yah or even opaque structs in C
<geist> `struct myshit; void frob(myshit *, int frob_func);`
<heat> personally not a fan of opaque structs
<heat> you can't declare them on the stack and that's mega lame
<geist> agreed, but i'm a big fan of hiding the contents of stuff from callers that dont need to know
<geist> one of the generally worst parts of C++
<geist> the general cheat there that sometimes works is to define some sort of sizeof(struct) for the caller
<geist> and then they can pass you a buffer of bytes for them to construct a thing on, but that's pretty lame
<heat> if you don't have the deetz where are the inlines coming from :(
<geist> `struct foo; #define FOO_SIZE 64; #define FOO_ALIGN 8` and then inside your .c file do a `static_assert(sizeof(struct foo) == FOO_SIZE)`
<geist> oh word. indeed
<geist> i use it for things where it's really an opaque thing that they shouldn't know about ad dont need inlines
<geist> like a pointer to a driver or whatnot
<heat> a compromise i like: <frob_types.h> struct frob { /* ... */ }; <frob.h> static inline void frob_init(struct frob *) ...
<heat> i find it neatly reduces the header hell
<geist> yah
<heat> btw geist i don't see how the lazy non-leaf PTE adding thing is supposed to work?
<geist> lazy non leaf pte adding....
<geist> not sure i get what you're getting at
<heat> if they specifically ask for a global sfence.vma, i suspect it's not the same as doing sfence.vma <addr within page table>
<heat> at least per the very-formal-very-great riscv priv spec
<geist> well, the question s what is the worst case scenario
<geist> if you dont flush it on a clean add, what could possibly happen
<geist> it appears that worst case the cpu will get a tlb miss even if it's present
<geist> as if there's a pt walker cache that has cached the non present entry
<heat> could you get stuck in a page fault loop cuz no one understands what's happening?
<geist> right, that's why i also said here and in the comment that you should always flush the page upon entry to a PF
<geist> that way worst case if it appears like nothing to be done you just restart and it should work the second time
<heat> yeah but global sfence.vma != sfence.vma <addr> right?
<geist> that's right, but a PF is only for a local cpu
<geist> so you're only concerned about the vision of that one cpu at the time
<geist> so if you have 8 cores and you map a page on the first cpu, locally sfence and continue
<geist> now you have the chance that 7 other cores will as they touch that page also fault with an extraneous fault (but probably not)
<geist> so if they do, they locally sfence and continue and will work the second time
<geist> so you're avoiding a global flush with the idea that 99% of the time the secondary cores wont trip over it
<heat> yeah that's not my point
<geist> oh for adding inner nodes? yah
<heat> point is: "If software modifies a non-leaf PTE, it should execute SFENCE.VMA with rs1=x0" this would imply that it's permissible for the implementation to *not* flush the walker cache on a sfence.vma rs1=addr no?
<geist> yah i dont when adding a new one (and only in that case) because it seems fine on real hardware: https://fuchsia.googlesource.com/fuchsia/+/refs/heads/main/zircon/kernel/arch/riscv64/mmu.cc#797
<bslsk05> ​fuchsia.googlesource.com: zircon/kernel/arch/riscv64/mmu.cc - fuchsia - Git at Google
<geist> but that may also not be correct
<geist> that MapPageTable routine is basically the main recurse and map routine
<heat> yeah linux seems to YOLO it too
<heat> either this should be amended in the spec or we're all fucked
<geist> so on removal we do do TLB flush but only at the end of the operation, before the page is returned to the PMM https://fuchsia.googlesource.com/fuchsia/+/refs/heads/main/zircon/kernel/arch/riscv64/mmu.cc#723
<bslsk05> ​fuchsia.googlesource.com: zircon/kernel/arch/riscv64/mmu.cc - fuchsia - Git at Google
<geist> same as arm64
<geist> note you should grab the newest version of the spec, not officially released, it's a lot clearer
<geist> someone has been cleaning it up
<geist> it might mention more bits about it
<geist> at the minimum it moved away from the default Latex look
<geist> there's also the newer, fancier flush mechanism though i dont think it changes the contract really
<heat> https://gist.github.com/heatd/0f41377789a81cd16dc44602f2c93890 my logic for x86 removal is actually Real Simple
<bslsk05> ​gist.github.com: x86_tlbi.c · GitHub
<heat> god bless x86
m3a has quit [Ping timeout: 246 seconds]
<geist> yah and of course ARM has something fairly similar as you're aware
<heat> wdym?
<geist> well, the whole page table walker cache maintenance thing
<geist> where you need to tell it to dump the inner nodes manually
<heat> yep
<geist> or always use the stronger version, in which case it acts like x86 al the time
<heat> tbf i don't know how intel cores are looking in this regard, they explicitly recommend that logic i used. amd also does, but they explicitly mention the old behavior of a single invlpg flushing the whole thing
<heat> and amd does have the EFER.TCE bit you can set, which actually opts-in
bslsk05 has quit [Server closed connection]
<geist> yah i think it just basically mentions that there is a page walker cache and you dont have to worry about it as long as you invlpg
m3a has joined #osdev
<geist> oh wow that's a very good point
<geist> never occurred to me, i had kinda written it off too because of the same PCID issue
<geist> most likely linux wont use it because they've probably invested in a bunch of 'avoid IPI storm' logic that probably scales better
<geist> i think there's even talk in the riscv manuals or one of the things i read that said they sort of explicitly didn't do the broadcast stuff because in the long run having software do it is more efficient, somewhat paradoxically
<geist> at least when you really scale up to 128, 256, etc cores. software knows best (ie, linux)
<geist> i've heard some folks grumble that on some of the ARM server cores the broadcast IPI stuff is *slow* because the hardware has some concurrency issues
<geist> and you're almost better off switching to a software solution
<heat> i kinda want to play around with it now, though i don't have a zen 3
<mjg> except these ipis are mostly self-induced on freebsd
<mjg> s/ipis/invalidations/
<geist> yah vs doing a simple broadcast ipi it's propbably a win
<geist> vs doing a very sophisticated 'avoid flushing until you have to' software solution it's probably not
<mjg> the kernel makes extensive use of temporary mappings which it keeps whacking
<geist> right, exactly
<geist> or some sort 'delay this on that cpu because it's idle, or running user space, or something'
<mjg> vast majority of that can be straight up eliminated
<mjg> most commonly seen ipis come from freeing pipe-backing buffers
<geist> what about plain thread stacks?
<mjg> they are cached
<geist> how so?
<geist> like recycled from previous mappings?
<geist> some sort of LRU?
<mjg> no lru or anythign, just per-cpu caching of some numbers of stack caches
<mjg> linux is also doing it, except as a total hack (caching up to 2 per cpu)
<geist> and when creating a thread it tries to grab one from the local cpu
<mjg> yes
<geist> well, it's kinda a lru, just distributed across the cpus
<geist> so i gues the worst case is you suddenly create a destroy a bunch of threads, so it builds up a list
<mjg> it's a static 2-sized array
<geist> but then i guess it can try to collect and free a bunch at a time, which would be a win from IPI point of iew
<mjg> it notoriously overflows
<geist> right
<mjg> i added some probes and ran a kernel build
<geist> still, better than nothing, but not by much
<geist> would just soak up some little stray thread creation
<mjg> well it is better than nothing but seriously lame af
<geist> but point is when it frees those i assume it does some IPI to everyone
<mjg> they have something to delay ipis in vmalloc/vfree (which is how stacks come to be), but eventually yes, you get hit
<heat> linux vmalloc does not broadcast ipis
<heat> *if it can*
<geist> i was always wondring if there was some sort of generation style counter thing where you dont free the pages to the PMM, but you unmap them but dont TLB broadcast
<geist> but roll the gen couter
<heat> you can totes free the pages
<mjg> here is a simple solution which takes care of everything: intermediate per-node cache
<geist> as cpus cycle through the kernel they bump their counter to match, tlb flush, then when they all have you return the pages
<heat> it's the kernel, you can do whatever you like
<mjg> there, after some warmup you will probably never vfree any of the pages
m3a has quit [Ping timeout: 264 seconds]
<heat> UAF deref? you're fucked anyway, might as well leak some data
<mjg> and the local array overflow will add to the per-node cache instead of vfreeing
<mjg> i'm rather negatively surprised they did not already do it
<heat> NEGATIVELY SURPRISED
<mjg> what
<heat> another one of your funny expressionz
<geist> any of those local caches always have a side effect though where you have to deal with an OOM situation
<heat> depessimize style
<geist> so tey're not free necessarily
<mjg> ofc
<mjg> this is why i said per-node cache
<mjg> not bigger per-cpu caches
<geist> and more to the point folks may take extra convincing as a result
m3a has joined #osdev
<mjg> that i doubt. note the total number of cached stacks may even be the same
<geist> our security folks in fuchsia would take longer to convince because of potential use-after-free situatios of recycling stacks
<mjg> the difference is you don't go to vmalloc/vfree pair
<geist> easiest way is to just not tell them
<heat> geist, re the riscv thing: on rs1!=x0 "The fence also invalidates all address-translation cache entries that contain leaf page table entries corresponding to the virtual address in rs1, for all address spaces."
<mjg> linux memsets cached stacks fwiw
<geist> yah i guess that's true
<heat> so it is weird that they explicitly recommend a global fucking flush?
<geist> yep, that was exactly my surprise too
<geist> it' pretty bad. i assume they're starting off very conservatively and over time more extensions will appear that relax things
<geist> so for example the sinval instruction is now there which is a bit looser
<geist> it makes sense, it's always easier to loosen up things over time with new features or even just a extension flag that says 'this isn't as strict could be' kinda things
<heat> Znotmyfirstcpuunicourse
<heat> wording around this should probably be tightened up
<geist> note i only saw the weird stray PF stuff on a newer sifive core that's not generally available yet
<geist> i have meant to ask them precisely what's going on, even their manual doesn't really go into any more details
<geist> p470/p670. they're at least announced
<geist> proper superscalar + vector riscv core
<heat> didnt you also see it on the starfive you have?
<geist> you know i dunno, i forget
<geist> i may have been talking about runnig it on the newer cores, i as just being coy about it because i didn't want to mention the new cores
<geist> but now they're announced, etc
<heat> yeah
melonai has quit [Quit: Ping timeout (120 seconds)]
Gooberpatrol66 has joined #osdev
<heat> https://kib.kiev.ua/x86docs/Intel/WhitePapers/317080-002.pdf i found this earlier, great overview
<geist> oh that does look good
netbsduser has quit [Ping timeout: 246 seconds]
Turn_Left has quit [Ping timeout: 264 seconds]
gruetzkopf has quit [Server closed connection]
gruetzkopf has joined #osdev
jjuran has quit [Server closed connection]
emm has quit [Ping timeout: 258 seconds]
melonai has joined #osdev
X-Scale has quit [Ping timeout: 256 seconds]
theyneversleep has quit [Remote host closed the connection]
baraq has quit [Server closed connection]
baraq has joined #osdev
jjuran has joined #osdev
jjuran has quit [Remote host closed the connection]
jjuran has joined #osdev
tom5760 has quit [Server closed connection]
tom5760 has joined #osdev
<heat> i kinda wanted a CoW filesystem tree right now
<heat> but i use REAL FILESYSTEMS so i'll have to just git clone the source code into a separate tree
<clever> heat: seen git worktrees?
<heat> nop
<heat> e
<clever> basically, `git worktree add ../path branch`
<clever> and it will checkout that branch, in a new dir
<clever> but both places, share the .git and object store
<clever> so you can act on 2 branches of a repo at once, without paying the cost of 2 .git's
<heat> oh interesting
<clever> [clever@amd-nixos:~/apps/nixpkgs-master4]$ cat .git
<clever> gitdir: /home/clever/apps/nixpkgs/.git/worktrees/nixpkgs-master4
<clever> internally, .git becomes a file, pointing to the master .git
<clever> `git worktree ls` will also show the state of each
<clever> s/ls/list/
<clever> i have 25 different worktrees on nixpkgs alone
<heat> what happens if you check out a branch, then make some changes?
<heat> can you easily merge those back?
<clever> a given branch can only be checked out in one place at a time
<clever> and the normal things like `git merge` can still work as usual
<heat> hmm guess i could fork a new branch
<heat> lets try that
<clever> there is also `git worktree add ../path -b newbranch oldbranch