#osdev on 2024-01-31 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

13:45 _whitelogger has joined #osdev

13:50 <heat_> LOGGER'S BACK

13:50 <heat_> WOOOOOOOOOO

13:51 kfv has joined #osdev

13:54 <zid> oh nice

13:54 <zid> but no more talking about smurfs :(

13:57 <heat_> now that we're logged, what kind of SOURCE CODE LEAKS have yall been looking at

13:57 netbsduser has quit [Ping timeout: 240 seconds]

13:57 <heat_> i cannot wait to infringe on intellectual property in a logged channel

13:57 <gog> i'm generative ai

13:58 <gog> i can't not infringe on IP

14:00 <nikolapdp> zid smurfs is a great name for a filesystem

14:02 <zid> wait, what did you think I was talking about

14:02 <zid> if not smurFS

14:04 goliath has quit [Quit: SIGSEGV]

14:04 <nikolapdp> that's the spirit

14:06 Left_Turn has joined #osdev

14:07 Left_Turn has quit [Remote host closed the connection]

14:08 navi has quit [Quit: WeeChat 4.0.4]

14:13 netbsduser has joined #osdev

14:16 bauen1 has quit [Ping timeout: 268 seconds]

14:18 kfv has quit [Quit: Textual IRC Client: www.textualapp.com]

14:21 janemba has quit [Ping timeout: 245 seconds]

14:22 heat_ is now known as heat

15:13 navi has joined #osdev

15:17 Left_Turn has joined #osdev

15:32 luke9716 has quit [Remote host closed the connection]

15:44 bauen1 has joined #osdev

15:53 frkazoid333 has joined #osdev

16:04 zxrom has joined #osdev

16:23 heat_ has joined #osdev

16:23 heat has quit [Read error: Connection reset by peer]

16:30 pretty_dumm_guy has quit [Ping timeout: 252 seconds]

16:31 pretty_dumm_guy has joined #osdev

16:36 zetef has joined #osdev

16:38 TkTech has quit [Quit: Ping timeout (120 seconds)]

16:38 TkTech has joined #osdev

16:41 randm has quit [Remote host closed the connection]

16:41 randm has joined #osdev

16:59 zetef has quit [Remote host closed the connection]

17:12 joe9 has joined #osdev

17:34 jack_rabbit has quit [Read error: Connection reset by peer]

17:35 jack_rabbit has joined #osdev

17:47 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

17:49 netbsduser has quit [Ping timeout: 264 seconds]

17:49 antranigv has joined #osdev

17:53 netbsduser has joined #osdev

17:57 gog has quit [Quit: Konversation terminated!]

18:00 yoo has quit [Ping timeout: 256 seconds]

18:04 yoo has joined #osdev

18:10 goliath has joined #osdev

18:11 yoo has quit [Ping timeout: 246 seconds]

18:25 zetef has joined #osdev

18:32 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

18:33 antranigv has joined #osdev

18:35 GeDaMo has quit [Ping timeout: 264 seconds]

18:38 Shaddox404 has joined #osdev

18:40 GeDaMo has joined #osdev

18:45 bitoff has joined #osdev

18:50 <nikolapdp> heat: how do you do physical memory allocation

18:50 Neo has quit [Ping timeout: 260 seconds]

18:50 <heat_> imagine not having tab completion

18:50 heat_ is now known as heat

18:51 <nikolapdp> kek

18:51 <heat> nikolapdp, very generic question, please explain

18:51 <nikolapdp> like how do you keep track of what phyical pages have you allocated or not

18:51 <nikolapdp> do you use a slab, or bitmap or whatever

18:51 <zid> bitmap is life

18:51 <zid> bitmap is love

18:52 <nikolapdp> sure is zid

18:52 <zid> bitmap of bitmaps

18:52 <heat> i have two physical memory allocators

18:52 <heat> my bootmem allocator works pre-buddy, it's basically a list of available ranges and reserved ranges, and you carve out memory from the available ranges

18:53 <heat> it's a very simple thing

18:53 <zid> boros does a linked list cus it's boring and trivial

18:53 <heat> my actual page allocator (when memory is "properly up" and I have struct page available) is a buddy allocator

18:54 <nikolapdp> makes sense

18:54 <heat> pages in the buddy allocator get marked PAGE_FLAG_BUDDY, the order is also stashed in the struct page; these two things help me coalesce pages

18:55 <heat> then as a kind of "separate layer but not really" i have a percpu cache of order-0 pages

18:55 <heat> does this answer your question?

18:55 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

18:55 <nikolapdp> yes it does

18:58 antranigv has joined #osdev

19:01 <heat> why do you care

19:01 <nikolapdp> just curious

19:07 <heat> oh, note that my buddy allocator is zone-based

19:07 <heat> and technically-but-not-actually NUMA-node-based

19:08 <heat> say, a node has memory for DMA32 and NORMAL (> 4GB)

19:08 zetef has quit [Remote host closed the connection]

19:08 <zid> nikolapdp: When are you adding me a proper allocator to boros?

19:08 <heat> what zone you prefer/use entirely depends on flags you pass the allocator

19:09 <nikolapdp> zid: when i am done writing my own os

19:09 <heat> it also has some initial support for kicking off page reclamation

19:09 <zid> you mean, after you're done reading honzuki

19:09 <nikolapdp> people can do two things

19:10 <nikolapdp> heat why is it numa but not really

19:10 <heat> in practice besides the basic LRU shit (which i *still* don't have) i need memory compaction in order to reliably be able to get higher order pages

19:10 joe9 has quit [Quit: leaving]

19:11 <heat> it's numa but not really because although I do have the beginnings of a struct page_node for each NUMA node, i don't instantiate any and the alloc_page() interface does not support specifying numa nodes

19:11 <heat> nor is slab numa-aware, nor is anything else

19:12 <heat> and i cant be arsed because i don't have numa hardware, so it'd be pretty hard to test nonetheless

19:12 <heat> even if i tried to add numa

19:12 <nikolapdp> lol fair enough

19:16 Neo has joined #osdev

19:18 jack_rabbit has quit [Read error: Connection reset by peer]

19:18 jack_rabbit has joined #osdev

19:19 Shaddox404 is now known as Shaddox_AFK

19:26 <heat> geist, have you seen Svvptc?

19:26 <heat> it works around the need for the "redundant" sfence.vma when mapping in a page fault

19:27 <heat> it makes stores to PTEs that set V happen-before an sret or mret

19:33 Shaddox_AFK has quit [Ping timeout: 256 seconds]

19:38 Shaddox_AFK has joined #osdev

19:42 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

19:43 antranigv has joined #osdev

19:44 <Ermine> time for speedrun build onyx 100%

19:50 <heat> wat

19:51 <zid> gog: https://cdn.discordapp.com/attachments/983644553397534741/1202311620311646269/knj1N2J.png

19:57 Shaddox_AFK is now known as Shaddox404

19:57 <Shaddox404> Anyone using NixOS here?

19:59 <nikolapdp> no

19:59 <heat> you're the only nixos user in the world

19:59 <heat> enjoy

20:00 <Ermine> heat: I've got a new laptop and I want to check how quickly it will build onyx

20:00 <heat> cool! onyx is very fast to build

20:00 <heat> all ports - less so

20:06 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

20:06 antranigv has joined #osdev

20:18 nitrix has quit [Quit: ZNC 1.8.2 - https://znc.in]

20:19 nitrix has joined #osdev

20:29 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

20:30 antranigv has joined #osdev

20:34 <geist> ONYXXXXX

20:34 <geist> i'm not sure i've ever built it either, need to do it

20:34 <geist> also hmm, which one is svvptc....

20:35 <geist> ah yeah, dunno. no haven't fiddled with it

20:36 <heat> yeah its very new

20:37 <heat> it wasn't ratified yet tho

20:39 gareppa has joined #osdev

20:40 <geist> does remind me i should look at the oh what is it extensio (i have a whole spreadsheet at work with a list of extensions, but i'm on my personal computer right now)

20:40 <geist> it's the one that lets yo split MMU flushes into separate flush and sync instructions

20:40 <geist> ie, like arm

20:40 <geist> that extensiom is starting to show up on things

20:42 <heat> the Owhatisit extension?

20:42 <geist> Svinval

20:43 <geist> qemu will emulate it but i'm sure it makes no difference at all, probably treats the sync as a nop

20:43 <heat> i have no idea how one is supposed to support all these extensions and differing code paths

20:43 gareppa has quit [Client Quit]

20:43 <heat> this looks like opengl extension hell, but architecture

20:44 <geist> well, in general you start adding global bools and either test at the place, start code patching, or have different virtual functions

20:44 <geist> at this point its nothing like suppirting a bunch of stuff on arm64 post v8.0

20:44 <heat> is arm64 worse?

20:50 <geist> well, now that it's up through 8.7 and whatnot there are a *ton* of details that you may want to conditionalize on in the kernel

20:50 <geist> behavioral stuff

20:50 <geist> feature bits that change this or that

20:51 <geist> it's the beahviorla ones i find to be more annoying, where based on feature X if you set bit Y now you need to do sequence Z instead of W

20:51 <geist> though as is usual most are optional, so you can pick and choose

20:52 <geist> likle say dont need to use x2apic vs apic kinda stuff

20:53 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

20:53 <heat> yeah

20:53 <heat> i guess with riscv it's *usually* like that too

20:53 <heat> except with Zicbom, that's really annoying

20:53 <geist> part of the problem is so far a few of the extensions are not opt out. like if this is present you must deal with it

20:53 <geist> there's a new extension for precisely that problem actually, but i haven't seen it in place yet

20:54 <geist> lets you turn features off

20:54 <heat> the riscv platform spec(i think?) says something like "cache coherency is not a problem and is expected for UNIX kernels. if cache maintenance is required, there will be an extension present for it"

20:54 <geist> whereas x86 and arm are worried enough about forward compatibility that they almost always hide new things behind some sort of opt in bit

20:54 <heat> which sounds /ok/, but you don't know if there's a cache extension present, except if you support it

20:55 <heat> and if you don't... silent breakage all around

20:55 <geist> yah

20:55 <heat> e.g there's Zicbom, but there's also a Theadcmo or something liek that

20:55 <geist> stuff like wiether or not the cpu writes back to the A/D bit: you cannot opt out of that

20:55 <geist> it either does it or not, and you must deal with both paths

20:56 <geist> that ones the most annoying to me personally so far. i'd just as soon have it fall back to exceptions and then if needed write code to scan later, but in this case you dont have a choice

20:57 <heat> i had to deal with the zicbom problem personally, and it was the most annoying shit ever

20:57 <heat> because the EDK2 people want to half ass it and deal with the real problems later

20:57 <heat> and i don't quite understand the device <-> cache coherency problem well enough to really be an authority on it

20:57 <geist> yah

20:58 Shaddox404 is now known as Shaddox_AFK

20:58 <geist> yah added zicbom to zircon recently

20:58 Shaddox_AFK is now known as Shaddox404

20:58 <geist> what's making the extension explosion not get out of hand is the RVA stuff which defines these baselines and mandatory extensions

20:58 <geist> so for the most part if you follow along there and pick up the mandatory bits as the RVAs roll forward. RV... uh, what is the A

20:59 <geist> oh profiles. A is for application stuff i think

20:59 Left_Turn has quit [Ping timeout: 256 seconds]

20:59 <geist> https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva22-profiles if you're not following along

20:59 <bslsk05> github.com: riscv-profiles/profiles.adoc at main · riscv/riscv-profiles · GitHub

20:59 <nortti> do we also have RVM?

21:00 <geist> https://github.com/riscv/riscv-profiles/blob/main/rvm23-profile.adoc

21:00 <bslsk05> github.com: riscv-profiles/rvm23-profile.adoc at main · riscv/riscv-profiles · GitHub

21:00 <geist> not sure it's ratified yet, but there is a microcontroller version

21:01 antranigv has joined #osdev

21:02 <heat> actually, now that you're here geist: when do you need to maintain cache coherency explicitly?

21:02 <heat> i know there's a device tree property for it

21:02 Left_Turn has joined #osdev

21:02 <geist> between cpus or between cpus and devices?

21:02 <heat> does it depend on the device? the platform? both? the architecture? all of em?

21:02 <heat> cpus and devices

21:02 <geist> yes

21:02 <geist> i dont actually know if ther'es a device tree thing that says if it's coherent or not

21:03 <geist> so for example the sifive hifive and visionfive class socs *are* coherent, which is why there really isn't any cache flushing you have to do

21:03 <heat> there's dma-coherent and dma-noncoherent

21:03 <geist> theres some sort of front port AXI bus that if you run your bus mastering dma device through it, the cpu gets to snoop the transfers

21:03 <geist> yah and it makes it dma coherent, and thus you dont really need to manually flush anything. basically like x86

21:04 <geist> note this is independent of i&d cache coherency. riscv and arm (and most other arches) you have to manaylly sync data there, but that's known

21:04 <heat> where's it stated "this architecture is coherent by default"

21:04 <geist> it does not

21:04 <geist> it quite explicitly does not state it at all

21:04 <heat> because the device tree spec states:

21:04 <heat> "For architectures which are by default non-coherent for I/O, the dma-coherent property is used ..."

21:05 <heat> and vice-versa for the dma-noncoherent

21:05 <heat> so... how tf do you guess?

21:05 <heat> i'm assuming the device tree spec reflects reality

21:05 <geist> right. it quite possibly is Just Known, or it may be stated that you must assume it's non coherent unless specified elsewhere

21:05 <geist> depends. which spec are yo ureading? if it's the original spec it probably hasn't been updated in 20 years

21:06 netbsduser has quit [Remote host closed the connection]

21:06 <geist> but if you read the arm and riscv spec it may be stated somewhere that it's non coherent by default

21:06 <geist> i just cant tell you if/where that is

21:06 netbsduser has joined #osdev

21:06 <geist> however since i know it is that way because that's how it is, i dont particularly need to find it

21:07 <geist> i think what makes it more confusing is except for very high end server chips, any given ARM device is almost certainly non-dma-coherent, so it's sort of the default state: non coherent unless proven otherwise

21:08 <geist> and if you over flush stuff you're just wasting time, but it's otherwise harmless

21:08 <geist> on riscv it seems a lot of the initial cpu clusters (by sifive in general) *are* fully coherent, so it means a lot of initial code can forget about it, and then as more cores come out that are not, it gets much more messy

21:10 Shaddox404 is now known as Shaddox_AFK

21:10 <nortti> < geist> https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva22-profiles if you're not following along ← do you know why it's only a recommendation to have an illegal instruction exception for RVA22U64?

21:11 <geist> i've never heard of one that doesnt, so i can't say why

21:11 <geist> possible there were some existing cores that dont, so this is a recommendation to try to claim it back

21:11 <geist> this profile stuff seems to be a real attempt across the riscv world to make some order out of chaos

21:12 <geist> to make things a little more confusing, it's possible for machine mode (SBI in particular) to trap and emulate instructions transparently for you, so they may, for eample, just nop something it doesn't understand

21:12 <geist> in that case it's not the cpus fault, but a firmware issue. from the app developer point of view it may appear as if nothing was raised

21:13 <geist> not saying thats the thing, but possible something like that exists somewhere and this is an attempt to 'please dont do that again'

21:13 gbowne1 has joined #osdev

21:15 <geist> i'd like to tell you some of the real world riscv mess i've had to deal with over the last 6 months but i can't, but precisely this sort of nonsense does exist right now

21:16 <geist> but usual 'bring up on <thing> which is weird and nonstandard'

21:19 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

21:20 antranigv has joined #osdev

21:23 Shaddox_AFK has quit [Ping timeout: 256 seconds]

21:25 Turn_Left has joined #osdev

21:28 Left_Turn has quit [Ping timeout: 255 seconds]

21:33 GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]

21:42 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

21:43 antranigv has joined #osdev

21:57 Shaddox_AFK has joined #osdev

21:57 Shaddox_AFK is now known as Shaddox404

21:58 <Shaddox404> heat: nah, i use OpenSUSE

21:58 <Shaddox404> I was curious since it was termed to be "different"

21:59 <kof123> eh, ask in other channels, there are people

22:03 <Shaddox404> Sure

22:05 Shaddox404 has quit [Quit: Connection Terminated.]

22:06 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

22:06 antranigv has joined #osdev

22:12 <zid> heat help, my vindaloo is REALLY hot

22:17 <zid> https://www.qualys.com/2024/01/30/cve-2023-6246/syslog.txt

22:18 netbsduser has quit [Ping timeout: 255 seconds]

22:18 <zid> cute bug

22:18 <zid> they added a +1 to a buffer size to fix a bug

22:18 <zid> now if sizeof(p) is 0 cus of a failed alloc, it no-longer falls through to the error cases, instead allocating 0+1 bytes and if(p) succeeds

22:28 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

22:29 <heat> yo

22:29 <kof123> > this looks like opengl extension hell, but architecture well it sounds like /dev/duck :D

22:30 <heat> can ppl with amd zen older than 4 build https://github.com/kernelslacker/x86info and post the output of ./x86info -c

22:30 <bslsk05> kernelslacker/x86info - x86info : x86 processor register decoder. (20 forks/37 stargazers/GPL-2.0)

22:30 <heat> i also take new intel

22:30 <heat> like 10th gen forwards is interesting

22:31 <nikolapdp> https://paste.artixlinux.org/view/846add57

22:31 <bslsk05> paste.artixlinux.org: Database Error

22:31 <nikolapdp> zen 2

22:32 <nikolapdp> heat ^

22:34 <kof123> or all the feature bits sounds like "The Thirty-Million Line Problem" .....which he argued e.g. for x86 (expansion hardware-wise, not just cpu), that was what brought innovation... this is just to say, because it is new stuff, the dust hasn't settled yet?

22:35 <nortti> heat: https://plumage.oriole.systems/pale-browed-treehunter/x86info.txt for zen 3

22:36 antranigv has joined #osdev

22:37 <heat> >L2 Instruction TLB (1G): Disabled. 0 entries.

22:37 <heat> huh, does zen3 have some weird errata?

22:37 <nikolapdp> same on zen 2: L2 Data TLB (1G): Disabled. 0 entries.

22:38 <nikolapdp> L2 Instruction TLB (1G): Disabled. 0 entries.

22:38 <heat> WOAH that's even weirder

22:39 <Mondenkind> would be interesting to bench some stuff with 1g pages and see what the actual behaviour is

22:39 <nikolapdp> can you even force 1g pages

22:39 <nikolapdp> other than writing your own os and doing it that way

22:39 <Mondenkind> if it's actually doing a full page walk for every access that would be ... bad ...

22:39 <Mondenkind> nikolapdp: pretty sure it is possible under linux. might require some hoop-jumping

22:40 <heat> yall have some really beefy fucking caches

22:40 <nikolapdp> not that i even have a gig of ram free at the moment :)

22:40 <heat> the direct map uses 1GB if possible AFAIK

22:40 <nikolapdp> heat: really

22:40 <heat> yes

22:40 <heat> here's my kabylake's

22:40 <heat> https://gist.github.com/heatd/36c2bc7cdd824b2557d0fda492578034

22:40 <bslsk05> gist.github.com: x86info-kbl · GitHub

22:42 <nikolapdp> huh differently displayed

22:42 <heat> well, the cache is different

22:43 <heat> intel CPUs (at least pre-kabylake) have a shared L2 TLB, AMD zen ones seem to be separated in iTLB and dTLB, and then separated in page size

22:43 jack_rabbit has quit [Remote host closed the connection]

22:44 jack_rabbit has joined #osdev

22:44 <nikolapdp> also mostly 8-way associative on zen

22:45 <heat> i want to see anything 10th+ gen on intel

22:45 <qookie> according to https://www.7-cpu.com/cpu/Zen2.html 1G pages use the same L2 TLB entries as 2M pages?

22:45 <bslsk05> www.7-cpu.com: AMD Zen2

22:46 <qookie> > 1-Gbyte pages smashed into 2-Mbyte pages in Data TLB L2: 2048 items. 16-way.

22:46 bauen1 has quit [Ping timeout: 256 seconds]

22:46 <nikolapdp> zen 3 had dedicated dTLB

22:46 <nikolapdp> realistically, who's mapping multigigabyte executables

22:47 <heat> it's important if you're running off of 1G pages in the kernel

22:47 <heat> e.g in the direct map

22:47 <nikolapdp> yeah that's true

22:48 gog has joined #osdev

22:48 <heat> qookie, ah, i guess that makes sense

22:48 <geist> heat: re: split TLB for different page sizes, seems to be the opposite

22:48 <geist> the L1 tlbs at least are all page sizes

22:48 bitoff has quit [Ping timeout: 256 seconds]

22:49 <heat> hrm

22:49 <heat> where do you see that?

22:49 <geist> I TLB L1 : 64 items. full-assoc, all page sizes

22:49 <geist> hmm, where is the D TLB L1....

22:50 <heat> what core?

22:50 <geist> oh zen 2

22:50 <zid> where DO you see that

22:50 <nikolapdp> yeah i don't see it

22:50 <geist> https://www.7-cpu.com/cpu/Zen2.html on this page

22:50 <geist> search what i pasted

22:51 <heat> i wonder if the 5000 line was different than the 3000

22:51 <geist> 5000 is zen3

22:51 <nikolapdp> mine is 5500u but zen2

22:51 <geist> okay, except the mobile, etc stuff

22:51 <geist> that's where the nubmers get confusing, darn you AMD!

22:52 <nikolapdp> they keep changing it which is also annoying

22:52 <geist> https://en.wikichip.org/wiki/amd/microarchitectures/zen_3 is in general a better place to find this stuff

22:52 <bslsk05> en.wikichip.org: Zen 3 - Microarchitectures - AMD - WikiChip

22:53 <qookie> geist: the data TLB is talked about in a bit more detail below, split into sections for each page size

22:53 <geist> it shows the zen 3 as having all page sizes for L1i/d and then L2s do 4K/2MB, 1GB flattened

22:53 <heat> hmm, i wonder if there's a bug in x86info

22:53 <geist> which is what i generally remember, AMD has generally had multi page size TLBs

22:53 <heat> x86info from what i was told essentially flattens the cpuid data into pre-baked strings

22:53 <heat> so whatever they lifted, was lifted straight from a manual

22:53 <geist> probably at the expense of being harder to implement

22:54 <nikolapdp> but i am missing both iTLB and dTLB for 1g pages

22:54 <geist> yes. almost certainly

22:54 <nikolapdp> while zen 3 is missing only iTLB

22:54 <heat> in the L2, because they smash it into 2M

22:54 <geist> Zen 4 seems to have 1GB pages

22:54 <geist> https://en.wikichip.org/wiki/amd/microarchitectures/zen_4

22:54 <bslsk05> en.wikichip.org: Zen 4 - Microarchitectures - AMD - WikiChip

22:54 <geist> in the L1s at least

22:55 <heat> in any case YALL ARE FLOODED WITH LARGE PAGE TLB

22:55 <zid> 1GB pages evil on my cpu, got it

22:56 <heat> i don't get it, maybe there's something we're missing

22:56 <nikolapdp> zen 2: 1-Gbyte pages are smashed into 2-Mbyte entries in the L2 ITLB

22:56 <nikolapdp> from wikichip

22:56 <geist> but for example skylake has split TLBs at the L1 level, shared at L2

22:56 <geist> https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)

22:56 <bslsk05> en.wikichip.org: Skylake (client) - Microarchitectures - Intel - WikiChip

22:56 <heat> my shit kabylake from 6 years ago has a larger L1 TLB than zen 4?

22:56 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

22:57 <nikolapdp> was the zen 4 a mobile chip?

22:57 <heat> no

22:57 <heat> zen 4 is a microarchitecture

22:57 <nikolapdp> oh so no one sent a zen 4

22:57 <nikolapdp> ok

22:58 <heat> i have mcrod's zen 4

22:58 <geist> heat: well, your kabylake has 128 4K L1i and 64 4K L1d

22:58 <geist> but they're 8 way associative instead of being fully associative and all page size

22:58 <geist> so my guess is that's sort of similar

22:59 <geist> https://en.wikichip.org/wiki/intel/microarchitectures/kaby_lake#:~:text=Kaby%20Lake%20TLB%20consists%20of%20dedicated%20L1%20TLB%20for%20instruction%20cache%20(ITLB)%20and%20another%20one%20for%20data%20cache%20(DTLB).%20Additionally%20there%20is%20a%20unified%20L2%20TLB%20(STLB).

22:59 <bslsk05> en.wikichip.org: Kaby Lake - Microarchitectures - Intel - WikiChip

22:59 <heat> hmmm, cpuid does not say that

22:59 <heat> at least not x86info

23:00 <heat> Instruction TLB: 4K pages, 8-way associative, 64 entries

23:00 <geist> also interesting, it shows the DTLB being fixed partitioned between threads

23:00 <geist> which probably means 50/50

23:00 <gog> hi

23:01 <nikolapdp> hello gog

23:01 <geist> gog

23:01 <gog> geist:

23:01 <heat> gog:

23:01 <heat> bazinga

23:01 <gog> heat:

23:01 <gog> bazel

23:02 * geist shudders

23:02 <heat> do note that cpuid can be wrong as stated in the kabylake errata

23:04 <heat> it may be that amd is just really bad at describing its TLB layout through cpuid

23:04 <gog> my boss told me i'm a good programmer

23:05 <heat> congrats gog

23:05 <nikolapdp> go you gog

23:05 <heat> do you want to see windowsified linux page tables code

23:05 <heat> it's a blatant GPLv2 violation

23:05 <gog> HPAGE

23:05 <heat> i wish

23:06 <heat> https://openfw.io/edk2-devel/20240126062919.3101691-1-lichao@loongson.cn/

23:06 <bslsk05> openfw.io: [edk2-devel] [PATCH v8 14/37] UefiCpuPkg: Add CpuMmuLib to UefiCpuPkg - Chao Li

23:06 <heat> the copied the whole weird layout, code is straight up copied and then converted

23:06 <heat> there's a SWAP_PAGE_DIR that they took from swapper_pg_dir

23:07 <heat> somehow the most GPL compliant chinese corporation

23:07 bitoff has joined #osdev

23:09 antranigv has joined #osdev

23:12 <zid> oh, heat's program is where you get it

23:13 <heat> that's not my program

23:13 <geist> greetings program

23:13 <zid> the one you linked, makes it yours

23:13 <heat> https://github.com/torvalds/linux/

23:13 <bslsk05> torvalds/linux - Linux kernel source tree (52082 forks/165002 stargazers/NOASSERTION)

23:13 <heat> yall like my new kernel

23:14 <zid> nah cus I already knew about that

23:14 <zid> heat l2english already

23:14 <heat> heatux kernel

23:15 <nikolapdp> gnu/heatux

23:15 <zid> why not heatix

23:15 <zid> everything cool is *ix

23:15 <nikolapdp> UNIX

23:16 <heat> nix

23:16 <heat> wait a minute! nix isn't cool!

23:16 <zid> onix

23:16 <heat> overrated pokemon

23:16 <zid> overrated!?

23:16 <zid> Everyone considers it total trash

23:17 <heat> i agree then

23:17 <heat> total trash

23:19 <heat> i've had an idea for a while

23:19 <heat> one could /probably/ have a linux-ish style of page table management but still enjoy a pmap interface if need be

23:19 <zid> no, you can't eat tomatoes until you turn red

23:20 <heat> like, one thing doesn't necessarily exclude the other

23:20 <heat> my big qualm with having a per-arch pmap is that a lot of it is just copy-pasted

23:20 <zid> MASSIVE IFDEFS

23:20 <zid> hundreds, in one file

23:21 <zid> Just write every possible line, in a bunch of different orders

23:21 <zid> and piece together the impl with ifdefs

23:21 <heat> because arm64 - arm32 - x86 - x86 PAE - x86_64 - riscv32 - riscv64 are basically the same shit

23:21 <heat> but with varying numbers of levels, and some tiny differences when it comes to flags

23:21 <geist> indeed, however they're different enough, with different enough optimization paths, or requirements

23:22 <geist> that trying to unify them is a Bad Idea

23:22 <heat> i... i don't know

23:22 <heat> you probably know better than me

23:22 <heat> but i think there's a way to make it work

23:22 <zid> don't trust him heat

23:22 <zid> go for the combinatorial explosion of ifdefs

23:24 <heat> you dont need many ifdefs if you do it right

23:24 <heat> geist, what kind of stuff are you thinking off?

23:24 <heat> of*

23:24 <geist> well things like the precise ordering of TLB flushes. do you batch, do you do alone?

23:24 <geist> do you need to barrier here vs there

23:25 <geist> what about A/D writeback?

23:25 <geist> how many reserved bits per level? can you use intermediate page sizes?

23:25 <geist> does the cpu support combined page sizes, what about variable page size? etc etc

23:25 <gog> yes

23:25 <heat> right

23:26 <geist> stuff just starts to combinatorially explode. but the trouble is some of the pattersn result in needing to do the order differntly

23:26 <geist> like splitting a page table, what precise order do you need to do it in

23:26 <geist> splitting large pages that is

23:26 <heat> idk how much linux mm you've read, if any

23:26 <geist> ASID support with TLB flushing is a gigantic PITA

23:26 <heat> but you can totally offload things to helper functions

23:26 <geist> since there's no one precise pattern that works

23:27 <geist> sure i dont doubt you can completely plow through it brute force. the Linux Way

23:27 <geist> i just dont know if the result is worth it

23:27 <geist> i'd rather have N copies of highly tuned code

23:27 <heat> and even if complexity may jump up a bit, because things aren't exactly the same, you probably get a better result than having 10 pmap impls

23:28 <geist> (X) Doubt

23:28 <heat> even in a big system like freebsd most of them don't agree in the way you can go from a page to all the mappings

23:28 <geist> OTOH, it's also worth a try :)

23:28 <geist> ARM64 is the real outlier here since it's page tables are so flexible

23:29 <geist> and has some extremely careful ordering of updates, primarily because of the weak memory model

23:29 <heat> absolutely

23:29 <heat> for anything that's extremely fucky to get right/properly, you could just fork the code

23:30 <heat> that's the difference between my idea and linux's. linux's doesn't even attempt to have a pmap-ish layer

23:30 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

23:30 <geist> yeah but it also treats the page table as a first class citizen

23:30 <geist> as an upper data structure, much to the chagrin to any arch that doesn't match that model

23:31 <geist> it's a fundamental design decision that has immense ramifications

23:31 gog has quit [Quit: byee]

23:31 * heat nods

23:32 <heat> i.. i don't know, this is hard to think about

23:33 <heat> i feel like the linux page table model came up as an accident, but a happy accident because by exposing them as a first-class citizen it ends up allowing for really fast "hacks" so to speak

23:34 <heat> it was just a "look haha my hobby system can map pages" that evolved into pgd/p4d/pud/pmd/pte go brrrrrrr

23:34 antranigv has joined #osdev

23:34 <geist> yah

23:35 <geist> but then it also is only a win on architectures where it lines up. ie, x86

23:35 <heat> linux generally throws away every other kernel's very pretty abstractions that map out very nicely on a whiteboard

23:35 <heat> and it ends up winning out because of that

23:35 <nikolapdp> geist where does it not line up

23:35 <heat> sun engineering ethos vs LINUX HACKER GPL!!!!

23:35 <nikolapdp> kek

23:36 <geist> nikolapdp: POWER/PPC comes to mind. or itanium

23:36 <heat> ppc, itanium, sparc

23:36 <geist> or arches that take explicit TLB misses, or even arm32

23:36 <nortti> 0

23:36 <heat> zero

23:36 <geist> iirc arm32 has some funny dual page table thing, where for every high level page table there's a second one

23:36 <nikolapdp> so a bunch

23:36 <heat> yes

23:36 <geist> yah but, if you notice all the modern ones basically copy x86

23:36 <geist> because they know where the bread is buttered

23:37 <geist> (not that x86 invented that strategy of page table)

23:37 <heat> fwiw windows also follows this idea somehow

23:37 <geist> yeah

23:39 <geist> prototype page tables, etc

23:39 <CompanionCube> the new POWER versions have more conventional page tables, don't they?

23:40 <heat> gosh linux was a fucking accident wasn't it

23:40 <nikolapdp> absolutely

23:40 <nikolapdp> just at the right place at the right time

23:41 <heat> the UNIX people generally have some sort of disdain for linux's abstractions

23:41 <heat> i guess this is what dave cutler talked about all along

23:41 <heat> the UNIX phds

23:42 <heat> vs the Linux... unemployed BSc's?

23:42 bauen1 has joined #osdev

23:42 <heat> vs the OpenVMS demigods of course

23:42 <nikolapdp> and who won out :)

23:42 <heat> IBM AIX

23:43 Matt|home has quit [Quit: Leaving]

23:43 <nikolapdp> SOLARIS

23:44 <heat> it's remarkably funny to read the svr4 internals book and see them justify the vnode as the end-all be-all of VFS's everywhere, but then when it comes to block devices and other special files, the vnode shits itself and needs a separate special filesystem to proxy

23:44 <nikolapdp> kek

23:44 <heat> whereas the linux jank has 3 separate structs with 3 separate method table structs

23:45 <heat> but everything Just Works(tm)

23:46 <nikolapdp> good enough(tm) always wins

23:47 <heat> yeah, the jank is there for a reason

23:49 <nikolapdp> if it works it ain't stupid i guess

23:49 <heat> there's a lot of stupid stuff *and* stuff that seems stupid but isn't

23:50 <heat> like struct page is really stupid and amazingly overloaded, but it's also the smallest of all the struct pages in UNIX

23:51 <nikolapdp> lol

23:53 <heat> there's a really great hairy trick in struct page: the mapcount field is biased to -1 (so 0 maps = -1 in mapcount)

23:53 <heat> this means that it's trivial and OPTIMAL to detect state transitions between mapped and unmapped

23:54 <heat> unmapped -> mapped = overflow to 0, mapped -> unmapped = underflow to 0xffffffff

23:54 <nikolar> Interesting

23:54 <nikolar> And very hacky

23:55 <heat> this is not a story the sun engineering department would tell you

23:55 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

23:56 <heat> solaris would have a 128-bit counter to make it mega-future-proof

23:57 <nikolar> As if we're getting 128 bit processors any time soon

23:58 <heat> matthew wilcox (from linux) estimated they would pop up around 2050/2060 IIRC

23:59 <zid> in what year did he estiamate this

23:59 <zid> cus if it was anytime recent, I want what he's smoking