#osdev on 2022-05-13 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:25 eryjus has joined #osdev

00:26 sonny has quit [Ping timeout: 252 seconds]

00:43 sonny has joined #osdev

00:44 <geist> kazinsal: figured out the weird segment thing from yesterday

00:44 <kazinsal> oh nice

00:45 <geist> see linux CONFIG_VMD

00:45 <geist> it's not really nice. it's a total intel hack

00:45 <geist> basically it's a fake virtual segment used by intel 'volume management device'

00:45 <geist> that basically puts devices on some other virtual segment so it can get access to more busses and resources or some nonsense like that

00:46 <kazinsal> oh nasty

00:46 <geist> that's why i starts at segment 0x1000. that's out of the range of ACPI or something

00:46 <geist> it's functioally a root bridge to another segment

00:46 <kazinsal> I think the Intel VMD thing is for NVMe hotplug and stuff

00:46 <geist> yah makes sense. gives them a new set of 256 busses to play with

00:46 <heat> i just found out why my connection was sometimes resetting for certain connections: my router has a bug where it resets the ivp6 flow label

00:47 <heat> s/ivp6/ipv6/

00:47 <heat> turned off flow labels in linux and windows and things work

00:47 <heat> this router is total crap i'm telling ya

00:48 <heat> doesn't even speak ARP properly

00:48 <geist> oh ugh

00:48 <geist> how is it doing v6? dhcpv6 or just stateless?

00:49 <heat> i think stateless

00:49 <heat> yup

00:51 friedy10- has quit [Quit: fBNC - https://bnc4free.com]

00:51 Likorn has quit [Quit: WeeChat 3.4.1]

00:53 andreas303 has quit [Ping timeout: 252 seconds]

00:54 sonny has quit [Ping timeout: 252 seconds]

00:55 xenos1984 has quit [Read error: Connection reset by peer]

01:06 gog has quit [Ping timeout: 252 seconds]

01:14 xenos1984 has joined #osdev

01:44 andreas303 has joined #osdev

01:49 nyah has quit [Ping timeout: 240 seconds]

02:27 knusbaum has quit [Ping timeout: 276 seconds]

02:30 heat has quit [Remote host closed the connection]

02:31 heat has joined #osdev

02:43 heat has quit [Ping timeout: 252 seconds]

02:44 knusbaum has joined #osdev

02:54 gamozo has quit [Changing host]

02:54 gamozo has joined #osdev

02:55 gamozo has quit [Quit: leaving]

02:55 gamozo has joined #osdev

02:56 gamozo has quit [Client Quit]

02:56 gamozo has joined #osdev

03:20 sonny has joined #osdev

03:21 sonny has left #osdev [#osdev]

03:35 orthoplex64 has quit [Ping timeout: 276 seconds]

03:38 vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

04:10 dude12312414 has joined #osdev

04:32 hyenasky has joined #osdev

04:33 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

05:00 sikkiladho has joined #osdev

05:32 sonny has joined #osdev

05:40 Likorn has joined #osdev

05:42 zaquest has quit [Remote host closed the connection]

05:46 zaquest has joined #osdev

05:55 <sikkiladho> While going through AArch64 mmu guides, I came across this: https://developer.arm.com/documentation/101811/0102/Translation-granule, which defines size per entry based on the granule size. i.e for a 4KB granule, size per entry for Level 0 page table would be 512GB. How is this calculated?

05:55 <bslsk05> developer.arm.com: Documentation – Arm Developer

05:56 <clever> sikkiladho: an entry in the L0 table, points to a single page (4096 bytes i assume) of L1 entries

05:56 <clever> and each entry in L1 points to a single page worth of L2 entries

05:57 <clever> until you hit the deepest point (i forget)

05:57 <sikkiladho> you can have 4-level tables, l0,l1,l2,l3

05:58 <clever> so the size would then be ((pagesize / entrysize) ^ depth) * granule? i think

05:58 <clever> let me check my notes

05:59 <clever> https://github.com/librerpi/rpi-open-firmware/blob/master/docs/arm-mmu.txt#L1-L12

05:59 <bslsk05> github.com: rpi-open-firmware/arm-mmu.txt at master · librerpi/rpi-open-firmware · GitHub

06:00 <clever> so for arm32, the first level of the table is just an uint32_t[4096] (16kb), and each slot represents a 1mb chunk of the virtual space, covering the full 4gig of virtual space

06:01 hyenasky has quit [Quit: Client closed]

06:01 <clever> and strangely, an L2 is only 1kb long, a uint32_t[256], with each slot representing a 4kb page

06:01 <clever> and did i name L1 and L2 right in these notes??

06:01 <clever> comparing that to the link you pasted...

06:04 <clever> scrolling down a bit, they have an example of how a 48bit address is cut up into 5 parts, and each part is an index into a table

06:04 <clever> the first part is bits 47:39, a 9bit int, so 512 slots in the L0, at 4k granules

06:05 <clever> and with 64bit slots, uint64_t[512], thats 4096 bytes for the entire L0 table

06:06 <clever> oh, wait

06:06 <clever> > Math.pow(2,39)/1024/1024/1024

06:06 <clever> 512

06:06 <geist> the way i think about it is you take the log2 of each size

06:06 <clever> sikkiladho: the difference between slot0 and slot1 in the L0 table, is just +1 in bit 39 of the addr

06:06 <geist> ie, 12 bits for 4K pages

06:06 <geist> so the page tables then cover 12 + 9 + 9 + 9 + 9 bits of address space

06:07 <geist> reason for 9 is each page table entry uses 8 bytes, so that shifts 3 off the log2

06:07 <geist> okay this isn't clear, but once you grok it the math is simple

06:07 <geist> [9][9][9][9][12] kinda

06:07 <geist> so for 16k pages it's [11][11][11][11][14] and so on

06:08 <geist> but that's basically where they get those 'bits used to index' from

06:09 <geist> each level adds 9 more bits (for 4k base page granule)

06:09 <clever> is my notes right, about arm32 only having L1 and L2?

06:09 <clever> it feels weird now, that it doesnt start at L0

06:09 <geist> unless you enable PSE

06:10 <geist> i dont like they way they number things, but so it goes

06:10 <geist> i generally prefer to number the root L0 and count down, and if you only have two level syou only get to L1, etc

06:10 <geist> but they number them as if the terminal layer is always L3, i guess

06:10 <geist> at least in that doc

06:10 <clever> ah

06:10 <clever> and yeah, i see similar in the aarch64 doc sikkiladho linked, with 64k granules, the L0 doesnt exist

06:11 <geist> i think that's actually codified in the arch, because the ESR has a field that actually tells you waht level a permission check failed at

06:11 <geist> so they had to define some numbering scheme

06:12 <clever> is PSE like LPAE? supporting 64bit phys on 32bit virt?

06:13 <Mutabah> Kinda iirc

06:13 <Mutabah> I think it forces big pages and overloads bits 12-16 as extra address bits?

06:14 <clever> oh, that would also be why i cant find it in the v7 docs

06:14 <clever> because armv7 is pure 32bit

06:14 <geist> oh LPAE, that's right. sorry

06:14 <geist> got x86 mixed in there

06:14 <geist> LPAE in arm == PAE in x86

06:14 <clever> ahh

06:15 <Mutabah> Is PSE also an ARM thing? I was describing x86's "PSE"

06:15 <geist> armv7 has LPAE extensions, much like how x86-32 has PAE extensions

06:15 <geist> ie, you only get 32bit of address space but a larger physical space, by increaseing each entry to 8 bytes and thus you now need 3 levels of page tables to fit it in, etc

06:15 <clever> and LPAE is how raspi-os has gotten away with shipping a 32bit everything on devices with 8gig of ram

06:15 <geist> right

06:16 <geist> Mutabah: yah sorry, keep screwing up the names. PSE is iirc rarely used. was a temporary hack until PAE came along

06:16 <clever> the pi4 also has 2 modes for peripheral io

06:17 <clever> "high peripherals" mode puts the MMIO up at a 64bit only access, so you dont get a hole in your ram

06:17 <clever> but now an aarch32 kernel cant touch MMIO until it enables LPAE

06:17 <clever> so the default is "low peripherals" mode, which puts MMIO at the top of the 32bit addr space, creating a hole nearly dead-center in your 8gig of ram

06:18 <clever> but now a 32bit kernel can touch MMIO before the MMU is on, and isnt forced to use LPAE

06:18 * geist 's head hurts with even more stupid rpi4 shit

06:18 <geist> you're almost proud of how stupid that thing is aren't you?

06:18 the_lanetly_052 has joined #osdev

06:18 <clever> its probably the stockholme, lol

06:18 <geist> like 'hey look at this dumpster fire i keep warming me hands to! if you put truck tires in it vs regular car tires the smoke is pretty!'

06:19 <clever> how would you have designed that?

06:19 <clever> put a hole in ram? put all ram after mmio? screw 32bit?

06:19 <geist> easy: put the peripherals at or around 0, start RAM at a higher address and cross right over 4GB

06:19 <geist> that's how basically all modern SOCs do it now

06:20 <clever> ah, yeah, thats simple enough

06:20 <clever> i think the problem is that the rpi reset vector is 0, and they didnt want to fix that

06:20 <clever> so ram must start at 0

06:20 <geist> limits your mmio space to say a GB or so, but if you have a really fancy thing you probalby have PCI or whatnot, and you can put a second aperture > 4GB if you want

06:20 <geist> easy: put a rom at 0, turn it off when you're done

06:20 <geist> say 64MB of rom then peripherals, and you can start RAM at say 0x4000.0000 (1GB) and you have all that space

06:20 <clever> pretty sure thats almost exactly what the amiga is doing

06:20 <geist> that's precisely what the virt machine does

06:21 <clever> the rom is aliased to 0 during reset, and once the bootrom is in control, it knows where the true copy lives in the addr space

06:22 <geist> also iirc 68k has a high starting vector, i think

06:22 <geist> so you could put your rom in the high part of the address. or maybe the other way? actually now that i think about it maybe nto

06:22 <clever> from what i heard, the 68k loads a pair of uint32_t from addr 0 and 4

06:22 <clever> and those become the initial SP and PC

06:22 <clever> kinda sounds like cortex-m?

06:23 <clever> the rom exists at some fixed higher addr, and is aliased to 0 temporarily, for that reset vector to find it

06:23 <geist> yes cortex-m has the exact same thing

06:23 <clever> but the initial PC is aware of that higher addr, and jumps directly to the non-aliased copy

06:23 <geist> and i think those both came from VAX. iirc

06:23 <geist> the double entry thing

06:24 <clever> but ive also heard that the amiga bootstrap rom doesnt use the initial SP at all

06:24 <clever> _start just loads sp like you would on any other platform, and that 32bit slot is instead used as some kind of version number

06:24 <geist> yah cortex-m explicitly doesn't push anything on the reset vector since there's nothign to save

06:24 <geist> and thus the SP doesn't really need to be valid. it's just a nice to have so your reset vector can be written in C

06:24 <clever> so your free to ignore the initial SP

06:25 <clever> yep

06:26 <clever> i'm also thinking, about how i could modify the rpi, to behave less like a dumpster fire, lol

06:26 <clever> on VC4 era, there is a dedicated mmu with 64 x 16mb pages, that translates "arm physical" to real ram

06:27 <clever> so i could put some ram in the 1st 16mb slot, then 32mb of mmio, then 976mb of ram

06:27 <geist> actually no VAX doesn't do the reset that way, so really the cortex-m in this case is basically copying 68k

06:27 <clever> (that model range only supports 1gig max)

06:27 <clever> and then treat that first 16mb of ram as a boot rom

06:28 <clever> so the main block of ram begins at +48mb

06:28 <clever> but with no ability to map things above the 1gig point in the physical space, the higher you move ram, the more you loose

06:29 <clever> i need to figure out how the pi4 handles 8gig, and how i could coerce it into being more normal

06:31 <geist> arm64 got no problem with that

06:31 <geist> 64bit and move on with things

06:31 <clever> more, about how high/low peripheral works, and can i move it even lower, all the way to 0?

06:32 <geist> get a better SOC?

06:32 <clever> never! lol

06:33 GreaseMonkey has quit [Remote host closed the connection]

06:33 <geist> moving physical stuff around is not really in your list of things you can do

06:34 <clever> given that past models had a dedicated mmu and i could change the phys addr of mmio anywhere i want....

06:34 <clever> it may still exist on the bcm2711

06:37 <clever> i see signs that it does

06:38 <clever> but it only has room for 1gig, same as before

06:42 vdamewood has joined #osdev

06:44 bauen1 has quit [Ping timeout: 256 seconds]

06:46 vinleod has joined #osdev

06:49 vdamewood has quit [Ping timeout: 240 seconds]

06:50 vinleod is now known as vdamewood

06:50 <sikkiladho> I think I finally got it. There are 4 tables in AArch64. L0,L1,L2 and L3. If we use 4KB granule, it means each entry in L3 can cover 4KB. There are 9(20:12) bits to L3 table index. Therefore, there are 2^9=512 slots. Whole L3 table can cover 512*4KB=2MB. L2 table can point to each L3 table. Therefore each entry in L2 can point 2MB and so on to L1 and L0. That's how it is calculated! Thank you.

06:51 Likorn has quit [Quit: WeeChat 3.4.1]

06:51 <geist> yep!

06:51 <geist> and then the math follows from there f you use non 4k page granultes

06:51 <geist> everything is just shifted over

06:53 <clever> -rwxr-xr-x 1 root root 1.1K Feb 8 09:09 /boot/overlays/highperi.dtbo

06:53 <clever> oh, thats just cheating, lol

06:54 <clever> when you tell the firmware to move the peripherals, in addition to moving them, it just applies this DT overlay

06:54 <clever> and all its doing is patching the ranges= and dma-ranges= in a few spots

06:58 sonny has left #osdev [#osdev]

07:16 pretty_dumm_guy has quit [Quit: WeeChat 3.5]

07:24 <moon-child> Q: what's the point of 5-level paging, or of arm's 52-bit address space extension?

07:25 <Mutabah> More address space?

07:26 <moon-child> yeah but why? What's it for?

07:26 <moon-child> I can't imagine people are mmapping files at that scale...

07:26 <Mutabah> Who doesn't wan't more AS? :)

07:27 <moon-child> no but seriously what's the demand?

07:27 <Mutabah> Probably future-proofing

07:27 <Mutabah> so people can do massive memory-maps if needed

07:28 <moon-child> yeah but again: why would you need that?

07:28 * kingoffrance .oO( https://johnhartstudios.com/bc/2017/05/28/sunday-may-28-2017/ )

07:28 <bslsk05> johnhartstudios.com: Sunday May 28, 2017 - B.C. Comic Strip

07:32 <geist> moon-child: arm does not do 5 level paging

07:32 <geist> x86 has a new 5 level paging extension, and it extends the aspace out to 57 bits

07:32 <geist> the point of that is fairly obvious: 57 > 48 bits

07:33 <moon-child> geist: yeah didn't mean to imply arm had 5lp. Hence the 'or' separating the two

07:33 <moon-child> I mean 128 > 64 but no one is making 128-bit cpus...

07:33 <geist> but yeah what use cases stuff has for that? I dunno, but i'm sure some folks do

07:34 aejsmith has quit [Remote host closed the connection]

07:36 aejsmith has joined #osdev

07:49 <Griwes> 48 bits is only 256 TiB, and with machines actually having TiBs of RAM per box and stuff like RDMA, that is... getting crammed, at least for HPC use cases

07:52 <moon-child> oh yeah rdma

07:52 <moon-child> mesh stuff

07:52 <moon-child> makes sense

08:09 vdamewood has quit [Read error: Connection reset by peer]

08:10 vdamewood has joined #osdev

08:22 XgF has quit [Remote host closed the connection]

08:23 XgF has joined #osdev

08:30 archenoth has joined #osdev

08:31 Oshawott has quit [Ping timeout: 240 seconds]

08:39 toluene has quit [Ping timeout: 260 seconds]

08:42 toluene has joined #osdev

08:49 vdamewood has quit [Quit: Life beckons]

08:54 nick64 has joined #osdev

09:03 GeDaMo has joined #osdev

09:35 Likorn has joined #osdev

09:36 the_lanetly_052_ has joined #osdev

09:38 the_lanetly_052 has quit [Ping timeout: 240 seconds]

10:01 gog has joined #osdev

10:13 nyah has joined #osdev

10:21 kingoffrance has quit [Ping timeout: 240 seconds]

10:41 bauen1 has joined #osdev

10:42 pretty_dumm_guy has joined #osdev

10:52 Likorn has quit [Quit: WeeChat 3.4.1]

11:03 nick64 has quit [Quit: Connection closed for inactivity]

11:35 kingoffrance has joined #osdev

12:52 Dyskos has joined #osdev

12:52 <mrvn> We have customers with TiB of ram at work. With 48 bit == 256 TiB half of that goes to user space, half to kernel. If you want a phys map for easy page table manipulations then that's half again. So 64TiB of ram before you ran into problems.

12:53 <mrvn> Suddenly it's no so big an address space.

13:16 rorx has quit [Ping timeout: 246 seconds]

13:16 sikkiladho has quit [Quit: Connection closed for inactivity]

14:22 basil has quit [Ping timeout: 240 seconds]

14:25 nick64 has joined #osdev

14:25 Likorn has joined #osdev

14:29 <FireFly> funky

14:31 heat has joined #osdev

14:34 rorx has joined #osdev

14:35 sonny has joined #osdev

14:36 basil has joined #osdev

14:40 the_lanetly_052_ has quit [Ping timeout: 260 seconds]

15:31 mahmutov has joined #osdev

15:50 dude12312414 has joined #osdev

16:16 sonny has quit [Ping timeout: 252 seconds]

16:22 <mrvn> grrr, why is there no std::span<T>::at(size_t index) that does range checking?

16:39 PapaFrog has joined #osdev

16:40 LostFrog has quit [Ping timeout: 240 seconds]

16:49 sonny has joined #osdev

17:01 <heat> because c++ is, above all things, consistent

17:02 <mrvn> heat: std::vector::[] has no check, std::vector::at() has check. How is that consistent with std::span?

17:03 <heat> i r o n y

17:03 <heat> ;)

17:03 <GeDaMo> "it's like bronzy or goldy but it's made of iron" :P

17:04 <mrvn> Is class A { int x; } class B : A { int y; } still an aggregate class?

17:05 heat has quit [Remote host closed the connection]

17:05 <bauen1> mrvn: probably, you can check with https://en.cppreference.com/w/cpp/types/is_aggregate

17:05 <bslsk05> en.cppreference.com: std::is_aggregate - cppreference.com

17:06 heat has joined #osdev

17:07 <bauen1> mrvn: it is if you make all members and inheritance public

17:07 <mrvn> godbolt comfirms that

17:29 puck has quit [Excess Flood]

17:29 puck has joined #osdev

17:51 <mrvn> hehe, now isn't that a brilliant use of multithreading? std::thread t(func);

17:51 <mrvn> t.join();

18:41 <heat> it's just letting the main thread's cpu rest

18:41 <heat> good programming if you ask me

18:50 <mrvn> Assuming the thread isn't running on any core then t.join() could mark the thread to destruct itself and run the code in the main thread.

19:03 gog has quit [Ping timeout: 240 seconds]

19:07 sikkiladho has joined #osdev

19:20 <Griwes> that'd be observable if you use tls

19:20 <mrvn> Griwes: you would have to set the tls reg to the thread till the thread exits.

19:20 <geist> true but why would you run it in the main thread? that would defeat the purpose

19:20 <Griwes> mrvn, what if the main thread has already shared a pointer to, say, an atomic in its tls with other threads that would want to access it concurrently with the call to join?

19:20 <mrvn> geist: save 2 context switches

19:20 <geist> well sure, but the point is to have a thread. if you want some sort of coroutine thing then build it that way

19:20 <mrvn> Griwes: then nothing. that keeps working

19:20 <Griwes> wdym by "then nothing"?

19:20 <Griwes> for the record, you cannot tell if that is the case by doing anything short of full program analysis

19:20 <mrvn> Griwes: why should the address become invalid?

19:20 <Griwes> ah, you mean swap the *tls* itself

19:20 <mrvn> push and pop it

19:20 <Griwes> I feel like that still has problems

19:20 <heat> meaningless optimizations

19:20 <heat> a totally non-trivial amount of work for a meaningless optimization

19:21 <mrvn> but those are always such fun

19:23 <mrvn> at least in the kernel the join() syscall can switch to the thread the code is waiting on for the remainder of the original threads timeslice.

19:24 <mrvn> (or the threads if that is bigger)

19:25 <heat> there's no join syscall

19:25 <heat> it's a futex

19:25 <heat> it just blocks

19:26 <heat> on thread exit the futex gets woken up (it's set on thread spawning time)

19:27 <heat> on linux that is, dunno about other OSes

19:28 <mrvn> same thing there. If a futex blocks wake up the thread holding the futex with the remainer of the timeslice.

19:41 Likorn has quit [Quit: WeeChat 3.4.1]

19:41 Likorn has joined #osdev

19:48 sonny has quit [Quit: Client closed]

19:54 nick64 has quit [Quit: Connection closed for inactivity]

19:56 sebonirc has quit [Remote host closed the connection]

19:56 sebonirc has joined #osdev

20:05 sebonirc has quit [Remote host closed the connection]

20:06 sebonirc has joined #osdev

20:26 Dyskos has quit [Quit: Leaving]

20:28 GeDaMo has quit [Quit: There is as yet insufficient data for a meaningful answer.]

20:33 sonny has joined #osdev

20:34 sonny has left #osdev [#osdev]

21:13 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

21:17 sikkiladho has quit [Quit: Connection closed for inactivity]

21:20 DanDan has quit [Ping timeout: 260 seconds]

21:31 DanDan has joined #osdev

21:59 vdamewood has joined #osdev

21:59 xenos1984 has quit [Read error: Connection reset by peer]

22:16 xenos1984 has joined #osdev

22:29 Ali_A has joined #osdev

22:42 Ali_A has quit [Quit: Connection closed]

22:55 Ali_A has joined #osdev

23:11 nyah has quit [Ping timeout: 240 seconds]

23:14 heat has quit [Remote host closed the connection]

23:25 sikkiladho has joined #osdev