#osdev on 2022-10-21 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:03 thinkpol has joined #osdev

00:07 dude12312414 has quit [Remote host closed the connection]

00:11 dude12312414 has joined #osdev

00:12 nyah has quit [Ping timeout: 252 seconds]

00:39 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

01:30 eck has quit [*.net *.split]

01:30 IRChatter has quit [*.net *.split]

01:31 IRChatter has joined #osdev

01:31 eck has joined #osdev

01:53 smach has joined #osdev

02:03 heat has quit [Ping timeout: 246 seconds]

02:29 [itchyjunk] has quit [Remote host closed the connection]

02:35 pretty_dumm_guy has quit [Quit: WeeChat 3.5]

03:15 carbonfiber has quit [Quit: Connection closed for inactivity]

03:18 SpikeHeron has joined #osdev

03:47 vdamewood has joined #osdev

04:06 Guest5920 has joined #osdev

04:19 wand has quit [Remote host closed the connection]

04:20 scoobydoo_ has joined #osdev

04:21 scoobydoo has quit [Ping timeout: 252 seconds]

04:21 scoobydoo_ is now known as scoobydoo

04:31 scoobydoo_ has joined #osdev

04:33 scoobydoo has quit [Ping timeout: 252 seconds]

04:33 scoobydoo_ is now known as scoobydoo

04:46 wand has joined #osdev

05:26 fatal1ty has joined #osdev

05:32 garineko has quit [Quit: Connection closed for inactivity]

06:38 wand has quit [Ping timeout: 258 seconds]

06:47 scoobydoo has quit [Ping timeout: 252 seconds]

06:48 scoobydoo has joined #osdev

07:11 wand has joined #osdev

07:26 vdamewood has quit [Read error: Connection reset by peer]

07:27 vdamewood has joined #osdev

07:36 scoobydoo has quit [Ping timeout: 260 seconds]

07:37 scoobydoo has joined #osdev

07:54 scoobydoo has quit [Ping timeout: 246 seconds]

07:54 scoobydoo has joined #osdev

08:11 vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

08:21 nyah has joined #osdev

08:27 nyah has quit [Quit: leaving]

08:30 Burgundy has joined #osdev

08:32 crm has joined #osdev

08:35 orthoplex64 has quit [Ping timeout: 272 seconds]

08:37 CompanionCube has quit [Ping timeout: 248 seconds]

08:39 CompanionCube has joined #osdev

08:43 identitas has quit [Quit: Bridge terminating on SIGTERM]

08:43 Irvise_ has quit [Quit: Bridge terminating on SIGTERM]

08:43 chibill has quit [Quit: Bridge terminating on SIGTERM]

08:43 Maja[m] has quit [Quit: Bridge terminating on SIGTERM]

08:48 Maja[m] has joined #osdev

09:01 terminalpusher has joined #osdev

09:14 identitas has joined #osdev

09:14 chibill has joined #osdev

09:14 Irvise_ has joined #osdev

09:27 awita has joined #osdev

09:31 Burgundy has quit [Ping timeout: 252 seconds]

09:46 GeDaMo has joined #osdev

09:51 lg is now known as notwhonorttiwant

09:59 notwhonorttiwant is now known as lg

10:05 tomaw has quit [Quit: Quitting]

10:21 tomaw_ has joined #osdev

10:22 tomaw_ has quit [Client Quit]

10:28 awita has quit [Remote host closed the connection]

10:30 tomaw has joined #osdev

10:42 Burgundy has joined #osdev

10:44 darkstardevx has quit [Remote host closed the connection]

10:46 pretty_dumm_guy has joined #osdev

10:46 darkstardevx has joined #osdev

10:53 Celelibi has quit [Ping timeout: 246 seconds]

11:06 Celelibi has joined #osdev

11:11 axis9 has joined #osdev

11:16 pretty_dumm_guy has quit [Quit: WeeChat 3.5]

11:18 SGautam has joined #osdev

11:31 lanodan has quit [Ping timeout: 264 seconds]

11:32 lanodan has joined #osdev

11:43 bauen1 has quit [Ping timeout: 272 seconds]

11:43 bauen1 has joined #osdev

11:43 sortie has quit [Ping timeout: 272 seconds]

11:50 poyking16 has joined #osdev

11:54 sortie has joined #osdev

11:55 axis9 has quit [Read error: Connection reset by peer]

11:58 terminalpusher has quit [Remote host closed the connection]

12:07 epony has quit [Quit: QUIT]

12:11 axis9 has joined #osdev

12:17 axis9 has quit [Read error: Connection reset by peer]

12:19 xvmt has quit [Remote host closed the connection]

12:22 xvmt has joined #osdev

12:50 archenoth has joined #osdev

12:52 SpikeHeron has quit [Quit: WeeChat 3.7]

13:12 axis9 has joined #osdev

13:19 heat has joined #osdev

13:40 KaitoDaumoto has joined #osdev

13:40 heat has quit [Read error: Connection reset by peer]

13:40 heat_ has joined #osdev

13:41 Burgundy has left #osdev [#osdev]

13:43 Burgundy has joined #osdev

13:46 axis9 has quit [Remote host closed the connection]

13:50 heat_ is now known as heat

13:56 epony has joined #osdev

13:58 fatal1ty has quit [Ping timeout: 255 seconds]

13:59 wootehfoot has joined #osdev

14:04 <marshmallow> I have a doubt. do for each process exist both page tables recording the translation from virtual to physical addresses, and also page tables containing the related mm_struct of that process?

14:06 gildasio has quit [Remote host closed the connection]

14:06 wand has quit [Ping timeout: 258 seconds]

14:07 gildasio has joined #osdev

14:07 nvmd has joined #osdev

14:09 SpikeHeron has joined #osdev

14:13 wand has joined #osdev

14:17 SGautam has quit [Quit: Connection closed for inactivity]

14:21 smach has quit [Ping timeout: 246 seconds]

14:31 <heat> what do you mean with page tables containing the mm_struct

14:35 Burgundy has quit [Remote host closed the connection]

14:36 Burgundy has joined #osdev

14:45 <marshmallow> like, where does exactly mm_struct of each process live?

14:46 <marshmallow> kernel's page tables?

14:46 <heat> ok so mm_struct is kernel memory

14:47 <heat> but kernel memory /usually/ lives in every process' page tables

14:48 <heat> imagine you're dividing the address space 50/50, and the top page table has 4 entries; the bottom 2 are for user stuff, and the rest is always mapped/reserved for the kernel

14:48 <clever> until cpu bugs force you to not do that

14:48 <heat> this is the simple design

14:48 <heat> yes

14:48 <heat> meltdown et al made KPTI

14:48 <heat> that has some trickery and now there's a shadow page table for each process (with user + kernel mapped) and the user page table which has user + trampoline

14:49 <heat> that trampoline deals with switching to the real page tables

14:49 <clever> does x86 have seperate pointers to the user and kernel tables? (pre meltdown)

14:50 <heat> no

14:50 <heat> never had

14:50 <marshmallow> alright, and what exactly does it mean that kernel unmaps its address space when returning to EL0/ring3?

14:50 <heat> probably never will

14:50 <clever> ah, thats one region where i can see arm as being better, TTBR0/TTBR1

14:50 <heat> it doesn't unmap anything

14:50 <marshmallow> are kernel page tables still there?

14:50 <clever> yes

14:50 <clever> there is a bit in the tables, saying if its user or kernel memory

14:51 <heat> idk are you talking about meltdown mitigations or what?

14:51 <marshmallow> heat: spectre/meltdown mitigations on arm64, yes

14:51 <heat> KPTI would make you switch back to the user address space

14:51 <clever> heat: arm has (almost) always had 2 registers for the root of the paging tables, one for the lower half, one for the upper half

14:51 <heat> on arm64 I assume they have a dummy TTBR1 with the trampoline mapped

14:51 <clever> heat: so you dont need to keep a copy of the kernel mappings in every user table

14:51 <heat> and "unmapping the kernel" means switching to that trampoline

14:52 <heat> that sounds reasonable

14:52 <clever> ah, that

14:52 <marshmallow> sorry, in what sense "for the root of the paging tables, one for the lower half, one for the upper half"?

14:52 <heat> but im not particularly aware of how arm64 does this

14:52 <marshmallow> where do the lower and upper point to?

14:52 <clever> marshmallow: TTBR0 has the addr of the "userland" paging tables, that start at virtual addr 0

14:52 <clever> marshmallow: TTBR1 has the addr of the "kernel" paging tables, that end at INT_MAX virtual

14:53 <clever> the size of both, is set by another pair of registers

14:53 <clever> https://forums.raspberrypi.com/viewtopic.php?t=341806

14:53 <bslsk05> forums.raspberrypi.com: Understanding of mmu TCR.txsz - Raspberry Pi Forums

14:53 <clever> a recent answer in here explains it

14:54 heat has quit [Remote host closed the connection]

14:54 <marshmallow> thanks! and so there's no trampoline at all in arm64?

14:55 <clever> marshmallow: you can still add your own if you want to fight meltdown style bugs

14:55 heat has joined #osdev

14:55 <clever> where TTBR1 only maps the trampoline code, and swaps TTBR1

14:56 <heat> clever, you mean ULONG_MAX btw

14:56 <heat> ofc there's a trampoline

14:57 <marshmallow> clever: but from the userland perspective it shouldn't matter (both in pre/post meltdown) as kernel page tables are not accessible, no?

14:57 <clever> marshmallow: correct

14:57 <heat> incorrect

14:57 <heat> post meltdown would leak kernel data and addresses

14:57 <heat> that's why this happened

14:58 <marshmallow> on arm64 too?

14:58 <clever> i am assuming normal operation, nobody is trying to exploit it

14:58 <heat> yes, on arm64 too

14:58 <clever> a non-malicious userland will never notice the difference

14:58 <marshmallow> and so, sorry, how does arm64 actually mitigate this?

14:59 <heat> by replacing the normal kernel TTBR1 with a dummy one that only has the trampoline mapped

14:59 axis9 has joined #osdev

15:00 <marshmallow> and what does the trampoline do/contain?

15:01 <heat> code and data needed to switch to the normal kernel evironment

15:01 <clever> marshmallow: code that changes the TTBR1 and flushes the TLB, which should be mapped in both the real kernel and the fake trampoline kernel

15:01 <clever> and likely code to support the reverse as well

15:02 <marshmallow> and this however comes with a detrimental of performance, right?

15:02 <clever> and probably the irq table too, since the trampoline TTBR1 will be mapped when the cpu tries to handle an irq

15:02 <clever> yep

15:02 <heat> yes

15:02 <clever> all of that TLB flushing ruins the cache, and your injecting an extra dozen opcodes into every syscall and irq

15:03 <clever> if your cpu core has hw mitigations, you can skip the trampoline

15:03 <clever> if you control/trust the userland code, skip it

15:03 <heat> they probably play some games with ASIDs to skip some tlb flushing

15:04 <clever> heat: i was thinking the same, but i'm not sure if thats user, kernel, or both

15:04 <marshmallow> got it. and how/when can TTBR0 and TTBR1 possibly change? can they be seen as the equivalent of CR3 in x86_64? does TTBR0 change at each process context switch?

15:04 <heat> when switching address spaces yeah

15:05 <clever> the original idea (pre meltdown), is that you only change TTBR0 on context switch, and TTBR1 never changes

15:05 <heat> yes they (together) are the equivalent of cr3

15:05 <clever> so you dont have to duplicate the kernel half of the paging tables, in every set of tables

15:05 <heat> you also get 1 more bit of addressing

15:06 <marshmallow> alright, so TTBR0 is set by the kernel to point to the page table directory when switching between processes, right?

15:06 <clever> heat: let me see what the armv8 docs say about ASID...

15:06 <clever> marshmallow: yeah

15:06 <marshmallow> with current->mm_struct->pgd?

15:06 <heat> maybe

15:07 <heat> we're not linux devs

15:07 <marshmallow> sorry how many level of page tables arm64 has?

15:07 <heat> tough question

15:07 <heat> usually 4

15:07 <heat> i think

15:07 <heat> arm64 mmu is stupid configurable

15:08 <clever> at each level, you can just say "this is the end" and give a pointer to a crazy big page, rather then another table

15:08 <heat> i think the traditional "desktop/server" thing is 4KiB pages, 4 levels

15:08 <heat> i've heard android goes smaller sometimes

15:10 <marshmallow> so, to recap, when the kernel needs to translate a virtual address to a physical address, it can achieve so by walking page tables via TTBR0, till the value set by TTBCR?

15:10 <clever> heat: oh, now that i think of it, when transfering from the trampoline to the real kernel, your not unmapping any pages, so do you even have to TLB flush in that case? can the TLB hold unmapped addresses?

15:10 <clever> marshmallow: there is a dedicated opcode in arm to do that translation for you, and if you just do a read/write to an addr, the cpu does everything automatically

15:11 <heat> you shouldn't have to flush when switching to the kernel in arm64's case

15:11 <heat> the new page tables are a superset of the old tables

15:11 <heat> only when going back

15:11 <clever> yeah, thats what i was thinking

15:12 <clever> the flush is only needed when unmapping pages

15:12 <clever> or changing the mapping of a page

15:13 <heat> you need to flush when switching back to the trampoline pts

15:14 <heat> or ASIDs ofc

15:14 <clever> yeah, because your unmapping all of the sensitive bits of the kernel

15:15 <clever> > For these stage 1 translations, each of TTBR0_ELx and TTBR1_ELx has a valid ASID field, and

15:15 <clever> > TCR_ELx.A1 determines which of these holds the current ASID.

15:15 <heat> and ofc global mappings are forbidden in KPTI

15:15 <clever> heat: aha, the ASID and the phys addr of the paging tables, are BOTH in TTBR0/1!

15:16 <heat> yeah

15:16 <clever> so changing TTBR1 swaps both the addr and ASID at the same time

15:16 <heat> just like cr3

15:16 <clever> but there are 2 ASID's, in TTBR0 and TTBR1, and the TCR_ELx.A1 bit says which one is used

15:16 <clever> so you can wind up with some things being tagged to the kernel, or user, depending on A1

15:17 axis9 has quit [Remote host closed the connection]

15:17 <clever> that seems a bit odd to me

15:18 <clever> if its using the TTBR0 ASID in kernel mode, then the kernel can have TLB misses, for data it was just using in another proc

15:18 <heat> presumably that makes your TTBRs switchable

15:19 <clever> but if the kernel uses the TTBR1 ASID while doing a read from a user pointer, that lands in the kernel TLB, and could return the wrong answer in another process

15:19 <heat> which makes sense

15:19 <heat> as in, you can have your kernel in 0, user in 1

15:19 <heat> that's not an issue

15:20 <heat> how is a user pointer the same as a kernel one?

15:20 <heat> AIUI you'll still use the same "64-bits" to index in the TLB

15:20 <clever> what i mean, is if the kernel reads from user memory while in pid 2

15:20 <clever> then it context-switches to pid 3, and reads the same user pointer, but now with a different TTBR0

15:21 <clever> it should respect the new tables, and not use the old TLB entry

15:21 <heat> your ASID should have switched

15:21 <clever> so that means the kernel should be using the TTBR0 ASID

15:21 <heat> user pointer -> TTBR0 -> ASID y instead of x -> lookup_tlb(user_ptr, y)

15:21 <clever> but now all of your kernel pages, are isolated to the userland TLB ASID

15:21 <heat> hm?

15:22 <clever> and when you context switch, all of the kernel memory has a TLB miss, because its looking in the wrong ASID

15:22 <heat> hm2

15:22 <clever> because while arm64 has 2 ASID's, you have to manually select which one is active, via the TCR_ELx.A1 bit

15:23 <heat> ok

15:23 <heat> where's the problem?

15:23 <heat> you obviously set it to the TTBR that changes and is thus using the ASID

15:23 <clever> while running in kernel mode, which ASID should be active?

15:24 <heat> hmmmmm

15:24 <heat> i think i'm starting to see your point

15:25 wand has quit [Ping timeout: 258 seconds]

15:25 <heat> the easy, non-meltdown solution would be to use TTBR0.ASID and have the kernel mapped as global

15:26 <clever> but now the kernel will TLB miss every time you context switch, because the kernel lookups are landing in the userland ASID

15:27 <clever> if i'm understanding this doc right

15:29 <heat> but they're globals

15:29 <heat> globals are not pinned to ASIDs

15:30 <clever> ah, is that a special flag in the table entry?

15:31 wand has joined #osdev

15:33 <heat> yes

15:34 <clever> i see it now, the nG bit

15:35 <clever> that solves that mystery, for the non-meltdown case, you can just always use the TTBR0.ASID, and leave the kernel global

15:35 <clever> but i'm not sure how you could use the ASID to help with a trampiline

15:39 axis9 has joined #osdev

15:42 Vercas6 has quit [Ping timeout: 258 seconds]

15:43 [itchyjunk] has joined #osdev

15:46 Vercas6 has joined #osdev

15:49 axis9 has quit [Remote host closed the connection]

15:54 dude12312414 has joined #osdev

15:55 dude12312414 has quit [Remote host closed the connection]

15:58 dude12312414 has joined #osdev

16:01 dude12312414 has quit [Client Quit]

16:01 dude12312414 has joined #osdev

16:15 <pbx> https://media.discordapp.net/attachments/810940400419602502/1033049841082834954/unknown.png yay :)

16:18 <heat> wohoo

16:18 <heat> FIX THAT UB

16:19 <heat> YOU FIX THAT RIGHT NOW

16:21 nvmd has quit [Quit: WeeChat 3.7]

16:23 <pbx> i will, i will

16:24 <mjg_> keep it

16:25 <heat> noooo

16:25 <pbx> what is the accepted solution for accessing misaligned ints these days?

16:26 <heat> memcpy

16:26 <pbx> manually writing a function that uses uint8_t loads and shifts, or is there some intrinsic

16:26 <heat> memcpies get optimized out

16:26 <mjg_> __builtin_memcpy is the intrinsic :-P

16:26 poyking16 has quit [Quit: WeeChat 3.5]

16:26 <mjg_> provided the size is known at comiplation time

16:26 poyking16 has joined #osdev

16:26 crm is now known as orthoplex64

16:26 wand has quit [Remote host closed the connection]

16:29 xenos1984 has quit [Ping timeout: 246 seconds]

16:30 xenos1984 has joined #osdev

16:32 wand has joined #osdev

16:34 <sham1> And the smart compiler can turn your normal memcpy into that with known sizes and optimise it away

16:34 <sham1> So don't rely on undefined behaviour: read your data properly

16:36 axis9 has joined #osdev

16:37 dude12312414 has quit [Remote host closed the connection]

16:41 dude12312414 has joined #osdev

16:47 elastic_dog has quit [Ping timeout: 246 seconds]

16:47 elastic_dog has joined #osdev

16:54 epony has quit [Quit: QUIT]

17:14 gog has joined #osdev

17:18 <gog> mew

17:18 <heat> meooooooooooooooooooo

17:18 <heat> w

17:18 <sham1> mow

17:18 <heat> momw

17:21 xenos1984 has quit [Ping timeout: 252 seconds]

17:22 bauen1 has quit [Ping timeout: 260 seconds]

17:24 bauen1 has joined #osdev

17:29 SGautam has joined #osdev

17:34 axis9 has quit [Remote host closed the connection]

17:37 xenos1984 has joined #osdev

17:43 axis9 has joined #osdev

17:45 nvmd has joined #osdev

17:46 joe9 has quit [Quit: leaving]

17:47 gmodena has quit [Ping timeout: 268 seconds]

17:48 wand has quit [Remote host closed the connection]

17:54 xenos1984 has quit [Ping timeout: 255 seconds]

17:58 <heat> pbx, does xinit show a blank screen unless xterm is present?

17:58 <heat> cuz i have a blank screen

17:59 <mrvn> heat: xinit usually tries a few things with xterm being the last fallback

17:59 <mrvn> any menues when you press a mouse button?

17:59 <heat> i don't have input yet

18:00 <j`ey> heat: time to port xterm

18:00 <mrvn> some WMs don't have any icons and without config to also start an xterm you just have the WM running

18:00 <mrvn> ps aux

18:00 <mrvn> look what's running

18:00 <heat> im talking about my os

18:00 <mrvn> .oO(time to implement ps)

18:01 <heat> X is running, so is xinit (through startx)

18:01 <j`ey> do you have strace?

18:01 <heat> complicated question

18:01 <j`ey> / put a printf in 'exec*' syscall :P

18:01 <heat> tldr yesish

18:01 <mrvn> you must have some WM or app that startx has found that keeps the X running

18:02 <heat> i probably need ERESTARTSYS support

18:02 <axis9> yes

18:05 brynet has quit [Quit: leaving]

18:09 xenos1984 has joined #osdev

18:14 wand has joined #osdev

18:19 <mrvn> Has anyone implemented X? How much of X do you need to run everyday apps? I'm pretty sure you don't need stippled eplisoid segments with rounded corners. But what do you need?

18:20 <pbx> heat: i've not tried xinit

18:20 <pbx> just starting the server by hand

18:20 <pbx> try -retro on the server to get the classic background pattern and an X cursor

18:20 <heat> hm

18:20 <mrvn> pbx: Just "X" will just show the default background

18:20 <heat> Xorg -retro just works?

18:21 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

18:22 <mrvn> for some form of "works". you need another shell/terminal to then start some app on the display

18:22 <zid> DISPLAY=":0" xneko & when

18:22 <heat> https://i.imgur.com/diZ6zdm.png

18:22 SpikeHeron has quit [Quit: WeeChat 3.7]

18:22 <heat> oh wow look, my OS is ugly!

18:22 <j`ey> omg

18:23 <zid> what offtopic channel is that :o

18:23 <mrvn> congrats

18:23 <mjg_> grab cdm for proper openbsd experience

18:23 <zid> oh

18:23 <mrvn> twm, twm

18:23 <heat> mjg_, what's cdm?

18:24 <mrvn> c*something* display manager

18:24 <mrvn> or desktop manage?

18:24 <zid> what the hell is that channel

18:24 <GeDaMo> https://en.wikipedia.org/wiki/Common_Desktop_Environment ?

18:24 <bslsk05> en.wikipedia.org: Common Desktop Environment - Wikipedia

18:24 <mjg_> no

18:24 <heat> zid, something b e y o n d y o u r u n d e r s t a n d i n g

18:24 <mjg_> there was this turbo unusuable window manager

18:24 <mrvn> close, so close

18:24 <zid> yes yes it is

18:24 <mjg_> lemme look for it

18:24 <heat> i'm relatively sure it started out as osdev-offtopic

18:24 <zid> yea the lusers look similar

18:24 <heat> anyway bslsk05 is theirs for instance

18:25 <mjg_> maybe cwm

18:27 <shikhin> Oh I see.

18:27 <pbx> mrvn: no longer, default now is black bg with no cursor

18:28 <mjg_> now i remember

18:29 <mjg_> heat: twm

18:29 <mjg_> > twm is a window manager for the X Window System. Started in 1987 by Tom LaStrange, it has been the standard window manager for the X Window System since version X11R4.

18:29 <mjg_> utter crap

18:29 <heat> LaStrange

18:29 <heat> hehehehehe

18:29 <mjg_> used to be the default on openbsd

18:30 <pbx> fvwm2 is decent for something that doesn't require a full toolkit

18:30 <j`ey> fvwm2 is a cool retro look

18:30 <mjg_> team i3 right here, but i don't know the deps

18:31 * pbx cries in having to use fvwm for work every day

18:31 <mjg_> wut?

18:31 <zid> xfce4 has the nicest terminal so I just install xfce4-meta

18:31 <zid> and be done with it

18:31 <mjg_> you stuck on some old unix workstation? :P

18:32 <pbx> mjg_: nah, VNC into the real computers

18:32 <pbx> i guess we have plasma too, but you really don't wanna try that on VNC over vpn

18:33 <j`ey> my coworker uses fvwm2.. locally out of choice :P

18:33 <pbx> honestly fvwm is fine, it's the xterm default that kills me

18:33 <marshmallow> heat: just for the record, this is what I was referring to when I mentioned that part of kernel mappings needed to be unmapped (https://github.com/apple/darwin-xnu/blob/main/osfmk/arm64/proc_reg.h#L45)

18:33 <bslsk05> github.com: darwin-xnu/proc_reg.h at main · apple/darwin-xnu · GitHub

18:33 <marshmallow> (/cc clever too)

18:33 <pbx> i'm too used to the Ctrl-Shift-{C,V,X} keybinds

18:34 <heat> marshmallow, woah that's so fucking clever

18:36 <marshmallow> so basically the kernel address space is mapped to the user's one, when running userspace applications, but it contains only a small subset of it (w/ just the exception vector?)

18:37 <heat> yes

18:37 <heat> they contract and expand the TTBR1 address space when switching to and from EL0

18:37 <heat> Only In Arm64 moments

18:38 <mrvn> marshmallow: not having all of the kernel mapped in user space is a security mitigation against speculative execution exploits

18:46 SpikeHeron has joined #osdev

18:49 heat has quit [Remote host closed the connection]

18:49 heat has joined #osdev

18:50 <heat> yeah im porting xterm now

18:51 <heat> this is the shittiest port experience but god dang it im doing it

18:51 <j`ey> have you had to patch much?

18:52 <heat> config.sub, libtool patches in every autoconf package

18:52 * heat exhales

18:52 <heat> MESON BEST

18:52 <heat> and i'm adding features to make Xorg work, which obviously means they're not correct

18:53 <heat> like my unix sockets

18:53 <pbx> you don't need those per se

18:54 <heat> yeah

18:54 <heat> but you should

18:54 <pbx> assuming you have AF_INET

19:03 Guest5920 has quit [Quit: WeeChat 2.8]

19:03 terminalpusher has joined #osdev

19:03 vin has joined #osdev

19:07 <marshmallow> heat: not completely clear though about one thing: what isn't eventually flushed in the TLB?

19:07 <heat> context?

19:08 seer has quit [Quit: quit]

19:10 seer has joined #osdev

19:12 <marshmallow> https://github.com/apple/darwin-xnu/blob/main/osfmk/arm64/proc_reg.h#L82-L83

19:12 <bslsk05> github.com: darwin-xnu/proc_reg.h at main · apple/darwin-xnu · GitHub

19:13 <heat> if the kernel mappings are in ASID y and the user mappings are in ASID x, you don't need to invalidate all the kernel mappings when going back to user-space

19:15 netbsduser has joined #osdev

19:17 terminalpusher has quit [Remote host closed the connection]

19:18 gorgonical has joined #osdev

19:18 <gorgonical> Okay guys just to make sure I'm not dumb: a 16550 in fifo mode will continually push bytes to the front, right?

19:19 <gorgonical> As in you repeatedly do a lb intsruction from the register address?

19:19 <gorgonical> And you know you are done when the status register says rxfifo empty, or so

19:21 <geist> depends on what you mean by front, but if you mean at the tail end of the fifo yes

19:21 <geist> if it's full and you keep pushing i dont think that's well defined

19:22 <gorgonical> I mean let's say you've got 8 bytes and the rx reg is 0x0 offset. You just do lb x0, (base+0x0) 8 times, then right?

19:22 brynet has joined #osdev

19:22 <geist> to read 8 bytes out of the fifo? yes

19:22 <gorgonical> Right. The hardware manages the fifo to always present the "next" byte at 0x0

19:23 <geist> internally it'll read you the head of the fifo until it's empty in which case it returns UNDEFINED

19:23 <geist> right

19:23 <gorgonical> right

19:23 <gorgonical> my risc-v forth is coming along and I'm having to build a uart driver

19:23 <geist> yeah okay, i reread your original statement. this is for the RX path yes

19:24 <geist> what's unforunate is the 16550 isn't verygood at giving you exactly all the signals you want, but i think you get the bare minimum: the rx fifo has something in it

19:24 <geist> or the equivalent: the rx fifo is empty

19:24 <geist> so you pop stuff until the status changes

19:24 <gorgonical> yeah at this point I'm doing extremely simple signalling. I think my first try will be no-int

19:24 <gorgonical> I guess tx will come soon enough too

19:24 <geist> 'better' uarts with better fifos have a counter that tells you precisely how much is in it, which is more convenient since you can just in a loop read that many bytes out before checking again

19:25 <geist> yah the TX side of the 16550 is annoying too. i think it only tells you if the fifo is empty, but not how much is in it

19:25 <heat> geist, did you read the convo about meltdown? i'm kinda unsure how you can avoid losing all of the kernel's TLB entries when switching address spaces

19:25 <geist> so you can't know necessarily where it is at if you have stuffed in a few bytes and it hasn't finished transmitting them

19:25 <gog> what about ASID or PGE? or can you not use those with meltdown?

19:25 <gorgonical> so the only "safe" approach is 1 byte, poll until empty, repeat

19:25 <gorgonical> lol

19:25 <geist> heat: with KPTI i think you have to dump the kernels TLB yes. global mappings are my understanding verboten

19:26 <gog> oof

19:26 <heat> :((

19:26 <gorgonical> even with asid?

19:26 <geist> but, if you have PCID/ASID i think that mitigates it a lot

19:26 <heat> you use ASID for kernel <-> user switches

19:26 <geist> as in the kernel gets to be a new asid

19:26 <heat> there's no way to tell the CPU that ASID 1 and ASID 2 share pages

19:26 <geist> there's a window of x86 implementations iirc where the PCID isn't useful for it so you hae to dump everything

19:26 <geist> and then on very new cpus you dont need the KPTI

19:27 <geist> so i think in the end this will all just be some Dark Era of cpus where the mid 2000s stuff was just bad

19:27 <geist> and newer stuff you dont have to do this anymore

19:27 <gog> until there's a new cache timing attack

19:27 <geist> sure, but whether or not you need KPTI is the question

19:27 <heat> https://github.com/apple/darwin-xnu/blob/main/osfmk/arm64/proc_reg.h#L82-L83

19:27 <geist> i think there will always be spectre style stuff

19:27 <geist> but whether or not you need to keep the kernel isolated i dunno. i hope now

19:27 <geist> not

19:28 <heat> linux people were revisiting the idea of isolating user and kernel completely

19:28 <heat> and only mapping user mappings in copy_to_user et al

19:28 <geist> i think on many arches it's not so bad. POWER/PPC i think historically was basically always set up for that

19:29 <geist> a whole lot of this is to due with the x86 centricness of the world. it's a trash fire, so all solutions are hard and complex (and interseting)

19:29 <geist> that being said ARM64 has one fatal flaw in this, and i dont think they solved it in v9. i have no idea why

19:29 <geist> when you have an ASID you can only assign it to one of the two active page tables

19:29 <gog> yeah didn't arm64 have a cache timing attack similar to meltdown?

19:29 <geist> i dont know why you can't do both

19:30 <geist> so you cant just independently put the kernel and user space in two different asids. by definition one of the two is always running with asids off

19:30 axis9 has quit [Remote host closed the connection]

19:30 <geist> (which i think is functionally the equivalent to using ASID=0)

19:30 gorgonical has quit [Ping timeout: 260 seconds]

19:31 <geist> but take what i wrote with a grain of salt. i haven't personally climbed this particular tech tree of KPTI solutions

19:31 <geist> i read a bit about it, sort of grokked it a while back, but havne't actually *done* it

19:32 <geist> so there are probably things i'm completely missing

19:32 <gog> but you're the closest thing to a kernel master we have

19:32 gorgonical_ has joined #osdev

19:32 <geist> even within that bucket there are things i know about and things i've personally done. the latter i have a lot more confidence in

19:33 <gog> fair

19:33 <geist> soeaking of low confidence i was fixing some bugs in the x86 mmu implementations (32 and 64) in LK last night. omg it's worse than i thought

19:34 <geist> totally full of bugs. i see why we rewrote it in fuchsia

19:34 <gog> lol

19:34 <geist> it was some code that was originally submitted by intel. and it's got a really insidious set of bugs. lots of really really loosey goosey interpretation of what is physical and virtual

19:34 <geist> everything is just stored in uint64s so sometimes it's passing around pointers, sometimes physical addresses, and sometimes it's a page table entry with the bits masked off and sometimes now

19:35 <geist> so full of bugs

19:35 <gog> and somebody at intel wrote this

19:35 <gog> XD

19:35 <geist> yah at the time LK didn't really have an x86 port so i basically took it as 'whelp this is better than nothing'

19:35 <geist> a mistake hopefully i'll learn from

19:36 <gog> yes. never trust code from intel

19:36 <geist> it's code that *looks good* but totally once you try to figure it out turns out is fully of bugs. that's where i was fixiing the inline asm with the rep ins instruction the other day. also from intel

19:36 <geist> i mean good for them for submitting code. i approve of this practice

19:36 <geist> but... i'll have to be more careful

19:37 <gog> yes

19:37 <heat> what bugs did you find

19:38 <heat> how many bugs can you fit in a low level mmu really?

19:38 <heat> s/mmu/mmu code/

19:38 <geist> also annoying thing i hate about lots of x86 mmu implementations: insists on having parallel routines and whantot for every level of the 4 page tables

19:38 gorgonical_ has quit [Read error: Connection reset by peer]

19:38 <geist> ie, a function to deal with pdpe, pml4, pdp, etc

19:38 <heat> UGH

19:38 <geist> like, just treat it as 4 levels of page table, quit calling it separate stuff

19:38 <geist> so tons of copy pasta with one thing changed, etc

19:38 <heat> but they're also not quite similar

19:38 <geist> sure they are, just a few tweaks here or there

19:39 <heat> some bits' meanings change in a few levels

19:39 <geist> this is true, but it's still better IMO to have routines like 'frob this level(uint level)' and then logic inside

19:39 <geist> that way you still get one code path

19:39 <heat> yeah

19:40 <gog> that's how some of my code works

19:40 <geist> but with page tables there's always the eternal question: do you recursively walk down via having the walk routines call itself with level - 1, or do you try to walk the PTs in a loop

19:40 <geist> some algos work better for one or the other

19:40 <heat> level += 1;

19:40 <gog> loop

19:40 <geist> but in general the recursive walk is more powerful

19:40 <heat> aw shit

19:40 <heat> level -= 1;

19:40 <heat> x86_mmu_unmap_entry(vaddr, level, (map_addr_t)next_table_addr);

19:41 <heat> level += 1;

19:41 <heat> this is genius code

19:41 <geist> yes that's one of the things i fixed

19:41 <geist> doesn't matter, but it's still like uh?

19:41 <gog> loop loop loop

19:41 <heat> this is almost uefi-level of bad

19:41 <geist> the big function that's so full of bugs is the x86_mmu_get_entry i think

19:41 <geist> it turns out to be just totally broken

19:42 <geist> basically i was finally getting around to really unit testing the arch mmu code for the 5 mmu implementations in LK (riscv, arm32, arm64, x86-32, x86-64) and finding the x86 ones were really bad

19:42 <heat> table = (map_addr_t *)(X86_VIRT_TO_PHYS(table_entry) & X86_PG_FRAME);

19:42 <heat> heh found the bug

19:42 <geist> yep, so i was tidying all of that up to be much more clear what is a paddr and what is a vaddr, etc

19:42 <geist> where it really showed up was i tried to map something with a XN bit, and then everything exploded

19:43 <geist> because it was not properly masking off bit 63 so it'd end up thinking there wer enegative paddrs, etc

19:43 <geist> then i started digging in and it's like WTF is this doing

19:43 <geist> i'm debating whether or not it makes sense to maintain the 32 vs 64 bit mmu code paths, but i probably will keep it

19:43 <geist> but probably remove the PAE path from the 32bit code, which is AFAIK untested

19:44 <heat> i would some day like to try the linux way

19:44 <heat> all those macros

19:44 <gog> gross

19:45 <geist> in general with a clean

19:45 <heat> they somehow get collapsed at compile time when you have less levels

19:45 <geist> the cleanest mmu implementation in Lk i have now is probably the riscv stuff

19:46 <geist> but that's cause i recently wrote it and thus it's generally tigher and more asserted up and whatnot

19:46 <geist> OTOH i was kinda experimenting with a new style walker, and it's debatable if it worked or not

19:46 <geist> so i'm not completely sold on the design

19:47 <geist> and it doesn't currently have solid TLB shootdown logic. i need to revisit that some

19:47 <heat> what's the design?

19:47 <geist> basiclaly most of the LK mmu code in practice on embedded things is just to map something once and leave it alone, which works well enough

19:47 <geist> https://github.com/littlekernel/lk/blob/master/arch/riscv/mmu.cpp specifically https://github.com/littlekernel/lk/blob/master/arch/riscv/mmu.cpp#L314

19:47 <bslsk05> github.com: lk/mmu.cpp at master · littlekernel/lk · GitHub

19:47 <gog> like the ronco showtime rotisserie oven

19:48 <gog> you set it

19:48 <gog> and FORGET IT

19:48 <geist> the notion that you have a generic 'pt_walker' routine that you pass a lambda into that performs operations at various decision points

19:48 <heat> oh yeah that's funny

19:48 <geist> was kinda an experiment to try to reuse the same walker logic for different use cases

19:48 <geist> it works, but i dont think it's necessarily extensible

19:49 <geist> it *does* compile out to something pretty efficient, which i was pleased to see

19:50 <heat> the iterate-and-visit pattern is pretty nice

19:50 <heat> with lambdas specifically

19:50 <heat> but that has a limitation in this case

19:50 <geist> yah the fatal flaw is it probably gets too complex for more complex situations (mmu_protect() is usually the worse)

19:51 <geist> and its inefficient in that instead of continung to walk it starts over at the root after each page

19:51 <heat> unmapping may want to check if the pt is all clear for instance

19:51 <geist> and that yes

19:51 <geist> could do that with more opcodes you return from the lambda

19:52 <geist> https://github.com/littlekernel/lk/blob/master/arch/riscv/mmu.cpp#L547 basically

19:52 <bslsk05> github.com: lk/mmu.cpp at master · littlekernel/lk · GitHub

19:52 <heat> i have a pattern in my mmu code where it's just <arch>_mmu_get_pt_entry()

19:52 <geist> could return some sort of argument to the opcode that says 'check if PT is zero and free it' maybe

19:53 <heat> you give it a virtual address, and it gives you a pointer to the pte

19:53 <geist> yah that's a pretty clean, if limited, design

19:53 <heat> it's maybe too simple, but simple

19:53 <pbx> with TCP how do you prevent exhausting your TX window on both sides? I'm imagining a situation where after habn

19:53 <geist> another strategy that might work is extend it to return an array of pointers to all the PTs leading to the entry

19:53 <pbx> handshake, both parties send winsz bytes

19:53 <pbx> and now can't ack because window exhausted

19:53 <geist> so that the code that gets it has an opportunity to do more work, like free upper page table entries

19:54 <heat> ACK doesn't use the window AFAIK

19:54 <heat> geist, oh yeah i also have that

19:54 <geist> right, ACKs are metadata in TCP packet, not a byte

19:54 <heat> my mmu code is kind of chaotic

19:54 <geist> so you can send a zero byte packet with an ACK in it

19:54 <heat> my x86 code evolved over 7 years, so lots of different ideas

19:55 <pbx> i thought set flags also added 1 to the byte count

19:55 <heat> and riscv is x86 copied and patched, arm64 is copied from riscv and patched

19:55 <geist> no, SYN uses a byte count, which is a bit strange but for a reason

19:55 <pbx> ah, nvm then :)

19:55 <heat> seq doesn't exactly correlate to window size

19:56 <geist> heat: yeah that's why i think of all the strategies, the recursive walk actually is the most powerful, because it gives you the best opportunity to clean up on the way out (like check upper PTs for zeroing)

19:56 <geist> and it allows to keep walking from the same level it was already at, or bounce up a level and go to the next PT, etc

19:57 <geist> ie, the most efficient

19:57 <heat> https://github.com/heatd/Onyx/blob/master/kernel/arch/x86_64/mmu.cpp#L1069

19:57 <bslsk05> github.com: Onyx/mmu.cpp at master · heatd/Onyx · GitHub

19:57 <geist> but i think it tends to be the most complex

19:57 <heat> i kinda have that as well

19:57 <geist> yah the arm64 code in LK does it that way

19:57 <heat> it was a bitch to write

19:57 <geist> exactly

19:58 <geist> tends to have to be rewritten for each of the higher level algorithms (map, unmap, query, change protection) since they have slightly different constraints

19:58 <geist> also when you get into large page entries that get split that always really throws a spanner in

20:02 Lumia has joined #osdev

20:03 jafarlihi has joined #osdev

20:04 <jafarlihi> I want to directly boot to 64 bit, should I go with Limine or BOOTBOOT?

20:05 <geist> on what architecture?

20:05 <jafarlihi> x86

20:05 <geist> good question. i dont personally know either of those. is the intent to avoid having to write 32 to 64bit?

20:06 <geist> UEFI is not part of the question?

20:06 <jafarlihi> No UEFI, don't want to write 32 to 64 code myself

20:06 <geist> gotcha. i dont know those projects, so i can't help, but i can help you write 32 to 64. it's not difficult at all

20:07 <j`ey> limine seems pretty popular now

20:07 <geist> and most of what you have to do to get to 64bit you have to do eventually (ie, set up your own GDT, your own IDT, your own page tables) so in the long run it doesn't save you too much

20:07 <geist> but if you wanna get running now, SGTM

20:07 <heat> what happened to the good old linux x86 boot protocol

20:07 <jafarlihi> Do I need to create i686 cross-compiler if I boot with GRUB to 32 and switch to 64 myself? Or is just x86_64 cross compiler enough?

20:07 <heat> will you boot in 16-bit mode? 32-bit? 64-bit? who knows??

20:08 <geist> actualyl has me thinnking. grub-on-uefi on a 64bit machine. does that actually explicitly drop to 32bit before starting something?

20:08 <heat> yes

20:08 <geist> 64bit cross compiler is enough

20:08 <heat> multiboot1 and 2 specify 32-bit, so it needs to drop down

20:08 <geist> a 64bit x86 compiler can happily generate 32bit or even 16 bit assembly at least, if not actually compile 32bit C code

20:08 <heat> EXCEPT if you tell GRUB not to exit boot services (multiboot2)

20:10 <heat> something I've realized is that 32-bit x86 code is hard

20:10 <heat> at the very least it's super non-trivial to include in an x86_64 kernel

20:11 <heat> you can't link elf32 with elf64

20:11 <heat> and at least I haven't found a clearcut way to make it generate an elf64 with 32-bit code

20:11 <geist> yah that's generally why i advise just a) sticking to 32bit asm if you need to link it and b) compile it as 64bit .S

20:11 <heat> i would like to do things like KASLR

20:11 <geist> actually linking an elf32 .o with an elf64.o requires extra hackery

20:12 <heat> yeah

20:12 GeDaMo has quit [Quit: Physics -> Chemistry -> Biology -> Intelligence -> ???]

20:12 <heat> you can't just objcopy it because of the relocs I think

20:12 <geist> i think it can be done with severe limitations

20:12 <geist> right, basically you have to avoid most of the relocs, then you can try to objcopy it

20:13 <Griwes> My only 32 bit code is inside an asm file that gets compiled as elf64

20:13 <geist> usually in the past if i needed something more than a simple 32-to-64 boot32.S file, i would compile the 32 bit as a separate flat binary and define a protocol to branch between them

20:13 <geist> ie, have the 32bit binary do whatever it needs and then directly jmp to a 64bit flat binary

20:13 <geist> and just cat the two together in the payload

20:13 <heat> i had a fully physically relocatable x86 kernel thanks to multiboot2 but it turns out their relocation is broken

20:13 <heat> so i really really don't trust grub mb2

20:13 <heat> it killed off the trust i still had

20:14 <geist> yay x86 bootstrapping

20:14 <geist> it can be fun if you like that sort of masochism, but in general its only fun in the way fiddling with esoteric old 1980s systems are fun, sometimes

20:14 k8yun has joined #osdev

20:14 <geist> ie, fun to fiddle with, unfun if you want it to actually work reliably

20:14 <jafarlihi> Is there an easier arch out there than x86 for beginners to create OS for but still not be simple enough to not support things like GUI?

20:14 <kazinsal> x86 bootstrapping gives me sad

20:14 <heat> riscv, arm64

20:15 <jafarlihi> Is RiscV really all that easier?

20:15 <heat> yes

20:15 <heat> it's just a trashy architecture

20:15 <jafarlihi> How does arm64 compare to RiscV on that?

20:15 <kazinsal> every time I crack open my repo named "x64core" I look at it for all of two minutes then go "yeah I'm just gonna go rewatch the expanse or something actually fun"

20:15 <heat> riscv is the trampstamp of kernels

20:15 <gog> neat

20:15 <heat> s/kernels/architectures/

20:15 <geist> yes riscv is far more straightforward. the main downside being that it's hard to find physical hardware to run on

20:16 <geist> so you'll have to be content for the most part with an emulator

20:16 <heat> 32-bit hw is relatively available

20:16 <geist> OTOH when you're just getting started i highly recommend mostly sticking with emulators anyway

20:16 <heat> 64-bit, hard nope

20:16 <geist> since you have way more tools available

20:16 <geist> i guess the only annoying thing to learn with riscv64 is the whole SBI thing, but you only need that once you start trying to run in supervisor mode

20:17 <geist> and even then it's not hard to implement, just a little bit of a conceptual jump

20:17 <heat> "what if your syscalls had syscalls"

20:17 <geist> yo dawg i heard you like timers

20:18 <heat> yo dawg, i heard you didn't like timers, so u dont even have to touch one directly

20:18 <geist> but once you accept that all that stuff is out if your control, its pretty nice. if you ignore that it's probably slow

20:18 <geist> (i think there's some new spec stuff for a supervisor level timer hardware, so you can drive your own)

20:18 <heat> woah

20:18 <geist> presumably because the whole SBI timer stuff folks have decided is too slow/inefficient

20:18 <jafarlihi> Has anyone managed to make arm64 OS and run it on an Android phone device?

20:19 * geist whistles

20:19 <geist> if you mean take some random phone and hack your system on it? that's a fools errand, it's too hard in general

20:19 <geist> but not because of arm64, but because nothing is documented anywhere

20:19 <heat> LMAO

20:19 <geist> everything is closed hardware in that world

20:20 <geist> it can be done but it's mostly reverse engineering things, etc. definitely not a beginners task

20:21 <heat> x86: i heard you like timers so i gave you like 5 of them and only 1 or 2 are decent

20:21 <heat> pit, lapic, hpet, apic, tsc

20:21 <heat> s/apic/acpi/

20:21 <geist> yah was looking at the x86 port: still uses PIT, running at 1khz! yay

20:22 <geist> having the pentium 4 out on the table is why i've been fiddling with LK x86-32 lately. kinda fun to blat it onto a floppy and boot it

20:22 <geist> gives you a different set of feels

20:22 <heat> heh

20:22 <kazinsal> the 2022 equivalent of digging the old XT out of the closet

20:22 <heat> aren't all the p4s 64-bit?

20:23 <geist> oh no. only the later ones

20:23 <geist> early p4s were out in like late 2001. AMD x86-64 wasn't until like 2003

20:23 <geist> but iirc all of the Xeon P4s are 64bit

20:24 <CompanionCube> lol 'lacpi'

20:24 <geist> actually that may have been the first xeon

20:24 <geist> P4 was all of the intel line from like 2001-2005 or so until the Core stuff came out, iirc

20:24 <kazinsal> I think I remember looking at P3 Xeons back in the day

20:25 <geist> looks like.... prescott was where P4 got x86-64

20:25 <geist> kazinsal: yah you're probably right

20:26 <geist> willamette and northwood were the first 2 p4 microarches. northwood seems to be where hyperthreading came along

20:27 <geist> and prescott added 'intel 64'

20:27 <geist> oh interesting, from wikipedia

20:27 <geist> "The Prescott microarchitecture was designed to support Intel 64, Intel's implementation of the AMD-developed x86-64 64-bit extensions to the x86 architecture, but the initial models shipped with their 64-bit capability disabled. Intel stated that it did not intend to release 64-bit CPUs in retail channels, instead releasing the 64-bit capable F-series to OEMs only.[32] However, they were later made available to the general public as

20:27 <geist> the 5x1 series. A number of low-end Intel 64-enabled Prescotts, with 533 MHz FSB speed, were also released."

20:27 <clever> geist: i was looking into aarch64 ASID with heat earlier, and ran into some confusion, the ASID is held in TTBRn_ELn, so updating the pagetable atomicaly updates the ASID as well, however

20:28 <clever> > TCR_ELx.A1 determines which of these holds the current ASID.

20:28 <geist> clever: yep! which makes sense right? you swap the aspace by writing a new value in it, so you want to load the ASID

20:28 <geist> and yep. the A1 bit is whats annoying. why can't they just have two ASIDs active at the same time on both halves

20:28 <clever> geist: both TTBR0 and TTBR1, are tagged using the same ASID, selected by the A1 bit

20:28 <clever> which then makes me wonder, how you can do aarch64 TTBR1 trampoline

20:28 <geist> i dont think so, i think it selects which of TTBR0 or TTBR1's ASID is in use

20:29 <geist> and the one that isn't tagged is intrinsically running without ASID

20:29 <geist> which i think is functionally equivalent to ASID=0

20:29 <clever> but to make the userland TLB cache right, userland should always be active?

20:30 <clever> so the kernel ASID can never vary?

20:30 <geist> right

20:30 <clever> and the trampoline needs its own ASID, or you have to TLB flush constantly

20:30 <clever> now what?

20:30 <geist> so the usual way of doing it is selecting that TTBR0's ASID is active, and assign one for user space thats > 0

20:30 Lumia has quit [Quit: ,-]

20:31 <geist> what do you mean 'trampoline'?

20:31 <clever> meltdown protections

20:31 <geist> oh i dunno, it's complicated is the answer

20:31 <clever> have a dummy kernel that just changes TTBR1

20:31 <geist> and i frankly just dont want to think about it rght now

20:31 <heat> clever, did you see the darwin solution?

20:31 <heat> it's funny, they just change the address space size to only include the trampoline entry

20:31 <clever> heat: only thing ive heard about darwin, is that the entire lower 4gig of the virtual space is banned

20:31 <geist> yah that's probably another strategy

20:32 <clever> ah, i can see that working

20:32 <geist> you can use TCR_EL1 to set the aspace size. that makes good sense

20:32 <clever> and then even if you have a TLB hit, its not in the valid size

20:32 <clever> so it should fault?

20:32 <geist> i forget the smallest you can set it to, but if it's say 30 bits you can just burn the top 1GB of the kernel as the trampoline

20:32 <clever> but then again, meltdown is it leaking when it should have faulted!

20:33 <geist> theres some new v8.x feature to build even smaller aspaces (basically extends those bits)

20:33 <clever> do you trust the cpu to not have the exact same problem with the size flag?

20:33 <geist> maybe that's what this is for. i was always wondering what that feature was for

20:33 <geist> clever: yes. you do. also easy to do when you build your own cpus

20:33 <clever> but if i was to implement a heartbleed resistant kernel, on a core i lack the source to

20:34 <clever> my only option is to either trust or test

20:34 <heat> heartbleed?

20:34 <heat> wrong vuln?

20:34 <heat> :D

20:34 <geist> ir assume the worst and add all the mitigations you know about

20:34 <clever> heat: too many exploits bouncing in my head, lol

20:34 <geist> and KPTI is one of those. but i forget the precise details of how linux does it

20:34 <heat> anyway, if i were to write a shellshock resistant kernel...

20:34 <heat> :D

20:34 <geist> but that's probably the canonical solution

20:35 <geist> whatever darwin does may or may not be the 'best' solution, since they build their own hardware

20:35 <geist> they can tune whatever their mitigations are for their own design

20:35 <clever> yeah

20:35 <clever> and they can answer every question i gave above, by just reading the implementation of the core

20:35 <clever> and design a proper fix

20:35 <geist> but the TCR 'turning off' the aspace seems like a solid design

20:35 <geist> if they can be sure that the cpu will fault early based on an address being outside of the aspace

20:36 <heat> yeah i assume that's a lot simpler and more reliable

20:36 <heat> if bits[X...Y] != 0, fault

20:36 <clever> i assume there is also a cpuid flag, to tell you if meltdown has been fixed in hw?

20:36 <clever> and then you can disable your KPTI, to regain the performance

20:36 <heat> x86 doesn't AFAIK

20:37 <heat> you just cpu_bug based on the family/model

20:37 <clever> ah, and hard-code the list of cores that are fixed?

20:37 <geist> i think there are *some* in some cases though

20:37 <heat> you assume everything past gen X has been fixed

20:37 <geist> i just can't tell you the which ones off the top of my head

20:37 <clever> bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass

20:38 <clever> my AMD core at least claims to not have meltdown, and it pre-dates the bug

20:38 <geist> i do remember that zen cores for example have at least one MSR that has some sort of 'this is or isn't fixed' register i believe

20:38 <heat> lucky!

20:38 <geist> but i think those are mostly errata style things

20:38 <heat> bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds mmio_stale_data retbleed

20:38 <clever> heat: i assume the cpu is too dumb and cant do such pre-fetch

20:38 <heat> i have the whole shebang

20:38 <clever> bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds mmio_stale_data retbleed

20:39 <clever> my laptop does have meltdown

20:39 <clever> the nas managed to escape though

20:39 SGautam has quit [Quit: Connection closed for inactivity]

20:39 <geist> clever: your AMD core is a bulldozer class design, iirc right?

20:39 <geist> an AMD FX. something

20:39 <heat> the classic

20:39 <heat> 8350?

20:39 <clever> yeah, fx-8350

20:39 <heat> vintage

20:39 <heat> the best shitty core ever

20:39 <clever> model name : AMD A6-5400K APU with Radeon(tm) HD Graphics

20:39 <clever> bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass retbleed

20:39 <clever> the nas is also amd, and lacks meltdown

20:39 <clever> model name : Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz

20:39 <clever> but the laptop is intel

20:40 <geist> ah i think it's a real bulldozer too, not one of the later piledrivers or whatnot

20:40 <heat> yeah

20:40 <heat> im on a kabylake r

20:40 <clever> ah, wikipedia says: AMD x86 processors are not currently known to be affected by Meltdown and don't need KPTI to mitigate them.

20:40 <heat> 2017-18 is probably the golden age of vulns

20:40 <geist> i have kinda a place in my heart for unloved microarchitectures for reason

20:40 <geist> K5, bulldozer, etc

20:40 <geist> pentium 4

20:40 <heat> yeah bulldozer <3

20:40 <heat> tigerlake too

20:41 <geist> bonnell

20:41 <clever> the fx8350 is also very weird in terms of power draw

20:41 <geist> but yeah i think skylake is about the peak of the vulns

20:41 <clever> if i run prime95 on linux, the amps usage goes off the charts, but the TDP says it should

20:41 <clever> but if i dual-boot into windows and run prime95, the amps usage stays far more reasonable

20:41 <CompanionCube> didn't an AMD employee bascially give the game away vis a vis meltdown before the embargo ended?

20:41 <heat> hm?

20:42 <heat> tell me moar

20:42 <geist> oh i dunno, someone here in the channel was i think part of a lot of the original research on it

20:42 <geist> i dont know if they're still hanging around, but it was really interesting

20:42 <heat> who?

20:42 <geist> i forget. they were out of somewher ein europe. switzerland? zurich? i forget

20:43 <geist> pre freenode exodus so i dunno if they made the jump

20:43 <CompanionCube> ah yes, https://lkml.org/lkml/2017/12/27/2 is light on details but still has enough to get the gist accross

20:43 <bslsk05> lkml.org: LKML: Tom Lendacky: [PATCH] x86/cpu, x86/pti: Do not enable PTI on AMD processors

20:43 <heat> the great march

20:43 <geist> i think in general there are some more interesting modern vulns being found on zen 1 and 2, but from what i understand 3 and 4 are very solid cores (currently)

20:44 <heat> CompanionCube, LMAO

20:44 <heat> i think it was publicized in what, january?

20:45 <CompanionCube> iirc it was just after new years?

20:46 <jafarlihi> 2/exit

20:46 jafarlihi has quit [Quit: WeeChat 3.7]

20:47 awita has joined #osdev

20:51 awita has quit [Remote host closed the connection]

21:07 netbsduser has quit [Remote host closed the connection]

21:09 <sortie> https://twitter.com/sortiecat/status/1583547872317820928 ← Wrote a little thread on my new init system for Sortix :)

21:09 <bslsk05> twitter: <sortiecat> I just finished my new init system for my Sortix operating system. A thread on how it's super simple and powerful with fast parallel startup in dependency order, portable daemon readiness signaling, reliable logging, and is easy to configure per init(5): <pub.sortix.org/sortix/release… https://t.co/ydIqY0LgUl>

21:13 invalidopcode has quit [Remote host closed the connection]

21:13 invalidopcode has joined #osdev

21:17 wootehfoot has quit [Read error: Connection reset by peer]

21:39 poyking16 has quit [Quit: WeeChat 3.5]

21:45 fkrauthan has quit [Quit: ZNC - https://znc.in]

21:46 fkrauthan has joined #osdev

21:49 fkrauthan has quit [Client Quit]

21:52 fkrauthan has joined #osdev

21:53 epony has joined #osdev

21:59 <heat> (gdb) disassemble __sys_mmap_thunk(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)

21:59 <heat> Dump of assembler code for function _Z16__sys_mmap_thunkmmmmmmm:

21:59 <heat> 0xffffffff8012ade1 <+1>: mov %rsp,%rbp

21:59 <heat> 0xffffffff8012ade0 <+0>: push %rbp

21:59 <heat> 0xffffffff8012ade4 <+4>: pop %rbp

21:59 <heat> 0xffffffff8012ade5 <+5>: jmp 0xffffffff80145f90 <_Z8sys_mmapPvmiiil>

21:59 <heat> codegen 100

21:59 <heat> "ok boss, you said -fno-omit-frame-pointer, we won't omit the frame pointer"

22:03 <zid> yea gcc likes to mess with stack frames even if you ask it nicely not to a lot

22:03 <zid> I've had it adjusting rsp in leaf functions quite aggressively too

22:04 <heat> my issue here is gcc taking me too literally :D

22:05 <heat> most of these thunk functions should either be "jmp dst_syscall" or "call dst_syscall; cltq; ret"

22:06 <zid> gcc seems to have some weird hardcoded logic for what constitutes a frame pointer etc

22:06 <zid> and it gets confused a lot

22:06 <heat> they're too small for a frame pointer, and setting up + tearing down a framepointer for a tail call is the stupid

22:24 <kazinsal> thunkmmmmmmm

22:25 <heat> it's either thinking or very hungry

22:31 <epony> after it's not that hungry it won't be thinking that much either ;-)

22:32 <epony> "stay hungry" --steve jobs, the reseller of failed computers

22:32 <epony> (expensively)

22:42 nvmd has quit [Quit: WeeChat 3.7]

22:52 axis9 has joined #osdev

23:15 axis9 has quit [Ping timeout: 272 seconds]

23:18 axis9_ has joined #osdev

23:28 <geist> yeah in that case it decided it was an interesting function thus needs a frame pointer, but then decided to tail chain it, which by definitoin has to happen after the frame pointer was cleaned up

23:29 axis9_ is now known as axis9

23:30 vdamewood has joined #osdev

23:35 <geist> i wonder would it still set up a frame pointer even if the function did literally nothing but return?

23:35 <geist> if so, i guess it's at least still being consistent

23:36 <geist> heat: here's an even better one: https://gcc.godbolt.org/z/q1P7hravb

23:37 <heat> ugh

23:37 Lumia has joined #osdev

23:37 <geist> though interesting gcc 12.2 doesn't do a frame pointer for arm64, x86-64 or -32

23:37 <geist> so seems to be some riscv codegen nonsense

23:38 <geist> it does for arm32 though. so it seems to be highly arch specific. i dunno if that's actual ABI requirements or just different implementations of the back end

23:38 <heat> interesting

23:38 <heat> clang does

23:38 <heat> https://godbolt.org/z/q197W8bPY

23:38 <bslsk05> godbolt.org: Compiler Explorer

23:41 axis9 has quit [Read error: Connection reset by peer]

23:45 <geist> yeah, even with -m32

23:45 <zid> I thiink gcc's frame pointer and stack pointer code is mainly boilerplate

23:45 <zid> that it doesn't carefully manage

23:46 <zid> it makes some simple binary decisions and splats FRAME_YUP(); and FRAME_NAH(); into the stream

23:46 <zid> given the ways in which I've seen it appear and get used

23:46 axis9 has joined #osdev

23:47 <geist> yeah

23:51 <mjg_> i know osmep oeple here are going to like it https://linusakesson.net/commodordion/index.php

23:51 <bslsk05> linusakesson.net: The Commodordion

23:56 heat has quit [Remote host closed the connection]

23:57 heat has joined #osdev