#osdev on 2022-03-04 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:08 pretty_dumm_guy has quit [Quit: WeeChat 3.4]

00:08 <mrvn> vin: the first thing you should learn is that there is no best possible value.

01:02 thinkpol has quit [Remote host closed the connection]

01:03 thinkpol has joined #osdev

01:33 gog has quit [Quit: byee]

02:19 <vin> mrvn: Agreed there can't be a single value that can be the best for all workloads but there can be one for a single workload.

02:25 <moon-child> best in what respect?

02:29 theruran has quit [Ping timeout: 240 seconds]

02:30 theruran has joined #osdev

02:30 gxt has quit [Remote host closed the connection]

02:30 gxt has joined #osdev

02:49 joe9 has quit [Quit: leaving]

02:50 joe9 has joined #osdev

02:52 knusbaum has joined #osdev

02:52 Guest127 has quit [Ping timeout: 240 seconds]

02:54 <knusbaum> Hi everyone. I'm working on generating an ELF executable. Based on the docs (https://wiki.osdev.org/ELF, http://www.skyfree.org/linux/references/ELF_Format.pdf) it looks like the section header table is optional for an executable and that only a program header table is required. Is that right?

02:54 <bslsk05> wiki.osdev.org: ELF - OSDev Wiki

02:54 <Mutabah> That's correct

02:54 <knusbaum> I'm structuring my file like this: ELF header, Program Header, TEXT

02:54 <Mutabah> Executables (anything loadable really), use the program headers

02:55 <knusbaum> objdump -d doesn't show me any disassembly. I'm not sure if that means my program is structured improperly or if it's because there's no .text section

02:56 <knusbaum> When running I get (from gdb) "During startup program terminated with signal SIGSEGV, Segmentation fault"

02:56 <Mutabah> Try `objdump -x` to see all the headers

02:57 <knusbaum> and it won't show me where execution failed. That indicates to me that loading failed.

02:57 <knusbaum> I'll put it in a paste.

02:57 <Mutabah> Or `readelf -l`

02:58 <knusbaum> http://sprunge.us/zLxigL

02:58 <knusbaum> Here's objdump -x

02:58 <knusbaum> I've been looking at the headers but can't figure out what's wrong with it.

02:58 <klange> objdump does select output based on sections, so I'm not sure what it does if there are none for an executable; my usual go to is -D and even that's "all sections"

02:58 <knusbaum> Yeah, I tried -D

02:58 <knusbaum> no dice.

02:59 <knusbaum> I examined the file and the TEXT appears at the expected offset in the file.

02:59 <klange> when you get down to just having PHDRs, there's nothing to say what's actually executable code beyond access hints

02:59 <Mutabah> Tried readelf?

03:00 smeso has quit [Quit: smeso]

03:00 <Mutabah> Also, are you manually constructing this file?

03:00 <knusbaum> Yes, I'm manually constructing this.

03:00 <knusbaum> Writing an assembler and linker.

03:00 <Mutabah> `paddr` should probably be set to `vaddr` iirc

03:01 <knusbaum> Hmm, I'm not doing that. I noticed other binaries do that, but the docs say paddr is ignored.

03:01 <knusbaum> I'll try that.

03:01 <klange> My first thought would be that while it's perfectly valid to produce executables with only phdrs, the other tools will thank you if you keep sections in your final outputs

03:02 Jari-- has joined #osdev

03:03 <Jari--> morning #osdev

03:03 <klange> Ah you're just a few minutes late for it to still be morning here.

03:04 <knusbaum> klange, Yeah I was thinking about adding a .text header just for convenience.

03:04 <knusbaum> Ok. I added paddr and same result.

03:06 <knusbaum> Hmm. Sprunge is barfing. Here's GDB output: https://pastebin.com/ebDxLtDq

03:06 <bslsk05> pastebin.com: During startup program terminated with signal SIGSEGV, Segmentation fault.(gdb - Pastebin.com

03:07 <knusbaum> Is it fair to say that it's not the TEXT that's bad, it's that the program failed to load?

03:07 <knusbaum> I think the TEXT should fail anyway, but I expected to get a fault at a program counter.

03:12 <knusbaum> Ok well I'll add the .text section and see what the tooling can tell me.

03:13 smeso has joined #osdev

03:14 <knusbaum> Do you know if linux can tell me more about *why* an executable failed to load?

03:16 <klange> is it dynamic? ld.so probably can offer more details; if it's not even getting past kernel load, you can even try invoking ld.so directly...

03:17 <geist> also there may be some environment vars you can set

03:17 <klange> ld.so takes some flags when called directly, or more normally there's some environmnent variables

03:17 <geist> whatsit, LD_DEBUG?

03:17 <geist> yeah

03:23 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

03:47 <Clockface> can NASM build position indipendent executables?

03:47 <Clockface> or would i have to write stuff to do that manually

03:48 <Clockface> as in, do i have to be careful to do something with absolute adresses, or can i tell NASM to only do stuff with relative adressing (and ideally complain if i use an abs adress)

03:50 <moon-child> put 'default rel' at the top of your file

03:50 <Clockface> thank you

03:52 troseman has quit [Ping timeout: 272 seconds]

03:58 [_] has joined #osdev

03:58 [itchyjunk] is now known as Guest705

03:58 Guest705 has quit [Killed (molybdenum.libera.chat (Nickname regained by services))]

03:58 [_] is now known as [itchyjunk]

04:23 mahmutov has quit [Ping timeout: 272 seconds]

04:57 zaquest has quit [Remote host closed the connection]

04:59 zaquest has joined #osdev

05:00 bauen1 has quit [Ping timeout: 245 seconds]

05:09 bradd has quit [Ping timeout: 256 seconds]

05:10 Burgundy has joined #osdev

05:14 bradd has joined #osdev

05:37 masoudd has joined #osdev

05:39 ElectronApps has joined #osdev

05:57 srjek has quit [Ping timeout: 240 seconds]

05:57 the_lanetly_052 has joined #osdev

06:07 [itchyjunk] has quit [Read error: Connection reset by peer]

06:10 vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

06:17 xenos1984 has quit [Remote host closed the connection]

06:18 xenos1984 has joined #osdev

06:26 bradd has quit [Remote host closed the connection]

06:29 bradd has joined #osdev

06:46 jeaye is now known as jeaye_

06:48 jeaye_ is now known as jeaye

06:50 k8yun_ has quit [Quit: Leaving]

07:36 mlombard has joined #osdev

07:57 nyah has joined #osdev

08:19 gog` has joined #osdev

09:01 the_lanetly_052 has quit [Ping timeout: 240 seconds]

09:06 GeDaMo has joined #osdev

09:14 Jari-- has quit [Remote host closed the connection]

09:27 the_lanetly_052 has joined #osdev

09:59 dormito has quit [Quit: WeeChat 3.3]

10:02 bauen1 has joined #osdev

10:08 the_lanetly_052 has quit [Ping timeout: 256 seconds]

10:17 gxt has quit [Remote host closed the connection]

10:17 gxt has joined #osdev

10:42 dormito has joined #osdev

10:47 bauen1 has quit [Ping timeout: 256 seconds]

11:05 ElectronApps has quit [Remote host closed the connection]

11:14 immibis has joined #osdev

11:28 gog` is now known as gog

11:55 Payam has joined #osdev

12:14 bauen1 has joined #osdev

12:26 bauen1 has quit [Ping timeout: 268 seconds]

12:35 dennis95 has joined #osdev

12:56 roan has joined #osdev

13:12 bauen1 has joined #osdev

13:13 ElectronApps has joined #osdev

13:30 <klange> I think I've managed to unf*ck my signals sufficiently.

13:30 <klange> Only deliver on return to userspace, store userspace context in userspace stacks, support restarting system calls...

13:31 <klange> Resizing a terminal with my editor in it works well, can still suspend/resume, ^C things, etc.

13:31 <clever> that reminds me, libpam handles EINTR very poorly

13:31 <clever> it considers that to be a fatal error, and passes an error back to the user of the library

13:32 <clever> and in my case, i was FFI'ing into pam, with SIGALARM based context switching implemented within userland

13:32 <clever> so it had a very high chance of interrupting pam, and pam then failed

13:33 <klange> I still have an issue with resizing my editor quickly on the M1 but after much digging I think it's not specifically a signal issue, and I'll look at it deeper tomorrow.

13:34 <clever> ive run into issues where screen under xterm queues up every resize event

13:34 bauen1 has quit [Ping timeout: 256 seconds]

13:34 <clever> and if i resize the window the wrong way, i have to sit and wait 15 seconds

13:34 <clever> while it repaints itself, for every size i went thru

13:34 <klange> I'm going to play some pokémon and then go to bed.

13:36 bauen1 has joined #osdev

13:46 SikkiLadho has joined #osdev

13:52 SikkiLadho has quit [Ping timeout: 245 seconds]

13:57 SikkiLadho has joined #osdev

13:58 X-Scale` has joined #osdev

14:00 X-Scale has quit [Ping timeout: 256 seconds]

14:00 X-Scale` is now known as X-Scale

14:01 roan has quit [Ping timeout: 256 seconds]

14:04 <SikkiLadho> Hi. I have built a tiny hypervisor for RPI4, it can just load a linux kernel for now. I'm trying to compare the uart logs with and without hypervisor. I see a lot of difference, How should I look for errors which might haunt me in future when I get into Stage 2 memory managment? I see "CPU: CPUs started in inconsistent modes" in kernel uart with

14:04 <SikkiLadho> the hypevisor? What's causing it?

14:04 <SikkiLadho> uart log with hypervisor: https://gist.github.com/SikkiLadho/a722f9425433a2399622da4913321e2c

14:04 <bslsk05> gist.github.com: uart_log_with_hyp.txt · GitHub

14:06 <j`ey> SikkiLadho: https://github.com/torvalds/linux/blob/master/arch/arm64/include/asm/virt.h#L60

14:06 <bslsk05> github.com: linux/virt.h at master · torvalds/linux · GitHub

14:07 <clever> SikkiLadho: did you change the cpu enable method in DT? are all cores entering linux in SVC mode?

14:08 <j`ey> at EL2 that, not SVC mode

14:08 <clever> ah right, my mind is still mostly in arm32 mode

14:09 <clever> if you have a hypervisor in EL2, then that hypervisor should own EL2 of all cores, and linux should get EL1 i believe?

14:09 <j`ey> yeah

14:10 <SikkiLadho> I have mpdir gate in my hypervisor to only allow the master core, to avoid race conditions. https://github.com/SikkiLadho/Leo/blob/701aa7d0566bfc657d9967f66ea66325ddcd8022/src/boot.S#L20

14:10 <bslsk05> github.com: Leo/boot.S at 701aa7d0566bfc657d9967f66ea66325ddcd8022 · SikkiLadho/Leo · GitHub

14:10 <j`ey> (so I meant EL1 in the previous)

14:10 <clever> https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/smp.c#L427

14:10 <bslsk05> github.com: linux/smp.c at master · torvalds/linux · GitHub

14:10 <clever> which is calling is_hyp_mode_mismatched from the file j`ey linked

14:11 <clever> https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/smp.c#L249-L255

14:11 <bslsk05> github.com: linux/smp.c at master · torvalds/linux · GitHub

14:11 <clever> [ 0.075625] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]

14:11 <clever> and this line, contains the cpu#(2), the mpidr, and the cpuid

14:12 <clever> from read_cpuid_id()

14:12 <clever> https://github.com/torvalds/linux/blob/master/arch/arm64/include/asm/cputype.h#L205-L214

14:12 <bslsk05> github.com: linux/cputype.h at master · torvalds/linux · GitHub

14:12 <clever> which is just MIDR_EL1

14:13 <SikkiLadho> Hey celver, I have mpdir gate in my hyp, so only master core is entering it while others are in loop, this might be causing problem? https://github.com/SikkiLadho/Leo/blob/701aa7d0566bfc657d9967f66ea66325ddcd8022/src/boot.S#L20

14:14 <clever> SikkiLadho: and how does that function then pass the cpu off to linux?

14:15 roan has joined #osdev

14:15 <clever> looks like core0 runs kernel_main, while core1/2/3 run proc_hang?

14:15 dude12312414 has joined #osdev

14:16 <SikkiLadho> yes that is right

14:16 <clever> and proc_hang is an infinite loop with no cpu idle'ing

14:16 <clever> but, did you tell the arm stub to release the other 3 cores? or is this kernel_old=1?

14:17 <clever> it looks like a no

14:17 <clever> so core 1/2/3 never actually go into proc_hang

14:18 <clever> they are instead still running secondary_spin, from https://github.com/raspberrypi/tools/blob/master/armstubs/armstub8.S#L156-L161

14:18 <bslsk05> github.com: tools/armstub8.S at master · raspberrypi/tools · GitHub

14:18 <SikkiLadho> How do i tell the arm stub to release the other 3 cores?

14:19 <clever> you must write to the 3 spin addresses in the device-tree

14:19 <j`ey> hm, so it sounds like linux is actually starting them then?

14:19 <j`ey> hence the CPU mode mismatch

14:19 <clever> j`ey: yeah, the hypervisor never actually grabbed the other 3 cores, so the firmware gave linux those 3 in HYP/EL2 mode

14:19 <clever> https://github.com/raspberrypi/linux/blob/rpi-5.10.y/arch/arm/boot/dts/bcm2711.dtsi#L469-L504

14:19 <bslsk05> github.com: linux/bcm2711.dtsi at rpi-5.10.y · raspberrypi/linux · GitHub

14:20 <clever> SikkiLadho: in the device tree, there is an enable-method of spin-table, and a cpu-release-address of 0xe0, 0xe8, and 0xf0

14:20 <clever> if you write a 64bit addr to those slots, and then run the SEV opcode, then secondary_spin will jump to the 64bit addr you wrote

14:20 <clever> 0xe0, 0xe8, and 0xf0 matches up to lines 179-188 of armstub8.S

14:21 <clever> SikkiLadho: you must also modify the device-tree, to remove that enable method (or re-implement it, or replace it with PSCI)

14:22 <SikkiLadho> I have able to "edit" the device tree with libfdt previously, so I will try to modify it.

14:24 <SikkiLadho> Thank you everyone for the direction

14:24 <clever> SikkiLadho: so, there are ~4 steps

14:25 <clever> 1: delete 3 cores from DT, so linux cant start them in EL2 mode

14:25 <clever> 2: your hypervisor starts them itself, so you control all 4 cores

14:25 <clever> 3: re-add the cpu cores back to DT (or just modify them), so linux uses something else like PSCI or spin-tables at a new addr

14:26 <clever> 4: when the spintable/PSCI gets an event from linux to wake a given core, run linux in EL1 mode on that core

14:26 <clever> the same as how you ran linux in EL1 on core0

14:28 <mrvn> clever: doesn't the DT say wether to write 32bit or 64bit?

14:28 <clever> mrvn: i think its always native width?

14:28 <mrvn> I would assume the cell size to match

14:29 <clever> Documentation/arm64/booting.rst:- CPUs with a "spin-table" enable-method must have a 'cpu-release-addr'

14:29 <clever> arch/arm64/kernel/smp_spin_table.c: .name = "spin-table",

14:30 <knusbaum> Hmm. My executable isn't dynamic, so maybe this means nothing, but ld.so gives me this: "./out.o: error while loading shared libraries: ./out.o: ELF load command address/offset not properly aligned"

14:30 <clever> https://github.com/raspberrypi/linux/blob/rpi-5.10.y/arch/arm64/kernel/smp_spin_table.c#L84-L91

14:30 <bslsk05> github.com: linux/smp_spin_table.c at rpi-5.10.y · raspberrypi/linux · GitHub

14:30 <clever> mrvn: writeq_relaxed is whats used to write to the addr

14:31 SikkiLadho has quit [Ping timeout: 272 seconds]

14:31 <clever> mrvn: which i think is 64bit based

14:31 <mrvn> knusbaum: if your executable isn't dynamic then it wouldn't call ls.do

14:32 <mrvn> clever: doesn't mean that's actually "portable". :)

14:32 <mrvn> and who names their executables .o?

14:37 <knusbaum> me

14:37 xenos1984 has quit [Read error: Connection reset by peer]

14:37 <knusbaum> when I'm running a test and don't care what the file is called

14:38 <GeDaMo> I knew an engineer who would name all his programs 'fred' :P

14:40 <sham1> I suppose that's better than the default from gcc and clang which is a.out

14:42 <mrvn> ahh, back in the good old days, when an a.out actually way a.out. Nowadays the default name should be elf.

14:43 <mrvn> s/way/was/

14:43 <mrvn> knusbaum: it looks like you are trying to execute an object file, not an executable.-

14:44 <mrvn> Are you aligning stuff to page boundaries?

14:44 <mrvn> padding between sections?

14:45 srjek has joined #osdev

14:47 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

14:48 sortie has quit [Remote host closed the connection]

14:49 roan has quit [Read error: Connection reset by peer]

14:49 sortie has joined #osdev

14:50 srjek has quit [Ping timeout: 240 seconds]

14:51 ElectronApps has quit [Remote host closed the connection]

14:51 X-Scale` has joined #osdev

14:51 teroshan2 has joined #osdev

14:52 buffet5 has joined #osdev

14:52 zid` has joined #osdev

14:52 DonRichie2 has joined #osdev

14:52 woky_ has joined #osdev

14:53 SikkiLadho has joined #osdev

14:53 <knusbaum> mrvn, I'm manually putting together an ELF executable, so surely I'm doing something wrong. Here's the readelf -a output: v

14:53 <knusbaum> http://sprunge.us/VlnRQX

14:54 <mrvn> Data: 2's complement, little endian

14:54 <mrvn> Is there any system with ELF that has 1's complement?

14:55 <knusbaum> Not that I'm aware of.

14:55 <knusbaum> There are a lot of strange features in ELF.

14:55 Brnocris1 has joined #osdev

14:55 Terlisimo1 has joined #osdev

14:55 <mrvn> 0x0000000000000078 isn't page aligned. so that might be the problem.

14:55 __xor has joined #osdev

14:55 EtherNet_ has joined #osdev

14:55 vin1 has joined #osdev

14:56 pg12_ has joined #osdev

14:56 <knusbaum> mrvn, Do segments need to be page aligned in the file? That's the source offset.

14:56 xenos1984 has joined #osdev

14:56 <mrvn> knusbaum: if the ld.so want's to use mmap then the alignment of offset and physaddr must match in the lower bits.

14:57 <knusbaum> hmmmmmmmmmmmm

14:57 <knusbaum> I did not know that.

14:57 <mrvn> Not sure how you got ld to place the section at 0x78.

14:57 <mrvn> Did you set pagesize for the ld to 1?

14:57 <knusbaum> I put it at 0x78. It's my assembler and linker.

14:57 <mrvn> Well, your linker isn't compatible to the ld.so then.

14:58 <knusbaum> Yes, that's what I'm trying to figure out :)

14:58 <mrvn> The default page size for x86_64 is 2MB.

14:58 <knusbaum> wat. I thought the page size was like 8k unless using big pages

14:59 <mrvn> 4k is what you use for kernels and might work with linux ld.so but I never tried.

14:59 <mrvn> knusbaum: 8k is sparc iirc. Nearly everything has 4k as smallest.

14:59 <mrvn> And elf files are aligned to hguge pages on x86_64.

14:59 X-Scale has quit [*.net *.split]

14:59 nyah has quit [*.net *.split]

14:59 blockhead has quit [*.net *.split]

14:59 vin has quit [*.net *.split]

14:59 kkd has quit [*.net *.split]

14:59 zid has quit [*.net *.split]

14:59 EtherNet has quit [*.net *.split]

14:59 Brnocrist has quit [*.net *.split]

14:59 teroshan has quit [*.net *.split]

14:59 DonRichie has quit [*.net *.split]

14:59 _xor has quit [*.net *.split]

14:59 Clockface has quit [*.net *.split]

14:59 simpl_e has quit [*.net *.split]

14:59 ephemer0l has quit [*.net *.split]

14:59 Terlisimo has quit [*.net *.split]

14:59 pg12 has quit [*.net *.split]

14:59 dh` has quit [*.net *.split]

14:59 buffet has quit [*.net *.split]

14:59 woky has quit [*.net *.split]

14:59 koolazer has quit [*.net *.split]

14:59 X-Scale` is now known as X-Scale

14:59 teroshan2 is now known as teroshan

14:59 DonRichie2 is now known as DonRichie

14:59 buffet5 is now known as buffet

15:00 <knusbaum> Ok. My memory told me linux bumped from 4k to 8k but I must be misremembering.

15:00 <mrvn> absolutely not. they support it but its not what is used on x86_64.

15:01 <mrvn> Some hacks need 8k pages, 2 consecutive 4k pages.

15:01 <knusbaum> I see.

15:01 <mrvn> ARM has 16k pages made up of 4 4k pages even in hardware.

15:01 <knusbaum> Ok so let me try aligning to 0x200000

15:02 <knusbaum> You say this needs to be both in the ELF itself *and* in virtual memory?

15:02 <knusbaum> I must be misunderstanding. There are binaries smaller than 2MiB

15:04 <mrvn> I'm not sure what is allowed, as said 4k is probably sufficient. Default is 2MB alignment in ld which trips many people up when they try to get the multiboot signature into the first 8k of the file.

15:05 <mrvn> SO just try matching up the lower 12 bit and see if that solves the problem.

15:06 <mrvn> If I look at e.g. /bin/bash I see:

15:06 <mrvn> [25] .data

15:06 <mrvn> PROGBITS 0000000000125700 0000000000124700 0

15:07 koolazer has joined #osdev

15:07 nyah has joined #osdev

15:07 kkd has joined #osdev

15:09 garrit has quit [Remote host closed the connection]

15:11 <knusbaum> Ok, cool. Let me try that first.

15:15 Brnocris1 is now known as Brnocrist

15:23 [itchyjunk] has joined #osdev

15:25 <mjg> http://bxr.su/FreeBSD/lib/libc/stdtime/strftime.c#261 loller

15:25 <bslsk05> bxr.su: Super User's BSD Cross Reference: /FreeBSD/lib/libc/stdtime/strftime.c

15:25 roan has joined #osdev

15:26 k8yun has joined #osdev

15:26 Payam has quit [Quit: Client closed]

15:32 roan has quit [Quit: Lost terminal]

15:34 <knusbaum> mrvn, thanks so much. I just aligned the section to 4k in the ELF and it works now.

15:35 <mrvn> knusbaum: you should use the page-size makro to get the systems page size.

15:35 <mrvn> just in case you port to other archs later.

15:35 <knusbaum> Hmm. good idea. Hard when cross-compiling. I wonder what the safest value is.

15:35 <mrvn> knusbaum: it's a runtime function.

15:37 <knusbaum> Not cross-compiling the assembler, I mean when the assembler/linker are cross-assembling/linking.

15:37 <knusbaum> The ELF is generated on a different machine than it runs.

15:37 <mrvn> ahh, yeah. You just have to know then.

15:38 <mrvn> I guess you have to use the static define then: /usr/include/x86_64-linux-gnu/sys/user.h:#define PAGE_SIZE(1UL << PAGE_SHIFT)

15:39 <knusbaum> Yeah, I know my target OS so I could pull headers or something like that so it stays up to date.

15:43 k8yun_ has joined #osdev

15:44 <knusbaum> At least up to date as of whenever I build the assembler and linker

15:44 <SikkiLadho> Hey clever, you wrote that "CPUs are instead running in armstub(in secondary_spin)" but I used trusted firmware -A instead of default armstub. So where are the CPUs cores now?

15:47 k8yun has quit [Ping timeout: 240 seconds]

15:53 <clever> SikkiLadho: in the trusted firmware version of that, same rules apply, your hypervisor must claim those cores according to the TF's rules, and then run linux in EL1

15:59 <mrvn> knusbaum: It's unlikely to ever change once you have the define.

16:06 k8yun_ has quit [Quit: Leaving]

16:17 immibis has quit [Remote host closed the connection]

16:17 immibis has joined #osdev

16:20 immibis has quit [Read error: Connection reset by peer]

16:20 immibis has joined #osdev

16:28 <mrvn> clever: have you ever measured how much heat the arm cores produce in the secondary_spin compared to sleeping?

16:28 <clever> mrvn: the official secondary_spin is using wfe, so the cpu will properly park itself in idle mode, until an sev occurs

16:28 <mrvn> and how much it alows down the primary core if you don't put them to sleep?

16:29 <clever> and the arm arm has an entire chapter on how the core behaves in wfe mode

16:29 <clever> such as if all cores go into that mode, the l2 cache also parks and shuts things off

16:29 <clever> so its already in the most optimal state by default

16:30 <clever> https://github.com/raspberrypi/tools/blob/master/armstubs/armstub8.S#L157

16:30 <bslsk05> github.com: tools/armstub8.S at master · raspberrypi/tools · GitHub

16:30 <mrvn> Oh, that's changed then from when I looked at the fist multi-core RPi. The firmware would just spin endlessly in a busy loop causing heat and using bus cycles.

16:31 <clever> https://github.com/raspberrypi/tools/blob/master/armstubs/armstub7.S#L178

16:31 <bslsk05> github.com: tools/armstub7.S at master · raspberrypi/tools · GitHub

16:31 <clever> and this is the arm32 version, for any quad-core board

16:31 <clever> https://github.com/raspberrypi/tools/commit/b23276d2ef3d832b40ae3e4dbefac992539557d3

16:31 <bslsk05> github.com: armstubs: Add wfe to ARMv7/ARMv8-32 stubs · raspberrypi/tools@b23276d · GitHub

16:31 <clever> i can then just check git history, and find that WFE was added in may of 2017

16:32 <clever> and armstub7.S was created in april of 2016, but the commit msg implies this is after the pi2 release

16:33 <clever> wikipedia says pi2 came out in February 2015

16:33 <clever> mrvn: so yes, that problem was likely in existance for ~2 years

16:33 srjek has joined #osdev

16:36 immibis has quit [Read error: Connection reset by peer]

16:37 immibis has joined #osdev

16:39 <SikkiLadho> Thank you clever. You also said that core1/2/3 never actually go into proc_hang loop but are inside secondary_spin . Why is that when I explicitly branch the cores to proc_hang?

16:40 <clever> SikkiLadho: the arm trusted firmware, will only run your kernel on core0

16:41 <clever> you have to use another api (probably PSCI) to ask the ATF to start the other 3 cores up, at a location of your choosing

16:42 <clever> reading the /cpus node in device-tree will tell you what api you should use

16:57 SikkiLadho has quit [Ping timeout: 256 seconds]

17:01 ephemer0l has joined #osdev

17:03 k8yun has joined #osdev

17:15 zid` has quit [Ping timeout: 240 seconds]

17:18 xenos1984 has quit [Remote host closed the connection]

17:19 the_lanetly_052 has joined #osdev

17:19 xenos1984 has joined #osdev

17:40 zid has joined #osdev

17:45 dennis95 has quit [Quit: Leaving]

17:52 dude12312414 has joined #osdev

17:56 nur has joined #osdev

17:58 <geist> note the model of waking up the cores by writing to an address is not the usual model, but it's what RPI does

17:58 <geist> usually PSCI exists and then you make a call to it

17:58 <nur> woah I just walked in on something interesting!

17:58 <geist> and thats probably wha tyour hypervisor should do, wait for linux to make a PSCI call

17:59 <nur> is someone writing an RPI hypervisor?

17:59 terminalpusher has joined #osdev

17:59 <geist> yah, though they left a little while ago

17:59 <j`ey> yes!

17:59 <nur> is it up somewhere

17:59 <geist> yah there should be a log link in the topic

18:00 <j`ey> https://github.com/SikkiLadho/Leo/

18:00 <bslsk05> SikkiLadho/Leo - Leo Hypervisor. Type 1 hypervisor on Raspberry Pi 4 machine. Mailing List: https://www.freelists.org/list/leo (1 forks/8 stargazers/GPL-2.0)

18:01 <nur> ah sweet thanks

18:01 elastic_dog has quit [Ping timeout: 260 seconds]

18:02 <zid> I think my iommu was disabled in my bios, rip

18:02 <nur> why rip

18:03 <clever> geist: ah, you mean the hypervisor should emulate its own PSCI? and only when linux wants to wake a core, does the hypervisor also wake the core?, but obviously route the core into the hypervisor, so you can init it, and then run linux in EL1

18:03 <zid> not had an iommu until now

18:03 elastic_dog has joined #osdev

18:04 <zid> I had 14 pages of dram timings in the way of finding it

18:04 <geist> clever: yeah. that's pretty standard. that's i think the main reason why PSCI is specced to either be accessible via HVC or SMC

18:04 <zid> https://rog.asus.com/forum/attachment.php?attachmentid=24465&stc=1&thumb=1&d=1375620865 not even joking, look at that scroll bar :P

18:04 <geist> if you're an EL2 you set up the DT to say PSCI exists, and tell the guest to use HVC and then just emulate psci cpu on/off and the other mandatory ones

18:05 <geist> makes for a nice interface to bring cores on and off and shutdown the guest

18:06 eschaton has joined #osdev

18:07 elastic_dog has quit [Client Quit]

18:07 elastic_dog has joined #osdev

18:08 <mrvn> Is there a hypervisor already that lets me run linux on 2 cores and play with my own kernel of the other 2?

18:09 <clever> mrvn: couldnt you just use /dev/kvm for that?

18:10 <mrvn> wouldn't that time share the cores?

18:10 <clever> there may be tunables to tell the linux scheduler to never schedule to that core

18:10 <clever> so only kvm gets it

18:10 <clever> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/virtualization/ch33s08

18:10 <bslsk05> access.redhat.com: 33.8. Setting KVM processor affinities Red Hat Enterprise Linux 5 | Red Hat Customer Portal

18:10 <mrvn> pinning kvm to 2 cores would be a start but I think not a garantee it runs all the time.

18:11 <clever> i think another thing, is how the kvm api works

18:11 <clever> https://david942j.blogspot.com/2018/10/note-learning-kvm-implement-your-own.html

18:11 <bslsk05> david942j.blogspot.com: Play With Capture The Flag: [Note] Learning KVM - implement your own kernel

18:12 <clever> i think when you run ioctl(vcpufd, KVM_RUN, NULL);, the core you did that on, will take a brief pop in EL1, then EL2, then back to EL1 within the guest

18:12 <clever> so you just need to pin your userland threads, that are driving kvm, to a given core

18:12 <clever> and when the guest hits a trap, it gets handled by that thread, on the same core that triggered the trap

18:13 <clever> mrvn: so you basically just need to tweak the default linux affinity, to only use core 0/1, and then pin your kvm threads to core2 and core3, one thread each

18:13 <mrvn> is there a default affinity?

18:14 <clever> and if linux never has something to schedule on core2/3, it wont pre-empt the guest

18:14 <clever> https://unix.stackexchange.com/questions/396149/globally-setting-cpu-affinity

18:14 <bslsk05> unix.stackexchange.com: numa - Globally setting CPU affinity - Unix & Linux Stack Exchange

18:14 <clever> according to this, you can set it in systemd

18:14 <clever> and i'm assuming that just sets the affinity in pid1, before spawning the rest of the os

18:14 <clever> which then just inherits it

18:15 vdamewood has joined #osdev

18:15 <mrvn> Makes sense. But I bet that needas a reboot.

18:15 <clever> `man taskset`

18:15 <clever> -a, --all-tasks

18:15 <clever> Set or retrieve the CPU affinity of all the tasks (threads) for a given PID.

18:16 <mrvn> Bigger problem will be worker threads from kernel and user space. They should only have 2, not 4.

18:16 <clever> then you just need a list of all PID's, and pray there is no race condition

18:16 <clever> there are also hotplug tricks for that, one moment

18:16 <mrvn> I can turn off cpus in linux but that would prevent kvm running on them.

18:17 <clever> maxcpus=1 in the kernel cmdline, and it will nearly forget that the other cores exist

18:17 <clever> /sys/bus/cpu/devices/cpu1/online

18:17 <mrvn> yes. but if the cpu is offline how do I get kvm on it?

18:17 <clever> but if you `echo 1 > /sys/bus/cpu/devices/cpu1/online`, it will act like you hot-plugged an entire cpu core, and start doing things

18:18 <mrvn> it will actually power down the core.

18:18 <clever> and half of the system initialized on the assumption you only had 1 core, so it only made 1 worker

18:18 vinleod has joined #osdev

18:18 <clever> probably depends on what the drivers and core are able to do

18:18 <clever> the rpi cant actually power the cores off, only put them into an idle state

18:18 <mrvn> a WFE loop normaly.

18:19 <clever> the entire arm cluster is in a single power domain

18:19 <clever> turning that off results in a loss of all arm state

18:19 <clever> so thats much more in suspend-to-ram territory, rather then parking an unused core

18:20 vinleod is now known as vdamewood

18:20 vdamewood has quit [Killed (silver.libera.chat (Nickname regained by services))]

18:39 <geist> hmm, i dont know of a hard type 1 or even a lower level thing that partitions the cores totally statically

18:40 <geist> but as clever says you can get pretty close to the effect by kicknig everything off the other cores

18:40 <geist> no matter what you do you probably still need at least a thing hypervisor at EL2 so the different EL1 guests dont interfere with each other

18:40 <clever> i think xen does let you do it

18:40 <geist> and that basically by definition needs to be shared across the cores, otherwise you'd have guests interfering with each other re: tlb flushes and whatnot

18:41 <mrvn> and something to associate peripheries with one of the two kernel or emulate them.

18:41 <geist> yah

18:41 <clever> in the initial xen config, you can just say to boot dom0 on core0/1

18:41 <geist> yah i think lots of big cloud VMs and whatnot hve the avbility for more static configs

18:42 <geist> i'd also say esxi (which has some beta port for rpi) but i think fundamentally it' still a type 2 hypervisor. it's just not linux based

18:42 <geist> the esx kernel is just a thinner posix kernel whose job it is to be the dom0

18:42 MrBonkers has quit [Quit: ZNC 1.7.5+deb4 - https://znc.in]

18:42 <mrvn> in clouds you have the dom0 running linux doing nothing and then VMs on fixed cores.

18:43 <geist> i wouldn't say nothing though. unless you have dedicated hardware you probably are still running a bunch of network and storage stack on it

18:43 <mrvn> I could run a linux VM and my-kernel VM on 2 cores each I guess.

18:43 <geist> what the hardware fundamentally doesn't have is some way to at the physical level carve up the system into separate domains

18:44 <geist> i bet some of the z machine stuff does from IBM, but i dont think that's much of a requirement anymore

18:44 <mrvn> on the rpi most peripherals are page aligned so that could be split

18:45 <geist> sure but i'm thinking more fundamentally: you need two level paging to be able to keep two guests from touching the same ram, and some sort of virtualization of the interrupt controller to keep them from touching each others irqs, etc

18:45 MrBonkers has joined #osdev

18:45 <geist> arm and x86 hardware have this feature, but only via some amonut of software virtualization

18:45 <geist> ie, a hypervisor

18:45 <mrvn> internal interrupts for me, GIC for linux?

18:46 <geist> sure but even internal interrupts involve an interrupt controller

18:46 <mrvn> How does it work with kvm? Every write to the interrupt registers faults into the hypervisor?

18:46 <geist> but at this point it's just details. you pretty quickly arrive at the same answer: run linux + KVM and then just set up the thread affinity such that you get pretty much complete access to the cpu

18:47 <geist> mrvn: for the most part yes. especially with older GICs like GICv2, etc

18:47 <geist> v3/v4 have some additional virtualization capability to reduce the amount of traps needed

18:48 <geist> but rpi4 has a GICv2, like all cheap arm socs. and prior to that rpi doesn't even have a GIC, so any GIC in any virtualization scheme on a rpi3 or whatnot by definition involves a fully virtualized interrupt controller

18:48 <geist> which just means trap-and-emulate

18:54 <geist> i was actually reading the manual the other day to see how GICv3 (or was it v4) fully virtualizes interrupts and it's kinda neat. basically has a in memory scheme to let you (the hypervisor in EL2) build a list of pending virtual irqs that will fire when you enter the guest, and vice versa

18:55 <geist> the list is limited in size, like 8 or 16 but overflow is doable in software. so in most cases most irqs are dispatched directly in hardware without any exiting

18:55 <mrvn> that's something the GIC does? Would have thought the mode change in the cpu would do that

18:55 <geist> also you can deliver irqs to a cpu that's currently running in the guest without stopping it

18:56 <geist> the mode change does it without this assist, but with the assist you can avoid a mode change even more so

18:56 <geist> basically the GIC has a context that you swap in and out as you're switching between geusts, and the context itself has a list of pending irqs and whatnot

18:57 <geist> note simpler designs like GICv2 do as well, but they're basically jsut a bitmap of pending IRQs, so you can also do a lot of this without the list, but the list is more flexible and handles more cases

18:58 <geist> i think this design also allows for one virtual cpu to deliver an irq directly to another one without exiting, i think

18:58 <mrvn> do they have a mask register that hides bits the running VM should not read or write?

18:58 <geist> it involves at any point in time the EL2 having a translation of what virtual cpus correspond to physical ones

18:58 <nur> is there any mmio cases on rpi that addresses something beyond the 32 bit address range

18:59 <mrvn> nur: tons of them

18:59 <geist> really? i'm not sure broadcomm has put any mmio > 4GB

18:59 <mrvn> you can configure all peripheral to be in 64bit space then everything will be beyond

18:59 <mrvn> geist: the VC remaps them

18:59 <geist> oh yeah 4 has one of theo config bits that moves everyhting?

18:59 <geist> aaaah yes. okay you're right, yep

18:59 <nur> the VC?

19:00 <mrvn> nur: the actual CPU running the raspberry Pi. The ARM is just a secondary processor running behind it.

19:00 <geist> videocore

19:01 <nur> just... trying to figure out where to _start_ writing for rpi and I figured, I know, mmio but then I realised I didn't know jack about how to do that. The examples I've seen take a 32 bit address but I figured wait what if you needed more than that

19:01 the_lanetly_052 has quit [Ping timeout: 256 seconds]

19:01 <geist> nur: it's not really any different if it's 64bit

19:01 <geist> just use a 64bit pointer. no difference

19:01 <nur> well the function header needs to be different I guess?

19:01 <mrvn> nur: the peripherals are at different addresses on every RPi model. You should parse the Device Tree to find out where.

19:01 <geist> and if you're doing that you are on a 64bit cpu so it's natural, because the cpu will be using 64bit pointers

19:01 <geist> (and yes i know PAE, etc etc lets not confuse nur)

19:02 <nur> can I just use one mmio function that takes a 64 bit pointer

19:02 <geist> or hard code what you're doing for the particular rpi

19:02 <nur> ah

19:02 <mrvn> nur: to do what?

19:02 <nur> DT

19:02 <nur> mrvn, to write bytes to memory

19:02 <geist> nur: well, first. precisely what raspberry pi are you trying to write software for?

19:02 <nur> to init hardware and stuff

19:02 <mrvn> nur: sure. But that's awfully slow.

19:02 <geist> and secondarily, which ones do you intend to ever run your software on?

19:02 <nur> geist, the one that qemu implements...rpi3 I think

19:02 <nur> I don't actually have one of those

19:03 <geist> are you intending to run it in 64bit or 32bit mode?

19:03 <nur> 64

19:03 <geist> they're pretty different environements, i suggest 64

19:03 <mrvn> nur: usualy you just decalre the MMIO register volatile and be done with it.

19:03 <geist> good. okay.

19:03 <nur> mrvn, okay you lost me

19:03 <geist> so if you're building 64bit then by definition all of your pointers are 64bit, so any pointer is already 64

19:03 <nur> ok

19:03 <zid> `void *` already exists and dynamically changes size, basically

19:04 <zid> no need to do anything special

19:04 <geist> `*(volatile uint32_t *)0x1234 = 99;` would already be derefericing a 64bit pointer even if it is a 32bit value

19:04 <geist> etc

19:04 <mrvn> nur: There are 3 ways to access the MMIO registers: You can declare them external and set their addresses in the linker script, you can initialize pointers to the address or (in C++) you can use placement new.

19:04 <geist> or you can use an accessor function as you said before

19:05 <geist> mmio_write(void *address, uint32_t val); etc

19:05 <nur> ohhh

19:05 <nur> is it not better practice to say what the size of address is

19:05 <geist> sure, it is, but i slammed that out in 2 seconds

19:05 <geist> there are better styles, cleaner code, etc etc

19:05 <geist> just trying to get you aligned in the right direction :)

19:05 <nur> okay thanks sorry :)

19:06 <mrvn> you should have mmio_write8, mmio_write16, mmio_write32, mmio_write64

19:06 <nur> got it

19:06 <geist> right, also depends on the language you're using, how fancy you want it, how type safe you want it, etc

19:06 <nur> that makes sense

19:06 <mrvn> not sure the RPi has anything other than mmio_write32/64 though.

19:06 <geist> but fundamentally you're trying to get the compiler to do something like

19:06 <nur> language is C

19:06 <geist> `mov register, a_pointer; ldr dest_register, [register]` to read from an address

19:06 <geist> and a str for the other direction

19:07 <nur> ohh

19:07 <nur> okay

19:07 <nur> ah that makes sense yes

19:07 <geist> and since in this case 'register' is a 64bit register, it will by definition be a 64bit pointer, etc

19:07 <nur> heck I could write it in asm

19:07 <nur> the mmio function

19:07 <geist> you could, and there are some reasons to do so, but it'd be overkill and a bit of a risk right now

19:08 <nur> right

19:08 <geist> in general when getting started i recommend the last amount of inline asm (or asm) as you can get, because all safeties are off and you can have bugs that continually bite you

19:08 <geist> since it's subtle and tricky and there are no safety nets there

19:08 <nur> juggling chainsaws whee

19:08 <mrvn> nur: You must read/write from a MMIO register in exactly the documented size or you get total garbage. So I would suggest making a "struct MMIO32 { volatile uint32_t *reg; }" and struct MMIO64 if you need it. The type for MMIO registers should be incompatible with any other type so you can't accidentally call something with the wrong thing.

19:09 <geist> yah the struct method works pretty well. i generally have moved away from it, but it's more of a style thing, and i think it's perfectly usable

19:09 <geist> a good clean solution

19:09 <mrvn> Saddly the struct is the only way to make the compiler insist on the right type.

19:09 <geist> yah

19:10 <geist> well or some silly super fancy C++ (like we're doing in fuchsia which i dislike but does weork)

19:10 <nur> I see

19:10 <mrvn> C++ has more ways but 20:06 < nur> language is C

19:10 <nur> I'm keeping it simple

19:10 <geist> agreed

19:11 <geist> yah i'm in no way advocating super fancy C++, just saying there are optios there in case some day yo ugo that direction

19:11 <nur> thanks, got it :)

19:11 <geist> for once i think mrvn and i are on the same page and agreeing!

19:12 * geist high fives mrvn

19:12 <nur> is that uncommon

19:12 <geist> both of us like to argue each other into the ground over details

19:12 <mrvn> Even if it's a bit more complex having distinct types will save you tons of headaches. It's just too easy to misuse some value for something it shouldn't be used for and distinct type will then give compiler errors.

19:12 <nur> I feel like all this good information should go into a blog entry but I don't want to get it subtly wrong and pollute the internet :)

19:13 <geist> nur: possibly. but the main problem is everyone has a different style here, so even the fact that mrvn and i agree here means there are 10 people foaming at the mouth ready to attack us if this went to a wider group

19:13 <mrvn> nur: a best practices page for the osdev wiki would be better

19:13 <geist> but. the real meta here is there are lots of ways to do this and making forward progress is more important than doing it in some ideal way on day 1

19:14 <nur> as in life itself

19:14 <mrvn> or even just a side-by-side comparison of different methods.

19:14 <geist> i'm always a huge advocate of keeping momentum and not getting bogged down in details, as we are all prone to do

19:14 <mrvn> nur: The most important part you have to take away is that all MMIO access must be a) right size, b) volatile.

19:15 <nur> "volatile" is just "write to register addresses in C" right?

19:15 <nur> is there something in the background that happens to ensure that this is done right

19:16 <geist> or even more generic, you're trying to make sure the compiler is for every mmio access emitting precisey one load or store instruction, and that it is in the right spot of the program

19:16 <nur> like it's telling the compiler "this is a _register_ I am trying to write to"

19:16 <mrvn> nur: volatile tells the compiler that any read/write to that register has an observable effect. The compiler must not optimize them out.

19:16 <geist> hence volatile which among other things tells the compiler it *must* do the access right here

19:16 <mrvn> Also a read might not return the same value you just wrote.

19:16 <nur> why not

19:17 <GeDaMo> It might be changed by something other than your program

19:17 <nur> oh like devices!

19:17 <GeDaMo> Like a device changing state

19:17 <geist> language lawyers will tell you that there are probably a ton of edge cases etc, and there are memory order and memory barrier issues to deal with in some exotic situations, but when getting started you can generally ignore that. and double plus so when against an emulator

19:17 <nur> I GUESSED RIGHT

19:17 <GeDaMo> :D

19:17 <zid> unless you're doing threads you can infact, completely ignore memory barriers, volatile is a compiler barrier and that's plenty

19:18 <geist> zid: weeeeeelllll not really. but lets not side track it

19:18 <mrvn> nur: Take the UART as example. When you write to it it outputs a character to the serial port. But when you read from it it gives you the character received over serial. Not what you wrote.

19:18 <zid> (unless you're an alpha)

19:18 <geist> zid: or ARM

19:18 <mrvn> zid: write back buffers?

19:18 <nur> mrvn, and this is important to tell the compiler "ASSUME NOTHING"

19:18 <nur> because the compiler makes assumptions when optimizing

19:18 <nur> right?

19:18 <mrvn> nur: yes.

19:18 <mrvn> nur: volatile tells the compiler it can't do that here.

19:18 <nur> I got another one right again.

19:18 <nur> group hug.

19:19 <geist> awww

19:19 <geist> it's nice when i virtually hear lightbulbs going off above someones head

19:19 <GeDaMo> Exploding lightbulbs? :|

19:19 <geist> it's honestly what this place is all about. screwing in bulbs

19:19 <mrvn> nur: used to be volatile was for interrupts but that use case has been basically obsoleted and shouldn't be used.

19:19 <nur> :>

19:19 <geist> s/going off/turning on/!

19:19 <zid> I screwed in some screws earlier, hated it

19:20 <mrvn> geist: exploding lightbulbs is so much more fun

19:20 <zid> (my cpu cooler is really annoying to mount)

19:20 <geist> zid: we can blab about memory barriers on ARM in a bit once this is over if you're interested

19:20 <geist> zid: i dunno, i kinda like the screw in type of coolers, vs clips

19:20 <geist> i once had the plastic clips holding one of my coolers snap off spontaneously one day

19:20 <zid> the stock intel cooler had little push pin clips

19:20 <zid> they were dire

19:20 <geist> the cooler fell off the cpu, and within about 3 seconds the machine shut down

19:21 <geist> was an athlon, late 2000s

19:21 <mrvn> lucky it didn't short anything important

19:21 <geist> yah totally

19:21 <zid> I've booted a pentium 4 machine with a glass of water

19:21 <mrvn> could have fallen on the PCI cards and break them off or damage the ports.

19:21 <zid> It had like 110C tjmax so until all the water boiled away it was perfectly fine :p

19:22 <geist> mrvn: yah really what happened is it pivoted over, since the bottom clip was still holding it. it just turn and hit the outside of the case so was at about a 20 degree angle

19:22 <geist> but wasn't touching the cpu anymore

19:22 <zid> Mine is a pain to mount so I *re*mounted it to see if it'd improve temps, there was a little bald spot of thermal paste

19:22 <geist> it was a tall heavy cooler

19:22 <zid> but it also was a 2nd pain to mount so I may have just done it again

19:23 <geist> yah sometimes with the screw ones you can overtighten and bend the mobo too so it doesnt make great contact

19:23 <zid> that's why they have backplates

19:23 <mrvn> and springs

19:23 <geist> was watching a vid on a current mobo with cheapo backplates that gamers nexus guy was complaing about

19:23 <zid> https://www.hardwareasylum.com/images/rampage4extreme/backplate.jpg

19:24 <mrvn> and stops. tighten the screws till the stop and then the spring gives the right tension

19:24 <zid> same gen as mine, not looked at mine but it's probably not too different

19:24 <geist> an issue being that los of mobos have crummy backplates

19:24 <zid> I can find gene 3 and extreme 4, but not gene 4 on GiS

19:27 <zid> My thing is just annoying to mount because it's basically on a small contact patch of butter, while you're trying to align 4 different screw holes

19:27 <zid> would be better with a long shafted screwdriver and 8 hands and no case

19:28 <geist> reminds me, i need to build an alder lake PC to test the new E and P core stuff

19:28 <geist> been putting it off. i had a parts list all set up at newegg and then a huge bruhah started up so i decided i shouldn't use them

19:29 <zid> friendo did a new PC last week

19:29 <zid> I told him to just get a 12600k

19:30 <zid> his mobo was being dumb wrt ram though

19:30 <zid> I blame gigabyte, i hate gigabyte

19:30 <geist> yah i tend to prefer ASUS. have had good luc with them and their bioses are usually reasonable

19:30 <zid> <3 asus

19:31 <geist> yah even their cheapo PRIME series, which is my usual go to if i'm not building a high end gamer PC

19:31 <geist> aaaaaand lots of asus boards *still* come with a COM header

19:31 <zid> my 2011 mobo is still £200 second hand on ebay

19:31 <geist> most do, eactually

19:31 <geist> so for osdev it's ❤️

19:31 <zid> so if you give a shit about resale at all, asus gud

19:31 <geist> you can get a $3 10 pin COM header to DE9 and you're set

19:32 <zid> I have the super fancy superio that has EVERYTHING

19:32 <zid> but nto everything is wired up cus.. micro atx

19:33 <zid> got a soldering iron handy? :P

19:33 <geist> yah i thik that's the key to the COM header. usually it's just a few pins off the nuvoton superio chip, which they'll have if they also have PS/2 or whatnot pins

19:33 <geist> but they probably wont stuff on an etire superio chip if com was the only thing they needed

19:33 <zid> there's a few models of the superio yea, I have the EXTREME MEGA 48003489 PIN PACKAGE one

19:33 <zid> so that they can give me ps/2 and stuff

19:34 <zid> NCT6776F

19:34 <kingoffrance> i noticed ack 16-bit pc86 target, i can int main(void) { char a; a = 5; a = 6; a = 7; return 0; } and it is simple enough not to optimize those out, despite no volatile, despite not being used anywhere; with optimized gcc they disappear from the disassembly :D

19:34 <zid> 102 pins

19:34 <zid> err 128 pins

19:34 <geist> anyway re: memory barriers and IO and ARM: you *should* memory barrier after writing to a device MMIO register bank because it forces the cpu to flush the transactions across the bus

19:35 <geist> if the mmio is mapped with 'device memory' (whcih you should) then the transcations are in order, not write combined, etc, but... they're fire and forget

19:35 <zid> I had a thing typed out and I cut it, but then copy pasted an image so I lost it

19:35 <geist> so the cpu moves on before the devie has acked the transations

19:35 <mrvn> kingoffrance: optimization can happen, but doesn't have to happen.

19:35 <geist> but a DSB forces the bus to flush it out

19:35 <zid> how strong is the arm memory model? does it reorder writes/writes, loads/loads, writes/loads?

19:35 <kingoffrance> mrvn, i just mean, older compilers perhaps are less trigger happy

19:35 <mrvn> geist: even for device mnemory you should use a barrier?

19:36 <geist> zid: it's complicated. with normal memory (which is usually how regular memory is mapped), it's weakly order

19:36 <zid> so it's up to the MTRR equivalents?

19:36 <geist> mrvn: yes. but its hard to construct a situation where you hit a problem, except when you do

19:36 <geist> zid: no. not at all. it really has no equivalent model to MTRR

19:36 <zid> You were just going on about 'device memory' and 'normal memory' and 'how it's mapped' so it sounded like it did

19:37 <geist> so one thing at a time. to zids question: ARM is weakly ordered, so for normal memory it is free to reorder things as it sees fit *on the wire* but the cpu must still appear to be in order relative to itself

19:37 <geist> ie, it can't hoist a load over a store such that it is inconsistent with itself

19:37 <mrvn> geist: take a revision 1 RPi and access different peripherals without barrier. The reads and writes do get scrambeled randomly because the bus implementation they connected to is garbage.

19:37 <zid> x86 could just set UC on the range even if it wasn't as strong

19:38 <geist> zid: reason i say it's not the sam thing as MTRR is ARM doesn't do it the same way. MTRRs refer to physical memory, ARM tags different memory types at the mmu, on the page table entries

19:38 <geist> so it's a more complicated thing. it's kinda the same idea but done at a different layer

19:38 <zid> I mean.. I'd consider that basically the same

19:38 <geist> sure depends on which part you're thinking about

19:39 <geist> anyway. so that's the 'weak memory model' that most programmers hit. means between threads without a memory barrier you cannot ensure that memory trasactions appear in any particular order

19:39 <mrvn> and here I thought MMIO wouldn't need those barriers anymore with newer (fixed) rpi hardware.

19:39 <geist> mrvn: i'll get to the why in a bit

19:39 <geist> it's a differnt thing entirely to what i'm talking about with zid

19:40 <mrvn> device memory is never reordered on the wire, right?

19:40 <geist> so i think zid is satisfied, since i'm hearing no complaints

19:40 <geist> and they usually type faster :)

19:41 <geist> mrvn: armv8 defines memory model a bit more explicitly than armv7. in the case of ARMv8 you set up memory parameters in the form of a few bits that you set per page

19:41 <zid> so ultimately the question is, on 'device memory' (whatever that entails, page table bits to mark it UC or whatever by the sounds of it), does a single cpu need barriers to ensure write ordering?

19:41 <geist> okay so onto device memory. it operates compeltely differently from 'normal memory' since it'suncached

19:41 <zid> you would still on things like alpha, you don't on x86, but not sure about arm

19:42 <geist> so in armv8 you specify memory paraters as a series of bits

19:42 <geist> sometimes you'll see something like GRE or nGnRnE or nGnRE or whatnot

19:42 <geist> iirc off the top of my head: G = gathering, R = reordering, and E = early acknolwedge

19:43 <geist> gathering = can merge subsequnt stores and issue a single transction

19:43 <geist> reodering = can rearrange the order of stuff across the bus

19:43 <zid> so don't set R and you can ignore barriers by the sounds of it

19:43 <geist> early acknlowefge = can issue a write and then 'move on' without waiting for the device on the other end of the bus to ack it

19:43 <zid> E sounds useful for devices

19:43 <zid> assuming they're clever enough about it

19:44 <geist> so in armv8 sense as it maps to armv7 'device memory' == nGnRE

19:44 <mrvn> not sure why E would matter

19:44 EtherNet_ is now known as EtherNet

19:44 <geist> and 'stongly ordered' == nGnRnE

19:44 <geist> basically you avoid SE like the plague and only use in very special situations, and map things like MMIO registers as 'device memory'

19:45 <geist> what this means: when you read from a mmio it reads it full stop, doesn't reorder other memory transctions *to the same device aperture*

19:45 <geist> so 3 reads of the same register happen as 3 reads, or 3 reads from different registers on the same bus, 3 different reads, not reodeered

19:45 <geist> writes are similar *except* the cpu issues the writes, in order, not combined, but doesnt' wait for the device to ack it

19:46 <geist> so it can take 10s or hundreds of cycles for the device to *see* the write

19:46 <geist> and by then the cpu has moved on

19:46 <mrvn> and why would that matter?

19:46 <mrvn> (other than the MMU)

19:46 <geist> why this is a problem: if you write to some device and then move over and write to another one, you can have two write transcations in flight, and they can be reordered *relative to each other* as they go across the bus

19:47 <zid> yea I can't come up with a situation off the top of my head where that's an issue, but I bet I could construct one if I tried

19:47 <geist> so the canonical exampe is something like you get an irq, go read the devie that caused it, handled the situation, then went back and acked the irq at the interrupt controller

19:47 <geist> sine both of them are different deviceds, the ack can appear to the IC before the device got its writes

19:47 <zid> involving something like IRQs, yea :P

19:47 <mrvn> Huh? I thought the reorder bit would cover that case

19:47 <zid> how does it know they're different 'devices'?

19:47 <geist> mrvn: 'early acknolwedge'

19:47 <geist> setting of it means it fires and forgets the writes across the bus

19:48 <mrvn> So you are saying the CPU writes them out in-order but the bus with all its hops reorders them?

19:48 <geist> correct

19:48 <geist> but there are rules there, but the way AXI busses work on ARM think of it as a tree of point to point

19:48 <geist> as the memory transaction drills across the tree, it goes from the root (the cpu) down towards devices. as it forks and heads across branches, they can get locally buffered and appear out of order

19:49 <geist> based on contention

19:49 <mrvn> Ok, so if the devcie and GIC are on different branches they can get out of order in respect to each other.

19:49 <geist> so the ARM ARM is very squishy about it. basically within a 'device aperture' or whatever they call it, you can be pretty assured stuff is in order

19:49 <geist> since there's only one path to that device

19:49 <geist> but as you start crossing devices, you can't up front know if they're on the same leaf node in the tree

19:50 <mrvn> The GIC gets the interrupt cleared before the dvice gets it turned off and you get a spurious interrupt

19:50 <geist> mrvn: right. a DSB would force the cpu to stop and wait for all outstanding writes to be acked

19:50 <zid> now I want an acking write/read instruction and a pointer tag from C

19:50 <geist> this can also happen with things like setting up memory descriptors in uncached memory and then hitting the doorbell register in a corresponding device

19:51 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

19:51 masoudd has quit [Ping timeout: 256 seconds]

19:51 <geist> basically the ARM ARM has all sorts of squishy language about how things are in order in some cases and there are some cases where the bus can reorganize stuff, so you have to be aware of what domain various things you're interacting are on

19:51 <geist> or... you put in lots of memory barriers

19:51 immibis has quit [Ping timeout: 252 seconds]

19:52 <geist> i thin the canonical thing to do is interact with a device's registers and then when you're 'done' dealing with it insert a barrier to ensure the device sees what you told it

19:52 <geist> so if you read/write 3 regs in a row and are now done (say in your irq handler). isse a DSB and then exit

19:52 <mrvn> never thought having writes to devices having to be in a certain order as long as everything ends up where it should, But you are right, with devices connected to other devices (like interrupts) the order can matter.

19:52 <zid> so it is not the cpu with the ordering issue here, but the bus itself?

19:52 <geist> or between things like setting up in memory descriptors and then liking it in a HEAD/TAIL pointer on your e1000 or whatnot

19:52 <zid> hence how it knows about 'different devices'

19:53 <mrvn> zid: yep

19:53 <geist> zid: yah the cpu itself doesn't know what it's talking to, so its a bit of the way the busses are wired up that's bleeding in

19:53 <geist> you *could* map all MMIOs are strongly ordered, but that's *really* slow

19:53 <zid> and you work around it with E bits or a manual flush

19:54 <geist> since effectively that means the cpu is stuffing in a full bus barrier between every read/write

19:54 <mrvn> geist: so we are back to having to add barriers whever we switch between peripherals?

19:54 <geist> basically

19:54 <geist> or when crossing from memory to peripheral and back

19:54 <geist> ie in memory descriptors

19:55 <geist> or you can just stuff a memory barrier after every mmio transaction

19:55 <geist> ie, mmio_write() { ....; dsb; }

19:55 <geist> i think that's *basically* what we do in fuchsia, which is sub optimal, but probably the safest thing to do

19:56 <zid> You might be able to do some silly stuff to make it automatic to get the inter-device flushy stuff working

19:56 <geist> also one of those reasons the whole struct { volatile ; } thing falls over

19:56 <zid> something like a locking pattern

19:56 <geist> yah

19:56 <mrvn> I had some code for the RPi where the last device used is a template parameter sort of. So at the first access to a peripheral it generates a barrier because it's != the last.

19:56 <geist> i'm not a fan of things that automatically do this, but sine most people dont study the ARM ARM it's probably the right thing to do

19:57 <zid> Linter can just enforce that you do it

19:57 <zid> basically

19:57 <geist> i still dont think i fully grok the subtleties of it. especially the exact rules of switching bewee cached memory and device memory and what can be reordered relative to what

19:57 <geist> the manual reads like pages of legalese

19:57 <mrvn> geist: Today I would use something like pythons "with": with UART0 as uart { uart.data = 'h'; uart.data = 'e'; ... }

19:58 <zid> arm is silly news at 11? :p

19:58 <mrvn> or similar RAII like conecpt.

19:58 <geist> also as you can probably tell, most of this robably wont show up on a VM

19:58 <mrvn> What this really needs is a linear type system.

19:58 <geist> or when being emulated

19:59 <zid> I guess you could make this happen on any platform as long as your device took ages to deal with writes given to it

19:59 <geist> also since lots of implicit actions in ARM insert a memory barrier of some kind (though i haven't gotten into all the subtle variants of memory barrir on ARM)

19:59 <mrvn> emulated don't have the tree structure of the bus and VMs will flush or just take long enough.

19:59 <geist> lots of times you just dont notice

19:59 <zid> they just *happen* to all be fast

19:59 <zid> so ARM makes a situation that doesn't happen in practice, happen in practice

19:59 <geist> zid: honestly i dunno what the x86 memory model is with regards to io transactions

19:59 <mrvn> zid: you need 2 devices that are interconnected so the write order matters.

20:00 <geist> presumably they also just have a complicated back channel of acks and whatnot to make everythin appear in order

20:00 <zid> if everybody knew acking an e1000 took 40uS, we'd already have code to deal with not acking the IRQ controller too early

20:00 <zid> it just happens to be sub-instruction instead so we don't

20:00 <geist> but soometimes i'll ask someone at work that really groks x86 at the physical level and they'll tell me it's fantastically complicated and there are ways to make things weakly ordered if you really know what you're doing

20:00 <mrvn> geist: doesn't x86 with iommu do cache snooping?

20:01 <zid> my personal bet is just that the fact threads *do* exist means those barriers always present in practice

20:01 <geist> even without

20:01 <geist> anyway, so that was my long answer to 'welll, actually' from an hour ago

20:01 <geist> me regurgitating this every once in a while refreshes my memory

20:01 <geist> i'm getting pretty good at the nGnRnE thing now

20:02 <mrvn> geist: is a barrier on every write better than the nE flag?

20:02 <geist> that's a good question that i honestly dont know

20:02 <geist> i *thik* strongly ordered has more strictness that may be even worse than that though

20:02 <mrvn> I would think the barrier is the same as nE except you waste an extra opcode

20:03 <mrvn> but ARM has so many barrier types ....

20:03 <geist> i think it may have to do with other things like how all of this interacts with normal memory transactions that may be in flight

20:03 <geist> yah also note in this case you'd probably do a `DSB SY`

20:03 <geist> or maybe... `DSB OST`? i always have to thik about it

20:04 <geist> whereas a strongly ordrered barrier is probably closest to `DSB SY` which is the strongest

20:04 <geist> and also though i keep saying memory barrier, dont confuse it with DMB, which is different from DSB (DSB is 'stronger')

20:05 <geist> and that particular distinction i constantly have to go back to the manual for

20:05 <mrvn> and it's a real mess in the manuals for the RPi 1

20:05 <geist> yah that was simply broken. and thus i would advise just tossing it in the bin and move on

20:05 <mrvn> the barrier opcode was added in ARMv7, right?

20:06 <geist> broken hardware is just not worth fucking with

20:06 <geist> yes. i think in armv6 it was at best some control op

20:06 <mrvn> yes, lots of coprocessor ops for barrier and they are ugly to understand

20:07 <geist> fundamentally ARM arch is designed by cpu engineers for cpu engineers. when they have a choice they seem to pretty consistently make the decision to take the route that maximizes the ability for hardware to do what it wants

20:07 <geist> and makes it a SW problem

20:08 <mrvn> the RISC philospohy

20:08 <geist> i always thought there was a lot of potential to build a *really* high end cpu out of the arch (ie, apple M1) but the cost is no one fully groks it

20:08 <geist> well anyway, gotta get some work done

20:19 <mrvn> sometimes I think CPUs should be even less ordered so errors happen a lot when you use it wrong. Maybe a flag to explicitly mess up the order randomly at the cost of speed.

20:21 <geist> yah that's an issue honestly, especially with lower end ARM cores like cortex-a53. it's allowed to do all sorts of crazy stuff, but in practice it only has a limited amount OOO

20:21 <geist> so it appears largely in order memory wise, so it really doesn't expose you to the full joy of something that'll bite you

20:22 <geist> also why tryig to run on something like M1 is pretty useful. it'll shake out lots of issues pretty fast

20:23 <mrvn> I would have thought larger CPUs with more busses would be more critical

20:23 <geist> that too

20:23 <geist> was just thinking about general weak memory stuff

20:24 <geist> the more stuff in flight, the more OOO the core is, the more ability it has to really take advantage of the weak memory model

20:25 <nur> I feel like I've read about os stuff for years and I don't know what _any_ of that meant.

20:26 <geist> that's pure cpu architecture stuff

20:26 <geist> which intersects with osdev, but isn't so much the main goal

20:26 dh` has joined #osdev

20:26 <nur> is this going to be on the test

20:26 <nur> :)

20:26 <mrvn> nur: did they teach you about it?

20:26 <nur> it was a joke

20:26 <nur> I'm not in school

20:27 <mrvn> Most people think of a cpu of executing one opcode after the next sequentially

20:27 <geist> also in general this channel tends to be skewed towards low level osdev. which is of course a large part of it, but i think largely because usually you start with an empty computer and build a kernel, etc

20:27 <mrvn> But all levels of hardware do things in parallel or out of order nowadays

20:27 <geist> but in the lomg run most of the meat osdev is kernel and up

20:28 <geist> i'm just personally less interested in that, primarily because i like to deal with low level cpu bits

20:28 <zid> In the long run, most of osdev is tweaking your voltages and cpu frequency multipliers, silly

20:28 <geist> and dont forget memory timings

20:28 <nur> geist, I feel like we need to know everything

20:28 <zid> That's what I've been doing this week, so ergo it's true

20:28 <geist> nur: well doesn't *hurt*

20:28 <mrvn> who needs memory. I want to run my RPi in cache only.

20:29 <zid> my 1650 doesn't seem like an especially good bin :(

20:29 <geist> i persoally find it fun to just learn as much as possible about everything

20:29 <geist> but whether or not that matters i dunno

20:29 <zid> geist we should buy me 10 more to test

20:29 <nur> like subtle interactions that make you tear your hair out

20:29 <nur> I'm intrigued by the RPI has a "main CPU which is the GPU" doing the "real running"

20:29 <geist> zid: how about we buy one modern thing that completely destroys your old 1650 and move on

20:29 <zid> 1390p only got a paper launch though

20:29 <geist> it'd be even more power efficient at that

20:30 <nur> the GPU bits feel opaque

20:30 <mrvn> nur: they are, except when you need to do DMA

20:30 <zid> idk about the power efficiency, have you seen the listed TDPs on modern chips

20:30 <zid> they stuff them full of cores and run them at 295W TDP out of the box

20:31 <mrvn> nur: In some places you have to set the real physical address and not the address the ARM cpu sees.

20:31 <nur> uh oh

20:31 <nur> the "all addresses are virtual addresses, relax" adage goes out the window then

20:31 <geist> zid: and they also do about 10x as much work with that

20:31 <mrvn> nur: that's not true in kernel world at all

20:32 <geist> and personally i *wouldnt* advise getting modern cpus that burn 250W (*cough* intel)

20:32 <zid> w-1390p is actually a solid upgrade to mine but doesn't actually.. exist

20:32 <mrvn> geist: if only game programmers would understand multithreading

20:32 <geist> my ryzen 5950x comfortably pulls 150 at full tilt and is probably at the minimum many ties the horsepower of your older xeon

20:32 <zid> not according to benchmarks

20:33 <geist> i think i disagree on that one

20:33 <geist> though of course the fun one is the M1 cpus sip power and get pretty darn close, at least single cpu benchmarky

20:34 wand has joined #osdev

20:34 <zid> yea the M1 is what happens if you don't waste 10 years, I guess

20:34 <zid> considering my crap is at least still competitive with 10nm+++++ and it's on 32nm, intel aren't even trying

20:34 <geist> but even that the ryzens are pretty darn efficnent. the very new intel stuff is starting to get closer again, because they're finally consistently making 10nm and lower stuff

20:34 <geist> but all the new ones are still generally burning a lot more power to do the same thing than a 7 or 5nm ryzen

20:35 <zid> there's no way they can't make something with 80GB/s memory bw, 40 pci-e lanes, 4-8 cores at 5GHz without it being a $4000 xeon on 10nm, when they did exactly that in 2011 on 32nm for a cheap OEM cpu

20:35 <geist> sure, but does 80GB/sec matter?

20:35 <geist> are you doing something that actually saturates that?

20:35 <mrvn> geist: factorio :)

20:36 <geist> newer DDR tech will at the minimum have much faster access times given 10 years of develpment just due to higher clock rates if nothing else

20:36 <zid> They actually don't

20:36 <zid> you need to go VERY expensive and high end on ddr4 to get close to ddr3

20:36 <zid> it was significantly *slower* than ddr3 for a few years

20:36 <mrvn> aren't they slower due do deeper trees?

20:36 <geist> it's common that a newer revision is worse than the high end one from before but usually that's made up over time

20:36 <geist> 'for a few years' is the key

20:37 <zid> as in, for a few years you couldn't even BUY ddr4 that fast

20:37 <geist> but then over time it usually superceds the last in every metric, and now we're up to DDR5

20:37 <zid> yea ddr5 actually looks decent

20:37 <zid> ddr4 just sucked

20:37 <mrvn> Why isn't there QDR?

20:37 <geist> anyway, i'm not dissig your setup, i just think it's kinda a dead end to keep tryig to microoptimize that thing

20:37 <zid> which was made worse by intel not releasing quad channel setups for it so you couldn't even mitigate it

20:38 <geist> but of course, i also buy vaxes and old macs and whatnot, so i hvae no legs to stand on

20:38 <zid> It's.. fun though?

20:38 <geist> totes

20:38 <zid> and basically free

20:38 <geist> ah yeah if you're for free upgrading icontinue :)

20:38 <zid> something tells me your 5950x, which gets what, a most double the perf on a good day for the things I use it for, wasn't £20

20:38 <geist> honestl the old power usage of old machines is the reason i tend to use them even if i have them around

20:38 <geist> i have a sandy bridge 2600 and a nehalem E5520 box that i could resurrect

20:39 <geist> but at the end of the day they're 250W boxes that are a fraction of the horsepower of something modern

20:39 <geist> definitely the case with the old G5 powermac from 2005

20:39 <geist> 250W space heater

20:39 <mrvn> Other than games I never needed the power.

20:39 <zid> my cpu runs 30W watching youtube or whatever, dram uses about 5W

20:39 <zid> gpu probably uses about that to render it

20:39 <geist> i kinda doubt it's *simply* double the perf. if nothing else it has 16 cores

20:39 <zid> PCH etc use another 10

20:39 <zid> I don't use 16 cores

20:39 <zid> I use 1 core, 99.99% of the time

20:40 <geist> ah, so now we're at the meat of it.

20:40 <zid> It's a desktop, not a webserver

20:40 <geist> sure, if single cpu is your jam, then los of stuff in the last 10 years aren't that interesting

20:40 <zid> I did say I wanted 4-6 cores :p

20:40 <zid> I don't want to pay the heat/power costs on the extra cores, I won't use them

20:40 <geist> i do enough big compilation and stuff like fpga working and whatnot that lights them all up

20:41 <zid> I'd probably just set up distcc or something if that were my jam

20:41 <geist> FWIW modern designs do a darn good job of downclocking unused stuff

20:41 <zid> I'd still play elden ring on a quad core

20:41 <zid> turbo 3.0 from intel helps

20:41 <zid> idk what amd is like wrt that

20:41 <geist> same thing, different name

20:41 <mrvn> I use 16 cores 1% of the time. But when I do it really helps.

20:41 wand has quit [Quit: leaving]

20:41 <zid> turbo 3.0 has a single core turbo option, it still drags the other cores up but not as much

20:41 <geist> it's all dynamic nowadays

20:41 wand has joined #osdev

20:42 <mrvn> make clean; make -j

20:42 <zid> you can't go core 0 5GHz + 23 cores at 1GHz so far as I know

20:42 <zid> they end up at 4.2 or whatever

20:42 <geist> yep. same thing on AMD. it's a global optimization. some cores run faster than others, and when others come up it starts downclocking everyting to maintain a reasonable TDW

20:42 <mrvn> geist: don't they all run on the same voltage and then the speed is limited to a narrow range?

20:42 <geist> 'overclocking' nowadays is generally just pushign that TDW limit up so that it can push more cores longer

20:42 <zid> yea, tau timers and shit in your firmware to illegal values etc :p

20:43 <geist> mrvn: i dunno. i think it's pretty fancy

20:43 <geist> at the minimum on my machine it has two separate dies for each 8 cores so there's a possibility

20:43 <zid> I'm actually tempted to try disabling some cores to see what it does to the heat

20:43 <geist> but yeah for the ryzen stuff 16 cores is pretty overkill. an 8 core ryzen is a good gaming system. like 5800x or whatnot usually benchmarks a teensy bit better in ames anyway

20:43 <geist> simply because you have less L3 cache traversals if nothing else

20:44 <zid> even 8 is massively overkill

20:44 <geist> and theres a bit more clocking headroom

20:44 <zid> you'll run out of vram or memory or something before you run out of clockspeed with 8

20:44 <geist> i wouldn't say that necessarily. newer games are getting designed for 8 core machines (PS5, xbox, etc) so you're starting to se them more used

20:44 <mrvn> from my understaning those tubro modes are driving the cores above what you can cool in the hope the cpu sleeps enough to not get too hot.

20:44 <zid> if you were trying to say, run 18 copies of WoW

20:44 <zid> yea but their cpus are 20% as good :P

20:44 <geist> they're ryzen 2s

20:45 <geist> pretty modern. high end as of 2019 or so

20:45 <mrvn> geist: With the average core count being ~5 and single core getting only 25% performance the game industry is changing. The programmers still don't understand it.

20:45 <geist> alright.

20:45 <zid> *checks what a ps5 has in it*

20:46 <mrvn> Plus game design takes years. They can easily be a decade behind the curve.

20:46 <geist> i kinda disagree there, but also depends a lot on what kinda game ou're talking about

20:46 <zid> 8 core 3.5GHz apparently

20:46 <geist> zen 2

20:46 <zid> might have some cut-corners to reduce heat, like turning of memory reordering etc idk

20:46 <zid> I doubt they'll tell us

20:46 <geist> yah so pretty darn modern. the downclock is the main loss. also ps5 and xobx have pretty huge memory busses IIRC

20:46 <mrvn> Even today game designers thing a game should do this: while true { read input; game tick; rnder; }

20:47 <mrvn> think even

20:47 <geist> zid: not really. it's just a PC in a can. the cpu is custom, but it's not exotic

20:47 <zid> that's what they did for zen3

20:47 <zid> you know?

20:47 <zid> err typo

20:47 <zid> memory *renaming*

20:47 <geist> sure because they came up with a newer/better scheme

20:47 <zid> zen2 had memory renaming, zen3 cut it, presumably it was eating too much budget for too little reward

20:47 <geist> exactly

20:47 <zid> and it's common for console cpus to cut things like that for heat, hsitorically

20:48 <geist> and toss a few more load/store units it gets swamped by that

20:48 <zid> cus they're ran in tiny little boxes with bad cooling

20:48 <geist> *shrug* okay. i mean i guess you have a point of view here so i dont particularly want to argue it

20:48 <zid> 360s famously all desoldered themselves

20:48 <geist> anyway, need to get back to work

20:48 <zid> I mean, neither of us knows either way

20:48 <zid> it's obviously not going to be stupidly slow

20:49 <zid> but I still get more total bogomips

20:49 <geist> but... this is why i want to get ahold of a alder lake. i've been Zen based for the last few years and have been pleased, so will be interseting what intel has come up with the golden cove cores

20:49 <zid> 12600k looks seriously legit

20:49 <geist> yah

20:50 <zid> 6 core, 4.9Ghz stock, accepts high end DDR4 to mitigate the dual channel, ecc supported, 20 pic-e lanes instead of 16

20:50 <zid> so now you can have an ssd AND a graphics card

20:50 <mrvn> where do you put the NIC?

20:51 <zid> it cuts down the stuff I don't *really* use (high bandwidth, more pci-e lanes than I can actually use in practice) and focuses a little on the stuff I *do* use a lot, fast single cores

20:51 <geist> (sounds like zid just described a Ryzen circa 2017)

20:51 <mrvn> 16 lanes GPU, 2 lanes M2.key, 1 lane SSD, 1 lane NIC?

20:51 <zid> lmk when a 2017 ryzen is £20

20:51 <zid> 3600x or something looked good, if that's the right gen

20:51 <geist> hmm ets see there's one for $60

20:52 <geist> what's the exchange rate now?

20:52 <zid> 1:1 probably? :P 1.1 1.2

20:52 <zid> 3600X is a 6C 4.4GHz

20:52 <geist> anywy *really* have to go now. meeting in 8 minutes

20:52 <geist> bye!

20:52 <zid> £100 on ebay

20:55 <zid> new they're still £200, same price as a 12600k

21:20 pretty_dumm_guy has joined #osdev

21:21 GeDaMo has quit [Remote host closed the connection]

21:28 terminalpusher has quit [Remote host closed the connection]

21:34 dormito has quit [Quit: WeeChat 3.3]

21:36 ZipCPU has quit [Ping timeout: 240 seconds]

21:37 ZipCPU has joined #osdev

21:40 pretty_dumm_guy has quit [Ping timeout: 240 seconds]

21:40 __xor is now known as _xor

21:40 pretty_dumm_guy has joined #osdev

21:47 wootehfoot has joined #osdev

22:13 zid has quit []

22:20 Oli has joined #osdev

22:28 wootehfoot has quit [Read error: Connection reset by peer]

22:36 dequbed has quit [Quit: bye!]

22:38 dequbed has joined #osdev

22:53 Oli_ has joined #osdev

22:56 Oli has quit [Ping timeout: 272 seconds]

23:06 dude12312414 has joined #osdev

23:20 dormito has joined #osdev

23:28 troseman has joined #osdev

23:29 pretty_dumm_guy has quit [Quit: WeeChat 3.4]

23:30 pretty_dumm_guy has joined #osdev

23:30 pretty_dumm_guy has quit [Client Quit]

23:30 pretty_dumm_guy has joined #osdev

23:30 pretty_dumm_guy has quit [Client Quit]

23:43 zid has joined #osdev