#osdev on 2022-11-06 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:00 antranigv has quit [Quit: ZNC 1.8.2 - https://znc.in]

00:00 antranigv has joined #osdev

00:01 <PapaFrog> I'm all for starting it, mrvn

00:01 pie_ has quit []

00:01 vancz has quit []

00:02 MiningMarsh has quit [Ping timeout: 255 seconds]

00:03 MiningMarsh has joined #osdev

00:05 IRChatter has quit [Read error: Connection reset by peer]

00:06 dude12312414 has joined #osdev

00:07 <mrvn> PapaFrog: well, it played in March so you're a bit late.

00:08 vancz has joined #osdev

00:08 pie_ has joined #osdev

00:08 dude12312414 has quit [Client Quit]

00:12 eschaton_ is now known as eschaton

00:16 MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]

00:17 ChaosWitch has joined #osdev

00:18 Stella[OotC] has quit [Read error: Connection reset by peer]

00:22 Burgundy has quit [Ping timeout: 246 seconds]

00:24 tacco_ has quit [Remote host closed the connection]

00:24 MiningMarsh has joined #osdev

00:24 matrice64 has joined #osdev

00:27 spikeheron has quit [Quit: WeeChat 3.7]

00:33 MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]

00:33 matrice64 has quit [Ping timeout: 255 seconds]

00:36 MiningMarsh has joined #osdev

01:00 spikeheron has joined #osdev

01:02 matrice64 has joined #osdev

01:49 matrice64 has quit [K-Lined]

01:53 poisone has quit [Remote host closed the connection]

01:59 matrice64 has joined #osdev

02:02 scoobydoo_ has joined #osdev

02:05 scoobydoo has quit [Ping timeout: 255 seconds]

02:05 scoobydoo_ is now known as scoobydoo

02:07 matrice64 has quit [Ping timeout: 248 seconds]

02:15 justPardoned has quit [Quit: ZNC 1.8.2 - https://znc.in]

02:16 justache has joined #osdev

02:18 <heat> i did the patch send

02:18 <heat> nothing can go wrong now except linus screaming at me

02:19 Stella has joined #osdev

02:20 <mjg_> wut

02:20 <mjg_> i would assume this was already handled?

02:21 ChaosWitch has quit [Ping timeout: 260 seconds]

02:28 <heat> what was

02:29 srjek_ has quit [Ping timeout: 248 seconds]

02:46 [itchyjunk] has quit [Ping timeout: 255 seconds]

02:49 <mrvn> the screaming

02:49 <mrvn> must be going to be real bad if it takes him so long to write it all down

02:49 <mrvn> *duck*

02:50 [itchyjunk] has joined #osdev

02:52 <heat> he's getting old, slow

02:57 MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]

03:01 MiningMarsh has joined #osdev

03:05 <mjg_> heat: the bug

03:05 <heat> nope

03:05 <heat> https://lkml.org/lkml/2022/11/5/382

03:05 <bslsk05> lkml.org: LKML: Pedro Falcato: [PATCH] fs/binfmt_elf: Fix memsz > filesz handling

03:05 <mjg_> weird

03:05 <mjg_> well nice of you to pick it up

03:06 <mjg_> kind of reflects on linux tho :p

03:06 pounce has quit [Remote host closed the connection]

03:06 pounce has joined #osdev

03:07 <heat> lol

03:07 <heat> i had the opportunity of finally writing a linux patch and i did it

03:07 <heat> popped the cherry

03:07 <heat> that code really reads like "old geezer elf loader"

03:08 <heat> particularly assuming you have a bss at the end, where you can put a brk

03:08 torresjrjr has quit [Remote host closed the connection]

03:09 torresjrjr has joined #osdev

03:09 MiningMarsh has quit [Ping timeout: 248 seconds]

03:09 MiningMarsh has joined #osdev

03:21 <mrvn> have you tried putting bss chunks everywhere?

03:23 <clever> heat: isnt that loader also only used only for static files, and i think the dyld and initial dynamic elf, but no libs?

03:23 <clever> it feels like all dynamic libs get loaded by the userland dyld, not the kernel loader

03:23 <heat> yes

03:23 <mrvn> clever: dynamic files have INTERP pointing to the loader and you have to load thaT

03:24 <heat> musl's ld-musl is libc.so so the elf is a lot more complex

03:24 <heat> turns out there there's a plt and ppc64 plts are NOBITS and boom

03:24 <clever> mrvn: but when loading the INTERP binary, is the original executable also loaded? i feel like parts of it are pre-mapped?

03:25 <mrvn> In zero_bss() you zero till the end of the page. What if the bss segment is between 2 other segments? Wouldn't that zero out the next segment?

03:25 <mrvn> clever: I thought the interpreter gets passed the original file but I think it gets the file mapped.

03:26 <heat> clever, yes it is

03:26 <heat> mrvn, segments are at least PAGE_ALIGNed

03:26 <clever> there is also the "open binary" flag in https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/binfmt-misc.rst

03:26 <bslsk05> github.com: linux/binfmt-misc.rst at master · torvalds/linux · GitHub

03:26 <mrvn> heat: says who?

03:26 <heat> the kernel

03:26 <clever> where the kernel will pre-open a filehandle to the binfmt-misc interpreter, and then execve by handle not path

03:26 <clever> which allows executing a binary from outside a chroot env

03:27 <mrvn> heat: you can't change the flags between pages but nothing really stops me from having: .init, .plt, .text with 16 byte alignment and all executable.

03:27 <clever> or is that "fix binary", they both use similar working, bit confusing when you skim read

03:28 <mrvn> heat: or stuffing some of the elf headers into the holes

03:29 <clever> there is also a blog post i have related to this question...

03:30 <clever> http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html

03:30 <bslsk05> dbp-consulting.com: Linux x86 Program Start Up

03:30 <clever> there it is

03:31 <mrvn> clever: which part of that big pretty picture is supposed to be kernel?

03:31 <clever> ok, lets see, so execution begins at _start, which i believe is in the dynamic binary, even when using ld.so, and you call out to a symbol in it

03:31 <mrvn> clever: when you get to _start all the kernel and loader is already done

03:31 <clever> yeah, so that makes me wonder what ran the preinit array...

03:31 <mrvn> clever: we need everything before "loader"

03:32 <clever> > But first, how do we get to _start?

03:32 <clever> > To summarize, it will set up a stack for you, and push onto it argc, argv, and envp. The file descriptions 0, 1, and 2, (stdin, stdout, stderr), are left to whatever the shell set them to.

03:33 <mrvn> I know you can do "loader /path/binary" and it runs the program. So my guess would be that the kernel just maps the loader and stuff the original binary into args[1]

03:33 <clever> it also goes on to explain the auxv, which is in the stack

03:33 MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]

03:33 <clever> mrvn: except you can execute 1 binary but set argv[0] to point to something completely different

03:33 <mrvn> clever: so? nobody said anything about arg[0]

03:34 <clever> ah, you said args, not argv

03:34 <mrvn> typo

03:34 <clever> try `LD_SHOW_AUXV=1 ls`

03:35 <clever> > The AT_PHDR is the location of the ELF program header that has information about the location of all the segments of the program in memory and about relocation entries, and anything else a loader needs to know.

03:36 <heat> mrvn, ok so that possibility you talked about isn't a problem

03:36 <mrvn> clever: inconclusive. Is that loaded by the kernel or the ld.so?

03:36 <mrvn> clever: /lib64/ld-linux-x86-64.so.2 /bin/ls

03:36 <heat> you map in PAGE_SIZE chunks, and offsets that are not page aligned are always invalid

03:36 <heat> due to mmap, etc

03:37 <clever> mrvn: -bash: /lib64/ld-linux-x86-64.so.2: No such file or directory

03:37 <mrvn> clever: adjust for your arch

03:37 <heat> executing the interpreter directly is very different from asking for the interp

03:37 <clever> [root@amd-nixos:~]# LD_SHOW_AUXV=1 /nix/store/z56jcx3j1gfyk4sv7g8iaan0ssbdkhz1-glibc-2.33-56/lib/ld-linux-x86-64.so.2 ls

03:37 <clever> AT_PHDR: 0x7f6872b38040

03:37 <clever> interesting

03:37 MiningMarsh has joined #osdev

03:37 <heat> the interpreter is loaded like a normal PIE

03:37 <clever> it was at 0x400040 earlier

03:37 <mrvn> heat: how does "ls" and "ld.so ls" differ?

03:37 <heat> AT_PHDR is filled by the kernel to point at the loaded main program's PHDR

03:38 <clever> using the interpreter directly, changes the AT_PHDR addr

03:38 <clever> heat: except when running `ld.so ls`, then the kernel wasnt loading ls, so ld.so has to fill in the blanks on its own

03:38 <heat> mrvn, ls loads the main prog and then loads the interp (and jumps to it). ld.so ls loads ld.so

03:38 <clever> and it winds up loaded to a weird addr

03:38 <heat> no?

03:38 <heat> nothing is loaded to weird addresses

03:39 <mrvn> clever: is that many the AT_PHDR of ld.so in the later case?

03:39 <mrvn> s/many/maybe/

03:39 <heat> it is

03:39 <heat> ld.so is a normal program

03:39 <heat> the kernel loads elfs

03:39 <heat> ld.so works like a PIE elf

03:39 <heat> it allocates a base, mmaps the thing, jumps to it

03:39 <mrvn> heat: My question is: Why are there 2 ways to start a dynamic binary? Why doesn't the kernel always let the INTERP load the file?

03:40 <clever> mrvn: ah yes, even if i run just `LD_SHOW_AUXV=1 ld.so`, it shows that weird addr

03:40 <clever> and checking `ld.so cat /proc/self/maps`, i can see cat is still at the same addr as without

03:40 <heat> mrvn, because the libc devs chose to support explicit interp invocation

03:40 <clever> i think ld.so is linked to a diff addr, to be out of the way

03:40 <mrvn> heat: doesn't everyone?

03:40 <mrvn> clever: yes

03:40 <heat> not necessarily

03:41 <heat> you need to go out of your way to detect "invoked as an interp" vs "invoked as a program"

03:41 <mrvn> heat: seems like a smart thing to do so you can test your INTERP before you overwrite the existing one on "make install"

03:41 <heat> sure

03:42 <clever> mrvn: its also used heavily on nixos, because the INTERP is at a non-standard location (see above), you can test as your patching binaries

03:42 smach has quit []

03:42 <mrvn> The kernel should just always do that and then the INTERP wouldn't have to do anything special.

03:42 <clever> darwin is also stupid, the mach-o file has a field for the INTERP path (they call it dyld), but the kernel asserts that it hasnt been changed

03:42 <heat> the kernel should do what?

03:42 <clever> so you can never use a non-standard dyld

03:42 [itchyjunk] has quit [Remote host closed the connection]

03:43 MiningMarsh has quit [Ping timeout: 248 seconds]

03:43 <mrvn> heat: check for INTERP, if it exist map INTERP and call it with the binary pre-pended to the args.

03:43 <heat> if you have the main program open you might as well load it

03:44 <clever> heat: oh, interesting, according to `readelf -l`, my ld.so is a "DYN (Shared object file)" while cat is a "EXEC (Executable file)"

03:44 <clever> but both are executable and function as normal CLI tools

03:44 MiningMarsh has joined #osdev

03:44 immibis_ has quit [*.net *.split]

03:44 <heat> yes

03:44 <heat> have you seen a PIE executable?

03:44 <mrvn> clever: static, dynamic, exec are kind of screwed under linux.

03:44 <heat> they are also DYN

03:45 <clever> heat: ah, i had previously assumed shared libraries couldnt be executable, but maybe the difference is just the lack of an entry-point?

03:45 <heat> dude

03:45 <heat> try /lib64/libc.so.6

03:45 terrorjack has quit [Quit: The Lounge - https://thelounge.chat]

03:45 <heat> or your weird nixos path

03:45 <mrvn> clever: run: /lib/x86_64-linux-gnu/libc.so.6

03:45 <mrvn> clever: glibc has a weak main()

03:45 <mrvn> or weak _start

03:45 <clever> it spits out a version!

03:46 <clever> Entry point address: 0x278c0

03:46 <clever> 3450: 00000000000278c0 33 FUNC LOCAL DEFAULT 15 __libc_main

03:46 <clever> csu/version.c:__libc_main (void)

03:47 <mrvn> clever: I don't know if any other lib has this too

03:47 <heat> musl does

03:47 terrorjack has joined #osdev

03:47 <mrvn> or if they even can

03:47 <heat> yes they can

03:47 <mrvn> heat: any non libc lib

03:47 <heat> you can always set your shared object's entry

03:47 <clever> it just calls a thin (inlinable) function, that just write()'s a single blob and exits, zero logic

03:48 <heat> clever, anyway, yes shared libraries can be executable

03:48 <heat> there's no such thing as a shared lib really

03:48 <heat> ELF classifies your binaries as DYN or EXEC

03:48 <clever> heat: i ran into trouble before, when i tried to dlopen an EXEC

03:48 <heat> DYN need to have their base relocated (PIE or shared lib)

03:49 <heat> EXEC don't

03:49 <clever> it wouldnt let me, so i had to implement my own loader

03:49 <heat> the kernel does 0 relocations

03:49 <clever> that also makes my future plans of a PIE kernel simpler, its more obvious now that i can just compile it in DYN/-shared mode

03:49 <heat> if you're a static PIE or an elf interpreter or whatever (any DYN), you need to relocate yourself

03:49 <heat> hence the need for AT_PHDR, _DYNAMIC, etc

03:50 <heat> a PIE kernel is wasteful

03:50 <clever> ah yes, you cant even find your own sections, because youve been loaded to the wrong addr and not relocated properly

03:50 <clever> and the AT_PHDR is at the top of the stack

03:50 kof123 has quit [*.net *.split]

03:50 yuiyukihira has quit [*.net *.split]

03:50 fluix has quit [*.net *.split]

03:50 graphitemaster has quit [*.net *.split]

03:50 alexander has quit [*.net *.split]

03:50 fluix has joined #osdev

03:50 <clever> why is a PIE kernel wasteful?

03:51 alexander has joined #osdev

03:51 ornx has joined #osdev

03:51 graphitemaster has joined #osdev

03:51 <heat> because PIE is designed to leave .text intact

03:51 <clever> isnt that kind of needed for things like kaslr? (not what i want though)

03:51 <clever> ah for reuse between processes

03:51 <heat> and I'm talking about ELF PIE here, toolchain PIE

03:51 <clever> but for a kernel, you dont care, and can freely modify the .text

03:51 <heat> yes

03:52 <clever> is that called something else?

03:52 <heat> not really

03:52 <heat> relocatable I guess?

03:52 <clever> how would the toolchain args differ?

03:52 <heat> essentially you end up relocating the kernel

03:52 <heat> ld -r

03:52 <heat> anything that keeps relocation data

03:52 <mrvn> the hack to put all modifications together in fewer pages is only usefull if you map the same binary at multiple addresses. The kernel isn't mapped so you can just modify the .text.

03:52 <heat> i actually support that on a toy os

03:53 cheapie has quit [Quit: Local host tripped over the cable]

03:53 <heat> https://github.com/heatd/Carbon/blob/master/kernel/Makefile#L36-L38

03:53 <bslsk05> github.com: Carbon/Makefile at master · heatd/Carbon · GitHub

03:53 <clever> heat: ah, i think linux and littlekernel also use -r, to do incremental linking

03:53 <heat> https://github.com/heatd/Carbon/blob/master/kernel/arch/x86_64/relocation.c

03:53 <bslsk05> github.com: Carbon/relocation.c at master · heatd/Carbon · GitHub

03:53 <heat> linux doesn't do -r anymore

03:53 <clever> where all .o's in a dir get linked into a dir.o

03:53 <heat> it didn't work well with LTO

03:53 kof123 has joined #osdev

03:54 <clever> ah, yeah, i can see that messing up

03:54 <clever> LTO wants to do everything in the final link

03:54 MiningMarsh has quit [Read error: Connection reset by peer]

03:54 <clever> doing 20 LTO passes is just silly, and partially linking without breaking LTO could be tricky

03:54 <mrvn> heat: I do that with LTO per dir. Each directory can choose wether to do LTO at that point or push it to the top dir.

03:54 <heat> linux uses thin ar archives these days

03:55 <clever> ah, so its more like mini static libs

03:55 <heat> those archives literally just have paths

03:55 <heat> no file copying, no nothing

03:55 <mrvn> clever: you don't do 20 passes. Once you LTO something it looses the LTO chunk in the .o file and just gets coppied at the next stage.

03:55 <clever> oh

03:55 <heat> ar T I think

03:55 <clever> mrvn: ah, so you cant LTO between the dirs, enless you just put it off

03:55 <mrvn> ar files / .a libs are something else again

03:55 MiningMarsh has joined #osdev

03:55 <mrvn> clever: exactly.

03:56 cheapie has joined #osdev

03:56 <clever> and if the linker starts to fuse .text's together, would that break LTO? or does that just get thrown out when LTO re-generates .text from the compiler state?

03:57 <mrvn> clever: you get a bunch of .o files with binary data and some with lto data and the LTO linker plugin optimizes the LTO chunks and merges that with the binary chunks and then links.

03:58 <mrvn> Note: .S files don't have any LTO chunks. So you already have to merge in a kernel.

03:58 <clever> but isnt the .o containing a mix of .data, .text, .rodata, and LTO data?

03:58 <mrvn> clever: that's a compiler option. You can have both.

03:58 <clever> and the LTO contains data for recreating .text (and more?) after you merge several LTO chunks and re-optimize?

03:58 <clever> ah

03:58 Affliction has quit [*.net *.split]

03:58 night has quit [*.net *.split]

03:58 arminweigl has quit [*.net *.split]

03:58 sjs has quit [*.net *.split]

03:58 les_ has quit [*.net *.split]

03:58 Patater has quit [*.net *.split]

03:58 dza has quit [*.net *.split]

03:59 <clever> so you could omit the .text (and others?) entirely, and generate an LTO only .o file?

03:59 <heat> yes, that's what happens

03:59 <clever> i can see that being faster

03:59 night has joined #osdev

03:59 <heat> unless you pass ffat-lto-objects

03:59 sjs has joined #osdev

03:59 <heat> if you run file on a LTO'd .o you'll see GIMPLE

03:59 arminweigl has joined #osdev

03:59 Patater has joined #osdev

03:59 <mrvn> clever: the LTO doesn't contain info to recreate the .text file. If you link an .S file and .cc file with LTO together and then LTO it later you loose the .S file.

04:00 <clever> yeah, .S files cant be LTO'd, that makes sense

04:00 <mrvn> clever: that part is kind of tricky.

04:00 les has joined #osdev

04:00 <clever> since LTO is compiler (c/c++) internal state

04:00 <clever> this also leads me to another idea

04:00 dza has joined #osdev

04:01 <clever> if i use inline asm within a .c file, could LTO inline it directly at the call-site, as-if it was an inlineable function in a .h?

04:01 <clever> and then it could perform better then calling a function in .S files?

04:01 <mrvn> clever: yes

04:02 <mrvn> You should really avoid calling functions in .S files that are trivial.

04:02 <clever> yeah, i try to use inline asm any time i can

04:02 scoobydoo has quit [Read error: Connection reset by peer]

04:03 <clever> the main reason i use .S files, is mainly when stack manipulation comes into play

04:03 scoobydoo has joined #osdev

04:03 <clever> _start creating the start, irq handlers, context switching

04:03 <mrvn> I only have boot.S to bootstrap, entry.S for IRQs and one switch_task function that swaps stacks and program counter.

04:03 <clever> yep, exactly what i listed

04:03 corecode has quit [*.net *.split]

04:03 ornitorrincos has quit [*.net *.split]

04:03 remexre has quit [*.net *.split]

04:03 Maja[m] has quit [*.net *.split]

04:03 mjg_ has quit [*.net *.split]

04:03 rb has quit [*.net *.split]

04:03 Ameisen has quit [*.net *.split]

04:03 kkd has quit [*.net *.split]

04:03 jleightcap has quit [*.net *.split]

04:03 linkdd has quit [*.net *.split]

04:03 <clever> s/creating the start/creating the stack/

04:03 patwid has quit [*.net *.split]

04:03 matthews has quit [*.net *.split]

04:03 patwid has joined #osdev

04:03 ornitorrincos has joined #osdev

04:03 jleightcap has joined #osdev

04:03 <mrvn> Making that a function call has the benefit that the compiler will save state for you for most things.

04:03 matthews has joined #osdev

04:03 rwb has joined #osdev

04:03 remexre has joined #osdev

04:04 corecode has joined #osdev

04:04 <mrvn> As in, it will save what it needs to save so you don't have to save all regs.

04:04 kkd has joined #osdev

04:04 <clever> it also means your free to do whatever with the stack, as long as you undo it when returning

04:04 Ameisen has joined #osdev

04:04 mjg has joined #osdev

04:04 <mrvn> clever: you could do that in inline asm.

04:05 <clever> but your going to have a hard time context switching with inline asm

04:05 <mrvn> With inline asm you would have to save all regs or mark all regs as clobber.

04:05 <clever> you dont know exactly what gcc was storing on the stack

04:06 <clever> and when creating a new thread, you need to falsify that saved state, so you can "restore" it when first spawning the thread

04:06 <mrvn> To be correct you actually should be saving/restoring all regs so you don't leak any data from one task to another.

04:06 <clever> yeah

04:07 <clever> my context switch routine saves all regs to the stack, saves the final SP, then loads the new SP, and restores all regs

04:07 <mrvn> One should also do that on the syscall border.

04:07 <clever> so creating a new thread, just involves creating a fake stack frame with a set of "saved" registers

04:07 <mrvn> (or zero out regs on exit)

04:07 Maja[m] has joined #osdev

04:14 <clever> mrvn: so if i was to use -r to create a relocatable kernel, either i pass the whole .o file to a suitable loader, or i make it self-relocating, but then how do i ensure the relocation data survives an objcopy to .bin?

04:14 <clever> can the linker script say to preserve it? and put it at some relative offset from .text?

04:15 <clever> or perhaps use the right PIC opcodes to find that relative offset

04:16 dh` has quit [Remote host closed the connection]

04:16 wereii has quit [*.net *.split]

04:16 \Test_User has quit [*.net *.split]

04:16 Goodbye_Vincent has quit [*.net *.split]

04:16 Irvise_ has quit [*.net *.split]

04:16 ebb has quit [*.net *.split]

04:16 <mrvn> clever: your linker script says what sections to keep and what to discard

04:17 <mrvn> clever: and if you have any addresses in your C code then you need those relocation infos to make the kernel self relocate or have an extra loader.

04:17 dh` has joined #osdev

04:17 wereii has joined #osdev

04:18 <clever> i know of at least one opcode in VPU that is properly PIC, when you try to load a symbol into a reg, it gets encoded as a PC relative offset

04:18 <clever> so i can use that to find the start of the relocation data, in a custom _start

04:18 <mrvn> clever: you have to mark the right sections so they are marked LOAD and kept in the objcopy to binary

04:19 <mrvn> clever: doesn't work on the .data section

04:19 <clever> yeah, i would want to tag them as LOAD, and ensure they dont cause huge holes that objcopy would null back-fill

04:19 <clever> dont put .text at 1mb and relocations at 512mb, lol

04:20 <mrvn> say you have "struct Task init_task; struct Task *current_task = &init_task;" then you need to relocate "current_task" on boot

04:20 <clever> the relative spacing will be preserved in the .bin, and youll find up with a 511mb .bin file

04:20 <clever> i would just relocate patch everything before the C code is even touched

04:20 <clever> that should patch all of the initial .data values, right?

04:20 <mrvn> hehe, don't do that. You can play with LMA and VMA to get everyhting close together even when you map it wide apart

04:21 <clever> ah yes

04:21 Irvise_ has joined #osdev

04:21 <mrvn> and yes, you relocate before _start so the C code just sees everything in place.

04:21 <clever> [nix-shell:~/apps/rpi/lk-overlay]$ vc4-elf-readelf -l build-bootcode-fast-ntsc/platform/bcm28xx.mod.o

04:21 <clever> There are no program headers in this file.

04:21 <clever> i believe this binary was created with -r

04:22 <clever> "file" says it is: ELF 32-bit LSB relocatable, Broadcom VideoCore III, version 1 (SYSV), with debug_info, not stripped

04:23 <clever> i do also see a lot of .rela.* stuff in here, i assume that is what i would want the linker script to preserve?

04:23 <clever> with `readelf -a` instead, i can see an entry-point of 0, so that implies the headers do have room for it, and i could still use this as a kernel binary, if i just pass the right flags at link-time

04:24 <clever> ah, and there is the type, REL (Relocatable file)

04:24 ebb has joined #osdev

04:25 <mrvn> All those relocation stuff depends on the arch and pic, PIC, pie, PIE flags. It's a bit of a mess.

04:25 <clever> Relocation section '.rela.text.cmd_gpio_mode' at offset 0x9984 contains 23 entries: Offset Info Type Sym.Value Sym. Name + Addend

04:26 * clever kicks irssi

04:26 justache has quit [Quit: ZNC 1.8.2 - https://znc.in]

04:26 <clever> 0000000e 00006007 R_VC4_PCREL27_MUL 00000000 puts + 0

04:26 <heat> -fPIE/-fPIC are much easier to relocate than REL

04:26 <clever> Offset Info Type Sym.Value Sym. Name + Addend

04:26 <clever> i do see fields like this in a .o file

04:26 <heat> REL have actual linker relocations

04:26 <clever> would -r create a REL or DYN?

04:26 <mrvn> clever: I would assume REL

04:27 <dh`> it's not easier, it's just different

04:27 <mrvn> clever: You are supposed to use -r over and over and then a final link without.

04:27 <clever> `objdump -dr` the `-r` here is also very important, when disassembling files that havent been linked fully yet

04:27 <clever> 8: 00 e8 00 00 00 00 mov r0,0x0

04:27 <clever> 8: R_VC4_IMM32 .rodata.str1.4

04:27 <heat> dh`, have you seen the gnarly risc relocations of some architectures? it's just nasty

04:28 <clever> the addresses are all 0 on this partial link (ld -r i believe), but objdump -r inserts the relocation data

04:28 <dh`> I've written a portable linker :-)

04:28 <clever> revealing that in this case, a 32bit immediate needs to be slotted in, and what symbol/section it points to

04:28 <dh`> the idea with PIC relocations is that you don't have to write in the text segment

04:28 <clever> 2022-11-06 00:52:32 < heat> essentially you end up relocating the kernel

04:28 <clever> 2022-11-06 00:52:33 < heat> ld -r

04:28 <dh`> but the relocation operations themselves are considerably more complicated

04:29 <clever> mrvn: then how am i meant to get a relocatable kernel, if the final pass isnt -r?

04:29 <heat> it may not be -r but --keep-relocs

04:29 <mrvn> clever: PIC/PIE

04:29 <heat> LDFLAGS:=-Wl,--emit-relocs -Wl,--discard-none

04:29 justache has joined #osdev

04:30 <clever> *tries*

04:30 <heat> mrvn, definitely not PIC/PIE

04:30 <heat> that's overkill

04:30 <mrvn> clever: Why do you care for the kernel anyway? It runs in virtual space so just pick a fixed address. The only part where relocatable / position independent would be relevant is for boot.S

04:30 <heat> KASLR

04:31 <clever> mrvn: the VPU lacks an MMU, it doesnt run in virtual space!

04:31 <mrvn> clever: ok, you are screwed. :)

04:32 <mrvn> heat: why would PIC/PIE be overkill?

04:32 <clever> i cant load to 0, because the arm reset vector is 0, and they would clash some, and remapping the arm would cause bigger problems

04:32 <clever> i cant load to anything under 128mb, because linux and uboot like assuming the low parts of ram are available, until they parse DT

04:32 <heat> mrvn, because you're the kernel, you can patch .text

04:32 <mrvn> clever: you can link to any address, that isn't a problem

04:32 <heat> you don't need a GOT, nor a PLT

04:32 einkoder has quit [*.net *.split]

04:32 stux has quit [*.net *.split]

04:32 brynet has quit [*.net *.split]

04:32 corank_ has quit [*.net *.split]

04:32 Andrew has quit [*.net *.split]

04:32 Arsen has quit [*.net *.split]

04:32 gdd has quit [*.net *.split]

04:32 woky_ has quit [*.net *.split]

04:32 bleb has quit [*.net *.split]

04:32 HeTo has quit [*.net *.split]

04:32 HeTo has joined #osdev

04:32 <mrvn> heat: i beleive there are options for avoiding GOT and PLT

04:32 <clever> mrvn: some arm kernels also assume ram is a single linear chunk, so i would need to load to the top of ram, to stay out of the way

04:32 Arsen has joined #osdev

04:33 bleb has joined #osdev

04:33 <clever> but top of ram is a moving target

04:33 <clever> so i need relocation...

04:33 <heat> i don't think you can avoid the got

04:33 <heat> PIE will not generate text relocs

04:33 <dh`> if you don't have an mmu, you're kinda screwed with standard tools

04:33 <dh`> best option is probably fdpic but it's pretty gross

04:33 <mrvn> clever: the VPU can only access 1GB ram. How is that supposed to work at all?

04:34 <clever> mrvn: the dram controller on the pi0-pi3 lineup, is also limited to 1gig of ram

04:34 Andrew has joined #osdev

04:34 <clever> but there are models with 256 and 512 mounted

04:34 woky has joined #osdev

04:34 <mrvn> clever: and on an 8GB model the vpu won't be at the end of ram.

04:34 einkoder has joined #osdev

04:34 corank_ has joined #osdev

04:35 <clever> thats the pi4, totally different dram controller, the VPU is loaded to the top of the lowest 1gig

04:35 <clever> and the extra 7gig is a second segment in device-tree

04:35 <clever> that is basically arm-only

04:35 <mrvn> clever: and any kernel that assumes you only have 1 segment will blow up

04:35 <clever> the legacy api's only report the lower 1gig

04:35 <clever> so any dumb kernel will just not see the other 7gig

04:36 <clever> i suppose i could do the same, just load to 128mb, and claim 128mb of ram via the legacy api

04:36 <clever> if you want more, use the device-tree

04:36 gdd has joined #osdev

04:37 <mrvn> clever: or learn how the REL entries work and use a loader stub.

04:37 <mrvn> or PIC/PIE

04:37 <clever> the official firmware, has its own horid solution

04:38 <clever> the top of ram, the gpu_mem config value, and fixup.dat are mixed together (no idea how), to create a binary patch against the whole damn elf

04:38 <clever> it doesnt relocate .text, it patches the entire ELF, headers and all

04:38 <mrvn> or use the solution geist has: During build you scan the elf file for the relocation info and generate your own data structure for where to start the _start address and glue that to the kernel image.

04:38 <clever> thats kinda what the official firmware does

04:39 <mrvn> which just shows that the ELF relocation methods are so bad you don't want to do that during boot.

04:39 <dh`> there are kernels that relocate themselves via PIC at boot time, I've seen it done

04:40 <clever> arm32 linux kinda does that, the decompression stub

04:40 <clever> it has a PIC asm blob, that relocation patches the decompression (compiled C) stub

04:40 <dh`> but it also makes you exciting problems if you are trying to e.g. use address constants in trap handling

04:40 <clever> then it decompresses the real payload to some other addr, which also deals with loading it to a more sane addr

04:41 <clever> then it exits C, and fills the mmu config in, and jumps to virtual

04:41 <mrvn> On x86 where you need a 32bit stub to get to 64bit having a loader stub also makes a lot of sense. Just have a ld32.so before your kernel.elf and that maps the kernel into 64bit address space and jumps to it.

04:41 <mrvn> The loader stub you can write in asm as truely position independent code

04:41 <clever> aarch64 kinda put its foot down and just banned self decompressing stubs

04:42 <clever> so the aarch64 entry point only sets up the mmu and thats it

04:42 <mrvn> how would it prevent that?

04:42 <clever> the kernel just doesnt have the code to decompress itself anymore

04:42 <clever> its not an option it `make menuconfig` either

04:42 <mrvn> but you could.

04:42 <mrvn> linux just chooses not to

04:42 <clever> yeah

04:43 <clever> if i was making my own aarch64 kernel, i dont have to follow the same rules

04:43 <mrvn> if your firmware uncompresses your kernel for you then there isn't much point in decompressing yourself.

04:43 <clever> yeah, thats the rule linux set on aarch64

04:43 <clever> if you want a compressed kernel, the bootloader has to undo it first

04:44 <mrvn> it's much easier to decompress before you set up the mmu.

04:44 <clever> which arm32 linux did do, but it was kinda a mess

04:44 <mrvn> as in decompress the kernel to anywhere and map it to the virtual address it expects.

04:44 <clever> some PIC asm will patch the decompression code, so it can decompress with the MMU off

04:45 <clever> but if you drop decompression support, you dont have to patch bunzip to run without the MMU

04:46 dh` has quit [Quit: brb, client is hosed]

04:47 dh` has joined #osdev

04:50 gog has quit [Ping timeout: 272 seconds]

04:51 <clever> mrvn: one reason i kinda want to use what elf has, is that i dont want to create the problem of having to keep the elf and relocation data in sync

04:51 <clever> if you upgrade just the elf, the old relocation file will just shred the new code

04:51 <clever> having both bundled in one file makes it more idiot proof

04:52 <clever> but a fixed load addr like 128mb can also do that, and be far simpler

05:06 <mrvn> clever: deprecating the legacy API unless they only want 128MB sounds perfectly fine to me.

05:06 <clever> yeah

05:06 <clever> i was loading to 64mb, but i recently discovered a nasty surprise in u-boot

05:07 <mrvn> if you can use a fixed address that is worlds simpler to build.

05:07 <clever> it assumes the arm has full control of the 0-128mb range, and relocates itself to the top of 128mb

05:08 <clever> i'm already using 64mb as a fixed addr, but now see that 128mb seems like a safer bet, then u-boot and the firmware would be perfectly loading on either side of the 128mb boundary

05:08 <mrvn> one problem might be to load a kernel+initrd that is big. Things like uboot might not support loading it to the second memory segment.

05:08 <mrvn> but then how does that work now if uboot puts itself at the end of 128MB?

05:09 <clever> uboot comes in ~2 stages, the SPL is typically linked for a fixed load addr (but can be PIC), and copies uboot proper to a different fixed addr (top of 128mb), before it parses any DT i think

05:09 <mrvn> Does it only support kernel up to 120MB and loads the initrd into the second memory segment?

05:09 <clever> uboot proper then has a real malloc, and can parse DT and load things anywhere

05:09 <clever> i assume it can load the initrd after uboot, but uboot being in the middle kinda fragments your free space

05:10 <clever> i think that 128mb is also a compile-time constant

05:10 <clever> and can be tuned to whatever makes the most sense for the platform

05:10 <mrvn> maybe it's just the first stage that's at 128MB and the uboot proper can be anywhere

05:10 <clever> if the platform has predictable ram, you can put uboot at the top of ram

05:11 <clever> last time i was debugging u-boot, one of its memory allocation routines glitched hard, and allocated some state to ~+3gig physical

05:11 <clever> totally unmapped space

05:11 <clever> and to my horror, i discovered that doesnt cause a bus fault

05:11 <clever> writes are discarded, reads return zero

05:12 <clever> so all state was silently zero'd out upon readback

05:15 justache has quit [Quit: ZNC 1.8.2 - https://znc.in]

05:16 <mrvn> yeah, I don't get why the bus doesn't get a timeout and the cpu fault

05:16 <clever> its either a config flag i missed (should intentionally do that under the official firmware) or a serious oversight in the axi interface design

05:16 brynet has joined #osdev

05:17 <clever> its also complicated by the custom mmu involved

05:17 justache has joined #osdev

05:17 <clever> for the pi0-pi3, the arm's "physical" addre space, is made up of 64 pages of 16mb each

05:17 <clever> each page can be mapped to any addr (2mb alignment i think) on the real bus

05:18 <clever> so i could totally scramble the arm->bus address mappings, if i wanted to wreak havoc with dma :P

05:18 <mrvn> that would require a big hughe device tree range list

05:19 <clever> 64 entries max

05:19 <clever> but that could also let both sides load to 0

05:19 <clever> and just dont map 0 to 0

05:20 <clever> i could even do light, page 0 (0-16mb) is mapped weirdly, but all other pages are mapped normally, and you just dont dma from page 0

05:20 <clever> DT would be simple then, and both sides can load to 0, the first 16mb is isolated

05:21 <mrvn> still would need an extra range statement for the DMA mapping

05:21 <clever> but, i would need to toss in some L1/L2 flushing in my bootloader

05:21 <clever> because the bootloader is also loaded to 0

05:21 <clever> you already need that range statement in DT, because dma has to start from the 0xc000_0000 addr

05:22 <clever> all you would be doing, is +16mb to both the parent and child addr

05:22 <clever> this is also a problem the pi1 created, and DT later solved

05:23 <clever> the pi1, only has an L1 arm cache, 16kb i believe

05:23 <clever> no L2

05:23 <clever> to get any kind of reasonable performance, all 64 pages of the arm, are mapped via the VPU L2 cache, 128kb

05:24 <clever> and to make dma coherent, you flush the arm L1, then tell dma to read via the VPU L2

05:24 <clever> there is a config.txt to disable that, but you also had to compile linux specially to dma right

05:24 <mrvn> and you have to map pages as 16k chunks or the page coloring craps out

05:24 <clever> DT automates that mess

05:25 <clever> page coloring, is that related to what bank of dram things map to?

05:26 <heat> cache

05:26 <mrvn> bits of the address are used to pick a cache slot or might even not me remapped, can't quite remember. But if you map 4k chunks of a 16k block randomly you get problems in the cache.

05:27 <clever> ah yeah

05:27 <clever> a certain bit range of the phys addr is used to index into the cache, and then the tag (the other bits) is checked in parallel

05:27 <clever> 4-way cache, having 4 cache lines at a given index

05:27 <heat> https://docs.freebsd.org/en/articles/vm-design/#page-coloring-optimizations

05:27 <bslsk05> docs.freebsd.org: Design elements of the FreeBSD VM system | FreeBSD Documentation Portal

05:28 <mrvn> The CPU was basically designed to only run with 16k pages.

05:28 <clever> so if you only use the first 4k out of every 16k chunk, you may have an abnormally high cache wastage

05:28 <mrvn> clever: i.e. you only get 25% of the cache-

05:29 <clever> according to my notes, the L1 cache on the bcm2835 was 4-way, with 8 words per line, 16kb total

05:29 <clever> 8 words, would be 32 bytes, 4 way means a given index can hold 128 bytes, and with 16kb in total, that means there are 128 of those slots

05:30 <clever> so part of the addr, is turned into a 0-127 index? and then it compares the rest of the addr, against the 4 tags, to select one of the 32 byte lines?

05:31 <clever> and if its using bits of the addr that conflict with your allocations, you may only ever get an index in the 0-31 range?

05:32 <clever> with a 32 byte cache line, that means the lower 5 bits (4:0) are an index into the line, so those can be ignored

05:32 <mrvn> clever: I think the cache was also buggy with multiple mappings. If you map a 4K page at 0 and 4096, write to one and read from the other you get garbage.

05:32 <clever> ouch

05:32 <heat> https://www.reddit.com/r/osdev/comments/yn3oou/beginner_question_do_you_guys_use_software_to/

05:33 <clever> i believe arm uses physical addresses for all cache logic

05:33 <bslsk05> www.reddit.com: Beginner question: Do you guys use software to simulate an "empty computer" when developing an OS? : osdev

05:33 <clever> so that cant happen on arm

05:33 <mrvn> clever: not supposed to but bcm is screwy

05:33 <heat> i feel like this person has just discovered emulators and i find it lovely that it happened on r/osdev

05:34 <clever> mrvn: i have heard, that the bcm2835 axi port is ultra buggy with async and reordering

05:34 <clever> mrvn: if you fire out 2 reads to different peripherals, due to all of the axi fifo's, the answers can come back out of order

05:35 <clever> and the arm's axi master, cant deal with the re-ordering

05:35 <mrvn> clever: yep.

05:35 <clever> so the results get swapped

05:35 <clever> and they recommend you barrier any time you switch between peripherals

05:35 <clever> i assume that was fixed on the bcm2836, because you cant coordinate 4 arm cores to barrier properly when switching peripherals

05:36 <mrvn> indeed, with SMP that's impossible to do

05:36 <clever> another thing a properly functioning axi port should do, is dynamicaly change the transfer width

05:37 <clever> for example, (going off memory), the bcm2835 arm axi port, is only 32bits wide

05:37 <clever> so it can only ever move 32bits per clock, but you can have a 4 clock burst, sending 256 bits in total

05:38 <clever> 128*, math is hard :P

05:38 <clever> but, other internal busses are 128bit wide, so it can translate that into a single 128bit transfer

05:39 <clever> mrvn: however, some of the ports (peripherals) assume you only ever do a 32bit transaction, and do implementation defined funky things when you violate that, and dont translate into a burst

05:40 <clever> if you do a 64bit load, covering reg1+reg2, youll just get reg1 repeated twice i believe

05:40 heat has quit [Remote host closed the connection]

05:40 <mrvn> don't do a double register load/store to MMIO

05:40 <clever> exactly

05:41 <mrvn> which is why you have to declare every MMIO volatile

05:41 heat has joined #osdev

05:41 <clever> the VPU also cant even do a 64bit load/store in scalar mode

05:41 <clever> all regs are 32bits max

05:41 <clever> you need vector mode to even do that, which does 16 consecutive addresses, 8/16/32bits each

05:42 <clever> so 16/32/64 bytes in a burst

05:43 <clever> oh right, forgot, the vpu had load-multiple, that can trigger it

05:43 <clever> mrvn: but another fun bug, is mis-aligned 8bit loads

05:43 <clever> basically, there is a switch-case block in hardware, that assumes your only ever giving it 32bit aligned addresses

05:44 <clever> if something isnt 32bit aligned, its not a valid register

05:44 <clever> except on the sdhci peripheral, that has 16bit registers, with bugs

05:45 <clever> consecutive access to both halves of a 32bit reg corrupt the transfer

05:58 <zid> Disregard pages of clevertext, aquire pencilcases

06:00 <clever> lol

06:02 dza6 has joined #osdev

06:02 les has quit [Ping timeout: 260 seconds]

06:02 les has joined #osdev

06:03 scoobydoo has quit [Read error: Connection reset by peer]

06:03 scoobydoo has joined #osdev

06:03 dza has quit [Ping timeout: 260 seconds]

06:03 night has quit [Ping timeout: 260 seconds]

06:03 dza6 is now known as dza

06:03 night_ has joined #osdev

06:06 matthews has quit [Ping timeout: 260 seconds]

06:06 matthews has joined #osdev

06:07 mjg has quit [Ping timeout: 260 seconds]

06:14 heat has quit [Ping timeout: 246 seconds]

06:33 zaquest has quit [Remote host closed the connection]

06:36 bleb has quit [Ping timeout: 260 seconds]

06:37 woky_ has joined #osdev

06:38 bleb has joined #osdev

06:38 woky has quit [Ping timeout: 260 seconds]

07:17 zaquest has joined #osdev

07:18 GeDaMo has joined #osdev

08:03 scoobydoo has quit [Read error: Connection reset by peer]

08:04 scoobydoo has joined #osdev

08:24 poisone has joined #osdev

08:32 poisone has quit [Read error: Connection reset by peer]

08:33 gildasio1 has quit [Ping timeout: 255 seconds]

08:36 gildasio1 has joined #osdev

08:40 ZombieChicken has quit [Quit: WeeChat 3.6]

08:47 poisone has joined #osdev

08:49 <geist> oh oh are you ready for the fallback?

08:49 <geist> awww yeah

08:52 <poisone> i

09:05 <geist> 1am again!

09:06 Stella is now known as Stella[OotC]

09:07 <poisone> UPTIME: 0 days, 0 hours, 33 minutes

09:18 Goodbye_Vincent has joined #osdev

09:29 rwb is now known as rb

09:42 Burgundy has joined #osdev

09:43 <Jari--> completed watching Terminator Salvation+Genisys on Netflix

09:43 <Jari--> now watching Deep Space Nine

09:43 <Jari--> interesting computers, operating system on those

09:44 <Jari--> 16 million colours, possible, and say, you want to use monochrome logos

09:44 <Jari--> mainstream OS logos, etc.

09:45 <Jari--> I have epilepsy, and never got attacks on colorful logos, etc. flickers

09:45 <Jari--> made JTMOSDEV (google it) system under epilepsy attacks

10:04 scoobydoo_ has joined #osdev

10:07 scoobydoo has quit [Ping timeout: 260 seconds]

10:07 scoobydoo_ is now known as scoobydoo

10:26 wootehfoot has joined #osdev

11:02 Burgundy has quit [Ping timeout: 260 seconds]

11:17 Burgundy has joined #osdev

11:49 linkdd has joined #osdev

11:53 Ali_A has joined #osdev

11:57 vdamewood has quit [Read error: Connection reset by peer]

11:58 vdamewood has joined #osdev

12:00 vdamewood has quit [Read error: Connection reset by peer]

12:00 vdamewood has joined #osdev

12:05 genpaku has quit [Remote host closed the connection]

12:08 genpaku has joined #osdev

12:27 gog has joined #osdev

13:01 [itchyjunk] has joined #osdev

13:05 alexander has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]

13:10 alexander has joined #osdev

13:15 dormito has quit [Ping timeout: 255 seconds]

13:16 eau has joined #osdev

13:37 wootehfoot has quit [Ping timeout: 252 seconds]

14:12 genpaku has quit [Ping timeout: 260 seconds]

14:14 genpaku has joined #osdev

14:19 elastic_dog has quit [Ping timeout: 276 seconds]

14:19 elastic_dog has joined #osdev

14:29 dormito has joined #osdev

14:31 gog` has joined #osdev

14:32 gog has quit [Killed (NickServ (GHOST command used by gog`))]

14:32 gog` is now known as gog

14:46 terminalpusher has joined #osdev

14:47 spikeheron has quit [Quit: WeeChat 3.7.1]

14:51 spikeheron has joined #osdev

15:00 terminalpusher has quit [Remote host closed the connection]

15:02 srjek_ has joined #osdev

15:04 <Jari--> hi

15:13 \Test_User has joined #osdev

15:20 eau has quit [Ping timeout: 260 seconds]

15:21 eau has joined #osdev

15:29 ckie has quit [Quit: *poof*]

15:32 ckie has joined #osdev

15:46 Arthuria has joined #osdev

16:12 terminalpusher has joined #osdev

16:20 terminalpusher has quit [Remote host closed the connection]

16:25 poisone has quit [Read error: Connection reset by peer]

16:36 wootehfoot has joined #osdev

16:47 Arthuria has quit [Remote host closed the connection]

17:02 srjek|home has joined #osdev

17:06 srjek_ has quit [Ping timeout: 260 seconds]

17:06 wootehfoot has quit [Ping timeout: 248 seconds]

17:07 romzx has quit [Ping timeout: 272 seconds]

17:17 linearcannon has joined #osdev

17:22 d5k has joined #osdev

17:28 bauen1 has quit [Ping timeout: 255 seconds]

17:28 bauen1 has joined #osdev

17:28 romzx has joined #osdev

17:32 <yuu_> I/

17:32 <yuu_> o/

17:36 <gog> I/O is very important

17:37 <GeDaMo> I/O/yuu_ :P

17:37 <yuu_> Hehe

17:38 <yuu_> Hello, I noticed I was back in here, glad to be here

17:44 d5k has quit [Ping timeout: 252 seconds]

17:46 heat has joined #osdev

17:50 <Jari--> https://www.quora.com/Which-hypervisors-provide-FULL-hardware-support-for-MS-DOS-Windows-95-and-Windows-98-I-have-hundreds-of-old-games-I-can-no-longer-play-because-of-OS-and-hardware-incompatibility

17:50 <bslsk05> www.quora.com: Which hypervisor's provide FULL hardware support for MS-DOS, Windows 95, and Windows 98? I have hundreds of old games I can no longer play because of OS and hardware incompatibility. - Quora

17:50 <Jari--> sorry, wrong chan

17:51 <Jari--> but was like looking for any MS-DOS DJGPP then CWSDPMI-basedd hypevisor which runs actually under MS-DOS environment

17:51 <Jari--> more ideal than running XEN

17:51 <Jari--> not meaning DOS boot up for Linux loadlin

17:52 <Jari--> not much of MS-DOS left, it would be actual own individual seperate operating system

17:52 <Jari--> well, cwsdpmi is an OS or not ?

17:52 <gog> no

17:53 <gog> it's an implementation of DPMI

17:53 <Jari--> gog: yeah BIOS reliancy calculates it is MS-DOS I gues

17:53 <gog> it runs inside of MSDOS

17:53 <Jari--> yeah because it preserves the first megabyte

17:54 <gog> and it depends on DOS interrupt handlers for system calls

17:54 <gog> it's not a standalone OS

17:54 <Jari--> gog yeah so its VM86 it basically just compatibility mode with MS-DOS reliancy

17:55 <zid> I am also a DPMI implementation

17:55 <Jari--> zid: no clues on DPMI interface calls, etc. insights, dunno if many have this info

17:55 * gog gives zid a bagel

17:55 <zid> I've never had a bagel

17:55 <gog> whaaaaaaaa

17:55 <zid> I'm not jewish enough I guess

17:55 <Jari--> football anyone?

17:57 <heat> depends

17:57 <heat> what side of the pond

17:57 <gog> neither kind of fooballs interest me anymore

17:57 <heat> booooooooooo

17:57 <zid> we need to get heat on HRT so he shuts up about it too

17:58 <heat> lol

17:58 * gog hands heat cat ears and pink stripy thigh-highs

17:58 <gog> you'll need these

17:58 <heat> UwU

17:58 <heat> what's this

17:58 <gog> OwO

18:00 <zid> PooOoo

18:00 xenos1984 has quit [Ping timeout: 255 seconds]

18:00 xenos1984 has joined #osdev

18:01 <Jari--> This program requires Microsoft WIndows

18:02 <Jari--> I had this on my COmmodore 64, programmed with turbo assembler

18:02 <heat> winnie the pooh moment

18:02 <Jari--> a PC simulator

18:03 <gog> zid: what kind of pizza should i get

18:03 <zid> jala + mush

18:03 <Ermine> gog: may I pet you?

18:03 <gog> Ermine: yes

18:04 * Ermine pets gog

18:04 * gog prr

18:04 <heat> has it escaped everyone how winnie the pooh is fucking yellow

18:04 <heat> bears are not yellow

18:04 <gog> he's got hepatitis

18:04 <zid> some are

18:04 <zid> for example, winnie the pooh

18:05 * Ermine is less anxious now and hopes so is gog

18:05 <gog> :)

18:06 <heat> gog, winnie went to too many gay clubs without protection 😳

18:06 <gog> oh bother

18:06 <gog> gotta safeguard your health, winnie

18:07 <zid> bouncing's what tiggers do best

18:07 <zid> tigger's the gay one

18:07 <gog> help me pick a pizza

18:07 <zid> I already did

18:07 <gog> your choice was bad

18:07 <gog> try again

18:08 <kof123> spinach, mushroom, ...

18:08 <zid> fine, if what I think is good is bad, then what I think is bad must be good

18:08 <zid> sweetcorn and pineapple

18:08 <gog> a less terrible choice

18:08 <zid> okay so gog is a non-person

18:08 <zid> some kind of weird alien

18:08 <gog> i usually get blaze

18:09 <gog> chili, garlic, jalapeno, pepperoni, pepper cehese, black pepper

18:09 <zid> https://twitter.com/i/status/1588957215419158530

18:09 <bslsk05> twitter: <hacer_kun> https://video.twimg.com/ext_tw_video/1588957159093592064/pu/vid/1076x720/ZEMl3bdpl2dxfafZ.mp4?tag=12

18:10 <gog> final fantasy 7 music

18:10 <gog> love it

18:10 <zid> and message box

18:10 <gog> yes

18:10 <gog> low-poly spinning cat

18:11 <zid> This IRC message is sponsored by squarespace, use discount code ZID for 20% off today!

18:11 <gog> not raid shadow legends?

18:12 <GeDaMo> That cat is too high poly for FFVII :P

18:12 <zid> The battle models were super good

18:13 <zid> it was the overworld models that they never.. finished

18:13 <GeDaMo> I think there are three different styles

18:13 <GeDaMo> The cut scenes are different again

18:13 <zid> well.. yea

18:14 <zid> you'll be counting advent children next

18:14 <gog> and the remake

18:14 <GeDaMo> I only know the original :|

18:14 <gog> i'm tempted to buy the PC version of FFVII on steam

18:15 <gog> the original

18:15 <j`ey> heat: u see? https://lore.kernel.org/lkml/20221105222012.4226-1-Jason@zx2c4.com/

18:15 <bslsk05> lore.kernel.org: [PATCH] drm/atomic: do not branch based on the value of current->comm[0] - Jason A. Donenfeld

18:15 <kof123> it includes a yamaha midi software synthesizer thing, and presumably a soundfont

18:15 <kof123> that's right, midi

18:16 <heat> j`ey, aw wtf

18:16 <heat> how was that ever merged

18:16 <zid> gog: It works nicely, but has 1 silly achievement to max your gil

18:17 <zid> which takes a few hours of grinding movers and selling all materia buuut ff7 has a bug where if you gain too much materia it deletes other stuff to make room

18:17 <j`ey> heat: i guess torvalds missed it go by

18:17 <zid> guess who lost the underwater materia and had to roll back

18:18 <heat> j`ey, https://lkml.org/lkml/2022/11/5/382

18:18 <bslsk05> lkml.org: LKML: Pedro Falcato: [PATCH] fs/binfmt_elf: Fix memsz > filesz handling

18:18 <heat> are you proud

18:18 <heat> my first patch

18:18 <j`ey> heat: you were testing on ppc64??

18:18 <zid> omg it's pedro

18:19 <heat> j`ey, no

18:19 <heat> it all happened on #musl

18:19 <heat> maskray then tested on ppc64

18:20 Raito_Bezarius has quit [Ping timeout: 255 seconds]

18:20 <j`ey> heat: congrats tho, now cp onyx/* linux/* and send that as a patch

18:20 <heat> a good chunk of the codebase would be much improved

18:21 <j`ey> a good chunk would be deleted

18:21 <heat> particularly all the arm and arm64 code

18:21 <heat> it wouldn't be there! how delightful

18:21 <j`ey> :<

18:21 <gog> :D

18:21 <heat> hey

18:21 <heat> there's no CoW for you to break if ARM isn't even supported

18:21 wootehfoot has joined #osdev

18:21 <heat> there, saved you some trouble

18:22 <j`ey> :<

18:22 <gog> i decided on blaze

18:24 <heat> should've gone with bazel

18:24 <heat> blaze really is... an odd build system to choose

18:24 <gog> i meant pizza

18:24 <heat> pizza? is that a new build system?

18:24 <gog> yes

18:25 <Ermine> heat: congrats on patch!

18:25 <Jari--> "synthetic DNA molecules are now considered as serious candidates for this new kind of storage"

18:25 <heat> thanks

18:25 <Jari--> new SSDs

18:25 <heat> i did slightly fuck up because I forgot to rebase on kees's tree

18:25 <heat> hopefully that doesn't matter? or I'll just send a rebased v2

18:25 <zid> wow heat

18:26 <zid> pissing off morpheus

18:26 <heat> *torvalds angrily writes an email calling me INCOMPETENT and STUPID and DUMB*

18:27 <zid> nah you're a rank amateur

18:27 <zid> that == 'X' thing is dumb as shit

18:27 <zid> but wtf is that left hand side

18:27 <zid> current process's name is called 'comm' ?

18:28 <heat> yrah

18:28 <heat> see htop

18:28 <zid> That's why nobody noticed it, anyway

18:28 <zid> the true fix would be to rewind time to change that to be process_name_do_not_strcmp_to_block_things_wtf

18:29 Ali_A has quit [Quit: Client closed]

18:29 <heat> or as BSD would call it, pndnstbtw

18:31 <gog> BaSeD

18:32 bauen1 has quit [Ping timeout: 248 seconds]

18:32 <gog> is that why you were asking if geist had any ppc64 machines

18:32 <heat> yes

18:32 <gog> nice

18:32 <heat> geist is the likeliest person to have random architectures lying around

18:32 <gog> true

18:33 <geist> heh yep

18:33 <geist> but yeah i haven't ran the g5 in a while. primary reason is it's a power hog

18:33 Raito_Bezarius has joined #osdev

18:33 <heat> i bet you don't have a loongarch yet

18:34 linearcannon has quit [Read error: Connection reset by peer]

18:34 <geist> surpisingly a lot of my other old exotic machines really arent. (the vax, sparcstations, etc). we forget that moderns stuff since about 2000 draws a shitton of power. prior to that you could get away with just a heatsink or maye a heatsink and a little fan on your 486 or pentium

18:34 bauen1 has joined #osdev

18:34 <geist> so a early 2000s era machine pulling 200W to do what a modern machine can do in 5w feels wrong enough that i generally dont keep them on much

18:34 xenos1984 has quit [Ping timeout: 248 seconds]

18:35 <geist> whereas prior to 2000 they do even less work but maybe pull 35 or 40 so i feel better about that

18:35 <heat> dig on zid who uses a 10 year old xeon

18:35 <geist> yah but they explicitily downclock it to use less

18:35 <geist> gets a pass

18:35 <zid> hey it's on 0.8V and uses 30W

18:35 <geist> exactly

18:36 <gog> just turn down the thermostat, use it to heat your house

18:36 <zid> It's overclocked and undervolted at the same time because sandy is amazing

18:36 Raito_Bezarius has quit [Max SendQ exceeded]

18:36 <geist> gog well honestly yes, once it starts getting cold in the room and its either some space heater in the form of some computer equipment or it's a space heater in terms of a heat pump, it's not too bad

18:36 <geist> though the heat pump is hypothetically more efficient

18:37 <zid> I wish people didn't go fucking *insane* on bitcoin and start using gpus and shit so that I could mine on my cpu to heat my room

18:37 <zid> and.. have a greater than 1 in heatdeath chance of actually getting a hit

18:37 <heat> bitcoin hasn't used GPUs in a loooooooong time

18:38 <heat> all about ASICs baby

18:38 <zid> yes I was describing the progression

18:38 <zid> gpus, then shit (like asics)

18:38 <geist> yah i'm not much on regrets, but bak in 2010 when i fiddled with bitcoin for lulz, i had spent a few weekend mining bitcoins back when you actually could

18:38 <gog> the heat pump is not only hypothetically more efficient, it literally is :P

18:38 <zid> heat pumps are black magic

18:38 <geist> and earned like 2.4 of them. but i stopped after ike weeks because it was heating up the room and it was summer

18:38 <zid> stealing warm from outside when it's already cold outside

18:38 heat is now known as _Heat

18:38 <geist> gog: yeah, though then depends on the particular model, how cold it is outside, etc

18:38 <geist> but yeah

18:39 <geist> last year when it got substantially below freezing my main house heat pump switched to pure resistive mode and burned a *crapton* of power over the course of 2 or 3 days

18:39 <zid> I think like, chicago, which is kinda famous for getting cold, only gets too cold for the refrigerant used in a heat pump for like 1 day a year

18:39 <zid> if you get a modern one with good.. fluids

18:39 <geist> yah it was really odd here that it got that cold, hence the heat pump not being designed to dealw ith it

18:40 <geist> but omg i was burning power like craszy. i think for 3 days i was pulling about 250kWh a day

18:40 <geist> which actually lines up: the resistive aux heater in the house system is like 8kW

18:40 <zid> how did your house not melt

18:40 <geist> and it was just running it flat out

18:40 <geist> because 8kW os

18:40 <zid> is your house actually a swimmingpool

18:40 <geist> isn't enough. even with that running flat out the temp inside was like 60F

18:41 <zid> 8kW would get my entire house like.. hot to the touch :P

18:41 <_Heat> 250kwh is insane

18:41 <zid> the bricks would start to glow

18:41 <geist> 250 lines up though: 24 hours in the day, about 8 or 9kW per h

18:41 <geist> that was very exceptional. it rarely gets below like, -2C or so here

18:42 <geist> this was more like -10, beyond the lock out temp of the compressor, etc

18:42 <geist> and then it just stayed -10 continually for days on end

18:42 <GeDaMo> Polar vortex bulging out

18:43 <geist> the house is oddly efficient/inefficient in terms of insulation

18:43 <geist> my house is a log cabin, so there's no actual insulation i the walls per se

18:43 <zid> mine's double hulled brick with fibreglass between so yea

18:43 <zid> 8kW would literally cook me

18:43 <geist> so it's odd in that it doesn't have a traditional r factor

18:43 <GeDaMo> My house is granite, no insulation either

18:43 <gog> my house is concrete and well insulated

18:43 <gog> but it also does not breathe

18:43 <geist> so i think the model is that it has technically fairly low r factor, but it also has a strong 'memory'

18:44 <zid> Summer this year was >40C indoors :D

18:44 <geist> so in general if it dips or gets hot for short periods of time (day or two) it doesn't really respond much to it

18:44 linearcannon has joined #osdev

18:44 <geist> but eventually it'll normalize the wall temperatures to the outside and then you're in trouble

18:44 <zid> I wanna fill my walls with water when it's hot

18:44 <zid> and keep it all year

18:45 <geist> so i think that wsas the problem with the -10 last year. once the walls normalized to being cold to the touch (i think they were like 40F for a while) then the heat pump just has to run flat out to stay ahead

18:45 <_Heat> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3042.htm

18:45 <bslsk05> www.open-std.org: Introduce the nullptr constant

18:46 <geist> but yeah basically log cabins: do not recommend for very cold temps unless you ave a lot of heat source (ie a fireplace)

18:46 <zid> mmm particulates

18:46 <zid> my favourite

18:46 <geist> but they work nicely for temperate zones where there's a fairly large daily temp swing: they act as a nice thermal battery over the day

18:47 SGautam has joined #osdev

18:47 ZombieChicken has joined #osdev

18:50 xenos1984 has joined #osdev

18:50 <geist> _Heat: oh i guess that's to C

18:50 <geist> took me a minute to figure out why they're ust now talking about it

18:50 <_Heat> yeah, it's in C23

18:51 <zid> make the type of nullptr incomplete

18:51 <_Heat> clang HEAD now supports it

18:51 <zid> doesn't that stop it being embeddable into structs

18:51 <_Heat> why would you want to embed nullptr_t

18:52 <zid> I like how all of their rationale is that

18:52 <zid> NULL can be defined to 0 or (void *)0 the former of which most people view as a fucked up bug

18:52 <zid> so their solution is to do the C++ thing and add another incompatible thing on top

18:52 <zid> and all the code examples are in C++ style too as a bonus

18:53 <_Heat> nullptr works fairly well

18:53 Raito_Bezarius has joined #osdev

18:55 <mrvn> I wouldn't mind loosing the conversion from nullptr to bool

18:56 <geist> yah form a very quick glance at the page before i got bored it seemed to state that most of the problem is broken NULL #defines that *dont* just define it as ((void *)0)?

18:56 <geist> ie, the void * one is basically okay its just not standardized and thus broken impls exist

18:56 <zid> spec says 0 and (void *)0 are the two alternatives

18:56 <zid> you just need to delete the former from the spec

18:57 <zid> and bam, no more problems

19:02 Raito_Bezarius has quit [Ping timeout: 252 seconds]

19:02 <mrvn> geist: I see a lot of "If NULL has integer type" in the rationale.

19:03 srjek_ has joined #osdev

19:04 * geist nods

19:05 wootehfoot has quit [Ping timeout: 260 seconds]

19:06 <mrvn> "In memory, nullptr is represented with the same bit-pattern as a null pointer constant of type void*." So really, as geist said, drop the "0" and only leave "(void*)0"

19:07 srjek|home has quit [Ping timeout: 260 seconds]

19:07 <mrvn> one step further would be: make "NULL" a keyword. Although having c++ compatibility with nullptr is nice.

19:09 <mrvn> "nullptr is permitted as argument to ..., as long as the function interprets it as pointer to void or character type." Urgs. Thinking about it it makes sense. But have you ever seen any code that casts their nullpointers to the right pointer type on function calls?

19:09 mjg has joined #osdev

19:10 <geist> hmm i thought the whole point was it could cast to any pointer type, much like a void *0 would

19:10 <geist> ie, it's a null pointer of any pointer type

19:10 <geist> or are they attempting to tighten that up?

19:10 <mrvn> geist: but with "..." what would be the type of the pointer to cast to?

19:11 <geist> fair point: and you can have short and wide pointers or whatnot

19:11 <mrvn> No, you can't. storage size must be the same.

19:11 <geist> well, i guess in C thatd be an extension. it's C++ that has the doublewide pointers (pointers to virtual members)

19:11 <mrvn> But casting between types can change the bit representation.

19:12 romzx has quit [K-Lined]

19:12 <geist> i guessin a case where you're using say x86 far pointers one defines what the null pointer bit pattern is

19:13 <geist> presumably 0:0

19:13 <mrvn> and then all pointers would have to be far

19:13 <geist> and not just 'something aht ends up equating to 0

19:13 <geist> '

19:13 Raito_Bezarius has joined #osdev

19:14 <mrvn> That "nullptr in ..." thing really makes no sense I think. What it really comes down to is that you can only pass nullptr to ... for void* and char*. Every other case is UB.

19:14 <mrvn> (and that is already the state of C)

19:15 <mrvn> Are there actually real archs where the bit representation for pointers of different (data) types differ?

19:16 <_Heat> hwasan :)

19:16 nyah has joined #osdev

19:16 <geist> yah actually brngs up the quetion what the bit pattern of a hw tagged null pointer is

19:17 <geist> ie, are you obligated to strip the tag before comparing to 0 (probably)

19:17 <mrvn> geist: same as void* and char*. So they must have the same bit pattern.

19:17 <geist> yah but with hw tagging you have this whole thing dealing with also comparing the tags when comparing with another pointer, i guess.

19:17 <mrvn> Would a compare convert the nulltpr to the other type or the other type to nullptr_t?

19:17 <geist> though maybe not. depends on what sw does with the tags really

19:18 <geist> or i guess also kinda specifically: would you ever tag a nullpointer or do they always implicitly have tag 0

19:18 <mrvn> geist: the specs say it has the bit pattern of "(void*)0" or "(char*)0"

19:18 <mrvn> s/or/and/

19:18 <geist> i'm thinking specifically of software based tagging schemes that use something like ARMs TBI (top byte ignore) feature. the top 8 bits are simply ignored by hardware so it's a sw problem

19:19 * geist nods

19:19 <mrvn> i.e. a sane NULL and nullptr must have the same bit pattern

19:19 romzx has joined #osdev

19:22 <geist> anywya if i didn't report it: proxmox VM binary review: 1

19:23 <geist> seems like a nice system that's doing things in a straightforward way

19:23 <geist> there's still some parts tha tyou have to open a shell to do (like mount a NFS share to locate vms) and whatnot but it seems like a pretty straightforward to use system

19:24 <geist> additional things i can do with it easily tht i haven't really figured out how to do manually with all my scripts to run qemu instances is it automatically supports virtio memory balloons

19:25 <geist> and it seems to actually work. i created 2 or 3 ubuntu instances and way oversubscribed and then observed what happened, and though it worked slowly it did eventually balloon al of them to de-oversubscribe the system

19:26 romzx has quit [K-Lined]

19:26 <mrvn> geist: "After WG14 refused a specification for a simple macro with value (void*)0 ..." They tried your way.

19:30 wootehfoot has joined #osdev

19:40 wootehfoot has quit [Ping timeout: 252 seconds]

20:02 tacco has joined #osdev

20:03 wootehfoot has joined #osdev

20:09 <_Heat> https://en.wikipedia.org/wiki/Winnie-the-Pooh_(1969_film)

20:09 <bslsk05> en.wikipedia.org: Winnie-the-Pooh (1969 film) - Wikipedia

20:09 <_Heat> russian winnie the pooh is not yellow

20:09 <_Heat> they in fact know how bears look like in russia

20:11 <_Heat> I guess polar bears are kinda yellowish?

20:11 <_Heat> but winnie is definitely not a polar bear

20:12 <GeDaMo> They're yellow if they're covered in honey :|

20:13 _Heat is now known as heat

20:19 <zid> https://en.wikipedia.org/wiki/Kermode_bear

20:19 <bslsk05> en.wikipedia.org: Kermode bear - Wikipedia

20:21 <heat> those are white

20:21 <heat> it even says so in the article

20:22 <zid> 'white'

20:22 <zid> it's actually a black bear.

20:23 <heat> yellow bears are not a thing

20:23 <zid> so ergo, winnie the pooh is actually black and you're just racist

20:23 <heat> much less so yellow bears that wear a fucking shirt

20:23 <zid> sorry you had to find out this way

20:23 <heat> why does he wear a shirt

20:23 <heat> and no pants??

20:23 <heat> is pooh a perv?

20:23 <zid> it makes you MORE naked if you do that, true fact

20:23 <zid> shirt and no pants > nothing at all > pants and no shirt > pants and shirt

20:25 <heat> programming socks and nothing else >>

20:30 <GeDaMo> Might be a bit chilly :|

20:35 <Jari--> reiserfs stable?

20:36 <Jari--> rating 5 317 reviews - but this is "recovery software"

20:37 <geist> reiserfs 3 was pretty stable at the time

20:37 <geist> had fairly bad failure modes if externally corrupted, but for a while it was hella fast and stable

20:37 <Jari--> xfs ftw

20:38 <geist> ah interesting, apparently reiserfs3 is declared deprecated

20:38 <Jari--> well a years back, hosting companies defaulted on reiserfs

20:38 <Jari--> because of saving resources hell a lot

20:38 <Jari--> heck

20:39 <geist> yeah that's what i mean. at the time the general defacto linux fs was ext3, and i would generally outperform it a lot

20:39 <geist> it's heydey was like 2002

20:39 <Jari--> :) no it was like 2015

20:39 <Jari--> or

20:39 <Jari--> and I got a corruption lost my longest project

20:39 <Jari--> Perl code..

20:41 <Jari--> Facebook+YouTube => killed like 10 million small companies on the fly

20:41 <Jari--> by eliminating the competition

20:43 <geist> huh?

20:43 <geist> what was in 2015?

20:43 Piraty has quit [Quit: -]

20:44 <gog> you also can't have a reiserfs image on a reiserfs filesystem iirc

20:44 <geist> presumablky due to fsck reasons

20:45 <zid> (plus it murders your wife, huhu)

20:45 <gog> yes

20:45 <geist> that being said rfs3 was the first fs i had really seriously looked at where the general idea is to just toss everything into one massive btree

20:45 Piraty has joined #osdev

20:45 <geist> it's [rpetty simple by 'toss everything into one btree' fs standards

20:45 <geist> but it's a fairly good example of what you get when you just follow that logical path

20:46 <geist> i thnk that was the general origin of the issues. a failure of the btree was a serious issue since reconstructing it was rally really difficult

20:46 <Jari--> Commander X-16 implements VFAT

20:46 <geist> so i think the rfs cant be contained in it was something to do with the fsck detecting data blocks as metadata blocks

20:46 <Jari--> so VFAT is still valid market FS

20:47 <geist> which is a general problem if you let metdaat and data mix it up in the same area of the device, without a clear distinction between zones

20:47 <Jari--> USB sticks still ship with exfat ?

20:47 <Jari--> 10 terabyte ones etc. ?

20:47 <Jari--> NTFS no ?

20:47 <geist> i remember BFS had the same problem: since inodes could be allocated anywhere the fsck utility could easily pick up something in a file itself as an inode

20:47 <mjg> hehe

20:48 <Jari--> oh well uefi defaults on ntfs - what else do we have ?

20:48 <geist> ext* fses avoid it by having the inodes in dedicated spots, and btrfs and presumably xfs avoid it by having clear distinction between metadata and data stripes

20:48 <gog> uefi defaults on vfat

20:48 <Jari--> uefi with ext4 support would be preferable though, would please the marketing adopt please ext4

20:48 <geist> Jari--: hmm? yes of course. FAT and exfat are still the defacto (and actually specced) fses for usb sticks

20:48 <mjg> hm i think there is no such problem on ufs

20:48 <mjg> go bsd

20:48 GeDaMo has quit [Quit: You are becoming what we French call 'Le Fruitcake'.]

20:49 <geist> mjg: same reason i'm sure: the inode table is allocated up front, this inodes can only exist on certain spots on the disk

20:51 <Jari--> Google : exFAT's maximum file size limit is 16EiB (Exbibyte). exFAT is compatible with more devices than NTFS, making it the system to use when copying/sharing large files between OSes.

20:51 <Jari--> ?

20:51 <geist> correct

20:51 <Jari--> oh

20:51 <mrvn> geist: ext4 finally added a variable for how much of the inode table has been initialized and creates them on demand.

20:51 <Jari--> basically exfat is a major leap, I thought it would suck, but this means 128 PB is the maximum size of a disk for exfat

20:52 <geist> functionally speaking exfat is fat64, though it's different enough from the previous FAT file systems that it actually gets a new name

20:52 <mrvn> And sane FSes added a filesystem ID to inodes so they wouldn't detect a FS image on the disk as part of its own FS on recovery.

20:52 <Jari--> is it possible to boot up an exfat ? custom roms available to boot up an exfat drive? for qemu?

20:52 <geist> mrvn: yeah so there must be a clear marker of 'not initialized'

20:52 <geist> Jari--: sure. here's the problem: exfat is i think patent encumbered

20:53 <geist> so in general it hasn't been a good idea to use it

20:53 <mrvn> geist: something as simple as "largest used inode" works well

20:53 <geist> i think linux only has gotten support for it in the kernel recently, and that was because MSFT bequeathed it to the kernel

20:53 <geist> but i think the licensing terms are still not fully open

20:53 <geist> mrvn: yeah and since it's per allocation group, you acn just keep a counter locally there

20:54 <geist> if there's say 32 allocation groups, each with say 10k inodes, you can just store 32 local counters of the highest initialized inode, etc

20:54 <mrvn> geist: that would require initializing allocation groups. Better to keep it in the superblock

20:54 <geist> sure but the allocation groups already have a header. that's easy to put out

20:55 <geist> frankly the whole splitting the fs into zones and having a local allocation group header, etc is archaic, but it was inherited from FFS, etc

20:55 <mrvn> it's so you reduce seek times

20:55 <geist> made sense when you had old style spinning media, but nowadays nothing really does that anymore

20:56 <mrvn> Even with spinning disks isn't most of the seek time now rotating the disk to the start of the sector and not moving the head?

20:56 <geist> right, vs having all the inodes crammed to the front of the disk it was a local win. more modern designs either dont care because SSD or the metadata can live anywhere on the disk so you can optimize it howeve ryou see fit over time

20:56 <geist> good question: i guess you can fairly easily calculate what the rotational latency is, since it's basically fixed on RPM

20:56 <mrvn> except that's variable too :)

20:57 <geist> right except not really. there are only a finite number of RPMs

20:57 SGautam has quit [Quit: Connection closed for inactivity]

20:57 <geist> ie, 4200 5400 7200 10k. so you can fairly easily compute it there

20:57 <mrvn> 7200RPM = 0.008333s per rotation

20:57 <geist> anyway that's the worst case rotational delay + whatever worst case seek delay

20:57 <mrvn> seek time is what? 12ms?

20:57 <geist> there you go up to 8.3ms for the platter to come by the sector

20:58 <mrvn> Should be <5ms on average :)

20:58 <geist> that's more variable, depends on the drive. dunno what modern times are, but i think in general you hear average latencies of like 12ms or so, so i think that's taking into account average seek + average rotational

20:59 <geist> though i dont know what the distribution of that is

20:59 <geist> and of coruse the drive is free to start reading whatever sector it finds as soon as the head gets to it and start filling in a track cache buffer

20:59 <mrvn> I always wondered if they offset sectors on neighbouring tracks so that a sequential read can seek and get the start of the sector right away.

20:59 <geist> so subseuqnet sector reads off the same track are probably going to be either ready beacuse it already spun by, or will soon be there (within 8ms)

21:00 <geist> traditionally with stuff like floppy disks you actually can interleave the sectors, like 2:1 or 3:1

21:00 <geist> so you get say sector 0 3 6 9 ... and then 1 4 10 ...

21:00 <geist> that way as the sectors go by the host/chipset has time to decide if it wants to read the next logical sector

21:00 <mrvn> true. Seek, wait for a "start of block" marker and start reading a complete revolution. Then shuffle the blocks around.

21:01 <geist> now of course how the disk controller lays out the sectors and what tracks are where is totally up to the disk

21:01 <geist> but at least with old things like floppy disks its still easy to understand the logic

21:02 <mrvn> That reminds me of the floppy for the C64. The CPU is to slow (or so they thought) to decode the 10:8 encoding while reading the data so you would read a block, decode a block, read a block, decode a block. Half speed. Only like last decade someone figured out to do it interleaved.

21:02 <geist> i asume even with SSDs there's read-ahead logic that starts to see where the next N pages are via the translation table and starts prefetching data from those flash chips + banks

21:03 <mrvn> even DRAM does that

21:03 <geist> mrvn: yep. apple2 did the same thing. it was even more basic since it didn't have a 6502 on board. since the host cpu itself was basically directly looking at the data under the head i think it generally used a 3:1 interleave so the cpu had enough time to process what it had just seen before

21:03 <geist> totally soft sectored

21:04 <mrvn> and that's just incrementing the row/column address for the next block of data.

21:04 srjek|home has joined #osdev

21:04 <geist> that was woz's great invention re: the apple 2 floppy drive. it was totally dump, offloaded all the work to the host cpu which was basically written in hand assembly to have the timing precisely right

21:04 <geist> as a result the disk drives were very cheap compared to almost all the other 8 bit micros at the time

21:04 <geist> s/dump/dumb

21:05 <mrvn> I wonder if I have my old C128+floppy somewhere in my parents basement.

21:05 <geist> yah i have an 1570 drive over here somewhere

21:05 <geist> last i fiddled with it it works fine

21:05 <mrvn> I had a 1571. No turning over the disk.

21:06 <geist> noice

21:06 <geist> a C128 i'd actually like to have, but they're getting pricey. much more rare than a c64

21:06 <mrvn> But it hat that rotation lock mechanism that easily breaks and not the push down lever.

21:06 <mrvn> geist: 6502 + Z80 cpu.

21:07 <geist> yeah there are some great youtube vids about the history of it. including talking to some of the guys that designed it

21:07 <geist> it was a fascinating tale. they were just throwing everything at it and surprised they could even make it

21:07 <geist> commodore was already spiralling out of control at the time

21:07 srjek_ has quit [Ping timeout: 260 seconds]

21:08 ZombieChicken has quit [Ping timeout: 255 seconds]

21:08 JudgeChicken has joined #osdev

21:15 sympt7 has quit [Ping timeout: 246 seconds]

21:18 <heat> block groups still make some sense

21:18 <heat> vs having a big table of inodes and shit

21:18 wootehfoot has quit [Ping timeout: 260 seconds]

21:19 <heat> particularly if expanding/shrinking a filesystem, or checksumming stuff

21:20 <heat> Jari--, tianocore has ext4 support, its up to the platform builder to enable it

21:20 <heat> those nvidia SBCs have it

21:21 wootehfoot has joined #osdev

21:21 JudgeChicken is now known as ZombieChicken

21:22 wootehfoot has quit [Client Quit]

21:24 <mrvn> heat: on the other hand inode tables make not much sense.

21:25 <heat> sure they do, for expansion

21:25 <mrvn> heat: inode tables, no0t block groups

21:25 <geist> well it's the whole preallocating the inodes and thus having a finite number of them on fixed locations that i think is obsolete

21:25 <heat> why does an inode table not make sense?

21:25 <heat> finding an inode is O(1)

21:25 <mrvn> because it limits you to a fixed inode per block ratio

21:25 <geist> it makes sense, it's just obsolete

21:26 * heat rants on fancy pants filesystems cough cough zfs and btrfs

21:26 <mrvn> using the block location as inode number has O(1) finding too

21:27 <geist> right, that's exactly the strategy bfs did

21:27 <mrvn> and xfs and btrfs

21:27 <geist> wellthose are different, because an inode is part of one or more btrees

21:28 <geist> i consider that a completely different class. there's a tree, the tree has all these data structures spread across it. an inode is an amalgom of those data structures, and part of the index is the inode #

21:28 <mrvn> using the block location makes it trivial to pick a number and making sure it's unique and tells you how to find it all in one simple algorithm.

21:29 <geist> right. however as i was stating before it has a serious flaw: when doing a fsck to try to fix it you can't easily tell an inode from file data

21:29 <geist> so you need a solid way to determine the difference, and such you need another mechanism to differentiate metadata blocks from data blocks

21:29 <mrvn> yeah, that's the drawback. But easily solved by including an FS uuid

21:29 <geist> so you're kinda back to square one, unless you also have a data structure that describes that

21:29 <geist> the FS uuid doesn't easily solve it at all. you can still easily maliciously include some inode in your file data

21:30 <geist> and then enxt fsck picks it up

21:30 <mrvn> true, it only helps against accidental inclusion

21:31 <mrvn> damn those hackers trying to insert a SUID root bash into the FS.

21:31 <geist> xfs and btrfs at least have layers of allocation schemes, so you already have a fairly clear notion of which stripe/allocation group/etc is dedicated to metadata or file data

21:31 <mrvn> zfs has block groups and each group can be data, metadata or in case of emergency: mixed.

21:31 <geist> i dunno how say XFS solved it. NTFS solves it by hvaing all FILE records exist inside a metadata file itself. it's recursive, because the metadata file ($MFT) is also dscribed by the MFT

21:32 <geist> but there are mechanisms there to help

21:32 <mrvn> that doesn't help in the NTFS case when the metadata file is corrupted

21:33 <mrvn> And that's the case where data blocks get picked up as metadata. When you can't follow the metadata to all files anymore.

21:33 <geist> right but there are mechanisms there to avoid it

21:33 <mrvn> Otherwise you just start at / and scan all directories.

21:33 <geist> notably the first N blocks of the MFT are guaranteed to be in one spot (described by the superblock) and those file records are used to describe the MFT itself

21:34 <geist> so it's pretty hard to corrupt the MFT's description of where itself is located

21:34 <mrvn> Where you pick up data by accident is when scanning for lost / unconnected files.

21:34 <geist> right

21:34 <mrvn> or deleted files.

21:34 <geist> in NTFS you'd just walk the MFT and look for all FILE records that say they are active, but appear to be unlinked

21:34 <geist> OTOH NTFS is also journalled, so in general it's pretty hard to get it too corrupted

21:35 <geist> as are any systems that have a functional journal

21:35 <mrvn> I always like to have multiple ways to verify files

21:37 <geist> yah and this is why more modern stuff has checksums of stuff

21:43 <heat> ufs and ext2 ideal filesystems

22:53 <geist> so interesting: playing with PCI passthrough on proxmox (and thus qemu/kvm)

22:53 <mjg> normie

22:53 <geist> so there are limitations. turns out for some dumb reason both of my server like machines with multiple on board nics put all of the nics in the same iommu group

22:53 <heat> j`ey, so arm64 linux has every kmalloc 128-byte aligned

22:53 <heat> wtf??

22:53 <geist> so trouble is you can't pull one device out of an iommu group and pass it through

22:54 <geist> it takes all the rest of the devices out

22:54 <j`ey> heat: yeah, hence the series

22:54 <geist> seems to be in both cases all of the motherboard nics are in the same group. both an intel xeon nehalem machine and a more modern ryzen

22:54 <heat> how has this not resulted in crippling fragmentation

22:54 <geist> of course any nics in slots end up in a different group

22:55 <heat> 8 byte allocations get 120 bytes of alignment

22:56 <geist> apparently there are some thing syou can do with it (something called ACS) but in general if it puts things in the same group seems you're outta luck

22:56 <heat> arm32 aligns to the L1 cache size

22:57 <heat> which is also horrible but less so

22:57 <geist> what do you mean 128? that's the largest L1 cache in arm

22:57 <geist> thats why

22:58 <heat> yeah, they're aligning every kmalloc allocation to the L1 cache size

22:58 xenos1984 has quit [Read error: Connection reset by peer]

22:58 <mjg> heat: *every*?

22:58 <geist> now a slightly better thing is to run time look up the L1 cache size, which is almost always 64

22:58 <heat> mjg, yes

22:58 <geist> but i know of at least one ARM machine that is 128

22:58 <geist> (cavium thunderx1)

22:59 <mjg> heat: does not sound legit

22:59 <mjg> sounds like a huge memory waste

22:59 <heat> https://lore.kernel.org/linux-mm/20221106220143.2129263-1-catalin.marinas@arm.com/T/#m1b9dd9b18bfab9054b316ad749184b87a492bee8

22:59 <bslsk05> lore.kernel.org: [PATCH v3 00/13] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8

22:59 <mjg> not saying this means linux would not do it

23:00 <geist> sure, i mean sometimes you gotta crack some eggs

23:00 <heat> it looks like this is very much a thing on most architectures

23:01 <geist> but yeah my guess is they are noting the fact that folks may be wanting to allocate things that they're DMAing into out of kmalloc (which is perhaps the real issue)

23:01 <geist> but if you do, you almost always gotta arrange for your buffer to exist in its own cache line(s) and not aliased with anything else

23:01 <geist> yay no cache coherency

23:01 <heat> yeah

23:01 <geist> i have some cheezass mechanism in LK to allocate +cache_line-1 and then align within that boundary or something

23:02 <geist> these sort of dma buffers and whatnot are a big issue when porting cache coherency naieve code to something like ARM

23:03 <mjg> heat: i would expect this is for some form of aigned alloc

23:03 <mjg> not straight up kmalloc

23:03 <geist> my guess is there's some api that bottoms out in kmalloc, so they had to pass it down

23:04 smach has joined #osdev

23:04 <heat> * kmem_cache_alloc and friends return pointers aligned to ARCH_SLAB_MINALIGN.

23:04 <heat> * and ARCH_SLAB_MINALIGN, but here we only assume the former alignment.

23:04 <heat> */

23:04 <heat> * kmalloc and friends return pointers aligned to both ARCH_KMALLOC_MINALIGN

23:04 <geist> ah there you go

23:05 <heat> arch_slab_minalign is alignof(unsigned long long)

23:05 <geist> thankfully we avoided basically all of these sort of shenanigans in zircon kernel by not doing any of that in the kernel

23:06 <mjg> heat: wut

23:06 <mjg> utter nonsense

23:07 <heat> this may not be a big deal

23:08 <heat> depends on the kernel's allocation pattern

23:08 <geist> maybe kmalloc is less of a concern, since most stuff is slab allocated

23:08 <heat> hmm idk about that

23:08 <geist> guess that's easy enough to figure out by looking at the slab info

23:08 <heat> common stuff like file, vm_area_struct, etc, sure

23:09 <heat> but idk if those represent the most allocations

23:09 <geist> and heh yeah. if you look at /proc/slabinfo on an arm64 machine, the lowest bucket is -128

23:09 <geist> goes down to 8 on PC

23:10 epony has quit [Quit: QUIT]

23:10 <heat> i don't like -8

23:10 <heat> i feel like 8 is way too little alignment

23:11 <geist> https://pastebin.com/WEaPhs7q

23:11 <bslsk05> pastebin.com: geist@rpi4a:~$ sudo cat /proc/slabinfo slabinfo - version: 2.1# name - Pastebin.com

23:11 <mjg> lol

23:11 <mjg> kmalloc-128 45312 45312 128 32 1 : tunables 0 0 0 : slabdata 1416 1416 0

23:11 <mjg> indeed the smallest one

23:11 <mjg> what the fuck

23:12 <mjg> seriously

23:12 <mjg> my amd64 laptop goes all the way down to 8

23:13 <heat> see, this is why arm64 is a shit shit architecture lol risc loser go back to mips x86 rulez intel amd oh yeah

23:13 <mjg> here is ONE WEIRD TRICK: fix whatever fucking api they are using to kmalloc_aligned or whatever it is

23:13 <mjg> how the fuck did something like this go in

23:13 <heat> mjg, OR

23:13 <geist> mjg: we just established it. it's becuase of DMA cache coherency on arM

23:13 <heat> dma-kmalloc-16

23:14 <heat> isn't there a flag for this? GFP_DMA or something

23:14 <mjg> geist: except most kmalloc'ed bufs are not used for dma, are they

23:14 <geist> that's the point!

23:14 <geist> they are. and that's the problem

23:14 <mjg> what?

23:14 <geist> probably beause no one gives a shit on x86 so they built the api like that

23:14 <geist> and then on architectures where there's a problem. the 'temporary' hack was to reduce the minimum bucket to maximum cache line size (128)

23:14 <mjg> they can still fix it?

23:15 <mjg> i see 0 justification in the patchset as to why they don't do that instead

23:15 <geist> i dunno, but that's how solutions are made. bandaid over it, fix it later. not all things are pefect all the time

23:15 <heat> GFP_DMA is a thing man

23:15 <heat> i think they could just use it

23:15 <mjg> geist: that's how crap lasting decades is made

23:15 <geist> making it locally work on your new architecture (arm64) is a lot easier than fixing the underlying problem. it's the essence of Real Engineering

23:16 <geist> you can't always fix everything perfectly the first time

23:16 <mjg> cue you in 10 years claiming well it clearly made sense bro, not a hack!

23:16 <geist> *shrug*

23:16 <geist> i mean somehting working sub-optimally is far better than not working at all

23:16 <heat> yeah i mean

23:16 <geist> what would make me more worried is a hack like this actually improving performance, such that it's harder to remove later

23:16 <heat> this is unix

23:16 <mjg> if they demonstrate how infeasible it is to fix

23:16 <mjg> then sure

23:16 <geist> by having less cache lines alias, etc etc

23:17 <mjg> but i see nothing of the sort

23:17 <mjg> so far looks like a lazy cop out, webdev stle

23:17 <geist> most likely the globla, architecturally neutral api expects kmalloc to work fine

23:17 xenos1984 has joined #osdev

23:17 <geist> and so fixing it involves rethinking/retooling that

23:17 <geist> and so the local (arm64) hack is this

23:17 <heat> this is defo not an arm64 only thing

23:18 <geist> you should look at something like arch/ppc and how it has to deal with page tables in linux

23:18 <mjg> or *maybe* replacing with places of kmalloc with kmalloc_aligned(..., ARCH_DMA_ALIGN)

23:18 <geist> it's a total clusterfuck. but linux.

23:18 <mjg> or similar

23:18 <mjg> which would change nothing on x86

23:18 <geist> heat: probably anything not x86 that doesn't have dma cache cohereny

23:18 <mjg> but again 0 analysis performed

23:18 <mjg> this is the kind of bullshit i normally expect in the bsd land

23:18 <mjg> :]

23:19 <geist> but anyway i have no real idea. we're seriousy armchair quarterbacking this thing

23:19 <heat> arm64chair you mean

23:19 <geist> eyyyooo

23:19 <heat> ok here's a funny commit

23:19 <heat> https://github.com/Tencent/TencentOS-kernel/commit/adb335972fcb7a6b59bb8034498b1ffddfb37c97

23:19 <bslsk05> github.com: ampere/arm64: Add a fixup handler for alignment faults in aarch64 code · Tencent/TencentOS-kernel@adb3359 · GitHub

23:19 <geist> (and yes i know it's an american football thing but i dont know of a global equivalent to 'armchair quarterbacking' as a phrase)

23:20 <heat> context https://twitter.com/marcan42/status/1589193324954804225

23:20 <bslsk05> twitter: <marcan42> OMG. So it turns out Ampere Altra botched their PCIe controller in a way that makes it unable to use (e)GPUs just like Macs. ␤ ␤ So what did they do? ␤ ␤ They put an ARM64 load/store emulator into the kernel. <github.com/Tencent/Tencen… https://t.co/HS6NOK4Y90> <github.com/Tencent/Tencen… https://t.co/2UFPBLMmcf>

23:21 <mjg> what's htep oint of linux existing if you can't shit on it

23:21 <mjg> :thinkingface:

23:21 <geist> heat: heh i've actually seen worse than that in vendor local linux trees

23:21 <geist> there was a tree years ago where $vendor just inserted memory barriers all over the place, including in some core linux macros

23:22 <geist> fixed the problem

23:22 <j`ey> more barriers less problems

23:22 <heat> there you have it

23:22 <heat> fixed

23:22 <mjg> :]

23:22 <heat> 🚢it

23:22 <mjg> have fun removing them

23:22 <geist> i eventually dug into it, turns out there was a serious memory controller bug that would occasionally reorder instruction fetches in front of data fetches

23:22 <mjg> geist: lol

23:22 <geist> so it was still a bug but really only needed to be when reading data into a block you were going to run

23:22 <mjg> gotta love the hw bugs

23:23 <geist> but their hack had been to just sprinkle isbs all over the kernel untilt he problem went away

23:23 <geist> with of course no explanation

23:23 <mjg> :]

23:23 <heat> have you seen an unmoveable printk

23:23 <geist> turns out they knew about the bug, so they could have at least more correctly fixed it.

23:23 <heat> / DONT REMOVE OR ELSE IT BREAKS

23:23 <geist> heh yeah

23:23 <mjg> where

23:23 <mjg> i did not

23:23 <geist> anyway gonna go take a walk. toodles

23:24 <heat> i guess we'll wait

23:24 <mjg> so how are things in the onyx land

23:24 <mjg> support wifi yet?

23:24 <heat> no

23:25 <mjg> then i'm sticking to serenityos for my daily driver

23:25 <heat> i found a bug in my KASAN quarantine code and i'm trying to track down the problem

23:25 <mjg> started listened to can't hurt me yet?

23:25 <mjg> listening

23:25 <heat> serenity supports wifi?

23:25 <heat> (X) Doubt

23:25 <mjg> fuck if i know

23:25 <mjg> does not stop me from picking it over onyx

23:25 <heat> well sure

23:26 <heat> they have a web browser, i have flamegraphs

23:26 <heat> you choose

23:26 <mjg> :thinkingface:

23:26 <heat> enjoy their LibJS, i'll enjoy my flamegraph.pl

23:26 <mjg> need to sleep on it

23:27 <j`ey> heat: did you see one of their libjs dudes is on the js community in some form now

23:27 <heat> nope

23:27 <heat> i have limited brain power

23:28 <geist> also just as i was leaving i saw a video about a guy that made a 3 meter long concrete sarcophagus with a bag of flaming out cheetos in it and buried it, not to be opened for 10000 years

23:29 <mjg> i'll set a calendar event for it

23:29 <mjg> is this stored next to nuclear waste

23:29 <heat> hopefully serenityos doesn't have the 2038 epoch issue

23:31 <heat> hey, they stole freebsd jails

23:31 <mjg> wut?

23:31 <heat> didn't they also steal pledge and unveil

23:31 <heat> https://github.com/SerenityOS/serenity/commit/5e062414c11df31ed595c363990005eef00fa263

23:31 <bslsk05> github.com: Kernel: Add support for jails · SerenityOS/serenity@5e06241 · GitHub

23:31 <mjg> let's just hope they did not steal code quality

23:32 <heat> every late stage POSIX hobby OS turns into a BSD

23:32 <heat> which I find shameful. at least turn into Linux

23:32 <mjg> i have to note freebsd jail came with several stupid security problems

23:32 <mjg> which would never show up if this was thought out, which it was not

23:33 <heat> freebsd taking a page from openbsd's book

23:33 <mjg> and i know because one day i decided i'm gonna find some

23:33 <mjg> and checked the most obvious stuff

23:33 <mjg> ... and what do you know, broken

23:33 <mjg> most notably concerning transition from the host system into the jail

23:34 <heat> have you seen https://isopenbsdsecu.re/

23:34 <bslsk05> isopenbsdsecu.re: Is OpenBSD secure?

23:34 <heat> it's hilarious

23:34 <mjg> no

23:34 <mjg> hm now that i see the slides i have a weak recollection

23:35 vdamewood has quit [Read error: Connection reset by peer]

23:35 <heat> "No public code reviews"

23:35 <heat> ok heat@

23:35 stux has joined #osdev

23:35 <mjg> i remember how their bespoke "pam is broken so let's write out own" pam replacement turned out to have a retarded bug giving root

23:36 vdamewood has joined #osdev

23:36 <mjg> basically any time someone with security clue takes a look at openbsd, it turns out there is serious problems there

23:36 <mjg> which fortuntaely for them is rare

23:37 <mjg> https://blog.qualys.com/vulnerabilities-threat-research/2019/12/04/openbsd-multiple-authentication-vulnerabilities

23:37 <bslsk05> blog.qualys.com: OpenBSD Multiple Authentication Vulnerabilities | Qualys Security Blog

23:38 eryjus has joined #osdev

23:38 eryjus has quit [Client Quit]

23:39 smach has quit [Remote host closed the connection]

23:47 ZombieChicken has quit [Remote host closed the connection]

23:51 Burgundy has left #osdev [#osdev]

23:54 <heat> mjg, have you seen the stupid codegen patches they have?

23:54 <heat> xchg A, B; mov B, α; xchg B, A instead of mov A, α.

23:55 <mjg> is that some form of spectre mitigation?

23:55 <heat> rop gadget removal

23:55 <mjg> i don't know if that's any good for the intended purpose\

23:55 <mjg> i can tell you for a fat they are slow single threaded cause geezer

23:56 <mjg> and the above is probably not a big deal in the grand scheme of things

23:56 <heat> from what that dude says, removing rop gadgets is stupid because you'll always find some

23:56 <Mondenkind> https://cos.ufrj.br/uploadfile/publicacao/3061.pdf grumble

23:57 <mjg> did he try on openbsd/

23:57 <mjg> sec papers are not to be trusted man, just like perf :>

23:58 <Mondenkind> the thing that really annoys me about ROP

23:58 <Mondenkind> is that you can just ('just') do separate call/data stacks

23:58 justache has quit [Quit: ZNC 1.8.2 - https://znc.in]

23:59 <Mondenkind> it's just as fast, and it's abi-compatible