#osdev on 2022-02-21 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:00 <mrvn> you only need libgcc

00:00 <heat> except that I can't link against libgcc

00:01 <mrvn> is compiler-rt the clang equivalent?

00:01 <heat> yes

00:01 <Griwes> compiler-rt builtins is the clang equivalent, there's a lot more stuff that falls under compiler-rt

00:01 <heat> libgcc also includes unwinding

00:02 <heat> I'm just compiling the builtins part

00:02 <mrvn> do you have exceptions turned off?

00:02 <heat> of course

00:02 <Griwes> unwinding in llvm is separate, in libunwind

00:02 <heat> yea

00:02 <mrvn> you are doing something strange that needs something you aren't providing. *shrug*

00:03 <mrvn> using gcc + libgcc works for me. Maybe wait for someone using clang to wake up.

00:04 <heat> no, I don't need help

00:04 <heat> I've fixed it

00:04 <heat> you usually can't really link with libgcc/compiler-rt

00:05 <heat> so you either ruin your beautiful toolchain just to compile some special libgcc with multilib, or you need to take a different approach

00:06 <Griwes> wdym can't really link with compiler-rt

00:06 <heat> either 1) don't link with libgcc, but that ruins some division and some builtins; 2) roll your own; 3) import

00:06 <mrvn> heat: you can totaly link against libgcc from your cross compiler.

00:06 matrice64 has quit [Quit: Textual IRC Client: www.textualapp.com]

00:06 <heat> no I can't lol

00:06 <Griwes> why

00:06 <heat> well, for example, in x86_64, compiler-rt and libgcc are compiled with the red zone enabled

00:06 <heat> can't use that in kernel space

00:07 <heat> if you don't have your toolchain target set up right, libgcc will compile with mcmodel=small, and that doesn't work in kernel space

00:07 <heat> (you can make it compile everything as PIC, but that's non obvious for a beginner that isn't patching gcc)

00:08 <mrvn> heat: x86_64 is so screwed up

00:08 <heat> in riscv, object files are tagged with soft-fp/hard-fp; you either add multilibs of soft-fp (whyyyyyyyyyy) or you can't link because ld doesn't mix soft-fp and hard-fp object files

00:09 <mrvn> the arm libgcc compiles right out of the box.

00:09 <heat> cool, but that doesn't really work a lot of the time

00:09 <heat> and I bet it won't work if your kernel is PIC or something

00:10 <mrvn> heat: does, tried that. It's just that PIC isn't PIC, just relocatable.

00:10 masoudd has joined #osdev

00:11 <mrvn> heat: works with and without mmu too. The defaut multiplib setup seems to be fine.

00:11 <mrvn> s/mmu/fpu/

00:13 <heat> anyway, the point is that it's common to have incompatible libgccs and compiler-rts that compile for a regular user environment and not a kernel env

00:14 <mrvn> heat: sure, you need the standalone compiler. Not one for userspace.

00:14 <heat> so you either mess around with multilib options(you should know that it compiles everything two or three times!) and llvm (good luck diving through the cmake for something that works reliably)

00:15 Oli has quit [Ping timeout: 240 seconds]

00:15 <heat> except that my OS actually has a user-space so I do need one for user-space

00:15 <heat> and your -elf targets don't fix anything

00:15 <mrvn> heat: for your userspace you are screwed you have to patch gcc with all the multiplib nightmare.

00:15 <heat> not really

00:15 <Griwes> I mean you need separate environments for the two anyway

00:15 <Griwes> for sanity

00:15 <heat> you could do that

00:15 <heat> or

00:16 <heat> you add compiler-rt builtins as a library for the kernel, takes 3 or 4 seconds to build and there ya go, works

00:16 <heat> liberally licensed too

00:17 <mrvn> aparently it doesn't work or you wouldn't have your problems.

00:17 <heat> i don't have problems

00:17 <heat> they're 100% fixed lol

00:18 <Griwes> idk my builtins archive does look position independent, and I would not expect the builtins to actually want to use the redzone

00:19 <heat> it's regular C code, it 100% can and will if the compiler wants to

00:19 <heat> can also use SIMD lol

00:19 <mrvn> reszone is a stupid idea anyway. Either the function is so trivial it should be inlined or doesn't spill anything anyway. Or it's so complex that incrementing the stack is free.

00:21 <heat> using a libgcc/compiler-rt that was built for a user environment is frail

00:21 <heat> you either multilib the shit out of them or you pray that they don't do what you don't want them to do

00:21 <mrvn> nobody is suggesting that heat

00:22 <heat> every single libgcc is built for a user environment, that's my point

00:22 <heat> the "bare metal" targets just say "hey, no libc here"

00:22 Oli has joined #osdev

00:22 <mrvn> heat: and should set all the right multilib flags for bare metal libgcc

00:22 <heat> no

00:22 <mrvn> then start filing bugs

00:23 <heat> no

00:23 <heat> how would they know what you want to do?

00:23 <heat> do gcc maintainers decide that bare metal = kernel, or bare metal is simple bare-metal-ish app, etc?

00:24 <mrvn> heat: it's a standalone compiler, it needs to use the flags that make it safe for that use. Like no redzone.

00:24 <heat> the bare metal targets are just no-libc, simple targets with all the fancy stuff turned off

00:24 <mrvn> and then compile a bunch of libgcc, like softfloat, hardfloat, ...

00:24 <heat> mrvn, except that a redzone isn't inherently unsafe in a bare metal app

00:25 <mrvn> heat: redzone needs spezial support. no redzone always works.

00:25 <heat> the redzone is part of the sysv ABI

00:25 <Griwes> there's ways to specify flags for specific targets within an llvm runtimes build even without patching llvm, I should try and see how well that works

00:27 <heat> Griwes, https://github.com/heatd/llvm-project-rust/blob/rustc/13.0-2021-09-30/clang/cmake/caches/Onyx-stage2.cmake#L156

00:27 <bslsk05> github.com: llvm-project-rust/Onyx-stage2.cmake at rustc/13.0-2021-09-30 · heatd/llvm-project-rust · GitHub

00:27 <heat> half stolen from fuchsia but works like a charm

00:27 <heat> just set those at cmake time

00:33 <mrvn> "Pinky: Gee, Brain, what do you want to do tonight? Brain: The same thing we do every night, Pinky - try to take over the world!"

00:42 * kingoffrance points mrvn at scrollback from a week or two ago "he who controls the spice, controls the universe!" ancient module song

00:45 Oli has quit [Ping timeout: 240 seconds]

00:48 Oli has joined #osdev

00:56 <mrvn> fear is the mind killer

01:04 <heat> qemu-system-aarch64's -kernel doesn't give you a dtb and can even load you in ROM

01:04 <heat> if it knows you aren't linux

01:05 <heat> as soon as I flatten my elf file it seems I'm linux and does everything I've ever wanted

01:05 <heat> loads me dynamically in memory too

01:28 dude12312414 has joined #osdev

01:30 <klange> -aarch64 -kernel _does_ give you a dtb

01:30 <klange> _if_ you don't ask to be loaded somewhere the clobbers the default location

01:33 <klange> ThinkT510: I had a rust highlighter in C and I guess I forgot to port it to Kuroko when I switched over all the other highlighters; thanks for the reminder, I'll try to get around to it :)

01:34 <heat> klange, how? the registers are all 0

01:34 <klange> where are you asking to be loaded?

01:34 <klange> it wants to put the dtb at start of ram and it pads it out to a whole juicy megabyte

01:35 <heat> 0 because I want the bootloader to figure that out

01:35 <heat> it loaded me at 0

01:35 <heat> (it's ROM, but it did put me there)

01:35 <klange> What machine target?

01:35 <heat> virt

01:35 <klange> virt RAM starts at 0x4000_0000

01:36 <heat> so you hardcode that?

01:36 <klange> I am unsure if it supports loading PIE ELFs at addresses of its own choosing.

01:37 <heat> if you add the arm64 linux image header and flatten the ELF it loads you as a linux kernel

01:37 <klange> So for the path of least resistance, yeah, link yourself to be loaded at like 0x40100000

01:37 <heat> so it finds a place for you in memory and gives you the dtb

01:38 <klange> If you're fine with pretending to be Linux, then you do you.

01:38 <heat> there's not much pretending to be done

01:39 <heat> the whole boot protocol is "here's the dtb in a register, and here's your load address, gl hf"

01:41 <heat> i think it emulates uboot

01:41 <klange> https://qemu.readthedocs.io/en/latest/system/arm/virt.html

01:41 <bslsk05> qemu.readthedocs.io: ‘virt’ generic virtual platform (virt) — QEMU 6.2.50 documentation

01:41 <heat> which is better than hardcoding stuff

01:41 <klange> way down at the bottom

01:42 <heat> yeah, shame for the hardcoding you need to do

01:42 <klange> you'd think they could be nice enough to just pass in the dtb address in x0 regardless... "patches welcome", I guess but the QEMU patch submission process is _involved_...

01:42 <heat> i want to try and build a generic arm64 image

01:43 <klange> I went for platform shims and my generic kernel expects to be loaded at -2G

01:44 <klange> So there's the qemu virt platform shim that loads at 0x4010_0000, sets up some initial page tables to accomplish that, reads kernel from fw-cfg, and hands over

01:44 <klange> and then the RPi4 one has the kernel + ramdisk embedded in it with .incbin, loads at 0x80000, and does the same things plus setting up the framebuffer early for debug messages

01:44 <heat> oh right you don't even have -initrd right?

01:45 <klange> I can't figure out how it's providing the ramdisk location, it's not in /chosen

01:45 <heat> qemu or the rpi?

01:45 <klange> qemu

01:46 <clever> klange: it should be in chosen, but i think you need the .dtb file for that model for dtb to work right

01:46 <heat> i think i saw the initrd code being under if (is_linux)

01:46 <klange> but for virt it should all be automatically generated?

01:46 <clever> for qemu virt, i'm not sure

01:46 <klange> ^ and I have strong suspicions heat is right and they're just laughing at me without even bothering to print a warning that the initrd was ignored

01:46 <heat> klange, try info roms

01:47 <heat> it lists everything that qemu loads

01:47 <clever> i would just read the qemu src

01:47 <klange> I do that often, but this particular stuff is a mess.

01:47 <heat> I was forgetting to add -cpu so qemu-system-aarch64 was just exit(1)'ing

01:47 <heat> no error

01:47 <heat> just fuck you

01:47 <klange> nothing in `info roms`

01:48 <heat> it's not loading it then

01:48 <klange> it's the same thing with the f*ing dtb if you ask to be loaded too low, it's just not there and no warning

01:49 <klange> meanwhile rpi is still using ATAGs to hand Linux ramdisk addresses in the yold 3188 on aarch64

01:49 <heat> what's an ATAG?

01:50 <heat> klange: https://github.com/qemu/qemu/blob/0a301624c2f4ced3331ffd5bce85b4274fe132af/hw/arm/boot.c#L1120

01:50 <bslsk05> github.com: qemu/boot.c at 0a301624c2f4ced3331ffd5bce85b4274fe132af · qemu/qemu · GitHub

01:50 <klange> in days long past, before device trees, there were ATAGs - ARM tags.

01:51 <heat> guess it's simpler than editing the fdt at boot time

01:51 <AmyMalik> nyan

01:52 <heat> /bin/nyancat

01:52 <klange> started as a terminal escape sequence test, now it's mostly a test of SIGINT

01:53 <clever> klange: if the .dtb file is missing, the firmware switches to ATAGS automatically

01:54 <klange> I've got my hardware's dtb and have identified other things in it - actually dumped the whole thing once to the framebuffer which was, in retrospect, a mistake

01:54 <klange> (it was very long, and this was before I turned on the MMU)

01:54 <klange> (so it took several seconds)

01:54 <clever> heh

01:55 <clever> i recently tried to hexdump the bootrom to the framebuffer, but its giving me trouble

01:56 <klange> I'm working off of a fresh "Raspberry Pi OS" SD card image, with the EXT4 partition removed, so I've got the full set of dtbs and overlays and a modern mostly-default config.txt

01:56 <clever> ah

01:56 <heat> does the rpi not have its own dtb in rom?

01:56 <heat> or is that not a thing in rpi land?

01:56 <clever> heat: there is no dtb in the rom, the firmware loads it from the fat32 partition on the SD card

01:56 <clever> there is not one byte of arm code in the rom

01:57 <klange> it mashes together multiple files to support different software-selectable configs, even

01:57 * klange should switch to high peripherals

01:58 <clever> heat: the boot rom runs on the VPU, and loads a stage1 .bin file, stage1 then brings ram online and loads a stage2 .elf to the VPU, stage2 then loads the kernel, initrd, dtb, patches the dtb, and then turns the arm core on

01:58 <clever> heat: so the kernel+dtb is already fully in ram when the arm core runs its 1st opcode

01:58 <heat> that is stupid

01:59 <clever> heat: thats because the arm core was more of an after-thought, an optional accelerator core hanging off the side of an otherwise already complete soc

01:59 <clever> the arm was never the boss of the show

01:59 <clever> its like those compute pci-e cards you can add to a desktop

02:00 <heat> except it is lol

02:00 <heat> and they had multiple socs to fix that

02:00 <clever> the VPU also has a lot of intel ME or amd PSP like features

02:00 <clever> where you can sandbox off the arm core, and contain untrusted arm kernels

02:00 <clever> from before trustzone was a thing

02:00 <klange> every iteration of the rpi is full of insanity

02:01 <clever> firmware wise, its improving with every iteration

02:01 <clever> they just arent changing those insane hw choices

02:02 pretty_dumm_guy has quit [Quit: WeeChat 3.4]

02:02 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

02:03 <clever> for example, the 3d core on the 2711 changed too much

02:03 <heat> just run UEFI

02:03 <heat> go full SBSA

02:03 <clever> so they skipped having a closed 3d stack, and just went directly to only mesa/open-source 3d

02:03 <clever> same for the h265 decode block, it skipped the blob stage and only has source

02:04 <clever> and they have moved the 2d and camera subsystems almost entirely into the linux source tree

02:04 <heat> qemu's arm64 machine virt actually loads ACPI tables

02:04 <heat> actually, it doesn't load them but puts them in fw_cfg

02:04 <klange> I think someone forgot an if somewhere

02:05 <clever> you can also get uefi+acpi on a pi4, there is a tianocore build for it

02:05 <heat> I know

02:05 <heat> just for the 3 and 4 though

02:05 <clever> yeah, only on aarch64 platforms

02:05 <heat> dunno if the pi zero 2 is supported

02:05 <heat> probably not

02:06 <heat> I don't see the point though, considering it's chainloaded by more firmware

02:06 <klange> i got distracted by this deadlock, I should really be working on bringing up xhci...

02:06 <clever> i had a ticket open with plans on how to load tianocore from spi flash

02:06 <clever> so it would be more seamless

02:06 <heat> they added/are adding spi flash support, finally

02:07 <clever> noobs derailed the ticket with complaints about sd card reliability

02:07 <clever> they then closed the ticket, saying they would never add that feature

02:07 <heat> ticket where?

02:07 <clever> and now they implemented exactly what i wanted :P

02:07 <clever> https://github.com/raspberrypi/rpi-eeprom/issues/95

02:07 <bslsk05> github.com: SPI boot · Issue #95 · raspberrypi/rpi-eeprom · GitHub

02:07 <mrvn> heat: they might be making more money on the RPI than they ever did for the chip everywhere else.

02:08 <klange> close as won't fix → someone realizes actually that's a good idea → it happens anyway

02:08 <clever> heat: my request, was the ability to optionally load the official start4.elf from an external SPI flash chip, with my cover story being tianocore

02:08 <clever> heat: but secretly, i wanted it to be able to load a .elf from the internal SPI flash

02:08 <clever> heat: well, the net-install beta, does exactly what i wanted, it loads a .elf from internal SPI flash, lol

02:09 <heat> i do want tianocore :(

02:09 <clever> once i'm loading a custom .elf, i can load whatever else i want on the arm, like tianocore

02:11 <clever> heat: https://imgur.com/a/7iQWhad this shows each stage of the firmware, and what options can run at each, and what types they can load next

02:11 <bslsk05> imgur.com: Imgur: The magic of the Internet

02:12 <heat> well, ideally you just flash the firmware

02:12 <heat> no weird chainloading needed

02:12 <clever> for the pi0-pi3 range, the maskrom loads a bootcode.bin from the fat32 card, and ram is not online yet

02:12 <clever> there is no chip you can flash, and you must bring ram up to load more then 128kb of code

02:13 <heat> if they just loaded stuff from the SPI, you could have all the blobs in edk2 and use just that

02:13 <clever> and thats where some history comes in

02:13 <clever> the pi3 added usb-host and tftp drivers to the maskrom

02:13 <clever> it was full of bugs :P

02:13 <heat> no crappy sd card, no gpu loading stuff

02:13 <clever> pi4 learned from that mistake, and put the bootcode.bin on SPI flash

02:13 <clever> so they could change the code in the field

02:14 <heat> i guarantee that whatever they're doing in bootcode.bin, etc isn't much compared to intel platforms for instance

02:15 <clever> the old pi4 bootcode.bin, did dram init, and then loading start4.elf from one of sd, usb, tftp, or nvme

02:15 <clever> but with the latest beta firmware, they added https support to it, but it didnt fit within the 128kb limit

02:15 xenos1984 has quit [Read error: Connection reset by peer]

02:15 <clever> so, they cut bootcode.bin in half, bootcode.bin now does only dram init, and loading of bootmain

02:16 <heat> getting most of that in open source software would be way better than having it all closed

02:16 <clever> bootmain is a .elf in SPI flash (over 200kb), that deals with SD, usb, tftp, nvme, and https

02:16 <clever> heat: and thats where NDA's come into play, the rpi engineers cant even say how fast the dram controller can perform!

02:16 <clever> you expect them to release dram init source?

02:16 <heat> no, that's not the point

02:17 <heat> lots of platforms out there with mostly-oss firmware and a bunch of blobs attached to them

02:17 <clever> but with the new design they made, everything bootmain touches, is hw that is already openly documented

02:17 <clever> so i could re-implement bootmain, and make it opensource

02:18 <clever> oh, also, the toolchain to compile VPU firmware, is also behind NDA

02:18 <clever> so even if they could release the source, you cant compile it!

02:18 <mrvn> I'm always amazed about that in this day and age. It's like they don't want anyone to use the hardware.

02:18 <heat> make the vpu way smaller

02:19 <heat> get it booting the arm processor without dram, just running on CAR

02:20 <clever> heat: ive not tried yet, but i suspect you can do CAR on the arm side, however, the arm core cant touch any dram control registers

02:20 <clever> so you must still run VPU firmware to init the dram!

02:20 <mrvn> fit your whole app in cache

02:20 <clever> 128kb L2 cache is all you have

02:20 <heat> that's what's called "the big stupid"

02:20 <clever> heat: the arm wasnt meant to run the show, and for security, it cant talk to a lot of peripherals

02:21 <mrvn> clever: I thought it's only 64kB. Big win there.

02:21 <heat> "security"

02:21 <mrvn> security aka copy protection on hdmi and that crap?

02:21 <clever> heat: remember, this was for trustzone style security, where you keep the DRM keys in the vpu ram

02:21 <clever> so even if an attacker can gain kernel mode on the arm core, they cant rip your DRM keys

02:21 <mrvn> oh, and DVD decryption. that's oh so secret.

02:22 <clever> and netflix :P

02:22 <mrvn> well, netflix still is a thing

02:22 <clever> the bcm2835 was in the roku2 for example

02:22 <heat> rpi uefi needs to compile arm TF-A anyway

02:22 <clever> heat: but the "request from secure mode" signal wasnt wired to the dram controller, so TF-A cant actually be secure

02:23 <clever> its instead wired to the VPU, for the VPU's supervisor vs user mode

02:23 <clever> so secure ram can only be accessed by the VPU in supervisor mode

02:23 <heat> "The RPi4 has a single nonstandard PCI config region."

02:24 <heat> why

02:24 <clever> its a crappy pci-e controller geist has seen in many other SoC's

02:24 <clever> there is a single ecam region, and you use a reg to select which pci-e slot it maps to

02:24 <clever> "because it only has 1 lane, and there is only 1 choice 99.9% of the time"

02:25 <mrvn> .oO(except when you want an M2.key and SATA)

02:25 <clever> yeah, a pci-e switch is a violation of that assumption

02:25 <clever> so you must grab a lock, switch the ecam to the right device, then write to those regs, and release the lock

02:25 <clever> its also likely assumming you only use the ecam once during init, and then never again

02:27 <clever> mrvn: also, according to geist, some SoC's put 2 of these controllers on the same chip, but because each is single-lane, you can never run it in 2-lane mode, its permanently 2 slots of 1x each

02:27 <heat> it's not an ecam though

02:27 <mrvn> clever: a 2 lane controller would be more expensive

02:28 <clever> and thats why its such a mess :P

02:28 <mrvn> fixing all the RPi hardware choices would also cost a lot of money.

02:28 <clever> oh, another fun fact

02:29 <heat> reject arm64, embrace PC?

02:29 <clever> the VC6 core isnt actually a finished project

02:29 <mrvn> they might even have stuff licensed for use in the VC that they can't use in an ARM core.

02:29 <clever> the 2711, is a VC4 core, with the new v3d core bolted on

02:29 <clever> and some slight tweaks to support 4k hdmi and more ram

02:30 <mrvn> That's nothing new. What do you think every x86 chips is? It's always the old with something bolted on.

02:30 <clever> yep

02:30 <clever> and the entire pi1/pi2/pi3 lineup, the vc4 end is virtually identical

02:31 <clever> the only change they made in the whole lineup was the arm cores, and sometimes the rom

02:31 <clever> which reminds me, the pi1 and pi2 bootrom, are almost bit for bit identical!

02:31 <clever> the addition of 3 more arm cores, had zero impact on the bootrom

02:31 <mrvn> except for the model ID and peripheral address?

02:31 <clever> mrvn: the peripheral address is purely a software construct!

02:32 <clever> there is an mmu between arm-physical and vpu-physical

02:32 <mrvn> the whole bootrom is software

02:32 <clever> and that mmu decides where the peripheral lands

02:32 <clever> the peripheral address choice, is made by start.elf

02:32 <clever> 2 stages after the rom

02:33 xenos1984 has joined #osdev

02:33 <clever> https://github.com/librerpi/lk-overlay/blob/master/platform/bcm28xx/arm/arm.c#L311-L313

02:33 <bslsk05> github.com: lk-overlay/arm.c at master · librerpi/lk-overlay · GitHub

02:33 Oli has quit [Ping timeout: 240 seconds]

02:33 <clever> mrvn: with 1 extra line of code, i can put the peripherals at BOTH addresses on every model :P

02:34 <mrvn> Maybe you could change your bootrom/start.elf code to change the address space layout to match basically every other arm system, with memory a bit further up.

02:34 gog` has joined #osdev

02:34 gog has quit [Killed (NickServ (GHOST command used by gog`))]

02:34 gog` is now known as gog

02:34 <clever> mrvn: for the vc4 line, the mmu can only control the lower 1gig of ram, 64 pages of 16mb each

02:35 <clever> well, lower 1gig of the address space

02:35 <mrvn> clever: so your saying you can't have peripherals below 1GB or you loose to much shared memory?

02:36 <clever> both peripherals and ram must all exist in the lower 1gig of the addr space

02:37 <mrvn> clever: many ARMs have memory at 512MB.

02:37 <clever> oh, and the arm reset vector is 0

02:37 <clever> so page0 must be ram

02:37 <mrvn> so rom at 0, peripherals at 16MB, memory at 512MB

02:37 <mrvn> or pseudo rom

02:38 <clever> there is no arm rom, so you must map ram to the 0-15mb range

02:38 <mrvn> easy enough to map the first 16MB twice.

02:38 <clever> yep

02:38 <clever> but if you have 1gig of ram, you must cover at least 1 page (16mb) up with MMIO

02:38 <mrvn> That's how I would have done it anyway.

02:39 <clever> for the bcm2835, they put the MMIO after the 512mb of ram, nicely out of the way

02:39 <clever> then the ram got bigger, and rather then have an MMIO hole in the ram, they moved MMIO up to 1008mb (the top most 16mb page)

02:40 <mrvn> and then ram got bigger again and again and again

02:40 <clever> then ram got bigger, and rather then have an MMIO hole, they moved MMIO to 0xfe00_0000 (4064mb) i think

02:40 <clever> and now it cant run any further, without requiring 64bit support

02:40 <clever> so now you have an MMIO hole by default

02:41 <mrvn> and last the whole now has grown downwards

02:41 <clever> but if you turn on high-peripherals mode, the MMIO lands at a 64bit only addr, and now you have a solid 16gig of the address space dedicated purely to ram

02:41 <mrvn> -w

02:41 <heat> do other pis work similarly?

02:41 <heat> like the orange pi and the rock pi

02:41 <clever> heat: i havent investigated those knock-off's

02:42 <mrvn> 16gig? wow, like that will never be exceeded. I mean we went from 256 to 512 to 1024 to 2048 to 4096 to 8192. We will never ever need more than 16384

02:42 <mrvn> heat: no. totaly different chips. They only use the Pi for marketing.

02:42 <heat> yeah the rock pi looks solid

02:42 <heat> has a mali gpu

02:43 <heat> no pseudo boot processor and gpu

02:43 <clever> mrvn: the dram controller has been stated to max out at 16gig of ram, but nobody actually makes 16gig ddr4 chips, so the limit is currently 8gig

02:43 <mrvn> clever: so then they bolt on a second one.

02:45 <mrvn> wasn't there some stackable ram that's only limited by heat?

02:45 <heat> i dont limit no ram

02:46 <mrvn> you don't limit the ram, the ram limits you. :(

02:46 <clever> mrvn: thats actually what they did on the pi3

02:46 <clever> the pi3 is using ddr2, and they dont make 1gig ddr2 chips

02:46 <clever> so they have a pair of 512mb ddr2 dies in one epoxy package

02:46 <clever> each on half of the data bus

02:47 <clever> so its acting like a striped raid array

02:47 <mrvn> like all PCs

02:47 <clever> and the pi02, is just a pi3 soc, and a single 512mb die, now sharing the same epoxy package

02:47 <clever> so the soc and dram are bond-wired directly together

02:48 <clever> mrvn: but PC's have multiple ram chips per module, and multiple modules, while the controller on the rpi can only wire to a single chip

02:48 <clever> so it must have a much wider bus, possibly several slots wide

02:48 <clever> and drives the whole pair of slots in parallel, expecting them to both be the same size/speed

02:49 <clever> hence the whole mess of needing matching sticks in the right slots for optimal performance

02:50 <clever> i think the limiting factor then, is the width of the data bus coming out of the ram controller

03:33 [itchyjunk] has joined #osdev

03:47 elastic_dog has quit [Ping timeout: 240 seconds]

03:57 elastic_dog has joined #osdev

04:07 srjek has quit [Ping timeout: 256 seconds]

04:09 dude12312414 has joined #osdev

04:09 rustyy has quit [Quit: leaving]

04:10 dude12312414 has quit [Remote host closed the connection]

04:23 isaacwoods has quit [Quit: WeeChat 3.4]

04:43 gog has quit [Ping timeout: 240 seconds]

05:05 toastloop has joined #osdev

05:06 troseman has quit [Ping timeout: 272 seconds]

05:14 heat has quit [Ping timeout: 260 seconds]

05:16 rustyy has joined #osdev

05:18 xenos1984 has quit [Quit: Leaving.]

05:20 rustyy has quit [Client Quit]

05:20 rustyy has joined #osdev

05:30 Burgundy has joined #osdev

05:54 xenos1984 has joined #osdev

06:16 Burgundy has quit [Ping timeout: 240 seconds]

06:33 ElectronApps has joined #osdev

06:40 eroux has joined #osdev

06:53 jjuran has quit [Ping timeout: 240 seconds]

07:26 pounce has quit [Ping timeout: 272 seconds]

07:31 not_not has joined #osdev

07:35 pounce has joined #osdev

07:46 [itchyjunk] has quit [Read error: Connection reset by peer]

07:50 Jari-- has joined #osdev

07:53 bxh7 has joined #osdev

07:55 the_lanetly_052 has joined #osdev

08:10 puck has quit [Excess Flood]

08:10 puck has joined #osdev

08:28 k8yun has joined #osdev

08:28 k8yun has quit [Remote host closed the connection]

08:36 the_lanetly_052 has quit [Ping timeout: 256 seconds]

08:42 Belxjander has joined #osdev

08:48 toastloop has quit [Quit: Leaving]

08:49 toastloop has joined #osdev

08:58 wolfshappen has quit [Ping timeout: 272 seconds]

08:58 wolfshappen has joined #osdev

09:17 ddevault has quit [Ping timeout: 245 seconds]

09:17 tom5760 has quit [Ping timeout: 240 seconds]

09:17 gjnoonan has quit [Ping timeout: 256 seconds]

09:17 exec64 has quit [Ping timeout: 240 seconds]

09:18 jjuran has joined #osdev

09:18 jleightcap has quit [Ping timeout: 240 seconds]

09:18 jleightcap has joined #osdev

09:18 tom5760 has joined #osdev

09:18 sm2n has quit [Ping timeout: 250 seconds]

09:19 exec64 has joined #osdev

09:19 patwid has quit [Ping timeout: 256 seconds]

09:19 ddevault has joined #osdev

09:19 gjnoonan has joined #osdev

09:19 sm2n has joined #osdev

09:19 patwid has joined #osdev

09:42 Oli has joined #osdev

09:49 toastloop has left #osdev [Leaving]

09:52 xenos1984 has quit [Quit: Leaving.]

10:05 <not_not> X 86 or write os for my pi?

10:06 <hmmmm> there's certainly more baggage on the x86

10:07 <hmmmm> it would be easier to focus on operating system concepts with the latter

10:10 <not_not> Ye

10:11 <not_not> X86 has to be a city by now

10:11 <not_not> The slums of 16 bit real mode

10:11 <hmmmm> even the older stuff isn't particularily fun

10:12 <hmmmm> shutting off a computer is quite a feat

10:12 <not_not> Nah lol

10:12 <not_not> Wow

10:12 <not_not> Arm32 was my asm virginity

10:12 <hmmmm> ive never actually gotten to the point where i implemented the acpi dsdt parser

10:13 <hmmmm> so my hacky workaround was to point the reset start address at a routine that used the bios to shutdown and then intentionally triple fault

10:13 <not_not> I dont even know what that is i barely wrote my first parser

10:13 <not_not> Ow

10:14 <hmmmm> i grew up in a world where everything's x86

10:14 <not_not> Osdwv is a new world but i did some close ro the metal stuff on the gba when i was 12

10:15 <hmmmm> wow neat

10:15 <hmmmm> i think i was doing visual basic when i was 12

10:16 <not_not> I was 10 when vb but first day of school

10:16 <not_not> Middle school or jr high or whstever

10:16 <not_not> My classmate dissed me so hard for writing vb shit

10:17 <not_not> Told me he was a hacker and he wrote an os

10:17 <not_not> Amd that he had hacked a televiion station

10:18 <not_not> And there was another hacker and they were fightong over the mouse

10:18 <not_not> Years lAter i saw 1995 hackers

10:19 <not_not> He lied he was just describing that scene when zero cool and acid burn were in the same tv system

10:20 <klange> coding in vb > pretending you did a thing in a movie

10:21 <not_not> Lol

10:21 <not_not> Ye was so lol when i first saw that movie

10:22 <kazinsal> I oughta sit down and rewatch that movie

10:22 <klange> when I was a wee klange and wasn't allowed on the dialup, and I all I had was a busted old Compaq with a slot-load Penitum, VBA in Excel was the best I could do.

10:22 <kazinsal> it's probably been a decade or so

10:22 <not_not> Ye mw too

10:22 <not_not> Ahh angelina joulie boobs

10:22 <not_not> Vb is a good language

10:22 <kazinsal> I did recently rewatch Starship Troopers, and it continues to be my favourite Paul Verhoeven film

10:23 <kazinsal> Such a wonderfully batshit satire

10:23 <not_not> We dont really have any visual something on linux

10:23 <klange> There's Gambas

10:24 <not_not> Vb very good for beginners

10:24 Burgundy has joined #osdev

10:24 <klange> Which is a very similar language with a very similar bit of tooling, but I think it's relatively new compared to the days when I was hacking together forms in Excel.

10:24 <not_not> Mmmm

10:24 <klange> I think "the kids these days" would just hack together web stuff; React is today's Visual Basic.

10:24 <not_not> Ahh

10:25 xenos1984 has joined #osdev

10:25 <klange> And probably with all the same ireful aftereffects of knowing how to hack together a GUI but not actually knowing "software".

10:25 <not_not> My cousin codes everything in haskell theese days

10:25 <not_not> Ye unaware of the dangers of buffer overflows

10:26 <not_not> Of by one errors

10:26 <klange> Anyway, this lock shit has been so aggravating, and I don't think it was even actually a real deadlock, it was insufficient atomics...

10:27 <klange> I did "deadlock detection" the stupid way by having the acquire-loop check the system timer and panic after 5s, dump lock owners for all the critical stuff, etc.

10:28 <klange> And it revealed nothing directly. I could see three cores waiting on the main dumb lock for managing timed sleeps

10:28 <klange> And it revealed which core owned that lock and what function it was in. So I dug deeper into that and was timing every aspect of the function... and everything was reporting it was completing... immediately after five seconds.

10:29 <klange> Seemingly, the SGI from the panic was 'fixing' the problem.

10:29 <j`ey> SGI?

10:29 <not_not> Lol u can tell im the userspace brAt

10:30 <klange> ==IPI, software generated interrupt sent between processors

10:30 <j`ey> oh ok, I wasn't sure if that was the usage you meant, too many acroynyms

10:31 Oli has quit [Ping timeout: 250 seconds]

10:32 <klange> so best as I can tell, one core would get stuck spinning on a lock that it had actually successfully acquired, causing the other cores wanting that lock to trip the deadlock detection, the first to do that would send the SGI, and that would unbork the test-and-set loop, and it would return and then all its timing functions would say it spent five seconds in there

10:32 <not_not> It lied?

10:35 <klange> lied, or was stuck fighting with the exclusive monitor until the interrupt did the equivalent of slapping it in the face

10:36 <not_not> Ahh

10:36 <klange> the GCC manuals say __sync_lock_test_and_set is supposed to enforce acquire semantics, but the instructions it was spitting out don't seem sufficient for that

10:37 <not_not> Im gonna get pen and paper

10:37 <klange> which, fine, I need to move away from that, it's considered "legacy" and apparently was only actually defined for Itanium and not even the x86 I was using it on previously

10:38 GeDaMo has joined #osdev

10:38 <not_not> Bah im in insane asylum

10:39 <not_not> Gonna try hacking myself out

10:39 <kazinsal> hack the gibson

10:39 <not_not> Or the very least change my meds

10:39 <GeDaMo> How can you tell the difference between inside and outside? :|

10:39 <kingoffrance> there is a fence around one of them

10:39 <kingoffrance> or a bag

10:40 <kazinsal> nice white padded walls

10:40 <klange> i could use more fences

10:40 <GeDaMo> Is the fence to keep people out or in?

10:40 <not_not> Outside i can illicitly get enough benzo

10:40 <not_not> Both

10:40 <kingoffrance> that's a hercules question GeDaMo

10:40 <kingoffrance> are you hercules ? ghostbusters says yes

10:40 <kingoffrance> *says you are supposed to say yes

10:41 <mrvn> not_not: sketch?

10:41 <not_not> All i know is no snipers can shoot in this window

10:42 <not_not> Ye mrvn

10:42 <not_not> Gonna draw a box and think outside of it

10:43 <kazinsal> mard mk 2

10:44 <not_not> Bastards tho my ssh key is diffrent in the insane asylum

10:44 <mrvn> not_not: think outside, no box required *tree emoji*

10:45 <not_not> Xd but im locked up and dad is building a narrative of me being insane

10:46 <mrvn> klange: ARM has this little behavior that two processes can write a var and read the result and never see a change unless you force it out of the write buffers.

10:46 <not_not> Well i smokef pot once to try and fin a singular concept that can create binary

10:47 <not_not> By process of elimination i found out it was not not not

10:47 <not_not> Or not

10:47 <not_not> And not not is is

10:48 <not_not> So u can use one word to not define binary but not not defining binary and implicitly declare and or or

10:49 <not_not> Packing the universe into one word

10:49 <kazinsal> most of us just assign one voltage level to 1 and another to 0

10:49 <mrvn> seems like the pot is still going strong

10:49 <not_not> Havent smoked in years

10:50 <GeDaMo> I watched a video about Flash memory which apparently uses multiple voltage levels to store values

10:50 <not_not> Ooh nice

10:50 <mrvn> GeDaMo: one to read, one to write, one to erase as far as I know.

10:51 <not_not> My music teacher told me the computer is an analogue machine a recently and i shat brix

10:51 <not_not> Well i found out not can do all actions of god

10:52 <not_not> Generate operate and destroy the universe

10:52 <mrvn> not_not: what is 3 * 7?

10:52 <not_not> 10101

10:53 <not_not> My fav number

10:53 <not_not> U play dala wiw?

10:53 <mrvn> so not a but, just not not insane

10:53 <mrvn> s/but/bot/

10:54 <not_not> Well the turing test backfired

10:54 <GeDaMo> https://youtu.be/5Mh3o886qpg?t=342

10:54 <bslsk05> 'How do SSDs Work? | How does your Smartphone store data? | Insanely Complex Nanoscopic Structures!' by Branch Education (00:17:54)

10:55 <kazinsal> mrvn: thus my suggestion of a second revision of whatever software mard runs (ran?) on

10:55 <not_not> Well not not is the ideal language for qbits

10:55 <kazinsal> a slightly more advanced neural net, but still not able to be indistinguishable from actual human traffic

10:56 <not_not> God i miss my qbits

10:57 <not_not> Ok well if you program ur neurons to cognize brainfuck

10:58 <not_not> U gotta remember to start counting at 0

10:59 <not_not> There is a buffer overflow in the brain

10:59 <not_not> Sec looking up mard

11:00 dormito has joined #osdev

11:00 <mrvn> Some time ago I saw a nice article where they took a neural net to recognize dogs and run an image backwards through it so it would "draw" dogs. Did someone do the same with text and get not_not?

11:00 <not_not> Ahh i seen that

11:01 <not_not> Idk i did it with my brain tho but very excited if someone got not not

11:01 <not_not> Man the nurses are hot for me

11:02 <not_not> And im like fuck off i dont need ur womb im giving birth to ai

11:02 <not_not> Unless we process her and make her the oracle

11:03 <GeDaMo> https://openai.com/blog/dall-e/

11:03 <bslsk05> openai.com: DALL·E: Creating Images from Text

11:04 <not_not> Well the brain can get not not not by doing procesd of elimination on the proces of elimination

11:05 <not_not> But the brain has non mechanical parts

11:05 <not_not> Like teleporting information

11:06 <not_not> Good she's gone

11:10 <not_not> Its 12 o clock at noon going zzz mode

11:13 eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

11:21 qubasa has quit [Ping timeout: 252 seconds]

11:31 isaacwoods has joined #osdev

11:35 zaquest has quit [Ping timeout: 272 seconds]

11:40 pretty_dumm_guy has joined #osdev

11:41 isaacwoods has quit [Quit: WeeChat 3.4]

11:47 <mrvn> GeDaMo: those images are supposed to be generated by the neural net and not just picked out of the samples?

11:47 isaacwoods has joined #osdev

11:47 <GeDaMo> That's how I read it

11:48 <mrvn> they look way too good

11:50 <mrvn> "a collection of glasses is sitting on a table" gives either perfect wine clases or perfect reading glasses but never a oddly shaped mix? Can't believe that.

11:50 <GeDaMo> https://en.wikipedia.org/wiki/DALL-E

11:50 <bslsk05> en.wikipedia.org: DALL-E - Wikipedia

11:50 <GeDaMo> «DALL-E was developed and announced to the public in conjunction with CLIP (Contrastive Language-Image Pre-training),[1] a separate model whose role is to "understand and rank" its output.[3] The images that DALL-E generates are curated by CLIP, which presents the highest-quality images for any given prompt»

11:56 <not_not> Oh nice its beutifull

11:57 <not_not> Off the psychologist

11:57 <not_not> She's hot tho

11:58 <not_not> Smart didnt fall for the usual tricks

11:58 <not_not> Asked me if she thought i was being surveiled

11:59 <not_not> Or if i felt i was in control of my own thoughts

12:00 <not_not> Answered with a question but she rejected questions immediately

12:03 dennis95 has joined #osdev

12:04 <not_not> Shit on me

12:04 <not_not> What a coincidence

12:05 <GeDaMo> https://www.youtube.com/watch?v=rs8xE0jLen8

12:05 <bslsk05> 'Rabbit' by Chas & Dave - Topic (00:02:25)

12:06 <not_not> Dennises must be drawn to system development kike flies to dead flesh

12:06 <not_not> - kike that was a typo for like

12:09 Jari-- has quit [Ping timeout: 272 seconds]

12:19 not_not has quit [Ping timeout: 272 seconds]

12:26 Oli has joined #osdev

12:44 cvemys has joined #osdev

12:57 eroux has joined #osdev

13:44 hegz has joined #osdev

13:58 cvemys has quit [Quit: Leaving]

14:07 Oli has quit [Ping timeout: 256 seconds]

14:23 gog has joined #osdev

14:28 garrit has joined #osdev

14:34 hegz7 has joined #osdev

14:35 nyah has joined #osdev

14:43 zaquest has joined #osdev

15:03 gwizon has joined #osdev

15:07 gwizon has quit [Client Quit]

15:10 [itchyjunk] has joined #osdev

15:14 troseman has joined #osdev

15:20 mahmutov_ has joined #osdev

15:21 mahmutov_ is now known as mahmutov

15:22 zaquest has quit [Ping timeout: 250 seconds]

15:24 Vercas has quit [Write error: Connection reset by peer]

15:24 gxt has quit [Remote host closed the connection]

15:25 gxt has joined #osdev

15:25 Vercas has joined #osdev

15:34 zaquest has joined #osdev

15:37 kori has quit [Quit: zzz]

15:43 <gog> so using compiler-rt, would i just need libclang_rt.builtins to be the equivalent of a bare metal libgc?

15:43 <gog> libgcc*

15:44 <gog> also that ubsan stuff is pretty interesting

15:48 Oli has joined #osdev

15:49 sonny has joined #osdev

15:51 <mrvn> you need everything that gets reported as unresolved symbol on link

15:52 <gog> yeah makes sense

15:57 <g1n> hello

15:57 ElectronApps has quit [Remote host closed the connection]

15:58 <GeDaMo> Hi g1n :)

15:58 <g1n> i haven't tryed to make mm last week, going to do that this week

15:58 <sonny> what is mm?

15:59 <g1n> memory manager

15:59 <g1n> (malloc and friends)

15:59 <sonny> oh

15:59 <sonny> I thought it was machine monitor

15:59 <sonny> cool

16:00 <g1n> lol

16:01 <g1n> what can be "end of memory" status (in header)?

16:01 <sonny> Is there any scheme that physically partitions memory when handing it out? It's probably super ineffective

16:03 Oli has quit [Ping timeout: 240 seconds]

16:04 <mrvn> g1n: traditionally it's NULL or nullptr

16:04 <mrvn> sonny: NUMA partitions physical memory according to distances to different cores.

16:05 <g1n> mrvn: oh makes sense, seems i will need to make more "arrays" and "structs"

16:05 <g1n> lol

16:06 <mrvn> g1n: malloc is not something you should be using in the kernel.

16:07 <GeDaMo> malloc is usually a user-level function built-on top of mmap (or brk)

16:08 <mrvn> yeah, and forget brk

16:11 patwid has quit [Remote host closed the connection]

16:11 sm2n has quit [Remote host closed the connection]

16:11 gjnoonan has quit [Remote host closed the connection]

16:11 exec64 has quit [Remote host closed the connection]

16:11 tom5760 has quit [Remote host closed the connection]

16:11 jleightcap has quit [Remote host closed the connection]

16:11 ddevault has quit [Remote host closed the connection]

16:12 Brnocrist has quit [Ping timeout: 272 seconds]

16:12 exec64 has joined #osdev

16:12 tom5760 has joined #osdev

16:12 sm2n has joined #osdev

16:12 gjnoonan has joined #osdev

16:12 ddevault has joined #osdev

16:12 patwid has joined #osdev

16:12 jleightcap has joined #osdev

16:13 <sonny> mrvn: thanks, I'll look into that

16:13 <mrvn> Hacking in movies: % TELNET <HUNTLEY_NET> % SSH CLIENT

16:14 <mrvn> ********ACCESS. GRANTES********

16:14 <mrvn> OPENING PORT: 47534534534 got to love movies.

16:15 <mrvn> Also very important while hacking: A call graph (stack backtrace) for your hacking tool. Can't hack without that.

16:18 <sonny> I think a scene where someone compromises a mainframe would be great

16:21 <sonny> "hey look at this, they still use mainframes lol"

16:22 <bauen1> iirc wargames actually had some decent "hacking scenes" with war dialing that seemed to actually be pretty accurate

16:23 <sonny> noted

16:25 <GeDaMo> Sneakers is also good

16:26 <g1n> mrvn, GeDaMo: so i need to find how to make smth like mmap? where to start?

16:27 <GeDaMo> mmap finds a free physical page and maps it to a free page in the processes virtual space

16:28 <g1n> wdym by "processes virtual space"?

16:28 <g1n> like virtual space per process?

16:28 <GeDaMo> Yes

16:29 <g1n> oh

16:29 <g1n> it should be one for kernel right?

16:33 <mrvn> g1n: do you have user processes already?

16:33 <g1n> no of course

16:33 <mrvn> then why would you need mmap?

16:34 <g1n> then what should i do?

16:34 <mrvn> write code to map pages, create a stack and a user process

16:34 <g1n> but why now?

16:34 <g1n> i thought to make fs first

16:35 <mrvn> then maybe implement a log_string syscall

16:35 <g1n> oh

16:35 <g1n> i have no idea about userland yet

16:35 <g1n> i thought to make memory things, then filesystem, then userland

16:35 <g1n> and there syscalls and other cool things

16:36 <mrvn> it's not userland as in separate programs. just somce code you run with user priviledges.

16:36 <g1n> oh

16:36 <g1n> but still, am i ready?

16:37 <mrvn> probably not

16:37 <g1n> so, why doing it?

16:37 <mrvn> that's the challenge

16:37 <g1n> oh

16:37 <g1n> lol

16:37 <g1n> yes

16:38 xenos1984 has quit [Remote host closed the connection]

16:39 <g1n> vfs needs allocing, isn't it?

16:39 xenos1984 has joined #osdev

16:39 heat has joined #osdev

16:39 <mrvn> no

16:40 <g1n> oh

16:40 <g1n> really???

16:40 <geist> you also probably want to do task switching and and whatnot first

16:40 <mrvn> it certainly helps, but no, not needed

16:40 <geist> fs is fairly late in the game, IMO

16:40 <mrvn> geist: I think he has kernel threads already

16:40 <g1n> geist: why is fs lite lol? i need to access files isn't it?

16:41 <g1n> mrvn: no, i don't have any multitasking

16:41 <geist> because you dont *need* it for the other stuff

16:41 <heat> yes but that's only useful when you can load user programs

16:41 <geist> ie,youc an build a yser space even if you just hard compile in some programs, etc

16:41 <g1n> oh

16:41 <g1n> makes sense

16:41 <geist> and yeah you probably want to tackle multitasking fairly soon

16:42 <geist> since that affects how you build the rest of the subsystems

16:43 <g1n> hmm, so i need to check multitasking? also, i currently have issues with setting up keyboard and timer for some reason (page faults working, so idt should works). I thought about setting up framebuffer too.

16:43 <heat> then get the timer working first

16:44 <heat> and/or the keyboard

16:44 <klys> what to do first appears to be quite an issue. there are a few things you can do. you should probably do one of those things.

16:44 <g1n> heat, klys: ok

16:44 <heat> i like mmu -> interrupts -> scheduling -> userspace

16:45 <heat> it's kinda how I did it first time around, and that's how I did it for riscv

16:45 <mrvn> mmu, exceptions, interrupts, (user) mode switch, syscall, scheduling

16:45 <g1n> ok, so currently i need to fix timer/keyboard to make total sure that idt is working

16:46 <heat> geist, kinda out of nowhere but how fast is scudo?

16:47 <mrvn> My first FS was: std::map<std::string, std::vector<uint8_t>> basically

16:47 vin has joined #osdev

16:48 <g1n> so the hardest thing is to find addr to fs, that can be given by grub, isn't it?

16:49 <heat> the hardest thing is to design a proper fs layer

16:49 <g1n> oh, yes makes sense

16:49 <mrvn> Actually the really first one was: struct File { File *next; char name[64]; size_t size; uint8_t data[ /* size */ ]; };

16:49 <heat> std::map<string, vector<uint8_t>> is crap

16:49 <heat> turns out filesystems are way more complex than a list of paths and a bunch of bytes

16:49 <g1n> i think as first fs i will try tar (please not kill me, i will fix it, at least planning)

16:49 <g1n> lol

16:50 <heat> suggestion: don't

16:50 <mrvn> heat: totally, but walking a linked list for files get tiresome.

16:50 <heat> tar isn't a filesystem and isn't suitable as one

16:50 <g1n> ok

16:50 <heat> I recommend you create a ram filesystem (like linux's tmpfs for instance) and unpack the tar to that

16:51 <g1n> initrd?

16:51 <heat> yes unpack the initrd to the ram fs

16:51 <mrvn> just use something like my struct File above and link it into the kernel so your VFS has some data to access.

16:52 <g1n> ok, thanks, i am going to fix keyboard first

16:53 <g1n> mrvn: i thought of doing like that, to test that i am doing it correctly, and then make proper one

16:53 <g1n> also, vfs could be useful if doing like real unix (everything is a file)

16:53 <mrvn> g1n: your goal should be to get something working with the minimum of code. Understand how it works, design a proper interface and only over time replace the early stuff with proper code.

16:54 <g1n> ok

16:54 <heat> i don't agree

16:54 <heat> considering that that's totally not how filesystems work

16:54 <mrvn> g1n: So if you think you need an FS then make one that can handle 10 files compiled into the kernel image and go from there.

16:55 <g1n> heat: oh

16:55 <mrvn> heat: how do filesystems work? They can give you some data associated with the name of a file. Or store some data. Write support can wait.

16:56 <heat> filesystems have directories and paths

16:56 <heat> they're trees, not lists

16:56 <mrvn> heat: Do they? Not really, historical speaking.

16:56 <GeDaMo> Directories are files containing references to other files

16:56 <heat> that's an impl detail

16:56 <GeDaMo> Also, you don't need directories in that sense

16:56 <mrvn> heat: a list is a tree, just a really bad one.

16:56 <heat> so lets make a good one instead :)

16:57 <mrvn> heat: sure, spend 1 year implementing zfs before you implement multitasking.

16:57 <heat> <heat> I recommend you create a ram filesystem (like linux's tmpfs for instance) and unpack the tar to that

16:57 <GeDaMo> https://en.wikipedia.org/wiki/Burroughs_MCP#File_system

16:57 <GeDaMo> "In early MCP implementations, directory nodes were represented by separate files with directory entries, as other systems did. However, since about 1970, MCP internally uses a 'FLAT' directory listing all file paths on a volume"

16:57 <bslsk05> en.wikipedia.org: Burroughs MCP - Wikipedia

16:57 <heat> who's this heat guy and why does he say stuff?

16:58 exec64 has quit [Remote host closed the connection]

16:58 jleightcap has quit [Remote host closed the connection]

16:58 patwid has quit [Remote host closed the connection]

16:58 ddevault has quit [Remote host closed the connection]

16:58 sm2n has quit [Remote host closed the connection]

16:58 gjnoonan has quit [Remote host closed the connection]

16:58 tom5760 has quit [Remote host closed the connection]

16:58 gwizon has joined #osdev

16:58 <mrvn> GeDaMo: there are no directories, only users. each user has a flat list of files. :)

16:59 <GeDaMo> Users don't exist, they're just fairy stories used to scare programmers! :P

17:01 Oli has joined #osdev

17:01 <g1n> lol

17:02 <gog> there is no user, only zuul

17:02 <g1n> zuul?

17:03 <heat> gog, btw the clang builtins is the thing you want to use to replace libgcc

17:03 Oli has quit [Read error: Connection reset by peer]

17:03 <gog> heat: yes ty i objdump'd it and had a look :>

17:03 <heat> note that clang doesn't support crtbegin/end nor the good old .init and .fini sections

17:03 <gog> i don't use those anyway

17:04 <gog> at least not currently

17:04 <heat> if you want to call constructors, try iterating through init_array and fini_array

17:04 ddevault has joined #osdev

17:04 exec64 has joined #osdev

17:04 tom5760 has joined #osdev

17:04 sm2n has joined #osdev

17:04 gjnoonan has joined #osdev

17:04 <mrvn> heat: on x86 too? I thought init_array was an ARM thing

17:04 jleightcap has joined #osdev

17:04 patwid has joined #osdev

17:04 <heat> no, it's a toolchain thing

17:04 <gog> it's on x86 too, i've done some c++ experiments

17:05 <gog> gcc does make init_array if you tell it too iirc

17:05 <heat> oh and you need to explicitly enable the init_array and fini_array support when building a cross compiler iirc

17:05 <heat> because it can't autodetect some stuff

17:05 <gog> yes

17:06 <gog> i think it enables support for it by default, just not generation

17:06 <gog> but it's been a minute since i played with that

17:06 <heat> btw not sure if you read what I said yesterday but I recommend you just ship compiler_rt with your kernel

17:06 <mrvn> gog: gcc makes them per default on arm

17:07 X-Scale` has joined #osdev

17:07 <gog> yeah i might do a git submodule for that

17:07 <gog> i'm going to be working toward using clang only i think

17:07 <heat> omg omg omg omg omg omg omg

17:07 <gog> my toolchain is rather old and a pita to set up

17:07 X-Scale has quit [Ping timeout: 250 seconds]

17:07 X-Scale` is now known as X-Scale

17:07 <gog> and i won't need it after i ditch gnu-efi

17:08 <heat> 1) I recommend you keep gcc around for better portability

17:08 <heat> 2) what does gnu-efi have to do with your gcc toolchain?

17:08 <gog> i only keep it around for the objcopy and reloc nonsense

17:08 <gog> i know it'd also work on clang

17:09 <gog> but with clang i can just make an efi application directly

17:09 <gog> and the only library function i use is Print() which i will not need once i finish my work on my printf implementation

17:10 <heat> fair enough

17:10 <gog> the NIH is very strong

17:11 <GeDaMo> When are you writing your own C compiler? :P

17:11 <gog> eventually™

17:11 <mrvn> gog: I first only had puts() and the put_hex32()

17:11 <heat> i'm very happy you embraced clang, the best permissively licensed compiler with great code gen and runtime libraries, and modular code too!

17:12 <gog> it caught an overlapping comparison error i made and i was sold

17:12 <gog> gcc does not even with every warning enabled

17:12 <heat> try clang-tidy :0

17:12 <gog> o:

17:12 <gog> will look into it

17:12 <heat> clang-format is also great

17:12 <heat> get a compile_commands.json and boom, clangd

17:13 <heat> great code completion and IDE-ish support

17:13 <gog> yeah i've been investigating improving my nvim config to have better code completion

17:13 <gog> and clang is a part of that

17:14 <gog> having to dig up a header every time i can't remember a struct field is getting tiresome

17:14 <mrvn> any decent editor can autocomplete for you

17:15 <heat> when you get to user space you'll be able to enjoy the great runtime libraries that clang brings to the table

17:15 X-Scale` has joined #osdev

17:15 <j`ey> is heat a paid clang spokesperson?

17:15 <heat> memory? asan. ub? ubsan. concurrency? tsan

17:15 <heat> i wish they paid me

17:15 <gog> i think nvim has pretty sophisticated code-completion built in i just don't know how to use it

17:16 <heat> usually editors rely on a language server to do stuff

17:16 X-Scale has quit [Ping timeout: 256 seconds]

17:16 X-Scale` is now known as X-Scale

17:16 <heat> at least fancier stuff

17:16 <gog> yes

17:16 <heat> clangd is one of the language servers

17:17 <heat> intellisense just runs as a plugin I think

17:17 <heat> it's also slow as shit

17:17 <gog> i tried vs code and was not impressed

17:17 <gog> sorry not sorry

17:18 <geist> heat: re: scudo i'm not so sure it's fast in as much as it's secure

17:18 <heat> what do you mean you didn't like the best permissively licensed open source extensible editor?

17:18 <geist> though i guess that wasn't implied with the qestion

17:18 <heat> I vaguely read that it's both secure and competitive in terms of perf with jemalloc and friends

17:18 <gog> honestly i like editing in the terminal and being able to switch between editing and testing with keystrokes ra

17:19 <heat> but I can't find numbers

17:20 <heat> geist: also I think scudo sacrifices a bit of safety for speed, lots of per thread state for instance

17:20 <geist> yah that's probably correct

17:20 <heat> things musl says they won't do because that can screw with global malloc state

17:20 h4zel has joined #osdev

17:21 <mrvn> how is per thread state less safe?

17:22 <heat> "unsynchronized per-thread state inherently sacrifices global consistency for performance and makes it impossible to detect a lot of types of memory usage errors (DF/UAF, etc) that could otherwise be caught."

17:22 <heat> "However musl has the additional constraint of being compatible with small/very-low-memory environments. Lack of global consistency inherently means you will end up using memory less efficiently and requesting significantly more from the system. The new malloc about to go upstream in musl is, to my knowledge, the first/only advanced hardened allocator using slab-type design rather than traditional dlmalloc type split/merge, but also designed for

17:22 <heat> extremely low overhead/waste at low to moderate usage rather than extreme performance. And in the vast majority of applications, this is perfectly reasonable. Even Firefox for example does very well with it."

17:22 <heat> oops

17:23 <sonny> gog so you don't like using the mouse?

17:23 <mrvn> Oh yeah, firefox. The perfect app to test how it performs in small/very-low-memory environments.

17:23 <gog> not when i'm coding

17:24 <sonny> ok

17:24 <sonny> hmm

17:24 <heat> mrvn, firefox is a good example of a big app with lots of allocations

17:24 <mrvn> heat: s/per-thread state// s/global// s/for performance and//

17:24 <sonny> guess it's a good idea to leave room for the user to make their own key commands

17:25 <mrvn> heat: yes, big app, lots of allocations, a little per thread overheat in the malloc code becomes totally irrelevant and all the gains shine.

17:26 <mrvn> heat: I would say firefox is one of the best test cases to showcase the per-thread allocations.

17:26 <heat> that's not the point

17:26 <heat> musl's malloc isn't tuned for performance at all

17:26 <heat> if they wanted a fast malloc, they could get one

17:26 <mrvn> heat: My argument is with "*Even* Firefox for example does very well with it.". Obviously. That's the kind of app you want this stuff for.

17:27 <heat> that's about musl's malloc, not a per-thread thing

17:27 <mrvn> oh, your paste looked like it was all about the same thing

17:27 <heat> it is

17:27 <heat> it's musl's author explaining the reasoning behind the slowness in musl's malloc

17:28 <heat> and the lack of per-thread state

17:28 <mrvn> heat: you forgot to paste the not having it part. :)

17:28 <heat> yeah it's got some context

17:28 <heat> https://news.ycombinator.com/item?id=23080290

17:28 <bslsk05> news.ycombinator.com: Why does musl make my Rust code so slow? | Hacker News

17:29 <j`ey> shame that jemalloc bakes the PAGE_SIZE into the binary at compile time

17:29 <mrvn> heat: The "unsynchronized" line is still stupid though. Anything unsynchronized will just crash in any multithreaded app.

17:30 <mrvn> heat: What musl has to compete with is synchronized per-thread state

17:31 <heat> i'm tired of shilling for the llvm foundation

17:31 <heat> they don't even pay me

17:31 <mrvn> heat: gcc pays (payed) me.

17:31 <heat> j`ey, scudo sucks, have you heard of mimalloc, the best allocator ever written?

17:31 * mrvn was hired by GNU for the princely sum of 10 stickers.

17:32 <heat> way better than jemalloc

17:32 <heat> also way smaller

17:32 <j`ey> Ive heard of it, never used it

17:33 <sonny> what about snmalloc?

17:33 <sonny> https://github.com/microsoft/snmalloc

17:33 <bslsk05> microsoft/snmalloc - Message passing based allocator (72 forks/679 stargazers/MIT)

17:33 <heat> sucks, use mimalloc

17:34 <heat> the best permissively licensed memory allocator with a special focus on speed

17:34 <mrvn> Isn't is sad that the best way to transfere copyright internationally is claiming your employes work for yourself?

17:34 <sonny> well wow microsoft has been doing a lot of allocators :D

17:35 <heat> don't worry, the windows allocators still suck

17:35 <heat> everything is still normal

17:35 <sonny> lmao

17:38 <sonny> have you guys heard of the unicorn emulator?

17:39 <sonny> https://www.unicorn-engine.org/

17:39 <bslsk05> www.unicorn-engine.org: Unicorn – The Ultimate CPU emulator

17:41 <heat> anyway malloc sucks, use slub

17:42 <heat> the non-permissively licensed high performance allocator suitable for kernels

17:42 <heat> better yet, allocate whole pages

17:42 <j`ey> LARGE pages

17:43 <heat> HUGE PAGES

17:43 <heat> FUCKING HUGE

17:43 <heat> HUUUUUUUUUGE

17:45 <heat> pages so huge you'll lose your shit

17:46 dennis95 has quit [Quit: Leaving]

17:47 <gog> 512GiB pages

17:47 <GeDaMo> https://en.wikipedia.org/wiki/Page_(computer_memory)#Multiple_page_sizes

17:47 <bslsk05> en.wikipedia.org: Page (computer memory) - Wikipedia

17:48 <gog> oh shit RISCV64 actually supports 512GiB pages if you have large enough address space bits

17:49 <GeDaMo> Thinking ahead :P

17:49 <hmmmm> what is the point in having pages at that granularity

17:49 <gog> idk really

17:49 <gog> seems like overkill

17:51 mahmutov has quit [Ping timeout: 272 seconds]

17:51 <j`ey> aarch64 too

17:52 <heat> hmmmm, for the phys map

17:54 <mrvn> At some point it's just simpler to copy&paste the page table design for each level and you just get support for pages at any level for free.

17:54 mahmutov has joined #osdev

17:55 <heat> the arm64 48-bit address space's phys map is 128TB long

17:55 <heat> 512GB pages are obviously useful here

17:55 <mrvn> gog: if you have a compute cluster with TB of memory per node running a single application then 512GiB pages sound like a smart thing. Only needs 1 slot in the TLB.

17:55 <gog> yes

17:56 <gog> that is the only existing application i can think of for that

17:56 <mrvn> gog: the other is when you kernel maps all physical memory for easy access

17:56 <mrvn> or at bootup

17:56 <gog> true true

17:57 <gog> but 1GiB pages would be ok for that too, but i guess that eats up more of the precious few TLB slots for 1GiB pages

17:57 <mrvn> I was tempted to use 1GB pages on amd64 but not every cpu has it

17:57 <gog> i can't think of anybody i know that has more than 32GiB of memory in their rig

17:58 <mrvn> gog: 1GiB page needs more levels of page tables. More work to set up. Slower to look up on fault too.

17:58 <j`ey> probably a few of us in here :P

17:58 <gog> perhaps

17:58 <gog> i think geist has a rig with 64GiB?

17:58 <mrvn> gog: MemTotal: 64851252 kB

17:58 <heat> it was doug16k I think

17:59 <mrvn> heat: 16k 4k pages?

17:59 <gog> odamn

17:59 <mrvn> oh, wait, wrong oder of magnitude

18:01 <j`ey> I have 64G too

18:01 <j`ey> Im assuming geist's thunder x2 has 128G at least

18:02 <mrvn> What I don't have is some insane gaming GPU with 16GB of memory.

18:03 <mrvn> And I say "gaming" because if you stick a "for data center use" sticker on it the price goes up 50%

18:05 <heat> if you stick a gaming sticker on it the price goes up by 200% :P

18:07 <j`ey> RGB is expensive

18:09 bslsk06 has joined #osdev

18:09 puckipedia has joined #osdev

18:10 puckipedia has quit [Remote host closed the connection]

18:10 bslsk06 has quit [Client Quit]

18:10 mahmutov has quit [Ping timeout: 240 seconds]

18:13 mahmutov has joined #osdev

18:30 sonny has left #osdev [Closing Window]

18:33 ravish0007_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

18:37 CaCode has joined #osdev

18:40 h4zel has quit [Ping timeout: 272 seconds]

18:44 <geist> 128 yes

18:44 <geist> On thunder x

19:07 the_lanetly_052 has joined #osdev

19:08 the_lanetly_052 has quit [Max SendQ exceeded]

19:09 the_lanetly_052 has joined #osdev

19:10 the_lanetly_052 has quit [Max SendQ exceeded]

19:10 the_lanetly_052 has joined #osdev

19:11 the_lanetly_052 has quit [Max SendQ exceeded]

19:11 the_lanetly_052 has joined #osdev

19:12 the_lanetly_052 has quit [Max SendQ exceeded]

19:13 the_lanetly_052 has joined #osdev

19:13 the_lanetly_052 has quit [Max SendQ exceeded]

19:14 the_lanetly_052 has joined #osdev

19:15 not_not has joined #osdev

19:15 <not_not> Hy

19:15 the_lanetly_052 has quit [Max SendQ exceeded]

19:16 the_lanetly_052 has joined #osdev

19:16 the_lanetly_052 has quit [Max SendQ exceeded]

19:17 the_lanetly_052 has joined #osdev

19:26 <g1n> not_not: hi

19:29 <clever> geist: you mentioned before that youve seen the pi4 pcie controller in other soc's, do you happen to know the name of that controller or where i might find proper docs on it?

19:34 Brnocrist has joined #osdev

19:36 isaacwoods has quit [Quit: WeeChat 3.4]

19:57 the_lanetly_052 has quit [Ping timeout: 245 seconds]

19:59 troseman has quit [Ping timeout: 272 seconds]

20:00 <not_not> Hmm time to contemplate on desiscion

20:01 <not_not> Write a VM or my own language, or x86_64 kernel or arm64 kernel?

20:01 <GeDaMo> Write a VM for your own langhuage as a kernel :P

20:01 not_not has quit [Read error: Connection reset by peer]

20:02 <g1n> lol

20:02 <g1n> i think x86_64 will be easier on first tryes, but i am not sure

20:02 * g1n tryed to make a vm

20:02 * g1n thought about making own lang, but just did very little steps in compiler dev lol

20:04 not_not has joined #osdev

20:06 <gog> x86_64 with UEFI is easier in a few ways imo

20:06 <gog> firmware deals with getting you into the right mode and sets up an identity-mapped paging environment

20:06 <gog> you can load files with boot services protocols directly from FAT volumes

20:07 <gog> and it has rudimentary memory management

20:08 h4zel has joined #osdev

20:19 CaCode_ has joined #osdev

20:21 <sham1> AMD64 is also relatively easy just due to all the resources available

20:23 CaCode has quit [Ping timeout: 272 seconds]

20:24 mahmutov has quit [Ping timeout: 256 seconds]

20:28 mahmutov has joined #osdev

20:31 biblio has joined #osdev

20:32 <gog> yes

20:33 <gog> also a big fan of rip-relative addressing

20:38 biblio_ has joined #osdev

20:40 biblio_ has quit [Client Quit]

20:40 biblio has quit [Ping timeout: 260 seconds]

20:58 sortie has quit [Ping timeout: 256 seconds]

21:07 vin has quit [Ping timeout: 260 seconds]

21:19 sortie has joined #osdev

21:24 eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

21:25 <not_not> Gog ye i think relative adressing rings a bell from my userspace exp

21:25 <not_not> IMO Cisc is more interesting

21:26 <gog> yes makes position-independent code much more straightforward

21:26 <gog> no thunking, no referring to offset tables

21:26 <not_not> And i hear the push multiple regs to stack

21:26 <gog> x86_64 doesn't have that anymore

21:26 <gog> pusha isn't supported

21:26 <not_not> Aww poops

21:27 <not_not> Any reason?

21:27 <gog> idk exactly, probably makes the microcode around register dependencies more complicated

21:27 <gog> thus way slower

21:27 <not_not> Ahh

21:27 <gog> might take more micro-ops to do the push vs just doing it one-by-one in the asm

21:28 GeDaMo has quit [Remote host closed the connection]

21:28 <gog> best guess

21:28 <not_not> Ye and 64 bit ur not exactly presses for mem

21:28 <_eryjus> break the pipeline maybe?

21:28 <mrvn> gog: you don't need to push all regs and partial pushes are definetly faster

21:28 <gog> yes

21:28 <gog> also pipeline stalls

21:29 <not_not> And U have too many registers to count

21:29 <gog> mrvn: and it might not even be the registers you care about

21:29 <gog> wrt pushad

21:29 <not_not> 64 bit changes errything

21:29 <not_not> Never written kernel stuff

21:30 <not_not> Got a bootloader into 32 bit mode maybe

21:30 <not_not> But cant write to screen to confirm yet

21:30 <gog> i don't recommend it unless you enjoy frustrating bugs that you caused yourself because of a side-effect you didn't take into account

21:31 <not_not> I enjoy late night mysteries yes

21:31 <gog> i.e. me not realizing i enabled interrupts before trying to initialize the PIC :|

21:31 <not_not> Wow

21:31 <not_not> Ye its always do banale

21:31 <gog> yeah i shot myself in the foot there

21:31 <not_not> So*

21:31 air has quit [Quit: cria 0.2.9cvs17 -- http://cria.sf.net]

21:31 <not_not> Hehehe

21:31 <gog> with the tiniest change to my code

21:32 air has joined #osdev

21:32 <not_not> Did the same on my first parser

21:32 <not_not> Spendt 3 days looking for a sigsegv

21:33 <gog> oops

21:33 <not_not> Turens out the index to my string pointer array was assigned with == instead of =

21:33 dormito has quit [Quit: WeeChat 3.3]

21:34 <not_not> But when i caught it i forgot it was the build that would show if my parser worked or not and hnggg

21:35 <not_not> My first line of my own programming lang was computed correctly and in polish notation

21:36 <not_not> Was satisfying

21:36 <not_not> I love bugs

21:36 <not_not> And i hate bugs

21:37 <not_not> X86 64 it is

21:37 <not_not> But the ssh key to my server is off when i Try to connenct to my linux

21:38 <not_not> (they Are watching is)

21:38 <not_not> Us*

21:40 <not_not> Well when i connect from the insane asylum

21:43 <not_not> Not planning on a multi user multi core full blown system here

21:44 <not_not> And not gonna be temple os guy

21:45 <not_not> Nod

21:47 <not_not> Ftw i never finished tiberian sun

21:47 <not_not> Tfw

21:47 masoudd has quit [Ping timeout: 272 seconds]

21:52 <CompanionCube> not_not: heh, if x86_64 has too many registers then there's arches like arm64 with 32 registers

21:52 <not_not> Ye i have a pi

21:52 <not_not> Arm was my first asm

21:53 <not_not> Had to write sound mixer and set dma and interrupts in asm on the gba

21:54 <not_not> Lots of regs helped there

21:54 <klange> Happy to report that thus far my hvf vm and my rpi are both continuing to loop Doom demos ten hours later

21:54 <not_not> Gz

21:56 <not_not> Gonna try and hook up my rpi to qbits from parallell universes when i get out

21:57 sonny has joined #osdev

21:58 <not_not> But i need kpins to reboot my brain

21:59 <not_not> All they have here is 2 mg valium and 25 mg qetiapine

22:01 <klange> I've got 72mg of methylphenidate.

22:01 <not_not> Uff get adderall

22:02 <klange> Very very illegal here.

22:02 <not_not> How illegal?

22:02 <klange> Lose my house, get deported, and never be able to return to the country I have lived the last six years of my life in illegal.

22:02 <not_not> Ur not phillipineese by any chance

22:03 <not_not> Japan?

22:03 <klange> Japan.

22:03 <not_not> Awesome

22:03 <gog> quetiapine made me have lucid nightmares

22:03 <not_not> I wanna go to japan

22:03 <gog> and oversleep

22:03 <not_not> Gog lol

22:04 <not_not> Well i take 100. Mg in one go once a year and i can do amp for another year without a disaster

22:05 <not_not> My bf held an anti pill campaign against me

22:06 <not_not> Insisting i should rather take 10 hits of acid and do speed daily over taking 2 kpin every other week

22:07 <klange> ah, yes, "don't take this drug, take _this_ drug!" people

22:07 <not_not> Yes

22:07 <not_not> I have weapon grade ptsd

22:08 <not_not> 2 kpin make me chill and tired for a whole day and the whole day after

22:09 <not_not> Whereas he'd wake up with a tiger in his room

22:10 <not_not> He's such a sweetheart but omg stupid

22:11 <not_not> And i paid off all his Loan sharks

22:12 <not_not> And i told him if i start whining for pills

22:13 <not_not> Its a chrisis and U need to put a xanax in my mouth or its now i am become death the destroyer of worlds blackout

22:14 <not_not> And i have eplepsia om my visual cortex

22:15 <not_not> Guess 3 times who ate my xanax in my moment of need

22:15 <gog> i'm rewriting my memory manager

22:15 <not_not> Nice

22:16 <not_not> Im having my whole brain replaced when i get some real sedatives with alien technology

22:16 <not_not> Then im gonna tend to my hobbies

22:17 <not_not> Write some os

22:18 <not_not> And sell off a company

22:18 <not_not> Man U get loads done with amp man

22:20 <not_not> Gog U read yakuza noon?

22:20 <not_not> *moon

22:20 <gog> no

22:20 <not_not> Wait i meant klangen

22:20 <clever> gog: on the subject of memory managers, i need to make one for a display-list system, its a bit different from normal, because the metadata and the data must be in different regions of memory...

22:20 <not_not> Klange

22:22 <not_not> Lol i was like "but im really here to talk to you about the security clearance" when i was arrested

22:22 <clever> gog: basically, i have an uint32_t[4096], where i need to use chunks of semi-unpredictable size, but only ~8 objects need to exist at any time, and they can expire after 1/30th of a second at the slowest

22:22 <gog> will they ever add up to more than 16kibs?

22:22 <not_not> Mhm

22:23 <clever> gog: i think ive gotten things to work with one object taking up ~8kb before, so i could very easily fill it with just 2 objects

22:23 <clever> so certain combinations of modes will be unsolvable

22:23 <not_not> So they sendt med to the insane asylum

22:24 <gog> where do the objects come from? are they autogenerated, is this like a blitter?

22:24 <not_not> AS punishment for thinking they were gonna send me to Afghanistan

22:25 <clever> gog: auto-generated from a linked list in normal ram, and yeah, its configuring the blitter

22:25 <not_not> They spendt a ton of Money om that arrest lmao

22:25 <gog> and these are blits that need to happen at the same time or can you queue them up for a certain number of frames?

22:25 <clever> gog: basically, for each image you want to display on screen (at 1:1 scale), you need an uint32_t[7] object, scaled images are uint32_t[14], and the end-of-list marker is uint32_t[1]

22:26 <clever> gog: you describe a frame by just having a big list of the image objects, and an end-of-list marker, and for page-flip speed, having a second list pre-loaded into the memory saves a great deal of time

22:27 <clever> so yes, you can queue up the next frame, by just writing it into unused space in this limited memory

22:27 <not_not> Nice

22:27 <clever> and the hw can drive up to 3 displays, so you will have 3 frames actively being rendered, plus a potential 3 more frames that you are writing to the config, or are waiting for a vsync

22:27 <not_not> Cant wait to do video shit gonna make a kernel space gaming os

22:27 <gog> but you're still limited to that 16k

22:27 <clever> yep

22:27 <clever> so you cant queue too much up

22:27 <gog> hm yeah that is tricky

22:28 <clever> and also, the palette and up-scaling filters go into the same 16kb region, if used

22:28 <mrvn> clever: that's what allocators are for in c++

22:28 <mrvn> (different regions of memory)

22:28 <clever> mrvn: from https://en.cppreference.com/w/cpp/memory/allocator ?

22:28 <bslsk05> en.cppreference.com: std::allocator - cppreference.com

22:29 <clever> gog: all records must also be 32bit aligned, so you can just treat the 16kb region as a 4096 slot region of opaque tokens, if that makes things simpler

22:30 <mrvn> clever: yes. you should check for some talk about allocators in c++ 17/20. They have changed them.

22:30 <not_not> Ill never code a line of rust

22:30 <clever> mrvn: got a link to a talk? search tools are imposible to use with keywords like c++

22:30 <mrvn> clever: given your size constraint you might need something that can compact memory.

22:30 <clever> they just go "oh, regex", and ignore the ++ no matter what you do, lol

22:31 <mrvn> clever: try cppcon instead

22:31 <not_not> Cant tolerante languages that use " let x = 0"

22:31 <clever> mrvn: compaction is why i mentioned the 1/30th of a second thing, if you do move an object around, you have to wait for the next vsync before you can delete the old object

22:32 <clever> https://www.youtube.com/watch?v=kSWfushlvB8 CppCon 2017: Bob Steagall “How to Write a Custom Allocator”

22:32 <bslsk05> 'CppCon 2017: Bob Steagall “How to Write a Custom Allocator”' by CppCon (01:03:40)

22:32 <clever> that one sounds perfect

22:33 <not_not> Is C++ dumb if ur writing kernels?

22:33 <Griwes> there's also Arthur O'Dwyer's thing that explains memory resources and yours truly's talk on how we've used them together with gpus in Thrust

22:33 <clever> not_not: ive used c++ on no-mmu kernels, havent had any real trouble

22:33 <mrvn> clever: reading what you wrote maybe it isn't such a good idea. You need space for different objects and allocators are for just one.

22:33 <gog> c++ is perfectly suitable to write kernels with, with some caveats

22:34 <not_not> I have a hankering for asm and C being spar out by python scripts

22:34 <mrvn> gog: a subset of c++ is definetly better for writing kernels

22:34 <gog> yes

22:34 <not_not> Ye considering C++

22:34 <Griwes> raii is a beautiful thing

22:35 <clever> Griwes: found that, https://www.youtube.com/watch?v=0MdSJsCTRkY

22:35 <bslsk05> 'C++Now 2018: Arthur O'Dwyer “An Allocator is a Handle to a Heap”' by CppNow (01:28:42)

22:35 <Griwes> even if you use just that and literally nothing else, your life already improves

22:35 <mrvn> clever: iirc you are supposed to split the management into the allocator and the resource.

22:35 <not_not> Someone suggested rust but "let x = 0" is petting the dog the wrong way

22:35 <Griwes> clever, yeah, and the mentioned self plug: https://www.youtube.com/watch?v=5UVeh4_5B8I

22:35 <bslsk05> 'Memory Resources in a Heterogeneous World - Michał Dominiak - CppCon 2019' by CppCon (00:59:49)

22:36 <mrvn> clever: maybe one resource and N allocators using it woud work. One for images, one for scaled images, ...

22:36 <clever> another detail, is that with the current code, i cant easily know the size of an object ahead of time

22:36 <clever> but that could be solved

22:36 <mrvn> clever: on the other hand placement new sounds easier in that case.

22:37 <clever> mrvn: the image data itself, lives in regular ram, malloc already solved that

22:37 <mrvn> you can always make a proxy object that allocates memory when it gets commited for display

22:37 <clever> and the scaled vs unscaled images, must be in consecutive slots, if you want them in the same frame

22:38 <clever> for reference, here is an unscaled image, taking up 7 slots in the dlist: https://github.com/librerpi/lk-overlay/blob/master/platform/bcm28xx/hvs/hvs.c#L100-L116

22:38 <bslsk05> github.com: lk-overlay/hvs.c at master · librerpi/lk-overlay · GitHub

22:38 <clever> https://github.com/librerpi/lk-overlay/blob/master/platform/bcm28xx/hvs/hvs.c#L697-L707

22:38 <bslsk05> github.com: lk-overlay/hvs.c at master · librerpi/lk-overlay · GitHub

22:38 <clever> and here, line 697 takes note of the starting position, 701/703 adds elements, and 707 creates the 1x32bit end-of-list marker

22:39 <clever> the amount of space taken up, depends on how many images are scaled and how many are unscaled, and how images are planar

22:39 <mrvn> clever: you always alternate between 2 lists. So maybe handle the list as 2 blocks of 8kb.

22:39 <clever> mrvn: except, if i want dual-monitor support, i then need to cut it up into 4, and tripple-monitor, 6

22:40 <clever> which limits you to 48 unscaled images on-screen at once

22:40 <mrvn> clever: hmm, dual/tripple monitor is harder. They aren't likely to be the same size.

22:40 <clever> yeah, having an allocator would allow a monitor with 1 image to not hog 1/3rd of the dlist

22:41 <mrvn> are you likely to construct multiple lists in parallel?

22:41 <clever> not likely, i already have a mutex per monitor

22:41 <clever> and i could expand that to one global mutex

22:42 <clever> my current solution is to just blindly treat the entire memory region as a ringbuffer

22:42 <mrvn> I mean: add image to monitor 1, add image to monitor 2, add image to minotor 1, scale image for monitor 2, end monitor 1, add image to monitor 2, end monitor 2.

22:42 <clever> because if you only ever have 2 objects live at once, by the time you wrap around, the old ones have expired

22:42 <clever> for the hw to work right, all images on a given monitor, must be consecutive in the dlist memory

22:43 <not_not> Clever clever

22:43 <clever> in z-order

22:43 <mrvn> I know. which makes adding images interleaved a problem because you have to either make space or leave space.

22:43 <clever> yeah

22:43 <mrvn> On the other hand the memory isn't that big. moving a few objects around is easy.

22:43 <clever> my current hack, is that on every change, i re-write the dlist for every monitor

22:44 <clever> and then schedule pageflips on vsync

22:44 <not_not> Gonna do loads of weird shit with the mouse cursor on my os gui

22:44 <not_not> Like U can rotate it

22:44 <not_not> Split IT in 2 to grav multiple things

22:44 <not_not> Multi mouse support

22:45 <mrvn> does the hardware read the memory as ring buffer?

22:45 <clever> nope

22:45 <clever> if the hw hits the end of the array without a proper end-tag, it crashes

22:45 <not_not> In fact the mouse Will be a programming language

22:45 <not_not> Like U can record mouse macros

22:46 <not_not> And drag it throug if statements and loops

22:46 <not_not> And watch the mouse work

22:47 <mrvn> A good strategy might to move lists that haven't changed towards the front each frame, create new lists somewhat spread out in the back of memory.

22:47 <not_not> Stole the ideal from microsoft

22:48 <clever> mrvn: one idea ive considered, is to pre-create the 7/14 slot object for an image, within the layer object that tracks its state

22:48 <clever> so i dont have to convert the state each time i make the list, i can just memcpy chunks

22:48 <mrvn> clever: definetly.

22:50 <mrvn> I can't think of any algorithm that wouldn't leave gaps and any waste could be deadly. Except keeping the lists in normal memory and recreating them in the each 8k block on every change.

22:51 <clever> also, if the monitors are running at different refresh rates, like 60hz and 59hz, the vsync'd will drift in and out of phase

22:51 <not_not> Off

22:51 <clever> so when an object expires and becomes free space, will change

22:51 <mrvn> is that realistic?

22:51 <clever> they can be running from different PLL's with different divisors

22:52 <clever> and enless you get the pixel count, divisors, and PLL's all perfectly aligned, they will have some drift

22:52 <not_not> Ahh Nice night, night is soothing

22:52 <not_not> Day is gay

22:52 <mrvn> you're screwed.

22:54 <mrvn> clever: In that case I think you have to live with some tearing or do the memcpy all in the vsync.

22:54 <clever> memcpy may solve some of the issues

22:54 <clever> previously, when i was re-creating the entire list in vsync, i had a stable tear near the top

22:54 <mrvn> if you memcpy do you even need 2 copies of each list?

22:54 <clever> i solved that by pre-creating the list, and only doing the flip on vsync

22:55 <clever> possibly not

22:55 <mrvn> You can create the whole list in memory so it's a single memcpy() call

22:55 <clever> but you can only write to the region used by a screen currently in vsync

22:56 <not_not> Man im so relaxed

22:56 <clever> so worst case, 1 screen is in vsync, and 2 are active, so you cant change it

22:56 <not_not> Its the sun

22:56 <mrvn> put one list at the start, one list to end at the end and the third floating with equal space before and after.

22:56 <clever> mrvn: oh!, but if you are copying the visible frame, and then page-flip mid-frame, the 2 frames are identical, that wont be a visible tear! (when not scaling anything)

22:56 <not_not> All my worst enemies have day jobs

22:56 <not_not> Tbh

22:57 <not_not> I can sende they stopped scheming and gone to bed

22:57 <not_not> Sense

22:58 <mrvn> clever: is there a nop entry?

22:59 <clever> no, but you can use alpha to waste a slot on a 100% transparent image, or set the w/h to 0 and maybe it wont render

22:59 <clever> or set the xy to be off-screen

23:00 <clever> yeah, a 7-slot object can have an alpha, either object wide, or per-pixel

23:00 <mrvn> I wonder if you could move a list by copying each slot and replacing the original with nops.

23:00 <clever> there is also state on each object, that the hw uses for internal purposes

23:00 <clever> and it may glitch if that state is wrong

23:01 <mrvn> so better not do that.

23:01 <clever> at the most basic level, you have 1 compute core, that is round-robin'd between all active displays

23:02 <mrvn> Going with the "recreate in vsync" idea and having a list at the front, end and middle then worst case you might have to wait one frame to move the middle list if one list grows too much between frames.

23:02 <clever> for each display, you have an output FIFO, that holds whole scanlines

23:02 <clever> if a FIFO has room for 1 scanline of image data, the compute core will read the display-list, find every rect intersecting that scanline, then draw those objects directly into the FIFO ram

23:03 <clever> and its not a typical push/pop only FIFO, but just a ringbuffer acting as a FIFO

23:03 <clever> i can choose what range of ram each of the 3 FIFO's occupies

23:03 <mrvn> I can't quite see those list changing (in size) from frame to frame. I imagine more that they are setup once, e.g. when the game starts a level, and then it remains that size for minutes. Then shrinks for the loading screen and grows again for the next level. Or something like that.

23:04 <clever> if you are using sprites to animate enemies, the list will change size, based on how many enemies are on-screen

23:05 <mrvn> clever: that would limit the enemy count quite a lot

23:05 <clever> https://www.youtube.com/watch?v=u7DzPvkzEGA

23:05 <bslsk05> 'ntsc dance v2, interlacing fixed' by michael bishop (00:00:22)

23:05 <clever> do you see the glitching in this video?

23:06 <mrvn> horrible.

23:06 <clever> that happens if you have too many sprites in a certain area

23:07 <clever> the compute part of the HVS cant fill the FIFO fast enough

23:07 <clever> and the electron beam catches up, and runs out of pixels to display

23:07 <mrvn> Think about it: You have the player, the enemy, all the bullets and rockets or whatever they shoot. 20 sprites is not going to cut it.

23:07 <clever> the FIFO size lets you smooth out lag spikes, from a scanline taking too long to render

23:07 <clever> but too many spikes, and it will glitch out

23:08 <clever> but also, 20 is not really the limit

23:08 <clever> the limit, is how many pixels you have to copy for a scanline

23:08 <mrvn> I think you have to render the sprites to framebuffers and use the dlist to display just those.

23:08 <clever> and those raspberries are being down-scaled a lot

23:08 <clever> so there is a lot more pixels being copied, then what you would expect

23:09 <clever> also, a large amount of bandwidth is being wasted on pixels you cant see

23:09 <clever> if the sprites had proper collision checking, you cant overload it

23:09 <mrvn> clever: overlapping sprites definetly waste bandwidth

23:09 <clever> other then bullets in a bullet-hell game, most sprites dont overlap

23:10 <mrvn> is there hardware collision detection?

23:10 <clever> nope

23:12 <clever> also, if in dual-monitor mode, the compute core has to split its clock cycles between both displays

23:12 <clever> so it becomes half as capable

23:14 <clever> *looks*

23:15 <mrvn> clever: dual monitor wouldn't be a problem. One list at the start, one at the end. Both with memcpy() in the vsync.

23:15 <clever> channel 0 is dsi0 or dpi, dsi0 isnt wired on most pi models, dpi uses a lot of gpio but can be used for many things, and i have drivers

23:15 <mrvn> Only with 3 monitor support you could run into a case where the first list needs space but all the free space is after the second list.

23:15 <clever> channel 1 is dsi1 or smi, dsi1 i lack drivers for, smi is entirely undocumented

23:15 <clever> channel 2 is hdmi or composite, ive only got composite drivers currently

23:16 <clever> so realistcally, tripple-monitor mode isnt possible right now, dual is the limit

23:16 <clever> due to lack of drivers

23:16 <clever> the transposer brings in new stuff, but less limits

23:17 <clever> basically, the transposer uses up 1 channel, and does 90 degree rotations, and writes the image back to ram

23:17 <mrvn> clever: try if you can memcpy() long lists in the vsync if they are prepared in real memory ahead of time.

23:17 <clever> but its not a constant scanout, so you can free the dlist stuff upon completion

23:17 <clever> its also not racing an electron beam, so it isnt bothered by lag spikes

23:18 <clever> i'll give that a try, after i eat some pizza

23:18 <clever> just remembered i have some in the oven

23:20 <clever> one other random data-point, the DPI is probably the fastest output i can use right now, and its rated for up to 100mhz pixel clock

23:20 <clever> so that would mean generating 100,000,000 pixels/second

23:21 <clever> vsync rate then depends on resolution and blanking periods

23:23 <clever> divisor 2.967268, fps bounds 89-59, DPI clock measured at 108000 KHz, hsync rate: 63084 Hz, vsync rate: 59 Hz, htotal: 1712, vtotal: 1063

23:24 <clever> some of the debug code when selecting divisors to hit a target fps

23:25 <not_not> Ill eat some pizza too

23:25 <not_not> Bolognese is best cold

23:28 <not_not> Clever U have a github?

23:28 <clever> not_not: yeah, i linked it above

23:28 h4zel has quit [Ping timeout: 240 seconds]

23:28 <not_not> Ahh ty

23:31 manawyrm has quit [Quit: Read error: 2.99792458 x 10^8 meters/second (Excessive speed of light)]

23:32 manawyrm has joined #osdev

23:33 Terlisimo has quit [Quit: Connection reset by beer]

23:35 <not_not> Clever cool

23:35 <not_not> Miss bit operations in C havent used them sine i was 12

23:35 <mrvn> how do you do 4k displays?

23:36 <clever> mrvn: hdmi, which i lack drivers for currently

23:36 <clever> hdmi can do a higher pixel clock then dpi

23:36 <mrvn> looks like that needs around 400MHz

23:36 <mrvn> 1/4 the sprite count

23:36 Terlisimo has joined #osdev

23:37 <not_not> Zomg clever ill compile ur code once im out of here

23:38 <clever> mrvn: there is also some 4k differences between vc4 and 2711, one min...

23:39 <clever> mrvn: for vc4, the x/y position is limited to the 0-4095 range, so it can just barely handle 4k in a given dimension

23:39 <clever> 2711 increased the bits in a few fields, allowing higher resolutions

23:40 <clever> 2711 also generates 2 pixels per "pixel clock", so it can handle more pixels at the same clock