klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
<mrvn> you only need libgcc
<heat> except that I can't link against libgcc
<mrvn> is compiler-rt the clang equivalent?
<heat> yes
<Griwes> compiler-rt builtins is the clang equivalent, there's a lot more stuff that falls under compiler-rt
<heat> libgcc also includes unwinding
<heat> I'm just compiling the builtins part
<mrvn> do you have exceptions turned off?
<heat> of course
<Griwes> unwinding in llvm is separate, in libunwind
<heat> yea
<mrvn> you are doing something strange that needs something you aren't providing. *shrug*
<mrvn> using gcc + libgcc works for me. Maybe wait for someone using clang to wake up.
<heat> no, I don't need help
<heat> I've fixed it
<heat> you usually can't really link with libgcc/compiler-rt
<heat> so you either ruin your beautiful toolchain just to compile some special libgcc with multilib, or you need to take a different approach
<Griwes> wdym can't really link with compiler-rt
<heat> either 1) don't link with libgcc, but that ruins some division and some builtins; 2) roll your own; 3) import
<mrvn> heat: you can totaly link against libgcc from your cross compiler.
matrice64 has quit [Quit: Textual IRC Client: www.textualapp.com]
<heat> no I can't lol
<Griwes> why
<heat> well, for example, in x86_64, compiler-rt and libgcc are compiled with the red zone enabled
<heat> can't use that in kernel space
<heat> if you don't have your toolchain target set up right, libgcc will compile with mcmodel=small, and that doesn't work in kernel space
<heat> (you can make it compile everything as PIC, but that's non obvious for a beginner that isn't patching gcc)
<mrvn> heat: x86_64 is so screwed up
<heat> in riscv, object files are tagged with soft-fp/hard-fp; you either add multilibs of soft-fp (whyyyyyyyyyy) or you can't link because ld doesn't mix soft-fp and hard-fp object files
<mrvn> the arm libgcc compiles right out of the box.
<heat> cool, but that doesn't really work a lot of the time
<heat> and I bet it won't work if your kernel is PIC or something
<mrvn> heat: does, tried that. It's just that PIC isn't PIC, just relocatable.
masoudd has joined #osdev
<mrvn> heat: works with and without mmu too. The defaut multiplib setup seems to be fine.
<mrvn> s/mmu/fpu/
<heat> anyway, the point is that it's common to have incompatible libgccs and compiler-rts that compile for a regular user environment and not a kernel env
<mrvn> heat: sure, you need the standalone compiler. Not one for userspace.
<heat> so you either mess around with multilib options(you should know that it compiles everything two or three times!) and llvm (good luck diving through the cmake for something that works reliably)
Oli has quit [Ping timeout: 240 seconds]
<heat> except that my OS actually has a user-space so I do need one for user-space
<heat> and your -elf targets don't fix anything
<mrvn> heat: for your userspace you are screwed you have to patch gcc with all the multiplib nightmare.
<heat> not really
<Griwes> I mean you need separate environments for the two anyway
<Griwes> for sanity
<heat> you could do that
<heat> or
<heat> you add compiler-rt builtins as a library for the kernel, takes 3 or 4 seconds to build and there ya go, works
<heat> liberally licensed too
<mrvn> aparently it doesn't work or you wouldn't have your problems.
<heat> i don't have problems
<heat> they're 100% fixed lol
<Griwes> idk my builtins archive does look position independent, and I would not expect the builtins to actually want to use the redzone
<heat> it's regular C code, it 100% can and will if the compiler wants to
<heat> can also use SIMD lol
<mrvn> reszone is a stupid idea anyway. Either the function is so trivial it should be inlined or doesn't spill anything anyway. Or it's so complex that incrementing the stack is free.
<heat> using a libgcc/compiler-rt that was built for a user environment is frail
<heat> you either multilib the shit out of them or you pray that they don't do what you don't want them to do
<mrvn> nobody is suggesting that heat
<heat> every single libgcc is built for a user environment, that's my point
<heat> the "bare metal" targets just say "hey, no libc here"
Oli has joined #osdev
<mrvn> heat: and should set all the right multilib flags for bare metal libgcc
<heat> no
<mrvn> then start filing bugs
<heat> no
<heat> how would they know what you want to do?
<heat> do gcc maintainers decide that bare metal = kernel, or bare metal is simple bare-metal-ish app, etc?
<mrvn> heat: it's a standalone compiler, it needs to use the flags that make it safe for that use. Like no redzone.
<heat> the bare metal targets are just no-libc, simple targets with all the fancy stuff turned off
<mrvn> and then compile a bunch of libgcc, like softfloat, hardfloat, ...
<heat> mrvn, except that a redzone isn't inherently unsafe in a bare metal app
<mrvn> heat: redzone needs spezial support. no redzone always works.
<heat> the redzone is part of the sysv ABI
<Griwes> there's ways to specify flags for specific targets within an llvm runtimes build even without patching llvm, I should try and see how well that works
<bslsk05> ​github.com: llvm-project-rust/Onyx-stage2.cmake at rustc/13.0-2021-09-30 · heatd/llvm-project-rust · GitHub
<heat> half stolen from fuchsia but works like a charm
<heat> just set those at cmake time
<mrvn> "Pinky: Gee, Brain, what do you want to do tonight? Brain: The same thing we do every night, Pinky - try to take over the world!"
* kingoffrance points mrvn at scrollback from a week or two ago "he who controls the spice, controls the universe!" ancient module song
Oli has quit [Ping timeout: 240 seconds]
Oli has joined #osdev
<mrvn> fear is the mind killer
<heat> qemu-system-aarch64's -kernel doesn't give you a dtb and can even load you in ROM
<heat> if it knows you aren't linux
<heat> as soon as I flatten my elf file it seems I'm linux and does everything I've ever wanted
<heat> loads me dynamically in memory too
dude12312414 has joined #osdev
<klange> -aarch64 -kernel _does_ give you a dtb
<klange> _if_ you don't ask to be loaded somewhere the clobbers the default location
<klange> ThinkT510: I had a rust highlighter in C and I guess I forgot to port it to Kuroko when I switched over all the other highlighters; thanks for the reminder, I'll try to get around to it :)
<heat> klange, how? the registers are all 0
<klange> where are you asking to be loaded?
<klange> it wants to put the dtb at start of ram and it pads it out to a whole juicy megabyte
<heat> 0 because I want the bootloader to figure that out
<heat> it loaded me at 0
<heat> (it's ROM, but it did put me there)
<klange> What machine target?
<heat> virt
<klange> virt RAM starts at 0x4000_0000
<heat> so you hardcode that?
<klange> I am unsure if it supports loading PIE ELFs at addresses of its own choosing.
<heat> if you add the arm64 linux image header and flatten the ELF it loads you as a linux kernel
<klange> So for the path of least resistance, yeah, link yourself to be loaded at like 0x40100000
<heat> so it finds a place for you in memory and gives you the dtb
<klange> If you're fine with pretending to be Linux, then you do you.
<heat> there's not much pretending to be done
<heat> the whole boot protocol is "here's the dtb in a register, and here's your load address, gl hf"
<heat> i think it emulates uboot
<bslsk05> ​qemu.readthedocs.io: ‘virt’ generic virtual platform (virt) — QEMU 6.2.50 documentation
<heat> which is better than hardcoding stuff
<klange> way down at the bottom
<heat> yeah, shame for the hardcoding you need to do
<klange> you'd think they could be nice enough to just pass in the dtb address in x0 regardless... "patches welcome", I guess but the QEMU patch submission process is _involved_...
<heat> i want to try and build a generic arm64 image
<klange> I went for platform shims and my generic kernel expects to be loaded at -2G
<klange> So there's the qemu virt platform shim that loads at 0x4010_0000, sets up some initial page tables to accomplish that, reads kernel from fw-cfg, and hands over
<klange> and then the RPi4 one has the kernel + ramdisk embedded in it with .incbin, loads at 0x80000, and does the same things plus setting up the framebuffer early for debug messages
<heat> oh right you don't even have -initrd right?
<klange> I can't figure out how it's providing the ramdisk location, it's not in /chosen
<heat> qemu or the rpi?
<klange> qemu
<clever> klange: it should be in chosen, but i think you need the .dtb file for that model for dtb to work right
<heat> i think i saw the initrd code being under if (is_linux)
<klange> but for virt it should all be automatically generated?
<clever> for qemu virt, i'm not sure
<klange> ^ and I have strong suspicions heat is right and they're just laughing at me without even bothering to print a warning that the initrd was ignored
<heat> klange, try info roms
<heat> it lists everything that qemu loads
<clever> i would just read the qemu src
<klange> I do that often, but this particular stuff is a mess.
<heat> I was forgetting to add -cpu so qemu-system-aarch64 was just exit(1)'ing
<heat> no error
<heat> just fuck you
<klange> nothing in `info roms`
<heat> it's not loading it then
<klange> it's the same thing with the f*ing dtb if you ask to be loaded too low, it's just not there and no warning
<klange> meanwhile rpi is still using ATAGs to hand Linux ramdisk addresses in the yold 3188 on aarch64
<heat> what's an ATAG?
<bslsk05> ​github.com: qemu/boot.c at 0a301624c2f4ced3331ffd5bce85b4274fe132af · qemu/qemu · GitHub
<klange> in days long past, before device trees, there were ATAGs - ARM tags.
<heat> guess it's simpler than editing the fdt at boot time
<AmyMalik> nyan
<heat> /bin/nyancat
<klange> started as a terminal escape sequence test, now it's mostly a test of SIGINT
<clever> klange: if the .dtb file is missing, the firmware switches to ATAGS automatically
<klange> I've got my hardware's dtb and have identified other things in it - actually dumped the whole thing once to the framebuffer which was, in retrospect, a mistake
<klange> (it was very long, and this was before I turned on the MMU)
<klange> (so it took several seconds)
<clever> heh
<clever> i recently tried to hexdump the bootrom to the framebuffer, but its giving me trouble
<klange> I'm working off of a fresh "Raspberry Pi OS" SD card image, with the EXT4 partition removed, so I've got the full set of dtbs and overlays and a modern mostly-default config.txt
<clever> ah
<heat> does the rpi not have its own dtb in rom?
<heat> or is that not a thing in rpi land?
<clever> heat: there is no dtb in the rom, the firmware loads it from the fat32 partition on the SD card
<clever> there is not one byte of arm code in the rom
<klange> it mashes together multiple files to support different software-selectable configs, even
* klange should switch to high peripherals
<clever> heat: the boot rom runs on the VPU, and loads a stage1 .bin file, stage1 then brings ram online and loads a stage2 .elf to the VPU, stage2 then loads the kernel, initrd, dtb, patches the dtb, and then turns the arm core on
<clever> heat: so the kernel+dtb is already fully in ram when the arm core runs its 1st opcode
<heat> that is stupid
<clever> heat: thats because the arm core was more of an after-thought, an optional accelerator core hanging off the side of an otherwise already complete soc
<clever> the arm was never the boss of the show
<clever> its like those compute pci-e cards you can add to a desktop
<heat> except it is lol
<heat> and they had multiple socs to fix that
<clever> the VPU also has a lot of intel ME or amd PSP like features
<clever> where you can sandbox off the arm core, and contain untrusted arm kernels
<clever> from before trustzone was a thing
<klange> every iteration of the rpi is full of insanity
<clever> firmware wise, its improving with every iteration
<clever> they just arent changing those insane hw choices
pretty_dumm_guy has quit [Quit: WeeChat 3.4]
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
<clever> for example, the 3d core on the 2711 changed too much
<heat> just run UEFI
<heat> go full SBSA
<clever> so they skipped having a closed 3d stack, and just went directly to only mesa/open-source 3d
<clever> same for the h265 decode block, it skipped the blob stage and only has source
<clever> and they have moved the 2d and camera subsystems almost entirely into the linux source tree
<heat> qemu's arm64 machine virt actually loads ACPI tables
<heat> actually, it doesn't load them but puts them in fw_cfg
<klange> I think someone forgot an if somewhere
<clever> you can also get uefi+acpi on a pi4, there is a tianocore build for it
<heat> I know
<heat> just for the 3 and 4 though
<clever> yeah, only on aarch64 platforms
<heat> dunno if the pi zero 2 is supported
<heat> probably not
<heat> I don't see the point though, considering it's chainloaded by more firmware
<klange> i got distracted by this deadlock, I should really be working on bringing up xhci...
<clever> i had a ticket open with plans on how to load tianocore from spi flash
<clever> so it would be more seamless
<heat> they added/are adding spi flash support, finally
<clever> noobs derailed the ticket with complaints about sd card reliability
<clever> they then closed the ticket, saying they would never add that feature
<heat> ticket where?
<clever> and now they implemented exactly what i wanted :P
<bslsk05> ​github.com: SPI boot · Issue #95 · raspberrypi/rpi-eeprom · GitHub
<mrvn> heat: they might be making more money on the RPI than they ever did for the chip everywhere else.
<klange> close as won't fix → someone realizes actually that's a good idea → it happens anyway
<clever> heat: my request, was the ability to optionally load the official start4.elf from an external SPI flash chip, with my cover story being tianocore
<clever> heat: but secretly, i wanted it to be able to load a .elf from the internal SPI flash
<clever> heat: well, the net-install beta, does exactly what i wanted, it loads a .elf from internal SPI flash, lol
<heat> i do want tianocore :(
<clever> once i'm loading a custom .elf, i can load whatever else i want on the arm, like tianocore
<clever> heat: https://imgur.com/a/7iQWhad this shows each stage of the firmware, and what options can run at each, and what types they can load next
<bslsk05> ​imgur.com: Imgur: The magic of the Internet
<heat> well, ideally you just flash the firmware
<heat> no weird chainloading needed
<clever> for the pi0-pi3 range, the maskrom loads a bootcode.bin from the fat32 card, and ram is not online yet
<clever> there is no chip you can flash, and you must bring ram up to load more then 128kb of code
<heat> if they just loaded stuff from the SPI, you could have all the blobs in edk2 and use just that
<clever> and thats where some history comes in
<clever> the pi3 added usb-host and tftp drivers to the maskrom
<clever> it was full of bugs :P
<heat> no crappy sd card, no gpu loading stuff
<clever> pi4 learned from that mistake, and put the bootcode.bin on SPI flash
<clever> so they could change the code in the field
<heat> i guarantee that whatever they're doing in bootcode.bin, etc isn't much compared to intel platforms for instance
<clever> the old pi4 bootcode.bin, did dram init, and then loading start4.elf from one of sd, usb, tftp, or nvme
<clever> but with the latest beta firmware, they added https support to it, but it didnt fit within the 128kb limit
xenos1984 has quit [Read error: Connection reset by peer]
<clever> so, they cut bootcode.bin in half, bootcode.bin now does only dram init, and loading of bootmain
<heat> getting most of that in open source software would be way better than having it all closed
<clever> bootmain is a .elf in SPI flash (over 200kb), that deals with SD, usb, tftp, nvme, and https
<clever> heat: and thats where NDA's come into play, the rpi engineers cant even say how fast the dram controller can perform!
<clever> you expect them to release dram init source?
<heat> no, that's not the point
<heat> lots of platforms out there with mostly-oss firmware and a bunch of blobs attached to them
<clever> but with the new design they made, everything bootmain touches, is hw that is already openly documented
<clever> so i could re-implement bootmain, and make it opensource
<clever> oh, also, the toolchain to compile VPU firmware, is also behind NDA
<clever> so even if they could release the source, you cant compile it!
<mrvn> I'm always amazed about that in this day and age. It's like they don't want anyone to use the hardware.
<heat> make the vpu way smaller
<heat> get it booting the arm processor without dram, just running on CAR
<clever> heat: ive not tried yet, but i suspect you can do CAR on the arm side, however, the arm core cant touch any dram control registers
<clever> so you must still run VPU firmware to init the dram!
<mrvn> fit your whole app in cache
<clever> 128kb L2 cache is all you have
<heat> that's what's called "the big stupid"
<clever> heat: the arm wasnt meant to run the show, and for security, it cant talk to a lot of peripherals
<mrvn> clever: I thought it's only 64kB. Big win there.
<heat> "security"
<mrvn> security aka copy protection on hdmi and that crap?
<clever> heat: remember, this was for trustzone style security, where you keep the DRM keys in the vpu ram
<clever> so even if an attacker can gain kernel mode on the arm core, they cant rip your DRM keys
<mrvn> oh, and DVD decryption. that's oh so secret.
<clever> and netflix :P
<mrvn> well, netflix still is a thing
<clever> the bcm2835 was in the roku2 for example
<heat> rpi uefi needs to compile arm TF-A anyway
<clever> heat: but the "request from secure mode" signal wasnt wired to the dram controller, so TF-A cant actually be secure
<clever> its instead wired to the VPU, for the VPU's supervisor vs user mode
<clever> so secure ram can only be accessed by the VPU in supervisor mode
<heat> "The RPi4 has a single nonstandard PCI config region."
<heat> why
<clever> its a crappy pci-e controller geist has seen in many other SoC's
<clever> there is a single ecam region, and you use a reg to select which pci-e slot it maps to
<clever> "because it only has 1 lane, and there is only 1 choice 99.9% of the time"
<mrvn> .oO(except when you want an M2.key and SATA)
<clever> yeah, a pci-e switch is a violation of that assumption
<clever> so you must grab a lock, switch the ecam to the right device, then write to those regs, and release the lock
<clever> its also likely assumming you only use the ecam once during init, and then never again
<clever> mrvn: also, according to geist, some SoC's put 2 of these controllers on the same chip, but because each is single-lane, you can never run it in 2-lane mode, its permanently 2 slots of 1x each
<heat> it's not an ecam though
<mrvn> clever: a 2 lane controller would be more expensive
<clever> and thats why its such a mess :P
<mrvn> fixing all the RPi hardware choices would also cost a lot of money.
<clever> oh, another fun fact
<heat> reject arm64, embrace PC?
<clever> the VC6 core isnt actually a finished project
<mrvn> they might even have stuff licensed for use in the VC that they can't use in an ARM core.
<clever> the 2711, is a VC4 core, with the new v3d core bolted on
<clever> and some slight tweaks to support 4k hdmi and more ram
<mrvn> That's nothing new. What do you think every x86 chips is? It's always the old with something bolted on.
<clever> yep
<clever> and the entire pi1/pi2/pi3 lineup, the vc4 end is virtually identical
<clever> the only change they made in the whole lineup was the arm cores, and sometimes the rom
<clever> which reminds me, the pi1 and pi2 bootrom, are almost bit for bit identical!
<clever> the addition of 3 more arm cores, had zero impact on the bootrom
<mrvn> except for the model ID and peripheral address?
<clever> mrvn: the peripheral address is purely a software construct!
<clever> there is an mmu between arm-physical and vpu-physical
<mrvn> the whole bootrom is software
<clever> and that mmu decides where the peripheral lands
<clever> the peripheral address choice, is made by start.elf
<clever> 2 stages after the rom
xenos1984 has joined #osdev
<bslsk05> ​github.com: lk-overlay/arm.c at master · librerpi/lk-overlay · GitHub
Oli has quit [Ping timeout: 240 seconds]
<clever> mrvn: with 1 extra line of code, i can put the peripherals at BOTH addresses on every model :P
<mrvn> Maybe you could change your bootrom/start.elf code to change the address space layout to match basically every other arm system, with memory a bit further up.
gog` has joined #osdev
gog has quit [Killed (NickServ (GHOST command used by gog`))]
gog` is now known as gog
<clever> mrvn: for the vc4 line, the mmu can only control the lower 1gig of ram, 64 pages of 16mb each
<clever> well, lower 1gig of the address space
<mrvn> clever: so your saying you can't have peripherals below 1GB or you loose to much shared memory?
<clever> both peripherals and ram must all exist in the lower 1gig of the addr space
<mrvn> clever: many ARMs have memory at 512MB.
<clever> oh, and the arm reset vector is 0
<clever> so page0 must be ram
<mrvn> so rom at 0, peripherals at 16MB, memory at 512MB
<mrvn> or pseudo rom
<clever> there is no arm rom, so you must map ram to the 0-15mb range
<mrvn> easy enough to map the first 16MB twice.
<clever> yep
<clever> but if you have 1gig of ram, you must cover at least 1 page (16mb) up with MMIO
<mrvn> That's how I would have done it anyway.
<clever> for the bcm2835, they put the MMIO after the 512mb of ram, nicely out of the way
<clever> then the ram got bigger, and rather then have an MMIO hole in the ram, they moved MMIO up to 1008mb (the top most 16mb page)
<mrvn> and then ram got bigger again and again and again
<clever> then ram got bigger, and rather then have an MMIO hole, they moved MMIO to 0xfe00_0000 (4064mb) i think
<clever> and now it cant run any further, without requiring 64bit support
<clever> so now you have an MMIO hole by default
<mrvn> and last the whole now has grown downwards
<clever> but if you turn on high-peripherals mode, the MMIO lands at a 64bit only addr, and now you have a solid 16gig of the address space dedicated purely to ram
<mrvn> -w
<heat> do other pis work similarly?
<heat> like the orange pi and the rock pi
<clever> heat: i havent investigated those knock-off's
<mrvn> 16gig? wow, like that will never be exceeded. I mean we went from 256 to 512 to 1024 to 2048 to 4096 to 8192. We will never ever need more than 16384
<mrvn> heat: no. totaly different chips. They only use the Pi for marketing.
<heat> yeah the rock pi looks solid
<heat> has a mali gpu
<heat> no pseudo boot processor and gpu
<clever> mrvn: the dram controller has been stated to max out at 16gig of ram, but nobody actually makes 16gig ddr4 chips, so the limit is currently 8gig
<mrvn> clever: so then they bolt on a second one.
<mrvn> wasn't there some stackable ram that's only limited by heat?
<heat> i dont limit no ram
<mrvn> you don't limit the ram, the ram limits you. :(
<clever> mrvn: thats actually what they did on the pi3
<clever> the pi3 is using ddr2, and they dont make 1gig ddr2 chips
<clever> so they have a pair of 512mb ddr2 dies in one epoxy package
<clever> each on half of the data bus
<clever> so its acting like a striped raid array
<mrvn> like all PCs
<clever> and the pi02, is just a pi3 soc, and a single 512mb die, now sharing the same epoxy package
<clever> so the soc and dram are bond-wired directly together
<clever> mrvn: but PC's have multiple ram chips per module, and multiple modules, while the controller on the rpi can only wire to a single chip
<clever> so it must have a much wider bus, possibly several slots wide
<clever> and drives the whole pair of slots in parallel, expecting them to both be the same size/speed
<clever> hence the whole mess of needing matching sticks in the right slots for optimal performance
<clever> i think the limiting factor then, is the width of the data bus coming out of the ram controller
[itchyjunk] has joined #osdev
elastic_dog has quit [Ping timeout: 240 seconds]
elastic_dog has joined #osdev
srjek has quit [Ping timeout: 256 seconds]
dude12312414 has joined #osdev
rustyy has quit [Quit: leaving]
dude12312414 has quit [Remote host closed the connection]
isaacwoods has quit [Quit: WeeChat 3.4]
gog has quit [Ping timeout: 240 seconds]
toastloop has joined #osdev
troseman has quit [Ping timeout: 272 seconds]
heat has quit [Ping timeout: 260 seconds]
rustyy has joined #osdev
xenos1984 has quit [Quit: Leaving.]
rustyy has quit [Client Quit]
rustyy has joined #osdev
Burgundy has joined #osdev
xenos1984 has joined #osdev
Burgundy has quit [Ping timeout: 240 seconds]
ElectronApps has joined #osdev
eroux has joined #osdev
jjuran has quit [Ping timeout: 240 seconds]
pounce has quit [Ping timeout: 272 seconds]
not_not has joined #osdev
pounce has joined #osdev
[itchyjunk] has quit [Read error: Connection reset by peer]
Jari-- has joined #osdev
bxh7 has joined #osdev
the_lanetly_052 has joined #osdev
puck has quit [Excess Flood]
puck has joined #osdev
k8yun has joined #osdev
k8yun has quit [Remote host closed the connection]
the_lanetly_052 has quit [Ping timeout: 256 seconds]
Belxjander has joined #osdev
toastloop has quit [Quit: Leaving]
toastloop has joined #osdev
wolfshappen has quit [Ping timeout: 272 seconds]
wolfshappen has joined #osdev
ddevault has quit [Ping timeout: 245 seconds]
tom5760 has quit [Ping timeout: 240 seconds]
gjnoonan has quit [Ping timeout: 256 seconds]
exec64 has quit [Ping timeout: 240 seconds]
jjuran has joined #osdev
jleightcap has quit [Ping timeout: 240 seconds]
jleightcap has joined #osdev
tom5760 has joined #osdev
sm2n has quit [Ping timeout: 250 seconds]
exec64 has joined #osdev
patwid has quit [Ping timeout: 256 seconds]
ddevault has joined #osdev
gjnoonan has joined #osdev
sm2n has joined #osdev
patwid has joined #osdev
Oli has joined #osdev
toastloop has left #osdev [Leaving]
xenos1984 has quit [Quit: Leaving.]
<not_not> X 86 or write os for my pi?
<hmmmm> there's certainly more baggage on the x86
<hmmmm> it would be easier to focus on operating system concepts with the latter
<not_not> Ye
<not_not> X86 has to be a city by now
<not_not> The slums of 16 bit real mode
<hmmmm> even the older stuff isn't particularily fun
<hmmmm> shutting off a computer is quite a feat
<not_not> Nah lol
<not_not> Wow
<not_not> Arm32 was my asm virginity
<hmmmm> ive never actually gotten to the point where i implemented the acpi dsdt parser
<hmmmm> so my hacky workaround was to point the reset start address at a routine that used the bios to shutdown and then intentionally triple fault
<not_not> I dont even know what that is i barely wrote my first parser
<not_not> Ow
<hmmmm> i grew up in a world where everything's x86
<not_not> Osdwv is a new world but i did some close ro the metal stuff on the gba when i was 12
<hmmmm> wow neat
<hmmmm> i think i was doing visual basic when i was 12
<not_not> I was 10 when vb but first day of school
<not_not> Middle school or jr high or whstever
<not_not> My classmate dissed me so hard for writing vb shit
<not_not> Told me he was a hacker and he wrote an os
<not_not> Amd that he had hacked a televiion station
<not_not> And there was another hacker and they were fightong over the mouse
<not_not> Years lAter i saw 1995 hackers
<not_not> He lied he was just describing that scene when zero cool and acid burn were in the same tv system
<klange> coding in vb > pretending you did a thing in a movie
<not_not> Lol
<not_not> Ye was so lol when i first saw that movie
<kazinsal> I oughta sit down and rewatch that movie
<klange> when I was a wee klange and wasn't allowed on the dialup, and I all I had was a busted old Compaq with a slot-load Penitum, VBA in Excel was the best I could do.
<kazinsal> it's probably been a decade or so
<not_not> Ye mw too
<not_not> Ahh angelina joulie boobs
<not_not> Vb is a good language
<kazinsal> I did recently rewatch Starship Troopers, and it continues to be my favourite Paul Verhoeven film
<kazinsal> Such a wonderfully batshit satire
<not_not> We dont really have any visual something on linux
<klange> There's Gambas
<not_not> Vb very good for beginners
Burgundy has joined #osdev
<klange> Which is a very similar language with a very similar bit of tooling, but I think it's relatively new compared to the days when I was hacking together forms in Excel.
<not_not> Mmmm
<klange> I think "the kids these days" would just hack together web stuff; React is today's Visual Basic.
<not_not> Ahh
xenos1984 has joined #osdev
<klange> And probably with all the same ireful aftereffects of knowing how to hack together a GUI but not actually knowing "software".
<not_not> My cousin codes everything in haskell theese days
<not_not> Ye unaware of the dangers of buffer overflows
<not_not> Of by one errors
<klange> Anyway, this lock shit has been so aggravating, and I don't think it was even actually a real deadlock, it was insufficient atomics...
<klange> I did "deadlock detection" the stupid way by having the acquire-loop check the system timer and panic after 5s, dump lock owners for all the critical stuff, etc.
<klange> And it revealed nothing directly. I could see three cores waiting on the main dumb lock for managing timed sleeps
<klange> And it revealed which core owned that lock and what function it was in. So I dug deeper into that and was timing every aspect of the function... and everything was reporting it was completing... immediately after five seconds.
<klange> Seemingly, the SGI from the panic was 'fixing' the problem.
<j`ey> SGI?
<not_not> Lol u can tell im the userspace brAt
<klange> ==IPI, software generated interrupt sent between processors
<j`ey> oh ok, I wasn't sure if that was the usage you meant, too many acroynyms
Oli has quit [Ping timeout: 250 seconds]
<klange> so best as I can tell, one core would get stuck spinning on a lock that it had actually successfully acquired, causing the other cores wanting that lock to trip the deadlock detection, the first to do that would send the SGI, and that would unbork the test-and-set loop, and it would return and then all its timing functions would say it spent five seconds in there
<not_not> It lied?
<klange> lied, or was stuck fighting with the exclusive monitor until the interrupt did the equivalent of slapping it in the face
<not_not> Ahh
<klange> the GCC manuals say __sync_lock_test_and_set is supposed to enforce acquire semantics, but the instructions it was spitting out don't seem sufficient for that
<not_not> Im gonna get pen and paper
<klange> which, fine, I need to move away from that, it's considered "legacy" and apparently was only actually defined for Itanium and not even the x86 I was using it on previously
GeDaMo has joined #osdev
<not_not> Bah im in insane asylum
<not_not> Gonna try hacking myself out
<kazinsal> hack the gibson
<not_not> Or the very least change my meds
<GeDaMo> How can you tell the difference between inside and outside? :|
<kingoffrance> there is a fence around one of them
<kingoffrance> or a bag
<kazinsal> nice white padded walls
<klange> i could use more fences
<GeDaMo> Is the fence to keep people out or in?
<not_not> Outside i can illicitly get enough benzo
<not_not> Both
<kingoffrance> that's a hercules question GeDaMo
<kingoffrance> are you hercules ? ghostbusters says yes
<kingoffrance> *says you are supposed to say yes
<mrvn> not_not: sketch?
<not_not> All i know is no snipers can shoot in this window
<not_not> Ye mrvn
<not_not> Gonna draw a box and think outside of it
<kazinsal> mard mk 2
<not_not> Bastards tho my ssh key is diffrent in the insane asylum
<mrvn> not_not: think outside, no box required *tree emoji*
<not_not> Xd but im locked up and dad is building a narrative of me being insane
<mrvn> klange: ARM has this little behavior that two processes can write a var and read the result and never see a change unless you force it out of the write buffers.
<not_not> Well i smokef pot once to try and fin a singular concept that can create binary
<not_not> By process of elimination i found out it was not not not
<not_not> Or not
<not_not> And not not is is
<not_not> So u can use one word to not define binary but not not defining binary and implicitly declare and or or
<not_not> Packing the universe into one word
<kazinsal> most of us just assign one voltage level to 1 and another to 0
<mrvn> seems like the pot is still going strong
<not_not> Havent smoked in years
<GeDaMo> I watched a video about Flash memory which apparently uses multiple voltage levels to store values
<not_not> Ooh nice
<mrvn> GeDaMo: one to read, one to write, one to erase as far as I know.
<not_not> My music teacher told me the computer is an analogue machine a recently and i shat brix
<not_not> Well i found out not can do all actions of god
<not_not> Generate operate and destroy the universe
<mrvn> not_not: what is 3 * 7?
<not_not> 10101
<not_not> My fav number
<not_not> U play dala wiw?
<mrvn> so not a but, just not not insane
<mrvn> s/but/bot/
<not_not> Well the turing test backfired
<bslsk05> ​'How do SSDs Work? | How does your Smartphone store data? | Insanely Complex Nanoscopic Structures!' by Branch Education (00:17:54)
<kazinsal> mrvn: thus my suggestion of a second revision of whatever software mard runs (ran?) on
<not_not> Well not not is the ideal language for qbits
<kazinsal> a slightly more advanced neural net, but still not able to be indistinguishable from actual human traffic
<not_not> God i miss my qbits
<not_not> Ok well if you program ur neurons to cognize brainfuck
<not_not> U gotta remember to start counting at 0
<not_not> There is a buffer overflow in the brain
<not_not> Sec looking up mard
dormito has joined #osdev
<mrvn> Some time ago I saw a nice article where they took a neural net to recognize dogs and run an image backwards through it so it would "draw" dogs. Did someone do the same with text and get not_not?
<not_not> Ahh i seen that
<not_not> Idk i did it with my brain tho but very excited if someone got not not
<not_not> Man the nurses are hot for me
<not_not> And im like fuck off i dont need ur womb im giving birth to ai
<not_not> Unless we process her and make her the oracle
<bslsk05> ​openai.com: DALL·E: Creating Images from Text
<not_not> Well the brain can get not not not by doing procesd of elimination on the proces of elimination
<not_not> But the brain has non mechanical parts
<not_not> Like teleporting information
<not_not> Good she's gone
<not_not> Its 12 o clock at noon going zzz mode
eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
qubasa has quit [Ping timeout: 252 seconds]
isaacwoods has joined #osdev
zaquest has quit [Ping timeout: 272 seconds]
pretty_dumm_guy has joined #osdev
isaacwoods has quit [Quit: WeeChat 3.4]
<mrvn> GeDaMo: those images are supposed to be generated by the neural net and not just picked out of the samples?
isaacwoods has joined #osdev
<GeDaMo> That's how I read it
<mrvn> they look way too good
<mrvn> "a collection of glasses is sitting on a table" gives either perfect wine clases or perfect reading glasses but never a oddly shaped mix? Can't believe that.
<bslsk05> ​en.wikipedia.org: DALL-E - Wikipedia
<GeDaMo> «DALL-E was developed and announced to the public in conjunction with CLIP (Contrastive Language-Image Pre-training),[1] a separate model whose role is to "understand and rank" its output.[3] The images that DALL-E generates are curated by CLIP, which presents the highest-quality images for any given prompt»
<not_not> Oh nice its beutifull
<not_not> Off the psychologist
<not_not> She's hot tho
<not_not> Smart didnt fall for the usual tricks
<not_not> Asked me if she thought i was being surveiled
<not_not> Or if i felt i was in control of my own thoughts
<not_not> Answered with a question but she rejected questions immediately
dennis95 has joined #osdev
<not_not> Shit on me
<not_not> What a coincidence
<bslsk05> ​'Rabbit' by Chas & Dave - Topic (00:02:25)
<not_not> Dennises must be drawn to system development kike flies to dead flesh
<not_not> - kike that was a typo for like
Jari-- has quit [Ping timeout: 272 seconds]
not_not has quit [Ping timeout: 272 seconds]
Oli has joined #osdev
cvemys has joined #osdev
eroux has joined #osdev
hegz has joined #osdev
cvemys has quit [Quit: Leaving]
Oli has quit [Ping timeout: 256 seconds]
gog has joined #osdev
garrit has joined #osdev
hegz7 has joined #osdev
nyah has joined #osdev
zaquest has joined #osdev
gwizon has joined #osdev
gwizon has quit [Client Quit]
[itchyjunk] has joined #osdev
troseman has joined #osdev
mahmutov_ has joined #osdev
mahmutov_ is now known as mahmutov
zaquest has quit [Ping timeout: 250 seconds]
Vercas has quit [Write error: Connection reset by peer]
gxt has quit [Remote host closed the connection]
gxt has joined #osdev
Vercas has joined #osdev
zaquest has joined #osdev
kori has quit [Quit: zzz]
<gog> so using compiler-rt, would i just need libclang_rt.builtins to be the equivalent of a bare metal libgc?
<gog> libgcc*
<gog> also that ubsan stuff is pretty interesting
Oli has joined #osdev
sonny has joined #osdev
<mrvn> you need everything that gets reported as unresolved symbol on link
<gog> yeah makes sense
<g1n> hello
ElectronApps has quit [Remote host closed the connection]
<GeDaMo> Hi g1n :)
<g1n> i haven't tryed to make mm last week, going to do that this week
<sonny> what is mm?
<g1n> memory manager
<g1n> (malloc and friends)
<sonny> oh
<sonny> I thought it was machine monitor
<sonny> cool
<g1n> lol
<g1n> what can be "end of memory" status (in header)?
<sonny> Is there any scheme that physically partitions memory when handing it out? It's probably super ineffective
Oli has quit [Ping timeout: 240 seconds]
<mrvn> g1n: traditionally it's NULL or nullptr
<mrvn> sonny: NUMA partitions physical memory according to distances to different cores.
<g1n> mrvn: oh makes sense, seems i will need to make more "arrays" and "structs"
<g1n> lol
<mrvn> g1n: malloc is not something you should be using in the kernel.
<GeDaMo> malloc is usually a user-level function built-on top of mmap (or brk)
<mrvn> yeah, and forget brk
patwid has quit [Remote host closed the connection]
sm2n has quit [Remote host closed the connection]
gjnoonan has quit [Remote host closed the connection]
exec64 has quit [Remote host closed the connection]
tom5760 has quit [Remote host closed the connection]
jleightcap has quit [Remote host closed the connection]
ddevault has quit [Remote host closed the connection]
Brnocrist has quit [Ping timeout: 272 seconds]
exec64 has joined #osdev
tom5760 has joined #osdev
sm2n has joined #osdev
gjnoonan has joined #osdev
ddevault has joined #osdev
patwid has joined #osdev
jleightcap has joined #osdev
<sonny> mrvn: thanks, I'll look into that
<mrvn> Hacking in movies: % TELNET <HUNTLEY_NET> % SSH CLIENT
<mrvn> ********ACCESS. GRANTES********
<mrvn> OPENING PORT: 47534534534 got to love movies.
<mrvn> Also very important while hacking: A call graph (stack backtrace) for your hacking tool. Can't hack without that.
<sonny> I think a scene where someone compromises a mainframe would be great
<sonny> "hey look at this, they still use mainframes lol"
<bauen1> iirc wargames actually had some decent "hacking scenes" with war dialing that seemed to actually be pretty accurate
<sonny> noted
<GeDaMo> Sneakers is also good
<g1n> mrvn, GeDaMo: so i need to find how to make smth like mmap? where to start?
<GeDaMo> mmap finds a free physical page and maps it to a free page in the processes virtual space
<g1n> wdym by "processes virtual space"?
<g1n> like virtual space per process?
<GeDaMo> Yes
<g1n> oh
<g1n> it should be one for kernel right?
<mrvn> g1n: do you have user processes already?
<g1n> no of course
<mrvn> then why would you need mmap?
<g1n> then what should i do?
<mrvn> write code to map pages, create a stack and a user process
<g1n> but why now?
<g1n> i thought to make fs first
<mrvn> then maybe implement a log_string syscall
<g1n> oh
<g1n> i have no idea about userland yet
<g1n> i thought to make memory things, then filesystem, then userland
<g1n> and there syscalls and other cool things
<mrvn> it's not userland as in separate programs. just somce code you run with user priviledges.
<g1n> oh
<g1n> but still, am i ready?
<mrvn> probably not
<g1n> so, why doing it?
<mrvn> that's the challenge
<g1n> oh
<g1n> lol
<g1n> yes
xenos1984 has quit [Remote host closed the connection]
<g1n> vfs needs allocing, isn't it?
xenos1984 has joined #osdev
heat has joined #osdev
<mrvn> no
<g1n> oh
<g1n> really???
<geist> you also probably want to do task switching and and whatnot first
<mrvn> it certainly helps, but no, not needed
<geist> fs is fairly late in the game, IMO
<mrvn> geist: I think he has kernel threads already
<g1n> geist: why is fs lite lol? i need to access files isn't it?
<g1n> mrvn: no, i don't have any multitasking
<geist> because you dont *need* it for the other stuff
<heat> yes but that's only useful when you can load user programs
<geist> ie,youc an build a yser space even if you just hard compile in some programs, etc
<g1n> oh
<g1n> makes sense
<geist> and yeah you probably want to tackle multitasking fairly soon
<geist> since that affects how you build the rest of the subsystems
<g1n> hmm, so i need to check multitasking? also, i currently have issues with setting up keyboard and timer for some reason (page faults working, so idt should works). I thought about setting up framebuffer too.
<heat> then get the timer working first
<heat> and/or the keyboard
<klys> what to do first appears to be quite an issue. there are a few things you can do. you should probably do one of those things.
<g1n> heat, klys: ok
<heat> i like mmu -> interrupts -> scheduling -> userspace
<heat> it's kinda how I did it first time around, and that's how I did it for riscv
<mrvn> mmu, exceptions, interrupts, (user) mode switch, syscall, scheduling
<g1n> ok, so currently i need to fix timer/keyboard to make total sure that idt is working
<heat> geist, kinda out of nowhere but how fast is scudo?
<mrvn> My first FS was: std::map<std::string, std::vector<uint8_t>> basically
vin has joined #osdev
<g1n> so the hardest thing is to find addr to fs, that can be given by grub, isn't it?
<heat> the hardest thing is to design a proper fs layer
<g1n> oh, yes makes sense
<mrvn> Actually the really first one was: struct File { File *next; char name[64]; size_t size; uint8_t data[ /* size */ ]; };
<heat> std::map<string, vector<uint8_t>> is crap
<heat> turns out filesystems are way more complex than a list of paths and a bunch of bytes
<g1n> i think as first fs i will try tar (please not kill me, i will fix it, at least planning)
<g1n> lol
<heat> suggestion: don't
<mrvn> heat: totally, but walking a linked list for files get tiresome.
<heat> tar isn't a filesystem and isn't suitable as one
<g1n> ok
<heat> I recommend you create a ram filesystem (like linux's tmpfs for instance) and unpack the tar to that
<g1n> initrd?
<heat> yes unpack the initrd to the ram fs
<mrvn> just use something like my struct File above and link it into the kernel so your VFS has some data to access.
<g1n> ok, thanks, i am going to fix keyboard first
<g1n> mrvn: i thought of doing like that, to test that i am doing it correctly, and then make proper one
<g1n> also, vfs could be useful if doing like real unix (everything is a file)
<mrvn> g1n: your goal should be to get something working with the minimum of code. Understand how it works, design a proper interface and only over time replace the early stuff with proper code.
<g1n> ok
<heat> i don't agree
<heat> considering that that's totally not how filesystems work
<mrvn> g1n: So if you think you need an FS then make one that can handle 10 files compiled into the kernel image and go from there.
<g1n> heat: oh
<mrvn> heat: how do filesystems work? They can give you some data associated with the name of a file. Or store some data. Write support can wait.
<heat> filesystems have directories and paths
<heat> they're trees, not lists
<mrvn> heat: Do they? Not really, historical speaking.
<GeDaMo> Directories are files containing references to other files
<heat> that's an impl detail
<GeDaMo> Also, you don't need directories in that sense
<mrvn> heat: a list is a tree, just a really bad one.
<heat> so lets make a good one instead :)
<mrvn> heat: sure, spend 1 year implementing zfs before you implement multitasking.
<heat> <heat> I recommend you create a ram filesystem (like linux's tmpfs for instance) and unpack the tar to that
<GeDaMo> "In early MCP implementations, directory nodes were represented by separate files with directory entries, as other systems did. However, since about 1970, MCP internally uses a 'FLAT' directory listing all file paths on a volume"
<bslsk05> ​en.wikipedia.org: Burroughs MCP - Wikipedia
<heat> who's this heat guy and why does he say stuff?
exec64 has quit [Remote host closed the connection]
jleightcap has quit [Remote host closed the connection]
patwid has quit [Remote host closed the connection]
ddevault has quit [Remote host closed the connection]
sm2n has quit [Remote host closed the connection]
gjnoonan has quit [Remote host closed the connection]
tom5760 has quit [Remote host closed the connection]
gwizon has joined #osdev
<mrvn> GeDaMo: there are no directories, only users. each user has a flat list of files. :)
<GeDaMo> Users don't exist, they're just fairy stories used to scare programmers! :P
Oli has joined #osdev
<g1n> lol
<gog> there is no user, only zuul
<g1n> zuul?
<heat> gog, btw the clang builtins is the thing you want to use to replace libgcc
Oli has quit [Read error: Connection reset by peer]
<gog> heat: yes ty i objdump'd it and had a look :>
<heat> note that clang doesn't support crtbegin/end nor the good old .init and .fini sections
<gog> i don't use those anyway
<gog> at least not currently
<heat> if you want to call constructors, try iterating through init_array and fini_array
ddevault has joined #osdev
exec64 has joined #osdev
tom5760 has joined #osdev
sm2n has joined #osdev
gjnoonan has joined #osdev
<mrvn> heat: on x86 too? I thought init_array was an ARM thing
jleightcap has joined #osdev
patwid has joined #osdev
<heat> no, it's a toolchain thing
<gog> it's on x86 too, i've done some c++ experiments
<gog> gcc does make init_array if you tell it too iirc
<heat> oh and you need to explicitly enable the init_array and fini_array support when building a cross compiler iirc
<heat> because it can't autodetect some stuff
<gog> yes
<gog> i think it enables support for it by default, just not generation
<gog> but it's been a minute since i played with that
<heat> btw not sure if you read what I said yesterday but I recommend you just ship compiler_rt with your kernel
<mrvn> gog: gcc makes them per default on arm
X-Scale` has joined #osdev
<gog> yeah i might do a git submodule for that
<gog> i'm going to be working toward using clang only i think
<heat> omg omg omg omg omg omg omg
<gog> my toolchain is rather old and a pita to set up
X-Scale has quit [Ping timeout: 250 seconds]
X-Scale` is now known as X-Scale
<gog> and i won't need it after i ditch gnu-efi
<heat> 1) I recommend you keep gcc around for better portability
<heat> 2) what does gnu-efi have to do with your gcc toolchain?
<gog> i only keep it around for the objcopy and reloc nonsense
<gog> i know it'd also work on clang
<gog> but with clang i can just make an efi application directly
<gog> and the only library function i use is Print() which i will not need once i finish my work on my printf implementation
<heat> fair enough
<gog> the NIH is very strong
<GeDaMo> When are you writing your own C compiler? :P
<gog> eventually™
<mrvn> gog: I first only had puts() and the put_hex32()
<heat> i'm very happy you embraced clang, the best permissively licensed compiler with great code gen and runtime libraries, and modular code too!
<gog> it caught an overlapping comparison error i made and i was sold
<gog> gcc does not even with every warning enabled
<heat> try clang-tidy :0
<gog> o:
<gog> will look into it
<heat> clang-format is also great
<heat> get a compile_commands.json and boom, clangd
<heat> great code completion and IDE-ish support
<gog> yeah i've been investigating improving my nvim config to have better code completion
<gog> and clang is a part of that
<gog> having to dig up a header every time i can't remember a struct field is getting tiresome
<mrvn> any decent editor can autocomplete for you
<heat> when you get to user space you'll be able to enjoy the great runtime libraries that clang brings to the table
X-Scale` has joined #osdev
<j`ey> is heat a paid clang spokesperson?
<heat> memory? asan. ub? ubsan. concurrency? tsan
<heat> i wish they paid me
<gog> i think nvim has pretty sophisticated code-completion built in i just don't know how to use it
<heat> usually editors rely on a language server to do stuff
X-Scale has quit [Ping timeout: 256 seconds]
X-Scale` is now known as X-Scale
<heat> at least fancier stuff
<gog> yes
<heat> clangd is one of the language servers
<heat> intellisense just runs as a plugin I think
<heat> it's also slow as shit
<gog> i tried vs code and was not impressed
<gog> sorry not sorry
<geist> heat: re: scudo i'm not so sure it's fast in as much as it's secure
<heat> what do you mean you didn't like the best permissively licensed open source extensible editor?
<geist> though i guess that wasn't implied with the qestion
<heat> I vaguely read that it's both secure and competitive in terms of perf with jemalloc and friends
<gog> honestly i like editing in the terminal and being able to switch between editing and testing with keystrokes ra
<heat> but I can't find numbers
<heat> geist: also I think scudo sacrifices a bit of safety for speed, lots of per thread state for instance
<geist> yah that's probably correct
<heat> things musl says they won't do because that can screw with global malloc state
h4zel has joined #osdev
<mrvn> how is per thread state less safe?
<heat> "unsynchronized per-thread state inherently sacrifices global consistency for performance and makes it impossible to detect a lot of types of memory usage errors (DF/UAF, etc) that could otherwise be caught."
<heat> "However musl has the additional constraint of being compatible with small/very-low-memory environments. Lack of global consistency inherently means you will end up using memory less efficiently and requesting significantly more from the system. The new malloc about to go upstream in musl is, to my knowledge, the first/only advanced hardened allocator using slab-type design rather than traditional dlmalloc type split/merge, but also designed for
<heat> extremely low overhead/waste at low to moderate usage rather than extreme performance. And in the vast majority of applications, this is perfectly reasonable. Even Firefox for example does very well with it."
<heat> oops
<sonny> gog so you don't like using the mouse?
<mrvn> Oh yeah, firefox. The perfect app to test how it performs in small/very-low-memory environments.
<gog> not when i'm coding
<sonny> ok
<sonny> hmm
<heat> mrvn, firefox is a good example of a big app with lots of allocations
<mrvn> heat: s/per-thread state// s/global// s/for performance and//
<sonny> guess it's a good idea to leave room for the user to make their own key commands
<mrvn> heat: yes, big app, lots of allocations, a little per thread overheat in the malloc code becomes totally irrelevant and all the gains shine.
<mrvn> heat: I would say firefox is one of the best test cases to showcase the per-thread allocations.
<heat> that's not the point
<heat> musl's malloc isn't tuned for performance at all
<heat> if they wanted a fast malloc, they could get one
<mrvn> heat: My argument is with "*Even* Firefox for example does very well with it.". Obviously. That's the kind of app you want this stuff for.
<heat> that's about musl's malloc, not a per-thread thing
<mrvn> oh, your paste looked like it was all about the same thing
<heat> it is
<heat> it's musl's author explaining the reasoning behind the slowness in musl's malloc
<heat> and the lack of per-thread state
<mrvn> heat: you forgot to paste the not having it part. :)
<heat> yeah it's got some context
<bslsk05> ​news.ycombinator.com: Why does musl make my Rust code so slow? | Hacker News
<j`ey> shame that jemalloc bakes the PAGE_SIZE into the binary at compile time
<mrvn> heat: The "unsynchronized" line is still stupid though. Anything unsynchronized will just crash in any multithreaded app.
<mrvn> heat: What musl has to compete with is synchronized per-thread state
<heat> i'm tired of shilling for the llvm foundation
<heat> they don't even pay me
<mrvn> heat: gcc pays (payed) me.
<heat> j`ey, scudo sucks, have you heard of mimalloc, the best allocator ever written?
* mrvn was hired by GNU for the princely sum of 10 stickers.
<heat> way better than jemalloc
<heat> also way smaller
<j`ey> Ive heard of it, never used it
<sonny> what about snmalloc?
<bslsk05> ​microsoft/snmalloc - Message passing based allocator (72 forks/679 stargazers/MIT)
<heat> sucks, use mimalloc
<heat> the best permissively licensed memory allocator with a special focus on speed
<mrvn> Isn't is sad that the best way to transfere copyright internationally is claiming your employes work for yourself?
<sonny> well wow microsoft has been doing a lot of allocators :D
<heat> don't worry, the windows allocators still suck
<heat> everything is still normal
<sonny> lmao
<sonny> have you guys heard of the unicorn emulator?
<bslsk05> ​www.unicorn-engine.org: Unicorn – The Ultimate CPU emulator
<heat> anyway malloc sucks, use slub
<heat> the non-permissively licensed high performance allocator suitable for kernels
<heat> better yet, allocate whole pages
<j`ey> LARGE pages
<heat> HUGE PAGES
<heat> FUCKING HUGE
<heat> HUUUUUUUUUGE
<heat> pages so huge you'll lose your shit
dennis95 has quit [Quit: Leaving]
<gog> 512GiB pages
<bslsk05> ​en.wikipedia.org: Page (computer memory) - Wikipedia
<gog> oh shit RISCV64 actually supports 512GiB pages if you have large enough address space bits
<GeDaMo> Thinking ahead :P
<hmmmm> what is the point in having pages at that granularity
<gog> idk really
<gog> seems like overkill
mahmutov has quit [Ping timeout: 272 seconds]
<j`ey> aarch64 too
<heat> hmmmm, for the phys map
<mrvn> At some point it's just simpler to copy&paste the page table design for each level and you just get support for pages at any level for free.
mahmutov has joined #osdev
<heat> the arm64 48-bit address space's phys map is 128TB long
<heat> 512GB pages are obviously useful here
<mrvn> gog: if you have a compute cluster with TB of memory per node running a single application then 512GiB pages sound like a smart thing. Only needs 1 slot in the TLB.
<gog> yes
<gog> that is the only existing application i can think of for that
<mrvn> gog: the other is when you kernel maps all physical memory for easy access
<mrvn> or at bootup
<gog> true true
<gog> but 1GiB pages would be ok for that too, but i guess that eats up more of the precious few TLB slots for 1GiB pages
<mrvn> I was tempted to use 1GB pages on amd64 but not every cpu has it
<gog> i can't think of anybody i know that has more than 32GiB of memory in their rig
<mrvn> gog: 1GiB page needs more levels of page tables. More work to set up. Slower to look up on fault too.
<j`ey> probably a few of us in here :P
<gog> perhaps
<gog> i think geist has a rig with 64GiB?
<mrvn> gog: MemTotal: 64851252 kB
<heat> it was doug16k I think
<mrvn> heat: 16k 4k pages?
<gog> odamn
<mrvn> oh, wait, wrong oder of magnitude
<j`ey> I have 64G too
<j`ey> Im assuming geist's thunder x2 has 128G at least
<mrvn> What I don't have is some insane gaming GPU with 16GB of memory.
<mrvn> And I say "gaming" because if you stick a "for data center use" sticker on it the price goes up 50%
<heat> if you stick a gaming sticker on it the price goes up by 200% :P
<j`ey> RGB is expensive
bslsk06 has joined #osdev
puckipedia has joined #osdev
puckipedia has quit [Remote host closed the connection]
bslsk06 has quit [Client Quit]
mahmutov has quit [Ping timeout: 240 seconds]
mahmutov has joined #osdev
sonny has left #osdev [Closing Window]
ravish0007_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
CaCode has joined #osdev
h4zel has quit [Ping timeout: 272 seconds]
<geist> 128 yes
<geist> On thunder x
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
not_not has joined #osdev
<not_not> Hy
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
the_lanetly_052 has quit [Max SendQ exceeded]
the_lanetly_052 has joined #osdev
<g1n> not_not: hi
<clever> geist: you mentioned before that youve seen the pi4 pcie controller in other soc's, do you happen to know the name of that controller or where i might find proper docs on it?
Brnocrist has joined #osdev
isaacwoods has quit [Quit: WeeChat 3.4]
the_lanetly_052 has quit [Ping timeout: 245 seconds]
troseman has quit [Ping timeout: 272 seconds]
<not_not> Hmm time to contemplate on desiscion
<not_not> Write a VM or my own language, or x86_64 kernel or arm64 kernel?
<GeDaMo> Write a VM for your own langhuage as a kernel :P
not_not has quit [Read error: Connection reset by peer]
<g1n> lol
<g1n> i think x86_64 will be easier on first tryes, but i am not sure
* g1n tryed to make a vm
* g1n thought about making own lang, but just did very little steps in compiler dev lol
not_not has joined #osdev
<gog> x86_64 with UEFI is easier in a few ways imo
<gog> firmware deals with getting you into the right mode and sets up an identity-mapped paging environment
<gog> you can load files with boot services protocols directly from FAT volumes
<gog> and it has rudimentary memory management
h4zel has joined #osdev
CaCode_ has joined #osdev
<sham1> AMD64 is also relatively easy just due to all the resources available
CaCode has quit [Ping timeout: 272 seconds]
mahmutov has quit [Ping timeout: 256 seconds]
mahmutov has joined #osdev
biblio has joined #osdev
<gog> yes
<gog> also a big fan of rip-relative addressing
biblio_ has joined #osdev
biblio_ has quit [Client Quit]
biblio has quit [Ping timeout: 260 seconds]
sortie has quit [Ping timeout: 256 seconds]
vin has quit [Ping timeout: 260 seconds]
sortie has joined #osdev
eroux has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<not_not> Gog ye i think relative adressing rings a bell from my userspace exp
<not_not> IMO Cisc is more interesting
<gog> yes makes position-independent code much more straightforward
<gog> no thunking, no referring to offset tables
<not_not> And i hear the push multiple regs to stack
<gog> x86_64 doesn't have that anymore
<gog> pusha isn't supported
<not_not> Aww poops
<not_not> Any reason?
<gog> idk exactly, probably makes the microcode around register dependencies more complicated
<gog> thus way slower
<not_not> Ahh
<gog> might take more micro-ops to do the push vs just doing it one-by-one in the asm
GeDaMo has quit [Remote host closed the connection]
<gog> best guess
<not_not> Ye and 64 bit ur not exactly presses for mem
<_eryjus> break the pipeline maybe?
<mrvn> gog: you don't need to push all regs and partial pushes are definetly faster
<gog> yes
<gog> also pipeline stalls
<not_not> And U have too many registers to count
<gog> mrvn: and it might not even be the registers you care about
<gog> wrt pushad
<not_not> 64 bit changes errything
<not_not> Never written kernel stuff
<not_not> Got a bootloader into 32 bit mode maybe
<not_not> But cant write to screen to confirm yet
<gog> i don't recommend it unless you enjoy frustrating bugs that you caused yourself because of a side-effect you didn't take into account
<not_not> I enjoy late night mysteries yes
<gog> i.e. me not realizing i enabled interrupts before trying to initialize the PIC :|
<not_not> Wow
<not_not> Ye its always do banale
<gog> yeah i shot myself in the foot there
<not_not> So*
air has quit [Quit: cria 0.2.9cvs17 -- http://cria.sf.net]
<not_not> Hehehe
<gog> with the tiniest change to my code
air has joined #osdev
<not_not> Did the same on my first parser
<not_not> Spendt 3 days looking for a sigsegv
<gog> oops
<not_not> Turens out the index to my string pointer array was assigned with == instead of =
dormito has quit [Quit: WeeChat 3.3]
<not_not> But when i caught it i forgot it was the build that would show if my parser worked or not and hnggg
<not_not> My first line of my own programming lang was computed correctly and in polish notation
<not_not> Was satisfying
<not_not> I love bugs
<not_not> And i hate bugs
<not_not> X86 64 it is
<not_not> But the ssh key to my server is off when i Try to connenct to my linux
<not_not> (they Are watching is)
<not_not> Us*
<not_not> Well when i connect from the insane asylum
<not_not> Not planning on a multi user multi core full blown system here
<not_not> And not gonna be temple os guy
<not_not> Nod
<not_not> Ftw i never finished tiberian sun
<not_not> Tfw
masoudd has quit [Ping timeout: 272 seconds]
<CompanionCube> not_not: heh, if x86_64 has too many registers then there's arches like arm64 with 32 registers
<not_not> Ye i have a pi
<not_not> Arm was my first asm
<not_not> Had to write sound mixer and set dma and interrupts in asm on the gba
<not_not> Lots of regs helped there
<klange> Happy to report that thus far my hvf vm and my rpi are both continuing to loop Doom demos ten hours later
<not_not> Gz
<not_not> Gonna try and hook up my rpi to qbits from parallell universes when i get out
sonny has joined #osdev
<not_not> But i need kpins to reboot my brain
<not_not> All they have here is 2 mg valium and 25 mg qetiapine
<klange> I've got 72mg of methylphenidate.
<not_not> Uff get adderall
<klange> Very very illegal here.
<not_not> How illegal?
<klange> Lose my house, get deported, and never be able to return to the country I have lived the last six years of my life in illegal.
<not_not> Ur not phillipineese by any chance
<not_not> Japan?
<klange> Japan.
<not_not> Awesome
<gog> quetiapine made me have lucid nightmares
<not_not> I wanna go to japan
<gog> and oversleep
<not_not> Gog lol
<not_not> Well i take 100. Mg in one go once a year and i can do amp for another year without a disaster
<not_not> My bf held an anti pill campaign against me
<not_not> Insisting i should rather take 10 hits of acid and do speed daily over taking 2 kpin every other week
<klange> ah, yes, "don't take this drug, take _this_ drug!" people
<not_not> Yes
<not_not> I have weapon grade ptsd
<not_not> 2 kpin make me chill and tired for a whole day and the whole day after
<not_not> Whereas he'd wake up with a tiger in his room
<not_not> He's such a sweetheart but omg stupid
<not_not> And i paid off all his Loan sharks
<not_not> And i told him if i start whining for pills
<not_not> Its a chrisis and U need to put a xanax in my mouth or its now i am become death the destroyer of worlds blackout
<not_not> And i have eplepsia om my visual cortex
<not_not> Guess 3 times who ate my xanax in my moment of need
<gog> i'm rewriting my memory manager
<not_not> Nice
<not_not> Im having my whole brain replaced when i get some real sedatives with alien technology
<not_not> Then im gonna tend to my hobbies
<not_not> Write some os
<not_not> And sell off a company
<not_not> Man U get loads done with amp man
<not_not> Gog U read yakuza noon?
<not_not> *moon
<gog> no
<not_not> Wait i meant klangen
<clever> gog: on the subject of memory managers, i need to make one for a display-list system, its a bit different from normal, because the metadata and the data must be in different regions of memory...
<not_not> Klange
<not_not> Lol i was like "but im really here to talk to you about the security clearance" when i was arrested
<clever> gog: basically, i have an uint32_t[4096], where i need to use chunks of semi-unpredictable size, but only ~8 objects need to exist at any time, and they can expire after 1/30th of a second at the slowest
<gog> will they ever add up to more than 16kibs?
<not_not> Mhm
<clever> gog: i think ive gotten things to work with one object taking up ~8kb before, so i could very easily fill it with just 2 objects
<clever> so certain combinations of modes will be unsolvable
<not_not> So they sendt med to the insane asylum
<gog> where do the objects come from? are they autogenerated, is this like a blitter?
<not_not> AS punishment for thinking they were gonna send me to Afghanistan
<clever> gog: auto-generated from a linked list in normal ram, and yeah, its configuring the blitter
<not_not> They spendt a ton of Money om that arrest lmao
<gog> and these are blits that need to happen at the same time or can you queue them up for a certain number of frames?
<clever> gog: basically, for each image you want to display on screen (at 1:1 scale), you need an uint32_t[7] object, scaled images are uint32_t[14], and the end-of-list marker is uint32_t[1]
<clever> gog: you describe a frame by just having a big list of the image objects, and an end-of-list marker, and for page-flip speed, having a second list pre-loaded into the memory saves a great deal of time
<clever> so yes, you can queue up the next frame, by just writing it into unused space in this limited memory
<not_not> Nice
<clever> and the hw can drive up to 3 displays, so you will have 3 frames actively being rendered, plus a potential 3 more frames that you are writing to the config, or are waiting for a vsync
<not_not> Cant wait to do video shit gonna make a kernel space gaming os
<gog> but you're still limited to that 16k
<clever> yep
<clever> so you cant queue too much up
<gog> hm yeah that is tricky
<clever> and also, the palette and up-scaling filters go into the same 16kb region, if used
<mrvn> clever: that's what allocators are for in c++
<mrvn> (different regions of memory)
<bslsk05> ​en.cppreference.com: std::allocator - cppreference.com
<clever> gog: all records must also be 32bit aligned, so you can just treat the 16kb region as a 4096 slot region of opaque tokens, if that makes things simpler
<mrvn> clever: yes. you should check for some talk about allocators in c++ 17/20. They have changed them.
<not_not> Ill never code a line of rust
<clever> mrvn: got a link to a talk? search tools are imposible to use with keywords like c++
<mrvn> clever: given your size constraint you might need something that can compact memory.
<clever> they just go "oh, regex", and ignore the ++ no matter what you do, lol
<mrvn> clever: try cppcon instead
<not_not> Cant tolerante languages that use " let x = 0"
<clever> mrvn: compaction is why i mentioned the 1/30th of a second thing, if you do move an object around, you have to wait for the next vsync before you can delete the old object
<clever> https://www.youtube.com/watch?v=kSWfushlvB8 CppCon 2017: Bob Steagall “How to Write a Custom Allocator”
<bslsk05> ​'CppCon 2017: Bob Steagall “How to Write a Custom Allocator”' by CppCon (01:03:40)
<clever> that one sounds perfect
<not_not> Is C++ dumb if ur writing kernels?
<Griwes> there's also Arthur O'Dwyer's thing that explains memory resources and yours truly's talk on how we've used them together with gpus in Thrust
<clever> not_not: ive used c++ on no-mmu kernels, havent had any real trouble
<mrvn> clever: reading what you wrote maybe it isn't such a good idea. You need space for different objects and allocators are for just one.
<gog> c++ is perfectly suitable to write kernels with, with some caveats
<not_not> I have a hankering for asm and C being spar out by python scripts
<mrvn> gog: a subset of c++ is definetly better for writing kernels
<gog> yes
<not_not> Ye considering C++
<Griwes> raii is a beautiful thing
<clever> Griwes: found that, https://www.youtube.com/watch?v=0MdSJsCTRkY
<bslsk05> ​'C++Now 2018: Arthur O'Dwyer “An Allocator is a Handle to a Heap”' by CppNow (01:28:42)
<Griwes> even if you use just that and literally nothing else, your life already improves
<mrvn> clever: iirc you are supposed to split the management into the allocator and the resource.
<not_not> Someone suggested rust but "let x = 0" is petting the dog the wrong way
<Griwes> clever, yeah, and the mentioned self plug: https://www.youtube.com/watch?v=5UVeh4_5B8I
<bslsk05> ​'Memory Resources in a Heterogeneous World - Michał Dominiak - CppCon 2019' by CppCon (00:59:49)
<mrvn> clever: maybe one resource and N allocators using it woud work. One for images, one for scaled images, ...
<clever> another detail, is that with the current code, i cant easily know the size of an object ahead of time
<clever> but that could be solved
<mrvn> clever: on the other hand placement new sounds easier in that case.
<clever> mrvn: the image data itself, lives in regular ram, malloc already solved that
<mrvn> you can always make a proxy object that allocates memory when it gets commited for display
<clever> and the scaled vs unscaled images, must be in consecutive slots, if you want them in the same frame
<clever> for reference, here is an unscaled image, taking up 7 slots in the dlist: https://github.com/librerpi/lk-overlay/blob/master/platform/bcm28xx/hvs/hvs.c#L100-L116
<bslsk05> ​github.com: lk-overlay/hvs.c at master · librerpi/lk-overlay · GitHub
<bslsk05> ​github.com: lk-overlay/hvs.c at master · librerpi/lk-overlay · GitHub
<clever> and here, line 697 takes note of the starting position, 701/703 adds elements, and 707 creates the 1x32bit end-of-list marker
<clever> the amount of space taken up, depends on how many images are scaled and how many are unscaled, and how images are planar
<mrvn> clever: you always alternate between 2 lists. So maybe handle the list as 2 blocks of 8kb.
<clever> mrvn: except, if i want dual-monitor support, i then need to cut it up into 4, and tripple-monitor, 6
<clever> which limits you to 48 unscaled images on-screen at once
<mrvn> clever: hmm, dual/tripple monitor is harder. They aren't likely to be the same size.
<clever> yeah, having an allocator would allow a monitor with 1 image to not hog 1/3rd of the dlist
<mrvn> are you likely to construct multiple lists in parallel?
<clever> not likely, i already have a mutex per monitor
<clever> and i could expand that to one global mutex
<clever> my current solution is to just blindly treat the entire memory region as a ringbuffer
<mrvn> I mean: add image to monitor 1, add image to monitor 2, add image to minotor 1, scale image for monitor 2, end monitor 1, add image to monitor 2, end monitor 2.
<clever> because if you only ever have 2 objects live at once, by the time you wrap around, the old ones have expired
<clever> for the hw to work right, all images on a given monitor, must be consecutive in the dlist memory
<not_not> Clever clever
<clever> in z-order
<mrvn> I know. which makes adding images interleaved a problem because you have to either make space or leave space.
<clever> yeah
<mrvn> On the other hand the memory isn't that big. moving a few objects around is easy.
<clever> my current hack, is that on every change, i re-write the dlist for every monitor
<clever> and then schedule pageflips on vsync
<not_not> Gonna do loads of weird shit with the mouse cursor on my os gui
<not_not> Like U can rotate it
<not_not> Split IT in 2 to grav multiple things
<not_not> Multi mouse support
<mrvn> does the hardware read the memory as ring buffer?
<clever> nope
<clever> if the hw hits the end of the array without a proper end-tag, it crashes
<not_not> In fact the mouse Will be a programming language
<not_not> Like U can record mouse macros
<not_not> And drag it throug if statements and loops
<not_not> And watch the mouse work
<mrvn> A good strategy might to move lists that haven't changed towards the front each frame, create new lists somewhat spread out in the back of memory.
<not_not> Stole the ideal from microsoft
<clever> mrvn: one idea ive considered, is to pre-create the 7/14 slot object for an image, within the layer object that tracks its state
<clever> so i dont have to convert the state each time i make the list, i can just memcpy chunks
<mrvn> clever: definetly.
<mrvn> I can't think of any algorithm that wouldn't leave gaps and any waste could be deadly. Except keeping the lists in normal memory and recreating them in the each 8k block on every change.
<clever> also, if the monitors are running at different refresh rates, like 60hz and 59hz, the vsync'd will drift in and out of phase
<not_not> Off
<clever> so when an object expires and becomes free space, will change
<mrvn> is that realistic?
<clever> they can be running from different PLL's with different divisors
<clever> and enless you get the pixel count, divisors, and PLL's all perfectly aligned, they will have some drift
<not_not> Ahh Nice night, night is soothing
<not_not> Day is gay
<mrvn> you're screwed.
<mrvn> clever: In that case I think you have to live with some tearing or do the memcpy all in the vsync.
<clever> memcpy may solve some of the issues
<clever> previously, when i was re-creating the entire list in vsync, i had a stable tear near the top
<mrvn> if you memcpy do you even need 2 copies of each list?
<clever> i solved that by pre-creating the list, and only doing the flip on vsync
<clever> possibly not
<mrvn> You can create the whole list in memory so it's a single memcpy() call
<clever> but you can only write to the region used by a screen currently in vsync
<not_not> Man im so relaxed
<clever> so worst case, 1 screen is in vsync, and 2 are active, so you cant change it
<not_not> Its the sun
<mrvn> put one list at the start, one list to end at the end and the third floating with equal space before and after.
<clever> mrvn: oh!, but if you are copying the visible frame, and then page-flip mid-frame, the 2 frames are identical, that wont be a visible tear! (when not scaling anything)
<not_not> All my worst enemies have day jobs
<not_not> Tbh
<not_not> I can sende they stopped scheming and gone to bed
<not_not> Sense
<mrvn> clever: is there a nop entry?
<clever> no, but you can use alpha to waste a slot on a 100% transparent image, or set the w/h to 0 and maybe it wont render
<clever> or set the xy to be off-screen
<clever> yeah, a 7-slot object can have an alpha, either object wide, or per-pixel
<mrvn> I wonder if you could move a list by copying each slot and replacing the original with nops.
<clever> there is also state on each object, that the hw uses for internal purposes
<clever> and it may glitch if that state is wrong
<mrvn> so better not do that.
<clever> at the most basic level, you have 1 compute core, that is round-robin'd between all active displays
<mrvn> Going with the "recreate in vsync" idea and having a list at the front, end and middle then worst case you might have to wait one frame to move the middle list if one list grows too much between frames.
<clever> for each display, you have an output FIFO, that holds whole scanlines
<clever> if a FIFO has room for 1 scanline of image data, the compute core will read the display-list, find every rect intersecting that scanline, then draw those objects directly into the FIFO ram
<clever> and its not a typical push/pop only FIFO, but just a ringbuffer acting as a FIFO
<clever> i can choose what range of ram each of the 3 FIFO's occupies
<mrvn> I can't quite see those list changing (in size) from frame to frame. I imagine more that they are setup once, e.g. when the game starts a level, and then it remains that size for minutes. Then shrinks for the loading screen and grows again for the next level. Or something like that.
<clever> if you are using sprites to animate enemies, the list will change size, based on how many enemies are on-screen
<mrvn> clever: that would limit the enemy count quite a lot
<bslsk05> ​'ntsc dance v2, interlacing fixed' by michael bishop (00:00:22)
<clever> do you see the glitching in this video?
<mrvn> horrible.
<clever> that happens if you have too many sprites in a certain area
<clever> the compute part of the HVS cant fill the FIFO fast enough
<clever> and the electron beam catches up, and runs out of pixels to display
<mrvn> Think about it: You have the player, the enemy, all the bullets and rockets or whatever they shoot. 20 sprites is not going to cut it.
<clever> the FIFO size lets you smooth out lag spikes, from a scanline taking too long to render
<clever> but too many spikes, and it will glitch out
<clever> but also, 20 is not really the limit
<clever> the limit, is how many pixels you have to copy for a scanline
<mrvn> I think you have to render the sprites to framebuffers and use the dlist to display just those.
<clever> and those raspberries are being down-scaled a lot
<clever> so there is a lot more pixels being copied, then what you would expect
<clever> also, a large amount of bandwidth is being wasted on pixels you cant see
<clever> if the sprites had proper collision checking, you cant overload it
<mrvn> clever: overlapping sprites definetly waste bandwidth
<clever> other then bullets in a bullet-hell game, most sprites dont overlap
<mrvn> is there hardware collision detection?
<clever> nope
<clever> also, if in dual-monitor mode, the compute core has to split its clock cycles between both displays
<clever> so it becomes half as capable
<clever> *looks*
<mrvn> clever: dual monitor wouldn't be a problem. One list at the start, one at the end. Both with memcpy() in the vsync.
<clever> channel 0 is dsi0 or dpi, dsi0 isnt wired on most pi models, dpi uses a lot of gpio but can be used for many things, and i have drivers
<mrvn> Only with 3 monitor support you could run into a case where the first list needs space but all the free space is after the second list.
<clever> channel 1 is dsi1 or smi, dsi1 i lack drivers for, smi is entirely undocumented
<clever> channel 2 is hdmi or composite, ive only got composite drivers currently
<clever> so realistcally, tripple-monitor mode isnt possible right now, dual is the limit
<clever> due to lack of drivers
<clever> the transposer brings in new stuff, but less limits
<clever> basically, the transposer uses up 1 channel, and does 90 degree rotations, and writes the image back to ram
<mrvn> clever: try if you can memcpy() long lists in the vsync if they are prepared in real memory ahead of time.
<clever> but its not a constant scanout, so you can free the dlist stuff upon completion
<clever> its also not racing an electron beam, so it isnt bothered by lag spikes
<clever> i'll give that a try, after i eat some pizza
<clever> just remembered i have some in the oven
<clever> one other random data-point, the DPI is probably the fastest output i can use right now, and its rated for up to 100mhz pixel clock
<clever> so that would mean generating 100,000,000 pixels/second
<clever> vsync rate then depends on resolution and blanking periods
<clever> divisor 2.967268, fps bounds 89-59, DPI clock measured at 108000 KHz, hsync rate: 63084 Hz, vsync rate: 59 Hz, htotal: 1712, vtotal: 1063
<clever> some of the debug code when selecting divisors to hit a target fps
<not_not> Ill eat some pizza too
<not_not> Bolognese is best cold
<not_not> Clever U have a github?
<clever> not_not: yeah, i linked it above
h4zel has quit [Ping timeout: 240 seconds]
<not_not> Ahh ty
manawyrm has quit [Quit: Read error: 2.99792458 x 10^8 meters/second (Excessive speed of light)]
manawyrm has joined #osdev
Terlisimo has quit [Quit: Connection reset by beer]
<not_not> Clever cool
<not_not> Miss bit operations in C havent used them sine i was 12
<mrvn> how do you do 4k displays?
<clever> mrvn: hdmi, which i lack drivers for currently
<clever> hdmi can do a higher pixel clock then dpi
<mrvn> looks like that needs around 400MHz
<mrvn> 1/4 the sprite count
Terlisimo has joined #osdev
<not_not> Zomg clever ill compile ur code once im out of here
<clever> mrvn: there is also some 4k differences between vc4 and 2711, one min...
<clever> mrvn: for vc4, the x/y position is limited to the 0-4095 range, so it can just barely handle 4k in a given dimension
<clever> 2711 increased the bits in a few fields, allowing higher resolutions
<clever> 2711 also generates 2 pixels per "pixel clock", so it can handle more pixels at the same clock