<doug16k>
to solve that, I alias the 1st GB at 4GB so it can reach up to data (which is physically at 0x40_0000 but virtually at 0x1_0000_0000
<doug16k>
sorry 0x1_0040_0000
<doug16k>
that's pretty closeby to code at 0xFFFF0000-0xFFFFFFFF region
<doug16k>
er, 0xFFE00000-0xFFFFFFFF
<doug16k>
i386 has no trouble due to the wraparound, it reaches positive, wraps around, and gets to 0x400000 no problem
<doug16k>
fascinating that x86_64 (slightly) and riscv64 (seriously) both have that type of reach issue. the most gigantic and most tiny cpu of the bunch
<doug16k>
aarch64 has the problem too, but the memory layout at power up is so easy, you don't even think about reach, and you can reach ram
<doug16k>
aarch64 fetches the first instruction from 0x0
<doug16k>
qemu aarch64 virt one does anyway
Arthuria has quit [Ping timeout: 244 seconds]
<doug16k>
I guess it's because both riscv64 and x86_64 have their rom really far away from low RAM that's almost certainly there
<doug16k>
if you wanted to be sloppy, you could just require enough ram to get above 2GB line, and put data up there, and hope ram is there
<doug16k>
then it would be reachable with no paging remap trick
<doug16k>
that project has the most perfect makefile I have ever made
<gog>
damn that is a clean makefile
<gog>
puts mine to shame
<doug16k>
does everything I wanted to do but make wouldn't let me
<doug16k>
the pattern rules wouldn't let me
<doug16k>
there are no patterns, it generates each object file explicitly
<gog>
that generally means it's something you shouldn't do with make lol
<doug16k>
in a loop that generates the rules
<gog>
thats practically heresy but i'll allow it since make is rather rigid for a build system
<doug16k>
everything in directories under obj
<doug16k>
can do make obj/arch/pci.S and get assembly, or make obj/arch/pci.i to get preprocessed
<bslsk05>
github.com: rpi-open-firmware/Makefile at master · librerpi/rpi-open-firmware · GitHub
<gog>
clever: 8/10
<clever>
gog: anything you might do differently?
<doug16k>
subdirectories in your source though?
<doug16k>
doesn't that screw you up with the pattern rules?
<clever>
doug16k: now that i look at it again, i'm thinking of moving lib_armv6 to its own makefile, and building it as a static lib
<clever>
doug16k: i dont think it does, pretty sure its making subdirs under build
<doug16k>
oh wow you need autodependencies
<clever>
doug16k: yeah, thats an issue
<doug16k>
or you will go insane
<gog>
nothing jumps out at me
<doug16k>
nothing more disturbing than being certain that you fixed it, and it doesn't fix when you rerun it, because it was in a header that wasn't picked up
<clever>
doug16k: yep, the subdirs persist under the build/ dir, the patterns dont care
<doug16k>
have you ever strace make?
<doug16k>
it's ridiculous the amount of stat it does with the default rules
<doug16k>
it's almost as if it is comically checking for every extension of every filename it ever sees
transistor has joined #osdev
<doug16k>
and stat is so ludicrously fast, nobody notices
bsdbandit01 has joined #osdev
<clever>
yeah, the pattern rules do that
<doug16k>
ever since I started using make I wished I could just make a loop that makes each rule
<doug16k>
now I finally know enough to do exactly that
<doug16k>
screw patterns
<doug16k>
vpath thing? so what if I have a duplicate source name?
gog has quit [Ping timeout: 244 seconds]
bsdbandit01 has quit [Read error: Connection reset by peer]
<doug16k>
if it hurts when you go like this... I guess
<clever>
doug16k: i have run into trouble when i get both a .s and a .S by the same name, or a .S and a .c
<clever>
it tends to use the wrong one
<klange>
i am playing around with some changes to my boot splash, but outside of my laptop loading up a ramdisk with gcc it all goes by too quickly to really matter...
<doug16k>
klange, do you make it so you can do out of source build?
<klange>
nope, I don't bother for any of my three big projects
<clever>
klange: ive noticed the same issue with whats-it-called on linux, with the pi4, it boots too fast to even realize its installed
<clever>
klange: only when i booted the same card on a pi0, was it slow enough to read the graphical progress
<klange>
plymouth? takes longer to load plymouth's theme files than the time it gets to actually spend rendering
<doug16k>
yeah I find if I try to do out of source then the pattern rules uncontrollably put everything in the source tree. I don't know how to tell it not to
<clever>
yeah plymouth
<klange>
doug16k: I looked into doing it for Kuroko but unless you're doing a `configure` step doing a pure Make build that supports it is annoying
<doug16k>
oh wow you don't even have a configure step and include a config.mk ?
<clever>
i try to avoid the need for configure in most of my current projects
<doug16k>
hard mode
<clever>
they only work on a single target, why complicate it?
<doug16k>
hard to implement I mean
<clever>
with only one target, there is nothing to auto-detect
<clever>
just hard-code it all
<doug16k>
yeah with one target, nevermind configure
<doug16k>
configure is for supporting multiple machines/platforms/architectures
<doug16k>
tolerating lack of libraries
<clever>
the arm_chainloader runs on 3 rpi models, but is always a bare .bin dropped at physical 0, any model detection happens at runtime
<clever>
the firmware dir runs on the same 3 models, always as bare .bin, loaded to the bottom of the L2 cache
<clever>
the hardest part, was choosing the right arm mmu routines
<clever>
mmu-on is simple, i just picked a common subset of features
<bslsk05>
github.com: rpi-open-firmware/mmu.c at master · librerpi/rpi-open-firmware · GitHub
<doug16k>
hmm. I should update a config.guess.mk in each arch, so you can see what values it most likely gets in the source tree
<clever>
mmu-off is harder, because of cache flush
<clever>
armv6 has a single opcode to flush the entire cache
<clever>
armv7 and up, only allows you to flush a single row, so you must iterate thru every row
<clever>
and for extra fun, you basically cant do any load/store during the flushing, or you wind up having more to flush!
<doug16k>
someone should figure out some crypto mining code to schedule into that and make a few cents a year from cache flushes :P
<klange>
bim is a single cc call, kuroko jumps through a few hoops in the Makefile to handle a few different shared object strategies incl. Windows but mostly survives on a very small libc footprint and a healthy dose of #ifdefs, and ToaruOS could well be a shell script with how specific the build is
<klange>
The current ToaruOS build setup actually builds a system kuroko with a single big cc call as its first step :D
<klange>
(it gets used to run the magic #include sniffer that builds the Makefiles for userspace applications)
<doug16k>
yeah my rom was "throw the whole list of source at it and go all the way to link" for a while
<doug16k>
when I started needing to use make -B I fixed it
<doug16k>
I am guessing you made the dependencies work
<bslsk05>
github.com: rpi-open-firmware/hardware.h at master · librerpi/rpi-open-firmware · GitHub
<clever>
doug16k: when built for arm-linux, it assumes you always have a mmiobase pointer in scope, and does MMIO relative to that, then you just mmap /dev/mem
<clever>
so i can now write MMIO routines once, and it works under both linux and baremetal
<doug16k>
yeah my secondary-vga framebuffer is like that, it probably got the framebuffer base and mmio from what pci_init got, but it doesn't have to be, you can just know the framebuffer address and size and mmio base
<doug16k>
from what pci_init set*
<doug16k>
plan to make my rom bootstrap to a working framebuffer on every cpu that has secondary-vga
<doug16k>
then noobs can start from there and draw to the screen and call printdbg
<clever>
secondary-vga?
<doug16k>
it's a framebuffer-only mmio-only vga with no legacy in qemu
<doug16k>
it's what happens if you had two video cards
<clever>
ahh
<doug16k>
second one doesn't get the legacy I/O mapped
<doug16k>
qemu handles 7 without falling over
<doug16k>
my rom initializes them all and it's multimonitor
<doug16k>
sdl pops out a window per monitor
<clever>
i lost the link, but i saw a blog post of a guy torturing both linux and qemu
<doug16k>
gtk has the detach tab thing and all that
<clever>
he added over 26 block devices to the system
<clever>
sda thru sdz got used up, to it make an sdaa
<doug16k>
I should look at those gtk assertions while I have a qemu debugging setup set up
<clever>
he then kept cranking it up, and noting what happens at various milestones
<clever>
turns out, the list of block devices is a singly linked list
<clever>
the more you have, the slower it is to access one
<doug16k>
I hate singly linked lists very much when I need to look for a specific item. I love them very much when I can just grab the first one every time
<doug16k>
if you used a singly linked list for a thing you search, you are a fool
<clever>
and when the average machine has ~4 max, and >20 is very rare...
<doug16k>
it's pathological. it has to wait for the load latency at each step, and hasn't the slightest idea what is next until that load retires
<clever>
i think this cost is also only paid upon opening a block dev
<clever>
and then you have a pointer to that entity
<doug16k>
everyone in the memory subsystem looking at eachother palms up shrugging
<doug16k>
every new address is a big surprise
<doug16k>
didn't see *that* pointer coming. wow
<doug16k>
"did you think it would be that address mike?" "no, I swear usually they want the next cache line!"
eschaton has joined #osdev
<doug16k>
it's hard enough to walk the links in a tree structure that has O(log N) performance. hammering O(N) is brutal
<doug16k>
contiguous storage is the best way
<doug16k>
then the cpus first guess of "next line" is usually correct
<doug16k>
if you just jammed the predictor to say "next line" forever, it would be pretty good
<doug16k>
...in normal programs
<klange>
Kernel Sandurz: "Why are you configuring? You're always configuring! Just Make!"
<klange>
"Just Make. Sir, shouldn't you sit down?" *Dark Helmet then falls on his ass as some broken BSD derivative fails to live up the standards of the project.*
IRCMonkey has joined #osdev
<geist>
wow
isaacwoods has quit [Quit: WeeChat 3.1]
<klange>
Spaceballs is probably my... third favorite film? My whole top three is scifi space comedies: Fifth Element, Galaxy Quest, and Spaceballs.
IRCMonkey has quit [Quit: .oO (bbl tc folks~!)]
<geist>
solid list
richbridger has quit [Ping timeout: 245 seconds]
SweetLeaf is now known as SpikeHeron
<doug16k>
oh I should time my configure
<doug16k>
222ms
<doug16k>
it actually launches a vm twice and runs monitor commands in that time
<doug16k>
so probably instant + qemu_vm*2
<doug16k>
and thank you qemu for being so quick
<doug16k>
not one of those configures that say "making sure 1 + 1 == 2... 2"
<geist>
yah the new meson stuff is not half bad
<doug16k>
#define __HAS_GOOD_ONE_PLUS_ONE__ 1
<doug16k>
it would be funny to start emitting a bunch of stuff exactly like autotools would print out when configuring, then erase it back with terminal escapoes and print, "just kidding" and erase it and do the real one
<doug16k>
and add a command line arg, --no-jokes
<kingoffrance>
not sure, perl might hae some jokes of that nature
<doug16k>
put some funny ones in there like ...making sure up goes above down... above
Vercas has quit [Ping timeout: 252 seconds]
Vercas3 is now known as Vercas
<kingoffrance>
look at the opening lines about sco thinks true is false lol
<bslsk05>
github.com: perl5/Configure at blead · Perl/perl5 · GitHub
nyah has quit [Quit: leaving]
<riverdc>
how do linear framebuffers usually get drawn? is there hardware polling it at a fixed clock rate or something?
<meisaka>
in most hardware, framebuffers are constently being read in a loop and sent to the monitor in realtime, i.e. the whole buffer is sent 60 times per second
<doug16k>
endlessly
<doug16k>
even if nothing changed
johnjay has quit [Ping timeout: 264 seconds]
<doug16k>
most display technologies fade back to some undesirable value if you don't keep grabbing the pixel and putting it how you want it
<doug16k>
so the whole display interface design revolves around sending the whole screen again and again to the display
johnjay has joined #osdev
<doug16k>
you just go in circles fixing each pixel, forever
<doug16k>
if "you" are the video card CRTC/RAMDAC section :P
<meisaka>
until something comes along and yeets the power...
<doug16k>
or clears the display enable in some register, to make the monitor go into power save
<doug16k>
no sync == might as well turn off
<doug16k>
I love how bochs dispi paravirtualized framebuffer works. the registers are all there for VGA, in the MMIO. you'd think you have to do stuff with vga. You do one store of 0x20 to base+0x3c0, and you are done. it's unblanked, never touch vga again :D
<doug16k>
store to vga_3c0 which is at base+0x400
ZombieChicken has joined #osdev
iorem has quit [Quit: Connection closed]
ZombieChicken has quit [Quit: WeeChat 3.1]
<riverdc>
makes sense, thanks
<riverdc>
and do OSes typically try to keep display updates atomic?
smeso has quit [Quit: smeso]
<riverdc>
I know linux must be able to somehow because wayland does
<riverdc>
I'm not sure how I'd go about making sure that a write to the framebuffer is done before it's sent to the monitor, though
<meisaka>
most video cards will have a register to select the base address of the framebuffer, or switch framebuffers
<riverdc>
ah so you do double buffering?
<meisaka>
yes
<riverdc>
got it
<meisaka>
then switch between them on the vsync
<meisaka>
my OS is lazy, it just writes to the framebuffer directly, and doesn't care about all that
ZombieChicken has joined #osdev
smeso has joined #osdev
bsdbandit01 has joined #osdev
<klange>
One of Wayland's big design philosophies was "every frame is perfect" - applications must pass off completed buffers for rendering, there's ownership handovers in place to ensure nothing is updated during a frame render, vsync is baked into the whole pipeline model...
bsdbandit01 has quit [Read error: Connection reset by peer]
farcas has joined #osdev
<doug16k>
meisaka, in qemu, in linear framebuffer modes, "the framebuffer" is just a system memory buffer that qemu reads from to update the display, it is your system ram
<doug16k>
in case you are wondering why you can write 16GB/s to "video memory" directly :)
<meisaka>
I would fully expect emulation to use system ram and dirty flags and other such magic to update it
<doug16k>
yeah, that
<klange>
almost a bit surprised at how well my ui runs on the t410...
<klange>
full resolution, all the animations...
<meisaka>
I've certainly had to deal with framebuffer updates in my emulators
<doug16k>
if you treat the actual video ram as write only, you will tend to go fast just from that
<doug16k>
if you write only changed part, then the speed starts getting very high
<klange>
I do a handful of things to reduce what I update, though the damage clipping in my current lib is horrible
<meisaka>
my kernel project uses a large graphical framebuffer for text, it only writes updates to the parts that changed based on an attributes/character buffer
<klange>
should probably build pixman/cairo again and make sure that backend is still working
<meisaka>
I imagine it's a common way of handling that
<doug16k>
graphics is cool because it does exactly what your memory subsystem is hoping you are going to do, contiguous stuff
<doug16k>
and it's embarassingly parallel, so you can vectorize it without much thought
ZombieChicken has quit [Remote host closed the connection]
ZombieChicken has joined #osdev
<doug16k>
say if you had to convert from 32 bit back buffer to 16bpp framebuffer
<doug16k>
conversion is zero cost since the memory access dominates it
<doug16k>
it's weird when code is so cheap that you might as well execute it because there is latency to cover anyway
<doug16k>
the store will end up the bottleneck and it'll be all queued up as far as it can ahead in the loading/converting part
<meisaka>
that makes me wonder if using AVX or SSE for such a conversion is worth it
<doug16k>
it saves power
<doug16k>
race to sleep sooner
<doug16k>
avx also lets you force it to use WC stores
<doug16k>
if you do it without avx, how?
<doug16k>
one pixel at a time?
<doug16k>
instead of 8
<doug16k>
almost an order of magnitude
<meisaka>
I would guess one pixel at a time, depending on how well superscalar pipelines deal with it
<doug16k>
yeah it will end up doing better than sequential that the program said
<doug16k>
how about out of order executing the avx though?
<doug16k>
two stores. cache line done
<doug16k>
two cache accesses
<doug16k>
send it out
<doug16k>
pcie bus gets one huge efficient burst
<meisaka>
at least I don't have to deal with that... yet...
bradd has quit [Remote host closed the connection]
<doug16k>
avx-512 is going to be on a lot more machines soon
<doug16k>
that'd be one store per entire pcie burst
farcas has quit [Ping timeout: 272 seconds]
<doug16k>
pile up a queue of those
<doug16k>
it would be to the point where you couldn't possibly write to the video card any faster
<meisaka>
sort of related, I have a rather old nvidia GPU that I want to write my own driver for (probably modesetting/2D acceleration), I wonder if there are any good resources for that
<doug16k>
you want qemu modesetting?
<doug16k>
and multidisplay?
<meisaka>
scary world of real hardware
<doug16k>
why
<meisaka>
because I like to think outside the bochs
<doug16k>
let's say you write 0x433e to base+0xfeec. who cares
<doug16k>
I appreciate people liking support of very specific things
<doug16k>
I gravitate toward standard things that can be generic
<meisaka>
it's very specific because it's sitting in a bin collecting dust :P
<doug16k>
ah, having that hardware is the best reason
<meisaka>
playing with VMs is fun, but sometimes I like to try hardware I have
<doug16k>
no way, I think you are downplaying the involvement of vms
<doug16k>
you don't test it against that first and it graduates to real hardware?
<meisaka>
don't get me wrong, I test _heavily_ in a VM
<meisaka>
oddly though, I started out on real hardware, then went to a VM because it was easier XD
<doug16k>
I am still freaking out inside about qemu not throwing an exception in a bunch of register write scenarios
<doug16k>
code has comments like /* XXX: should #GP */
<doug16k>
bochs beats tcg hands down in that respect
<geist>
word.
<meisaka>
my kernel supports hotplug PS/2 as a result of starting on real hardware
<meisaka>
that's a feature that probably won't see much use
<doug16k>
on PC hardware, you can just keep using it, even for usb, as long as you don't to the bios handoff on the usb controller
<doug16k>
it emulates a keyboard controller for you
<meisaka>
that's very nice of it, but I want to write a usb driver after I finish my net stack
<doug16k>
then you won't need to worry about ps2
<doug16k>
I made an xhci driver and support usb mouse, keyboard, hub, block storage, so far
<meisaka>
:o
<meisaka>
guess I know who to ask if I get stuck :P
<doug16k>
I'll try
<doug16k>
I strongly recommend that if you haven't made a few different pci drivers, go make ahci and nvme first
<doug16k>
it'll make your brain able to wrap your head around what you have to do
<doug16k>
it's nested instances of whole ahci/nvme kind of :P
<meisaka>
my only pci driver is for 825** gigabit ethernet
<doug16k>
that pop in and out of existence
<doug16k>
as far as dealing with the rings and stuff
<doug16k>
ok so you get the ring idea right?
<doug16k>
if you get that you are ready to start usb driver
<meisaka>
I think so, enough to get network packets sent/received at least
<doug16k>
it's a lot of that
<doug16k>
popping in and out of existence on the fly, instead of just trivially setting it all up pristine once and for all like some other drivers
<doug16k>
it's not that bad really
<doug16k>
typing in the structs from docs is worst part :P
<doug16k>
the design is broken when one command takes 73.117 years
arch-angel has joined #osdev
<doug16k>
meisaka, if you want to straight-line to a decent framebuffer, and set it to the native monitor resolution and leave it, efi or bios call will do that much
<doug16k>
realistically, what modesetting are you going to do?
<doug16k>
something not native resolution? come on
<meisaka>
right now my bootloader just uses VESA bios calls to set native resolution
<doug16k>
you can enumerate multimonitor on efi bootloader
<doug16k>
get all the framebuffers and set all the modes
<meisaka>
I was actually more interested in accelerated 2D bitblts
<bslsk05>
github.com: dgos/modelist_efi.cc at master · doug65536/dgos · GitHub
<meisaka>
I also haven't been targetting EFI yet
iorem has joined #osdev
arch-angel has quit [Quit: WeeChat 3.1]
<doug16k>
I jumped at the opportunity to support every video card modeset, instead of some 2060 super modeset
<klange>
doug16k: I don't think I've ever seen EFI actually give multiple heads on real hardware unless you have multiple adapters that can work together (which is rare), though.
<doug16k>
it's up to the video rom
<doug16k>
I couldn't care less how to modeset one exact card
<meisaka>
modesetting many cards would be nice
<doug16k>
are they PCI compatible?
<meisaka>
I only have one, it's PCIe
<doug16k>
all you need to do is find out what you poke into the MMIO and the framebuffer is the other BAR
<doug16k>
and it probably just has an array of registers in there somewhere
<doug16k>
then just do a modeset, snapshot it, do another, snapshot it, look at the diffs, and find it by malicious attack
<doug16k>
I totally RE'd my water cooler by known plaintext attack
<doug16k>
look at the BARs and look at the ranges it maps
<doug16k>
it's in an MMIO one most likely
<doug16k>
then modeset by playback attack
<doug16k>
they didn't have a linux driver so I wrote one
<bslsk05>
doug65536/kraken - Linux kernel module to control and monitor NZXT liquid coolers (is fork /1 forks/0 stargazers/GPL-2.0)
<doug16k>
klange, if they were two different manufacturer cards, I don't see how it could not work
<doug16k>
maybe see any other card and refuse on purpose in init? if they are that mad
<doug16k>
it would execute both roms
<doug16k>
oh wait
<doug16k>
yeah if it was two video cards it would almost surely work
<doug16k>
you mean multi head on one card. yeah I can see that maybe unimplemented
<doug16k>
we better go add it to tianocore quick so when they copy paste it 10 years from now it'll work :P
warlock has quit [Quit: Lost terminal]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
iorem has quit [Ping timeout: 258 seconds]
<doug16k>
nice... I have ../qemu-rom/configure --autopilot and it just creates all the toolchains and does each arch build and launches a qemu in the background, all by itself
<doug16k>
3950x not included though
<doug16k>
what it should do is start all the architectures in parallel
<doug16k>
with 1/Nth the cpus
<doug16k>
then the stupid sequential configure steps are parallel
Sos has joined #osdev
<doug16k>
in gcc build
<doug16k>
toolchain*
angelsl has left #osdev [Quit]
<klange>
doug16k: a common thing when you have two separate cards from different manufacturers is the motherboard turns off one or the other on startup, and while outside of nvidia driver situations on Linux I never really got _why_ that was a thing, but it is.
<klange>
"power savings" maybe c'mon at least let me turn on both as an option?
<Mutabah>
probably VGA emulation reasons
<Mutabah>
I recall that some BIOSes allowed you to disable that
<klange>
Back in the CGA days DOS had support for multi-heading with an MDA and CGA card.
<klange>
With the Optimus (etc.) setups you have the problem of "sure, you can turn on both cards, but only one is hooked up to the ports, and you're lucky if the firmware modesetting even supports the right resolution for the panel; the external ports? hahaha"
<doug16k>
klange, totally not a thing in my experience. I have had nvidia and amd card together a few times
<doug16k>
but I don't mean efi-wise
<doug16k>
the "real" drivers had no problem coexisting perfectly
<doug16k>
why wouldn't they? they can hardly tell the other one exists
<doug16k>
they are all dead as a doornail at power up. one gets bars set up fully
<doug16k>
at least
<klange>
I am sitting, right now, on a box that will not let me turn on its onboard Intel chipset alongside an attached PCIe card - I must pick one or the other in the EFI settings.
<doug16k>
you must pick one or the other to be the head it uses in efi screens yeah
<klange>
Granted, this is several years old and perhaps the motherboard manufacturers have decided not to do this anymore as it is quite silly.
<doug16k>
how could one lock out another if it tried?
<klange>
No, I literally do not have the card on the bus anymore.
<doug16k>
firmware could cripple it like that or it could be a defect workaround
<klange>
It's the motherboard, not the card, that does this. And it does it by just _turning it off_.
<klange>
Again, yes, it is the firmware doing the dumb evil thing that the firmware alone is capable of doing.
<klange>
For once, it is not nvidia's fault :D
<doug16k>
it could be pcie switch limitations
<doug16k>
it can't have that many lanes. has to cut one thing off or another. no switch
<klange>
That doesn't really make sense on a gaming motherboard that would happily accept two such cards.
<doug16k>
sometimes the ahci and stuff are like that. there aren't a full tree of pcie switches, so they make it one or the other
<doug16k>
builtin though
<klange>
And the on-board chipset is on a lane shared with other CPU peripherals, so...
<doug16k>
the slot ones would just be on a switch
<klange>
Honestly, you want my actual theory? It's so that Windows gamers who don't actually know anything at all about computers don't send support complaints about their on-board Intel chipset when they installed their fancy nvidia gtx 9324123
<doug16k>
yes exactly
<doug16k>
there are zero reviews of how stupid the firmware and how bad the motherboard implementation
<doug16k>
nobody checks
<doug16k>
I want to see an octopus of pcie expanders and 16 video cards working
<doug16k>
crc check that
<doug16k>
furmark them all at once
<klange>
I don't _think_ Windows has the same sort of library compatibility issues with DirectX or even OpenGL that we suffer through with Nvidia's binary blobs coming with their libgl, but I would not doubt there's some wackiness on that side of the world that makes someone's Overwatch not run right if they turn on both cards...
<doug16k>
not even close to a stress test to be found
bradd has joined #osdev
<doug16k>
my nvidia card is infallible
<doug16k>
I haven't had the slightest problem
<doug16k>
since 1060 until 2060 super
<doug16k>
and still fine
<doug16k>
you mean in a laptop?
<doug16k>
that's way harder to be perfect
<klange>
It's not a hardware thing, it's just that the way nvidia supports Linux is by shipping their own GLX implementation, and to make everything find that you have to just plop it in with everything else...
<klange>
Like half the annoyance and "magic" of Bumblebee and other Optimus support on Linux is getting everything to find the right libgl :D
<klysm>
doug16k, have you tested it with vfio-pci
<doug16k>
tested what
<klysm>
your nvidia card
<doug16k>
oh my video card
<doug16k>
no
<doug16k>
why
<klysm>
mine works, is a GTX 960, using it right now
<doug16k>
ah, I was almost expecting you to say "good luck"
<klange>
ugh shit that reminds me
<klange>
THAT is the most annoying part of this
<klange>
I have TWO cards, right? So what if I wanted to attach a monitor the on-board port that connects to the Intel one and hand it to a VM? You can't because you can't turn it on!
<doug16k>
you can't
<doug16k>
there is no output on the optimus discrete gpu
<klange>
This is not optimus.
<klange>
This is a desktop.
<doug16k>
it dmas to the igpu framebuffer
<doug16k>
ah
<doug16k>
sorry
<klysm>
my bro has AMD Vega64, and his is working too, with some tweaks
<doug16k>
you should totally expect to slap any number of video cards in, and it works
<doug16k>
it did for me I swear
<klange>
Desktop: i6 with the regular on-board Intel chipset, which the firmware has disabled because there's a GTX 1080 plugged in.
<doug16k>
any ones
<doug16k>
disabled as in, it left the PCI config space command register with 0 in I/O and memory space enable, yeah
<doug16k>
that can be solved in microseconds
<doug16k>
after you set the still zero bars
<doug16k>
you know how to autoconfig bars. do that
<doug16k>
I assume you do
<klysm>
do i6 intel chips support FLR function level reset? (from lspci -vvv)
<doug16k>
klange, unless it is just plain physical limitation like mentioned, then yeah you can't
<klange>
Can't what? Are you echoing me or responding to something in particualar?
<doug16k>
I am qualifying what I said to you
<doug16k>
you can probably configure that "disabled" other video card yourself
<klange>
Maybe after work hours I'll try poking some PCI regs directly and see if I can get it to even show up in a bus scan.
<doug16k>
I have code that sets up blank pcie in my rom project
<doug16k>
goes straight to the point
<doug16k>
it's the "write all ones" "read it back" "negate to get size" thing
<doug16k>
if it needs 4K of address space, then the low 12 bits are stuck as zero
<doug16k>
(not including 4 bottom bits)
<doug16k>
they are flags
<doug16k>
pretend they say zero when doing the probe
<doug16k>
so the 16MB framebuffer, that bar has the low 24 bits stuck as zero
<doug16k>
that';s how you can tell it is 16MB
<doug16k>
and not even know it is a video card
<doug16k>
all you care is header_type says 0
<klange>
While we're on the topic of such PCIe GPU riddles, this ThinkPad has the option to switch to "just" the nvidia GPU. I believe you when you say it's still just DMAing to the Intel chipset, as there's no way they're both hooked up to the LVDS, but in that case I get the same view of the PCI tree as I do on this desktop - the Intel card is gone. Is it quietly being managed behind the scenes?
<doug16k>
that's non bridge
<doug16k>
I am not sure exactly how the drivers present themselves. I think the intel drivers do appear to vanish yeah
<doug16k>
I think the igpu sets it up and doesn't have to do a thing anymore and basically hands off all the way
<klange>
It's not the driver, it's the PCI device not showing up in scans.
<doug16k>
in linux scans?
<doug16k>
or your own scans
<doug16k>
linux may scan more thoroughly with ACPI info
<klange>
Me, Linux, whatever, lspci - what once showed an 8086:0046 at 00:02.0, now there is nothing.
<doug16k>
you using pcie ecam or legacy i/o port?
<klange>
Hm, presumably linux is doing ecam, I'm doing legacy i/o port.
<klange>
and grub is just there trying its best to vesa, which on against the nvidia card gives me a 640x480 display that the LVDS is dutifully scaling in hardware.
<doug16k>
intel vesa is awesome in my experience. nvidia is terrible?
<doug16k>
intel supports everything for vesa when I checked my laptop
<klange>
Mobile definitely was in this era, and it might be doubly so from being an Optimus chip.
<doug16k>
I mean optimus too
<klange>
I only get 1280x800 on a 1440x900 panel, but VESA at least sets up the timings for 1440x900 so I can just tell the card directly to switch the source framebuffer resolution - thank frick that's well documented.
<klange>
[on intel]
<doug16k>
does it make sense to even try to use that card?
<doug16k>
and do what? use its memory?
<doug16k>
it's the one in the cpu driving the monitor in the end
<klange>
The nvidia one? Nope, best to leave it off, it wasn't even really capable of gaming circa 2010, just a waste of power.
<doug16k>
might as well be cpu accessing same memory, no?
<doug16k>
ah
<doug16k>
you mean the intel rom you can't select native res?
<doug16k>
sorry I am confused
<klange>
sorry I'm being confusing
<doug16k>
when I checked my laptop, the intel vesa bios was so good it even had protected mode entry point and API
<doug16k>
and had DDC thing
<klange>
With the Intel card [and further, with the nvidia card turned off in the BIOS [and yes, this is the generation immediately before Lenovo started shipping EFI]] the best vesa mode I can get is 1280x800, but the panel resolution is 1440x900.
<klange>
This is very common of the era, and around 2010 you were lucky to get _widescreen_ at all.
<doug16k>
wow
<klange>
BUT, the saving grace is...
<klange>
That vesa modeset, in the video rom, _does_ set up the timings for the LVDS perfectly for its optimal resolution and refresh rate.
<klange>
This is also somewhat ironic in my mind as obviously it has everything it needs for the LVDS timings and it's intentionally setting the "source framebuffer" size to 1280x800, so... why? why do that?
<klange>
But alas, at the end of the day, if I ask Grub for that 1280x800 mode, I can go in and poke the right registers according to the published manual and say, oh that source framebuffer? let's just make that 1440x900...
<klange>
And there we go. Almost as simple as the Bochs VBE!
<doug16k>
that sort of thing should boil down to a playback attack, if you must
<doug16k>
stick magic values in magic places
<klange>
I was ready to do one! I had the timings all ready to go from my X11 log in Linux!
<klange>
But I figured I'd actually look at what was already there and they beat me to it!
<klange>
Calculating the timings off the EDIDs is supposed to be the "fun" part!
<doug16k>
it's actually fascinating what it says in some of the descriptors in devices
<klange>
The Managarm team led me astray, they said I had to turn off the plane before configuring it but that's wrong! It doesn't accept the register modifications if it's off! It needs to be on!
<doug16k>
if I saw the device descriptor for my hub, I'd have knocked all the other hubs on the floor and kicked them down the aisle :P supports everything
<klange>
* granted their driver is for the prior generation, and they supposedly only tested it on one particular card, much like I'm only testing on this thinkpad
<doug16k>
exactly. those two bars are there on every video card
<doug16k>
the gap is probably because they are 64 bit and it's a pair of bars then
<doug16k>
if not 64 bit then space reserved for 64 someday
<klange>
nah, it's because this card is from the era of shared GPU memory, so that gap is its share of actual RAM.
<doug16k>
yeah, it's just a bus agent though. it uses bus master transactions for "ram" access and that goes through the normal memory controller. mostly tagged "no-snoop" though, so it doesn't do the usual magical stuff
<doug16k>
it doesn't really care if that is in the cache or not
<doug16k>
^ no snoop
<doug16k>
it just dmas the hell out of ram
<doug16k>
that's how they make it look
<doug16k>
what they actually do is another thing
<klange>
I should probably actually look at the memory map for this and see where the rest of my RAM is, since it it has 4GiB and only 512 of that is stolen by the GPU and there's that whole region of MMIO space that _isn't_ backed by real memory... so probably just above that...
<klange>
512MiB*
<doug16k>
most likely there is a 16MB window where the cpu can reach into it
<doug16k>
everything else is just bus master bursts, same as if it were a hard disk controller doing reads and writes
<doug16k>
all over anywhere
<doug16k>
not bar based
<doug16k>
it just emits bus master reads and writes
<doug16k>
on its pcie lanes
<doug16k>
that's my understanding
<doug16k>
like when people say "why doesn't it just let me directly access my texture in video memory, why does it lock a memory copy". you can't. it's too hard to reach into the gpu in that 16MB PCI max MMIO window into video memory
<klange>
oh sorry it's not 512, it's 256
<doug16k>
you must do a dma thing like a hard disk read/write would dma
<klange>
and that is represented in the BAR
<klange>
but this _does_ let me directly access things, because this _is_ the GPU's memory, all 256MiB of it, right here in my DDR. Because that was a thing. For several years of mobile chipsets.
<doug16k>
soon there are going to be "variable BARs", and you can just map an unlimited mmio window and just map the entire gpu ram
<bslsk05>
'Working GMA 950 Mode Switching' by froggey (00:00:36)
<klange>
bet my code is shorter
<froggey>
probably
<doug16k>
klange, yes you can access that memory, but it is not coherent
<doug16k>
you have to arrange for barriers
<doug16k>
like I mentioned, it doesn't really care if you have a line of that waiting for writeback in the cache
<doug16k>
it will just miss it
<doug16k>
it can also speculatively fetch stuff really far ahead of time, needs barrier to really work
<doug16k>
gpu can
<klange>
Ah, yeah, sure.
<klange>
And that's why I need to set up the right no-cache, write-through mode for the framebuffer or it will be slow; not because it's slow to read and write the memory, but because my writes just aren't actually getting to the memory right then and there for the GPU to see them.
<doug16k>
should use WC on framebuffer
<doug16k>
it will coalesce to bursts
<doug16k>
and allow load bursts, and expect it to not change out from under it
<doug16k>
lots of stuff will cause it to be flushed if you don't do a thing to flush it
<klange>
ah right, and I am as I look a this code
<klange>
or at least that's what I called this flag, MMU_FLAG_WC
<doug16k>
sounds right
<doug16k>
should get good perf between the code and the bus if it uses that
<doug16k>
bus will like the nice full size bursts
<klange>
Smooth as butter and all my animations look great.
<doug16k>
awesome
<doug16k>
the pci registers tell you exactly what to do
<doug16k>
it if says prefetchable (bit 3 set) then it is ok to use either WC or WT
<doug16k>
if prefectchable is 0, then you must use UC
<doug16k>
you would use WC if it is a big buffer you throw data into and speed is more important than anything, and WT for something where you aren't joking when you do a store, you want it now
<doug16k>
think of prefetchable=1 meaning, it will never change out from under you. you read the last value you wrote
<doug16k>
prefetchable=0 means, all the values are a big surprise, every read, even if you just read it
<doug16k>
and nobody wrote it
wolfshappen has joined #osdev
MarchHare has quit [Ping timeout: 245 seconds]
<doug16k>
the name prefetchable comes from it being ok to do speculative or spurious reads in that region. it wont cause a side effect
sortie has joined #osdev
<doug16k>
if prefetch has an address that it isn't even sure the program is going to execute, and the address is in a WT, WB, WC or WP region, then it goes ahead with it
<doug16k>
if UC it says ya right, come back when you are certain you are going to execute it
<doug16k>
so every UC stalls until the pipeline is empty, then it starts
<doug16k>
everything before the UC address must be complete and it *really* is going to start it, then it starts the access
<doug16k>
being able to sneak the fetch ahead of time helps a ton if you obey prefetch bit
tenshi has joined #osdev
<doug16k>
I'm glad it won't write doorbells when my program hasn't even completed all if statements that it guessed about up to that doorbell write
<kazinsal>
alright, rewrote parts of my PCI device initialization process to make adding new build modules easier as well as fixing a bug where different drivers being in use affected interface ordering
<sortie>
Go go kazinsal
<sortie>
I definitely need a much better driver model
GeDaMo has joined #osdev
<klange>
My thought a week ago was to shove PCI IDs into a section in the Elf objects of modules and then have a userspace utility poke the kernel pci interface to find out what's there and then load appropriate drivers by identifying them by that section
gareppa has joined #osdev
gareppa has quit [Remote host closed the connection]
Belxjander has joined #osdev
uplime has quit [Quit: no ill be right back, i promise]
Arthuria has joined #osdev
j00ru has joined #osdev
dormito has quit [Ping timeout: 272 seconds]
SpikeHeron has quit [Quit: WeeChat 3.1]
dutch has joined #osdev
gog has joined #osdev
j00ru has quit [Quit: leaving]
j00ru has joined #osdev
j00ru has quit [Client Quit]
j00ru has joined #osdev
dormito has joined #osdev
isaacwoods has joined #osdev
knebulae has quit [Read error: Connection reset by peer]
Arthuria has quit [Remote host closed the connection]
transistor has quit [Ping timeout: 252 seconds]
blyat-73 has joined #osdev
Arthuria has joined #osdev
<doug16k>
klange, I have that in my modules, a .driver section, then a platform defined program header that points straight to the struct
<klange>
remind me, what do you build modules as?
<doug16k>
as shared object
<doug16k>
very similar to yours - mine handle unlimited distance is the difference
<doug16k>
I assume you use relocatable link?
<doug16k>
yours is slightly faster
<klange>
I was gonna do relocatables again like I did with toaru32, but haven't done it yet
mctpyt has quit [Ping timeout: 264 seconds]
<doug16k>
ah
<doug16k>
shared object is easiest to do
<doug16k>
relocatable is uglier
<doug16k>
relocations I mean
<klange>
My test.ko is doing -mcmodel=large so I can throw it up top while my kernel is still in its low-mapped home
<doug16k>
bunch of annoying things that were relative to a base are just relative to the module
<doug16k>
to the section*
<klange>
but I have aspirations of remapping the kernel up north... I already put everything else up there
<doug16k>
why not -fpie
<klange>
for my kernel? because it scares me
<doug16k>
the cpu loves it
<klange>
will my assembly bootstraps still work? will it make things slower with ip-relative instructions I wasn't expecting? will it cause demons to fly out of my nose?
<doug16k>
show me what errors and it's a near instant fix every time
<klange>
if I do that can I literally just plop the kernel up top and jump and pretend nothing happened?
<doug16k>
almost every time, all you need is to add (%rip) after the global name
<doug16k>
if you did fancy some_array(%eax,%edx,4) then we have a slight issue and need 1 more insn
<doug16k>
a lea
<klange>
Part of my apprehension to doing anything right now is also from debugging, at least with my current mapping I know that my debug symbols have the right addresses
<doug16k>
you lea that some array
<doug16k>
lea somearray(%rip),%reg
<j`ey>
doug16k: I cant pass --host=path/to/compiler for qemu-rom D:
<doug16k>
j`ey, oh I didn't know that was a thing
<klange>
is that... is that a thing?
<j`ey>
I dunno!
<doug16k>
if you want to force it, just do CXX=somecrazy-g++ ../src/configure
<klange>
I've not heard of that, let me poke some autotools vomit
<klange>
You're _supposed_ to set CC, CXX, etc. when calling configure
<klange>
if --host=triplet doesn't do what you want
<doug16k>
just --host means figure it all out from cross prefix
Arthuria has quit [Killed (NickServ (GHOST command used by guest2795))]
Arthuria has joined #osdev
<j`ey>
i see
<klange>
I guess there's --host-cc but idk if that's common?
<klange>
even autotools-generated configure scripts are only slightly standardized
<klange>
and it's all just "recommendations"
<doug16k>
if you had "super9000-elf-g++" in the path, then `../configure --host=super9000-elf` should pick it up
<klange>
One thing I was considering was to make misaka a combination of an elf32 stub that supports Multiboot loading, and then an Elf64 "actual" kernel patched onto the end that gets linked at -2GB and is loaded there by the elf32 stub.
<j`ey>
turns out the main toolchain i use doesnt package g++ lol
Arthuria has quit [Ping timeout: 250 seconds]
<doug16k>
j`ey, then just CXX=weird-compiler ../xxx/configure
<j`ey>
I mean it only provides a C compiler.. gcc
<doug16k>
no C++?
<doug16k>
sorry I don't do memory leak roms :P
<j`ey>
:)
<j`ey>
yeah, no C++. it's a toolchain i use for linux, so I guess that's why no c++
<doug16k>
how can you have no c++?
<klange>
jeez, even I have a g++
<j`ey>
doug16k: cos linux doesnt use c++? :P
<doug16k>
I almost have autopilot working
<doug16k>
I have it so you can just go to some directory and run `../whatever/configure --autopilot` and it goes and gets the toolchain script and uses it to build all the toolchains needed, then builds my project for every arch and spawns a background disowned qemu for each as it builds
<doug16k>
with it running
<doug16k>
do you have a modestly powerful machine?
<klange>
that sounds resource-intensive :)
<doug16k>
it is
<doug16k>
autopilot uses a lot of power, sorry
<doug16k>
it sledgehammers it to work
<doug16k>
I have it so it does a sequential loop of -j32 stuff per toolchain, when it should be 6 concurrent -jceil(32/6) makes
<doug16k>
so it will do all the configures in parallel
<doug16k>
6 j6 builds in parallel should be way faster than 6 j32 builds sequentially
<doug16k>
right?
<doug16k>
I should measure
<doug16k>
6 configures in parallel is a huge win right off the bat
<doug16k>
6 sequential parts
<doug16k>
you are right though, all architectures is a bit crazy, only for big machines
<doug16k>
autopilot=aarch64,x86_64 would be neat
<j`ey>
doug16k: for my qemu build: 2579 files, 32s
<j`ey>
(vs your 8000s 1:30 i think)
<doug16k>
not bad
<doug16k>
then you could do all arch in a couple of hours or so
<doug16k>
ya that is a lot
<doug16k>
how long does it take you to build gcc?
<j`ey>
never tried it
<j`ey>
I use a prebuilt toolchain
<doug16k>
what?
<doug16k>
how are you going do debug gdb like that? :P
<bslsk05>
github.com: dgos/module.ld at master · doug65536/dgos · GitHub
<doug16k>
elf already does the same trick for the .dynamic section
ZombieChicken has quit [Remote host closed the connection]
<doug16k>
glad I looked in that linker script, wtf am I doing with align. fixed that and how geting eh_frame_hdr that was broken
<doug16k>
fixed
[Brain] has joined #osdev
Arthuria has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
nyah has joined #osdev
nyah has quit [Client Quit]
nyah has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
isaacwoods has quit [Quit: WeeChat 3.1]
flx-- has quit [Quit: Leaving]
<doug16k>
klange, you talking about tagging drivers with device id matching info thing led me to looking at an old bad linker script and fixing weird linking issue. thanks!
ahalaney has joined #osdev
<klange>
whee~
flx has joined #osdev
<klange>
oh man this stack I'm building is terrible because I have the most horrible primitives for this stuff, so when I'm doing making it "work" I really want to go in and build new primitives
<klange>
I just want... like... an atomic queue that works with scheduling and poll...
<klange>
I have this horrid combination of non-atomic lists combined with spin locks combined with _more_ non-atomic lists that get used to implement the scheduling and... it's just bad, man
<klange>
how did I go so long without so much as a sempahore or condition variable... or _event_...
<klange>
ugh well I got the dhcp client working again but through a socket call, so that's nice at least
<klange>
time for bed, I have the famous Japanese company-provided annual physical exam tomorrow
Arthuria has quit [Read error: Connection reset by peer]
[Brain] has quit [Ping timeout: 265 seconds]
<klange>
This horrid little scheduling things are also most of the bugs in SMP right now, so if I can get rid of them that'd be great.
Arthuria has joined #osdev
warlock has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
theruran has joined #osdev
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
uplime has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
transistor has joined #osdev
MarchHare has joined #osdev
blyat-73 has quit [Quit: Leaving]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
Sos has quit [Quit: Leaving]
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
iorem has quit [Quit: Connection closed]
bsdbandit01 has joined #osdev
EtherNet has joined #osdev
les has quit [*.net *.split]
kanzure has quit [*.net *.split]
raggi has quit [*.net *.split]
livinskull has quit [*.net *.split]
LambdaComplex has quit [*.net *.split]
jeaye has quit [*.net *.split]
GreaseMonkey has quit [*.net *.split]
edr has quit [*.net *.split]
n3t has quit [*.net *.split]
buffet has quit [*.net *.split]
GeneralDiscourse has quit [*.net *.split]
mniip has quit [*.net *.split]
j`ey has quit [*.net *.split]
Darksecond has quit [*.net *.split]
raggi has joined #osdev
les_ has joined #osdev
edr has joined #osdev
buffet has joined #osdev
mniip has joined #osdev
livinskull has joined #osdev
Oli has joined #osdev
gmacd has joined #osdev
bsdbandit01 has quit [Quit: -a- Connection Timed Out]
bsdbandit01 has joined #osdev
<Oli>
Hello, and good morning: This day will feel neat, and a good step will be made towards desires.
<Bitweasil>
(on the art of silent data corruption in cores given a suitably large set of them)
bsdbandit01 has joined #osdev
<kazinsal>
I have now entered the future! Cable internet has been replaced with fancy new fibre
basil has quit [Quit: ooooh, what does this button do?]
basil has joined #osdev
isaacwoods has joined #osdev
<gog>
kazinsal: ooh what a good feeling
mctpyt has joined #osdev
<gog>
for the first week of my fiber they didn't have me throttled to the speed i was paying for and it was so nice
<gog>
until they got around to putting the qos rules on me :p
<kazinsal>
lol
bsdbandit01 has quit [Read error: Connection reset by peer]
dennis95 has quit [Quit: Leaving]
mathway has joined #osdev
mathway is now known as rednhot
GreaseMonkey has joined #osdev
<rednhot>
Hello. I've tried to create an account on the forum.osdev.org, but I didn't receive any message on my email even after asking to resend it. Yes, I am sure that the email is correct. Where is the problem?
<bslsk05>
en.wikipedia.org: Series of tubes - Wikipedia
<GeDaMo>
«"A series of tubes" is a phrase used originally as an analogy by then-United States Senator Ted Stevens (R-Alaska) to describe the Internet in the context of opposing network neutrality.»
<gog>
it's not entirely a wrong characterization, but the intent of it was to make it seem like bandwidth scarcity is natural rather than by design
<gog>
i'm gonna get elected to parliament in iceland and orchestrate the transition of communications infrastructure from private to social ownership just watch me
<rednhot>
hmm.. sorry, but i still can't understand which party is responsible for such misfunction
<rednhot>
otherwise, what should i do to correct the problem?
<GeDaMo>
Have you checked your spam folder?
Darksecond has joined #osdev
<rednhot>
ooops... yeah, it is in spam :) Thanks and sorry guys
<kazinsal>
good, we can re-schedule the annual tube de-plaquing
<GeDaMo>
Huh, that was just a wild guess :P
gog has quit [Changing host]
gog has joined #osdev
vin has joined #osdev
vin is now known as crash
crash is now known as viin
viin is now known as vin
tenshi has quit [Quit: WeeChat 3.1]
Oli has quit [Remote host closed the connection]
gog has quit [Ping timeout: 258 seconds]
mahmutov has joined #osdev
kanzure has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
bsdbandit01 has joined #osdev
gmacd has quit [Ping timeout: 250 seconds]
<Bitweasil>
It's a good idea to send a huge slug of data through the tubes, it helps flush out the clogs.
bsdbandit01 has quit [Read error: Connection reset by peer]
GeDaMo has quit [Quit: Leaving.]
<vin>
I am looking to measure power consumption of devices (DRAM, disk and CPU) for a particular period. One way to measure the overall load is by using a power meter (like kill-a-watt) but to measure the individual components I suspect I can use a multimeter. Does anyone have prior experience doing this?
<bslsk05>
sosy-lab/cpu-energy-meter - A tool for measuring energy consumption of Intel CPUs (15 forks/46 stargazers/BSD-3-Clause)
<geist>
depends on what kind of device iti s
<geist>
some times they have the different voltage rails brought through a loop somwhere where you can measure the current flowing through it
<geist>
but most of the time not in a 'production' device, like a cell phone, etc
<geist>
i have no idea what kind of platform you're on. a PC?
<vin>
geist: This will be a server class intel machine
<kazinsal>
DRAM power requirements are usually fairly fixed based on voltage
<geist>
ah sometimes those have special circuitry to watch the different rails? otherwise you're SOL
<kazinsal>
GPUs report power usage, I know that much
<geist>
right. it *is* possible for them to have installed ICs that can monitor current of particular rails but they woul dhave had to already do it
<kazinsal>
I think most disks you can estimate 5 watts per and still have generous overhead
<geist>
since it's otheriwse too hard to 'break it out' of an existing motherboard
<kazinsal>
less so for spinning disks
<geist>
thign is a multimeter can easily measure voltage, but measuring current is more difficult, since you have to put the meter in series with the circuit
bsdbandit01 has joined #osdev
<vin>
kazinsal: Yes I found spec sheets for most devices I have and planned to guestimate once I measure overall load. This is sort of the last resort.
<vin>
I will check if the motherboard supports any such provisions to measure
<geist>
you go measure current at the power supply connector which would at least let you break out the different 5V, 12V, etc rails
<geist>
but... that's likely to be a lot of amps so be careful it doesn't blow out your multimeter
<vin>
Ah yes that makes sense.
bsdbandit01 has quit [Read error: Connection reset by peer]
<geist>
i think for specialized testing they'd probabl have a special interposer socket for DRAM or whatnot that runs the power lines through a loop that can be measured
<geist>
but as kazinsal says i think dram pulls a relatively constant power
<geist>
for pci cards and whatnot it's much simpler to measure the power externally since you can do the same thing
<geist>
some of the gamer review sites have rigs set up to do that and measure power of video cards and whatnot. i definitely know they do for SSDs, hard drives, NVME
<geist>
key though is you probably want to factor out the power supply by not testing at the wall (or maybe you do, if you want real world power draw)
<geist>
it all depends on what you're trying to do really
Arthuria has joined #osdev
<kazinsal>
yeah, like many questions we field here, the real factor behind the answer is "what exactly is it you're trying to accomplish"
<vin>
Right. I am interested in real world power drawn. Measuring per device power consumption will help me decide how to distribute the load. The above link I shared is the kind of work I am doing.
<vin>
Essentially building energy efficient algorithms. Not just about keeping a low energy profile but also providing highest queries per joule possible.
<vin>
This involves me in deciding what hardware features (SIMD, turbo) or devices (more SSD or DRAM?) I should consider and design the algorithms accordingly.
<vin>
Which is why I am measuring the device power usage at different loads first :)
<XgF>
geist: measuring gpu power is tricky because they can draw 75w through the slot too
<kc8apf>
vin: most intel server designs use voltage regulators that provide PMbus telemetry
<kc8apf>
you'd need to figure out what parts are used on a specific board and then beg the regulator vendors to give you datasheets
<kc8apf>
or you can try to cut a trace near the regulator, insert a shunt, and measure current that way. The tricky part is that the core rails for some parts get into 100s of amps which requires some careful design.
<kc8apf>
hence why the regulators have the telemetry built in
<geist>
XgF: yah they probably have an interposer for that, but that's relatively easy for something like PCIe
Arthuria has quit [Read error: Connection reset by peer]
Arthuria has joined #osdev
Arthuria has quit [Read error: Connection reset by peer]
dormito has joined #osdev
jeaye has joined #osdev
Arthuria has joined #osdev
<kazinsal>
vin: Generally you'll see an increase in CPU power consumption (both through increased voltage and increased current) if you have any AVX instructions in the pipeline. The CPU will pre-emptively reduce its maximum clock speed by a couple multiplier bins as soon as the prefetcher decodes an AVX instruction
Arthuria has quit [Read error: Connection reset by peer]
<kc8apf>
yup. Controlling power management modes so perform repeatable measurements is extremely difficult
<vin>
Thanks kc8apf , this sounds tricky (haven't done much of hardware hackery before). I will plan it and try it though.
Arthuria has joined #osdev
Arthuria has quit [Read error: Connection reset by peer]
<vin>
Yes kazinsal if it was only about low power consumption this choice would be easy but I am also interested in queries/joule. So it will be a tradeoff
<vin>
on how many cores to use, how many EUs to use at what clock speed, the speed of devices, etc.
Arthuria has joined #osdev
<kazinsal>
My recommendation is to use all cores at whatever clock speed the box says it'll do. So long as you don't use AVX instructions you should stay within the power envelope that's roughly estimated by the TDP
<kc8apf>
Around 2008, EnergyBench was an attempt to create a benchmark for energy efficiency. They ran into all these problems back then. It's only gotten worse since.
Arthuria has quit [Read error: Connection reset by peer]
Arthuria has joined #osdev
<kazinsal>
yeah, everything since 2011 or so has extremely variable automatic power twiddling that makes guessing your effective instructions per joule quite difficult
<kc8apf>
if you manage to control all those variables, then the results are unrealistic because you'll never see such a controlled system
<kc8apf>
this is why I've had great luck with experimental physicists as computer performance analysts. They know how to design the right experiments
mahmutov has quit [Read error: Connection reset by peer]
mahmutov has joined #osdev
j`ey has joined #osdev
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
mahmutov_ has joined #osdev
mahmutov has quit [Ping timeout: 252 seconds]
pretty_dumm_guy has joined #osdev
<gorgonical>
What do you guys think of using Ada for system software? As in, on the topic of translating poor-quality, quick research code to production-usable code
mathway has quit [Ping timeout: 250 seconds]
ahalaney has quit [Quit: Leaving]
craigo has joined #osdev
gog has joined #osdev
* gog
meows
* kazinsal
pats gog
* gog
purrs
graphitemaster has quit [Ping timeout: 264 seconds]
sortie has quit [Quit: Leaving]
mathway has joined #osdev
transistor has quit [Ping timeout: 264 seconds]
<geist>
gorgonical: honestly i think you're going to have a tough time finding anyone that actually knows too much about Ada
<geist>
it *seems* like it'd be a decent enough language for it, but there are probably more modern, more appropriate languages
transistor has joined #osdev
<gorgonical>
naturally the first alternative suggested was rust, but rust is just new
<gorgonical>
it's not necessarily better
<gorgonical>
you are right that ada suffers from a lack of widespread knowledge, though
<geist>
right, but of course it has a lot of the same safety properties
Sos has quit [Quit: Leaving]
<gorgonical>
it's a spectrum, too. I have never used a language more aggressive about typing and safety than ada, and rust can't match it. but rust is probably easier to use and learn and maybe what you get is "good enough" for you
<geist>
why do i occasionally feel the need to watch the ending of Cowboy Bebop again?
<geist>
it always puts you in a mood
<gog>
my wife knows a lot about Ada
<gog>
;)
<geist>
also dunno how good the later ada object oriented extensions are
<geist>
i've heard about them but basically only have a wikipedia level knowledge of the language
<gorgonical>
my understanding is that they are... acceptable. My experience with ada is used in a way similar to c, so I don't have anything to say about it
<geist>
yah, honestly basic level OO is i've found a huge jump, and then it quickly gets out of hand
<kazinsal>
geist: did you see that Yoko Kanno is doing the soundtrack for the live action adaptation?
<geist>
kazinsal: actually that's precisely why i watched the ending again
<geist>
i saw that vid too :)
<geist>
gosh that whole pan up while Blue is playing... shivers
<kazinsal>
haha, yeah, when I saw the lead actors grooving along to Tank! I knew it was going to be good
<geist>
yah i'm not going to pre-judge the live action, but gosh that's big shoes to fill
<kazinsal>
definitely
<kazinsal>
Steve Blum seems excited for it so that's hopefully a good sign
<geist>
does he have anything to do with the new one?
<kazinsal>
not that I know of unfortunately
pretty_dumm_guy has quit [Quit: WeeChat 3.2-rc1]
<geist>
anyway, am excite
bsdbandit01 has joined #osdev
<kazinsal>
same. something tells me everyone involved is doing it because they want to make an awesome Cowboy Bebop show
jaevanko has joined #osdev
chin123 has quit [Remote host closed the connection]
chin123 has joined #osdev
mingdao has quit [Ping timeout: 264 seconds]
farcas has quit [Ping timeout: 272 seconds]
bsdbandit01 has quit [Read error: Connection reset by peer]
<kc8apf>
I've only seen ADA used in defense products
<kc8apf>
specifically avionics
<gog>
i'm sure there are other places where it's in use, but yeah avionics seems to be the primary application
bsdbandit01 has joined #osdev
bsdbandit01 has quit [Read error: Connection reset by peer]
mingdao has joined #osdev
<kc8apf>
There's also SPARK which is derived from Ada
<kc8apf>
that's been used in a lot more high-rel systems
farcas has joined #osdev
bsdbandit01 has joined #osdev
<gog>
maybe i should start telling people i named myself after the programming language XD
jamestmartin has joined #osdev
chin123 has quit [Remote host closed the connection]
chin123 has joined #osdev
mingdao has quit [Ping timeout: 258 seconds]
bsdbandit01 has quit [Read error: Connection reset by peer]
<Vercas>
geist, you weren't playing Chivalry 2 a few hours ago by any chance, right?
<nur>
ok I'm _still_ trying to find out what's messing up my interrupt handler. I pass the registers etc. from an x86-32 asm wrapper to a C function, and if I save that struct in a global the function call works and everything is good. However if I don't... it crashes with a triple fault. The disasembled code for the function is _wildly_ different despite only 1 line of code differring.