dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
mctpyt has joined #osdev
nyah has quit [Quit: leaving]
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 252 seconds]
Burgundy has quit [Ping timeout: 246 seconds]
dutch has quit [Quit: WeeChat 3.8]
spikeheron has joined #osdev
gog has quit [Quit: byee]
<gorgonical>
Question about GIC interrupt grouping: the GICD and GICR have separate registers for configuringthe group and security level that an interrupt has.
<gorgonical>
I'm guessing that the GICD is in charge of SPI interrupt configuration and the GICR is in charge of the per-CPU interrupts like SGI, PPI?
<gorgonical>
The main question is whether there's a "hierarchy" since although SGIs originate at the CPU interface and go to the GICR, they have to go to the GICD to make it to another CPU. So then in that case the first GICR determines the type? The second one? The GICD?
bradd has quit [Ping timeout: 248 seconds]
mctpyt has quit [Ping timeout: 260 seconds]
tiggster has joined #osdev
<geist>
think of the GICR as the local apic and the GICD as an ioapic, iirc
<geist>
one of them is indeed per cpu, the other is more of a gloal thing
<geist>
SGIs i think Just Happen on the other core and there's not really any real overall configuration, since the range is basically reserved
<geist>
but this is just off of memory, so i might be wrong
mctpyt has joined #osdev
<gorgonical>
hmm
<gorgonical>
If there's no configuration then that suggests anyone can SGI a secure core right?
<geist>
oh in a hypervisor situation that's a different story, but you're right i think if there's a separate core, then yeah i think there' dneed to be some way to mask it off
<geist>
but i dont have the spec in front of me, there may be a mechanism to configure it locally
<geist>
at least some sort of local interrupt mask for sure for that SGI
<geist>
but i dont think there's necessarily a way of specifying which cores can SGI which other cores
<geist>
*aside* from whatever virtualization extensions EL2 may implement
joe9 has quit [Quit: leaving]
spikeheron has quit [Quit: WeeChat 3.8]
<moon-child>
is it slow to send ipi, or just to receive them?
Clockface has joined #osdev
dutch has joined #osdev
<Clockface>
whats the most practical way to emulate a specific peice of hardware for other kernel mode code
<Clockface>
will i have to just intercept every I/O thing from everything else
<Clockface>
and then replicate all of it "for real"
<Clockface>
except the stuff connecting to the fake device
bradd has joined #osdev
slidercrank has joined #osdev
foudfou has quit [Ping timeout: 255 seconds]
foudfou has joined #osdev
Vercas6 has joined #osdev
Vercas has quit [Ping timeout: 255 seconds]
Vercas6 is now known as Vercas
zxrom has quit [Read error: Connection reset by peer]
mctpyt has joined #osdev
mctpyt has quit [Ping timeout: 246 seconds]
foudfou has quit [Quit: Bye]
foudfou has joined #osdev
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
bgs has joined #osdev
jjuran has quit [Quit: Killing Colloquy first, before it kills me…]
jjuran has joined #osdev
Vercas has quit [Quit: Ping timeout (120 seconds)]
Vercas has joined #osdev
epony has joined #osdev
masoudd has joined #osdev
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
bradd has quit [Remote host closed the connection]
danilogondolfo has joined #osdev
bradd has joined #osdev
hmmmm has quit [Remote host closed the connection]
gog has joined #osdev
slidercrank has quit [Ping timeout: 255 seconds]
Vercas9 has joined #osdev
fedorafan has joined #osdev
Vercas has quit [Ping timeout: 255 seconds]
Vercas9 is now known as Vercas
mahk has quit [Ping timeout: 260 seconds]
GeDaMo has joined #osdev
elastic_dog has quit [Read error: Connection reset by peer]
elastic_dog has joined #osdev
Burgundy has joined #osdev
les has joined #osdev
les has quit [Client Quit]
les has joined #osdev
mahk has joined #osdev
mahk has quit [Ping timeout: 248 seconds]
mahk has joined #osdev
foudfou has quit [Ping timeout: 255 seconds]
foudfou has joined #osdev
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
slidercrank has joined #osdev
<netbsduser`>
a question about unified buffer caches: in general i know pages of these to get a different treatment from e.g. anonymous pages, because pages of a page cache get written out to their backing store regularly (i think on linux every 30s) rather than just in response to page replacement deciding that a page has to be put back to make room for another. but nonetheless they also get put back to disk in response to typical page replacement demands too
<netbsduser`>
so consider the case of certain filesystems, which have to enact invariants, like "this journal block has to be written before that metadata block is, else all hell breaks loose." i know that there are a lot of filesystems which do in fact write journals lazily. what approach is usually taken in unified buffer caches to describe such invariants and to ensure that they are not violated by normal page replacement policy?
<netbsduser`>
i have considered two approaches: one is to let the `struct buf`s associated with a UBC hold dependency information. this would allow the pageout daemon to continue to enact its own policy on page replacement (if it calls a page eligible for swapout, and beholds it contains bufs which have dependencies, it would then write those dependencies out first.) another is to have it handled at the filesystem level. the page descriptions (or bufs they
<netbsduser`>
contain) would be marked to say, "fs driver will handle these ones"
bradd has quit [Ping timeout: 248 seconds]
joe9 has joined #osdev
<mrvn>
you write out the dependencies then throw in a barrier/flush and only then the depending blocks.
<mrvn>
the kernel will not reorder I/O across barriers
<mrvn>
which is also a problem. Because when you fsync() a file the updates can be stuck behind barrier with tons of unrelated data and they can't be fast tracked because that would require crossing the barrier.
<mrvn>
If you write your own IO system then having a dependency / order graph seems like an improvement over the simple queue strategy generally used.
<netbsduser`>
mrvn: but who writes them by that order? would, let's say, the FS driver submit asynchronous writes to the I/O system and then it maintains the ordering information and if e.g. the pageout daemon wants to write out a page, the I/O system checks it against its queue of pending writes and orders appropriately? that might be a wiser approach than either of what i was considering
[itchyjunk] has joined #osdev
mctpyt has joined #osdev
[itchyjunk] has quit [Read error: Connection reset by peer]
heat has joined #osdev
mctpyt has quit [Ping timeout: 248 seconds]
[itchyjunk] has joined #osdev
heat has quit [Remote host closed the connection]
heat has joined #osdev
[itchyjunk] has quit [Read error: Connection reset by peer]
[_] has joined #osdev
craigo has quit [Ping timeout: 252 seconds]
dutch has quit [Quit: WeeChat 3.8]
<mrvn>
netbsduser`: each IO layer writes their queue in the order the barriers enforce
<mrvn>
There is also no checking. The I/O layers simply perform the IO they are told to do. If you write out a page twice it gets written out twice if there is a barrier between them. Maybe even always.
fedorafan has quit [Ping timeout: 248 seconds]
dutch has joined #osdev
fedorafan has joined #osdev
srjek has joined #osdev
aoei is now known as Stella
<kaichiuchi>
hi
<heat>
hai
<gog>
hi
masoudd has quit [Remote host closed the connection]
masoudd has joined #osdev
bauen1_ has joined #osdev
bauen1 has quit [Ping timeout: 252 seconds]
<heat>
"However, the modularity of UEFI also makes it easier for HP to innovate. HP DayStarter is a simple value-add to the system allowing users to have access to productivity information while waiting for the system to boot"
<gog>
this is not what uefi is for but it's the inevitable consequence of making pre-boot application development easier
<gog>
good job
<gog>
we heard you liked operating systems so we put an operating system into your firmware
gog has quit [Quit: Konversation terminated!]
<heat>
late stage capitalism EFI
<kof123>
late stage osdev
<kof123>
devours its children
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
<sakasama>
Thank you HP DayStarter. Without this innovative technology I may never have known that useful fact about Chuck Norris.
<heat>
i hope you all realize this is done in SMM
knusbaum has quit [Ping timeout: 248 seconds]
knusbaum has joined #osdev
<sakasama>
I've heard of that! It's kind of like BDSM but participants need double the masochism.
masoudd has quit [Remote host closed the connection]
masoudd_ has joined #osdev
<heat>
no, that is BSD
xenos1984 has quit [Ping timeout: 248 seconds]
xenos1984 has joined #osdev
Turn_Left has quit [Ping timeout: 252 seconds]
dude12312414 has joined #osdev
<clever>
heat: isnt that just a clone of a minimal linux env in the flash? or does it run along side the os??
<clever>
oh, checking the screenshot, it looks more like an odd overlay, after the bootloader has ran??
<clever>
but where is it getting that data from
<heat>
clever, The benefits to the customers are the instant-on user experience with user productivity information (such as calendar, to-do list and customizable information) available for display before and while Windows is booting. The main technology behind it is for the UEFI BIOS to locate the proper JPEG images and use the System Management Mode (SMM) to update the frame buffer content until Windows is ready for system login. At OS runtime, HP
<heat>
implements an Outlook plug-in to capture the calendar information.
<heat>
it uses fucking SMM
<heat>
i hope they do jpeg decoding in SMM for the big funny
<clever>
heat: windows already has a cheat for instant on, they renamed hibernate to shutdown :P
<clever>
so when you think youve turned it off, it just went into hibernate
<heat>
yes, this was in 2011
<clever>
ah
<heat>
imagine how much better DayStarter is these days!
<clever>
smm also explains most of my questions
<clever>
now it can be just as anoying as the HUD on my tv, getting in the way and covering up valuable UI elements
<clever>
until it times out
<heat>
modern daystarter should play youtube vids in SMM :v
<clever>
or, you know, just boot faster :P
<heat>
hmm, good point
<heat>
there's room for a tiktok or two
<clever>
but i have had a similar idea in the past, with that dislay on the apple keyboard
<clever>
where they replaced the F1-F12 row, with what is basically an ipad
<clever>
fully self-contained computer
<mats2>
outlook in uefi
<mats2>
amazing innovation
<clever>
why not allow that to run on the keyboard, with the system off?
<gorgonical>
though personally it does make programming directly in asm a lot easier
<mjg>
who is highlighting me
<gorgonical>
I have been writing a lot of aarch64 asm and I'm furious about it usually
<heat>
wait
<heat>
wtf is it doing
<heat>
why is it saving %rax
<gorgonical>
mjg: i don't see any mentions
<mjg>
> chad
<mjg>
that was it
<gorgonical>
lmao
<heat>
func: # @func
<heat>
callq *(%rdi)
<heat>
popq %rcx
<heat>
addl $20, %eax
<heat>
pushq %rax
<heat>
retq
<heat>
am I going cray-cray or does this make no sense?
<GeDaMo>
Aligning the stack?
<heat>
for int func(int(**f)(void)) { return (*f)() + 20; }
<heat>
ooooooooh
<heat>
maybe so
<gorgonical>
does the stack need alignment on x86?
<heat>
yes
<mjg>
yes and no
<heat>
GeDaMo, great one! seems to be it
<heat>
gcc just does sub and add
<heat>
now this makes me wonder, why does clang seem to codegen crap here?
<GeDaMo>
16 bytes
<heat>
push %rax makes it depend on %rax
<heat>
cc chad
<mjg>
again with the highlights
<GeDaMo>
The return address pushed by the call misaligns it
<gorgonical>
i wasn't aware that the stack wanted/needs to be 16-byte aligned
<heat>
gorgonical, yeah, it's there on sysv at least cuz of SSE
<mjg>
gorgonical: that's only true if you use simd
<gorgonical>
oooh
slidercrank has quit [Ping timeout: 248 seconds]
<heat>
I think it's still true on -mgeneral-regs-only
<gorgonical>
because I'm used to this on arm64, hence ldrp instructions and stuff
<heat>
mjg, but seriously mr chad doesn't that make like 0 sense
<mjg>
dude i'm running on negative brainpower today
<heat>
unless you did something like xor %eax, %eax; push %rax to break the dependency
<GeDaMo>
You can directly alter the stack pointer too
<heat>
yes, gcc does that
Brnocrist has quit [Ping timeout: 268 seconds]
<gorgonical>
then it is a good question why clang just pushes garbage
<mjg>
lol it has tendra
Brnocrist has joined #osdev
masoudd_ has quit [Quit: Leaving]
<GeDaMo>
The only reason that comes to mind is instruction size
dude12312414 has quit [Remote host closed the connection]
dude12312414 has joined #osdev
elastic_dog is now known as Guest218
elastic_dog has joined #osdev
<gog>
gorgonical: my hypothesis is that it's to keep us more in line with business time in most of europe
<gog>
particularly banking and securities trading
<gog>
and that this is owing to iceland's recent history as a dubious and probably corrupt financial player
<heat>
GeDaMo, would make little sense considering I passed -O3 and not -Os
<gog>
and in the case of our infamous finance minister Bjarni Benediktsson, plainly corruppt
<GeDaMo>
Pfft! You can't expect compilers to make sense :P
<zid`>
clang pushes garbage because iceland is corrupt, got it
* zid`
paying attention
<heat>
Big Iceland controls the toolchains
<geist>
re push vs add, i'm guessing it's a combination of instruction size and/or various optimizations for various microarches where sometimes pushes vs direct stack instructions are faster. if you're not specifying a -march it may be up to whatever each compiler thinks they're tuning for
<geist>
i do remember there was a lot of back and forth on fiddling with stack pointer via anything other than push/pop being slow/fast/maybe
<heat>
yeah but in this case you do not care about what you're pushing
<heat>
so doing a mindless pushq %rax can stall the pipeline no?
<geist>
right, and thus it's just there to align the stack
<geist>
i doubt it, stack stuff is optimized out the wazoo
<geist>
flip side is in some microarches, fiddling with SP directly may stall, because it may have to synchronize the stack engine, etc
<heat>
you think the cpu will notice you never look at it?
<geist>
the push probably not, the pop maybe?
<geist>
as soeone else mentioned, arm64 has a lot of these trash push/pops to keep alignment
<geist>
via ldp/stp and sometimes using xzr as one of the regs
divine has quit [Quit: Lost terminal]
<heat>
wait, how much can the CPU optimize the stack?
<zid`>
I bet it doesn't matter unless eax isn't "settled" by the point of the push
<heat>
if you do e.g 1: push %rax; pop %rax; jmp 1b, is %rax ever written to the stack?
<heat>
can it do something really smart and e.g only write if you read that memory region from another thread? or if you get interrupted?
<zid`>
depends how good the uop optimiztion bits are I guess
<zid`>
I doubt that has a fuse though
<zid`>
zen2/4 might be able to do it
<geist>
yeah there's a ton of optimizations around the stack. it's one of the reason arm moved the SP out of the main register file as well
<heat>
this asks for a benchmark doesn't it
<geist>
i think it's fairly standard practice to hae a cached copy of the SP floating around fairly early in the pipeline, outside of the general register file so it can be fast forwarded between stages to remove any interdependencies between instructions
divine has joined #osdev
<geist>
historially i remember this meant something like if you did a bunch of push/pops in a row and then tried to read the ESP you'd get a stall because it'd have to 'write back' the cached SP to the main register file first
<zid`>
heat we playing dark souls instead of this?
<heat>
no
<zid`>
even though I was *promised* dark souls? wow
hmmmm has joined #osdev
<heat>
pushpop 3.46 ns 3.45 ns 203580329
<heat>
mov 0.968 ns 0.966 ns 704777624
<zid`>
try it on zen2/4
<heat>
benchmark of 11 push %rax; pop %rbx vs mov %rax, %rbx
<zid`>
and you definitely didn't straddle an icache line, and you put some gumpf before so the decode was nice and old etc?
<heat>
intel pt sampling on pushpop, pushpop2, subadd
<heat>
i don't fully understand whats going on here but it seems interesting
Left_Turn has joined #osdev
<heat>
my cpu does not seem to have a stalled cycles pmc
<mrvn>
Why do you have a 5 opcode function? Why isn't that inlined? Embrace LTO and your whole benchmark becomes artificall.
<mrvn>
WHat's the stack alignment on aarch64? 128bit?
<heat>
yes 16b
<geist>
also fun thing that you can enable but virtually all systems do, there's two control bits that you can set for EL0 and EL1 that cause it to instantly throw an exception if SP is ever for any reason unaligned to 16B
<mrvn>
Other than the double register load/store does it even matter?
<heat>
simd
<heat>
probably perf
<mrvn>
heat: I throw in simd load/store with double register load/store. Anything above 8 byte.
<moon-child>
wtf is this benchmark
<moon-child>
like what is it even trying to measure
<heat>
sub add vs push pop
<moon-child>
but why?
<moon-child>
no one does just subs and adds or just pushes and pops
<heat>
clang appears to
<mrvn>
moon-child: except gcc vs. clang
<heat>
for alignment stuff
<moon-child>
yea they do that for stack alignment
<moon-child>
and then they go and do other stuff
<mrvn>
moon-child: The question remains though why one compiler prefers to push an extra reg while the other adds 8 to keep the alignment.
<moon-child>
code size. push is better. But this doesn't demonstrate that because literally all it's doing is pushing and popping
<heat>
why does that mean push is better?
<mrvn>
heat: he means it's smaller.
<heat>
no, he means better
<heat>
"push is better"
<mrvn>
prefixed by "code size"
dutch has quit [Ping timeout: 256 seconds]
dutch has joined #osdev
danilogondolfo has quit [Remote host closed the connection]