<zid>
I like this argument, [rsp] might cause a false data dep, but -128 might randomly segfault or bring in a new cache line, etc
<bslsk05>
git.kernel.org: locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE - kernel/git/torvalds/linux.git - Linux kernel source tree
<zid>
funfun
<nikolar>
if they weren't, they wouldn't have been called heuristics
<childlikempress>
YOU HAD THAT CACHE INE ANYWAY AND YOU WERE GONNA SEGFAULT ANYWAY
<childlikempress>
mjg is still on libera, he just didn't come back here because zid is here
<childlikempress>
True story he actually said that lol
<heat>
lol reasonable take
<nikolar>
kek
<zid>
nice, considering I've had him ignored for 6 months
<zid>
that's pretty impressively petulent
<nikolar>
zid: weren't you on mutual ignores
<zid>
no idea
<nikolar>
i think he kept mentioning
<nikolar>
like you did too
<zid>
I just didn't wanna read his 14 page rants about ENTERPRISE
<zid>
kept lighting the channel up
<nikolar>
kek
<heat>
childlikempress, funnily enough gcc switched to lock or while clang is still on mfence
<nikolar>
locks are slow anyway :P
<heat>
clang crapper one could say
<zid>
Also I like the fact that once linux has done this
<zid>
mfence has no reason to ever get fast on new cpus
<zid>
because the kernel uses lock add, which has to remain fast now
<nikolar>
well windows might still use it
<nikolar>
(mfence i mean)
<zid>
yea but it fossilizes
<heat>
mfence does not care about performan
<zid>
which is what we call 'legacy baggage omg x86 so shit'
<zid>
we're watching it be created in realtime
<heat>
windows i mean, i am stupid
<nikolar>
they could just do a similar thing in the background
<nikolar>
since they are both full locks
<childlikempress>
heat: lol
<zid>
like how 'loop' sucks and nobody cares :P
<heat>
lock operations being fully serializing is Well Known
<childlikempress>
tbf no one does straight up fences anyway
<childlikempress>
if you fence you probably did something bad
<childlikempress>
sooooooo
<zid>
x86's fence instructions' semantics are fucky and full of holes anyway aren't they
<zid>
better to just do 'what works'
<heat>
linux has a bunch of special memory barriers like smp_mb__after_spinlock() where it compiles to a compiler barrier on x86
<nikolar>
zid: was loop ever common
<nikolar>
or used at all
<heat>
because grabbing a spinlock already did lock cmpxchg
<heat>
childlikempress, sure you do?
<zid>
nikolar: 8086 era, before uop caches and shit, i assume it saved a cycle?
<nikolar>
heat: yeah, x86 probably has the strongest memory ordering of the archs i know
<nikolar>
zid: eh maybe
<nikolar>
i feel like it came later though
<heat>
userspace people do not use fences because they enjoy the crappen
<nikolar>
not sure
<zid>
yea, x86 having most barriers be nops is so fucking nice lol
<heat>
and thus shit like TSAN does not support fences
<childlikempress>
tbf this is the sort of thing the compiler should in theory be able to handle anyway--skip a fence if you did a fencing thingy recently
<zid>
I'm amazed the kernel works on alpha at all
<zid>
given how fast and loose x86 has taught people to be with memory ordering
<nikolar>
zid: and makes you completely unaware of race conditions until you try running your software on arm or something
<nikolar>
:P
<childlikempress>
except that _still_ no one understands c++11
<zid>
nikolar: alpha not arm, arm is pretty strong still
<nikolar>
well linux has a crap load of tests on different archen
<zid>
arm's major weirdness is that weird bus ordering thing
<childlikempress>
so despite that one paper from ori lahav they are still treating it like a pile of hot potatoes
<nikolar>
not sure about alpha, do i do remember hearing someone talking about alpha hardware in the context of testing
<nikolar>
might be misremembering
<heat>
ARM isn't strong
<nikolar>
indeed
<heat>
alpha is only slightly weaker than ARM
<nikolar>
slightly??
<heat>
yes
<nikolar>
alpha is very stupidly weak
<childlikempress>
heat: ok lol actually why didn't they fix that in llvm yet
<nikolar>
like you need some kind of fence for basically everything
<childlikempress>
but also like
<childlikempress>
you don't need the fence there
<heat>
they had a bunch of special "DO NOT TOUCH BECAUSE OF ALPHA" places in the kernel, but they patched WRITE_ONCE to do the alpha memory barrier for alpha
<heat>
ggez
<nikolar>
kek
<heat>
childlikempress, i don't yeah
<heat>
i mean, i could need it
<nikolar>
alpha can like access the memory location pointed to by the pointer before accessing the pointer itself or something stupid like that
<heat>
and the matter of fact is that shit like a release store on x86 will have seq_cst semantics and the compiler doesn't know it
<childlikempress>
no it won't
<heat>
yes it will
<heat>
lock add is fully serializing boss
<childlikempress>
wym
<childlikempress>
release store maps to a regular store on x86
<heat>
uhhhh yes i am stupid
<heat>
i meant cmpxchg
<childlikempress>
ok yea
<zid>
nikolar: yea alpha is special in that reads and writes are disordered, as well as reads/reads and writes/writes
<nikolar>
ye
<nikolar>
also know as, very very weak
<zid>
most arches have a disconnect between the two, and then either one or the other
<childlikempress>
so you mean something like do { ... } while (!cas(..., relaxed)); fence()
<zid>
but alpha has none of the three
<nikolar>
yea
<Ermine>
if alpha is so good where's beta
<childlikempress>
where on x86 you'd skip the fence bc the cas is actually sc
<heat>
this is possible on the alpha: on the store side: "WRITE(ptr->a, 1); WRITE(ptr->a, 0); WRITE(global, ptr); " read side: "ptr = READ(global); READ(ptr->a) == 1"
<nikolar>
so you can't call alpha "slightly weaker" than arm
<zid>
hleol ,wrod!l
<zid>
my alpha printf
<childlikempress>
but on arm you'd want it relaxed and only do the fence once after you win
<nikolar>
kek
<childlikempress>
ok that makes sense
<heat>
the best part of all of this shit is that C11's consume memory model isn't implemented
<heat>
but it turns out a relaxed load does what you usually want
<heat>
except on da alfa
cloudowind has quit [Ping timeout: 252 seconds]
<childlikempress>
yeah consume was a mistake
<childlikempress>
there was one proposal i think from the webkit people to do consume with opaque tokens which makes a lot more sense
cloudowind has joined #osdev
<childlikempress>
and paulmck has a paper like 'just do semantic dependencies guys please semantic dependencies i want semantics in my dependencies FUCK YOU BITCH' it's like 100 pages i didn't read it cus fuck that lol but semantic dependencies makes plain relaxed do what you want most of the time
<heat>
paulmck is a god and should have my children
<childlikempress>
except for when it doesn't of cours ebut we don't talk about that
<heat>
bro it just works and if you think it doesnt then GET YOUR FUCKING ALPHA
<heat>
else SHUT THE FUCK UP
<childlikempress>
xd
<heat>
the alpha does not exist and DEC is fucking dead get over it
<childlikempress>
there is another thing that breaks though
<heat>
you can type your insane memory models on your DEC VT100 or some shit
<nikolar>
is there any alpha hardware around at all
<nikolar>
like even on an fpga or something
<nikolar>
(new hardware i mean)
* klys
with this DEC RX02 doing nothing
<heat>
they nuked alpha
<childlikempress>
which is if you wanna do like load(&a[0&load(&b)]) or whatever
<childlikempress>
so load of a has a fake dependency on the load of b
<childlikempress>
compiler will optimise it out unles you have a way to make it not
<heat>
say it with me
<heat>
READ_ONCE
<heat>
i'll say this though
<zid>
why is it called read once btw
<zid>
all my reads read once
<heat>
they do not
<childlikempress>
nuh uh
<childlikempress>
some of them read less than once
<zid>
that's cheating
<heat>
some of them more than once
<zid>
how very droll
<zid>
mroe than once is VERY cheating
<Ermine>
oh, is paulmck another guy in #osdev hall of fame
<Ermine>
?
<heat>
the linux memory model and, more importantly, it's primitives, are way more understandable than fucking C11 atomics
<heat>
paulmck is the RCU guy
<nikolar>
doesn't READ_ONCE just cast to a volatile pointer or something
<heat>
yes
<childlikempress>
READ_ONCE doesn't handle the example i gave though
<zid>
when do I get multiple reads
<klys>
read copy update
<childlikempress>
what you actually want is like asm("sub %0, %1, %1" : "=r"(dst) : "r"(src))
<childlikempress>
and the rest is regular atomic loads
<childlikempress>
or accesses whatever
<childlikempress>
you need the compiler to not know the value is zero so it doesn't just fold it out
<heat>
3 operand instruction? madness.
<zid>
like, I've seen that it's just a volatile cast or whatever, and then gone "okay, so, compiler barrier etc" but never understand why read once, when can I read twice
<childlikempress>
AAAAAAAAAAAAAAARGH64
<nikolar>
heat: intel apx, get with the times
<heat>
u32 *a = ...; *a;
<heat>
compiler can permissively read this byte-by-byte
<nikolar>
zid: well you can read zero times to
<nikolar>
too
<zid>
nikolar: yea zero is understandable, that's another reason to have added volatile
<zid>
heat: oh it just means 'atomic' then?
<zid>
rather than say, read-combined from a 16bit bus
<heat>
READ_ONCE is a relaxed volatile atomic load yeah
<zid>
yea I could see that it was volatile and atomic, but didn't twig that atomic meant you couldn't get a weird read-combined version
<zid>
meaning multiple loads were possible
<zid>
not sure if stupid
<heat>
it is a liiiiiiiiiitle different than non-volatile atomic because e.g this is permissible for C11: for (int i = 0; i < 16; i++) __atomic_load_n(ptr, __ATOMIC_RELAXED) -> for (int i = 0; i < 4; i++) /* load unrolled 4 times */
<bslsk05>
twitter: <fractalmagica> you raise an interesting point, however i have already opened clip studio paint and produced a 23 pagw doujin depicting you as the leashed catgirl (with slight blush) and i as the overly possessive girlfriend with a slightly cruel smile and low saturation hair
<zid>
ori ax; lahf
<zid>
is a valid instruction sequence imo
agent314 has quit [Read error: Connection reset by peer]
karenw has quit [Ping timeout: 248 seconds]
<kof673>
> is there any alpha hardware around at all i think some of the people went to amd and then intel but you will have to dig through old articles :D
<kof673>
so......philosophically perhaps..........but that was years ago........
<kof673>
i don't know if you could even point at anything and say that is where it came from........
gog has quit [Quit: byee]
sympt has quit [Ping timeout: 252 seconds]
sympt has joined #osdev
craigo has quit [Ping timeout: 276 seconds]
k_hachig_ has joined #osdev
Gooberpatrol_66 has quit [Read error: Connection reset by peer]
Gooberpatrol_66 has joined #osdev
cloudowind has quit [Ping timeout: 272 seconds]
cloudowind has joined #osdev
agent314 has joined #osdev
k_hachig_ has quit [Ping timeout: 260 seconds]
heat has quit [Ping timeout: 252 seconds]
k_hachig_ has joined #osdev
k_hachig_ has quit [Ping timeout: 252 seconds]
ramenu has quit [Quit: ZNC 1.8.2+deb3.1+deb12u1 - https://znc.in]
steelswords94 has quit [Quit: Ping timeout (120 seconds)]
qubasa has quit [Ping timeout: 248 seconds]
qubasa has joined #osdev
sly has joined #osdev
sly has quit [Quit: Leaving]
k_hachig_ has joined #osdev
goliath has quit [Quit: SIGSEGV]
ring0_starr has quit [Ping timeout: 252 seconds]
ring0_starr has joined #osdev
k_hachig_ has quit [Ping timeout: 248 seconds]
goliath has joined #osdev
_nater_ has quit [Ping timeout: 272 seconds]
k_hachig_ has joined #osdev
housemate has joined #osdev
k_hachig_ has quit [Ping timeout: 264 seconds]
cloudowind has quit [Ping timeout: 252 seconds]
cloudowind has joined #osdev
theruran has quit [Quit: Connection closed for inactivity]
k_hachig_ has joined #osdev
housemate has quit [Quit: Nothing to see here. I wasn't there.]
k_hachig_ has quit [Ping timeout: 272 seconds]
housemate has joined #osdev
k_hachig_ has joined #osdev
k_hachig_ has quit [Ping timeout: 265 seconds]
x64S has quit [Quit: Leaving]
x64S has joined #osdev
netbsduser` has joined #osdev
housemate has quit [Quit: Nothing to see here. I wasn't there.]
heat has joined #osdev
pebble has joined #osdev
k_hachig_ has joined #osdev
Left_Turn has joined #osdev
k_hachig_ has quit [Ping timeout: 260 seconds]
GeDaMo has joined #osdev
housemate has joined #osdev
dinkelhacker_ has joined #osdev
heat has quit [Ping timeout: 248 seconds]
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 248 seconds]
k_hachig_ has joined #osdev
op has joined #osdev
k_hachig_ has quit [Ping timeout: 252 seconds]
valeriusN has quit [Ping timeout: 248 seconds]
kof673 has quit [Ping timeout: 248 seconds]
xenos1984 has quit [Read error: Connection reset by peer]