<moon-child>
cas can be used to implement all sorts of thing
gera has quit [Quit: gera]
<rnicholl1>
But, is there a faster way than CAS to implement "lock" in the happy path?
<rnicholl1>
I don't think so
<rnicholl1>
imo, being 4-bytes or less is more important than having a really complex fairness scheduler
<rnicholl1>
otherwise you end up with uesless mutexes polluting your cache lines
<rnicholl1>
gcc 40 byte mutex.. no thank you
<rnicholl1>
it's not even aligned to a cache line boundary
<moon-child>
in the limit we could have 1 bit mutexes
<rnicholl1>
which is the only justification I can imagine for huge mutexes
<moon-child>
pack them tightly into a single region of memory
<moon-child>
and use the low-order bits of an address we want to lock to key that region
<moon-child>
;)
<moon-child>
I think the mutexes I wrote for the j interpreter were 16 bytes? Something like that. More importantly, unlike posix mutexes, they don't have a shitton of undefined behaviour
<rnicholl1>
if you make the mutex 64-bytes... and aligned to 64-byte boundary.. at least it would be like... not able to overlap with other mutexes
<rnicholl1>
40 bytes, no align...
<rnicholl1>
you can potentially overlap with not 1 but two other mutexes
<moon-child>
why do you have three mutexes
<moon-child>
what are you even doing with your life at that point
gera has joined #osdev
<rnicholl1>
more mutexes = smaller scope per mutex
<rnicholl1>
more mutexes is better
<rnicholl1>
less lock contention
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
<moon-child>
in the limit, we could have just one mutex per word
<moon-child>
and then skip the mutex and lock-free casloop that word
<rnicholl1>
Well, if you can get away with using an atomic instead of a mutex, sure
<rnicholl1>
I feel like most of the drive toward wait-free/lock free is just a result of mutexes in glibc being too heavyweight
<moon-child>
lol
<rnicholl1>
I have yet to see a compelling example of why wait free is better than locks
<rnicholl1>
the usual problem is just chosing the wrong lock scope
FreeFull has quit [Quit: Sleepy times]
<rnicholl1>
"here, let me get an exclsuvie lock on this giant tree I am reading from 64 threads simultaneously to do a read only operation"
<heat_>
what
heat_ is now known as heat
<heat>
don't make me call mjg the performance police
<rnicholl1>
I swear some people are like "mutexes are slow, look how much time we spend in this lock!"
<heat>
you're spitting the hottest takes since... i don't know, the hottest take
<heat>
YES, BECAUSE THEY'RE SLOW
<rnicholl1>
what do you mean? Mutexes aren't slow, it's a question of how people use them.
<heat>
mutexes are 200% slow
<heat>
particularly userspace ones
<rnicholl1>
since when
<heat>
ever
<heat>
do you think "cmpxchg and if we fail do a round-trip to the kernel and attempt to sleep, even if for 500us just to come back, with all the scheduling latency you'd ever want" is fast?
<heat>
do you think getting all the other threads on a chokehold while you do $OP is fsat?
<zid>
omg heat I forgot to recompile my kernel
<zid>
it's set up for intel
<heat>
like, the performance benefits of doing wait-free/per-cpu/per-thread algorithms have been clear since the fucking 90s
<heat>
even bonwick could see that shit coming from a mile away
<zid>
you should help me do that instead
<heat>
zid, ohno!
IRChatter has joined #osdev
IRChatter has quit [Read error: Connection reset by peer]
<zid>
probably shouldn't be trying to load 16GB of ram off a hdd
<heat>
when's your bday zid
gog has quit [Ping timeout: 265 seconds]
<immibis>
rnicholl1: it's obviously better to do less than more. "Wait-free" means finding a way to do something without locks which generally involves doing less. However if you have to go to extreme contortions to write a wait-free algorithm then it might not be good.
<zid>
25th of october
<zid>
what are you buying me?
<zid>
new monitor? :o
Left_Turn has quit [Read error: Connection reset by peer]
<heat>
we should get together and buy you a vid from nigel farage's cameo
<immibis>
locks are a convenient mechanism because they are general; in order to cover all cases, general mechanisms do more work than specific ones. Such as guaranteeing fairness.
IRChatter has joined #osdev
<zid>
if you say so
IRChatter has quit [Read error: Connection reset by peer]
<zid>
I won't watch it
<zid>
and you're funding terrorism
<heat>
nOoOOOOOOOOOOOOOOoooooooooooooo
<zid>
/me make menuconfig
<heat>
but its nigel garage
<moon-child>
i wanna transactions
<zid>
what options do I need for AMDAMDAMDAMD
IRChatter has joined #osdev
<heat>
CONFIG_FAST
<moon-child>
-mtune=native
IRChatter has quit [Read error: Connection reset by peer]
<moon-child>
or something I don't use gentoo
<immibis>
rnicholl1: on a fundamental level, the hardware is actually using wait-free algorithms to implement locks. The exception is particularly complex cases where the processor just blocks all processors from accessing memory for a short time: https://lwn.net/Articles/790464/
<bslsk05>
lwn.net: Detecting and handling split locks [LWN.net]
<zid>
do I need AMD ACPI2Platform devices support
<heat>
probably not m8
<heat>
if it aint broke don't compile it
<immibis>
actually i think cache protocols qualify as lock-free but not wait-free
* zid
switches from intel to amd microcode loading
<rnicholl1>
here as an example
<moon-child>
immibis: 'cache protocols' are not a monolith
<moon-child>
immibis: there have been some processors where e.g. cas can spuriously fail forever
zxrom has quit [Quit: Leaving]
<zid>
[ ] Mitigations for speculative execution vulnerabilities ----
<immibis>
that would be a processor bug
<rnicholl1>
immibis: but mutexes are just implemented using atomics
<moon-child>
it depends on the architecture
<rnicholl1>
there's no reason that they'd be slower than any other algorithms *unless* the mutex is contested
<immibis>
that's correct, except for the additional operation of updating the mutex
<immibis>
if only one processor wants to access the cache line, it gets exclusive access to the cache line and it knows it has exclusive access to the cache line
<heat>
"there's no reason they would be any slower than other algorithms except when they're getting used"
<immibis>
heat: it means they don't add a whole bunch of extra overhead. It's quite common that a mutex is uncontended most of the time, because each processor doesn't access the data very frequently and they rarely coincide.
<immibis>
It's also common that a mutex is heavily contended because the data is accessed very often by multiple processors. (In this case you have a bottleneck and might want to redesign something)
<heat>
it is common, in some circumstances, in some workloads
<heat>
what's with all this handwavy language?
<rnicholl1>
I am saying that 95% of the problems people have with locks is a "big giant lock" issue.
<rnicholl1>
not a problem with locks themselves
<rnicholl1>
misuse of locks to lock large data structures
<rnicholl1>
do long operations
<immibis>
not large structures but heavily accessed structure
<heat>
it's also not a requirement at all to have a "lock free" algo using atomics
<immibis>
s
<rnicholl1>
heat: how do you implement lock free without atomics? read only operations?
<rnicholl1>
you don't need a lock for that anyway
<heat>
per-thread/per-cpu state
<immibis>
when you handwave "atomics" what are you actually considering? a lot of operations that are not labeled as atomic are actually atomic
zxrom has joined #osdev
<immibis>
such as writing aligned data on most architectures
<rnicholl1>
that isn't atomic actually
<immibis>
the label "atomic" comes into play when atomically executing things that aren't usually atomic, such as increments
<rnicholl1>
something it took a bit of explaining at an old job
<rnicholl1>
that "volatile bool" was not atomic
<rnicholl1>
and in fact caused a bug
<immibis>
on which platform and which circumstance?
<rnicholl1>
x86
<rnicholl1>
the problem is
<heat>
immibis, writing aligned data is not an atomic
<immibis>
Windows (x86) comes with a guarantee that writing atomic data up to 4 bytes (IIRC) is atomic
<rnicholl1>
just because it is atomic "on x86"
<immibis>
aligned data*
<rnicholl1>
does not make it atomic in C++
<immibis>
ok but we are talking about processors here, not C++
<moon-child>
I will note it is very dumb of the c++ memory model to say races are ub, instead of making all unqualified sub-word ops relaxed atomic
<moon-child>
considering that basically no architecture has tearing for <=word
<moon-child>
but
<immibis>
implementations are free to define undefined behaviour
<rnicholl1>
moon-child: it would make the compiler not able to do a huge number of optimizations
<rnicholl1>
if that behavior was adopted
<immibis>
what optimizations?
<moon-child>
no, it would not
<immibis>
I don't think the compiler usually recognizes races at all
<moon-child>
java implementations, for instance, optimise just fine, and have no oota or tearing
<immibis>
it just tells the processor to write the same data that the source code says to write
<rnicholl1>
have you tried to benchmark a parallel introsort in java
<rnicholl1>
I did that for a class
<rnicholl1>
C++ vs Java
<rnicholl1>
java was
<rnicholl1>
8 times slower
<heat>
ok?
<moon-child>
most benchmarks are wrong
<heat>
missingthepoint.jpeg?
<rnicholl1>
maybe I just suck at writing java
<immibis>
did you look at the assembly code to find out why?
<rnicholl1>
because java requires seq_cst behavior
<moon-child>
pretty sure it has both seqcst and acqrel
<moon-child>
~analogues
<moon-child>
I will also note that c++ compilers do basically no optimisation of concurrent stuff either
<heat>
if C/C++ compilers optimized concurrent stuff the world would burn
<rnicholl1>
the issue is that if races aren't UB then the C++ compiler can't combine a large number of operations into a smaller number of more efficient operations if concurrent modification by another thread would produce a result that was impossible under the original code
<moon-child>
heat: would it
<rnicholl1>
heat: they do though
<heat>
yes, fairly sure a good amount of code is still very handwavy when it comes to most of this shit
<immibis>
rnicholl1: when does this happen? given that moon-child is talking about giving them relaxed ordering semantics
<heat>
rnicholl1, where
<rnicholl1>
I would have to prod the compilers a lot to get an actual example
<immibis>
memory_order_relaxed means the operation is still atomic, but it can be reordered whenever (as long as it doesn't cross any other barriers)
<immibis>
(but normal accesses also can't cross other barriers)
<moon-child>
handwavy somewhat, yeah, but I don't think it's _that_ bad. Especially for code which is ported to arm or whatever
<rnicholl1>
Suppose for example, a = b + c
<rnicholl1>
compiler could modify a = b then a += c
<immibis>
(e.g. read-acquire prevents both unqualified and relaxed loads and stores from moving before it)
<rnicholl1>
not possible if the assignment must be atomic
<immibis>
true
<rnicholl1>
since a = b wouldn't be possible in relaxed ordering
<immibis>
why would it do that?
<rnicholl1>
This is just an example of the flexibility it gives the compiler
<immibis>
there's probably some edge case where register pressure makes it a good idea
<rnicholl1>
to dump things from registers
<rnicholl1>
yeah
<rnicholl1>
register pressure
<rnicholl1>
instead of pushing into stack
<immibis>
so there's the answer. that wasn't so hard
<immibis>
can it do that in the presence of signals?
<rnicholl1>
yup, since signals race with user code
genpaku has quit [Ping timeout: 250 seconds]
<rnicholl1>
I hate trying to explain that one
<rnicholl1>
that's why we have std::atomic_signal_fence
<rnicholl1>
in addition to std::atomic_thread_fence
<moon-child>
only thing you can access from signals legally is volatile sig_atomic_t
<heat>
your example does not involve any sort of atomics
<heat>
also, see moon-child
<rnicholl1>
moon-child: no
<rnicholl1>
you can use atomics too
<moon-child>
well, sure
<rnicholl1>
and also, as long as you guard with std::atomic_signal_fence and so on
<rnicholl1>
you can access variables from non-signal land
<rnicholl1>
example
<zid>
vmware driver is messed up now I blame you heat
<heat>
zid, sorry B i'll talk to linus
<heat>
he'll take care of it
<zid>
your bestie
<heat>
mine and mjg's
<rnicholl1>
std::atomic_signal_fence(std::memory_order_release) .. set_sig_atomic_t(...) // signal handler can now acquire "lock" on some random data form non-signal land
<rnicholl1>
the only thing is I think some atomics, that are not lock free, cannot be used in signals
<rnicholl1>
I forget the rules on that
<rnicholl1>
I believe the rule was that sig_atomic_t is the only atomic guarnteed to be signal safe
<rnicholl1>
but not nesscarily the case that e.g. std::atomic<int> is signal unsafe
<rnicholl1>
"C11, 5.1.2.3, paragraph 5 also allows for signal handlers to read from and write to lock-free atomic objects."
genpaku has joined #osdev
<rnicholl1>
<heat> : the example I pasted on godbolt?
<rnicholl1>
or the example of a = b + c
<rnicholl1>
o rsomething else
<rnicholl1>
Reading about atomics always pains me
<rnicholl1>
"Actions on volatile objects cannot be optimized out by an implementation."
<rnicholl1>
hold my beer
<heat>
a = b + c
<heat>
good, the compiler people aren't that insane
IRChatter has joined #osdev
<rnicholl1>
oh damn they are pretty insane actually
<moon-child>
like I said, compilers basically don't optimise concurrent stuff
<zid>
doesn't help that C has no concept of volatile other than "might be a device"
<zid>
There's an impossible optimizatio for playstation that annoys me can't be achieved by C
<heat>
rnicholl1, do you seriously think that's insane
<heat>
by far the worst solution would be them language lawyering over any kind of leverage the standard gives them to *break* code
<heat>
you're not supposed to fight the compiler...
<rnicholl1>
I'd argue that behavior shouldn't happen unless you made it volatile in addition to atomic
<rnicholl1>
If the write itself has side effects, you should be using volatile
<rnicholl1>
the "as-if rule" would clearly allow those two writes to be combined into one though
<rnicholl1>
Personally I think it would be good for the compiler to make these kinds of optimizations, if it can make code faster
<heat>
"personally I think it would be good for kids to run with scissors if that meant they could get places faster" :p
<bnchs>
hi
rnicholl1 has quit [Quit: My laptop has gone to sleep.]
<heat>
you made em quit IRC benches
<heat>
gosh darnit!
<bnchs>
:<
<bnchs>
i'm sorry :<
friedy has joined #osdev
<friedy>
Do any of you guys know of an ARM based SoC that supports bank switching?
rnicholl1 has joined #osdev
<rnicholl1>
heat: I think C++ is not beginner friendly, and it doesn't need to be
<rnicholl1>
I would rather have separate "Safe" and "Fast" languages
<rnicholl1>
If you need "Safe" you can use Rust or something
<geist>
friedy: what do you mean bank switching?
<friedy>
geist: I'm using the RPI 4 (Broadcom BCM2711) and it has an SOC that has limited address pins (16 GB). I want to find an SoC that uses bank switching so I can get a larger physical address space.
<zid>
It's a shame there's no decent low speed bus to do that off
<clever>
friedy: the dram bus is too high of a freq to allow putting 2 chips on the same bus and using CS to select which one, signal reflections and junk
<zid>
That's why you need a low speed bus to hang the latch off
<clever>
friedy: also, nobody actually makes 16gig lpddr4 chips, so the limit of the SoC has yet to become a problem
<zid>
Mechanical arm that dispenses ln2 then swaps the dimms.
<geist>
yah, they’d just put more address lines ifthey wanted to support more
<bnchs>
friedy: lol 16 GB?
<bnchs>
that doesn't seem so limited at all
<geist>
yah also you probablycan’t easily bank switch dram like that, since you still have to select and refresh it, even if you’re not using it
<clever>
yep
<bnchs>
limited would be like.. 16 MB or something
<heat>
asking for bankswitching when you have 16GB of ram is funny yes
<clever>
dram also has banks internally
<friedy>
My prof had the question and I passed it on to you guys. Sounds like I need to switch schools lol.
<clever>
or stop asking people on irc to do your homework for you
<friedy>
clever: it's not homework. No assignment. Just wanted to see what you guys would say.
<bnchs>
your prof sounds weird
<bnchs>
imagine jimmy, he has a SoC that supports up to 512 terabytes of physical address space, but he needs bank switching to support more for some dumb reason
<bnchs>
how will he implement bank switching?
<clever>
bnchs: by re-desigining the SoC with either a wider data bus or more dram controllers
<clever>
or allowing NUMA
<friedy>
It's for some research project he's working on. He posted the question in the slack. I don't know why he needs it. Thought maybe you guys would know, but I guess it sounds pretty unreasonable.
<bslsk05>
www.youtube.com: Stardust Speedway Bad Future (JPN/PAL) - Sonic the Hedgehog CD Music Extended - YouTube
[itchyjunk] has quit [Remote host closed the connection]
d34d1457 has joined #osdev
<bnchs>
i never played sonic cd
danilogondolfo has quit [Remote host closed the connection]
<friedy>
bnchs: Sonic CD is probably my favorite sonic game. Good music, great level design, and there's a time travel feature so you can play different versions of each level. You should try it.
nyah has quit [Quit: leaving]
d34d1457 has quit [Read error: Connection reset by peer]
torresjrjr has quit [Remote host closed the connection]
torresjrjr has joined #osdev
friedy has quit [Ping timeout: 260 seconds]
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
gbowne1 has quit [Quit: Leaving]
rnicholl1 has joined #osdev
slidercrank has joined #osdev
heat has quit [Ping timeout: 246 seconds]
rnicholl1 has quit [Quit: My laptop has gone to sleep.]
rnicholl1 has joined #osdev
bgs has joined #osdev
rnicholl1 has quit [Quit: My laptop has gone to sleep.]
rnicholl1 has joined #osdev
rnicholl1 has quit [Client Quit]
rnicholl1 has joined #osdev
zxrom has quit [Ping timeout: 265 seconds]
kivikakk is now known as kivikakk[i]
zxrom has joined #osdev
kivikakk has joined #osdev
kivikakk[i] has left #osdev [#osdev]
rnicholl1 has quit [Quit: My laptop has gone to sleep.]
bgs has quit [Remote host closed the connection]
les has quit [Quit: Adios]
les has joined #osdev
yuriko has quit [Quit: You have been kicked for being idle]
GeDaMo has joined #osdev
Brain__ has joined #osdev
<dminuoso>
Is there's time travel, does that mean when you start you're already getting advanced guns from your future avatar, that you will collect a day in the future, and send yourself back into when you start?
<dminuoso>
I would *really* like to see the source code of that.
slidercrank has quit [Ping timeout: 276 seconds]
slidercrank has joined #osdev
bauen1 has quit [Ping timeout: 265 seconds]
Ali_A has joined #osdev
Ali_A is now known as Someone
gog has joined #osdev
elastic_dog has quit [Ping timeout: 246 seconds]
elastic_dog has joined #osdev
bnchs has quit [Read error: Connection reset by peer]
CalculusCats is now known as CalculusCat
Someone has quit [Quit: Client closed]
danilogondolfo has joined #osdev
danilogondolfo has quit [Max SendQ exceeded]
danilogondolfo has joined #osdev
danilogondolfo has quit [Max SendQ exceeded]
danilogondolfo has joined #osdev
CalculusCat is now known as CalculusCats
bauen1 has joined #osdev
bauen1 has quit [Ping timeout: 268 seconds]
sinvet has quit [Remote host closed the connection]
bauen1 has joined #osdev
[itchyjunk] has joined #osdev
sinvet has joined #osdev
d34d1457 has joined #osdev
danilogondolfo has quit [Ping timeout: 250 seconds]
danilogondolfo has joined #osdev
vdamewood has joined #osdev
CalculusCats is now known as CalculusCat
Ali_A has joined #osdev
dutch has quit [Quit: WeeChat 3.8]
d34d1457 has quit [Read error: Connection reset by peer]
elastic_dog has quit [Ping timeout: 240 seconds]
elastic_dog has joined #osdev
dutch has joined #osdev
nyah has joined #osdev
Ali_A has quit [Quit: Client closed]
torresjrjr has quit [Ping timeout: 265 seconds]
Ali_A has joined #osdev
torresjrjr has joined #osdev
Ali_A has quit [Quit: Client closed]
warlock has joined #osdev
elastic_dog has quit [Ping timeout: 246 seconds]
elastic_dog has joined #osdev
CalculusCat is now known as CalculusCats
patwid has quit [Remote host closed the connection]
jleightcap has quit [Remote host closed the connection]
vismie has quit [Remote host closed the connection]
exec64 has quit [Remote host closed the connection]
alethkit has quit [Remote host closed the connection]
utzig has quit [Remote host closed the connection]
alecjonathon has quit [Remote host closed the connection]
yuiyukihira has quit [Remote host closed the connection]
yyp has quit [Remote host closed the connection]
042AAAG7M has quit [Remote host closed the connection]
pitust has quit [Remote host closed the connection]
staceee has quit [Remote host closed the connection]
whereiseveryone has quit [Remote host closed the connection]
sm2n has quit [Remote host closed the connection]
tom5760 has quit [Remote host closed the connection]
noeontheend has quit [Remote host closed the connection]
gpanders has quit [Remote host closed the connection]
tommybomb has quit [Remote host closed the connection]
gjn has quit [Remote host closed the connection]
ddevault has quit [Remote host closed the connection]
utzig has joined #osdev
patwid has joined #osdev
pitust has joined #osdev
yyp has joined #osdev
jleightcap has joined #osdev
vismie has joined #osdev
tom5760 has joined #osdev
tommybomb has joined #osdev
whereiseveryone has joined #osdev
exec64 has joined #osdev
noeontheend has joined #osdev
ddevault has joined #osdev
gjn has joined #osdev
sm2n has joined #osdev
yuiyukihira has joined #osdev
gpanders has joined #osdev
alethkit has joined #osdev
alecjonathon has joined #osdev
staceee has joined #osdev
milesrout_ has joined #osdev
slidercrank has quit [Ping timeout: 240 seconds]
Ali_A has joined #osdev
Ali_A has quit [Client Quit]
dude12312414 has joined #osdev
dude12312414 has quit [Client Quit]
staceee has quit [Remote host closed the connection]
tommybomb has quit [Remote host closed the connection]
milesrout_ has quit [Remote host closed the connection]
patwid has quit [Remote host closed the connection]
pitust has quit [Remote host closed the connection]
utzig has quit [Remote host closed the connection]
alecjonathon has quit [Remote host closed the connection]
whereiseveryone has quit [Write error: Broken pipe]
gpanders has quit [Write error: Broken pipe]
yuiyukihira has quit [Write error: Broken pipe]
sm2n has quit [Write error: Broken pipe]
alethkit has quit [Write error: Broken pipe]
noeontheend has quit [Write error: Broken pipe]
tom5760 has quit [Write error: Broken pipe]
exec64 has quit [Write error: Broken pipe]
jleightcap has quit [Remote host closed the connection]
yyp has quit [Remote host closed the connection]
vismie has quit [Remote host closed the connection]
gjn has quit [Write error: Broken pipe]
ddevault has quit [Remote host closed the connection]
utzig has joined #osdev
staceee has joined #osdev
vismie has joined #osdev
gpanders has joined #osdev
tommybomb has joined #osdev
milesrout_ has joined #osdev
yyp has joined #osdev
alethkit has joined #osdev
ddevault has joined #osdev
tom5760 has joined #osdev
patwid has joined #osdev
noeontheend has joined #osdev
whereiseveryone has joined #osdev
yuiyukihira has joined #osdev
exec64 has joined #osdev
alecjonathon has joined #osdev
jleightcap has joined #osdev
sm2n has joined #osdev
gjn has joined #osdev
danilogondolfo has quit [Read error: Connection reset by peer]
pitust has joined #osdev
Ali_A has joined #osdev
Dyskos has joined #osdev
Ali_A has quit [Quit: Client closed]
dude12312414 has joined #osdev
bgs has joined #osdev
slidercrank has joined #osdev
slidercrank has quit [Client Quit]
slidercrank has joined #osdev
gog has quit [Quit: Konversation terminated!]
danilogondolfo has joined #osdev
dove has quit [Read error: Connection reset by peer]
heat has joined #osdev
gog has joined #osdev
<heat>
gog
<heat>
hi
<gog>
heat
<zid>
me
<zid>
:(
<heat>
hi zid
<gog>
zid
<zid>
linus torvalds
<gog>
hi
<heat>
hello its me linux torval
<heat>
give me your credit card information
<Griwes>
last 4 digits are 6969
<Griwes>
cvv is 420
rnicholl1 has joined #osdev
<heat>
haha funny sex number
<heat>
only a true nerd would take sex and make a number out of it
<gog>
sex nerd
rnicholl1 has quit [Quit: My laptop has gone to sleep.]