<mjg>
i have to stress, as much as slab is an ok idea, i don't think really crediting bonwick here is all that great
<mjg>
i think the key takeaway from the paper which was not 100% obvious was converting malloc to use it
<mjg>
but past that even the paper explicitly states people were hacking up their own caching layers in subsystems
<mjg>
and his is just making it general
<mjg>
which is not that much of a stroke of genius
SpikeHeron has joined #osdev
<heat>
sure
<mjg>
basically a number of other people would have done it if they could be fucked
<heat>
otoh, it is different
<heat>
it's like a funky object pool
MrBonkers has joined #osdev
<heat>
but funnily enough it lost most of its funkyness in linux
<heat>
it's just a pool now
<heat>
a particularly named and organized pool
<mjg>
"pool" is how they refer to it in the theo land
<heat>
.theo
<mjg>
although! it is plausible the pool thing is just for the per-cpu layer
<mjg>
i don't remembert now
<heat>
The world doesn't live off jam and fancy perfumes - it lives off bread and meat and potatoes. Nothing changes. All the big fancy stuff is sloppy stuff that crashes. I don't need dancing baloney - I need stuff that works. That's not as pretty, and just as hard.
<heat>
theo dissing you
<mjg>
theocracy
[itchyjunk] has quit [Ping timeout: 260 seconds]
<mjg>
look man
<mjg>
unscrew that vfs bench
<mjg>
then i'm gonna take a look at patching obsd to retain the lead
<heat>
lmao
<heat>
which one
<heat>
open3?
<mjg>
whatever where you are behind
<mjg>
i think it was open3
<mjg>
but i'm happy to take any other
<mjg>
as long as you are slower there now
<heat>
I'm gonna take a little while here
<heat>
so, my problem
<heat>
I have the generic virtual address space allocator
<heat>
it's kind of inefficient and holds a big, sleepable lock around the address space
<heat>
i need a non-sleepable thing for vmalloc
<heat>
I was going to take my time and get another tree impl
[itchyjunk] has joined #osdev
<mjg>
heat: and unsingle-list your namecache ntires, ffs!
Test_User has joined #osdev
<heat>
I think linux uses a hashtable
<mjg>
everyone uses a hash table
<mjg>
except for net and open
<heat>
I don't know how that looks in contention
gorgonical has quit [Ping timeout: 265 seconds]
<mjg>
it's great because you don't take locks when you do the lookup
<heat>
you still do no?
<mjg>
no
<mjg>
R C U
<mjg>
well hash table is not *inherently* scalable, but it is pretty good with a delayed memory reclamation mechanism
<mjg>
like rcu
\Test_User has quit [Ping timeout: 268 seconds]
<heat>
do your lookups use ebr?
<mjg>
smr
Test_User is now known as \Test_User
<heat>
oh yeah
<heat>
That other rcu thing
<mjg>
it's a freebsd-specific variation
<mjg>
for the sake of this convo you may as well assume ebr
<mjg>
key point is 0 ping pong against others accessing the same chain
<mjg>
there used to be a global rw lock for the namcache
<mjg>
and this is how it looked like :S
<heat>
ugh
<heat>
see I don't suck that bad
<heat>
when was this?
<mjg>
2015
<heat>
i'm better than 2015 fbsd
<heat>
good enough for me
<mjg>
actually you are not
<mjg>
this is -j 40
<mjg>
2 sockets
<hmmmm>
didn't linux have this fixed several decades ago with RCU?
<mjg>
yes and no
<heat>
linux has several tens of times the manpower of freebsd
<hmmmm>
yeah pretty much the reason why i couldn't stay using freebsd
<mjg>
funny case lookup started to scale way prior of course
<mjg>
but rest assured there was tons of really nasty locking in there for a long time past that
* mjg
might have ran into it
<heat>
what do you think of a per-filesystem rename lock?
<mjg>
fwiw it was performing ok enough on meh hardware at the time
vdamewood has joined #osdev
<heat>
like linux had
<mjg>
rename is a "fuck this shit" problem and a rename lock is an ok way to acknolwedge it
<heat>
I know they do seqlock voodoo now
<mjg>
fighting rename races is a losing battle
<mjg>
so i support it 100%
<mjg>
heat: you may want to say hi to dh` on thiso ne
<heat>
hi dh`
<heat>
how are you?
<heat>
I have no context for this but I'll roll with it
freakazoid332 has quit [Ping timeout: 260 seconds]
vinleod has joined #osdev
nyah has quit [Ping timeout: 260 seconds]
vdamewood has quit [Killed (mercury.libera.chat (Nickname regained by services))]
vinleod is now known as vdamewood
Matt|home has quit [Quit: Leaving]
zid has quit [Remote host closed the connection]
zid has joined #osdev
<heat>
I feel tempted to the the old vm allocation trick of having two trees, one with used regions and one with free regions, keyed by length
ThinkT510 has quit [Ping timeout: 250 seconds]
<heat>
wait, no, how does this ever work? if you have two free regions with the same length
ThinkT510 has joined #osdev
mavhq has quit [Ping timeout: 246 seconds]
zid has quit [Ping timeout: 265 seconds]
zid has joined #osdev
mavhq has joined #osdev
<heat>
geist, really like how simple lk is, super easy to take nice bits from it!
gxt has quit [Ping timeout: 258 seconds]
gildasio has quit [Ping timeout: 258 seconds]
gxt has joined #osdev
gildasio has joined #osdev
scaleww has joined #osdev
frkazoid333 has joined #osdev
saltd has quit [Remote host closed the connection]
gildasio has quit [Ping timeout: 258 seconds]
gildasio has joined #osdev
saltd has joined #osdev
scaleww has quit [Quit: Leaving]
saltd has quit [Read error: Connection reset by peer]
saltd has joined #osdev
<heat>
binary_search_tree is pretty decent, it gets my LittleKernel(r) Integration Ready(tm) stamp of approval
<heat>
only had to add a min function
<heat>
could also be a bit nicer when it comes to callbacks, could just take callables and the functions could be templates, but those things are trivially addable
<bslsk05>
github.com: [PATCH] x86_64: prefetch the mmap_sem in the fault path · torvalds/linux@a9ba9a3 · GitHub
<heat>
this is the kind of inquisitive prefetches I subscribe to
<heat>
microbench slow? prefetch the lock!
saltd has quit [Read error: Connection reset by peer]
epony has quit [Quit: QUIT]
saltd has joined #osdev
<geist>
heat: ah yeah that tree is nice. i need to use it for more stuff, only recently integrated it
<geist>
wanted to write some more unit tests for it too
<geist>
re: wavl trees vs rb trees, i honstly dont know. i didn't implement the wavl tree implementation in zircon
<geist>
but it has some properties that are nice, but i forget off the top of my head
<geist>
something about O(1) removals, or something, which is surprising, but i've been told it's the case
<geist>
i haven't fully grokked how that's supposed to work on a balanced tree
heat has quit [Ping timeout: 260 seconds]
darkstardevx has quit [Remote host closed the connection]
darkstardevx has joined #osdev
srjek has quit [Ping timeout: 248 seconds]
srjek has joined #osdev
darkstardevx has quit [Ping timeout: 260 seconds]
opal has quit [Ping timeout: 258 seconds]
jstoker has quit [Remote host closed the connection]
opal has joined #osdev
jstoker has joined #osdev
opal has quit [Ping timeout: 258 seconds]
opal has joined #osdev
saltd has quit [Read error: Connection reset by peer]
frkzoid has joined #osdev
frkazoid333 has quit [Ping timeout: 244 seconds]
[itchyjunk] has quit [Remote host closed the connection]
dasabhi has joined #osdev
saltd has joined #osdev
saltd has quit [Read error: Connection reset by peer]
saltd has joined #osdev
tarel2 has joined #osdev
k8yun has joined #osdev
frkzoid has quit [Read error: Connection reset by peer]
k8yun has quit [Ping timeout: 252 seconds]
opal has quit [Remote host closed the connection]
opal has joined #osdev
dasabhi has quit [Quit: Lost terminal]
ThinkT510 has quit [Quit: WeeChat 3.6]
ThinkT510 has joined #osdev
tarel2 has quit [Quit: Client closed]
xenos1984 has quit [Ping timeout: 264 seconds]
xenos1984 has joined #osdev
terminalpusher has joined #osdev
GeDaMo has joined #osdev
[itchyjunk] has joined #osdev
gildasio has quit [Remote host closed the connection]
freakazoid332 has joined #osdev
wootehfoot has joined #osdev
epony has joined #osdev
terminalpusher has quit [Remote host closed the connection]
gildasio has joined #osdev
saltd has quit [Read error: Connection reset by peer]
saltd has joined #osdev
freakazoid332 has quit [Ping timeout: 244 seconds]
[itchyjunk] has quit [Read error: Connection reset by peer]
gog has joined #osdev
tarel2 has joined #osdev
biblio has joined #osdev
xenos1984 has quit [Ping timeout: 268 seconds]
xenos1984 has joined #osdev
nyah has joined #osdev
DoubleJ2 has joined #osdev
DoubleJ has quit [Ping timeout: 268 seconds]
DoubleJ2 is now known as DoubleJ
saltd has quit [Read error: Connection reset by peer]
DoubleJ has quit [Quit: Not all those who wander are lost]
DoubleJ has joined #osdev
biblio has quit [Quit: Leaving]
dude12312414 has joined #osdev
dude12312414 has quit [Ping timeout: 258 seconds]
saltd has joined #osdev
dude12312414 has joined #osdev
k8yun has joined #osdev
heat has joined #osdev
[itchyjunk] has joined #osdev
k8yun has quit [Quit: Leaving]
freakazoid332 has joined #osdev
tarel2 has quit [Ping timeout: 252 seconds]
<heat>
mjg, squeezed an 20k on open3 with my still-crappy slab allocator for struct files
<heat>
an extra*
wootehfoot has quit [Ping timeout: 265 seconds]
<mjg>
heat: right on
<heat>
I still have big contention in my locks as you can imagine, without the magazine
xenos1984 has quit [Read error: Connection reset by peer]
<heat>
also I figured out why I never read bonwick's paper talking about the magazines last night
<heat>
it's literally the same one as vmem
<heat>
and there's big inefficiencies in my unlock routines for sleepable locks
<heat>
as in, they always spin_lock(), try to wake up threads, spin_unlock
<heat>
which is, as you would probably describe, "super pessimal"
<mjg>
this is how you normally do it though
<mjg>
well depends on other factors
<mjg>
key point being actual sleeping should be rare
<mjg>
:>
<mjg>
oh i misread. if your actual unlock always starts with a spin lock, then ye
<mjg>
that's turbo pessimal
<heat>
no
<heat>
well, in mutexes, more or less so
<heat>
but for instance
<heat>
if (__atomic_sub_fetch(&lock->lock, 1, __ATOMIC_RELEASE) == 0) rw_lock_wake_up_thread(lock);
<heat>
rw_lock_wake_up_thread ends up taking a spinlock
<mjg>
that's ok
<geist>
ah yeah, usually the trick is to encode in the atomic if something is waiting
<geist>
so you can avoid that trip in the optimal case
<heat>
yea
<geist>
it's not too difficult, worth a quick couple hours to bash it together and do it
<geist>
i need to go back and retrofit the LK mutexes actually. did it on zircon, but the lk mutexes are still spinlocky
<mjg>
's what solaris is doing.... :-P
<mjg>
and really everyone
<mjg>
one note is to align your threads to 16 or so so you have lpenty of bits to spare
<geist>
debating if i was going to keep both paths, since on a UP machine, or an embedded one, just entering a critical section and then doing the mutex stuff is probably faster
<mjg>
if you really care about UP you can hotpatch yourself at boot time
<geist>
especially on machines where an atomic is really just a criical section
<geist>
no, i can't. not on those kinda machines
<geist>
since they'd literlaly be running in ROM
<mjg>
oh that
<geist>
but i can #ifdef it. that's what i really mean, do i want to ifdef the older, UP version, per arch
<mjg>
i ws thinking something which can happen to boot to either
<geist>
or just toss it and move to an atomic based solution
<mjg>
you find out there is only 1 cpu, thee are savings to be made
<heat>
> lk mutexes are still spinlocky
<heat>
isn't that the point?
<heat>
mjg
<geist>
hmm?
<heat>
mjg says you should heavily spin for all kinds of sleepable locks
<heat>
and that they "make or break performance"
<mjg>
yep
<heat>
i don't know if thats the kind of "spinlocky" you're talking about
<geist>
sure you can do that too but that's a level 2 optimization, and highly centric on the workload, etc
<geist>
at some point you have to start tailoring what you're doing to the workload. for highly contentious, short mutexes on a SMP machine it may make sense to spin for a period of time before giving up and blocking
<geist>
we do that now in the zircon mutexes too, helps, but only for certain kinds of locks
xenos1984 has joined #osdev
k0valski1889 has joined #osdev
<mjg>
i'm yet to see a real-world case which ultimately loses from it. i also note even linux tries to spin on all of its locks
<mjg>
including semaphores
<mjg>
the do or don't factor is whether the lock owner is still running
<mjg>
well there is one degenerate pattern which does lose, but the answer to that is "don't employ the degenerate pattern"
<mjg>
and that's multiple cpus taking the same locks one after another
<mjg>
each cpu: for (i = 0; i < n; i++) { lock(obj[i]); ... ; unlock(obj[i]); }
isaacwoods has joined #osdev
nyah has quit [Quit: leaving]
<mjg>
geist: do i read it right zircon mutex code does not track the owner?
<mjg>
// Stop spinning if it looks like we might be running on the same CPU which
<mjg>
// was assigned to the owner of the mutex.
<geist>
it should
<mjg>
what's up with this bit then
<geist>
seems fairly self explanatory
<mjg>
well let me restate
<mjg>
can you reliably check from that code if the lock owner is running?
nyah has joined #osdev
<mjg>
i don't see this bit but maybe i'm blind here
<geist>
possibly not. i dont have it in front of me
<geist>
it tracks who the owner is, but i dont know if it goes back and marks it as preempted
<geist>
if the owner gets preempted, etc
<mjg>
ye no
<mjg>
// It looks like the mutex is most likely contested (at least, it was when we
<mjg>
// mutex hoping that the thread which owns the mutex is running on a different
<mjg>
// CPU, and will release the mutex shortly.
<mjg>
// just checked). Enter the adaptive mutex spin phase, where we spin on the
<mjg>
if this is the state you are dealing with, speculatively spinning, then i'm not surprised you see wins if you decide to stop
<mjg>
this needs to be patched to check the owner
<mjg>
and not guess
* geist
nods
<mjg>
then you will see consistent wins
<geist>
there's reasons it's not easy to do that
<mjg>
i presume this was not done becaues you have no means to safely access owner's struct
<geist>
correct
<mjg>
i got ya covered without rcu
<mjg>
:)
<geist>
there's work underfoot to restructure all of that, which will unlock the ability to do that
<mjg>
one way, which i don't recommend, is what freebsd did: threads never get actually freed
<mjg>
solaris added a special barrier: should yu end up releasing pages backing threads, you issue a special per-cpu barrier
<mjg>
to make sure whoever is spinning buggers off
* geist
nods
<mjg>
i think it's an ok approach
<heat>
swear to god, unixes never freed shit didn't they?
<geist>
none of this code i 'own' anymore, so i can send it to the correct folks
<mjg>
heat: normally they did not
<geist>
but they'll nod at me and say 'yes we know, this is something we want to get to eventually'
<mjg>
well i'm negatively surprised here
<mjg>
a lot of machinery to timeout a spin and whatnot
<geist>
aaaaaand once again this is why i dont post links to fuchsia code here anympre
<geist>
beause then it just turns into a mjg shit on it fest
<mjg>
welp
<geist>
and frankly i dont want to deal with that right now.
<mjg>
np
<mjg>
anyway my recommendation to heat is to do an adaptive spin
<heat>
:(
<heat>
no negativity guys
<mjg>
which is what everyone(tm) is doing
<mjg>
heat: i would say show your unix roots and make the thread-backing slab never free the pages
<mjg>
for the time being
<mjg>
no point adding complexity to handle it at this stage
<heat>
you realize I just look at the thread and yolo right?
<mjg>
well yolo is not even looking
<mjg>
which i'm not going to say what system is apparently doing
<heat>
yo
<heat>
chill with the hostilities
<heat>
banter is opt-in, not opt-out
<mjg>
for real though, all other things equal, you would beat openbsd just by not having pessimal locking behavior
<mjg>
which they do have a lot
xenos1984 has quit [Ping timeout: 250 seconds]
xenos1984 has joined #osdev
<mjg>
[unless ithey noticed it's a huge deal now that they got flamegraphs a little while back]
frkzoid has joined #osdev
<mjg>
heat: fun fact, until 2005 or so solaris would walk all per-cpu state to check who is running there
<mjg>
heat: instead of derefing threads
<heat>
im not particularly interested in un-pessimizing locks since, erm, it doesn't matter much
<heat>
if I have bad IO performance or poor VM code some nanoseconds shaved off the mutex_lock won't matter
<geist>
right. optimizing without real load is generally a way to get side tracked and not get anything done
<geist>
and honeslty again, i'd rather us not get into a This Vs That style discourse here
<geist>
i'd like folks to rememeber this is about everyone developing their own os, learning as they go, etc
<geist>
lets try not to throw up seemly artificial barriers that we have to compete wioth other things, etc
<geist>
that might scare folks off
freakazoid332 has quit [Ping timeout: 264 seconds]
<mjg>
i would liek to note that locks were top of the profile in onyx
* geist
gives up
<heat>
mjg, they were top of the profile because I have big locks in a few places
<heat>
don't forget we're looking at 4 threads, not 40
<mjg>
well i'm not gonna flame about it today
<mjg>
i would say an osdev article about realities of locking be nice
<mjg>
meanwhile if you find time you can read the mcs paper to get a general idea
<heat>
<p>very herd</p>
<heat>
post it on the wiki
<mjg>
should keep up th perf work, just ping me when you beat openbsd in a fair bench
<mjg>
with a flamegraph
<mjg>
:)
<mjg>
should you*
<mjg>
and if possible find out to grab one for thier kernel
<heat>
you know
<mjg>
it is plausible you are going to need to recompile it, which may be too much hassle
<heat>
I've been compiling with ubsan all this time
<mjg>
oh?
<heat>
yeah
<heat>
it may already be faster
<mjg>
well i do note obsd is partially intentionally self-shafted single-threaded
<mjg>
due to mitigations to stuff
<mjg>
which you don't have
<mjg>
(other self-shafting is them just being slow though)
<mjg>
i tried hotpatching th kenrel once to not employ meltdown mitigatins et al
<mjg>
but then it crashed
<mjg>
:)
<mjg>
i don't think you can disable retpoline stuff either and i don't htink you are going to add it for yourself
<mjg>
so tell you what, get the fastest multithreaded result you can
<mjg>
and if that beats openbsd we will think
<mjg>
sounds good?
<mjg>
being able to run on more threads would be a great nullifier for that factor
<mjg>
i'm saying the lower quality of locks, the more they degrade when faced with contention
CryptoDavid has quit [Quit: Connection closed for inactivity]
<mjg>
if you lessen a bottleneck somewhere, but your workload is contended, you add to said contention
<mjg>
and very well may suffer a performance loss
<mjg>
just get that per-cpu stuff sorted out kthx
<mjg>
there are many moving parts here but when turbo bottlenecked, like you are right now, and perf may do anything from going down, to not changing, to going up a little
<heat>
are you personally invested in this
<mjg>
no
<mjg>
it's not my code
<heat>
you're lyin
<heat>
you wanted me to beat openbsd
<mjg>
you already beaten openbsd so at this point it's whatever