<heat> mjg, i got from 30k -> 90k, and then -> 120k by skipping the slow, sleepable locked lookup for tmpfs
<heat> in the meanwhile I'm pretty sure I broke rename(2)
<heat> but I think that all needs a revamp sooooooooooooooooooooooo
<heat> not horrible
<mjg> wut
<mjg> what's your single threaded perf
<heat> 80k
<geist> Rename is optional
<mjg> well then you got a positive scaling factor
<mjg> geist: not used in the benchmark, so who even cares!
<mjg> ship
<mjg> heat: still, waiting for the *legitimate* openbsd comparison :)
<heat> mjg, well funnily enough from -t 2 to -t 4 I get ~0% performance difference
<heat> so it scales up to 2 threads? :P
<mjg> :)
<heat> tbf it's hitting a single lock
<mjg> i have to say this little shit does not scale almost whatsoever
<mjg> and that's partially by design
<heat> vfsmix?
<mjg> yea
<mjg> well le me restate, in all current kernels it serializes on the same lock
<mjg> in principle they could make it scale though
<mjg> at least for tmpfs
<heat> which lock? /tmp's dentry?
<mjg> yea
<mjg> well not dentry, more like inode
<mjg> anyhow if you want something which has a chance to scale, see the stat benches i pasted
<mjg> doing almost linear scaling there (modulo power throttling and whatnot issues)
<heat> i think I'll push this and look into my make -j4 issues
<heat> and that will take a good revamp of IO like I was wanting to do
<heat> make -j4 is pretty much serialized on /bin/gcc's vm object, it's horrific
<mjg> wait
<mjg> what exaclty did you do to skip the lock?
<mjg> is that something legit?
<heat> just a SB_FLAG_IN_MEMORY for filesystems that have all their directories in the dcache
<mjg> i find it suspicious you don't want to do a legit bench against obsd
<heat> the rwspinlock change was good but I still needed to take a rwlock for actual filesystem changes as you may guess
<mjg> it is fine if you don't feel ready, but just say so
<heat> ok fuck it
<heat> mjg, actually I'll do open3
<mjg> i'll defo want to see stat4 man
<mjg> open3 looks good tho
<heat> 160k on mine, 200k on openbsd
<heat> fuck
<mjg> :::)
<heat> fuck you
* mjg smells an allnighter
<mjg> is that _processes variant?
<mjg> can you try _threads?
<heat> no
<heat> its 3am and I've spent too many allnighters on this crap
<mjg> :)
<mjg> normally i would tease you, but on a serous note turning in sounds like a good plan
<bslsk05> ​gist.github.com: onyx-open3.svg · GitHub
<mjg> how do you store dentries
<mjg> hash?t ree?
<heat> linked list baby
<heat> not like it matters here
<mjg> well you are traversing some number of entries before you even get to /tmp
<mjg> in /
<heat> sure
<heat> does it matter? probably not?
<mjg> openbsd has a rb tree
<mjg> it may hurt you single-threaded
<mjg> strcmp is pretty nasty
<heat> I see malloc taking 20% of the runtime :)
<heat> actually, more
<heat> around 30%
<mjg> what are you mallocing?
<mjg> path buf?
<mjg> oh right, file
<heat> strings, struct file
<heat> also free(strings)
<mjg> welp i'm turning in, i wuld say the slab approach from s-system is the way to go in general
<mjg> :>
<heat> right
<heat> i would at least be able to turn this chunky lock into something better
<mjg> the system which shall not be named has per-cpu locks there
<mjg> which is quite wasteful, but good enough as first stab
<heat> you know, this is an unfair comparison
<heat> i really should compare myself to plan9
<mjg> which one
<heat> yes
<mjg> to open?
<heat> any operation
<heat> the more IPC hops the better
<mjg> what's next, you gonna bendh against fuchsia
<heat> sure
<mjg> obsd is derived from a codebase which had significant tech debt before they even started
<heat> geist, flamgraf where
<mjg> lol
<mjg> well i'm getting out of the blast radius
<geist> hmm?
<heat> mjg, my middle name is technical debt
<heat> geist, just a joke ping
<geist> ah oky
<heat> although do you have flamegraphs in fuchsia?
<heat> or is it all perfetto
<geist> what is perfetto?
<heat> the thing fuchsia uses to get nice tracing stuff
<heat> ui.perfetto.dev
<geist> i dont think it is perfetto based
<geist> but we do have a trace thing, yes
<heat> yes, and that trace thing generates a file that you feed into perfetto
<geist> okay, then you know more than i do
<geist> i know some of the lower level parts, but past a certain point it's just soe magic ui goop i dont care about
<heat> yeah I asked around a few months ago
<geist> usually i the traces are rendered in chrome
<heat> hrm
<geist> maybe the perfetto is some ne thing
<heat> so you convert the fuchsia format into chrome tracing format?
<geist> but yes, there's a system trace thing that can be used to generate renders
<geist> yes
<heat> yeah, perfetto is newish
<heat> google is switching to it in a bunch of projects
<geist> oh
<heat> it has its own tracing library/daemon which uses protobufs, and then a viewer/trace processor that supports chrome, fuchsia, protobuf, etc
<geist> yeah something like that
<geist> and it's blended with in-kernel logs
<heat> ye
<heat> i gotta say, I miss doing driver work :|
<heat> these last performance days were fun but I miss the other stuff
nur has joined #osdev
<zid> I need £4 of ewaste to purchase
<zid> so I can hit minimum shipping
<GeDaMo> I bought some thermal compound, that was about four quid
<zid> £1.62 for some lighter flints is all I've thought of so far
<zid> maybe a weird food or something
<GeDaMo> Cables? Batteries?
<zid> £4 for a cable? hmm
<zid> £5.99 for the only cable I can think of
<GeDaMo> USB memory stick?
<zid> GeDaMo do you have prime
<GeDaMo> Yes, for this month, but only because of a deceptive dialog box (and I'm not happy about it)
<zid> haha
<GeDaMo> There was no obvious way to say you didn't wan it and after clicking, no way to back out of it
<zid> if you need anything worth.. £15.99 we could trade :P
<ddevault> nothing, it seems
<mrvn> ddevault: if it is used then TLS, otherwise it uses %fs
<clever> mrvn: i assume that when you call something like pthread_create, it doesnt actually run your entrypoint directly, it runs some wrapper, that allocates more TLS space, and sets that reg, then runs your entrypoint?
<ddevault> correct
<mrvn> clever: it's something ld-linux.so sets up and pthread_create calls some hooks I believe. The TLS space has a pristine copy of the data parts that get copied per thread.
<mrvn> The reg points to a struct so that has to be created too
<ddevault> ld-linux does not set it up
<ddevault> it's rather disgusting
<ddevault> libc's startup code scans the current process's elf file looking for TLS headers
<ddevault> then allocates the initial thread's copy
<ddevault> my kernel does what mrvn thought ld-linux did, which is to make the loader set up the pristine copy
<linkdd> ignorant, maybe dumb, question: what is TLS space? I see that accronym and can only think about TLS/SSL/x.509 certificates
<j`ey> thread local storage
* linkdd suck at accronyms
<linkdd> j`ey: thx
heat has joined #osdev
<heat> geist: what's the most reliable way you've found to get symbols at runtime?
<heat> i usually have grub set that up for me but not for riscv, arm64 since the boot protocol doesn't give me any of that
<clever> heat: one option ive looked into before, was arranging my linker script so i could just drop the whole .elf file in memory raw, and jump into it, then the code can find its own elf headers
<clever> in my case, the previous stage expected a raw binary, and jumped to offset 0x200 within that binary, and the elf headers fit in the hole, so a linker script could make both things match up
<clever> but then all of the other stuff like symbols and debug info put me over a different size limit, so i abandoned the idea
<heat> yes, that's a horrific solution and I hate it
<mrvn> ddevault: the crt does it? Kind of makes sense though if you have a static binary it still has to work.
<clever> i still need to look into how software like linux can embed its own symbols into the binary, without then changing the symbols
<mrvn> heat: following the C standard you can't get symbols because function pointers don't have to fit in void*
<heat> i don't care about the C standard
<mrvn> clever: better make that offset 4096
<mrvn> isn't the elf header passed as argument to _start?
<heat> no
<heat> auxv is
<heat> and auxv has the phdrs
<mrvn> +indirectly
<heat> you never get the first header afaik
<heat> the Elf_Ehdr
<mrvn> do you need it?
<clever> mrvn: the 0x200 limit was basically hw imposed, its how the maskrom loads .bin files
<clever> but yeah, i would probably use something more like auxv maybe, if i had control of the previous stage
<mrvn> ahh, hardware, things you can kick but not change
<clever> now that i say that, i do have a .bin and a .elf stage
<clever> so i could do just that
<bslsk05> ​github.com: lk-overlay/stage1.c at master · librerpi/lk-overlay · GitHub
<clever> this bit of code parses a .elf in a void*, copies the sections based on DT_LOAD, and jumps to the entry-point
<clever> i could just give it a pointer to 1(or more) oarts of the input .elf
<clever> then it will have a reference to the original .elf, seperate from the copy that is running
<clever> ah, but minor issue there, if i open an elf in a filehandle, it doesnt actually store the whole elf in a void*
<clever> it reads it in chunks as needed, and never keeps the whole thing
<clever> so i would have to follow that api, load the region with symbols, and pass it along
<geist> leet I just learned today: replace some error define like `#define ERR_FOO (-123)` with `#define ERR_FOO (printf("XXX error at %s:%u\n", __FILE__, __LINE__), -123)`
<geist> obviously doesn't work in all cases, but pretty gnarly
<clever> neat
<mjg> there is a sytem which uses that to slap an instrumentatin probe
<mjg> so you can just check where an error is coming from
<clever> i'm also reminded of the HasError thing in haskell
<clever> basically, by adding that attribute to the function, the compiler will add an extra invisible parameter to every call-site for your function, which includes the source of the call-site
<clever> and if that source also HasError, then it forms a linked list, the stack
<bslsk05> ​hackage.haskell.org: GHC.Stack
<clever> so its kinda like declaring your function as: void foo(CallStack *stack = generateStack());
<clever> but it dynamically detects if the call-site has the same var in its own args, and it wont collide with other vars in scope
<clever> but does add overheads, due to having an extra arg to pass on every call
<jafarlihi> Is concept of "remote"s totally a local thing in git? I specifically want to know if adding someone else's repo as remote will change anything about your GitHub repo, following this: https://gist.github.com/wtbarnes/56b942641d314522094d312bbaf33a81
<bslsk05> ​gist.github.com: Brief instructions for how to modify and push to someone else's PR on github · GitHub
<j`ey> remotes are local
<mrvn> geist: I need to do that to fuse but I'm afraid it will return errors from libc functions it calls in turn and not return constants.
<heat> oh yeah lmao
<heat> comma operator is OP
<heat> I originally learned a similarish trick for errno from sortix
<heat> return errno = ENOENT, NULL;
<heat> I've also done return spin_unlock(lock), -1;
<moon-child> I've heard people say comma operator is evil or w/e
<dzwdz> oh nice, i never thought about using it like that
<zid> It's a bit like goto imo, you need it as part of a 'pattern' that's easy to parse
<zid> else it's a sign you fucked up
<heat> gotos are easily abusable
<heat> the comma operator really isn't
<j`ey> arent you just saving {}?
<heat> btw I think they're doing something in C++ wrt that
<heat> because doing Matrix m; m[10, 20] is become possible
<heat> j`ey, yes
<heat> oh yeah, there's also this trick if you want to avoid the comma (gcc/clang)
<heat> ({printf("Error!\n"); -123})
<heat> sorry, ({printf("Error!\n"); -123;})
<bslsk05> ​gcc.gnu.org: Statement Exprs (Using the GNU Compiler Collection (GCC))
<heat> these are super-abused in linux
<bslsk05> ​github.com: Onyx/percpu.h at master · heatd/Onyx · GitHub
<moon-child> j`ey: no because {} is a statement
<moon-child> , is an expression
<moon-child> (and yeah in gnu you have ({ }), as heat says)
<j`ey> moon-child: i mean in the return examples
<moon-child> oh, sure
