klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
<gorgonical> I'm getting a weird feeling I've somehow broken some cache functionality. A hashtable lookup that's never broken before is exhibiting now the same kind of cache coherence behavior from earlier, requiring a manual dcache flush to show up on the other core
<gorgonical> I don't like when cache stuff like this breaks because it's very confusing
xvanc has joined #osdev
slidercrank has quit [Ping timeout: 268 seconds]
xvanc has quit [Ping timeout: 255 seconds]
<geist> yeah
<geist> you were not doing SMP before and now you are?
<geist> what changed?
<kazinsal> ha, yeah, that's something that's pretty universal. "hey this works great" *turns on another core* "what the fuck"
<klange> I continue to be rather afraid of how little I did in this regard.
<klange> What mysterious issues lurk in unseen corners?
<Mutabah> Meanwhile, I reaped the benefits from using rust recently. Turned on SMP, and after getting the bringup logic working - everything Just Worked
<moon-child> rust won't help you not screw up the cache
<gog> rust rust rust
<gorgonical> geist: I don't know if we ever got kitten with multiple cores working on this board, but we've got it working on a different one
<gorgonical> with multiple cores. that was how we learned about the morass that's inner/outer shareable and how both the tcr registers and the page tables are marked for it separately, etc
<gorgonical> I don't *think* I did anything to mess with that, but now I'm seriously paranoid lol
<moon-child> RUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUST
<Mutabah> moon-child: Well, not things like TLB shootdown sure... but other race conditions yep
<moon-child> kindaaa
<moon-child> it can at least attempt to draw attention to cases when something is shared
<moon-child> which you can argue is advantageous
<moon-child> but you can always manifest your own race conditions at a higher level
<gorgonical> I haven't changed any of the code to do with memory configuration and stuff, just checked. I did check and the TCR has shareability set to inner, which is the less inclusive choice
<gorgonical> I should check the page tables themselves now
<moon-child> not saying useless. But not panacea either
<geist> yah this may be where you have been missing some memory barriers and now you need them
<geist> for example updating a page table entry and not DSBing when done
xvanc has joined #osdev
<gorgonical> Is it possible that this is "IMPLEMENTATION DEFINED" behavior varying? I'm seeing with an alarmingly large amount of the arm cache/translation stuff that the exact behavior is impl. def.
<geist> not anything in v8
<gorgonical> I am currently chalking it up to the pine64 board being different from the rk3399. After all, the rk3399 actually has two clusters, meaning in theory there is a difference between inner/outer shareable. But I was under the impression that if PE 0 and 1 are in the same "shareability domain" and you mark pts as inner shareable you shouldn't have to manually manage cahce
<geist> v8 standardized things pretty solidly. before that there were some exceptions
<geist> and no. the two clusters should absolutely be inner sharable
<gorgonical> then what is outer for?? the manual at least suggests clusters could be in different domains
<geist> otherwise it wouldn't work. all cores that expect to be in the same SMP domain *must* be inner sharable
<geist> outer is basically deprecated. idea is you could put cores that are off running different things
<geist> or say, a GPU or whatnot
<geist> but i think in practice nothing uses it
<gorgonical> oooh a gpu is not a bad idea for is
<gorgonical> but I see what you mean
<gorgonical> but basically it means that for all systems if you mark it as *either* shareability then it should work as you expect, without explicit cache management?
<geist> well, not entirely sure what you're asking there
xvanc has quit [Ping timeout: 255 seconds]
<gorgonical> What I'm really getting at is that since inner is fully covered by outer and outer is deprecated, it really means you have no-share and inner/outer
<geist> well, and 'sys' which is the outer to outer
<geist> at least in things like DSB and whatnot
<geist> think of it this way: inner sharable is for synchronizing between cpus in an SMP system
<geist> the SYS domain is for synchronzing with things like devices
<geist> so if you just want to say insert a barrier so that another cpu sees it: dsb ish
<geist> but if you want it to do the whole thing it's 'dsb sy'
<gorgonical> Holy crap those acronyms just clicked
<geist> in linu for example a bunch of these are hidden behing 'mb()' macros
<gorgonical> Before they just looked like perl sigils to me but it's "data sync barrier inner-share"
<geist> like smp_mb vs mb. and in the case of smp_ it'll probably be ish
<geist> yep!
outfox_ has quit [Ping timeout: 255 seconds]
<gorgonical> Hmm. Okay maybe I'm forgetting some basic CS theory here but what's the point of setting them shareable if, to be sure it happens, I have to dsb ish after each malloc for a smp-shared resource?
<geist> because ARM is a weakly ordered architecture (vs x86)
<gorgonical> Like, is the setting in tcr_el1 and the pt entries just a hint to the caching engine about what to flush and stuff when I *do* run dsb ish?
<geist> without actual barriers at particular points, other cores are not guaranteed to see writes in any particular order
<geist> it's still cache coherent, but the order at which things happens is undefined
<gorgonical> Gross. I'm also shocked this wasn't somehow a problem before
<geist> so you use barriers like dmb/dsb/isb at various points or use isntructions with built in barriers (lots of atomics do) so that it's not a problem
<gorgonical> I mean it's not shocking considering this is originally an x86 OS but we *did* port it
<geist> in general if yuo aren't doing a lot of volatile sharing, and you're probably using locks and atomics, it just works
<geist> since locks and atomics almost always have a barrier built in them already
<gorgonical> oh that's a good point
<gorgonical> what's the acronym for dmb then?
<moon-child> data memory barrier?
<geist> ie you grab the mutex (barrier), now you can fiddle with the object(s) all you want, and writes to the cpu can happen in any order, then you go to release the mutex (barrier), now all other cpus see everyting occurred
<geist> (this is where acquire and release semantics come into play, where the barryer can operate in either direction or both
<geist> )
<gorgonical> ah
<geist> and yeah DMB is a weaker and more subtle form of DSB
<geist> DMB just informs the cpu that when it gets around to writing out stuff it has to do it in this particular order (before + after)
<gorgonical> Then in theory if this stuff was working before I need to go snoop around my atomics/locks and make sure I didn't break something there. We use locks and such properly
<geist> but doesn't otherwise stop the world
<geist> DSB is more of a stop the world, make it get out into the cache subsystem
baine has quit [Ping timeout: 248 seconds]
<geist> i think you'll also notice the cache flush routines have a DSB at the end right?
<gorgonical> yes
<geist> same with TLB sync routines on arm64
<geist> that DSB also acts as a 'make sure those pending cache flushes/tlb syncs happened'
<gorgonical> yeah it's transactional, right
<geist> since they are *also* weakly ordered
<gorgonical> It is more flexible but also more complex
<geist> think of both of those kinda operations are a special kinda bus write, essentially. queued up with all the other writes the cpu may be doing
<geist> yah makes you appreciate (or be grossed out) by all the efforts x86 cpus myust go through to make after appear in order
<geist> even if they aren't really
<gorgonical> seymour cray was right
<gorgonical> multicore is a mistake
<gorgonical> lol
baine has joined #osdev
<geist> x86 cpus are tracking a fairly substantial amount of state to keep everything looking in order from the outside
<moon-child> from what I understand from what david chisnall said, the effect of tso is basically just that you have to have a bigger write buffer
<moon-child> weak ordering def better though
<geist> right you have a large write buffer and strict dependencies between entries in it
<moon-child> mehh
<moon-child> need lots of fancy dependency tracking anyway
<moon-child> I actually kinda wonder what would happen if you moved more of that out of the uarch into the arch
<geist> true, though it's much simpler to just slam stuff out as the cache lines become available.
<geist> but it's a good question: what is the real cost of all of that
<moon-child> eg don't do memory disambiguation in hardware; instead, have alias tags on loads/stores
<geist> really the apple M1 cpus are probably the best way to test that theory, since they support both modes dynamically
<moon-child> maybe. Maybe they underestimate the effect, since you're still paying in area
nyah has quit [Quit: leaving]
<geist> they n eed it because of x86 emulation, on top of an otherwise weakly ordered machine
<geist> but in that case yeah they have to build the silicon to do it, but can turn it off for regular ARM code
<geist> so presumably there must be a win, since they have both situations at the same time
<geist> otherwise they could just choose to leave it on all the time since they've already got the silicon for it
outfox has joined #osdev
<moon-child> yeah, but point is there might be an even bigger win if you could take that silicon and use it for something else
<gorgonical> so do the ldaxr/stxr instructions do something like dmb/dsb? They don't, right? This is just for synchronizing on the lock address itself
<geist> moon-child: oh 100%. they dont have that choice but most arm cores of course do, and thus dont have the overhead of silicon for it
<moon-child> there are some other big arm cores with a tso mode iirc
<geist> gorgonical: they do, the 'a' is the acquire part
<geist> otherwise it's just ldxr
<geist> and similarly stlxr vs stxr (the 'l' is for release)
<gorgonical> That's disturbing then
<geist> so using one or the other n either end gets you an acquire/release or both (seqcst)
<geist> https://marabos.nl/atomics/hardware.html#instructions-overview is a pretty good cheat sheet onc eyou grok it
<bslsk05> ​marabos.nl: Rust Atomics and Locks — Chapter 7. Understanding the Processor
<gorgonical> If you always acquire then you never need to release, right?
<geist> no not at all
<geist> acquire vs release are all about which 'direction' the barrier applies to
<geist> ie, before vs after memory transactions
<bslsk05> ​www.cl.cam.ac.uk: C/C++11 mappings to processors
<geist> if you put both then you have created a full before & after barrier
<geist> otherwise an acquire or relase barrier only works in one direction, which is fine for the nomenclature: you're acquiring access to the data vs releasing it
<gorgonical> Then I must be misunderstanding something
gog has quit [Ping timeout: 255 seconds]
<gorgonical> Or something is super borked
<gorgonical> I allocated this htable inside a spinlock, which has ldaxr for locking and stlr for unlocking the lock
<gorgonical> But if I don't explicitly flush_dcache_area on the *pointer* for the htable, the pointer shows up in core 1 as null.
<geist> did you enable the cache on core 1?
<gorgonical> But the malloc is inside the spinlocks
<geist> what kinda cores are these?
<gorgonical> These are the standard cores on the rk3399, so a72
<gorgonical> Cache should be enabled on core 1
<geist> double check that
<geist> (almost certainly is because otherwise your spinlocks would asplode)
<geist> are you positive your spinlock works properly?
<gorgonical> Positive in the sense that they have been working for years at this point and to port it about 18 months ago all we did was copy Linux's spinlock primitives for the arch port
<gorgonical> We did not re-engineer the asm for the spinlocks
<geist> when you initialize the second core did you invalidate it's cache? (should be fine, but a good idea)
<geist> in case it has stale entries with garbage in it
<gorgonical> Yeah part of head.S has a big dcache flush I believe
<geist> flush or invalidate?
<geist> ie, clean vs invalidate vs clean + invalidate?
<geist> (to use arm's nomenclature which is very precise)
<gorgonical> bl __flush_dcache_all; ic iallu; dsb sy
<geist> on the secondary core?
<gorgonical> Yes
<geist> before the cache is enabled?
<geist> what precisely dose that flush dcache routine do?
<gorgonical> Should be. We go from cpu_setup to enable_mmu
<gorgonical> Let me point you to it
<geist> okay
<bslsk05> ​github.com: kitten/cache.S at master · HobbesOSR/kitten · GitHub
<geist> yeah that's probably fine. double so if you boot it from PSCI which i think guarantees that the cache is clean
<gorgonical> I wasn't responsible for this code so I don't know it well
<geist> hmm, yeah that code is not really a great idea, but it's a super hard problem to generically do
<geist> so for example that code uses a `dc cisw` at the bottom of it
<geist> so it's really clean+invalidate
<geist> which strictly speaking isnt a good idea here because that it does is it causes the secondary cpu to write out any stale cache lines it may have
<geist> since the cache isn't enabled, and/or it was initialized with garbage it could hypothetically overwrite something
<geist> but it probably doesnt
<geist> usually the right idea is if you really want all your ducks in a row either dont do anything (because PSCI guarantees cpus come up with an empty cache) or do a pure *invalidate* over your cache
<geist> but *only* to the point of unification
<geist> so you end up needing like 8 variants of that
<geist> i kinda doubt ay of this is a problem
<gorgonical> I have a suspicion that running in secure mode is contributing to the problem but I am not sure
<gorgonical> So we flush+invalidate, then load sctlr_el1 val which enables caching. Since it's the same kernel code the pts and control registers are the same, so they should all have the same sharing settings
<geist> yeah
<geist> must be something you dont know could be a problem, and thus aren't reporting
<geist> ie, everything you have told me about seems correct at least on the surface
tiggster79 has joined #osdev
xvanc has joined #osdev
heat has joined #osdev
<heat> hell
<heat> i saw i missed a RUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUST moment, so there it is
xvanc has quit [Ping timeout: 248 seconds]
xvanc has joined #osdev
frkzoid has joined #osdev
wand has quit [Remote host closed the connection]
gxt__ has quit [Remote host closed the connection]
gxt__ has joined #osdev
wand has joined #osdev
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
[itchyjunk] has quit [Remote host closed the connection]
xvanc has quit [Remote host closed the connection]
baine has left #osdev [WeeChat 3.7.1]
xvanc has joined #osdev
terrorjack has quit [Quit: The Lounge - https://thelounge.chat]
terrorjack has joined #osdev
elderK has joined #osdev
heat has quit [Ping timeout: 248 seconds]
Arthuria has joined #osdev
<gorgonical> The world is chaos and I still don't understand where these sync errors are from
SpikeHeron has quit [Quit: WeeChat 3.8]
SpikeHeron has joined #osdev
gxt__ has quit [Remote host closed the connection]
<geist> the silly thing is it'll end up being something tiny
<geist> or worse, something you aren't doing
gxt__ has joined #osdev
<zid> I have the solution to all your code woes
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
<geist> got the visionfive 2 rv64 running
gxt__ has quit [Remote host closed the connection]
<geist> better performance than i expected. sits nicely alongside quad a53 boards i've used
<geist> which is exactly in line with the kinda cpu tech thats in it (quad sifive u74 cores @ 1.5Ghz)
gxt__ has joined #osdev
Arthuria has quit [Remote host closed the connection]
bgs has joined #osdev
<gorgonical> Here's a thought. If I run core 0, which should be releasing the spinlock, in a loop waiting for core 1 to change some sentinel, if I run dsb sy each loop iteration will that sort of act as a total ordering flush for the system?
<gorgonical> Just for debugging purposes?
<gorgonical> Core 0 reads that the spinlock is released but core 1 doesn't and is stuck forever
slidercrank has joined #osdev
xvanc has quit [Remote host closed the connection]
<qxz2> what are your opinions of go? i know it's mostly meant for distributed/cloud apps.
<zid> oly thing I really know is that the binaries are absolutely enourmous and the ecosystem is so fragile that people just maintain all their own deps
<qxz2> i think they're statically linked
<qxz2> which probably explains the bloated binaries
<zid> some language feature like reflection or debug api means they ahve to include the source for every line and stuff in everything
<zid> there's still afaik an open bug for "yea, maybe hello world shouldn't be 10MB?"
<qxz2> hah
<zid> that former thing is also annoying to corps like nvida who use it afaik
<zid> I can't remember what their solution was, source obfuscation or something
<zid> or hacking the binaries up
<qxz2> nvidia is known for using go? i didn't know that
<zid> no, they're known for making graphics cards, you'd think you'd have heard of them before :P
<geist> i suspect go is quite good at doing precisely what it sets out to do and isn't so great in other contexts
<geist> ie, is more domain specific than it probably wants to be
<qxz2> i'm very familiar with nvidia
<qxz2> what do they use go for?
<geist> but i can't say i have a tremendous amount of go experience
<geist> i was briefly swept up in the hype 10 years ago or so
<qxz2> the generics are idiosyncratic
<qxz2> it's an interesting reaction to typical OO langs
<qxz2> composition instead of inheritance
<qxz2> interface types
<geist> gorgonical: how are you configuring the page tables?
<geist> there's a shared/no shared bit
<zid> qxz2: idk, wouldn't surprise me if some random stuff like the debugger or experience etc was written in it, I was just talking to someone working on some go at nvidia listening to his rambling
<qxz2> i googled around and saw nvidia job ads looking for go experience, mostly for backend type dev, which makes sense
<geist> and/or web backend where i think it shines
<qxz2> distributed systems are go's niche
<qxz2> "Building automation for routine maintenance tasks for GPU farm
<qxz2> fun
<geist> yah makes sense.
<qxz2> concurrency is built in. i think the runtime has its own scheduler.
<zid> neat, there's a tool that can unpack the nvidia driver packages, do the installation by hand with better configurability
<zid> and apply a driver patch to enable MSI
Vercas has quit [Quit: Ping timeout (120 seconds)]
Vercas has joined #osdev
<gorgonical> geist: the shared/unshared bit refers to the two bits in the pte right?
<gorgonical> where 11 is ish and 10 is osh?
<gorgonical> or so
<geist> iirc there's a NS bit
<geist> no. it's another one
<gorgonical> That's the non-secure bit
<geist> yeah okay, maybe there's another one
<geist> basically it says 'dont bpther doing cache coherency with this' i think. basically unused. lemme see
<qxz2> is this x86_64?
<immibis> ARM documentation tells me that cache coherency is system-specific, but that is from the point of view of the single-core CPU
<geist> ah yeah i guess it is bit 8
<qxz2> oh nm
<gorgonical> Because I'm losing my mind over here. I can't even get the spinlock to communicate even when I loop dsb sy
<immibis> DSB implies ARM I think
<bslsk05> ​github.com: lk/mmu.h at master · littlekernel/lk · GitHub
<gorgonical> qxz2: yeah arm64
<geist> anyway you def want to set inner sharable there on all the PTEs
<gorgonical> Yeah those are set
<geist> yah using non sharable pages there is a feature that i dont think anything uses, and is possibly unimplemented in most cores
<geist> but still worth setting
<gorgonical> it is definitely implemented in the pine64
<gorgonical> lol
<geist> there's some verbiage on a lot of this inner/outer stuff that an implementation may actually implement something 'larger'
<gorgonical> That was how we discovered it was a feature in the first place
<geist> ah
<gorgonical> When we first ported kitten to arm64 fully
<geist> so just to be clear this is the very first time you've tried to enable SMP on ARm64?
<geist> whereas you have working SMP on x86?
<gorgonical> No, this is the first time I've tried smp on arm64 in secure mode on the rockpro64
<gorgonical> We have smp working in regular mode on qemu and the pine64
<geist> hmm
<gorgonical> yes exactly
<geist> can't say i think there's much of a difference there, except the whole secure bit in the page tables
<gorgonical> And these are basically the exact same problems that we solved the last time by fixing coherency issues
<geist> maybe that interacts with the cache subsystem in a weird way
epony has quit [Quit: QUIT]
<gorgonical> That's my only thought but I did look at the manual with some scrutiny and ctrl+f and didn't find any... clear indication that it might mess with it
<geist> but in general secure vs insecure mode should be hidden. the cpu even boots in secure mode, so it's possible that a random linux is actually running in secure mode, just because there's nothing that took it away
<geist> probably does a lot of the time anyway
<gorgonical> yes that's right. your bootloader actually needs to be sure and untick the box, but if you leave it on most stuff will 'just work'
<geist> and the secure/nonsecure bit only really kicks in when you have both modes actie at the same time, otherwise they dont really *do* anything
<gorgonical> yes
<gorgonical> right
<gorgonical> so in theory I wouldn't expect the secure bit to have such dramatic impacts on the coherence subsystem
<gorgonical> Unless something horriyfing like one of the cores running in secure mode and the other one not is happening
<gorgonical> WAit
<geist> yah there's some complicated verbiage about generating cache entries in one mode and then trying to use it in another, etc
<geist> oooh.
<gorgonical> Boy I hope that's not true lol
<geist> that might do it, because i think secure cache lines can't really be accssed from non secure cpus, etc
<gorgonical> trusted firmware and the psci interface sucks anyway man. It's always something
<geist> and then i honestly dont know what happens, up to and including UNDEFINED
<gorgonical> trusted firmware doesn't even allow secure world callers to issue psci calls
<geist> having fun building linux here on this riscv machine. it's slow! but it's cortex-a53 slow, so really that means it's performing as expected
<gorgonical> I'm jealous. I haven't had time to play with my riscv machines for a while and won't for more time
<gorgonical> dissertation research and stuff looms
<gorgonical> But I badly want to port kitten to the d2 riscv core I have
<geist> yah sadly nothing on the market yet supports virtualization mode, except qemu
<geist> looks like some of the new sifive performance cores, the P670 in particular, does
<geist> but dont know when that'll show up in hardware
<gorgonical> I am very pleased that sifive has so much traction. I was working somewhere we could afford the boards but I'm happy someone's actually doing it
<gorgonical> we had the unmatched
<geist> yah i have an unleashed around here somewhere
<geist> and an unmatched
<gorgonical> The d2 I got is this goofy clockwork pi devterm thing
<gorgonical> It comes with a thermal ink receipt printer
<gorgonical> ...
<geist> probably will get ahodl of the one of the horse creeks when they come out, but they're going to be P550 based, which is just before the cut to get virt, etc
<gorgonical> This nonsecure core theory can be tested, too. I haven't actually put the code in kitten to turn on the trustzone memory controller for the kernel image
<gorgonical> That would quickly tell me if secondary core is secure or not
<gorgonical> geist have you played with zig at all? It came to my attention recently and its constant presence on HN makes me wonder about it
xvanc has joined #osdev
xvanc has quit [Remote host closed the connection]
<geist> hmm, zig the language no. someone mentioned it at work as something to look at
<geist> and they seemed to be pretty impressed
xvanc has joined #osdev
<gorgonical> I haven't done much work in it but it seemed very simple and even pleasing. In the way that c++ is impressive but because of its complexity, zig felt impressive because of what it seems to be able to do while still looking like c to me
<gorgonical> I wrote a little stub kernel in D's betterC subset and I remember wishing "why isn't this just a whole language"
<moon-child> isn't it a whole language?
<moon-child> also: betterc is mostly a marketing thing
<zid> am I allowed to pronounce that like "beturk"
<gorgonical> nobody can stop you, zid
<zid> To make something turkish
<gorgonical> moon-child: how do you mean about it being marketing?
<gorgonical> that it has neat interop or something like that?
<moon-child> there's not much point in using it over 'regular d'
<gorgonical> I mean unless you're trying to do something like a simple kernel
<zid> using regular D also lets you make crappy memes about sex
<moon-child> gorgonical: no
<moon-child> omg
<moon-child> 'use lldb', they said
GeDaMo has joined #osdev
<moon-child> 'it's great', they said
<moon-child> I tried it, and it span up to 100% cpu doing who knows what
<zid> First they came for the lldb users but I did not speak out, for I was sane
<moon-child> never doing that again
* moon-child remains annoyed gdb doesn't work on mac
<gorgonical> I've been using gnu global and ggtags recently and I quite like it
smach has joined #osdev
epony has joined #osdev
levitating has quit [Ping timeout: 268 seconds]
levitating has joined #osdev
Vercas has quit [Quit: Ping timeout (120 seconds)]
Vercas has joined #osdev
gog has joined #osdev
<gorgonical> Good job forgetting for nearly an hour that ioremap_nocache exists and fumbling around like a dingus with head.S
<gorgonical> Probably time to sleep for the night
danilogondolfo has joined #osdev
nyah has joined #osdev
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 264 seconds]
danilogondolfo has quit [Remote host closed the connection]
<geist> hmm i really should give zig a shot though
<geist> that's at least two people that have recommended it for all the reasons i'm interested in
levitating has quit [Ping timeout: 255 seconds]
danilogondolfo has joined #osdev
levitating has joined #osdev
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
gildasio has joined #osdev
tepperson has quit [Ping timeout: 248 seconds]
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
Amorphia is now known as Rust
epony has quit [Remote host closed the connection]
zxrom has quit [Quit: Leaving]
epony has joined #osdev
Rust is now known as Amorphia
Vercas1 has joined #osdev
<epony> is it supported by GCC ?
Vercas has quit [Remote host closed the connection]
Vercas1 is now known as Vercas
<mjg> geist: zig?
<mjg> geist: quite frankly i would just spend time on rust man :)
<mjg> geist: why the zig
<davros1> geist have you tried any other C/C++ alternatives cough rust cough
<davros1> I must admit although 100% capable rust is a PITA for low level (IMO).
<mjg> excuse me sir, can i talk to you about a fast and memory-safe alternative to whatever you are using right onw?
<davros1> Look what I did there with a disclaimer before I get onto its virtues
<davros1> Seems there are some other options appearing that offer better C++ interoperability
<davros1> I can't afford another language switch though
<pitust> isnt zig the language where the compiler is a minefield?
<mjg> is it?
<mjg> i think that's the onme which compiles to c
<davros1> They have their own backends
<mjg> can't be arsed to check
<davros1> Its not compile to C, they do LLVM and their own codegen
<davros1> The aspect in which the compiler may be a minefield perhaps is it lets you run code at compile time
<davros1> To be fair thats probably less of a minefield than an accidentally discovered Turing complete language in the type system
<pitust> afaik the zig codegen itself can be buggy
<davros1> I must admit I thought they were biting off more than they could chew even trying that
<davros1> Rust has stuck to LLVM
<pitust> even with llvm
<davros1> I even wish rust could compile to C
<davros1> Heh ok
<pitust> you can compile llvm to c with llvm-cbe
<davros1> Well maybe if they weren't trying to write their own codeine they could fix those bugs
<davros1> Oh I thought that had all bitrotted (last time I tried)
<davros1> Rust -> C would have made getting onto some platforms a little easier
riverdc has quit [Quit: quitting]
<bslsk05> ​JuliaHubOSS/llvm-cbe - resurrected LLVM "C Backend", with improvements (127 forks/678 stargazers/NOASSERTION)
<davros1> Ok nice thanks
<pitust> it works
riverdc has joined #osdev
gabi-250_ has quit [Remote host closed the connection]
gabi-250_ has joined #osdev
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
heat has joined #osdev
<heat> mjg, hello sir can i talk to u for a second about a fast and unsafe alternative to whatever ur using rn
<heat> its called ansi c (anzi see)
<zid> heat fix my shader
<heat> whats the problem
<zid> my uniform is being optimized out
<zid> uniform sampler2D tex; outcol = texture(tex, frag_uv);
<zid> getAttribLocation(p, "tex") is returning -1
<mrvn> p is undefined
<heat> out of my Area Of Expertise(tm)
<mrvn> is "uniform" an attribute like static?
<zid> why did you ask then :P
<heat> now anyone else can help you
<heat> win for everyone
<mrvn> why oh why does /proc/pid/fd/ have a link count of 2? Should be 2+<num fds> :(
<sham1> OpenGL should be able to tell you why the operation fails
<sham1> I just don't remember the API off the top of my head
tepperson has joined #osdev
<tepperson> this is a dump from qemu with gdb, what might cause this behavior? I'm seeing instructions executed that don't exist. https://pastebin.com/6fuJwtwB
<bslsk05> ​pastebin.com: weird long mode behavior? - Pastebin.com
gxt__ has quit [Remote host closed the connection]
gxt__ has joined #osdev
<heat> tepperson, info registers pls
<tepperson> info registers here -> https://pastebin.com/5XAs7LfE
<bslsk05> ​pastebin.com: registers for long mode - Pastebin.com
<zid> they're not aligned
<zid> 0x000000000010007e in enter_long ()
<zid> add [rax], al is 0x00 afaik
<zid> 0x0000000000100078 <+14>:mov %ax,%gs
<zid> 0x000000000010007b <+17>:mov $0x0,%eax
<zid> you've executed the 0x0 in the mov eax
<tepperson> 0x10007b seems to execute fine
<heat> you're not in long mode
<zid> yea mismatched cpu mode is 99% the likely reason
<heat> unless gdb is batshit crazy atm
<heat> "efer 0x0"
<zid> idk which mode it disassembled it in though without the machine code
<heat> i wanted a qemu info registers btw
<tepperson> for efer, mov ecx, 0xc0000080
<tepperson> rdmsr
<tepperson> or ecx, 0x100
<tepperson> wrmsr ?
<heat> that's wrong yeah
<heat> "select register to operate on, rdmsr, or (register to operate on), 0x100, wrmsr"
<heat> you're lucky it doesn't crash
<heat> you're pretty much just copying EFER to 0xc0000180
<zid> mine is.. mov ecx, 0xc00000080; rdmsr; or eax, 1 | 1<<11 | 1<<8; wrmsr
epony has quit [Remote host closed the connection]
<zid> cus you know, it's eax that has the value, ecx that has the *address*
<heat> mov $IA32_EFER, %ecx
<heat> or $(IA32_EFER_LME | IA32_EFER_NXE), %eax
<heat> wrmsr
<heat> rdmsr
<heat> xorl %edx, %edx
<heat> defines 4 life
<zid> I have commnets in, irl
<zid> don't need a define for a oneshot, comment is fine
<zid> 1 is sysenter, 11 is NX, 8 is long
<heat> yeah mine isn't a oneshot, i later enable syscall
<gog> why aren't you clearing eax
<heat> idk gog
<zid> as in mov instead of or?
<gog> wait
<gog> nvm
<gog> i
<gog> i am v stup
<heat> i don't know why i'm even clearing the top part in the first place
<zid> sick
<zid> heat loves bimbos btw
<heat> isn't EFER guaranteed to be 32-bits
<heat> what
<zid> wait no, not bimbos, footballers
<gog> :<
<gog> i am not a footballer
<zid> I always get those confused, they both spend a lot of time doing their hair and drive jeeps
<bslsk05> ​www.bimbo.pt: Bimbo | Bimbo
<heat> luv me some bimbo
<gog> i do not drive a jeep
<zid> you're pre-bimbo
<zid> early access bimbo
<zid> a real bimbo is married in exchange for the black credit card
<gog> dang
<zid> so she can afford the jeep and barrels of spray tan
<zid> gog do you want to fix my shader
<gog> i don't know how to do that
<zid> same
<gog> i'm way out of my depth in third dimension
<gog> get it
<zid> but I know how to write the code that doesn't work at least
<heat> hehe
<gog> 'cause i'm so shallow
<heat> hilarious
<zid> I'm not even using a third dimension :(
foudfou has quit [Quit: Bye]
foudfou has joined #osdev
tepperson has quit [Read error: Connection reset by peer]
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
<mrvn> gdb has problem stepping through theb switch from 32bit to 64bit
<mrvn> Do you know that feeling after you cut some hot peppers that your eye itches?
<mrvn> .oO(Don't scratch, don't scratch!)
<sham1> Well gdb isn't exactly designed to cope with going from 32-bit mode to 64-bit mode
<sham1> mrvn: wash your damn hands and eyes
<zid> I feexed it
<mjg> eeey
<mrvn> sham1: you have to wash pretty thouroughly to get all the hot of your hand
<mjg> how you doin peeps todey
<sham1> I don't see the problem. You want to get the hot out of your hands after all
<mrvn> mjg: annoyed at DHL. Package tracking shows my packge in Bremen sind Friday.
<mrvn> s/sind/since/
<mjg> :]
Vercas has quit [Ping timeout: 255 seconds]
* mrvn wonders if "Bremen GVZ" is actually the processing center in China and the package is now on some ship.
Vercas has joined #osdev
tepperson_ has joined #osdev
<mrvn> As in: "It's going to Bremen GVZ" instead of "It arrived at Bremen GVZ"
<tepperson_> for intel assembly, how do i define n instances of a 32-bit data?
<mrvn> 4 db?
<mrvn> dw
<zid> that's an assembler directive
<zid> so ask your assembler's manual
<gog> dd
<mrvn> tepperson_: do you want data or zeroes?
<tepperson_> zeros in this case
<mrvn> then just pad n*4 byte
foudfou has quit [Remote host closed the connection]
<mrvn> Does intel have .zero?
<gog> you can use it in GAS
<gog> iirc
<gog> idk about nasm
foudfou has joined #osdev
<mrvn> also try: .bss my_data,4*1354
<mrvn> .oO(or try assembling with gcc so you have GNU syntax everywhere)
<gog> don't assemble
<zid> for nasm it'd be times 32 dd 0 I think
<zid> gl.NEAREST_MIPMAP_LINEAR (default value)
<zid> Who did this and why do they hate fun
xvanc has quit [Remote host closed the connection]
bauen1 has quit [Ping timeout: 256 seconds]
<kof123> "seymour cray was right" "multicore is a mistake" no comment, but...it could use a comic .oO( Chicken Attack! (Legend of Zelda LTTP SNES) )
<kof123> cow tools also can work here
xvanc has joined #osdev
<tepperson_> does this look like a valid pml4 table? .align 64
<tepperson_> PAGE_TABLE_BOOT2:
<tepperson_> .fill 510, 8, 0
<tepperson_> .quad 0x200083
<tepperson_> .quad 0x83
<gog> no
<zid> no
<gog> if your goal is to make 2MiB mappings you need two more levels
<gog> you need a PDPT and a PD
<zid> what is the 0x80 flag anyway
<gog> that's page size
<zid> ah!
<zid> 512GB pages :o
<gog> o:
<gog> do you have 512 gogglebytes of memory
<zid> if you .quad some_other_table | 3 you can do 1GB pages in some_other_table with | 83
<zid> then if you do .quad some_other_other_table in some_other_table you can do 2M pages with | 83
<gog> yes
<gog> you can do that
<zid> This is why people generally just.. do it at runtime
<zid> easier to mov [rsi+(8*510)], pdpt
<zid> than fuck around with trying to eyeball a bunch of arrays and stuff
<gog> yeah
<gog> and if you have an allocator or preallocate some pages for tables you can do arbitrary mappings
<gog> which makes your life very easy
<tepperson_> obviously my kernel needs 3 exabytes of ram to run correctly
<zid> I have my tables fully built in my ELF loader and pass them along to a small stub that does "set EFER, load value into pml4"
<gog> you don't need 3 exabytes
<zid> mine needs 1024GB
<gog> you only need three tables
<gog> 4KiB each
<gog> PML4[0] = PDPT, PDPT[0] = PD, PD[0] = page entry
<gog> for a 2MiB page
<tepperson_> i was kidding about the ram usage
<gog> o
<zid> gog is a very srs person
<zid> pls no tease
<gog> yes
<gog> i am srs bsns all day every day
<gog> be nice to me i have anxiety
<tepperson_> i dont write hello world apps anymore. i only write i am groot apps
<gog> same
<zid> I have anxiety about webgl's default
<zid> texturing mode
<nikolar> /me pets gog
* gog prr
<nikolar> What
<bnchs> what! >:(
* gog pet bnchs
<heat> mjg, rust
* bnchs purrs
<mjg> heat: RUST
<mjg> show some respect and upcase
<heat> RUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUST
<heat> show some respect and scream
<gog> IRON (III) OXIIIIIIIIIIIIIDE+
<heat> hehehehe
<gog> hehehehehe
<heat> you are comedy
<gog> i'm hilarious
<mjg> you know what's the funniest
<heat> ricky gervais? more like
<heat> ricky gogervais
* bnchs derustifies mjg
<mjg> project/company names being puns on rust
<gog> wow i must be really funny
<mjg> like OXIDE
<gorgonical> turns out the secondary core isn't starting in secure more, because life is suffering and nothing is simple
<mjg> gog: you are not as funny as solaris diaspora
<gog> who
<heat> mjg, you know what's worse?
<mjg> OXIDE
<mjg> an actual company using rust
<gog> o
<heat> BSD people that have bsd in their nickname/email
<mjg> i mean RUST
<mjg> dude there is a not-bsd guy who literally has a nickanme ending in bsd
<bnchs> rust has a really bad syntax
<zid> OXIDE is a bad language and also a bad guess in wordle
<zid> double whammy
<mjg> bnchs: that is a widespread opinion
<mjg> which i do share :)
<mrvn> What CPUs support 512GB pages? Even 1GB is optional.
<bnchs> like i have to type more symbols in rust than C
<gog> mrvn: none of them :P
<gog> maybe some hypothetical risc-v machine
<heat> mjg, there's this guy in linux kernel dev that named himself after you
<mrvn> What you can do is a fractal mapping. Then you can map a lot of memory with a single 4k page.
<heat> and this "linus" guy that named himself after linux
<bnchs> heat: is he the guy who shills microsoft pluton?
<mjg> heat: did you know original name for the kernel was FREAKS or something like that
<heat> mjg, yep
<heat> bnchs, mjg does not shill for pluton
<FireFly> linux torvalds
<gog> he made linux os
<mjg> it's unix, i know this
<gog> unix is really good
<gog> i know it
<heat> gog, how much do you like fork() and how little do you not like CreateProcess
<mjg> nah dawg fuckitix is the feature
<gog> creating processes is cringe no matter how you do it
<gog> i will never
<mjg> sounds like templeos is for you then
<heat> do you only create threads?
<mjg> no processes
<mjg> i think
<gog> no i got rid of that code
<gog> single process forever
<mjg> you basically print 'lol' in a tight loop?
<gog> yes
<mjg> aight i would sign off on that kernel
<heat> depends
<mjg> i would accept LoL
<gog> it uses PrintLolFactory to do it
<heat> mjg would only sign off on it if it was OPTIMAL
<mjg> right
<heat> PESSIMAL CODE = PESSIMAL
<mjg> no printf("%s\n", "lol");
<mjg> o rsimilar\
<heat> i hope your locking primitives are on point
<mjg> in fact i would proably only be satisfied if you hacked the cpu
<mjg> and patched the microcode to do it the fastest way it can
<mjg> which is probably inaccessible thorugh regular code
<mjg> you could even possibly block SMIs
<mjg> for more performance
<mjg> so ye, ultimately NACK, try harder
<heat> not-ok mjg@
<mjg> what he said
<kof123> i watched that temple vid linked the other day...the funny part was ba ha ha ha ha ha ha
<kof123> i never used it because 64-bit only
<mjg> you are somehow stuch with a 32-bit cpu?
<mjg> *today*?
<kof123> no, just saying why i never used it
<heat> he only enjoys Portable(tm) Software(tm)
<bnchs> mjg: what about a CPU that does operations on 64-bit integers, but has a 32-bit address bus
<kof123> i was hoping his typing "notes" he would show some move like: "The Oberon System has an unconventional visual text user interface (TUI)" and plan 9 IIRC
<mjg> bnchs: on your laptop?
<mjg> i got a product on a 32 bit arm *today*, the pain is real
<gorgonical> bnchs: in my head your handle is "bean cheese"
<heat> gorgonical: in my head your handle is "gorgonical"
<kof123> it is true i would never go for his U8 U16 U32 U64
<kof123> i have worse ideas than he does i think, so i show a bit of respect
<mrvn> Poll: What are "and *plant shaped C4 charges* on the four pipelines". Explosives shaped like plants so they aren't so obvious, right? Thank you google translate.
<gorgonical> Sounds like you mean you are placing shaped charges to achieve a more targeted effect
<gorgonical> the phrase is garden path-y though
<mrvn> gorgonical: obviously. "plant shaped charges" not "plant-shaped charges". But google doesn't get that.
<gorgonical> Ooh I see what you're saying
<mrvn> They reported plant-shaped charges on the NEWs on the german TV channel.
<mrvn> quality TV, I say. The highest quality.
smach has quit []
<kof123> (*plant_p) (charges_ptr *)
<zid> it's not gorgonical?
xvanc has quit []
<heat> have you tried looking for it
<zid> That's pretty mean, making them look it up themselves heat
<zid> you should type it out for them
moberg has quit [Remote host closed the connection]
<zid> oh hey latest sdm has pml5, I was using an older version before
<zid> HLAT paging too, whatever that is
<heat> is pml5 even implemented in silicon yet?
<zid> not sure, saw the whitepaper for it yeeears ago though
<heat> oh yes, ice lake apparently
<zid> anyway, figure 4-11
<zid> smh size bit is Rsvd in pml4 still, no 512GB pages yet
<heat> ohno.jpeg
<heat> that would actually only work if large pages were remotely useful in Intel
<heat> versus the TLB shittery that was going on
<zid> Imagine having a good TLB
<zid> it manages to have good L1, with way more addresses and shit in it
<zid> instead we get "idk, 40 bytes is the best I can do, and the lookups are bad"
<heat> i wonder how much does 5-level paging actually cost
<heat> (in perf, not monies hehe)
<zid> I am assuming the same perf delta as 2M pages gains you over 4k, ignoring TLB misses
<zid> 1 extra lookup step
<zid> but it doesn't really cost any space
<zid> Unless you're making gallions of pml4es for some reason
<mrvn> heat: if you have the same mappings as with pml4 then you only use 2 entries and they are easily cached.
<mrvn> the extra lookup step would always be in cache for kernel and have one miss per ASID change for user space.
<mrvn> sorry, no, one miss per CR3 reload each.
<mrvn> Is there a debug register counting page walk cache misses? How often does the plm4 miss?
<mrvn> My opinion is: You shouldn't use 5-level paging unless you need it. It surely can't be faster.
<gog> that's a lot of vm space
<heat> sharp observation
<gog> hehehehe
<tepperson_> does this look more correct than my last attempt at paging data? https://pastebin.com/Kt7v6SuU
<bslsk05> ​pastebin.com: paging structures? - Pastebin.com
<gog> do you want your pages mapped there to be writeable?
<gog> otherwise i think it's right
<zid> http://shogun.rm-f.net/~zid/page.html might be useful before you started, but now you're finished
<bslsk05> ​shogun.rm-f.net <no title>
<tepperson_> not sure what that link actually does, any explanation for it?
<zid> put an address in
<zid> 0xB00B0000 -> pml4[0]->pdpt[2]->pd[384]->pt[176] = blah | PT_PRESENT;
wootehfoot has joined #osdev
<tepperson_> hmm, as soon as i load cr0, i cannot access any memory at all
<tepperson_> i mean cr3
<tepperson_> wait nevermind, it is cr0 (enable paging)
danilogondolfo has quit [Remote host closed the connection]
<zid> you immediately triple fault?
<tepperson_> after a few instructions i triple fault, no idt setup
<zid> paging works then
<tepperson_> gdb complains of unable to access memory
<zid> what's the faulting instruction?
Left_Turn has joined #osdev
<zid> If you're in qemu you can also -d int to get info about the page fault in its monitor
<tepperson_> ah interesting
Turn_Left has quit [Ping timeout: 256 seconds]
<zid> you can also info tlb to see what translations are currently set up, you'll need to -no-reboot -no-shutdown to be able to do that one
<tepperson_> info tlb in qemu shows nothing after the fault
<bslsk05> ​pastebin.com: broken paging? - Pastebin.com
<zid> x /1i 0x104044
<tepperson_> address 0x104044 is out of bounds
<zid> xp then
<zid> now that paging is enabled
<heat> 0x101000 <PAGE_TABLE_PML4_BOOT>:0x00102000 <-- wrong
wootehfoot has quit [Ping timeout: 255 seconds]
<heat> 0x102000 <PAGE_TABLE_PDP_BOOT>:0x00103001 <-- also wrong(technically right but in practice probably not what you want at this stage)
<heat> 0x103000 <PAGE_TABLE_DIRECTORY_BOOT>:0x00000081 <-- see above
<tepperson_> what would be right? my goal is to map the first 2 megabytes of ram
<heat> exercise left to the reader
<gog> oh yeah
<gog> yeah it's wrong
<gog> sorry
<gog> i'm not well today
<heat> i could tell you exactly what's wrong but that's unhelpful
<gog> i'll tell you exactly what's wrong for $500
<zid> If 0x1000 is your pml4, pml4[0] = 0x2003, then 0x2000 is your pdpt
<heat> i'm cheaper than gog, i'll do 50
<zid> pdpt[0] = 0x3003, making 0x3000 your PD
<gog> zid is doing it for free already
<gog> we're ruined
<heat> sucker
<zid> PD[0] = 131;
<zid> That'll be $499.99
<zid> if it's correct
gog has quit [Quit: byee]
<tepperson_> (gdb) x 0x101000
<tepperson_> 0x101000 <PAGE_TABLE_PML4_BOOT>:0x00102003
<tepperson_> (gdb) x 0x102000
<tepperson_> 0x102000 <PAGE_TABLE_PDP_BOOT>:0x00103003
<tepperson_> (gdb) x 0x103000
<tepperson_> 0x103000 <PAGE_TABLE_DIRECTORY_BOOT>:0x00000083
<tepperson_> no change
<zid> I take it the instruction before xx44 is the mov cr0, eax?
<tepperson_> it does make info tlb in qemu do stuff now
<zid> oh, screenshot/paste?
<tepperson_> gdb appears to be interpreting my 32-bit code as 64-bits, trying to figure out how to make it stop
<zid> just dump the bytes then
<zid> as long as it ends up on a paste not a screenshot
<zid> I ain't typing it back in
<zid> (or use objdump)
<heat> use qemu
<heat> gdb will always interpret it as whatever the elf bitness is
<heat> qemu will interpret it based on the mode you're on
<tepperson_> it appears to be only mapping one 4kb section instead of a 2mb section
<Ermine> last time I tried it with "OS from 0 to 1" gdb disassembled 16 bit code as 32 bit
<Ermine> s/ it/ gdb/
<heat> i bet 10 usd dollar on how you're missing a certain cr4 flag
<tepperson_> bit PSE on cr4?
<heat> most certainly
<zid> So I get to keep my $499.99?
<tepperson_> bit pse didnt change it
<heat> didn't change what
wootehfoot has joined #osdev
<zid> did you remember to or the correct reg this time? :P
<mjg> burp
<zid> heat: http://shogun.rm-f.net/~zid/gl_tex.html I fixed my opengl btw
<bslsk05> ​shogun.rm-f.net <no title>
<mjg> and another geezer does no think atomic ops are expensive
<mjg> fml
<heat> zid, sweeeeeeeeeeeeeeeeeeeeeet
<heat> mjg, link pls
<mjg> NAH
<heat> yes
<mjg> no slut shaming
<mjg> i'll link if he doubles down
<heat> linus would never say that shit
<heat> praise be the linux, creator of linus
<zid> now the test rig is complete, I need to actually figure out how to make perlin noise I guess
<heat> is that a wizard or what
<mjg> heat: he indeed would not
<mjg> heat: hey heat, wanna patch intel pcm tools?
<mjg> it prints process pids in hex instead of dec
<mjg> std::cerr << "Program " << sysCmd << " launched with PID: " << child_pid << "\n";
<mjg> someone with c++ 101 needs to patch it
<sham1> Wait, atomic ops aren't expensive? But they affect every other pipeline, no? Because it has to be, well, atomic
<zid> impossible mjg
<zid> That's 40MB of machine code, it's iostream
<mjg> sham1: on amd64 they flush the store buffer on all uarchs
<mjg> sham1: it is *turbo* slow
zxrom has joined #osdev
<sham1> Oh, I read your comment in the opposite manner
<sham1> I was all about to get argumentative and tell you that you're wrong
<heat> mjg, what's intel pcm tools
<bslsk05> ​intel/pcm - Intel® Performance Counter Monitor (Intel® PCM) (384 forks/2022 stargazers/BSD-3-Clause)
<heat> std::cerr << "Program " << sysCmd << " launched with PID: " << std::dec << child_pid << "\n";
<heat> if that works tell me and I'll open a PR
<sham1> I've never liked the use of bit shifts for C++'s iostreams
<mjg> aight, give me 5
<heat> sham1, it's a verbose% speedrun
<zid> heat what is the portobalkanssub
<bslsk05> ​www.reddit.com: PORTUGALCYKABLYAT
<Ermine> you're welcome
tepperson_ has quit [Ping timeout: 252 seconds]
<mjg> heat: that works
<heat> nice
<mjg> heat: but there is more problems
<mjg> std::cerr << "Process " << child_pid << " was terminated with status " << WTERMSIG(res) << "\n";
<mjg> add this
<mjg> there is missing newline for that sucker
<mjg> consider it a freebie
<heat> does anyone use that fucking tool?
* mjg
<heat> you are the one user of that tool, ever
<mjg> hm
<mjg> are you sure the std::dec thing is correct?
<heat> yes
<heat> why?
<mjg> Program sleep launched with PID: b429
<mjg> Process b429 was terminated with status 2
<mjg> if i use that std::dec patch *both* change to dec
<mjg> even though only one statement is patched
<mjg> sounds like it flips the default?
<heat> yes
<heat> iostream does this globally
<heat> it's clinically insane
<mjg> wtf
<mjg> that's retarded
<heat> the format options are all global
<mjg> well
<zid> and you can't query it either
<zid> so you just have to pray nobody shits all over you
<mjg> then point out please that there are std::cerr uses all of which want to print dec
<mjg> in that func
<mjg> and add that newline
<zid> I'd just submit a patch that changes them all to printfs
<zid> it'll compile and run faster
<Ermine> heat: graphs suggest that living in Portugal is fun
<heat> Ermine, you'll feel right at home
<heat> <mjg> then point out please that there are std::cerr uses all of which want to print dec <-- wdym?
<mjg> std::cerr << "Program " << sysCmd << " launched with PID: " << child_pid << "\n";
<mjg> std::cerr << "Program exited with status " << WEXITSTATUS(res) << "\n";
<mjg> std::cerr << "Process " << child_pid << " was terminated with status " << WTERMSIG(res) << "\n";
<mjg> these fucking guys all want to print decimal
<heat> yes
<Ermine> heat: looking forward!
<heat> also, can you check if I didn't accidentally break any other thingy that wanted to print hex?
<heat> Just Iostream Things
<mjg> look ok here
<mjg> aha!
<mjg> you need to use 'dec'
<mjg> src/pcm-core.cpp: cout << "Time elapsed: " << dec << fixed << AfterTime-BeforeTime << " ms\n";
<heat> what
<heat> dec is exactly the same as std::dec
<mjg> aight
<heat> it's just that they "using namespace std;" that file
<mjg> i assumed it would not be fucking with global, but yeare right
<mjg> even in the same file they use that or std::dec
<mjg> fuckin'
* mjg pets printf
<mjg> allgith, just remember that \n and we are set here
<mjg> kthx
<zid> std::cout << std::hex << std::setfill('0') << std::setw(8) << x << std::dec << std::endl;
<zid> is the prefered method.
gog has joined #osdev
<heat> haha
tepperson_ has joined #osdev
<heat> aka %08x
<bslsk05> ​yosefk.com: C++ FQA Lite: Input/output via <iostream> and <cstdio>
<tepperson_> ok i now have the first 2 megabytes of ram mapped with my page tables (with a 2mb page directory entry). jmp 0x104080 actually jumps to 0x407e, what might cause that?
<sham1> Just use std::printf like a sane persohnj
<zid> user error?
<zid> either the assembler, or one of the debugger tools got told/asked the wrong thing somewhere
<mjg> lol > or some reason, it doesn't throw an exception (it's not really bad, because what's really bad is C++ exceptions).
<gog> %p
<gog> %pp
<Ermine> gog: may I pet you
<gog> you may
* Ermine pets gog
* gog prr
<mjg> heat: where the pull request at mofo
<bslsk05> ​github.com: Print PIDs and exit status in decimal by heatd · Pull Request #522 · intel/pcm · GitHub
<tepperson_> is my processor in long mode here? https://pastebin.com/WWvNZZy4
<bslsk05> ​pastebin.com: (gdb) i rrax 0x10405b 1065051rbx 0x10000 - Pastebin.com
<heat> no
<mjg> heat: maybe ship them with sample bad output: https://dpaste.com/3F4Q5VQGQ
<bslsk05> ​dpaste.com <no title>
<zid> you want figure 4-1
<heat> actually yes you may be in long mode
<zid> he is
<zid> PG and LM are set
<heat> turns out PSE is a don't-care bit in 64-bit
<heat> PG and LM are set, but I know jack shit about his GDT
<tepperson_> do i need to set LMA in the EFER register?
<heat> have you tried reading docs
<heat> it could help
<tepperson_> i cant find efer anywhere in the do i am looking at
<heat> how?
<tepperson_> ctrl f, "efer", whole words, phrase not found
<heat> "whole words"
<heat> there's your issue
<heat> try IA32_EFER whole words
<tepperson_> ah i see now. it looks like i told it to do long mode, but the IA32_EFER.LMA isn't turning on to indicate long mode active
<zid> oh right, needs to do the gdt swap to get out of *compat* mode
<zid> which is what he's in
<heat> qemu's monitor makes it painfully obvious as all the register names and widths change
zxrom has quit [Ping timeout: 268 seconds]
wootehfoot has quit [Ping timeout: 264 seconds]
<tepperson_> ok i figured out how to get the qemu monitor and gdb at the same time, cooking bacon now
heat has quit [Ping timeout: 248 seconds]
heat has joined #osdev
Matt|home has joined #osdev
<davros1> Anyone here code for retro consoles
<davros1> (I dont have a question, I'm just curious if there's overlap of interest in this community)
<davros1> Machines where you could hold the hardware in your head and you code by hitting the metal
<davros1> I do miss that (perhaps doing something on web assembly is making me want to 'purge' this way)
<clever> davros1: i recently wrote a basic framework for c on gba, to help a friend out
<davros1> Ok nice, never used one of them
<clever> same, ive never owned a gba, but the .elf works in an emulator
<clever> it can draw to the framebuffer, wait for vsync irq, and poll all inputs on every frame
<clever> all thats missing is audio support, and an actual game, lol
qubasa has joined #osdev
<davros1> Wondering about using generative AI and downscaling to retro machines
<davros1> Wont look as crisp as pixelart but you could make a low-effort game look more interesting that way
zxrom has joined #osdev
wootehfoot has joined #osdev
<bslsk05> ​old.reddit.com: Wizard mouse in laboratory, pixel art : dalle2
<davros1> Hah yeah I hadn't looked for actual pixelart finetunes and so on
<GeDaMo> Have you seen the DALL-E prompt book?
<davros1> No. I've been experimenting with stablediffusion . I like the ability to run locally.
mctpyt has joined #osdev
elderK has quit [Quit: Connection closed for inactivity]
<GeDaMo> This site allows you to search images generated by Stable Diffusion https://lexica.art/?q=pixel+art
<bslsk05> ​lexica.art: Lexica - pixel art
<davros1> What I'm seeing for pixel art there is quite different to the tailored palette work of classic pixel art.. but the Dalle things above were closer.
<geist> will be interesting to see when the first game that somehow integrates this sort of image generation in the game itself
<davros1> Still it doesn't matter. I think its still a good "quality : effort" tradeoff.. doesn't have to be perfect
<geist> cant think of how that'd owkr, but you could imagine some sort of clever puzzle thing where part of the puzzle is generated art
<geist> or machine generated art treated as a level, or whatnot
<davros1> Heh yeah Geist - stable diffusion on the highest end GPU can spit out new images in a couple of seconds . Imagine a scroll speed tuned such that you're always moving in to newly generated things
<geist> yah. and of course that'll get moare optimized over time, etc
<davros1> But even without goign that far.. being able to make random mazes or whatever , and just visually enhance them with "Img2img" would be great
<geist> yah, probaby some sort of indie game tries it first, with it being the main trick of the game, and then slowly becmes more standard
<davros1> I think for AAA hand tuned art will win in 3d. Generative will produce a rough dreamlike style that will suit indies (quality:budget)
<davros1> Indies and modders
<geist> yah
<geist> i was planning on piddling with stable diffusion maybe this weekend. my older 1080ti should still be able to crunch numbers fairly well
<davros1> Things like Minecraft/roblox ..
<GeDaMo> It might be possible to make smaller models trained for specific games / styles
<davros1> Yeah that should do 10-15 seconds per image.. better than hugging face certainly
<geist> yah, https://cdn.mos.cms.futurecdn.net/iURJZGwQMZnVBqnocbkqPa.png toms hardware has some benchmarks
<geist> iirc the 1080ti is somewhere like a 3060 iirc
<GeDaMo> Facebook / Meta just released a paper on their new LLM which showed a smaller model trained with more data can match GPT-3
<geist> assuming it doens't need specific hardware that the new rtxes have
<davros1> Its worth grabbing a 3000 series card or even 4000 if you can
<geist> yah i've been putting it off unti i can nicey find a strong replacement for it
<geist> but this may be something that pushes me to
<davros1> I'm fine with splashing out on a big GPU but electricity is the limiting factor really
<davros1> UK electric bills are high
<davros1> If it wasn't for that I'd be running 2
<geist> yah i get ya there
<davros1> SD img2img has also encouraged me to get back to amateur art
<davros1> Doing my own doodles and AI enhancing
<davros1> AI can't do 3d game ready yet, hence the interest in retro styles
<davros1> And hence the interest in retro hardware again hah
<bnchs> AI isn't accurate yet lol
<gog> it's not very intelligent
<gog> like me
<davros1> Neither are people in many ways,although I know what you're saying
<bnchs> yeah, most of the times you catch it saying dumb shit
<davros1> chatGPT bullshits alot, like a human ..
<gog> yes
<bnchs> especially when you tell it to explain something
<gog> chatGatekeep, chatGaslight, chatGPT
mctpyt has quit [Ping timeout: 252 seconds]
<davros1> But the art generators - if you use 'img2img' you can sketch, say what it's supposed to be, and it'll detail for you. This is pretty good quality/control/productivity balance. man+machine
<tepperson_> i asked chatgpt to make a dummy kernel driver and it used structs that don't exist in any kernel
<davros1> Yeah I wont use it for code
<gog> it doesn't really have a capacity for abstract reasoning
<gog> such a thing can't really code
<davros1> Script kiddies are getting excited by it though..
<bnchs> yeah
<bnchs> "OH NO CHATGPT CAN MAKE RESPONSES, OUR PROGRAMMING JOBS WILL BE DOOOOMED!!!"
<bnchs> can't even make a working kernel driver
wootehfoot has quit [Read error: Connection reset by peer]
<mats1> its not good but its not useless
<mats1> just needs some care and sanitisation
<mats1> like cvs where there's six self checkout stations and one guy to tend to them / greet guests
<sbalmos> those six checkout stations require 6 truckloads of receipt printer paper
<mats1> there's great business value in eating the low hanging fruit jerbs where retards are googling and including left pad libraries
<bslsk05> ​ke0z/VulChatGPT - Use IDA PRO HexRays decompiler with OpenAI(ChatGPT) to find possible vulnerabilities in binaries (17 forks/203 stargazers)
GeDaMo has quit [Quit: That's it, you people have stood in my way long enough! I'm going to clown college!]
gabi-250_ has quit [Quit: WeeChat 3.0]
gabi-250_ has joined #osdev
<heat> geist, do you have any perf numbers on 5-level paging vs 4-level paging?
<geist> i do not. on x86?
<heat> it sounds like something you'd have
<heat> yeah
<heat> also cc mjg
<geist> no i haven't actually fiddled with any machines with 5 level enabled
<geist> hypothetically with page table caching and aggressive data caching there's probably not *too* much of a hit
<geist> possibly the bigger hit is simply that it allows the system to be even more scattered
<mjg> i don't, but i would expect intel to have a hard-to-find paper with it
<mjg> alternatively lkml with submiion for la5
<geist> FWIW riscv has thesame thing now too, but surprisingly linux mainline seems to only currently support sv39
<gorgonical> geist: isn't that because no hardware really supports anything higher than sv39?
<heat> i think linux should enable PML5 by default?
<heat> huh you sure it only supports 39?
<heat> i think they usually lock that stuff behind a CONFIG_
heat has quit [Remote host closed the connection]
<geist> yah, except qemu and probably stuff in dev
<gorgonical> I can't help but wonder if the kernel support is just lagging along then
heat has joined #osdev
<geist> or it's sitting in a branch yet and hasn't been merged
<geist> heat: yeah it seems to only have the SV39 config option
<geist> obviously someone will add it, just don't see it yet
<gog> hi
<heat> at the very least they have layouts for 48 and 57
<geist> yah makes sense
<geist> may as well design it even if it's not a build option yet
<netbsduser> on chatgpt: it has declined significantly, in november it was able to give me competent advice and spot real issues, now it can barely match a typical stackoverflow answer and it only spots superficial issues. i was really disappointed to see how much it declined
<sham1> As an OpenAI language model, I cannot e expected to actually be useful at this time.
<heat> geist, yeah you're right, seems to assume 39 only
<heat> very weird IMO
<tepperson_> i thought i would try bochs, but on ubuntu it can't even load grub2
<heat> 48 sounds like the no-brainer (because of x86, etc)
nj0rd_ has quit [Ping timeout: 255 seconds]
<heat> i guess they wanted to support smaller devices?
<heat> *shrug*
epony has joined #osdev
<heat> does android still do 39-bits on arm64 too?
<geist> well, yeah i think all the existing hardware except for qemu only supports up to sv39. the next batch will probably pick up sv48, but it does seem that 39 is the general sweet spot for current non datacenter hardware
<heat> i have not messed with support multiple va sizes yet
<heat> supporting*
<geist> so an annoying thing i just discovered about my vf2 board
<geist> it seems to not support that many ASID bits
<geist> possibly 0, which is valid. linux dmesg shows that it basically decides it can't do ASID because not enough bits
<geist> and the logic in linux to determine that is (num_asids >= 2*NR_CPUS)
<geist> since NR_CPUS is 8 in this particular kernel, that means it must have declared less than 4 bits of ASID support
<geist> (probably 0)
m5zs7k has quit [Ping timeout: 264 seconds]
m5zs7k has joined #osdev
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 252 seconds]
valshaped has quit [Quit: Gone]
valshaped has joined #osdev
SpikeHeron has quit [Quit: WeeChat 3.8]
heat has quit [Read error: Connection reset by peer]
heat has joined #osdev
Vercas has quit [Quit: Ping timeout (120 seconds)]
Vercas has joined #osdev
SpikeHeron has joined #osdev
<mjg> yo
<mjg> what, if any, cpus today do *software* tlb?
<mjg> not mips
<mjg> something which is not slow
<mjg> :]
<\Test_User> > not slow
<\Test_User> > run code for every (uncached) memory access to translate pages
<\Test_User> seems to conflict a bit to me
<heat> mjg, IA64 you FUCKEN BITCH
heat has quit [Remote host closed the connection]
heat has joined #osdev
mctpyt has joined #osdev
mctpyt has quit [Ping timeout: 252 seconds]
moberg has joined #osdev
<heat> geist, i'm starting to think that riscv really is digging its own grave with i n f i n i t e e x t e n s i o n s
<heat> in this case, infinite optional crap
bradd has joined #osdev
robem has joined #osdev
thaumavorio_ has quit [Quit: ZNC 1.8.2 - https://znc.in]
<geist> there appears to be an effort to standardize via the riscv profiles stuff
thaumavorio has joined #osdev
<geist> ie RVA20, RVA22, etc
robem has quit [Changing host]
robem has joined #osdev
robem has left #osdev [#osdev]
<geist> would be somewhat akin to armv8.x in that you define a series of increasing profiles that include previous ones and add to, with a list of mandatory and optional bits
<chibill> Honestly I start writing an OS, get to where I can print to the screen (So basically nothing) and then just sort of can't decide how to proceed because my whole drive to write an OS is to just do it for fun so I have no real goal. :(
<geist> well, you can make interrupts work. that's fun
<heat> for fun != real goal
<heat> everyone here does this for fun
<heat> no one's getting rich off this
<heat> erm
<heat> for fun = real goal
<chibill> Like even for fun I have no end goal of where I want to end up xD
<geist> right i find it fun to just climb up the tech tree
<geist> ie, start adding more and more features akin to a Real OS
<geist> that's fine. treat it as a journey
<geist> a series of features you add
robem has joined #osdev
<heat> chibill, what features do you like from your system (linux, etc)?
<heat> *that do not involve graphics*
<heat> think small, and use that as an objective
<heat> like "oh I really like this bash thing to screw around", so you work towards that
<heat> or an http server, etc
<chibill> Hm. Just realized I can set a goal of at least having as much stuff as XV6 (So a basic terminal system and file system access of some sort.) Biggest challenge for me is I am working in Rust. I feel like I am going to run into issues when I need access to thing in multiple places. Might start over in C.
<mjg> oh noez
<mjg> someone had a kernel written in rust
<mjg> you could probably use it as a reference when stuck
<heat> note that xv6 sucks
<heat> (IMO)
<chibill> mjg: You mean the RustOS blog thing?
<heat> xv6 is a stupid simple example of a unix-ish kernel but man the code is not really high quality
<heat> much like the original UNIX I guess
<mjg> chibill: no, made by a local
<geist> but that's a goal, do a better job than xv6
<mjg> Mutabah: where ya kernel at
* Mutabah is away (Sleep)
<heat> kernel is sleep
<mjg> dafaq that mirc shieeet
<chibill> heat: I agree, had to do some deep inside work on it for a College class. (Adding new syscalls, file systems blocks, a way to show the memory layout and things like that.)
<bnchs> ding dong ding dong wake up
<bslsk05> ​thepowersgang/rust_os - An OS kernel written in rust. Non POSIX (43 forks/627 stargazers/NOASSERTION)
<heat> one day (maybe today hrmm) I should try to write a better xv6
<heat> except in x86 because screw whatever old crap xv6 was targetting
<heat> i don't want a workflow based around simh lol
bauen1 has joined #osdev
<heat> s/xv6/v6/ in that second statement
<mjg> while(xchg(&lk->locked, 1) != 0)
<mjg> // The xchg is atomic.
<mjg> ;
<mjg> fuck me
<heat> hahahaha
<heat> do you always open locking functions when you see a new OS
<mjg> // Release the lock, equivalent to lk->locked = 0.
<mjg> // This code can't use a C assignment, since it might
<mjg> // not be atomic. A real OS would use C atomics here.
<mjg> asm volatile("movl $0, %0" : "+m" (lk->locked) : );
<mjg> fucking
<heat> what's really irking me is that they use __sync_synchronize but then never bother to use any of the other ones
<mjg> heat: it is my iltmus test for quality
<heat> __sync_synchronize is way overkill
<heat> mjg, what do my locking primitives say
<mjg> what does it expand to? a full barrier?
<heat> yes, mfence
<mjg> heat: i already told you why they suck
<mjg> wow, that's stupid
<mjg> xchg already provides a full fence
<heat> mjg, yes, just wondering what they say about my OS
<heat> "passable but mildly incompetent" is probably a solid conclusion
<mjg> that it ouperforms openbsd at best
<heat> and net
<mjg> this is the real crime: asm volatile("movl $0, %0" : "+m" (lk->locked) : );
<mjg> with the comment above it
<heat> why?
<mjg> stackoverflow-level disinformation
<heat> wait, why does it do a memory fence before the store
<heat> in fact, you don't need a memory fence here at all
<mjg> you don't
<mjg> on that cpu
<mjg> it is common to not know that bit though
<mjg> you *do* need it on everything else
<heat> do you need a full one tho?
<heat> or just a store memory barrier?
<mjg> "release" it is called
<heat> yes so I think it's just a store memory barrier
<heat> "Creates an inter-thread happens-before constraint to acquire (or stronger) semantic loads that read from this release store. Can prevent sinking of code to after the operation."
<mjg> it guaranatees or *loads* and *stores* earlier in program order are finished
<mjg> at the same time does not prevent ops *past* the fence from leaking up
<heat> fun fact: which in x86 linux store fences literally just expands to something like (addl $0, -8(%esp))
<mjg> s/or//
<mjg> i think you mean a full fence
<mjg> i don't remember if the 32 sucker has the same trick
<mjg> it may be it needs it for release
<bslsk05> ​elixir.bootlin.com: barrier.h - arch/x86/include/asm/barrier.h - Linux source code (v6.2.1) - Bootlin
<heat> although TIL there's an alternative there for the fence instructions
<mjg> that's not smp_*
<mjg> time to head off, happy to flame tomorrow
<heat> it seems that smp_rmb and smp_wmb are just compiler barriers
<heat> huh
<heat> hmm do you really need that lock there?
<mjg> they are for amd64
<mjg> i don't know about i386
<mjg> for a smp_mb you do
<heat> why don't you for a mov $0, (lock) then?
<mjg> see the previous remark about ops leaking up
<heat> for RELEASE semantics at least
<mjg> there is no lock mov
<mjg> or at least i have not seen one :]
<mjg> burp really need to go dawg
<heat> ah so mov works differently here, got it
<mjg> mov is literally just store this shit over there
<mjg> so does not matter what other cpus are doing in the area
<mjg> in contrast smoething like 'addl' could lose existing state
<mjg> which is why for proper smp use it gets the lock prefix
<mjg> and lock addl 0 is a no-op in terms of content of the target area
<mjg> while still resulting in all synchro normally associated with a locked op
* mjg off
* heat nods
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
<chibill> <heat> "except in x86 because screw..." <- lol XV6 was targeting x86, they stopped that and rewrote it to target risc-v now.
<heat> yes, yes they were
<heat> my plans were more to write a simple early unix clone
<heat> kinda started mixing up unix and xv6 when writing those sentences
catern has quit [Read error: Connection reset by peer]
<kof123> " cvs where there's six self checkout stations and one guy to tend to them / greet guests" <squints eyes, not familiar enough with cvs to tell if double-joke about cvs checkouts>
<epony> sixty-six
<epony> mirrors
<epony> of 1 main CVS server
<epony> and 1 backup CVS server
Vercas8 has joined #osdev
<epony> figure it out, beats github 100% of the time since 1991
gabi-250_ has quit [Ping timeout: 255 seconds]
Vercas has quit [Ping timeout: 255 seconds]
gildasio has quit [Ping timeout: 255 seconds]
Vercas8 is now known as Vercas
<geist> i was gonna say but but cvs has some nice properties... but then i can't think of them
gabi-250_ has joined #osdev
gildasio has joined #osdev
<chibill> CVS = Customer Values Separated
<epony> text format of the metadata / repo files
<epony> and known "operation" / generic support everywhere
<epony> simple enough to be implemented as a yak-shaving task
<epony> and suitable for coherent development teams
<epony> existed in the 90ies