klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
Emil has joined #osdev
varad has joined #osdev
gog has joined #osdev
Likorn has quit [Quit: WeeChat 3.4.1]
thaumavorio has quit [Ping timeout: 258 seconds]
dragestil has quit [Ping timeout: 276 seconds]
acidx has quit [Ping timeout: 252 seconds]
nickster has quit [Ping timeout: 252 seconds]
Ameisen has quit [Ping timeout: 276 seconds]
thaumavorio has joined #osdev
Ameisen has joined #osdev
dragestil has joined #osdev
nickster has joined #osdev
acidx has joined #osdev
heat has quit [Ping timeout: 272 seconds]
<geist> yeah
<geist> also yay a pair of loud bangs in the distance and then power outage
<geist> time to settle in for some lack of lights
<gamozo> Light a candle!
<gamozo> Get a good mood goin
<geist> yep, basically shutting everything down on ups so i can turn off teh generator and save gas
<klange> I linked this before, but, macOS TPIDRRO_EL0 fun: https://github.com/kuroko-lang/kuroko/blob/master/src/kuroko/vm.h#L219-L223
<bslsk05> ​github.com: kuroko/vm.h at master · kuroko-lang/kuroko · GitHub
<Jari--> gamozo, geis, klange morning
<gamozo> morn morn
<Jari--> TODO: PCI scan, and reserve bits, of pages, from I/O memory space.
<Jari--> This is easy.
<Jari--> Physical memory.
Emil has quit [Ping timeout: 252 seconds]
gruetzkopf has quit [Ping timeout: 252 seconds]
merry has quit [Ping timeout: 252 seconds]
varad has quit [Ping timeout: 260 seconds]
xvanc has joined #osdev
merry has joined #osdev
xvanc has quit [Quit: leaving]
Emil has joined #osdev
gruetzkopf has joined #osdev
<geist> Hmm, what are you doing with that klange?
<geist> Is that for user or kernel space?
varad has joined #osdev
<geist> oh I guess userspace because macOS puts something in it? What are they using the tpidrs for?
smeso has quit [Quit: smeso]
<klange> geist: TPIDR seems to be a core index, it varies on a given thread. TPIDRRO is just the regular thread pointer that everyone else puts in TPIDR, but the TLS model is horrible and does library calls to dyld for every reference, so this is just inlining the descriptor+offset lookup instead of doing the library call
<geist> Yah that’s really odd. I’d expect the core index to be in the RO one
<geist> Kinda like the top part of ruts park
<geist> Rdtscp (stupid autocorrect)
<geist> Actually using it as a scratch as j`ey pointed out makes a lot of sense. I never thought about that, will have to consider it for stack overflow
<klange> It is a mystery... that could possibly be solved by looking more at Darwin kernel code, I think all of this stuff is in their source dumps now - the dyld stuff was at least.
<geist> But that’s only if have no other use in user space
<geist> Maybe they had some security reason to not let user space mess with their thread pointer
<geist> So they switched it so the harmless thing is in the RW one
<geist> You could if you took that argument from the gs: side of things and didn’t allow fsgsbase instructions on x86
<klange> The weird thing is __builtin_thread_pointer on clang still gives you TPIDR - presumably because it's a GCC compatibility thing no one cares to update / make target-specific.
<geist> And only allow values in the TP register that’s vetted by the kernel
smeso has joined #osdev
srjek has quit [Ping timeout: 255 seconds]
sprock has quit [Quit: Reconnecting]
sprock has joined #osdev
Likorn has joined #osdev
<klange> Not really getting any clearer information on TPIDR_EL0. _EL1 is used for the kernel thread pointer, they seem to be trying to zero _EL0 but something's leaking somewhere or I'm missing something somewhere else in the userspace side...
<klange> And TPIDRRO_EL0 actually does have the CPU number maybe in the low three bits? I missed that last time I looked at this, but I see them zeroing it here: https://github.com/apple-opensource/dyld/blob/master/src/threadLocalHelpers.s#L236-L237
<bslsk05> ​github.com: dyld/threadLocalHelpers.s at master · apple-opensource/dyld · GitHub
<bslsk05> ​github.com: darwin-xnu/pcb.c at 2ff845c2e033bd0ff64b5b6aa6063a1f8f65aa32 · apple/darwin-xnu · GitHub
<Jari--> Developer Charles W. Sandmann also hoped to eventually supply code for CWSDPMI r7 that allows CWSDPMI to map up to 64 GB memory into the address space upon request.[2][3]
<Jari--> Intersting, DOS protected mode extender with 64 gigs of RAM support.
<bslsk05> ​www.bttr-software.de: DOS ain't dead - > 3 GB of RAM (hypothetical), CWSDPMI
dude12312414 has quit [Remote host closed the connection]
<klange> Hm, I don't do the mask, but they also seem uncertain about whether it's needed, and I just did a spot check and the low bits are always clear even when I'm definitely bouncing around between cores...
dude12312414 has joined #osdev
<Jari--> 2022, I probably should not use BIOS or HIMEM.SYS to get memory map.:) PCI bus scan rules dude.
<klange> I accidentally exited the shell on the serial console to my RPi, but the way I set that up means it just proceeded to launch the GUI. No more uptime checks, but I have a clock that ticks, and more going on continuously.
dude12312414 has quit [Remote host closed the connection]
dude12312414 has joined #osdev
Jari-- has quit [Ping timeout: 246 seconds]
<mrvn> klange: they might have retrofitted TLS onto existing code and the register everyone else uses was already used.
* mrvn waits for a DOS clone that runs in 64bit with 32bit userspace.
<geist> oh my, putting the cpu number in the bottom bits is an incredibly short sited thing. how many bits did they reserve?
<bslsk05> ​github.com: darwin-xnu/machine_machdep.h at 8f02f2a044b9bb1ad951987ef5bab20ec9486310 · apple/darwin-xnu · GitHub
<geist> looks like only 3 bits
<zid> 3 bits of cpus should be enough
Likorn has quit [Quit: WeeChat 3.4.1]
<geist> i remember beos had some sort of similar limitation. something like there's a global constant B_MAX_CPUS set to i think 4
<geist> and a few syscalls that just return B_MAX_CPUS number of items
opal has quit [Ping timeout: 240 seconds]
vai has joined #osdev
vai is now known as Jari--
zid has quit [Ping timeout: 276 seconds]
gamozo has quit [Quit: Lost terminal]
the_lanetly_052_ has joined #osdev
opal has joined #osdev
zid has joined #osdev
gog` has joined #osdev
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
gog has quit [Ping timeout: 250 seconds]
<doug16k> 512 CPUs ought to be enough for anyone
<\Test_User> "ought to be enough for anyone" and so the hinderances trying to expand later when you want more begins
arch-angel has joined #osdev
jack_rabbit has quit [Remote host closed the connection]
<doug16k> if 16 cpus is a reasonable high end cpu now, then we should have 512 by 2040 or so?
knusbaum has joined #osdev
<Jari--> doug16k: Moores law indicates calculating power will be doubled every 18 months - so how does this imply to cores? I think thats more than 512 cores by 2040 in this sense?
<\Test_User> matters less when it happens so much as that it happens :P
<doug16k> the other way around. moores law only says there will be more logic, not faster
<mats1> fat lot of good that does when nontrivial percentages of it are dedicated to windows
<Jari--> doug16k: Really Wondering how long it takes as this processor is Mobile Ready ? : Xeon Platinium, Total Cores 28. Total Threads 56. Max Turbo Frequency 3.80 GHz. Processor Base Frequency 2.50 GHz. Cache 38.5 MB L3 Cache. Max # of UPI Links 3. TDP 205 W.
<doug16k> that has an monstrous amount of I/O bandwidth
<doug16k> are you comparing that to a low power embedded cpu?
<doug16k> the compute of the mobile is fine, it's the I/O that is starved
<mats1> more bandwidth for windows
<mats1> amazing
<doug16k> I wonder what curve PCB manufacturing is following
<doug16k> they are not going to be able to go that much past 2000 pins or whatever
<doug16k> imagine routing the pcb under a 2000 pin cpu?
<doug16k> with over 100 amps going into it
<doug16k> and coming out the grounds
<geist> yah those boards have to be using 12-14 layer boards at least
<doug16k> imagine trying to minimize ground bounce with 120 amp peaks on a couple of mm of copper traces, planes desperately via'd together to get the current in
<geist> side note the new zen 5 AM5 socket is a bit strange if you haven't looked at it
<geist> it's just a single grid, no hole in the middle for any caps or whatnot
<doug16k> haven't seen it in detail
<geist> so they moved those to the top of the cpu, but outside of the heat sink, so it has these cuts around the edge
<clever> something ive been wondering about, is how that cpu "package" is made/assembled
<clever> is it just a regular fiberglass pcb, with a raw die wire-bonded or flip-chipp'd onto it?
<zid> Good news, it's 400 half amp supplies instead *stares in 1200 pins*
<bslsk05> ​'Grand Theft Auto V PC - Ultra Settings 1080p60. Intel Xeon + GTX 970' by ModCollection (00:02:31)
<bslsk05> ​wccftech.com: AMD Ryzen 9 7950X With 16 Zen 4 Cores Shows Up In AM5 'LGA 1718' Desktop CPU Installation Video Guide
<geist> a strange looking cpu
<zid> yea saw der bauer take a cool at it
<zid> It is really weird
<zid> s/cool/look
<zid> I really want a zen4
<geist> but it guess it makes sense, the caps you usually see there (presumably bypass caps?) are typically on the bottom of the cpu in the gaps
<clever> photos like this, make it look like the raw die is BGA mounted to a regular pcb?
<clever> and then some epoxy underfill helps to bond it in place
dzwdz has joined #osdev
<geist> unrelated: I saw Everything Everywhere all at Once yesterday, and i can't recommend this movie enough
<geist> it's astonishingly good
<zid> okay but what is it called
<geist> yes
<zid> no yes is on first
<geist> no thats Yes
<geist> Roundabout over there
<doug16k> I wonder what the threadripper package is like
<doug16k> giant lga?
<bslsk05> ​en.wikipedia.org: Socket SP3 - Wikipedia
<geist> dunno if they've announced the new one yet
<doug16k> zen4 though yeah
<doug16k> could be that same socket?
<geist> i doubt it. probably will get some new one as well. TR5 or whatnot
<doug16k> I guess it already is LGA so that would make sense
<doug16k> yeah DDR5 I guess
<doug16k> different fiddly impedance stuff or something
<geist> haven't looked into it, but DDR5 may physically require more traces
<geist> looks like they're going to skip a socket number and SP3 -> SP5, presumably to sync up with AM5
<geist> TR4 ->? TR5?
<doug16k> they would get it all lined up
<doug16k> except first digit of 4 digit desktop cpu models
<geist> that of course assumes they'll continue the split SPn/TRn socket thing
<clever> something ive thought about, who says the cpu power has to come in over the same interface as the data?
<clever> could you have a big honking molex power port on the top of the cpu?
<clever> and just cut the heatsink around it
<doug16k> you need that high performance VRM though
<zid> it has voltage domains
<zid> multiple phases
<zid> etc
<clever> it could be a different connector, with all of the right voltages, and properly sized pins
<zid> still better to have 400 half amp pins
<clever> rather then trying to force it thru 400 undersized pins
<zid> unless you want crackling and warping and stuff
<zid> arcing is *really* healthy for electroncis
<geist> now you're playing with power!
<clever> another idea ive had before, what if you put fiber transcievers directly on the cpu package?
<clever> what if you replaced all of the data with fiber ports?
puck has quit [Excess Flood]
puck has joined #osdev
<clever> as-in, the cpu socket has a bunch of light pipes, connecting the cpu's fiber ports, to the motherboards fiber ports, or even portially directly into a fiber off to a new pci-e type thing
<clever> and then just beef up the size of the remaining power only pins
<doug16k> how much does that motherboard cost to manufacture?
<clever> good question
<clever> part of the thinking there, is that the motherboard can either convert the optical data back into standard pci-e
<doug16k> the ones we have now are cheap, but manufactured with such precision, it puts the cost in
<clever> or the board can just have a fiber routing the optical data right to an optical pcie port
<geist> you could just drive the light over to the storage crystals
<clever> so you dont have to deal with differential traces and routing anymore, just run a strand of fiber
<geist> your memory becomes the healing crystals that glow in the forest when you go search when both suns are down
<clever> i'm not that crazy :P
<geist> if you install windows 12 on them they become kyber crystals
<doug16k> if you could figure out how to use light for all data transfer, you would save power
<doug16k> right now, mosfet gates are essentially capacitors that we fill up and drain with huge pulses of current that are largely losses
<doug16k> every time something switches, there is a pulse of loss
<doug16k> extremely brief though
<doug16k> neat thing though, holding it on or off is almost free
<geist> https://youtu.be/tYiIpjaJ86E?t=232 he talks a bit about ryzen 7000 delidding
<doug16k> it charges up to whatever exactly cancels out your signal and no current flows at all
<bslsk05> ​'HW News - Intel Ships 1 GPU, Delidded Ryzen 7000 CPU, Apple M1 Vulnerability' by Gamers Nexus (00:26:06)
<geist> so you can see what's under it
<clever> doug16k: but which is costing more power, a gate inside the cpu switching a net that remains within the die, or a gate driving an external bus over the motherboard and into a pcie card?
<clever> i feel like the longer trace will have more capacitance, that you have to (dis)charge every time it changes state
<doug16k> yes, you would get parasitic capacitance on the traces, from that to the ground plane and adjacent traces
<doug16k> if the adjacent trace has the opposite value
<doug16k> and with ground if not low
<clever> and probably also against vcc if low
<clever> or even other voltage rails, if high but not that high
<clever> any time a voltage difference exists between 2 parallel-ish bits of copper?
<doug16k> yeah. driving them to a different value will cause that capacitance to charge
<doug16k> if they were the same, then there is no different to charge it up
<clever> yep
<clever> off the top of my head, isnt a modern cpu only really connecting to 3 things? 1: power, 2: every dram module, 3: pci-e lanes out the wazoo?
<doug16k> a whole bunch of stuff to communicate with VRM
<doug16k> the cpu participates in the feedback loop very much
<clever> ah right, power wont just be dumb voltage rails, and talking to it over pci-e is probably too much overhead?
<clever> need a dedicated comms channel for that feedback loop
<doug16k> needs to be super low latency yeah
<doug16k> you are right though, it is a huge number of power pins, same number of grounds, and tons of memory interface and pcie lanes, plus a bunch of SoC stuff like USB and audio
<clever> and something with an integrated gpu, would also have hdmi lanes over the cpu socket
<clever> or displayport
<doug16k> I'm not sure how they do the display output in detail
<clever> yeah, you could just have a dumb 2d only gpu on the motherboard, and a 3d core in the cpu
<geist> yah i think AMD in particular has N differential lanes that can be dynamically configured for PCI or SATA or USB or whatnot
<clever> some laptops kinda do similar, with both a 2d and a 3d gpu, and the ability to just cut power to the 3d gpu
<geist> but indeed, some of the pins on AM4 (and probably AM5) are video lanes
<clever> ive also heard of one cpu, that had 128 differential lanes, that can be configured as either 128 pcie lanes
<clever> or 64 pcie lanes, and 64 cpu<->cpu interconnect lanes
<clever> and the 2nd cpu would give you the other 64 pcie lanes
<clever> so it becomes numa
<geist> yep
<clever> but what if you just replaced that same idea, with 256 fiber transcievers?
<clever> and you just fiber the ram modules directly into the cpu
<clever> and fiber in the pci-e slots
<doug16k> one fibre can carry lots of different signals at once
<clever> now the motherboard is basically just doing VRM and decoding to slower electrical things
<clever> yeah, so you could get away with far fewer channels, say 1 fiber per ram module, 1 fiber per pcie slot
<doug16k> that would probably give you some feasibility
<clever> and the motherboard can either convert that fiber into the old pci-e 16x, or keep it as fiber for a new socket type
<clever> you could potentially even have the connections in new/weird places on the cpu
<clever> what if the transceivers are on the top of the cpu package, in a ring around the IHS?
<clever> and the clamp that holds the cpu down, includes the fiber couplers
gxt has quit [Ping timeout: 240 seconds]
gxt has joined #osdev
ccx has quit [Ping timeout: 246 seconds]
<doug16k> I have been wondering if transparent system memory encryption has any negative effect on ram longevity
<doug16k> doesn't it cause it to be a 50% chance that the adjacent bit is a different value?
<geist> oh how you figure?
<geist> sure
<doug16k> so you maximize the chance of leakover happening
<geist> or i guess any given bit will flip to the opposite state more often than perhaps before
<Mutabah> It would probably help with predicting the lifetime
<doug16k> yeah, trigger the ecc
<doug16k> it also causes approximately half the data lines to be 1 and half to be 0, no matter what the values
<doug16k> does that increase data integrity due to it being less electrically bouncy?
<doug16k> it probably increases power losses a bit too. it causes it to be a 50% chance that each data line changes at each clock edge
<doug16k> where it might have got away with runs of 1 or 0 with low loss
<clever> hdmi scrambling also does something related, i think for emi reasons
<clever> there is an 8:10 encoding on the wire, so an 8bit value (the raw color) gets converted into a 10 bit symbol
<clever> that offers both ecc, and also some emi reduction, the symbols are chosen to have a low edge count
<clever> but, during the blanking intervals, it uses a different 2:10 encoding, with 4 specially chosen 10bit symbols, that instead have a very high edge densitity
<clever> so the receiver can calibrate its phase offset for sampling
<mrvn> clever: can you trace fibre optics on a PCB?
<clever> mrvn: first thought that comes to mind, cnc out a channel on an interior layer, and lay some fiber in there
<clever> but that means an entirely new way of fabbing a multi-layer board
zaquest has quit [Remote host closed the connection]
<clever> a far simpler thing, is to just route some fibers on the back side, and glue them down
zaquest has joined #osdev
ccx has joined #osdev
<clever> from memory, a 4 layer pcb, is just a pair of 2layer (double-sided) pcb's, with a copperless fiberglass seperator, all 3 parts glued into a stack
<clever> but what if you cnc'd some slots into that spacer layer, and ran some fibers?
<doug16k> what about blind vias and other madness they have to deal with on motherboards
<clever> thats just drilling holes in the layers before you glue them together
<mrvn> clever: how thin can you make fibres? And what about crossing them?
<clever> for crossing, you could maybe come up with a kind of fiber via?
<mrvn> can you etch traces into the board and then fill them with something that becomes fibre optics?
<clever> where the fiber terminates with a 90 degree prism, and just shoots up/down
<mrvn> clever: can't do a 90° turn with fibre.
<clever> and the next pcb layer has a matching prism to catch it
<clever> thats why you have a prism on the end, that reflects the light
<mrvn> have fun placing those prisoms.
<clever> the prism would be bonded to a pre-cut length of fiber
<clever> and yeah, you would have some loss at every one of those junctions
<mrvn> Does the fibre expand at the same rate as the board as it heats up?
<clever> and then a total budget of allowed loss over the whole link
PapaFrog has quit [Ping timeout: 258 seconds]
<mrvn> Does fibre optics even get the kind of throughput your hugely parallel memory bus has?
<mrvn> Or where you thinking of having 256 parallel fibre lines?
<clever> what kind of bandwidth would you commonly get from todays ram?
nyah has joined #osdev
PapaFrog has joined #osdev
<mrvn> 50GBit/s?
<clever> > Fiber optic Ethernet can typically achieve speeds up to or greater than 100 Gbps.
<clever> from a random hit in google
heat has joined #osdev
<mrvn> that's with big transmitters and receivers.
<doug16k> you'd have to make sure the transceivers don't use more power than the copper
<clever> yeah, so you would need to find ones that are small enough to fit on a cpu module
<clever> one min
<mrvn> I doubt they would net you an energy save.
<bslsk05> ​'Finally Revealing my BIG SECRET - Corning Optical Thunderbolt 3' by Linus Tech Tips (00:14:19)
<clever> skip back to 4:09
<zid> and they have to not fail at 100C
<doug16k> if you made an optical processor, and it was only light, then it would make sense
<clever> 3:48 shows the emitters used
<clever> and i think this specific cable, is doing pcie over thunderbolt
<doug16k> the reason you use optical for that application is noise immunity
<clever> and distance
<mrvn> things to do with your fibre optic cables.
<clever> he is crazy, and putting every computer in his new house, in a single server rack
<clever> including things like the xbox
<clever> and then fiber hdmi'ing them to every room, lol
<clever> so the same computer, has monitors in multiple rooms
<clever> and all of the heat/noise is contained in one place
<clever> doug16k: i'm thinking less avoid noise, and more about reducing the pin-count
<clever> what if you just entirely did away with the traditional cpu socket?
<clever> what if the cpu was a brick in a hdd bay, with a bunch of fibers coming out, and a VRM module stuck onto the side of it?
<zid> speed of light
<zid> there's a reason my RAM hits my cooler
<zid> it's not because they're too lazy to move it farther away
CryptoDavid has joined #osdev
<clever> yeah, that could add latency
<doug16k> the closer it is, the easier it is to get it working too
<doug16k> because they are extremely pushing their luck on the data rate
<doug16k> imagine how high the frequencies are in the edges
<clever> well, you have 2 seperate things there, data rate, and latency
<doug16k> somewhat more than the ram frequency, into the GHz
<clever> nothing says you cant have both high latency and high data rates
<zid> I mean, dram is already high latency
<clever> exactly
<zid> making it worse doesn't sound fun
<clever> but i think thats more about the dram module itself, taking a few clock cycles to come up with a response
<clever> rather then time of flight for the messages
<zid> well you're talking about tranceiving it twice, that's going to add some nanos
<zid> plus some distance, that'll add 1 or 2 more
<clever> yeah
<zid> and it's already 'slow' at 10-20ns, I wouldn't wanna turn it into 30 or 40
<clever> moar cache!
<clever> hide that latency!
<zid> so we just pair it with the 768MB ryzen
<mrvn> what's the ARM doc that deals with memory barriers and concurrency issues called again?
<kazinsal> as someone whose project suffers from high latency on DMA, please, don't hide it with cache :(
<clever> kazinsal: you also have the problem of dma being coherent or non-coherent!
<zid> Your DMA is going to time travel, clever
<mrvn> kazinsal: but without caches we won't get side channel attacks
<clever> if a pcie device is doing dma, and the pcie hub is in the cpu, then the pcie can read directly from the caches
<clever> so the caches help the dma
<doug16k> how hard is it to disable all the caches in linux?
<doug16k> it would be funny to make every instruction serializing and see what happens
<clever> doug16k: i have ran the rpi with the L2 cache disabled by accident before
<clever> it was noticably slower
<mrvn> doug16k: on the RPI when I clear the screen without caches you can see it progressing line by line.
<doug16k> I have done it before on earlier processors where it wasn't that big of a change
<clever> mrvn: i assume your not doing write combining?
<mrvn> clever: that would leave artefacts in the famebuffer
<mrvn> (for a time)
<clever> which model of pi?
<mrvn> doesn't really matter. With the caches turned off they are all dead slow.
<mrvn> No instrcution caches either
<clever> the axi port size differs by model
<doug16k> yeah, it's always awful. modern fancy out of order ones have really long pipelines, so it's worse
<clever> the bcm2835 has a 32bit axi port coming out of the arm, so it can only ever mode 32bits per clock
<clever> 2836/pi2 and up, have a 64bit port, so it can potentially move 64 bits per clock
<clever> and then you have axi burst stuff, which i think is covered by write combining
<clever> and you then need a cache-flush thing, to prevent the framebuffer artifacting
<clever> you can also use the 2d sprite hw to perform much faster updates, that are always locked to vsync
<mrvn> clever: and none of that matters with the cpu running without any caches. It's dead slow.
bauen1 has joined #osdev
<clever> yeah
<clever> the sprite hw would just let you hide that visually
<mrvn> I don't think it even schedules more than a single opcode per cycle without icaches.
<clever> or avoid needing to draw at all
<mrvn> there is no optiomization better than removing code. :)
<clever> yep
<doug16k> does it dare fetch the next instruction before this instruction completes, when the instruction memory is uncacheable? it should expect the next instruction to change out from under it if part of being uncacheable is treating it as volatile
<mrvn> uncached or uncachable?
<mrvn> I don't think the later is supported
<doug16k> in general not being cacheable means not prefetchable
<doug16k> depends on how the implementation treats it
<clever> on arm, you dont have to configure on a per-page basis, what can be i-cached, because its assumed that anything your executing is going to be code, and all code should be cached equally
<clever> but d-cache needs per-page controls, for mmio to not get cached
<mrvn> and if you try to execute the MMIO regsiters bad things will probably happen.
<clever> thats how a lot of credit-warp exploits work in older consoles
<clever> in one case, its using the button state register as an opcode!
<mrvn> doug16k: you could check the ARM specs to see if it does any pipelining with the mmu/caches disabled.
<clever> so the opcode it runs, is directly linked to what buttons your holding at that instant
<mrvn> clever: talk about frame perfect input :)
<doug16k> mrvn, yeah. I half expect them to recklessly prefetch regardless
<doug16k> which makes sense
<clever> mrvn: well, it only reads once at a known delay, so you can just hold the buttons before that point in time
eroux has quit [Ping timeout: 248 seconds]
eroux has joined #osdev
<doug16k> have you seen that memory corruption glitch where you can complete SNES mario by going down a pipe that shouldn't end the game?
<doug16k> you have to do a sequence of moves to fill memory with particular values at particular frames
<clever> yeah
<clever> part of that is that you have a jmp into the sprite xy configs
<clever> so you need to set the coords of sprites correctly
<doug16k> if you make a game, leave all the bugs in that don't crash it. apparently everyone loves it when games glitch
<clever> that reminds me, one developer hooked all of the cpu exception vectors, and routed them to a "secret level selection screen"
<clever> because if the game crashed during review, it had to start the review all over again
<clever> but if it randomly opens a secret level selector, thats not a fail :P
<dminuoso> Redefining error behavior, nice!
<clever> and then decades later, people discovered that if you whack the console hard enough, you can open that menu
DonRichie has quit [Quit: bye]
Burgundy has joined #osdev
<dminuoso> Curious, how would sudden physical acceleration induce cpu exceptions?
<doug16k> capacitors microphoning for one, ringing
<clever> doug16k: in this case, it was whacked hard enough for the cartridge to come loose
<doug16k> physical shock also seriously disrupts crystals
<bslsk05> ​'Why does PUNCHING Sonic 3D trigger a Secret Level Select?' by GameHut (00:03:02)
<doug16k> yeah connector bounce would happen
Jari-- has quit [Ping timeout: 248 seconds]
<doug16k> I was thinking more along the lines of the the balding guy that smashed his keyboard 3x with his fist, then hit the monitor with the keyboard
<doug16k> I had a friend run over and rip out Street Fighter II and cave in one side from smashing it on his knee. worked fine
<doug16k> the whole back of the cartridge was gone, just board
<doug16k> near the middle
<doug16k> makes me wonder what you have to do to it to break a SNES and/or one of its games
<kingoffrance> "have you seen that memory corruption glitch where you can " theres something like that for zelda 3 snes too IIRC...beat it in 15 minutes or something?
<Mutabah> Nothing can beat the shenanigans people have done with Pokemon Yellow in a virtualboy
<Mutabah> (Or whatever that SNES GB cartridge adapter was)
<kingoffrance> super game boy
<kingoffrance> it added color :D
<kingoffrance> before the color game boy existed
<kazinsal> an acquaintance of mine is writing a new cycle-accurate game boy family emulator and they're running into so many bizarre things that some games do
<doug16k> when I imagine hitting a SNES with a sledgehammer, it damages it pretty bad, but also hilariously just pings up in the air and the sledgehammer bounces off it pretty nicely
<kazinsal> 80s/90s game dev was way more of a wild west than a lot of people realize
<Mutabah> A mix of pure magic/engineering... and "it runs, ship it"
<bradd> some early version of final fantasy didn't re-seed the rng if you died. So you had to kill in adifferent order and hope that worked. else you'd never get past the fight
<Mutabah> I once did some RE/decompilation work on Gen3 Pokemon, a whole lot of places where a BL instruction was used where a B was intended
<doug16k> how mad would you be on a processor where the only unconditional branch always sets ra
<doug16k> if you want to reduce the instruction set, get rid of the silly no-ra branch, right? hehe
<bslsk05> ​'How We Solved the Worst Minigame in Zelda's History' by Linkus7 (00:24:32)
<clever> bradd: basically, any loaded entity can call the rng function at any time, potentially multiple times per frame
<doug16k> I am playing with a toy core in verilog where every branch sets ra
<clever> and these crazy guys found a way to reverse the rng function, and discover its current internal seed, based on the random numbers it was outputing, and the total time the game was running
<doug16k> because opcode space is pretty jam packed and I don't want a branch off at some weird value
<clever> and then used that to predict the next rng
<Mutabah> Pure stats, pretty darn cool
<clever> Mutabah: with the complication, that the total number of rng calls can vary, based on how long youve been playing
<clever> and even what direction the camera was facing
<Mutabah> Yep.
<Mutabah> Working from rough memory of that video (saw it not long after it was released)
<clever> same
<doug16k> You can just record the whole sequence and know the next one trivially
<Mutabah> They figured out expected spread of the RNG at a given elapsed time, combined that with some observations, and used that to cut down the probabilities
<clever> with the added complication that your not actually getting numbers out of the rng code
GeDaMo has joined #osdev
<clever> the rng algo used, will repeat after 7 trillion outputs
<doug16k> in zelda
<doug16k> years ago on msvc toolchain, I made an array of 2^32 bools and called rand() 2^32 times and set the bit from the return value, and it returned every possible number and repeated exactly
<clever> neat
<clever> the light-house 2 protocol used in vr tracking, doesnt do that, by design
<clever> there are 7? channels, each with 2 seeds
<bslsk05> ​en.wikipedia.org: Linear congruential generator - Wikipedia
<clever> so you have 14 unique prng streams, that each repeat, but dont cover every possible value
<clever> and its not a list of 32bit ints, but a bit sequence
<clever> if you have 17? bits in a row, you can determine which seed your on, and your position within the stream
<clever> the channel# is used to seperate each lighthouse
<clever> while the 2 seeds, are used for a low rate comms channel, sending either seed-A or seed-B for a single sweep, giving you 1 bit per sweep
<clever> so the position in the stream, gives you the angular location of the tracker, within the LH's field of view
<clever> which pair of seeds, tells you which LH it is
<clever> and the low-speed comms, gives you a serial#, firmware version, and other stuff
<clever> if you then have multiple receivers getting hits from the same tracker, and you know the physical shape of the controller, you can then solve for distance
<doug16k> use ARC4 and use it to encrypt an infinite stream of zeros. output is ridiculously random, right?
<clever> so if 2 sensors are say 1 degree apart, in the LH's view, and 1 inch apart physically
<clever> then you can use basic math to figure out the distance from the LH
<clever> assuming its pointing square at the LH
<doug16k> ah. not random. ints aren't bits
xenos1984 has quit [Read error: Connection reset by peer]
<doug16k> you can pick constants for a simple LC generator that give a small period
<doug16k> if you had to do something like that with hardly any memory
<mrvn> but why would you ever want a period less than one less than the data type allows?
<doug16k> you probably wouldn't. why does RAND_MAX suck on several platforms
<mrvn> concerning ARC4. you don't encrypt a stream of 0. you encrypt a steam of something that occasionally changes it's value, like the cpu temp.
<doug16k> dumb choices for the constants
<clever> https://github.com/jdavidberger/lighthouse2tools has more info for what i was saying
<bslsk05> ​jdavidberger/lighthouse2tools - General tools for working with / figuring out the LH2 (index) technology stack (5 forks/7 stargazers/MIT)
<doug16k> watching a video on youtube that mentions spinning around makes me sick. I can't even touch a VR headset
<clever> its using a Linear feedback shift register for its rng generation
<clever> doug16k: but this tracking hardware can also be used for non-vr things
<clever> any time you want to know the position and rotation of an object in a space
<doug16k> yeah, lfsr is a really good generator for small memory and simple cpus
<doug16k> the one I mentioned earlier is for big fat processors that don't mind multiplication
<clever> yeah, a LFSR could be implemented entirely in an asic, with relatively few gates
CryptoDavid has quit [Quit: Connection closed for inactivity]
<doug16k> yeah, almost nothing
<doug16k> it's more wire than gate
<clever> and this is only running at a measly 6mhz!
<doug16k> N flip flips in a chain with a couple of xors
<doug16k> flip flops*
<doug16k> picks off a couple of things to xor together to put into first flip flop input
<doug16k> at different bits
<clever> yep
xenos1984 has joined #osdev
<doug16k> I wonder how bad ARC4 is if you naively initialize the state with 00 to FF in order and just start using it to encrypt immediately, no seed or anything
<doug16k> probably appears encrypted to the naked eye, you think?
<doug16k> everyone is trying to figure out the key when the key is "do nothing"
<doug16k> it's the same logic that makes the password "password" seem good
<heat> no one expects the password password
<heat> security by stupidity
<doug16k> funniest thing - everyone is convinced somehow that you aren't allowed spaces in a password
<doug16k> it's one of the rarest things in password dbs
<heat> i personally enjoy the caesar cipher with a shift of 0
the_lanetly_052_ has quit [Ping timeout: 248 seconds]
<heat> doug16k, depends on the service probably
<clever> doug16k: i never even thought to put one there!
<heat> some are quite picky
<doug16k> I read somewhere that the most unused character in password databases is <
<heat> ^^this is why chrome's password generation is very conservative
<heat> how do they know tho
<doug16k> breeches
<clever> ive heard a story before about a website that would truncate passwords that are too long
<heat> plaintext passwords are sus
<clever> and then your pw isnt valid, because its comparing the truncated to the non-truncated
<heat> use a salted sha256 and be done with it
<doug16k> people were way dumber about security back then
<clever> heat: i just used openid, let somebody else deal with the passwords :P
<heat> they still are
<doug16k> really really bad then
<GeDaMo> I've seen sites which say things like "enter the 3rd letter of your password" :/
<mrvn> I xor twice, for extra security.
<doug16k> I wonder how many security system installers face people saying "make it 1111"
<heat> the passcode for my apartment block is a straight fucking line
<clever> GeDaMo: when i was going thru account recovery with netflix, a dedicated dialog, seperate from the chat popped up, to confirm the last 4 digits of my credit card number
<mrvn> heat: and the keys are probably showing wear.
<heat> the security of the whole building is compromised because "hurr durr pins are hard"
<doug16k> then after a couple of years, 1 is blank and the rest of the buttons are brand new
<clever> GeDaMo: i suspect its designed in such a way, that the support guy, only gets a boolean, and cant steal my number
<GeDaMo> Yeah, showing the last four digits of a card is common
<doug16k> I know what you mean. people pretend they are too dumb to remember 4 digits
<clever> GeDaMo: but in this case, its not even showing the last 4, its asking for the last 4, to confirm who i am, but also not revealing the answer to the random support dude
<mrvn> clever: But if they know the 3rd letter of the password then they probably have it stored somewhere. total security fail even if the GUI only shows a bool
<doug16k> how many people here know the IT crowd emergency services number?
<clever> mrvn: yep, plaintext == fail
<clever> doug16k: oh god, i dont remember it, lol
<mrvn> I would even go one step further: password == fail
<clever> mrvn: that reminds me, i had designed a "saved password" feature years ago, it used ssl client certs
<clever> it didnt actually save the pw, but instead registered your client cert to the acct
<mrvn> clever: we use zeromq public/private keys for our management software at work.
<mrvn> and one time tokens for new users to log in the first time.
<mrvn> token + public key creates an account, private key is only known to the user.
<clever> and if they loose the private key?
<mrvn> then they get a new token and can register a new private key.
<mrvn> And tokens are even encrypted with a password. So you can email them the token and send the pass via sms for 2 factor auth to become a new user.
elastic_dog has quit [Ping timeout: 248 seconds]
elastic_dog has joined #osdev
vai has joined #osdev
vai has quit [Client Quit]
vai has joined #osdev
vai is now known as Jari--
dennis95 has joined #osdev
bliminse has quit [Quit: leaving]
bliminse has joined #osdev
pretty_dumm_guy has joined #osdev
srjek has joined #osdev
bliminse has quit [Quit: leaving]
bliminse has joined #osdev
the_lanetly_052 has joined #osdev
xenos1984 has quit [Read error: Connection reset by peer]
xenos1984 has joined #osdev
arch-angel has quit [Remote host closed the connection]
corecode_ has joined #osdev
mcfrd has joined #osdev
mcfrdy has quit [Quit: quit]
corecode has quit [Quit: ZNC - http://znc.in]
corecode_ is now known as corecode
mcfrd is now known as mcfrdy
<sbalmos> was doing some random reading this morning. if you're doing a C++ kernel, are the global constructors and destructors still in .ctors & .dtors, because it looke like some say .array_init should be used moreso nowadays?
pretty_dumm_guy has quit [Ping timeout: 255 seconds]
pretty_dumm_guy has joined #osdev
<dminuoso> If you're doing a C++ kernel, you better define what global constructors/destructors mean and how they are implemented yourself.
<heat> no that's a compiler detail
<heat> sbalmos, I think they get put into init_array and fini_array
<sbalmos> I was about to say
<sbalmos> heat: That's what I thought. Older stuff online I had bookmarked from ages back was using ctors & dtors. But I figured it got moved more recently (FSVO "recently")
<sbalmos> heat: Does clang follow that also? I thought it was slightly different than GCC
<heat> clang does init_array only afaik
<heat> gcc can /if you enable it/
<sbalmos> thx
<heat> it's a configure time switch. usually it defaults to on because it can test if the host system supports them. it defaults to no when cross-compiling
<heat> so use --enable-init-fini-array(? need checking)
<sbalmos> that looks familiar
<sbalmos> although I'm only using clang, so ¯\_(ツ)_/¯
puck has quit [Excess Flood]
puck has joined #osdev
gildasio1 has quit [Ping timeout: 240 seconds]
<geist> i think maybe it'll also change based on the triple and/or arch?
<geist> not sure it's the absolutely proper way but you can also just merge them together and define your own symbols for start/stop: https://github.com/littlekernel/lk/blob/master/arch/arm64/system-onesegment.ld#L81
<bslsk05> ​github.com: lk/system-onesegment.ld at master · littlekernel/lk · GitHub
dennis95 has quit [Quit: Leaving]
<sbalmos> geist: oh cute. yeah, hadn't even started into anything non-amd64 yet. was kind of wondering what the others looked like
<sbalmos> linker scripts are a whole other gray-art area that I'm crash-learning too
bliminse has quit [Quit: leaving]
k0valski1889 has quit [Ping timeout: 248 seconds]
bliminse has joined #osdev
ethrl has joined #osdev
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
k0valski1889 has joined #osdev
gog` has quit [Ping timeout: 248 seconds]
<ddevault> is anyone aware of an OS project which has attempted to implement the linux loadable module API
<ddevault> to run linux drivers as-is
<ddevault> not that I want to do this, just curious if anyone has tried
Likorn has joined #osdev
zid` has joined #osdev
Ram-Z_ has joined #osdev
bgs_ has joined #osdev
auronandace has joined #osdev
pie__ has joined #osdev
vancz_ has joined #osdev
bliminse has quit [*.net *.split]
PapaFrog has quit [*.net *.split]
zid has quit [*.net *.split]
Celelibi has quit [*.net *.split]
Teukka has quit [*.net *.split]
\Test_User has quit [*.net *.split]
mavhq has quit [*.net *.split]
pie_ has quit [*.net *.split]
vancz has quit [*.net *.split]
andreas303 has quit [*.net *.split]
Ram-Z has quit [*.net *.split]
dh` has quit [*.net *.split]
ThinkT510 has quit [*.net *.split]
Terlisimo has quit [*.net *.split]
MiningMarsh has quit [*.net *.split]
bgs has quit [*.net *.split]
bgs_ is now known as bgs
Vercas has quit [Quit: Ping timeout (120 seconds)]
Vercas has joined #osdev
Terlisimo has joined #osdev
PapaFrog has joined #osdev
bliminse has joined #osdev
Celelibi has joined #osdev
MiningMarsh has joined #osdev
andreas303 has joined #osdev
Test_User has joined #osdev
mavhq has joined #osdev
mahmutov_ has joined #osdev
<heat> ddevault, hmmmmm, the BSDs do kinda that with DRM
<heat> the're also the NDISWrapper stuff for windows network drivers on linux/BSD
<heat> but the whole API? Probably not
<heat> it would affect your whole design substancially I assume
<heat> (and it's not even stable!)
<ddevault> not much benefit, either
<ddevault> you get a bunch of drivers but at that point are you even particularly different from linux
<heat> you have become the thing you set out to destroy :0
<heat> it may be more feasible when rust on linux becomes a thing
<heat> the API surface will undoubtedly be smaller AFAIK
netbsduser` has joined #osdev
brenns102 has joined #osdev
Goodbye_Vincent6 has joined #osdev
CYKS2 has joined #osdev
klange_ has joined #osdev
AndrewYu has joined #osdev
dostoyev1ky2 has joined #osdev
dostoyev1ky2 has quit [Client Quit]
rwb has joined #osdev
ethrl has quit [*.net *.split]
Goodbye_Vincent has quit [*.net *.split]
mrvn has quit [*.net *.split]
netbsduser has quit [*.net *.split]
klange has quit [*.net *.split]
dostoyevsky2 has quit [*.net *.split]
janemba has quit [*.net *.split]
CYKS has quit [*.net *.split]
Andrew has quit [*.net *.split]
brenns10 has quit [*.net *.split]
rb has quit [*.net *.split]
Goodbye_Vincent6 is now known as Goodbye_Vincent
CYKS2 is now known as CYKS
brenns102 is now known as brenns10
janemba has joined #osdev
mahmutov_ is now known as mahmutov
GeDaMo has quit [Quit: There is as yet insufficient data for a meaningful answer.]
floss-jas has quit [Ping timeout: 240 seconds]
heat is now known as a
a is now known as heat
terminalpusher has joined #osdev
gildasio has joined #osdev
pretty_d1 has joined #osdev
pretty_dumm_guy has quit [Ping timeout: 252 seconds]
gildasio has quit [Remote host closed the connection]
SGautam has joined #osdev
mahmutov has quit [Ping timeout: 240 seconds]
sortiecat has joined #osdev
sortie has quit [Ping timeout: 260 seconds]
terminalpusher has quit [Remote host closed the connection]
<geist> iirc the linux kernel module stuff is a pretty unstructued system in the sense that it loads raw .o files and just resolves symbols as it sees fit
Likorn has quit [Quit: WeeChat 3.4.1]
Likorn has joined #osdev
eck has quit [Quit: PIRCH98:WIN 95/98/WIN NT:1.0 (build]
eck has joined #osdev
* heat yawns
<heat> is gregkh at google? I thought he worked for the linux foundation but he has a @google.com and does a bunch of android work
<heat> weird
<j`ey> contracting maybe?
<heat> guess so
pretty_dumm_guy has joined #osdev
pretty_dumm_guy has quit [Client Quit]
<heat> looks like all he does is merge upstream -stable stuff to the android kernel
* heat . o 0 { Reverse upstream - make a fork with so many unmergeable patches and hire the upstream maintainers to downstream the changes to your fork }
<geist> POWER MOVE
pretty_d1 has quit [Ping timeout: 240 seconds]
<heat> large phallus energy
nyah has quit [Quit: leaving]
<heat> what's with the steve jobs picture linus
<heat> he's not even a tech ceo wtf
<zid`> I'm redesigning my emulator to be.. complicated :D
<heat> is it in enterprise C# or Java?
<zid`> oh god heat
<bslsk05> ​github.com: JADE/InstructionManager.cs at master · BLNJ/JADE · GitHub
<heat> if it's not, i don't want to hear about it
<zid`> someone posted this a day or two ago
<heat> i know
<zid`> it runs at 0.00002fps he let slip
<heat> this is beautiful
<heat> very poor OO programming though
<zid`> anyway, I was thinking of having a fast path which does really lazy emulation until the next mmio, which runs for min(lcd, timer, audio) where those are 'how many cycles until that device will generate an interrupt'
<zid`> then it switches into an accurate mode to deal with all that interaction, then goes back to being fast
<heat> what's lazy emulation?
<zid`> full instructions rather than t-cycles
<zid`> vblank loop skipping
<zid`> for the lcd: entire scanlines rather than individual pixels
<zid`> for the timer: not incrementing and checking for overflow, just calculating when overflow will be then doing timer += 93039;
Test_User has quit [Quit: e]
eroux has quit [Ping timeout: 260 seconds]
SGautam has quit [Quit: Connection closed for inactivity]
eroux has joined #osdev
srjek|home has joined #osdev
srjek has quit [Ping timeout: 250 seconds]
<doug16k> qemu has an option for that in tcg, where it can just jump forward until the next event or make it really wait
<doug16k> icount related? I don't remember exactly
<doug16k> ah, yeah, icount sleep=off. makes the clock just jump forward if time would elapse in halt waiting for interrupt
<heat> how does that work in smp?
<doug16k> good question, but probably min deadline across cpus if all halted
Burgundy has quit [Ping timeout: 260 seconds]
<doug16k> I used it some, it seemed to work on smp, but I am not certain, I only used icount to make it deterministic for debugging and the sleep=off sped it up
<doug16k> when idling, it makes it seem like your kernel is pegging the cores but really, time is flying by on the clock
<doug16k> icount just sets a fixed amount of virtual time to elapse per instruction
<doug16k> another way of looking at it: executing an instruction bumps the clock forward
<heat> that scares me
<heat> might uncover some weird bugs in timer code
<doug16k> you can't feel it inside the guest
<doug16k> other than the cpu being faster or slower. it already expects that
<heat> i can see shit going south if the code between me getting the timestamp (deadline) and setting the timer is unrealistically slow
<heat> maybe my timer code is just crap idk
<doug16k> yeah you don't make it unreasonable
<doug16k> sure you could put it so it does nothing but dispatch timer irqs
<doug16k> it can't even execute one user instruction
<doug16k> don't do that, though
<doug16k> you set the shift. shift=N causes each instruction to take 2^N ns
<heat> do you remember details about the tsc deadline mode?
<doug16k> I don't use it because of errata
<heat> can't remember if it triggers if the counter == tsc or if counter <= tsc
<doug16k> are they crazy enough to try to make wraparound ok?
<doug16k> if they are sane it is <=
<doug16k> what year does it wrap around?
<heat> oh yes, indeed
<heat> it's sane
<heat> no, I don't think they make it wraparound
<doug16k> did you read the errata about it?
<doug16k> if you use it you really should
<heat> no
<heat> what errata
<doug16k> model specific bugs
<heat> how did intel screw this up
<heat> it's not that complex lol
<doug16k> if windows didn't use it, good luck
<doug16k> there is a microcode workaround
<heat> linux does though
<doug16k> can you degrade to the normal way if there is no deadline?
<doug16k> example: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x52 (or later)
<doug16k> linux does that too
<heat> theoretically
<doug16k> right, just saying, you might want to make it chicken out and use the normal timer if you can tell it has the defect
<heat> my gen's errata mentions messing with IA32_TSC_ADJUST makes IA32_TSC_DEADLINE trigger at the wrong time
<heat> which seems sane I guess?
<heat> technically a bug but that's not something I would feel comfortable about doing anyway
<doug16k> is that all? ok as long as you know that, you can tippy toe around IA32_TSC_ADJUST changes
<heat> for 7th gen seems so
<doug16k> what if the entire capability isn't there
<heat> i use the lapic
<doug16k> that's pretty good then
<doug16k> what makes tsc any better?
<heat> faster iirc and definitely more precise