klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
scoobydoo_ has joined #osdev
scoobydoo_ has quit [Excess Flood]
scoobydoo_ has joined #osdev
scoobydoo_ has quit [Excess Flood]
scoobydoo_ has joined #osdev
scoobydoo_ has quit [Excess Flood]
scoobydoo has quit [Ping timeout: 265 seconds]
scoobydoo_ has joined #osdev
scoobydoo_ is now known as scoobydoo
scoobydoo has quit [Excess Flood]
aejsmith has quit [Remote host closed the connection]
YuutaW has quit [Ping timeout: 260 seconds]
YuutaW has joined #osdev
elastic_dog has quit [Ping timeout: 260 seconds]
<geist> wow, someone put together a ridiculously detailed M1 analysis
<bslsk05> ​twitter: <handleym99> Bigger than Jesus! Longer gestating than Chinese Democracy! Rarer than Once Upon a Time in Shaolin! ␤ It's finally available in (very) preliminary form! My first masterpiece -- M1 Explainer. <drive.google.com/file/d/1WrMYCZ… https://t.co/h3RuiXlro2> 1/ @dougallj @andreif7 @trav_downs @silicongang @stuntpants
elastic_dog has joined #osdev
<clever> let me check on the comments you left in the PR
asskoala has quit [Ping timeout: 252 seconds]
<clever> geist: oh, i just had a bit of a hacky idea, after arch_chain_load() turns irq's off on one core, and calls platform_quiesce, can platform_quiesce still spawn 3 pinned threads, and get the other cores to re-schedule?, and then block until they act via a spinlock maybe?
<clever> so when arch_chain_load tries to quiesce the entire system, platform_quiesce will re-park the other cores in a known location
<geist> probably not
<clever> what is most likely to fail there?
<geist> well, okay with the current scheduler it'll probably work
<geist> because it's not possible to deadlock on the 'pinned' cpu0 that has disabled interrupts
<geist> but in a queue-per-cpu style design, you've already stepped off the edge the moment you essentially hijack cpu 0 by disabling ints
<geist> that's fine as long as you dont intend to ever touch the scheduler again, on any cores
<clever> yeah
<geist> but if you do, then it's possible there's another thread blocked up on cpu0 that is holding a mutex in the heap, for example
<geist> and then another cpu tries to malloc something, boom
<geist> the current scheduler is a single queue, so there's no blocking up like that
<geist> so it'll probably work
<clever> reading the code, i can see that arch_chain_load will do: 1: arch_disable_ints, 2: target_quiesce (no-op), 3: platform_quiesce
<clever> platform_quiesce could temporarily turn IRQ's back on, and ask the scheduler to get all 4 cores running code i control, each grabbing a spinlock
<geist> yeah so if either target or platform goes and does stuff that involves grabbing mutexes or whatnot (heap) or fiddling with the scheduler
<clever> with spinlocks held, irq's are already off
<geist> then the fact that it idisabled ints is not blown
<geist> now
<clever> platform_quiesce can then return back to arch_chain_load for hijacking core-0
<heat> what's a good resource for knowing how a modern CPU actually works under the hood?
<clever> and i can hijack the other 3 in my own way
<heat> uops and whatnot
<geist> i think we came up with a good solution about an hour ago: run LK in UP mode, grab the other cores, park them for eventual handoff
<geist> that's basically what all LK based bootloaders do
<heat> although something lower level would actually be cool as well
<geist> heat: a lot of what i learned was in the mid to late 2000s with an excellent series of articles on arstechnica, later collapsed into a book
<geist> basically a whole series of cpu architecture articles. more specifically superscalar cpu architecture
<geist> a lot of the rest of it i've learned here by talking to doug16k and whatnot
<clever> line numbers are also missing from your comments on the PR
<geist> oh?
<clever> normally, a comment is on a range of lines
<geist> huh. i just pushed a little +_ next to the line and started typing
<clever> weird
<geist> they ssemed a little strange though, like they were some sort of mini-comment
<geist> i never saw a ui for 'start a review' or whatnot
<geist> i still dont fully grok the github review UI
<geist> and it seems to change on me every time i use it
<clever> for the first comment, i should check to see if loader_pa is within the vmm_get_kernel_aspace()->arch_aspace first, right?
<geist> yah
<clever> and then if its not, add my own aspace
<geist> a good example if the previous is arm-virt qemu
<geist> kernel aspace starts at 0x4000.0000+ and lo and behold that's also where physical ram starts
<geist> lots of socs i know (seems to be most modern ones 've seen) now start DRAM later on, 0x4000.0000 or 0x8000.0000 and just go right past 4GB
<geist> and then stuff all the peripheral stuff below that
<clever> thats a good reason for that testcase i mentioned earlier
<clever> can qemu be configured easily to make ram start anywhere?
<geist> no
<geist> it's extremely hard coded
<clever> ah, but if i pick the rpi qemu mode, it starts at 0
gog has quit []
anon16_ has quit [Ping timeout: 252 seconds]
<clever> so i can just swap between arm-virt and rpi, in qemu
<geist> sure
gog has joined #osdev
<geist> and write some sort of bootloader test case sure
<clever> so i could write a testcase, where i try to chainload linux, and ensure it works in both cases
<clever> how does qemu-arm-virt deal with unparking the other cores? PSCI was it?
<geist> yes
<geist> basically the way all modern arm64s do it
<geist> you call a piece of firmware to unpark/park the cores
<clever> so LK boots in UP mode, and just tells linux to go ask PSCI for the other cores?
<geist> what does linux have to do with it?
<clever> when you chainload a SMP capable kernel, and it wants more cores
<geist> oh if it's a bootloader yes
<geist> just dont touch the other cores, let linux deal with it
<heat> x86 also works like that
<clever> the other option, would be to make LK into a PSCI firmware
<geist> and PSCI specifically has a defined state the cores are brought up in
<heat> cores go through the BIOS single-threadedly and just wait for the wakeup in a loop
<geist> right
<geist> in the case of PSCI on arm it's the firmware job to do what it does. most of them actually do park the core and really bring them up from cold
<clever> heat: and my problem, is that the bios is missing, and i'm using LK as my bios
<geist> saves power that way
<geist> but that's the nice thing about it, it abstracts how the cores are brought up
<geist> well yeah. so PSCI runs in EL3 for one thing
<geist> and a proper PSCI firmware does *nothing* if it's not told. so really LK is compeltely inappropriate for 'sticking around' like that
<clever> but i'm on arm32, so EL3 style things would be a bit more tricky
<geist> there is no PSCI on arm32 i think
<geist> well okay no that's not true, but i haven't seen it implemented on arm32
<geist> because arm32 is dead.
<clever> and i'm the one re-animating the zombie :P
<geist> well to be more precise: a pure 32bit only core i dont think implements PSCI. but you can make PSCI calls from a 32bit only os (running at say EL1) on a 64bit core with a 64bit EL3
<clever> that makes sense
<geist> arm32 as a subordinate EL is still somewhat alive (though newer cpus dont implement it above EL0 or at all)
<heat> nintendo switches still have arm7s in them
<geist> the big wrinkle is... cortex-a32 which is a 32bit only armv8 core (with four ELs)
<clever> i think i wound up fixing both your PR comments at once
<geist> so actually in that case everything i said is a lie.
<heat> note 7, not v7
<clever> i switched aspace to being a pointer, so i can trivially select between the 2 available ones
<geist> so really it's pre-armv8 that doesn't do PSCI
<clever> and now i have to malloc it, which solves the 2nd issue
<clever> if (loader_pa isin vmm_get_kernel_aspace()->arch_aspace) {
* zid makes a note in geist's fine: Do not trust on star sequences or arm boot processes
<clever> so i just need to fill in this blank
<zid> file*, bah
<geist> heat: that's probably the little hidden portalplayer 'security processor' in the tegra
<heat> yup
<heat> also the boot cpu
<geist> yep
<geist> in a past life i dealt with a nvidia tegra. it booted the arm7 first, then booted the other cores
System123 has joined #osdev
<geist> very much like the broadcomm mess that is the early raspberry pi cpus
<geist> basically an existing design (arm7tdmi + some dsp stuff) with some 'big' arm cores bolted onto the side
<clever> geist: i'm guessing i use arch_mmu_query to see if a given VA is within an aspace?
<geist> over the years the arm7 has turned into more and more of a security thing
<geist> clever: no. you have to test against KERNEL_ASPACE_BASE, etc
<clever> DEBUG_ASSERT(is_valid_vaddr(aspace, vaddr));
<clever> oh, maybe that
<geist> it's hard coded
<clever> is_valid_vaddr feels like the best option, because if that returns false, there aspace can never map it
gog has quit [Remote host closed the connection]
gog has joined #osdev
<clever> kernel/vm/vmm.c: arch_mmu_init_aspace(&_kernel_aspace.arch_aspace, KERNEL_ASPACE_BASE, KERNEL_ASPACE_SIZE, ARCH_ASPACE_FLAG_KERNEL);
<klange> I wonder what I can poke on my ThinkPad through ACPI without having to go all the way with AML... would be nice to have a bettery level widget...
<heat> geist, why the arm7 though?
<klange> Would also be nice if I actually did the thing with panel widgets I said I was going to do and make them shared objects...
<geist> heat: becuase nvidia bought an existing design from a company called portalplayer in the mid 2000s
<geist> and then morphed it into tegra over time
<klange> Where is my notebook... of the paper variety..
<bslsk05> ​en.wikipedia.org: PortalPlayer - Wikipedia
<geist> much the way broadcomm had bought another company and then morphed it into their ARM base of socs (they had been doing mips before)
<geist> and then the VCPU and whatnot came in via that path
System123 has quit [Ping timeout: 268 seconds]
<geist> in 2009 or so i was dealing with the second gen tegra, tegra 2
<geist> it still was basically a portalplayer with a cortex-a8 bolted onto the side
<heat> still doesn't explain why they didn't change it
* geist shrugs
<heat> why do they really like old CPUs booting new CPUs (see intel x86)
<geist> gotta ask them. 'them not changing it' is extremely common
pony has joined #osdev
<geist> well that kinda makes sense i guess. you can run a very old, small, extremely power efficient security processor of just internal SRAM
<geist> and do whatever crypto/etc you need to then decide to power up the many of order of magnitude larger cores
<geist> nice for things like sitting there with the soc 'off' and just sipping power keeping the battery running, etc
<geist> in their case they probably just kept it since they already paid for the arm7 IP
<geist> sometimes you see various socs stuff in one or more cortex-m class cpus for the same thing
<clever> that reminds me, ive had issues in the past, where a laptop battery just entirely died, because i left it fully charged and unused for months
<geist> same. worth bringing them out and powering them up every once in a while
<clever> ive heard of a product somewhere, that would automatically drain its own battery, if left unused for too long
<clever> to prevent exactly that
<clever> it had an MCU keeping track of idle time, and probably a mosfet and resistor, to dump things into heat
<heat> D:
<clever> and thats the kind of thing you could add into that security processor, if the battery is non-removable
<clever> its already running when "off"
pony has quit [Client Quit]
<clever> and you could even omit the dummy load resistor, just turn the fat cores on, and spin!
<geist> dumping things into heat is always fun
<heat> noooooooooooo
pony has joined #osdev
<clever> static inline bool is_valid_vaddr(arch_aspace_t *aspace, vaddr_t vaddr) { return (vaddr >= aspace->base && vaddr <= aspace->base + aspace->size - 1);
<clever> geist: i think this function does exactly what we need, but its private to arch/arm/arm/mmu.c, so arch/arm/arm/arch cant see it
<geist> elevate it to an arch_mmu_* routine or an arch_aspace routine
<heat> vaddr < aspace->base + aspace->size is way clearer :)
<geist> seems like it could be generically used
<geist> heat: problem is wraparound
<clever> arch_aspace does sound like a good place to move it
<geist> heat: in the very common case where base + size == 0 it explicitly calculates based on base + size - 1
freakazoid343 has joined #osdev
<heat> hmm good point
<geist> when dealing with inner VM address calculations and whatnot these sort of wraps and whatnot are extremely common, have to be very very careful
<clever> arch_mmu_ actually, compared to the other funcs
<geist> problem is arch_aspace_t is an opaque type, so you can't really put it in a public header next to the other methods
<geist> or at least can't do an inline version
<geist> tis the one really nice thing you can do easily in C and not as easily in C++: have completely opaque types with a bunch of methods defined on it
<geist> C++ tends to force you to expose the guts of your object *or* use a pimpl to hide the guts
<zid> seems like a job for a macro
<clever> *looks*
<moon-child> geist: isn't that what 'private' is for?
<zid> no
<moon-child> or is that not private enough
<geist> moon-child: sure but you still have to expose the guts in the .h
<zid> you still get the declaration
<moon-child> oh abi stuff?
* moon-child has not read scrollback
<clever> arch/include/arch/mmu.h:typedef struct arch_aspace arch_aspace_t;
<clever> ah, thats where its defined
<zid> so it still compiles slow and still breaks when things change (read: needs recompiling)
<geist> in C you can build OO style stuff, but just use an opaque struct for your pointer
<geist> but... then you dont get the advantage of lots of little inline accessors. so it's all a tradeoff
<zid> by like.. 'default' the idiom in C is basically to make things OO but on the TU level instead of the runtime memory level
<clever> geist: i think you can also do that with `class Foo;` in c++, you can pass a `Foo*` around, but you can never allocate a Foo or access any of its members
<geist> exactly same thing. you can *write* C style OO in C++ if you want
<clever> but member functions are a thing, and then you need to declare it fully
<geist> but my point is if you do it the C vs C++ way
<clever> arch/arm/arm/include/arch/aspace.h:struct arch_aspace {
<clever> ahhh, thats where its hidden
<geist> i'm not saying its great but it always pains me when you have to stuff so much crap in the .h file for some nominally opaque C++ object in its header because that's how you do it
gog has quit [Remote host closed the connection]
gog has joined #osdev
<geist> clever: right, each arch defines their own version of it
<clever> the typedef was making it harder to spot, but i found it now
<geist> this M1 pdf is really interesting
<geist> it spends about 40 pages really trying to explain superscalar design but then really gets into it
<zid> The first chip in the range so it's going over all the basics super deep?
<clever> link?
<geist> see above
<geist> warning it's 300 pages
<geist> and very detailed
<clever> geist: what happens if i change the core affinity for the currently running thread, and then hit reschedule or yield?
<moon-child> Wow. It's 100 pages longer than agner vol 3
<geist> clever: good question, looks like i might kinda fall through
<geist> it'll not pick it, but i dont see it send any sort of broadcast for the other cores to pick it up
<geist> so if the target cpu is idle it wont 'pick it up'
<clever> ah
<clever> i do see a wakeup_cpu_for_thread function
<clever> oh, what if i change the pinned core, and then thread_sleep() ?
gog has quit [Remote host closed the connection]
<geist> that'll do it
<clever> it will temporarily suspend, and the irq will wake it up later
<clever> and route it to whatever it should be on now
<clever> so i can use that, to ensure i'm on a given core, without having to spin up a new thread
<geist> welll... not so sure
<geist> the affinity stuff is kidna half baked. it's intended to be a thing you set once
<clever> yeah
<clever> but i do notice it has both curr_cpu and pinned_cpu
<clever> i'll review the pinned_cpu code, and do some testing
<geist> the gist is with the single unified run queue, 'pinning' a thread to a cpu just means any other cpu will skip it
<geist> so there's some logic in the thread wakeup path to make sure the target cpu wakes up if it's idle and/or reevaulates
<clever> so the scheduler has to grab a global lock, when deciding which thread to run next?
<geist> yes
<clever> ive also noticed, if there are no other competing threads, the pre-emption timer is never set
<geist> that's right
<clever> but if a new thread is created, pinned to such a core, and resumed
<clever> then you need to add the timer in
<geist> right
<clever> so it has to interrupt the task immediately?
<geist> probably
<clever> since it would be too costly to set a timer on the current core
<geist> there's a bit of a mental disconnect in my brain because i also completely rewrote all of this in zircon
<clever> "note to self, wake the neighbor at 5am", lol
<geist> this particular part (affinity) i did a complete gut and rewrite and it's far more complicated to handle all the edge cases
<clever> ah
<geist> but lots of it is also because i rewrote the scheduler to be per-cpu queues
<geist> and then there's this whole 'the thread is in the wrong queue' sort of edge cases
<clever> when using per-cpu queues, what happens if one of the cores winds up idle ahead of plans, and another queue is filling up?
<geist> also just the other day the trusty branch of LK added a bunch of logic for affinity
<geist> i need to pull it back
<clever> does something re-balance them?
<geist> i did some code reviews at work for it
<bslsk05> ​android.googlesource.com: 01d4cc46a1a8f108bcb118bff9bc73b2ab2bac56 - trusty/lk/common - Git at Google
<geist> i occasionally cherry pick stuff out of that branch
<geist> clever: uyeah per cpu queues is clearly much more efficient lock/etc wise and scales better
<geist> but suddenly your scheduler isn't 'perfect' as far as keeping all the cores occupied
<clever> i also need to investigate the thread priority some
<geist> a single queue that all cpus opportunistically pull from is 'ideal' in the sense that no cpu is wasted (if yuou're aggressive about waking them)
<geist> but clearly doesn't scale
<clever> i moved all of my animation code into threads, that block on wait_queue_block()
<geist> modern complicated systems have lots of logic to deal with trading off efficiency vs throughput vs overhead with regards to balancing threads between cpus
<clever> but if one thread is hogging cpu, the pre-emption wont interrupt it much
<clever> and the animation then slows to a crawl
<geist> why not?
<clever> the pre-emption is not going to interrupt a task at 60hz, to run 2 other threas
<clever> because the priorities are all equal
<geist> sure it will, it'll jsut round robin them
<clever> but how much time does each one get? before it rotates?
<geist> oh depends on what the quantum is set to
<geist> probably higher than you want
<clever> if i mess with the thread priority, then wait_queue_wake_all and INT_RESCHEDULE, will forcibly switch to the animation threads on each vsync irq
<clever> and those are supposed to be very quick routines, so it will get back to the cpu heavy part
<bslsk05> ​github.com: lk/thread.c at master · littlekernel/lk · GitHub
<geist> looks like 50ms, since htat's 5 ticks of 10ms
<clever> ah, and that would explain the stuttering, when i have a 16ms vsync interrupt
<geist> yep
<clever> so its missing 3 or 4 frames
<geist> but yes the priorities are hard so if you have some long running thing make it lower priority
<clever> even with the irq saying INT_RESCHEDULE, the scheduler says no and lets the quantum run out
<geist> but actually it's more complicated than that, the other threads *should* be preempting it
<geist> since the scheduler has a feedback loop
<clever> the long-running thread, is doing tga decode with irq enabled
<clever> the 2 animation threads, are blocked on wait_queue_block waiting for vsync
<geist> oh wait no. yes, again i have a disconnect. the LK scheduler is hard priority
<clever> and the vsync irq handler will wait_queue_wake_all(&channels[hvs_channel].vsync, false, NO_ERROR); and INT_RESCHEDULE
<geist> no feedback. if you set it to priority 18 something that is set to 19 will *always* preempt it
<geist> the only difrerence is threads marked as 'real time' will not get a preemption timer on them
<geist> so they're even more uber, they will simply run until something higher priority gets them or they yield
<clever> and a INT_RESCHEDULE from an interrupt handler, can force that pre-emption, without any care about the remaining quantum?
<geist> yes
<clever> thats what i was expecting
<geist> well, okay actually no. it's more subtle
<clever> so i can use a slightly higher priority for animations, to keep them smooth, but i might use realtime for thermal throttling, so i dont cook things
<clever> or just handle that entirely in the irq handler, and dont give the scheduler a chance to mess up
<geist> it means 'call thread_preempt()' on this which will decrement the quantum by 1 and invoke the scheduler
<clever> ah right, let me check the arch irq routine
<geist> so it doesn't always reschedule the current thread. it only reschedules if the current thread runs out of quantum or something higher priority is in the queue
<clever> thread_preempt(); yep, exactly
xenos1984 has joined #osdev
<clever> so each vsync irq, is eating up one quantum, making it run out slightly faster then normal
<geist> if the current thread is out of quantum it may still pick it if it's still the highest priority thread and there's nothing else in the same queue
<clever> until it runs dry, and has to reschedule
<clever> yeah, makes sense
<geist> right the quantum stuff is explicitly sloppy: only uses a counter of ticks and then the accounting of the ticks is sloppy
<geist> it's intended to not invoke current_time() or use a higher res thing like ms or us
<clever> and realtime threads can still be interrupted by irq's i assume, they just dont get interrupted by a pre-emption timer, and the scheduler will likely only interrupt it with another realtime?
<geist> the obvious thing to do there is to make quantum be tracked in actual time or at least some sort of jiffies thing
<geist> but in this case it's a sloppy notion of 'times we've checked'
<clever> i can see how avoiding current_time might help with speed
<clever> at least on the rpi, that involves MMIO leaving the cpu cluster, and going off to the clock peripheral
<geist> on a lot of cortex-m class stuff it can involve a divide or two
<geist> so it can takes hundreds of cycles
<clever> yeah, my clock peripheral returns uSec, so it also needs a /1000
<clever> i wonder...
<clever> c4001e1c: 80 90 d0 61 bl c400e1bc <__udivdi3>
<clever> yeah, thats a pretty big function
<geist> yah few hundred instructions at least
<clever> i can see why you want to avoid it
<clever> 0x488 bytes, and an opcode is a minimum of 16 bits
<geist> looks like the realtime thread stuff is still a bit half baked. it still calls thread_preempt on it
<geist> which really it shouldn't be fiddling with the quantum on a real time thread
<geist> but it does keep it from interrupting the cpu that's runningo ne
<geist> which is really what it's for
<geist> so if you took a real time thread, pinned it on cpu 2 then it'll basically leave that cpu alone, not send it IPIs, etc. that was the intent at the time
<clever> yeah
<clever> but if its pinned on a core that can service hw irq's, it will be getting interrupted by things like uart and vsync
<geist> so *really* that's all the real time flag does: it marks a cpu that s running a real time thread as off limits to ipis
<geist> right
<clever> for the rpi arm, only one core can ever get a hw irq, the rest are IPI only
<clever> for rpi vpu, each core has its own irq mask set, and vector table
<clever> so i could balance it however i want
<geist> been a while since i looked at this stuff. it's odd this sort of disconnect where i've been dealing with multiple derivatives of this and to go back to the ancestor
<geist> it's like going back and looking at linux 1.0 or something
<clever> heh
<clever> i recently updated my linux source, just incase it could fix something (it had basically no effect)
<geist> but... honestly i still mostly like how the LK stuff is for simple designs. it's good for 'i need some threads and wanna run some shit'
<clever> then i noticed it printing weird things on boot
<clever> C:0x010000C0-0x015B43E0->0x01095700-0x01649A20
<bslsk05> ​github.com: linux/head.S at rpi-5.10.y · raspberrypi/linux · GitHub
<clever> its debug info, for when the kernel copies itself
<clever> linux will take the PC for the decompression stub, round it down to the nearest 128mb (to find the start of ram), and then add 32kb
<clever> and it unpacks itself to that addr
<clever> but if the compressed kernel is in the way, it has to first memcpy itself out of the way
freakazoid343 has quit [Ping timeout: 252 seconds]
<clever> so ideally, the kernel should be at least 32kb + $uncompressed_size away from the start of ram, and the start of ram should be 128mb aligned
<clever> this is also where LK is in danger
<clever> -r-xr-xr-x 1 root root 126K Dec 31 1969 result/rpi2-test/lk.bin
<clever> my LK build is 126kb, so 94kb of it gets overwritten
<clever> any other cores LK had spinning, will then promptly malfunction on chainload
<clever> if they are idle, i expect them to be waiting for an IPI
<clever> and if linux never pokes the bear, they will remain idle forever
<clever> so any kind of parking i do, must either be within the first 32kb of the binary, or be a blob i copy to a safe place
<clever> PSCI doesnt have to worry, because it can use EL3 features to protect itself
<bslsk05> ​github.com: [arch][arm] improve arm chainload by cleverca22 · Pull Request #305 · littlekernel/lk · GitHub
freakazoid12345 has joined #osdev
<clever> ive also confirmed it can still boot linux on an rpi
mahmutov has quit [Ping timeout: 265 seconds]
isaacwoods has quit [Quit: WeeChat 3.2]
freakazoid12345 has quit [Ping timeout: 268 seconds]
sm2n has joined #osdev
freakazoid343 has joined #osdev
mahmutov has joined #osdev
freakazoid343 has quit [Read error: Connection reset by peer]
sts-q has quit [Ping timeout: 252 seconds]
pony has quit [Quit: WeeChat 2.8]
freakazoid343 has joined #osdev
mahmutov has quit [Ping timeout: 268 seconds]
vdamewood has quit [Quit: Life beckons]
srjek has quit [Ping timeout: 260 seconds]
mahmutov has joined #osdev
vdamewood has joined #osdev
freakazoid343 has quit [Ping timeout: 252 seconds]
smeso has quit [Quit: smeso]
smeso has joined #osdev
anon16_ has joined #osdev
dude12312414 has joined #osdev
dude12312414 has quit [Client Quit]
anon16_ has quit [Read error: Connection reset by peer]
anon16_ has joined #osdev
mahmutov has quit [Ping timeout: 268 seconds]
Burgundy has joined #osdev
freakazoid12345 has joined #osdev
bradd has quit [Read error: Connection reset by peer]
bradd has joined #osdev
mahmutov has joined #osdev
pony has joined #osdev
mahmutov has quit [Ping timeout: 260 seconds]
<gorgonical> i'm reading about MLIR and the cool stuff it does, and I realized I was never taught was a lattice was in mathematics, even though I acquired a degree in computer science with a focus on math
<gorgonical> wtf america
fedorafan has quit [Ping timeout: 268 seconds]
fedorafan has joined #osdev
heat has quit [Ping timeout: 252 seconds]
System123 has joined #osdev
freakazoid12345 has quit [Ping timeout: 265 seconds]
System123 has quit [Ping timeout: 268 seconds]
sm2n_ has joined #osdev
sm2n has quit [Ping timeout: 268 seconds]
aejsmith has joined #osdev
fedorafan has left #osdev [Textual IRC Client: www.textualapp.com]
hanzlu has joined #osdev
dutch has quit [Ping timeout: 268 seconds]
System123 has joined #osdev
dutch has joined #osdev
tacco has joined #osdev
<klange> "Hm, there's nothing in my compositor to send the hotspot for the cursor to the vbox driver..." I say to myself... and sure enough, I have harded my wacky choice... of 26,26.
<klange> hard-coded*
System12_ has joined #osdev
<mjg> klange: does your os boot on bare metal?
System123 has quit [Ping timeout: 265 seconds]
System12_ has quit [Ping timeout: 268 seconds]
<mjg> nice
<mjg> i never had the guts to try even my hello world kernel
<klange> I have a photo from what appears to be December of 2011, before I had a GUI - actually, before I even seem to have had a proper userspace - of this same ThinkPad running in VGA text mode on my desk in my dorm.
<klange> It must be December of 2011, because there's a date in the kernel shell prompt of 12/14 and my nameplate from Apple is on the pegboard behind the screen...
<klange> And a later one from what looks to be April, and possibly with a real userspace shell: https://i.imgur.com/JPRYk.jpg
vai has quit [Quit: Lost terminal]
<klange> Old GUI running on a desktop that had a funny idea of display centering: https://i.imgur.com/Pps6H.jpg
<klange> running on an old netbook: https://i.imgur.com/u9Kz7.jpg ← This little guy is one of the reasons I wasn't doing 64-bit support for so long, those early Atoms were 32-bit.
<klange> that netbook playing quake: https://i.imgur.com/RR8ahQO.jpg
System123 has joined #osdev
<mjg> pretty solid stuff, fortunatley does not make me want to go back to writing an os from scratch :)
<clever> mjg: thats why ive opted for the simpler, yet more complex route, of porting an existing kernel, to an under-documented cpu core!
<klange> Just booted on my desktop as I hadn't actually tried the new kernel here yet, and it works and brings up SMP and sees all 64GB of memory and Grub happily hands me a nice 1080p framebuffer for one of my four displays
<klange> But this box has a Realtek 8168-series NIC, so no network support, and also the PS/2 emulation layer is giving me a really slow mouse cursor (I think because this mouse is super-high DPI and it's being lazy about dealing with that)
<klange> I also booted on a Surface once, got to the desktop and full res, though I know that doesn't have older ACPI tables I'm looking for so no SMP, and since it has no PS/2 emulation it's utterly useless until I get this USB stack into a state of existence
<klange> But the clock ticked, so that's nice.
<klange> I think those two Renesas USB controllers are integrated on the PCIe video capture boards I have installed, amusing that they're the same model as the one in the ExpressCard card I use in my laptop so I've got them in my hand-written PCI ID database.
CryptoDavid has joined #osdev
GeDaMo has joined #osdev
mctpyt has quit [Ping timeout: 268 seconds]
hbag has quit [Quit: The Lounge - https://thelounge.chat]
gog has joined #osdev
dormito has quit [Quit: WeeChat 3.1]
anon16_ has quit [Ping timeout: 252 seconds]
<klange> I guess that NIC's in the 8169 series so we have a quick reference page... https://wiki.osdev.org/RTL8169
<bslsk05> ​wiki.osdev.org: RTL8169 - OSDev Wiki
pretty_dumm_guy has joined #osdev
hanzlu has quit [Quit: Konversation terminated!]
zaquest has quit [Quit: Leaving]
zaquest has joined #osdev
vinleod has joined #osdev
vdamewood is now known as Guest6509
vinleod is now known as vdamewood
Guest6509 has quit [Killed (copper.libera.chat (Nickname regained by services))]
dormito has joined #osdev
dude12312414 has joined #osdev
hanzlu has joined #osdev
anon16_ has joined #osdev
asskoala has joined #osdev
hanzlu has quit [Ping timeout: 268 seconds]
X-Scale` has joined #osdev
hanzlu has joined #osdev
X-Scale has quit [Ping timeout: 268 seconds]
X-Scale` is now known as X-Scale
drewlander has quit [Quit: ZNC 1.7.2+deb3 - https://znc.in]
drewlander has joined #osdev
tacco has quit [Remote host closed the connection]
tacco has joined #osdev
asskoala has quit [Ping timeout: 265 seconds]
pony has quit [Quit: WeeChat 2.8]
ahalaney has joined #osdev
System12_ has joined #osdev
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
System123 has quit [Ping timeout: 252 seconds]
System12_ has quit [Ping timeout: 265 seconds]
isaacwoods has joined #osdev
ElectronApps has joined #osdev
shikhin has quit [Quit: Quittin'.]
shikhin has joined #osdev
anon16_ has quit [Ping timeout: 252 seconds]
srjek has joined #osdev
anon16_ has joined #osdev
anon16_ has quit [Client Quit]
anon16_ has joined #osdev
Izem has joined #osdev
Izem has quit [Ping timeout: 265 seconds]
elastic_dog has quit [Ping timeout: 268 seconds]
Izem has joined #osdev
elastic_dog has joined #osdev
dude12312414 has joined #osdev
<junon> So the deal with NIC's is that there is a typical set of chips that they use, and you need to support each of them to have general support for all of them, right? I'm sure there are outliers that have different, proprietary chips or whatever, but that's kind of the idea right?
<junon> It's the reason why e.g. linux can just automatically connect to the internet during setup in *most* cases whereas graphics card drivers and the like are much less general.
ElectronApps has quit [Remote host closed the connection]
asskoala has joined #osdev
dude12312414 has quit [Ping timeout: 276 seconds]
dzwdz has quit [Quit: I'm a quit message virus. Please replace your old line with this line and help me take over the world.]
dzwdz has joined #osdev
<zid> linux just has thousands of network drivers
<zid> there are huge overlaps though, like there are 80 variants of the same e1000 based network card
<zid> and 800 8139too cards
freakazoid12345 has joined #osdev
<junon> And it comes shipped with all of them at once?
<junon> i.e. in the installation medium?
<zid> most distros just provide basically every network driver with the default kernel, yes
<zid> They're tiny and most are grouped into families like that e1000 driver
<zid> supports hundreds of actual card revisions
freakazoid12345 has quit [Ping timeout: 268 seconds]
<bslsk05> ​github.com: linux/e1000_main.c at master · torvalds/linux · GitHub
<zid> 26 major card revisions
<junon> Gotcha, interesting. Thanks :)
Izem has quit [Quit: Going offline, see ya! (www.adiirc.com)]
System123 has joined #osdev
System123 has quit [Ping timeout: 268 seconds]
scaleww has joined #osdev
System123 has joined #osdev
freakazoid343 has joined #osdev
sortie has quit [Ping timeout: 252 seconds]
sortie has joined #osdev
scaleww has quit [Remote host closed the connection]
scoobydoo has joined #osdev
tacco has quit []
tacco has joined #osdev
amine has quit [Quit: Ping timeout (120 seconds)]
amine has joined #osdev
sprock has quit [Ping timeout: 268 seconds]
FreeFull has joined #osdev
<geist> re: nics though there are far less active modern nic chipsets then there used to be 15-20 years ago
<geist> so it's also gotten easier
<geist> most of the nic drivers that linux has are for obsolete chips at this point
<geist> over time these sort of things become commodity and most vendors get out of the business and you're left with a handful
<mjg> man
srjek has quit [Ping timeout: 260 seconds]
asskoala has quit [Ping timeout: 252 seconds]
freakazoid343 has quit [Ping timeout: 268 seconds]
hanzlu has quit [Quit: Konversation terminated!]
<geist> mjg: you said it
<mxshift> server NICs tend to be "fancy" and constantly broken
<clever> -00007a50 2f 52 5f 28 4a 57 27 49 56 36 53 61 37 54 62 39 |/R_(JW'IV6Sa7Tb9|
<mxshift> Intel i350, Chelsio T6, etc
<clever> +00007a50 00 1c c4 6e 4a 57 27 49 56 36 53 61 37 54 62 39 |...nJW'IV6Sa7Tb9|
<clever> i dont know how, but i recently had issues with random `00 1c c4 6e` chunks, appearing in the middle of my tcp streams
<clever> i downloaded a file with plain http and curl, and then did a diff to see how it got corrupted
<clever> and every single corrupt part, is that 4 byte sequence, 32bit aligned
<mxshift> does it happen regardless of upstream route?
<clever> mxshift: the route is just desktop -> gigabit switch -> rpi
<mxshift> plenty of things can go wrong inside a NIC but failing RAM in routers cause problems like that too
<mxshift> oh, you're copying locally
<clever> and it only affects one destination, when its running my open firmware
<clever> if i run the closed firmware, its fine
<mxshift> which NIC model is doing this?
<clever> the usb ethernet chip on an rpi2
GeDaMo has quit [Quit: Leaving.]
<mxshift> well, that removes a few potential causes
<clever> the thing i'm wondering, doesnt tcp/ip have checksums on the packets?
<mxshift> usb ethernet chip isn't going to have the NCSI packet matching that occasionally screws thigns up
<clever> how is the network stack letting this garbage hit userland?
<mxshift> TCP checksums are 16-bit CRC
<mxshift> and many devices don't actually validate them
<clever> i would expect the tcp checksum to be handled in linux
<clever> not the NIC
<mxshift> checksum offloading is very common
<clever> *looks*
<mxshift> also TCP checksum is very weak: https://dl.acm.org/doi/10.1145/347059.347561
<bslsk05> ​dl.acm.org: When the CRC and TCP checksum disagree | Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
<clever> Dec 31 20:00:35 nixos kernel: smsc95xx 1-1.1:1.0 eth0: register 'smsc95xx' at usb-3f980000.usb-1.1, smsc95xx USB 2.0 Ethernet, b8:27:eb:77:df:95
<clever> Sep 11 23:48:28 nixos kernel: smsc95xx 1-1.1:1.0 eth0: Link is Up - 100Mbps/Full - flow control off
<clever> mxshift: does usb also have any checksums/ecc?
<mxshift> yes, 16-bit CRC
<mxshift> `ethtook -k <ifname>` will tell you what offloads are enabled/available
<clever> the usb phy is currently mis-configured, so i cant even see the usb device
<clever> all i have to go on is the journal logs from a past boot
<clever> and the source for the driver
<clever> over at drivers/net/usb/smsc95xx.c within linux
<clever> /* Enable or disable Tx & Rx checksum offload engines */
<clever> mxshift: this comment implies the nic can offload things
<mxshift> yup
<mxshift> but keep in mind that the ACM paper I linked earlier shows you can have a valid TCP checksum with invalid data fairly easily
<clever> i was consistently getting the same 32bit data, replacing different values
<clever> https/ssh noticed, and immediately died
<clever> http didnt care and corrupted the file in transit
<clever> that seems less like corruption, and more like a stray write or something
elastic_dog has quit [Ping timeout: 252 seconds]
elastic_dog has joined #osdev
sprock has joined #osdev
dude12312414 has joined #osdev
asskoala has joined #osdev
gog has quit []
dormito has quit [Quit: WeeChat 3.1]
sprock has quit [Ping timeout: 265 seconds]
Burgundy has quit [Ping timeout: 265 seconds]
dormito has joined #osdev
dude12312414 has quit [Ping timeout: 276 seconds]
dude12312414 has joined #osdev
xenos1984 has quit [Read error: Connection reset by peer]
* Bitweasil whistles.
<Bitweasil> You know what doesn't work if you screw up your TLB invalidation?
<Bitweasil> Anything relying on TLB invalidation, like using one page to map multiple regions of memory.
anon16_ has quit [Ping timeout: 265 seconds]
srjek has joined #osdev
h4zel has joined #osdev
xenos1984 has joined #osdev
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
System123 has quit [Ping timeout: 268 seconds]
elderK has joined #osdev
<junon> Bitweasil: what's TLB?
<zid> translation lookaside buffer
<zid> it's where your cpu keeps its spare frozen pies
<zid> or cached virtual to physical lookups, one of those two, I forget
<junon> according to google, apparently I want to translate it into german
<junon> "Lookaside-Puffer"
<junon> Thank you google.
anon16_ has joined #osdev
ahalaney has quit [Remote host closed the connection]
<junon> It never occurred to me that virtual->phys memory translation wasn't... I guess constant time?
<junon> But yeah it makes sense.
<j`ey> it has to do like 2-4 extra memory lookups
<j`ey> depending on the tables
System123 has joined #osdev
<geist> Bitweasil: yah and a fun one is orgetting to tlb invalidate the page table cache
<geist> which is also exposed on ARM (and optionally on AMD)
<geist> it's a subtle detail, but if you screw that up and forget to invaldate it you get some *really* weird shit
<geist> it's one of those things that intel x86 Just Deals With so it's invisible
<Bitweasil> Oof. Yeah...
<Bitweasil> junon, when you do a virtual to physical translation, it takes quite a few memory steps to do it - you're chasing at least a few page tables down, and this takes time and DRAM bandwidth.
<Bitweasil> So the result (virtual 0x80000000 maps to physical 0x00010000, size 4kb) is stored in the TLB.
<Bitweasil> And it's searched every time you do a memory access.
<Bitweasil> The TLB is typically quite fast, so if the result is in there, great, you just do the access and go on your way.
<Bitweasil> But, if you change the mappings, you have to be able to invalidate it.
<Bitweasil> So, if, for example, I decide to use the page at 0xffe01000 as the mapping interface for CPU 0, I can point that virtual address to any physical page I want.
<Bitweasil> Buuuuut, if I want to point it to another physical page, I have to say, "Ok, that virtual address is no longer valid, I want you to use the page tables next time."
<Bitweasil> And some refactoring had screwed up the plumbing to route that through.
<Bitweasil> So the page was being remapped, but because it was still in the TLB, the page walker never hit the actual page tables.
System12_ has joined #osdev
System12_ has quit [Remote host closed the connection]
System12_ has joined #osdev
System123 has quit [Ping timeout: 265 seconds]
System12_ has quit [Ping timeout: 260 seconds]
<junon> Can you invalidate individual mappings? or just the entire page table all at once?
<junon> If the latter, doesn't that mean any time a new memory page is acquired by a user process, the TLB has to be flushed and thus subsequent memory fetches process-wide will have a TLB miss penalty or something?
<clever> junon: i think it depends on the cpu, but usually there is a way to invalidate a range of virtual addresses
<junon> Gotcha, okay
<clever> but on some systems, the tlb flush is per-core
<zid> invlpg on amd64
<clever> junon: so you need to interrupt every core on the system, force them all to flush, and wait for them to ack
<zid> and it will do all the IPIing to inform all the other cores and stuff automagically
<clever> so the more cores you have, the more they are going to be interrupting eachother, and the worse your performance becomes
<clever> but the thing zid just said, saves you from having to interrupt the other cores yourself, its automated in hw
<junon> I see, that constitutes to process switching overhead, right?
<zid> not directly
<zid> invalidating a tlb entry is for when you've changed it, so the cached version is now invalid
<clever> junon: the pid is also often in the TLB records, so you dont have to flush on process switch
<junon> ohh okay
<clever> so only if you change a mapping, does it need a flush
<zid> If you've got the kernel mapped in both processes.. those don't need flushing
<zid> but it's often easier to just flush the whole thing anyway
<zid> there are some complicated schemes to allow you to do partial flushes, or flush based on PID tagging and stuff blah blah
<junon> I'm deep-diving on wikipedia, and single-address space is mentioned.
<junon> It lists advantages but no disadvantages
<junon> I suppose as memory usage increases and pages become more fragmented, you might run into more failed allocations that are > page size, rather than being able to map two disparate physical pages into a contiguous virtual memory area, right?
<junon> Or am I misunderstanding something?
<zid> it doesn't matter if physical memory is fragmented
<zid> virtual memory keeps it all linear
<zid> and dram doesn't really care beyond 64byte rows
<clever> only if you wave to save some pagetable size, and use hugepages, does physical fragmentation matter
<junon> Right but in SAS you can avoid flushes if you use direct mappings, right?
<clever> SAS?
<zid> To avoid flushes you'd need to use unique virtual memory addresses, so you might as well just not bother
<bslsk05> ​en.wikipedia.org: Single address space operating system - Wikipedia
<zid> and just use no MMU at all
<clever> junon: ah, that looks like what you might use if you had no MMU at all, or just an MPU
<geist> most of what you lose is security
<geist> since everything can see everything
<geist> you can mitigate that somewhat with using 'safe' languages
<clever> an MPU could be used to restrict what you can see, but then you loose the speed of context switching
<zid> which re-adds the slowdown but probably worse
<zid> (almost certainly worse)
<geist> but SAS systems aren't used much anymore
<geist> except in embedded where you probaby dont have a MMU/etc
<geist> but mid 80s or so there were lots of SAS systems on desktops, even multithreaded and preemptive
<geist> also note that not all arches require that you dump the entire TLB when context switching
<geist> so that particular SAS advantage isn't universal
<clever> PID tagging, right?
<geist> right
<geist> *most* modern arches have that feature, including modern x86s
<clever> did you get around to checking the 2nd version of my chainload pr?
<geist> nein
<geist> but we really should take it to the other channel anyway
<geist> shouldn't spam a bunch of that stuff here
<clever> sure
* Bitweasil mutters something about in-order architectures and avoiding speculation.
<Bitweasil> I really need to replace the CMOS battery on my old netbook.
<Bitweasil> It now... no longer sleeps competently, and I'm pretty sure the CMOS battery being dead has something to do with it.
gioyik has joined #osdev
h4zel has quit [Ping timeout: 268 seconds]
pony has joined #osdev
Retr0id3 has joined #osdev
Retr0id has quit [Ping timeout: 252 seconds]
Retr0id3 is now known as Retr0id
anon16_ has quit [Read error: Connection reset by peer]
anon16_ has joined #osdev
tacco has quit []
anon16_ has quit [Remote host closed the connection]
anon16_ has joined #osdev