ChanServ changed the topic of #armlinux to: ARM kernel talk [Upstream kernel, find your vendor forums for questions about their kernels] | https://libera.irclog.whitequark.org/armlinux
Grimler has joined #armlinux
djrscally has quit [Ping timeout: 240 seconds]
prabhakarlad has quit [Ping timeout: 256 seconds]
mraynal has quit [Quit: WeeChat 3.0]
XV9 has quit [Quit: Textual IRC Client: www.textualapp.com]
Pali has quit [Ping timeout: 240 seconds]
Nact has joined #armlinux
amitk has joined #armlinux
macromorgan has joined #armlinux
Nact has quit [Read error: Connection reset by peer]
cbeznea has joined #armlinux
macromorgan has quit [Read error: Connection reset by peer]
macromorgan has joined #armlinux
milkylainen has quit [Ping timeout: 240 seconds]
iivanov has joined #armlinux
djrscally has joined #armlinux
mraynal has joined #armlinux
frieder has joined #armlinux
frieder has quit [Remote host closed the connection]
frieder has joined #armlinux
nsaenz has joined #armlinux
sszy has joined #armlinux
matthias_bgg has joined #armlinux
milkylainen has joined #armlinux
headless has joined #armlinux
alpernebbi has quit [Ping timeout: 272 seconds]
alexels has joined #armlinux
alpernebbi has joined #armlinux
prabhakarlad has joined #armlinux
<geertu> mraynal: But the actual DMA controller register block is not part of the System Controller register block, right?
Pali has joined #armlinux
macromorgan_ has joined #armlinux
macromorgan_ is now known as macromorgan
macromorgan has quit [Killed (copper.libera.chat (Nickname regained by services))]
<geertu> mraynal: So it's about the CFG_DMAMUX register
<geertu> The only user for that register is ineed the DMAC driver.
<geertu> Just export a function to access that register? We have similar things in e.g. include/linux/soc/renesas/rcar-rst.h
jlinton has quit [Ping timeout: 256 seconds]
<mraynal> geertu: yes, the DMAMUX register is the one
<mraynal> geertu: I was about to add a syscon compatible, a reason for not doing so in the first place?
<geertu> mraynal: Personally, I don't like syscon compatibles ;-)
<mraynal> geertu: why? :)
<geertu> mraynal: It allows more access than you may want to export
<mraynal> that's true
<geertu> In addition, what if the RZ/N1 DMAC block is reused on a new SoC, where the DMAMUX register is located at a different offset in the SYSC block?
<mraynal> well in this case instead of having the provider knowing where the register is, it is the caller responsibility I would say
<mraynal> I mean, syscon or not you'll have to handle the move either way
<mripard> mraynal: from an abstraction pov, it shouldn't really be the consumer that has the knowledge of how the provider is laid out though
<mraynal> mripard: o/
<mraynal> mripard: agreed, but here we're talking about a DMA controller which tries to access one of 'its' registers that is located in a syscon, maybe the word 'consumer' above was a bit too much
<mraynal> it's quite common to have platform data structures defining how the registers are laid out depending eg. on the compatible
<mraynal> having this logic in the clock driver does not make more sense to me
<geertu> mraynal: the clock driver also does power management
<mripard> I mean, I don't really know your use case, so maybe it does make sense, but in general syscon is mostly used to punch a hole into all the nice abstractions we have
<mripard> and it's sad, really :)
<geertu> mripard: exactly (about the punching)
ravan has quit [Ping timeout: 272 seconds]
prabhakarlad has quit [Quit: Client closed]
headless has quit [Quit: Konversation terminated!]
Amit_T has joined #armlinux
matthias_bgg has quit [Ping timeout: 272 seconds]
Misotauros has quit [Ping timeout: 252 seconds]
monstr has joined #armlinux
headless has joined #armlinux
headless has quit [Quit: Konversation terminated!]
torez has joined #armlinux
elastic_dog has quit [Ping timeout: 240 seconds]
jlinton has joined #armlinux
matthias_bgg has joined #armlinux
jlinton has quit [Ping timeout: 256 seconds]
<ajb-lina-> I'm trying to track down why some QEMU test cases are so slow and I can track it to some guest pc's triggering invalidation of PCs - but the addresses don't show up in kallsysm
<ajb-lina-> Could this be bits of the EFI code?
<ajb-lina-> head /proc/kallsyms
<ajb-lina-> ffff800010000000 T _text
<ajb-lina-> tail -n 1 /proc/kallsyms
<ajb-lina-> ffff800008b00000 T loop_register_transfer [loop]
<ajb-lina-> pc in question 0xffff800011032b5c
<ajb-lina-> hmm kallsysms isn't sorted?
* ajb-lina- suspects EFI runtime mappings
<ardb> ajb-lina-: EFI runtime mappings are in the user portion of the VA space
<ardb> so anything with 0xffff in the top u16 is definitely not EFI runtime stuff
<ajb-lina-> ardb: aside from eBPF does the kernel generate any code in it's VA space? Or could it be a module?
<ajb-lina-> modules seem to go from 0xffff800008b00000 - 0xffff800008e60000 if /proc/modules is to be believed
<ardb> ajb-lina-: it depends on which kernel version
<ajb-lina-> currently testing with Alpine's 5.15.4-0-lts
<ajb-lina-> gdbserver can see the code and put breakpoints in it - but obviously without decent debug symbols it's hard to figure out whats going on
cengiz_io has quit [Ping timeout: 245 seconds]
cengiz_io has joined #armlinux
<ardb> ajb-lina-: check /proc/vmallocinfo?
<ardb> or /sys/kernel/debug/kernel_page_tables if it exists
<ajb-lina-> ardb: vmallocinfo only goes up to 0x00000000ff6646f3
<ardb> ajb-lina-: those addresses are scrambled i think
<ardb> echo 1>/proc/sys/kernel/kptr_restrict
<ajb-lina-> 0xffff800000000000 - 0xffff800020000000/0xfffffbffeffc6000
<ajb-lina-> might be in there
<ajb-lina-> localhost:/# cat /proc/vmallocinfo | grep 0xffff8000110
<ajb-lina-> 0xffff800010f70000-0xffff800011060000 983040 paging_init+0x29c/0x944 phys=0x0000000068970000 vmap
<ajb-lina-> 0xffff800011060000-0xffff8000114a0000 4456448 paging_init+0x308/0x944 phys=0x0000000068a60000 vmap
<ardb> ok so it is inside the core kernel'
<ajb-lina-> ardb: I mean I figure /proc/kallsysms would be contiguous even if it didn't have all symbols in it.. it's just symbols that are exported for modules right?
<ardb> ajb-lina-: no it has everything
<ardb> except for stuff that gets inlined etc of course
<ajb-lina-> ardb: so I'm confused - so vmallocinfo shows the memory is allocated for the kernel but it doesn't contain code from the vmlinux?
<ardb> it does
<ardb> maybe try nokaslr?
<ajb-lina-> ardb: I'll see if I can interrupt grub
<ajb-lina-> hot pc is now 0xffff80001057bf10
<ajb-lina-> 0xffff800010010000-0xffff800010b00000 11468800 paging_init+0x208/0x944 phys=0x0000000067a10000 vmap
jlinton has joined #armlinux
<ajb-lina-> cat /proc/kallsyms | grep 0xffff800010 - nothing
<ajb-lina-> cat /proc/modules | grep 0xffff800010 - nothing
<ardb> kallsyms doesn't have the leading 0x IIRC
<ajb-lina-> doh!
<ajb-lina-> cat /proc/kallsyms | grep ffff80001057b
<ajb-lina-> ffff80001057bd80 T __arch_copy_to_user
<ajb-lina-> ffff80001057bfa0 T csum_ipv6_magic
prabhakarlad has joined #armlinux
<ajb-lina-> well I guess __arch_copy_to_user - but I would expect that usually to be scribbling over code pages?
<ardb> ?
<ardb> why would it be doing that?
Misotauros has joined #armlinux
<ajb-lina-> ardb: the slowdown is because QEMU's SMC detection is triggering, causing it to flush translations (a lot)
<ajb-lina-> TB invalidate count 550447
<ardb> maybe some spectre/meltdown mitigation triggering?
<ajb-lina-> compared to maybe 7000 on my debian bulleye test image
<ardb> ajb-lina-: i'd assume you can trace the source of the SMC call no?
<ajb-lina-> ardb: that was an early theory - I enabled KAISER in my test kernel but couldn't replicate those high numbers
<ajb-lina-> http://ix.io/3PLU
<ajb-lina-> ardb: top 5 ^ 0xffff80001057bf10 is by far the highest offender
* ardb has no clue what he is looking at
<ardb> ajb-lina-: SMC 'detection' sounds like it based on some heuristic but I assume QEMU can decide whether an SMC is issued pretty definitively, no?
<ajb-lina-> arnd: yes - if we generate code in a particular page we mark it as such to trigger QEMU's slow path if the page is ever written to
alexels has quit [Quit: WeeChat 3.4]
<ajb-lina-> (notdirty_write and !cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE) in QEMU's cputlb code)
<ardb> and how is this related to SMC?
alexels has joined #armlinux
<ajb-lina-> ardb: I guess not SMC in the classic sense - but potentially invalidating existing translations because you've changed code in the page
<ajb-lina-> ardb: we don't usually have executable code in the stack frame right?
<ardb> ajb-lina-: not sure what you mean by 'stack frame'
<ardb> but we never execute code from the stack
<ajb-lina-> ardb: or the heap?
<ardb> kernel code is rarely modified
<ajb-lina-> ardb: its probably not kernel space being changed but userspace pages containing code
<ardb> ajb-lina-: heap is a bit vague, but generally, only code pages are executable, and those all live in the vmalloc area
<ajb-lina-> interestingly most of the invalidations from __arch_copy_to_user occur during boot up
jlinton has quit [Quit: Client closed]
<ajb-lina-> the current address triggering tb invalidations is
<ajb-lina-> localhost:~# cat /proc/kallsyms | grep ffff8000102927
<ajb-lina-> ffff8000102927e4 T strncpy_from_kernel_nofault
<ajb-lina-> I guess I should tweak my QEMU trace to show the affected page address
<ajb-lina-> tb_invalidate_phys_page_fast page:0x27d32fbc/4 pc:0xffff800010292778
<ajb-lina-> tb_invalidate_phys_page_fast page:0x27d333d8/4 pc:0xffff800010292778
<ajb-lina-> I think those are physical addresses
headless has joined #armlinux
sudeepholla has quit [Ping timeout: 250 seconds]
<ajb-lina-> ardb: is there anyway to figure out what virtual userspace addresses will be using those physical pages?
<ardb> but the flushing is related to code translations, right?
<ardb> and the hotspot is in the kernel code?
<ajb-lina-> heh a single cat /proc/self/maps triggers about 40 pages
<ajb-lina-> ardb: the kernel pc is what triggered the flush
<ajb-lina-> (well close - it's actually the pc of the start of the tb that triggered the flush)
<ajb-lina-> it seems to repeat several times
<ajb-lina-> http://ix.io/3PM9
<ardb> ajb-lina-: so *why* does it get invalidated each time? can you log that as well?
sudeepholla has joined #armlinux
<ajb-lina-> ardb: sure - I shall add some tracepoints to QEMU - it is possible we are triggering a QEMU bug
frieder has quit [Remote host closed the connection]
<ajb-lina-> http://ix.io/3PMp
alpernebbi has quit [Ping timeout: 240 seconds]
<ajb-lina-> ardb: ^ curious - this tells me 2 things, a) sometimes there are no TBs to invalidate so we must have missed clearing a flag somewhere and b) when we do invalidate a TB it's for a kernel routine
<ajb-lina-> http://ix.io/3PMr
<ajb-lina-> and even stranger the code being invalidated is for kmem_cache_alloc
alpernebbi has joined #armlinux
<ajb-lina-> there is certainly a bug (or two) here. I just don't know if it's all QEMU or something has gone very wrong with the kernel
Misotauros has quit [Ping timeout: 256 seconds]
jlinton has joined #armlinux
<ardb> ajb-lina-: without knowing why QEMU decides to perform invalidation, it is hard to reason about that
<mrutland> ajb-lina-: are those logs for the attempt to execute or the attempt to write?
<mrutland> because if i'ts happening at a deterministic addr, if you could find the write, it would be the smoking gun
<mrutland> Generally I'd be surprised if we were writing to kernel text mappings since our Stage-1 maps those read-only anyhow
<mrutland> ... and so those writes should be limited to boot-time patching / alternatives, static_keys, and kprobes
<mrutland> are we perhaps patching a boot-time alternative once, but qemu forgets it has done the invalidate, and so *every* subsequent attempt to execute that page results in an invalidate?
<mrutland> ... that could explain why __arch_copy_to_user was triggering this, since we boot-time patch that depending on PAN, etc
<mrutland> ... or at least we used to in older kernels, and I assume this is an older kernel sicne you said it's a test case
alpernebbi has quit [Ping timeout: 256 seconds]
<ajb-lina-> mrutland: attempts to write (triggering QEMUs invalidation of those TBs in the page)
alexels has quit [Quit: WeeChat 2.8]
<ajb-lina-> actually it's not strncpy it copy_to_kernel_nofault
<ajb-lina-> starts at ffff8000102926c4
<ardb> ajb-lina-: but it is not an attempt to write to that code address, right?
<ardb> i.e., the hot code path is not being modified, it is being executed
<ajb-lina-> 0xffff800010292780 (copy_to_kernel_nofault) is the PC of the code triggering the invalidation, occasionally the tb that gets invalidated starts at 0xffff80001033184c (kmem_cache_alloc)
<ajb-lina-> it makes no sense
<ajb-lina-> although I put a breakpoint at copy_to_kernel_nofault and
<ajb-lina-> http://ix.io/3PMB
<mrutland> what's the insn at 0xffff800010292780 ?
<ajb-lina-> 0xffff800010292780: str w1, [x2]
<mrutland> can you step to there *then* print w1 and x2? they get altered between the bit in the paste and 0xffff800010292780
<ajb-lina-> so single stepping while watching my trace point
<ajb-lina-> => 0xffff800010292780: str w1, [x2]
<ajb-lina-> (gdb) p/x $x2 │
<ajb-lina-> $4 = 0xfffffbfffdbfe148
<ajb-lina-> triggered
<ajb-lina-> tb_invalidate_phys_page_fast page:0x27d31148/4 pc:0xffff800010292780
<ajb-lina-> let me find one that actually invalidates a TB
<mrutland> actually, can you bt here?
<ajb-lina-> #0 0xffff800010292788 in ?? () │
<ajb-lina-> #1 0xffff800011661000 in ?? ()
<ajb-lina-> not much really - no debug symbols or fp
<mrutland> (to see why are we doing a copy_to_kernel_nofault)
<mrutland> Generally, copy_to_kernel_nofault implies we're patching code, which sort-of implies the right thing is happening here, but I don't know what's going on at a high-level that causes us to call copy_to_kernel_nofault
<ajb-lina-> mrutland: ahh ok - let me dig further but I've just been called to dinner
<ajb-lina-> mrutland: bbs
<mrutland> I suspect this is kprobes, which is self-modifying-code
alpernebbi has joined #armlinux
mort has quit [Quit: The Lounge - https://thelounge.chat]
mort has joined #armlinux
Misotauros has joined #armlinux
sszy has quit [Ping timeout: 240 seconds]
<ajb-lina-> mrutland: why would that fire every time though?
<ajb-lina-> mrutland: but yes stepping though the ret gets to
<ajb-lina-> 0xffff800010ae4d20
<ajb-lina-> localhost:~# cat /proc/kallsyms | grep ffff800010ae4d
<ajb-lina-> ffff800010ae4df0 t aarch64_insn_patch_text_cb
<ajb-lina-> actually probably
<ajb-lina-> ffff800010ae4c30 t __aarch64_insn_write
<ajb-lina-> via arch_jump_label_transform
<ajb-lina-> hand decoding bt from $x30 values it tiresome
monstr has quit [Remote host closed the connection]
<ajb-lina-> could this be kasan?
<ardb> ajb-lina-: kasan does not rely on code patching, so that seems unlikely
<ajb-lina-> but patching the kernel for every executable run seems odd
<ardb> ajb-lina-: yes that is unexpected
<ardb> ajb-lina-: can you rebuild that kernel from source so you have a vmlinux to give to gdb?
<ardb> ah hold on
<ardb> "via arch_jump_label_transform"
<ajb-lina-> ardb: sadly my hand built kernels don't exhibit the same wild TB invalidation as the distro ones.. I even built the alpine kernel and ran directly and it didn't
<ardb> so this is a static key being toggled
<ardb> maybe this is alpine value add?
<ajb-lina-> it seems a fairly lightly patched kernel
<ajb-lina-> ardb: so what are static keys for, could is this runtime guided perf tweaking or something?
<ardb> ajb-lina-: global boolean variables that are modified so rarely [typically] that it pays to patch the code directly instead of using if/else
<ajb-lina-> ardb: is there a kernel flag that turns this on/off?
<ardb> CONFIG_JUMP_LABEL
<ardb> but for obvious reasons, this is a compile time option only :-)
<ajb-lina-> ardb: hmm odd I have CONFIG_JUMP_LABEL=y in my "good" kernel - so maybe this is a subtle interaction with something else?
<ardb> ajb-lina-: jump labels are always enabled
<ardb> the question is which code that uses a jump label gets invoked here, and not on the "good" kernel
<ardb> hence the question regarding alpine value add
* ajb-lina- continues working up the call chain
<ajb-lina-> __jump_label_update
<ajb-lina-> jump_label_update
<ajb-lina-> static_key_disable_cpuslocked
<ajb-lina-> or
<ajb-lina-> static_key_enable_cpuslocked
<ajb-lina-> toggle_allocation_gate
* ajb-lina- builds a kernel with CONFIG_KFENCE_STATIC_KEYS to check
<ajb-lina-> ok well kfence.sample_interval=0 stops the invalidations everytime I do cat /proc/self/maps
<ajb-lina-> so now I just need to work out what is happening during boot
<ajb-lina-> 163968 0xffff800011032b5c
<ajb-lina-> ffff8000110327d8 T memmap_init_range
<ajb-lina-> that at least makes sense - but I would hope most of those didn't trigger tb flush because there isn't any code in them
russ has quit [Ping timeout: 256 seconds]
russ has joined #armlinux
Amit_T has quit [Ping timeout: 272 seconds]
russ has quit [Read error: Connection reset by peer]
amitk has quit [Ping timeout: 252 seconds]
russ has joined #armlinux
djrscally has quit [Quit: Konversation terminated!]
jlinton has quit [Ping timeout: 256 seconds]
headless has quit [Quit: Konversation terminated!]
System_Error has quit [Read error: Connection reset by peer]
djrscally has joined #armlinux
abelvesa has quit [Quit: leaving]
abelvesa has joined #armlinux
wolfshappen has quit [Ping timeout: 256 seconds]
torez has quit [Quit: torez]
XV8 has joined #armlinux
iivanov has quit [Remote host closed the connection]
matthias_bgg has quit [Ping timeout: 256 seconds]
olofj has quit [Ping timeout: 252 seconds]
pjw has quit [Read error: Connection reset by peer]
dianders has quit [Read error: Connection reset by peer]
arnd has quit [Read error: Connection reset by peer]
broonie has quit [Read error: Connection reset by peer]
ccaione has quit [Read error: Connection reset by peer]
unixsmurf has quit [Read error: Connection reset by peer]
narmstrong has quit [Read error: Connection reset by peer]
jamestperk has quit [Read error: Connection reset by peer]
maennich has quit [Read error: Connection reset by peer]
roxell has quit [Read error: Connection reset by peer]
mturquette has quit [Read error: Connection reset by peer]
drewfustini has quit [Read error: Connection reset by peer]
netonaut_ has quit [Read error: Connection reset by peer]
robclark has quit [Read error: Connection reset by peer]
zx2c4 has quit [Read error: Connection reset by peer]
robher has quit [Read error: Connection reset by peer]
ccaione has joined #armlinux
pjw has joined #armlinux
broonie has joined #armlinux
unixsmurf has joined #armlinux
jamestperk has joined #armlinux
mturquette has joined #armlinux
roxell has joined #armlinux
arnd has joined #armlinux
robclark has joined #armlinux
robher has joined #armlinux
dianders has joined #armlinux
olofj has joined #armlinux
maennich has joined #armlinux
drewfustini has joined #armlinux
narmstrong has joined #armlinux
netonaut_ has joined #armlinux
zx2c4 has joined #armlinux
XV8 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
djrscally has quit [Quit: Konversation terminated!]