sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv
Trifton has joined #riscv
balrog has quit [Read error: Connection reset by peer]
balrog has joined #riscv
gdd has quit [Ping timeout: 265 seconds]
gdd has joined #riscv
rurtty has quit [Quit: Leaving]
wingsorc has quit [Remote host closed the connection]
wingsorc has joined #riscv
wingsorc has quit [Remote host closed the connection]
wingsorc has joined #riscv
jacklsw has joined #riscv
___nick___ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
hrberg has quit [Ping timeout: 250 seconds]
___nick___ has joined #riscv
hrberg has joined #riscv
___nick___ has quit [Client Quit]
vagrantc has quit [Quit: leaving]
___nick___ has joined #riscv
Stat_headcrabed has joined #riscv
Dyskos has quit [Ping timeout: 250 seconds]
motherfsck has joined #riscv
wiagn has joined #riscv
Stat_headcrabed has quit [Ping timeout: 255 seconds]
wiagn is now known as Stat_headcrabed
motherfsck has quit [Ping timeout: 265 seconds]
BootLayer has joined #riscv
motherfsck has joined #riscv
pabs3 has quit [Quit: Don't rest until all the world is paved in moss and greenery.]
pabs3 has joined #riscv
matoro has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
matoro has joined #riscv
wiagn has joined #riscv
Stat_headcrabed has quit [Ping timeout: 265 seconds]
wiagn is now known as Stat_headcrabed
wiagn has joined #riscv
Stat_headcrabed has quit [Ping timeout: 265 seconds]
wiagn is now known as Stat_headcrabed
billchenchina- has quit [Ping timeout: 246 seconds]
motherfsck has quit [Ping timeout: 260 seconds]
BootLayer has quit [Quit: Leaving]
wiagn has joined #riscv
Stat_headcrabed has quit [Ping timeout: 252 seconds]
wiagn is now known as Stat_headcrabed
mahk has joined #riscv
junaid_ has joined #riscv
junaid_ has quit [Remote host closed the connection]
mahk has quit [Changing host]
mahk has joined #riscv
Gravis has quit [Ping timeout: 276 seconds]
Gravis has joined #riscv
wiagn has joined #riscv
Stat_headcrabed has quit [Ping timeout: 255 seconds]
wiagn is now known as Stat_headcrabed
wiagn has joined #riscv
Stat_headcrabed has quit [Ping timeout: 265 seconds]
wiagn is now known as Stat_headcrabed
MaxGanzII has joined #riscv
TheEldest_ has joined #riscv
TheEldest has quit [Ping timeout: 265 seconds]
JanC_ has joined #riscv
JanC has quit [Killed (lithium.libera.chat (Nickname regained by services))]
JanC_ is now known as JanC
wiagn has joined #riscv
Stat_headcrabed has quit [Ping timeout: 276 seconds]
wiagn is now known as Stat_headcrabed
jobol has joined #riscv
Stat_headcrabed has quit [Client Quit]
ldevulder has joined #riscv
<bjdooks> conchuod: should we do anything to advance the CMO dma-coherent memory issue?
sh1r4s3 has quit [Ping timeout: 276 seconds]
<conchuod> Ping arnd ;)
sh1r4s3 has joined #riscv
<conchuod> He was planning to submit some cross arch refactor which became a prereq for Prabhakars stuff
<arnd> yes, I need to get back to that, I wonder if there is any way to split it up
<arnd> the first step I had planned was to go through https://docs.google.com/spreadsheets/d/1qDuMqB6TnRTj_CgUwgIIm_RJ6EZO76qohpTJUMQjUEo/edit#gid=0 and ensure everything follows the same rules
<arnd> followed by moving the then common logic into shared code
<bjdooks> yeah, i was going to send something for that, a couple of the flushes i think are meant to be invalidate in riscv
<bjdooks> arnd: my issue is the dma-allocator mapping, if you use dma_alloc and expect the CMO to work, it still maps the memory as uncached on riscv
<bjdooks> which makes the CMO code irrelevant
<arnd> which dma_alloc variant?
<bjdooks> any of them
<arnd> dma_alloc_coherent() must map things as uncached if the DMA is non-coherent
<arnd> if DMA is fully coherent, then you need no CMO
<arnd> so I think that bit is correct
<bjdooks> no
<bjdooks> so i've faked a non-coherent system, done extensive testing, if you have a device marked noncohereht and do a dma-alloc on it, it does the CMOs but the page is marked as uncached.
<bjdooks> the issue is is SVPBMT and ZICBOM exist, SVPBMT gets picked first for both ioremap() and dma-alloc
<bjdooks> where you want SVPBMT for ioreamp() and ZICBOM for dma-alloc
<arnd> help me out with the acronyms: what does SVPBMT mean, and how is it differnet from ZICBOM?
<bjdooks> ZICBOM is the cahce management like clean/flush/inval
<bjdooks> SVPBMT allows memory pages to be marked uncached/weakly-ordered
<arnd> still confused. so ioremap() should generally require uncached/strongly-ordered, not uncached/weakly-ordered in order to guarantee ordering between individual mmio register accesses, but that is probably covered by the barriers in readl()/writel(), so that's fine.
<arnd> the riscv_page_dmacoherent() function in the patch looks correct to me (assuming only ZICBOM is supported, not custom cache management operations)
<arnd> at least as far as I can tell, this does the same as any other architecture: if the DMA is fully coherent, pgprot_dmacoherent() is regular cached mapping, and the CMOs are all nops
<conchuod> arnd: are your changes there a strict requirement for Prabhakars series adding CMO stuff for Renesas/Andes stuff?
<arnd> but if you have ZICBOM, then pgprot_dmacoherent() returns some variant of uncached memory and CMOs do the appropriate flushes
<bjdooks> but if it isn't, then you get uncached pages
<bjdooks> totally unached is stupid, then you don't need any ops as it is uncached
<bjdooks> totally unached is stupid, then you don't need any cmo ops as it is uncached
<arnd> bjdooks: I think you mix up the streaming mapping (dma_map_*) with the coherent mapping (dma_alloc_*)
<arnd> CMO is only used for streaming mappings, but cannot work for coherent mapping
<arnd> doing a flush on an uncached page should result in a CPU fault
<arnd> (not sure if it does on zicbom, but it does on some other ones)
<bjdooks> arnd: those instructions do not fault
<arnd> ok
<arnd> not a big deal, just makes it a little harder to find bugs in drivers that get the interface wrong
<bjdooks> I still think riscv is doing it wrong
<arnd> the typical example is a network driver using dma_alloc_coherent() to create a buffer for its descriptors that is uncached, and dma_map_sg() for the SKBs
<arnd> In the descriptors, you need the individual accesses to be strictly ordered (first the data pointer, then the valid flag), which you cannot enforce on cached memory
<bjdooks> so i'm fairly sure in that case, riscv with both svpbmt and zicbom will provide an uncached area and then do CMO ops, which seems strange when uncached should be more than sufficient there,
<arnd> for the descriptor access, the fictional network driver in my example only does a dma_wmb() between the address write and the flag write.
<arnd> if dma_wmb() turns into a CMO, that is indeed a bug
<arnd> I only see
<arnd> include/asm-generic/barrier.h:#define dma_wmb() wmb()
<arnd> arch/riscv/include/asm/barrier.h:#define wmb() RISCV_FENCE(ow,ow)
<arnd> that's not a CMO, right?
<bjdooks> no
* bjdooks is now confused
<arnd> bjdooks: do you have a particular driver that you were looking at, or just the architecture code?
<conchuod> bjdooks: oh, I think I got what you meant in your original message wrong. I didn't realise you meant your recent patches, saying "ping arnd" was for allowing CMO stuff from functions.
<bjdooks> so I've been using a test driver i wrote for testing, that does dma_alloc() with differetn attributes and then uses dma_sync_single_for_cpu and dma_sync_single_for_device
<arnd> ah, that makes sense. So you are using a broken testcase ;-)
<arnd> dma_sync_single_for_cpu() is only defined on memory you got from dma_map_*()
<conchuod> arnd: are your cross-arch changes a strict requirement for Prabhakars series adding CMO stuff for Renesas/Andes?
prabhakarlad has joined #riscv
<arnd> conchuod: as far as I'm concerned, the strict requirement for new CMOs is that we come up with a sensible definition of what each dma operation should do
<arnd> both the current riscv definition for ZICBOM and the version that prabhakarlad was adding are common across other architectures, but they are fundamentally at odds with one another, so the bit I'm interested in is making them do the same thing first
<conchuod> Okay, that makes sense.
zjason` is now known as zjason
<arnd> I think the most controversial bit is the question about DMA_BIDIRECTIONAL: powerpc started the flush/flush semantics a long time ago, and this has made it into parisc, microblaze and now riscv over time
mahk has quit [Ping timeout: 268 seconds]
<arnd> the idea was to deal with a partially shared cache line at the beginning of the mapping, where one part of it is used by the CPU and another part is used by a device
<arnd> having a flush in dma_map_*() here makes sense, as this means the device will see the data that was written by the CPU and the CPU doesn't lose any of its own data
mahk has joined #riscv
<arnd> but in dma_unmap_*() there is absolutely no way to preserve both the data from the device and the CPU, if they concurrently write into the same cacheline
<arnd> invalidate loses any new data written by the CPU, and flush loses data written by the device
BootLayer has joined #riscv
<conchuod> That sounds like a topic for the Wills and Christophs of the world :)
<arnd> absolutely. There are a number of easier changes to make where I hope we can easily agree
<arnd> such as powerpc always doing the same thing for map and unmap, agaict that is just a historic artifact and changing it just makes it more efficient
<conchuod> I won't hold my horses for this to be resolved soon so! I did like the idea of removing the ability to decide what op is called for what, if there's gonna end up being several methods for doing this on riscv, that approach sounds ideal.
Sos has joined #riscv
<bjdooks> ok, in the case of a non-coherent device, then zicbom isn't going to cut it with dma_alloc as there's no way to sync the data or make it uncached
Sos has quit [Quit: Leaving]
<bjdooks> ok, so one of the tests is dma_alloc_noncoherent and that if i read it correctly should require the dma-sync calls
<bjdooks> ^arnd ?
pecastro has joined #riscv
<arnd> bjdooks: correct, though note that dma_alloc_noncoherent() is rarely used, it pretty much only exists for old MIPS and Itanium workstations from 25 years ago that had custom requirements
<arnd> it's not even mentioned in Documentation/core-api/dma-api-howto.rst
<jrtc27> why would anyone try to support a cache line being shared between dma and something else...
<jrtc27> that's just broken by design
<jrtc27> unless you have an architecture that guarantees partial writebacks
<arnd> jrtc27: yes, that was pretty much my point. I think we have a couple of device drivers that did this in violation of the interface, and they worked fine on machines with coherent caches but caused bugs on certain machines
<arnd> and then we had architecture maintainers trying to work around this without fully understanding the problem
<jrtc27> perhaps you want a sanitiser mode where the cmo implementation zeroes out the partial ends...
<jrtc27> (or some other junk pattenr)
<arnd> right, I had already considered adding a WARN_ONCE(unaligned address or size)\
bauruine has joined #riscv
mahk has quit [Changing host]
mahk has joined #riscv
sh1r4s3 has quit [Ping timeout: 264 seconds]
<bjdooks> https://www.pinterest.de/pin/672936369316516141/ <= when somene mentions new laptop
ldevulder has quit [Remote host closed the connection]
jacklsw has quit [Ping timeout: 265 seconds]
sh1r4s3 has joined #riscv
wingsorc has quit [Ping timeout: 246 seconds]
<arnd> jrtc27: I wonder if KASAN could do even better here: mark whole cache line as invalid in dma_sync_*_for_device(..., DMA_FROM_DEVICE) but mark only the actual data as valid in dma_sync_*_for_cpu(..., DMA_FROM_DEVICE)
<arnd> DMA_TO_DEVICE on partial cache lines is not harmful because there is no data corruption as long as the device only reads
sh1r4s3 has quit [Remote host closed the connection]
sh1r4s3 has joined #riscv
joev has quit [Ping timeout: 255 seconds]
joev has joined #riscv
Andre_Z has joined #riscv
sh1r4s3 has quit [Read error: Connection reset by peer]
sh1r4s3 has joined #riscv
joev has quit [Ping timeout: 255 seconds]
joev has joined #riscv
joev has quit [Ping timeout: 250 seconds]
joev has joined #riscv
joev has quit [Ping timeout: 250 seconds]
joev has joined #riscv
billchenchina- has joined #riscv
billchenchina- has quit [Remote host closed the connection]
billchenchina has joined #riscv
cwebber has joined #riscv
rneese has joined #riscv
Tenkawa has joined #riscv
Andre_Z has quit [Quit: Leaving.]
jmdaemon has quit [Ping timeout: 265 seconds]
ldevulder has joined #riscv
MaxGanzII has quit [Remote host closed the connection]
MaxGanzII has joined #riscv
billchenchina- has joined #riscv
billchenchina has quit [Ping timeout: 265 seconds]
elastic_dog has quit [Killed (zinc.libera.chat (Nickname regained by services))]
elastic_dog has joined #riscv
rurtty has joined #riscv
sh1r4s3_ has joined #riscv
sh1r4s3 has quit [Read error: Connection reset by peer]
Andre_Z has joined #riscv
motherfsck has joined #riscv
Andre_Z has quit [Ping timeout: 265 seconds]
MaxGanzII has quit [Remote host closed the connection]
jacklsw has joined #riscv
elastic_dog has quit [Remote host closed the connection]
elastic_dog has joined #riscv
Andre_Z has joined #riscv
lagash has quit [Quit: ZNC - https://znc.in]
lagash has joined #riscv
rneese has quit []
<arnd> geertu: I'm trying to make sense of the m68k arch_sync_dma_for_device() function, which has operations called 'push' and 'clear' instead of the normal 'clean'/'invalidate'/'flush'.
<arnd> is this a write-through or write-back cache?
<geertu> arnd: '020/ '040/'060 is write-back
<geertu> arnd: '020/'030 is write-through, '040/'060 is write-back
<arnd> ok, so 'push' is 'clean' (on WB) plus 'invalidate' on all, while 'clear' is just 'invalidate', right?
<geertu> arnd: yes, cfr. the documented semantics in arch/m68k/mm/memory.c
<arnd> geertu: got it, so this uses the regular writeback semantics, except that it does an extra invalidate in sync_dma_for_device(..., DMA_TO_DEVICE), where others just do a 'clean', and no 'invalidate'.
<arnd> I'm still unsure what semantics we actually want on write-through caches. I think what you do here (all operations in *_for_device, just skip the clean when that is a nop) would be the easiest, but it's not what other architectures do today
<arnd> on sparc32, xtensa and writethrough variants of armv4, the invalidate happens in _for_cpu() rather than for_device(), and I'm not sure whether there are any important tradeoffs
Noisytoot has quit [Read error: Connection reset by peer]
<geertu> Doing it in for_device() avoids ever pushing out the data twice, corrupting memory if the DMA wrote something in between
<geertu> BTW, I don't like "clean"
<geertu> Your Google Docs document also uses "flush", which is ambiguous.
<arnd> is 'wback' better?
<geertu> IIRC, "push" and "invalidate" are the non-ambiguous terms?
<arnd> I don't think anyone else uses 'push', so that would be more confusing
lagash has quit [Quit: ZNC - https://znc.in]
<arnd> 'wbinv' instead of 'flush' would be less ambiguous but smells very x86
<geertu> That's write-back + invalidate?
<arnd> right
<geertu> wback is unambihuous, too.
Noisytoot has joined #riscv
<geertu> "flush" is typically used in sayings like "yeah, you have to flush the cache to avoid corruption", but doesn't clarify what exactly needs to be done (push/wback? invalidate? Both?)
<arnd> I'll stick with the wback/inval/flush naming for the moment, hopefully that's clear enough. clean/inval/flush is the terminology from arch/arm, so I started with that, but that is a bit ambigous as both 'clean' and 'flush'
<arnd> have been used with multiple meanings
lagash has joined #riscv
<geertu> Exactly.
<geertu> What terminology does the buffer cache use?
<geertu> OK, that one is not write-through
Andre_Z has quit [Ping timeout: 276 seconds]
Tenkawa has quit [Ping timeout: 250 seconds]
Tenkawa has joined #riscv
vagrantc has joined #riscv
<dh`> the last thing I needed names for those on I used wb/wbinv/inv
motherfsck has quit [Ping timeout: 276 seconds]
<geertu> dh`: These are unambiguous, too.
pecastro has quit [Ping timeout: 264 seconds]
Andre_Z has joined #riscv
billchenchina has joined #riscv
MaxGanzII has joined #riscv
billchenchina- has quit [Ping timeout: 256 seconds]
lagash has quit [Quit: ZNC - https://znc.in]
lagash has joined #riscv
Perflosopher has joined #riscv
<geist> i really find the arm clean and invalidate to be about as unambiguous as it gets
jacklsw has quit [Read error: Connection reset by peer]
<Esmil> I'm guessing wback means write what is in cache to ram and inval means forget what is in the cache and read from ram. But what is flush then?
<geist> yah i think you need like 2 of the 3 terms at the same time, because wback/clean/flush are kinda ambiguous with each other
<geist> flush does tend to get codified into various apis as 'synchronize i and d cache' annoyingly
<Esmil> ah, sorry. arnd said earlier that flush is just wback + inval
prabhakarlad has quit [Quit: Client closed]
<geist> yah it depends on what api, they're all used differently
<geist> the flush one i'm thinking about is iirc a builtin in gcc/llvm for 'synchronize i & d' that's called flush
<geist> and thus is somewhat codified everywhere, across a bunch of OSes
sh1r4s3 has joined #riscv
<geist> though hmm, now i see it as __builtin___clear_cache
sh1r4s3_ has quit [Ping timeout: 255 seconds]
sh1r4s3 has quit [Ping timeout: 246 seconds]
jmdaemon has joined #riscv
jobol has quit [Quit: Leaving]
motherfsck has joined #riscv
pecastro has joined #riscv
<bjdooks> ok, now i've moved to using kzalloc() and dma_map it seems the kernel is possibly using bounce buffers, which sort of defats the idea of trying to use cmo ops... dma_alloc_noncoherent does however work
vagrantc has quit [Quit: leaving]
sh1r4s3 has joined #riscv
prabhakarlad has joined #riscv
KombuchaKip has quit [Quit: Leaving.]
BootLayer has quit [Quit: Leaving]
billchenchina has quit [Ping timeout: 248 seconds]
jmdaemon has quit [Ping timeout: 268 seconds]
lagash has quit [Quit: ZNC - https://znc.in]
lagash has joined #riscv
jmdaemon has joined #riscv
___nick___ has quit [Ping timeout: 265 seconds]
ntwk has quit [Ping timeout: 248 seconds]
MaxGanzII_ has joined #riscv
MaxGanzII has quit [Ping timeout: 255 seconds]
ntwk has joined #riscv
Andre_Z has quit [Ping timeout: 268 seconds]
vineetg762 has joined #riscv
<palmer> bjdooks: IIRC we've got bounce buffers enabled by default as some systems need them (SiFive's ethernet, for example). Not sure what you're running on...
KombuchaKip has joined #riscv
<bjdooks> The sifive_u qemu and an internal FPGA farm
vineetg762 has quit [Client Quit]
<palmer> I guess the FPGAs are up to you to decide, but I think the sifive_u would end up emulating the same ethernet addressing related issues as in the Unleashed and thus have bounce buffers
ldevulder has quit [Ping timeout: 256 seconds]
bauruine has quit [Remote host closed the connection]
pedja has quit [Quit: Leaving]
MaxGanzII_ has quit [Quit: Leaving]
jmdaemon has quit [Ping timeout: 264 seconds]
ntwk has quit [Ping timeout: 255 seconds]
jmdaemon has joined #riscv
wingsorc has joined #riscv