sorear changed the topic of #riscv to: RISC-V instruction set architecture | | Logs: | Matrix:
rsalveti has joined #riscv
motherfsck has joined #riscv
handsome_feng has joined #riscv
KREYREN has joined #riscv
EchelonX has quit [Quit: Leaving]
Tenkawa has quit [Quit: Was I really ever here?]
Gravis has quit [Quit: Murdered]
naoki has quit [Quit: naoki]
Gravis has joined #riscv
Gravis has quit [Client Quit]
JanC has quit [Ping timeout: 256 seconds]
JanC has joined #riscv
heat has quit [Ping timeout: 240 seconds]
BootLayer has joined #riscv
jacklsw has joined #riscv
sevan has quit [Ping timeout: 264 seconds]
davidlt has joined #riscv
zv_ has joined #riscv
davidlt has quit [Remote host closed the connection]
zv has quit [Ping timeout: 268 seconds]
zv has joined #riscv
bgamari has quit [Ping timeout: 264 seconds]
zv_ has quit [Ping timeout: 252 seconds]
KREYREN has quit [Remote host closed the connection]
KREYREN has joined #riscv
bgamari has joined #riscv
coaden has joined #riscv
<coaden> Hi room
<coaden> Saw the room title and immediately thought of the old '88ish IBM RS/6000s running AIX. Nice machines those were, the top ends were even banned from export lolol crazy considering what kinda raw horse-power we have today huh?
<sorear> can I help
coaden has quit [Quit: - Chat comfortably. Anywhere.]
BootLayer has quit [Quit: Leaving]
jfsimon1981 has joined #riscv
davidlt has joined #riscv
naoki has joined #riscv
peeps[zen] has quit [Ping timeout: 260 seconds]
crossdev has joined #riscv
zero-xray has joined #riscv
esv has quit [Remote host closed the connection]
Leopold has quit [Ping timeout: 260 seconds]
Leopold has joined #riscv
Leopold has quit [Remote host closed the connection]
junaid_ has joined #riscv
Leopold has joined #riscv
jacklsw has quit [Quit: Back to the real world]
Gravis has joined #riscv
Gravis has quit [Ping timeout: 260 seconds]
Gravis has joined #riscv
zero-xray has quit [Quit: connection reset by purr]
zero-xray has joined #riscv
zero-xray has quit [Quit: connection reset by purr]
zero-xray has joined #riscv
zero-xray has quit [Quit: connection reset by purr]
zero-xray has joined #riscv
zero-xray has quit [Quit: connection reset by purr]
zero-xray has joined #riscv
scrts8 has joined #riscv
zero-xray has quit [Excess Flood]
scrts has quit [Ping timeout: 252 seconds]
scrts8 is now known as scrts
zero-xray has joined #riscv
zero-xray has quit [Excess Flood]
zero-xray has joined #riscv
zero-xray has quit [Excess Flood]
zero-xray has joined #riscv
mlw has joined #riscv
zero-xray has quit [Excess Flood]
zero-xray has joined #riscv
crossdev has quit [Remote host closed the connection]
crossdev has joined #riscv
zero-xray has quit [Excess Flood]
zero-xray has joined #riscv
zero-xray has quit [Excess Flood]
mlw has quit [Ping timeout: 260 seconds]
mlw has joined #riscv
motherfsck has quit [Ping timeout: 256 seconds]
esv has joined #riscv
motherfsck has joined #riscv
Guest19 has joined #riscv
<Guest19> Hi, I'm trying to create U-Boot SPL for RISCV and I'm not much experienced with the topic. Is there any documentation like "U-Boot SPL for RISCV from scratch"?
Tenkawa has joined #riscv
dipankar has joined #riscv
dipankar has quit [Changing host]
dipankar has joined #riscv
naoki has quit [Quit: naoki]
<mps> Guest19: in u-boot source doc/develop/spl.rst maybe could help
<Guest19> mps thanks
DesRoin has joined #riscv
mlw has quit [Ping timeout: 268 seconds]
mlw has joined #riscv
psydroid has joined #riscv
mlw has quit [Ping timeout: 264 seconds]
mlw has joined #riscv
heat has joined #riscv
prabhakalad has quit [Quit: Konversation terminated!]
prabhakalad has joined #riscv
Stat_headcrabed has joined #riscv
Gravis has quit [Quit: Murdered]
Gravis has joined #riscv
Guest19 has quit [Quit: Client closed]
Gravis has quit [Ping timeout: 260 seconds]
Gravis_ has joined #riscv
ntwk has quit [Read error: Connection reset by peer]
Stat_headcrabed has quit [Quit: Stat_headcrabed]
EchelonX has joined #riscv
handsome_feng has quit [Quit: Connection closed for inactivity]
<davidlt> T-HEAD C930 is a thing
<conchuod> davidlt: google has very little info, do we have anything on extensions supporteD/
<davidlt> conchuod, I haven't googled it
<davidlt> Someone just mentioned this article, and apparently there is C930
peeps[zen] has joined #riscv
Gravis_ is now known as Gravis
jfsimon1981 has quit [Remote host closed the connection]
jfsimon1981 has joined #riscv
<geertu> davidlt: 4 billion C910 units, wow
<davidlt> geertu, no idea how and where that went
BootLayer has joined #riscv
junaid_ has quit [Quit: Lost terminal]
heat has quit [Remote host closed the connection]
heat has joined #riscv
<another|> and all of them with V 0.7.1
<davidlt> Probably not, C920V2 is v1.0.0
jfsimon1981 has quit [Remote host closed the connection]
<sorear> vectors are a configuration option in c910 anyway
Stat_headcrabed has joined #riscv
ntwk has joined #riscv
Stat_headcrabed has quit [Quit: Stat_headcrabed]
DesRoin has quit [Quit: WeeChat 4.2.1]
DesRoin has joined #riscv
<another|> but c920V2 != c910
ldevulder has quit [Quit: Leaving]
Starfoxxes has quit [Ping timeout: 252 seconds]
jljusten has quit [Ping timeout: 240 seconds]
jljusten has joined #riscv
Narrat has joined #riscv
vagrantc has joined #riscv
BootLayer has quit [Quit: Leaving]
<geist> yah what davidlt is saying, AFAICT they're making a real effort to get up to v 1.0, so i'm not that worried about thead stuff
<geist> they were just early to the game and went with what they had
<davidlt> There are other changes to C920V2, but it's seems to fix/fill the gaps
<geist> but tis true, there are a lot of c910s out there. question there is what percentage of those actually get their vector unit used at all
<davidlt> I could speculate that this will land in SG2044 (SG2042 replacement)
<geist> i worry a lot more about all the other thead ISA and system extensions
<sorear> collectively?
<davidlt> I bet majority of those cores are in some devices that probably don't care running upstream Linux
<davidlt> Aren't they using in NVMe controllers too?
<geist> yeah i worry they become some sort of defacto alternate riscv spec. you h ave the mainline riscv and the thead style processors
<geist> and that's how a large fork is born
<geist> new vendors in china start making their own cores and choose to be Thead compatible as their first priority
<davidlt> Somehow I doubt it, again vectors in C920V2 are now RVI compatible (or should be).
<geist> yah not thinking about the vector stuff. they have a fair amount of plain ISA extensions
<geist> plain integer stuff and some amount of extensions to the system mode side of things
<davidlt> IIRC, Alibaba/T-HEAD is active in RVI. There is no point for them to continue cooking too many vendor extensions.
<geist> the plain ISA extensions seem mostly reasonable, but they're not vetted by anyone but themselves, and i honestly dunno how they fit into the extension space
<geist> i hope so, but then of course engineering logic doesn't necessarily apply to product decisions
<sorear> the plain extensions are in OP_CUSTOM_N
<davidlt> I think T-HEAD is just running ahead a bit too fast, that's why they have existing extensions.
<geist> indeed. i dont think they're being nefarious, but like is aid it may cause a defacto fork of the isa as more vendors maybe choose to follow them
<sorear> if a company cares so little about bespoke custom extensions that they'd willingly surrender their access to OP_CUSTOM_N in the name of T-HEAD compatibility, they'd be using aarch64 instead of riscv64 and the whole thing is mot
<davidlt> There is already another two ISAs in China. I doubt they want a 3rd one.
<geist> except riscv is free
<sorear> I'll be surprised if we see a single core that chooses to be T-HEAD compatible and doesn't turn out on closer inspection to be a rebranding of a T-HEAD developed core
<davidlt> T-HEAD open sources some of their cores :)
<davidlt> It's on GitHub
<geist> yah, i hope you're right
<geist> yah very convenient, though i dont tknow if they're open sourcing their newer ones
<geist> and/or they're getting too complicated :)
<davidlt> I don't think so, they didn't do that for C908.
<davidlt> But C910/C920 is not anything impressive.
<geist> yeah
<davidlt> I think, C920V2 really exist to just fix problems (vector, Sv48 support, new cache coherence protocol, etc.)
<sorear> the open source cores don't have V 0.7.1
<geist> iirc it's a simple dual issue in-order?
<davidlt> It's OoO IIRC, but it looses to JH7110 U74 in-order core
<sorear> C910 is out of order, I don't know the exact specs but they're not huge
<conchuod> They do have a bunch of custom extensions that they don't appear to have walked away from yet.
<davidlt> Ok, it's faster on microbenchmarks, slower on real life things :)
<sorear> which suggests either memory system problems or weird bugs
<geist> might be due to a terrible branch predictor or L1 cache latency or whatnot
<geist> yah
<geist> been working with a P470 and P670 lately, and they're definitely a lot healthier
<conchuod> They've gone and implemented vector, svpbmt and zicbom on the c908, but iirc they still use all of their custom bitmanip stuff rather than the RVI versions?
<davidlt> Oh yeah, a lot of fancy features landed in newer SiFive cores
<davidlt> conchuod, I think C908 they marked it as RVA22 compatible
<geist> conchuod: i think so, that was what i recalled. but i need to dig around for newer docs. i haven't gotten a real clear picture on the newer thead cores
<davidlt> C930 is probably a major change, maybe C920 but RVA22 compatible (?)
JanC_ has joined #riscv
JanC has quit [Killed ( (Nickname regained by services))]
JanC_ is now known as JanC
<conchuod> btw, who is "bruce hoult"?
<davidlt> I think he is doing a lot of micoarch/silicon benchmarks, etc.
<sorear> /r/riscv mod
<conchuod> He seems like a confidently incorrect kinda guy
<davidlt> also, ex-SiFive
<conchuod> I do find it interesting that noone really talks about the t-head bitmanip stuff etc.
<conchuod> maybe cos you can just use it in userspace on their stuff if you want without needing there to be kernel support etc for saving/restoring state
<davidlt> yeah, that's the best part of bitmanip :)
<davidlt> no state
<geist> yah that's what i mean, userspace ISA extensions may be a bit more insidious
<davidlt> SupermiT X60 also incl. some AI extensions (16 IIRC)
<geist> not that i'm saying the sky is falling or anything, but these sort of things can become defacto over time unless thead makes an effort to realign with mainline
<geist> and maybe they are with 930 or whatot. i dont have the data sheet in front of me
<conchuod> geist: Ye I don;t disagree with you on that.
<conchuod> davidlt: It was fun in here the other week when we found out the c908 can toggle on and off svpbmt and zicbom and use the t-head versions instead.
<davidlt> conchuod, sounds like attempt to have compatibility with existing software
<geist> ah yeah svpbmt is indeed one that does collide with thead
<geist> noticed that when i implemented svpbmt in Zircon
<sorear> that's kind of the obvious way for it to work?
<conchuod> Oh it makes total sense, but it was a surprise to find out that they'd put in a bit you could flip and it'd toggle it
<sorear> c906 can toggle off its custom PBMT version, it just doesn't have anything to replace it with
<conchuod> Actually, for zicbom I think both of them work, no bit required.
<geist> i definitely would love more toggles of things in riscv in general, a-la ARM
<conchuod> its just the pbmt stuff that has a bit
<sorear> and the t-head memory attributes are more flexible in ways that may or may not matter (5-bit field vs 2)
<sorear> if the LLVM commit history is a useful source sifive still has their custom CMOs
<davidlt> Here, C930!
<davidlt> RVA24
<conchuod> noice
<davidlt> Not sure that happens, RVA23 spec was changed again just few days ago
<conchuod> sorear: Ye I heard they do.
<geist> yeah, not sure RVA24 exists yet? (/me goes to check the profile doc)
<conchuod> And I got a patchset the other day that has non-zicbom CMO stuff for the jh8100
<geist> but anyway re: bit manip instructions, if it's at least RvA22 it has to have zba/zbb, and so maybe thead moves away from them
<geist> though i dunno how the thead bit manips line up feature wise with zbb
<conchuod> That one just annoys me cos I doubt that SoC was created after Zicbom's development.
<sorear> *reads patchset* this appears to be a MMIO interface to the cache controller itself, not an alternate set of instructions
<davidlt> conchuod, because specs + silicon design these days happens in parallel
<geist> re: the sifive cmos they're not substantial
<geist> just some extra supervisor instructions to do global cache flushes
<geist> which is otherwise not covered with zicbom
<davidlt> SiFive U74 in JH7110 happened before bitmanip v1.0 was available. I think I saw emails saying it's 0.94 or something, no major changes IIRC.
<davidlt> I also saw some emails claiming that some bits in T-HEAD C908 are "RC" spec versions
<conchuod> davidlt: but this soc will be newer than the spec being frozen by like 4 years.
<sorear> geist: not 1-1 between zbx and xtheadbx
<conchuod> sorear: Yah, but it's still something non-standard well after we have something standard. And I don't think that's gonna change for a while, I expect to see more non-standard mechanisms.
<sorear> conchuod: if you have 64 GB of physical memory and want to ensure that none of it is dirty in L2, you need 1 billion instructions with Zicbom and about 10 with starfive's latest thing
<sorear> we *don't* have a reasonable standard solution for range flushes
<geist> ytah seems like maybe that'd be a reasonable solution for SBI
<geist> then a hypervisor can even choose how to deal with the stacking effect of different translation domains
<sorear> sbi can't actually be used for useful things with our current social structures
<geist> heh
<conchuod> sorear: I didn't think of that
<conchuod> sorear: elaborate about SBI please
<conchuod> (the social part)
<sorear> we came up with a great solution for using the SBI to abstract over remote fence hardware a few years ago
<geist> yah re: global flushes this is why ARM has two sets of cache flush instructions: by virtual address (cache line at a time) and by way/set (iterate over all the levels of the heirarchy)
<sorear> the AIA people are trying to rip it out right now
<geist> zicbom only defines the former
<geist> sadly the SBI rfence had a major flaw: it only allows for a single run at a time
crabbedhaloablut has quit []
<geist> so doesn't letyou cache up what happens a lot in the real world: flushing N pages that are not contiguous
<sorear> on the theory that _obviously_ the only possible implementation of remote fences is an IPI and you should do IPIs in S-mode
<geist> iused it a bit in zircon and then switched to IPIs for that reason
<sorear> if it's a hardware remote fence it probably makes sense to push a single run to the hardware at once. batching makes sense for an IPI backend
<geist> yep
<geist> of course all the SBI implementations i had seen are just doing an IPI and then feeding it a single one
<geist> but yeah SBI could abstract over something else
crabbedhaloablut has joined #riscv
<geist> there's another extension in some level of flight to just add ARM style TLB shootdowns
<geist> i'm sure coming from currently existing ARM vendors
<sorear> where?
<geist> i thought i saw something like that, but maybe i shouldn't have mentioned it
<geist> may be in the pre-widespread phase, which means who knows what
<geist> assume that pretty much every major ARM feature has someone that wants to bring it to riscv
<davidlt> yeah, it could be it's just being discussed in a small circle before bringing it up with a decent support for it
<geist> if for not other reason that there are a few vendors that make their own arm cores that would love to just plonk a riscv decoder in front of it
<geist> ie, things like load/store two words at a time i've seen float around, etc
<sorear> that's Zilsd and it's on the list
<davidlt> sorear, initial discussions are just discussions, and it takes time before it becomes a proper proposal
<geist> oh that's a great page
<sorear> davidlt: are you implying that you've seen said discussions?
<davidlt> sorear, not the one geist mentioned
<sorear> anyway, the starfive thing probably shouldn't be called an "errata" if it's actually an extension to speed up something that can be done slowly with Zicbom, and wrapping it in an SBI call seems like an entirely reasonable approach until the privileged isa hc comes up with a better solution
<conchuod> sorear: Ye they already got told not to make it an erratum
<sorear> why are people so afraid of having more than a few vendor extensions? is there some deep assumption in linux that extensions can be represented as bits in a long or something?
<conchuod> I dunno man
<conchuod> I'm the one saying "make vendor extensions"
<sorear> powerpc is up to AT_HWCAP4, armv9 and the z/architecture principles of operation are up to several dozens of pages of "features" and "facilities", and that's just _standard_ functionality
<conchuod> I was thinking that we should get them to do what was done for the andes stuff and hide the detail behind an ecall - but I am wondering if there's gonna be a starfive one, at least one (maybe more) sifive one and an andes one should I go to tech-prs and get some standard ecall for it?
<sorear> throw the things in a struct and don't worry about it until we have 1000 of them
<gurki> sorear: riscv is on the edge of being as convoluted as other isas
<gurki> sorear: albeit "this does not happen" was one of the major goals
<geist> agreed. doesn't bother me, as long as the presence of the feature doesn't require softeare to know about it or it cant boot
<gurki> (just my personal impression, not neccesarily justified)
<geist> ie, add an enable bit if it causes forward compatibility issues
<sorear> my personal impression is that it was always intended as "you pay for what you're using"
<geist> but then there are a few cases where that doesn't happen so far
<gurki> "you get what you pay for" is a very difficult approach to isa, since somebody has to spend money to make a chip out if it
<sorear> supporting both vector ISA research and baby's first CS250 risc ISA in the same architecture would never have worked in another way
<gurki> out of it*
<geist> SVADU vs SVADE, not being able to cap the size of the vector registers in supervisor mode, etc
<geist> but iirc there's a new feature to let you more properly cap that stuff
<geist> though i'm sure it's m mode only which means it's not generically useful to S mode systems unless some new way exists to get to them
<sorear> if software doesn't know about the vector extension, VS in mstatus will never get set to not-Off, vector instructions will never run and the state doesn't bother anyone
<conchuod> The only thing that annoys me about extensions in linux is that it needs to know about them to report to userspace that theyre there.
<geist> sorear: the trouble is supervisor mode has to opt into the *entire* size of the register
<geist> cant cap say a 512 bit register at 256, or 128
<conchuod> And I don't mean hwprobe, just in cpuinfo itself.
<geist> like you can on ARM SVE
<geist> all it needs is a register that says 'user mode sees this size'
___nick___ has joined #riscv
<sorear> 512 bits isn't an "upgrade" of 256 bits, so 512-bit-unaware software doesn't really make sense
<sorear> moore's law is dead so VLEN isn't going to monotonically increase, it's an implementation property of chips that can vary freely
<geist> but for the kernel it has to save the entire state of the hardware
<geist> because you can't mask off and only expose a subset to user space
<geist> which costs time and space
___nick___ has quit [Client Quit]
<sorear> I can't see any reasonable scenario where exposing a subset of the vector state to user space would make sense energy-wise
<geist> sure it would. you have some simple apps that you only give say 128 bits to, and thus you dont have to save the full state during context switch
<geist> so you allocate less space in the kernel to do it, save the subset of the registers, etc
<geist> then yo uhave an app or thread that wants more? great, give it the full beans
<geist> and/or you set some sort of global cap of say 128 independent of whatever hardwar eyou're running on
___nick___ has joined #riscv
<geist> because you dont want or have support or arbitrarily sized vectors in your kernel
<sorear> you're wasting half the energy on every vector _instruction_ to save a bit of memory on _context switches_, which should be much rarer. better would be to just run the simple apps without vectors at all, which allows the vector unit to be clock-gated as a whole
davidlt has quit [Ping timeout: 246 seconds]
mlw has quit [Ping timeout: 246 seconds]
<geist> ah but as soon as you compile with vectors the compiler jumps all over it, so now you need two sets of libs/etc that do and dont have vectors
<geist> and the memory is fairly substantial, since that's space stored per thread
<gurki> geist: thats precisely what pretty much all our math libraries do
<gurki> if youre not a math library / heavily math oriented the question arises whether you benefit enough from simd to justify the clock drops
<geist> yah my point is if you had the ability to cap the vector size in the kernel you can pick and choose at a much nicer granularity
<gurki> the latter got better for x86, its only a few hundred MHz for avx512, but if you just have a single avx instruction in 100 lines of asm you might end up with less performance
<geist> and dont have to have vector vs non vector libs/code, you can just choose that certain apps get more vector
<geist> and otherwise dont waste space in the kernel for all the vector state
<sorear> I doubt any existing SVE core has clock drops for SVE _that depend on the configured vector length_
<geist> we side stepped it in zircon by at the moment just saying if the hardware is > 128 we dont support it, because adding arbitrarily large sized vector state save in the kernel is annoying
<geist> if you actually go up to 64kbits (which obviously isnt happening any time soon) the state is like 2MB
<gurki> sorear: now im curious *shoots up benchmarks*
<sorear> intel vectors are weird because of their self-inflicted boundary conditions, it's unreasonable to think that riscv hardware would do the same thing
<sorear> max vector state for V 1.0 is 256 kB with VLEN=65536
<geist> ah yes that's right, 2Mb, 256KiB
<sorear> to my knowledge no shipped chip has VLEN>1024, so allocate 4kB and call it a day
<geist> yah that's why we just set the cap at 128 for now, and will just raise the cap over time
<sorear> 256 is very common though, so capping at 128 is just being difficult for the sake of being difficult
<sorear> even if it worked you'd be taking a huge efficiency penalty for no reason
<geist> well, when i see a 256 i'll raise it
<geist> it is memory, when you h ave thousands of threads the space adds up
<gurki> ím inclined to agree that most numerical stuff shoots for 256 now
<sorear> the current sifive x280 datasheet is vlen=512, does "see" mean "own"
<geist> oh 100%, i just haven't personally seen a riscv core with 256
<geist> yes. as in i do not own a x280
<geist> or more importantly i do not have access to one to run on
unlord has quit [Ping timeout: 255 seconds]
unlord has joined #riscv
prabhakar has quit [Ping timeout: 272 seconds]
crossdev has quit [Read error: Connection reset by peer]
crossdev has joined #riscv
Narrat has quit [Quit: They say a little knowledge is a dangerous thing, but it's not one half so bad as a lot of ignorance.]
crossdev has quit [Remote host closed the connection]
ntwk has quit [Read error: Connection reset by peer]
ntwk has joined #riscv
zjason has quit [Read error: Connection reset by peer]
Leopold has quit [Ping timeout: 260 seconds]
Leopold has joined #riscv
zjason has joined #riscv
psydroid has quit [Quit: KVIrc 5.0.0 Aria]
germ has joined #riscv
Starfoxxes has joined #riscv
germ has quit [Changing host]
germ has joined #riscv
___nick___ has quit [Ping timeout: 240 seconds]
naoki has joined #riscv
jmdaemon has joined #riscv
sevan has joined #riscv
sevan has quit [Changing host]
sevan has joined #riscv
jmdaemon has quit [Ping timeout: 256 seconds]
vagrantc has quit [Quit: leaving]
jmdaemon has joined #riscv
naoki has quit [Quit: naoki]