#riscv on 2024-05-05 — irc logs at libera.irclog.whitequark.org

2023-08-11 11:05 sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv | Matrix: #riscv:catircservices.org

00:03 eightthree has quit [Ping timeout: 264 seconds]

00:10 <dh`> to do any kind of useful coroutining you need to mess with the stack pointer

00:10 <dh`> at least for anything I'd think of as a coroutine

00:11 <dh`> maybe if the compiler's done CPS conversion first that's not necessary

00:11 <dh`> but in that case it's just a call so nu...?

00:16 <sorear> wondering if it might work better for ada or something else pre-c-hegemony

00:36 eightthree has joined #riscv

01:03 sympt has joined #riscv

01:27 eightthree has quit [Ping timeout: 264 seconds]

01:36 vagrantc has joined #riscv

01:44 heat has joined #riscv

02:16 eightthree has joined #riscv

02:16 heat has quit [Ping timeout: 268 seconds]

02:34 Tenkawa has quit [Quit: Was I really ever here?]

02:51 eightthree has quit [Ping timeout: 260 seconds]

03:00 BootLayer has joined #riscv

03:00 eightthree has joined #riscv

03:29 vagrantc has quit [Ping timeout: 268 seconds]

03:54 <dramforever[m]> oh hey i got it working, kinda https://github.com/dramforever/crobber/blob/main/src/lib.rs#L68

03:58 <dramforever[m]> rustc wouldn't let me clobber s0 and s1 on riscv so it's a bit annoying

04:03 <dramforever[m]> it's also really annoying to have to do jalr ra, t0 like this...

04:48 test924 has joined #riscv

05:15 <sorear> if you're switching between coroutines that have their own procedure calls, it might not be appropriate to use the RAS hint

05:16 <sorear> actually it's rather useless unless crob_yield is #[inline(always)] because the predicted next instruction address will _always be the same_

05:17 <sorear> I don't see why it should deny s0/s1 when the other registers are obviously fine (if it were only s0 I might suspect that it wants you to use the fp name)

05:59 vagrantc has joined #riscv

06:07 vagrantc has quit [Quit: leaving]

06:26 <dramforever[m]> i see it being inlined alright and didn't bother putting #[inline(always)] in it, but yes it is intended to be inlined

06:27 <dramforever[m]> if it weren't and was using the "usual" calling conventions it's no different from the usual "save all callee saved regs" kind of coroutines

06:28 <dramforever[m]> https://github.com/rust-lang/rust/issues/85056 mentions llvm problems with no explanation

06:30 <dramforever[m]> oh, link wrt clobbering s0 and s1

06:32 mlw has joined #riscv

06:53 coldfeet has joined #riscv

06:57 stolen has joined #riscv

07:33 coldfeet_ has joined #riscv

07:34 coldfeet has quit [Ping timeout: 252 seconds]

07:40 mlw has quit [Ping timeout: 246 seconds]

07:43 mlw has joined #riscv

07:45 jacklsw has joined #riscv

07:47 mlw has quit [Ping timeout: 256 seconds]

07:58 mlw has joined #riscv

08:03 coldfeet_ has quit [Quit: leaving]

08:07 psydroid has joined #riscv

08:19 jacklsw has quit [Ping timeout: 268 seconds]

08:21 coldfeet has joined #riscv

08:22 courmisch_ has quit [Quit: Reconnecting]

08:22 courmisch has joined #riscv

08:33 <courmisch> dramforever[m]: wouldn't this work better with naked_fn perhaps?

09:31 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

09:31 TMM has joined #riscv

09:51 coldfeet has quit [Remote host closed the connection]

10:01 <dramforever[m]> courmisch: it would *work*, but the benefits of it being inlinable would be lost

10:07 <dramforever[m]> i also found that at least gcc is fine with clobbering the frame pointer register when -fomit-frame-buffer, and doesn't have a "reserved register" problem, which would mean a much cleaner implementation (only tried x86 so far, but i think it should work analogously for riscv)

10:09 felixonmars has quit [Ping timeout: 268 seconds]

10:10 jfsimon1981_b has quit [Remote host closed the connection]

10:10 jfsimon1981_b has joined #riscv

10:13 BootLayer has quit [Quit: Leaving]

10:21 felixonmars has joined #riscv

10:25 jfsimon1981_b has quit [Remote host closed the connection]

10:26 jfsimon1981_b has joined #riscv

10:32 coldfeet has joined #riscv

10:46 BootLayer has joined #riscv

11:03 jfsimon1981_b has quit [Remote host closed the connection]

11:03 jfsimon1981_c has joined #riscv

12:00 zjason``` has quit [Ping timeout: 260 seconds]

12:15 luca_ has joined #riscv

12:15 luca_ is now known as OwlWizard

12:25 Noisytoot has quit [Excess Flood]

12:27 Andre_Z has joined #riscv

12:28 Noisytoot has joined #riscv

12:29 Tenkawa has joined #riscv

12:45 OwlWizard has quit [Quit: OwlWizard]

12:56 heat has joined #riscv

12:57 stolen has quit [Quit: Connection closed for inactivity]

13:03 Forty-Bot has quit [Ping timeout: 252 seconds]

13:38 test925 has joined #riscv

13:40 eightthree has quit [Ping timeout: 256 seconds]

13:40 test924 has quit [Ping timeout: 240 seconds]

13:43 eightthree has joined #riscv

13:53 BootLayer has quit [Quit: Leaving]

13:55 beber_ has quit [Quit: Gateway shutdown]

14:02 beber_ has joined #riscv

14:19 <unlord> question for you fine people

14:19 <unlord> vwmulu.vx v16, v8, t3 <-- this is giving me an illegal instruction when LMUL=m8

14:21 <courmisch> mixed width is illegal with m8

14:21 <courmisch> since it would result in EMUL=16 for the wide operand(s)

14:24 beber_ has quit [Quit: Gateway shutdown]

14:24 <unlord> right, that is why I put it in v16

14:24 <courmisch> maybe but EMUL>8 is illegal

14:24 <unlord> lame :)

14:25 <courmisch> it wouldn't give any better performance than two instructions with EMUL=8 in any reasonable IP, I think

14:26 marcj has quit [Ping timeout: 272 seconds]

14:26 beber_ has joined #riscv

14:26 <courmisch> I mean, if your hardware can parallelise that much, then it's time to double the vector length

14:27 <unlord> it just means more special cases

14:28 <courmisch> it's just that conventionally LMUL is for the narrow operand(s)

14:28 <courmisch> in turn, that convention is because it makes it more likely that you won't need to change vcfg

14:29 <courmisch> (though I'm not sure that claim is based in any reality)

14:52 ___nick___ has joined #riscv

15:14 psydroid has quit [Quit: KVIrc 5.0.0 Aria http://www.kvirc.net/]

15:19 psydroid has joined #riscv

15:46 <unlord> is there any hardware out there where V does not imply Zbb ?

16:03 <courmisch> no

16:04 Andre_Z has quit [Quit: Leaving.]

16:05 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

16:05 TMM has joined #riscv

16:27 BootLayer has joined #riscv

16:48 esv has joined #riscv

17:02 <unlord> courmisch: why is there no mf16 if there is encoding space for it?

17:04 dzaima[m] has joined #riscv

17:04 <dzaima[m]> fractional LMUL is there for converting between different width integers while not increasing register usage; and mf8 is enough to convert 8-bit elements to a m1 register of 64-bit elements. Were 128-bit element types to be added, an mf16 would supposedly be added too

17:06 <dzaima[m]> spec: "Implementations must provide fractional LMUL settings that allow the narrowest supported type to occupy a fraction of a vector register corresponding to the ratio of the narrowest supported type’s width to that of the largest supported type’s width."

17:10 psydroid has quit [Quit: KVIrc 5.0.0 Aria http://www.kvirc.net/]

17:18 <courmisch> unlord: in theory, as dzaima pointed out, it would only make sense for e128

17:18 <courmisch> that being noted, at least K230 does see performance improvements from using fractional multipliers even in absence of mixed width

17:19 <courmisch> I guess that VLMUL=0b100 is reserved fro either MF16 or M16 whichever (if any) ends up making sense in the future

17:28 sakman_ is now known as sakman

17:55 <sorear> it's annoying because (a) pipelined multipliers generate the high and low halves in the same cycle, so mul; mulh wastes half the output in each cycle (b) sifive has a double-width write port, so widening and single-width operations take the same number of cycles at a given LMUL

17:57 <sorear> best multiplier occupancy for a bignum multiply you can get is iirc around 40%, I forgot most of the details

17:58 <sorear> (tip: if you interleave a large vector between several vector registers, you can do a "slide by 1" by renaming the registers with only one slide instruction)

18:28 Tenkawa has quit [Ping timeout: 268 seconds]

18:30 Tenkawa has joined #riscv

18:50 mlw has quit [Ping timeout: 268 seconds]

18:57 Andre_Z has joined #riscv

19:15 hightower2 has joined #riscv

19:17 Andre_Z has quit [Quit: Leaving.]

19:31 jfsimon1981_c has quit [Remote host closed the connection]

19:32 jfsimon1981_c has joined #riscv

19:47 coldfeet has quit [Remote host closed the connection]

19:52 BootLayer has quit [Quit: Leaving]

20:01 jfsimon1981_c has quit [Remote host closed the connection]

20:02 jfsimon1981_c has joined #riscv

20:04 ___nick___ has quit [Ping timeout: 260 seconds]

21:05 jfsimon1981_c has quit [Read error: Connection reset by peer]

21:05 jfsimon1981_c has joined #riscv

21:20 marcj has joined #riscv

21:45 markh has quit [Remote host closed the connection]

21:49 markh has joined #riscv

22:06 luca_ has joined #riscv

22:06 luca_ has quit [Remote host closed the connection]

23:03 Trifton has quit [Quit: ~~~RiDiN tHe WaVeS~~~]

23:21 Noisytoot has quit [Ping timeout: 256 seconds]

23:22 DesRoin has quit [Ping timeout: 272 seconds]

23:23 DesRoin has joined #riscv

23:24 Noisytoot has joined #riscv

23:29 beber_ has quit [Quit: Gateway shutdown]

23:41 beber_ has joined #riscv

23:42 beber_ has quit [Client Quit]

23:51 beber_ has joined #riscv