DesRoin has quit [Ping timeout: 256 seconds]
DesRoin has joined #riscv
heat has quit [Ping timeout: 264 seconds]
fuwei has joined #riscv
armand__ has quit [Remote host closed the connection]
armand__ has joined #riscv
fuwei has quit [Quit: Konversation terminated!]
vagrantc has quit [Ping timeout: 246 seconds]
u0_a127 has joined #riscv
eshep_ has joined #riscv
eshep has quit [Ping timeout: 260 seconds]
eshep has joined #riscv
eshep_ has quit [Ping timeout: 255 seconds]
BootLayer has joined #riscv
eshep has quit [Ping timeout: 255 seconds]
eshep_ has joined #riscv
eshep has joined #riscv
eshep_ has quit [Ping timeout: 252 seconds]
Guest36 has joined #riscv
coldfeet has joined #riscv
jacklsw has joined #riscv
naoki has quit [Remote host closed the connection]
coldfeet has quit [Remote host closed the connection]
vagrantc has joined #riscv
vagrantc has quit [Quit: leaving]
jack_lsw has joined #riscv
jack_lsw has quit [Quit: Back to the real life]
davidlt has joined #riscv
jacklsw has quit [Ping timeout: 252 seconds]
unnick has quit [Ping timeout: 264 seconds]
unnick has joined #riscv
Stat_headcrabed has joined #riscv
TMM has joined #riscv
paddymahoney has quit [Remote host closed the connection]
Stat_headcrabed has quit [Quit: Stat_headcrabed]
jacklsw has joined #riscv
paddymahoney has joined #riscv
jacklsw has quit [Ping timeout: 268 seconds]
jacklsw has joined #riscv
jacklsw has quit [Remote host closed the connection]
jacklsw has joined #riscv
paddymahoney has quit [Quit: Leaving]
coldfeet has joined #riscv
eshep_ has joined #riscv
eshep has quit [Ping timeout: 256 seconds]
eshep has joined #riscv
eshep_ has quit [Ping timeout: 240 seconds]
jacklsw has quit [Ping timeout: 268 seconds]
davidlt has quit [Ping timeout: 272 seconds]
eshep_ has joined #riscv
eshep has quit [Ping timeout: 264 seconds]
davidlt has joined #riscv
eshep has joined #riscv
eshep_ has quit [Ping timeout: 264 seconds]
naoki has joined #riscv
u0_a127 has quit [Ping timeout: 268 seconds]
davidlt has quit [Ping timeout: 246 seconds]
u0_a127 has joined #riscv
eshep has quit [Read error: Connection reset by peer]
eshep has joined #riscv
naoki has quit [Ping timeout: 260 seconds]
naoki has joined #riscv
naoki1 has joined #riscv
naoki has quit [Ping timeout: 268 seconds]
naoki1 is now known as naoki
naoki1 has joined #riscv
naoki has quit [Ping timeout: 268 seconds]
naoki1 is now known as naoki
mubluekoor has quit [Quit: mubluekoor]
naoki1 has joined #riscv
naoki has quit [Ping timeout: 264 seconds]
naoki1 is now known as naoki
naoki1 has joined #riscv
naoki has quit [Ping timeout: 260 seconds]
naoki1 is now known as naoki
naoki1 has joined #riscv
naoki has quit [Ping timeout: 255 seconds]
naoki1 is now known as naoki
davidlt has joined #riscv
psydroid2 has joined #riscv
sgerhold has quit [Quit: :/]
sgerhold has joined #riscv
Stat_headcrabed has joined #riscv
mubluekoor has joined #riscv
naoki has quit [Quit: naoki]
coldfeet has quit [Quit: leaving]
Guest36 has quit [Quit: Client closed]
u0_a127 has quit [Remote host closed the connection]
mlw has joined #riscv
davidlt has quit [Ping timeout: 264 seconds]
davidlt has joined #riscv
eshep has quit [Ping timeout: 264 seconds]
eshep has joined #riscv
coldfeet has joined #riscv
leah2 has quit [Ping timeout: 246 seconds]
heat has joined #riscv
davidlt has quit [Ping timeout: 255 seconds]
davidlt has joined #riscv
leah2 has joined #riscv
u0_a127 has joined #riscv
fuwei has joined #riscv
pecastro has joined #riscv
Narrat has joined #riscv
leah2 has quit [Ping timeout: 246 seconds]
<
drewfustini>
wmat: thanks!
leah2 has joined #riscv
<
courmisch>
anybody toyed with SpacemiT IME?
dramforever[m] has joined #riscv
<
dramforever[m]>
actually reads username oh you probably already know that
<
courmisch>
the year is 2024, and people still write inline assembler that assumes the C compiler won't clobber state between asm stanzas
<
dramforever[m]>
when was allocating vector registers in inline asm added to gcc and llvm?
<
dramforever[m]>
courmisch: are you referring the benchmark i posted? i don't see inline assembler there
<
dramforever[m]>
oh wait a little bit
<
dramforever[m]>
but it doesn't assume the c compiler won't clobber state
<
dramforever[m]>
s/posted/linked/ above
<
dramforever[m]>
ah, read again, that's just reading the max vl in a convoluted way
<
dramforever[m]>
the actual kernels are in asm/ with names like vector_vfmacc_vv_f32f32f32
<
courmisch>
dramforever[m]: 1) that's invalid too because it fails to mark vl and vtype clobbers
<
dramforever[m]>
ooh that's true
<
courmisch>
dramforever[m]: and 2) it's not clear if reg_new_isa expands to assembler or what
<
dramforever[m]>
no reg_new_isa is just for printing results
<
dramforever[m]>
but i missed the vsetvli the first time and thought those were okay when i caught a glimpse the second time, mb
<
dramforever[m]>
i mean, the vl problem
<
courmisch>
I have never seen vsetvl inline assembler that did it right
<
courmisch>
except code I wrote myself
<
dramforever[m]>
i think done right it should just be a csrr vlenb
<
courmisch>
in this case, yes
<
courmisch>
but I mean generally
<
dramforever[m]>
yes
<
courmisch>
really, vector code should be outlined to .S or nakedfn, or written in intrinsics if you really can't asm
<
dramforever[m]>
good that it is .S here!
<
dramforever[m]>
it's also really contrived it seems
<
courmisch>
yeah, why is it preserving vectors in .S
<
courmisch>
and why #ifdef
__APPLE__
<
dramforever[m]>
i think that's just cargo cult programming
<
courmisch>
and why vxor.vv. Somebody did too much x86
<
dramforever[m]>
yeah i'm trying to figure that out rn, it seems they're just repeatedly multiplying zero matrices?
<
dramforever[m]>
and it's just a big unrolled loop doing nothing in particular
<
courmisch>
dramforever[m]: vmadot is, as the mnemonic vaguely implies a MAC, not a MUL
<
courmisch>
so you do have to zero the output initially
<
courmisch>
well, at least unless you already have something to add to
mlw has quit [Quit: leaving]
<
dramforever[m]>
i'm pretty sure they zeroed everything
<
courmisch>
the weird thing is that they compute the same thing over and over, but that's probably for benchmarking
<
dramforever[m]>
yeah right it still makes sense if the performance of the instructions is the same
<
dramforever[m]>
it's just awfully unclear it's measuring the performance of computing what
<
dramforever[m]>
my current guess is "nothing"
<
dramforever[m]>
starting to regret linking to it now, a coworker was looking into it so i thought it was relevant
<
courmisch>
AFAIU, it's products of 8x4 by 4x8 matrices of s8
<
dramforever[m]>
> A CPU tool for benchmarking the peak of floating points
<
dramforever[m]>
"floating point" is out of the window now
<
dramforever[m]>
* > A CPU tool for benchmarking the peak of floating points
<
dramforever[m]>
"floating point" is out of the window now
<
dramforever[m]>
i guess, just how many arithmetic operations per second you can fit into this thing?
<
courmisch>
and I don't understand why it outputs to a pair of vectors. a 4x4 matrix of s32 should fit in one single vector (assuming VLEN=256), no?
<
courmisch>
dramforever[m]: would have to look how they assembler vmadot, but the FP version is supposed to be named vfmadot{,1,2,3,n}
dilfridge is now known as Agent_K
<
courmisch>
yeah I have the PDF, but I don't understand the in-vector layout
Agent_K is now known as dilfridge
<
courmisch>
there must be something incredibly obvious to AI people that I don't know or something
<
courmisch>
the PDF also does not explain what 'e4' is? I'd guess vtype.vsew = 0b111 ??
<
dramforever[m]>
> This extension instructions only support cases where LMUL is less than or equal to 1.
<
dramforever[m]>
i'm now very confused as well
<
courmisch>
yeah, but SEW=4 ? the encoding is not specified anywhere. I can only guess it's vsew = 7 (-1)
sevan has quit [Changing host]
sevan has joined #riscv
<
courmisch>
well, I don't care about e4 anyway. That's only for AI buzz people
<
courmisch>
I guess I'll just have to VS8R to try to reverse engineer the vector layout :/
<
dramforever[m]>
actually where were you looking at wrt outputing pair of vectors
<
courmisch>
"the index of VD(L) must be even"
<
courmisch>
also in the instruction encoding section, the low bit for VD is forced 0
<
dramforever[m]>
i kinda get the integer case, you multiply a 4x(64/SEW) matrix of SEW elements and a (64/SEW)x4 matrix to get a 4x4 s32 matrix
<
dramforever[m]>
so the inputs are both 4 * (64/SEW) * SEW = 256 bits, but the output is 4 * 4 * 32 = 512 bits
<
dramforever[m]>
i don't get why this is still the case for vfmadot which has 16-bit output element width
iooi has quit [Ping timeout: 252 seconds]
<
courmisch>
dramforever[m]: it isn't? it just says VD, not VD(L) / VD(H) ...
iooi has joined #riscv
<
courmisch>
okay, I think I get it now
<
courmisch>
it's actually VD += VS1 x TRANSPOSE(VS2)
<
courmisch>
the charts are wrong or misleading, if you ask me
BootLayer has quit [Quit: Leaving]
iooi has quit [Read error: Connection reset by peer]
iooi has joined #riscv
naoki has joined #riscv
TMM has joined #riscv
Stat_headcrabed has quit [Quit: Stat_headcrabed]
fuwei has quit [Ping timeout: 264 seconds]
fuwei has joined #riscv
davidlt has quit [Ping timeout: 264 seconds]
jfsimon1981_c has joined #riscv
mark4o has joined #riscv
markh has quit [Ping timeout: 264 seconds]
mark4o is now known as markh
jfsimon1981_c has quit [Remote host closed the connection]
jfsimon1981_c has joined #riscv
coldfeet has quit [Remote host closed the connection]
iooi has quit [Read error: Connection reset by peer]
iooi has joined #riscv
raym has quit [Ping timeout: 264 seconds]
raym has joined #riscv
iooi has quit [Read error: Connection reset by peer]
iooi has joined #riscv
iooi has quit [Read error: Connection reset by peer]
iooi has joined #riscv