sorear changed the topic of #riscv to: RISC-V instruction set architecture | | Logs:
hendursaga has quit [Remote host closed the connection]
hendursaga has joined #riscv
jimwilson has joined #riscv
vagrantc has quit [Quit: leaving]
radu242407 has quit [*.net *.split]
mcfrdy has quit [*.net *.split]
dobson has quit [*.net *.split]
mcfrdy has joined #riscv
radu242407 has joined #riscv
dobson has joined #riscv
Raito_Bezarius has joined #riscv
linkliu59 has joined #riscv
Raito_Bezarius has quit [Max SendQ exceeded]
Raito_Bezarius has joined #riscv
zapb_ has joined #riscv
edf0_ has joined #riscv
Bigcheese_ has joined #riscv
nosliot has joined #riscv
gordonDrogon has joined #riscv
jc has joined #riscv
stefanct has joined #riscv
riff-IRC has joined #riscv
hl has joined #riscv
sirn has joined #riscv
sjs has joined #riscv
scruffyfurn_ has joined #riscv
cp- has joined #riscv
awordnot has joined #riscv
Gravis has joined #riscv
leah2 has joined #riscv
pho has joined #riscv
Maylay has joined #riscv
kbingham_ has joined #riscv
awordnot has quit [Signing in (awordnot)]
awordnot has joined #riscv
kgz has joined #riscv
klys has joined #riscv
pierce has joined #riscv
BOKALDO has joined #riscv
freakazoid12345 has quit [Read error: Connection reset by peer]
charlesap[m] has joined #riscv
kaji has joined #riscv
khem has joined #riscv
winterflaw has joined #riscv
CarlosEDP has joined #riscv
EmanuelLoos[m] has joined #riscv
winterflaw has quit [Ping timeout: 244 seconds]
GenTooMan has quit [Ping timeout: 240 seconds]
GenTooMan has joined #riscv
geertu has quit [Quit: leaving]
geertu has joined #riscv
leah2 has quit [Quit: trotz alledem!]
leah2 has joined #riscv
valentin has joined #riscv
TMM_ has quit [Quit: - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
winterflaw has joined #riscv
hendursa1 has joined #riscv
hendursaga has quit [Ping timeout: 244 seconds]
zjason has joined #riscv
theruran has quit [Quit: Connection closed for inactivity]
Esmil has joined #riscv
wolfshappen has joined #riscv
dlan has quit [Ping timeout: 245 seconds]
dlan has joined #riscv
drewfustini has quit []
wolfshappen has quit [Quit: later]
drewfustini has joined #riscv
wolfshappen has joined #riscv
jedix has quit [Ping timeout: 258 seconds]
jedix has joined #riscv
jwillikers has joined #riscv
dogukan has joined #riscv
dogukan has quit [Quit: Konversation terminated!]
dogukan has joined #riscv
dogukan has quit [Client Quit]
dogukan has joined #riscv
dogukan has quit [Client Quit]
rjek has quit []
rjek has joined #riscv
mthall has quit [Quit: - Chat comfortably. Anywhere.]
mthall has joined #riscv
GenTooMan has quit [Ping timeout: 256 seconds]
GenTooMan has joined #riscv
GenTooMan has quit [Excess Flood]
GenTooMan has joined #riscv
hendursa1 has quit [Quit: hendursa1]
hendursaga has joined #riscv
GenTooMan has quit [Ping timeout: 272 seconds]
GenTooMan has joined #riscv
GenTooMan has quit [Ping timeout: 256 seconds]
GenTooMan has joined #riscv
Andre_H has joined #riscv
compscipunk has joined #riscv
freakazoid333 has joined #riscv
adomas has quit []
iorem has joined #riscv
iorem has quit [Quit: Connection closed]
nvmd has joined #riscv
TMM_ has quit [Quit: - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
<solrize> sorear around?
psydroid has joined #riscv
<sorear> hi
<solrize> hey can i go way off topic here for a little while, or move somewhere else? i want to talk about some AVR8 code
<solrize> i'm looking at an AVR flashlight controller which fills up the code space on the smaller AVR parts, and it seems to me that the avr-gcc output is not all that dense, so i'm wondering about the idea of using a bytecode interpreter or similar
<sorear> sure i guess, noone else seems to want the floor right now, although i wonder what makes this the most attractive channel for you
<solrize> i remember you wrote a post about code density on different cpus so i looked for you here
<solrize> figuring you might have thoughts on the topic
<solrize> sec
<sorear> ah. I don't recall ever making "a post" on that subject although it's come up on IRC a few times
<sorear> I also have only the most passing knowledge of AVR instruction encoding
<solrize> hmm maybe i'm confusing you with someone... it wasn't about avr or hardware cpus as much as encodings in general. it compared forth with smalltalk
<solrize> or the question of machine code (C compiler output that does a fair amount of 16 bit ops) vs interpreted code on 8 bit cpus in general
<sorear> the one that comes up somewhat regularly is vincent weaver's work (which I rather disagree with on the grounds that his choice of mostly compression-related benchmarks is not representative of benchmarks I would pick), but that doesn't address either forth or smalltalk
<solrize> hmm ok i'll see if i can find vincent weaver's work and also will look for the post i'm thinking of
<sorear> this is actually the first time i've heard of anyone using *smalltalk* specifically as a base for deeply embedded systems; forth is much more well-trodden ground
<solrize> the smalltalk comparison was only about code density
<solrize> this flashlight thing might have been an ok forth application though
<sorear> no
<sorear> (my github handle is my irc handle)
<sorear> i just spent a few minutes trying to find a post based on your description above
<solrize> ah ok sorry
<solrize> the person who wrote that is another regular here
<solrize> no wonder i confused you
<sorear> (who's a regular here?)
<solrize> dercuano i don't remember what nick he uses
<sorear> xentrac?
<solrize> yes, thanks
<solrize> sorry to have confused the two of you
<sorear> it's rare for this to happen *to* me, usually i'm the one that can't tell other people apart
<solrize> heh
<solrize> here is the other weaver/mckee paper
<jrtc27> tbf sorear and xentrac both show as green to me, albeit slightly different shades :D
<solrize> but yeah i looked at the later one and it is a little bit suspect
<meowray> what's summary of psABI Task Group meeting - 2021/08/09?
<solrize> the earlier weaver/mckee paper is not very informative, i just looked at it
mahmutov has joined #riscv
<meowray> "Yes, issue is embedded, people care a lot about code size there so can’t change the implementation until binutils has relaxation support." citation needed for the embedded claim
<jrtc27> yes, well, I've given up fighting over 10/12 bytes there
<wingsorc> I care about code size you can cite me :)
<jimwilson> I have pointed at uses of undefined weak in newlib many times. Particularly in crt0.S.
<jimwilson> The main issue is with naive users that just build a toolchain, build a benchmark, and then decide that RISC-V is broken because code is larger than ARM, without any attempt to understand what is actually going on. This is a problem for the entire RISC-V community. psABI changes that increase code size are reckless, and I won't agree to them.
<wingsorc> to be honest people roll their own crt0.S
<jimwilson> but a naive user looking at RISC-V for the first time for a quick evaluation isn't going to do that
<wingsorc> true. Actually we had people coming in complaining that RISC-V code was 10% larger than ARM
<wingsorc> I don't remember the exact configuration that was used though...
<meowray> the people who roll their own crt0.o very likely need -mcmodel=medany -fno-pic ..
<meowray> s/likely/unlikely/
haritz has joined #riscv
haritz has quit [Changing host]
haritz has joined #riscv
BOKALDO has quit [Quit: Leaving]
zjason has quit [Read error: Connection reset by peer]
zjason has joined #riscv
<solrize> is risc-v code larger than arm in real life?
<solrize> is it a matter of adding a feature to binutils (relaxation = shrinking down variable length operations when possible?)
<solrize> brb
GenTooMan has quit [Ping timeout: 258 seconds]
<jrtc27> the answer is likely "which Arm, which RISC-V and what software"...
GenTooMan has joined #riscv
<jimwilson> for embedded code, yes, risc-v is larger than arm in real life, the B extension helps a little, the zce* extensions will help more
<jimwilson> the C extension was designed using SPEC which is a good unix benchmark, but useless for embedded, this is why we have compressed float/double load/store, because SPEC needs them, but not compressed char/short load/store, because SPEC doesn't need them, even though many embedded systems have no float, and have a lot of char/short data to reduce data size, so this hurts embedded code size, but zce* will fix this
GenTooMan has quit [Ping timeout: 248 seconds]
<jrtc27> Zce ranges from "this is an obvious omission" to "what on earth no that's not what RISC-V should look like" IMO..
<jrtc27> hopefully the latter ones are not needed to be competitive for code size, because I really don't like them...
<jrtc27> how much has GCC been optimised for code size, too? I know Craig and people keep finding new code size wins in LLVM
<jrtc27> some of it could just be a lack of having time (money...) poured into it
GenTooMan has joined #riscv
<jimwilson> gcc is well optimized for dhrystone and coremark code size and performance
<jimwilson> we get slightly better results for SPEC CPU2006 with gcc than llvm, but we have more people working on llvm than gcc now, so I expect that to eventually change
<jrtc27> I know a lack of linker relaxation support does hurt LLD, we see that with our tiny set of embedded benchmarks
<jimwilson> there were some jump threading patches in llvm recently that helped narrow the gap to gcc
<jrtc27> oh I remember that one, caught my eye as it mentioned coremark explicitly
Andre_H has quit [Ping timeout: 248 seconds]
<solrize> i hadn't heard about zce before
<solrize> it's different from C extension
<solrize> hmm
<solrize> thanks
dermato has quit [Ping timeout: 258 seconds]
<solrize> i'm glad this stuff is being addressed, like 1 and 2 byte operations
dermato has joined #riscv
<solrize> i still want to see bignum benchmarks to check the claim that int overflow detection doesn't matter
<jrtc27> what do you mean? why would trapping be helpful?
<jrtc27> (or flags)
<jrtc27> surely you'd need exactly the same amount of code to proactively detect overflow and allocate more space as to reactively detect it?
<solrize> well on most cpus if you want a multi precision add, you use a carry flag, and there is an add with carry instruction
<solrize> and if you divide by 0 there is a hardware trap
<solrize> and ideally since int overflow is usually a bug, a hw trap would help there too
<solrize> so you have to emit extra instructions to test all that stuff
<jrtc27> if I wanted to make add-with-carry efficient I'd probably have c.slti[u] exist and then do c.slti[u]; c.addi
<jrtc27> and then macro-op fuse that
<jrtc27> uh, no, you do not want to trap on int overflow
<jrtc27> mips tried that, it was unused
vagrantc has joined #riscv
<jrtc27> everything just used the non-trapping instruction
<solrize> were the trapping ones slower or anything like that?
<solrize> and mips, that was before people cared about this stuff
<jrtc27> it was mips, everything was slow
<jrtc27> but, you just broke too much code
<solrize> if code depended on non-trapping it was already broken--signed int overflow in C is UB
<jrtc27> r6 removed the trapping version
<jrtc27> sure
<jrtc27> lots of things are UB
<jrtc27> shitty code still exists
<jrtc27> and people like to assume two's complement
<solrize> thus the desirability of traps, to flag the shitty code instead of running it and letting it corrupt stuff
<solrize> if they want 2s complement they can use unsigned or -fwrapv
<solrize> which disables some optimizations
<jrtc27> I like your optimism that this forces people to fix their code rather than makes people just ignore mips
<solrize> they ignore mips for many other reasons why not one more
<solrize> anyway it's a significant sticking point, if people want C to always allow wrapping then they should take it up with the C standard committee. unintentional overflow may not happen much on 64 bit machines but it was a real issue with 32 bit because it often escaped detection. with 16 bit it happened so much that it usually got caught
<jrtc27> -fsanitize=undefined
<solrize> hmm ok if that reliability catches overflow, but i mean if it inserts a bunch of extra code and slows down the program then people won't use it
<solrize> i tried -trapv and there wasn't much difference on x86
<jrtc27> well it does a whole bunch of things, integer overflow detection being just one of them
<solrize> nice
<solrize> i will start using it
<solrize> i've also wanted to try kcc
<solrize> or switching from C to ada lol
<jrtc27> ubsan is pretty cheap, it's things like msan where it gets slow
<jrtc27> the headline figure for msan is ~3 times slower, and ~2 times slower for asan
<solrize> nice thanks right now i primarily use gcc
<solrize> but i think gcc also has sanitize undefined
<jrtc27> gcc has support for some of them, don't know exactly what though
<solrize> yeah
<jrtc27> yeah it vendors parts of llvm in its tree
<solrize> wow interesting i didn't know that
<jrtc27> (the run-time parts of the sanitizers, in libsanitizer)
<solrize> thanks
valentin has quit [Quit: Leaving]
peeps[zen] has quit [Read error: Connection reset by peer]
peepsalot has joined #riscv
nvmd has quit [Quit: Later, nerds.]
<meowray> -fsanitize-trap=undefined is needed to make ubsan cheap
<dh`> with 16 bit it happened so much that it usually got caught
<dh`> so you'd think, but virtually every DOS game has some 16-bit overflow in it
<dh`> I remember in the original railroad tycoon there was a whole succession of 16-bit overflows you'd hit as you expanded your railroad
<sorear> zce is surprisingly reasonable imo... i'd like to see the detailed benchmark results (later), hopefully this wasn't just tested on one decompression algorithm
<jrtc27> which parts of zce?
<jrtc27> most of it is fine
<jrtc27> a couple of the instructions are way too specialist, and a couple are just "no" (e.g. tbljal, no, don't do that, please)
<jrtc27> push/pop, meh, I hate it but people do that on microcontroller ISAs
<sorear> tbljal is close to word for word something I worked out months ago while trying to come up with a non-terrible version of the andes code density instruction
<jrtc27> non-terrible != good...
<jrtc27> if you want to do tbljal, make it a less architecturally crippled version and just add a load-and-branch instruction...
<jrtc27> what I don't like is that it's using a new CSR
<jrtc27> as the implicit base
<sorear> hmm, if it used gp it'd be compatible with fdpic shared libs... or you could make it truncate pc
<jimwilson> push/pop and tlbjal are the ones that give the most benefit, but you don't have to implement them on unix parts where performance matters more than code size
<jrtc27> did they consider a generalised load-and-branch?
<jrtc27> because that has wider applicability
<jrtc27> and yeah you could have a compressed form that used say gp as the base
<sorear> I-types don't grow on trees, especially if you insist on encoding imm[0:1] despite the fact it will always be zero
<jrtc27> you could make it a J-type at least and shave off bit 0
<sorear> J = 8 times the space of I
<jrtc27> oh right
<jrtc27> hmm
<jimwilson> I don't recall discussion of load-and-branch, but I haven't followed all of the discussions
wingsorc__ has joined #riscv
<jrtc27> yeah I haven't either for various reasons
<jrtc27> still have concerns about the mismatch between the code corpus in use and the intended application space for the more interesting instructions...
wingsorc has quit [Read error: Connection reset by peer]
Xark has joined #riscv
devcpu has joined #riscv
theruran has joined #riscv
mahmutov has quit [Ping timeout: 272 seconds]
winterflaw has quit [Ping timeout: 244 seconds]
ntwk has joined #riscv