sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv
hendursaga has quit [Quit: hendursaga]
hendursaga has joined #riscv
riff-IRC has quit [Read error: Connection reset by peer]
riff-IRC has joined #riscv
frost has joined #riscv
jwillikers has quit [Remote host closed the connection]
___nick___ has quit [Ping timeout: 265 seconds]
jonasbits has quit [Ping timeout: 240 seconds]
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
jacklsw has joined #riscv
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
jamtorus has joined #riscv
jellydonut has quit [Ping timeout: 252 seconds]
compscipunk has quit [Quit: WeeChat 3.2.1]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
mahmutov has quit [Ping timeout: 252 seconds]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
bgamari has quit [Ping timeout: 265 seconds]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
freakazoid343 has joined #riscv
tgamblin has joined #riscv
freakazoid333 has quit [Ping timeout: 245 seconds]
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
bgamari has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
BOKALDO has joined #riscv
riff-IRC has quit [Remote host closed the connection]
riff-IRC has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
<kaddkaka[m]> Which of the tune targets does u74 correspond to? `‘rocket’, ‘sifive-3-series’, ‘sifive-5-series’, ‘sifive-7-series’, ‘size’`
jonasbits has joined #riscv
<pierce> Afaik 7 series
<pierce> U74
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
Doraemon has joined #riscv
NeoCron has quit [Ping timeout: 260 seconds]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
<kaddkaka[m]> Thanks, U7 has dual issue in-order, U5 and E3 are both single-issue in-order. Sifive have a lot of cores (https://www.sifive.com/documentation) so I guess many don't have a dedicated tune target or use the same (or similar enough) processor implementation
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
jamtorus is now known as jellydonut
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
<kaddkaka[m]> Are there any documentation for the rocket riscv impl? I can only find a tutorial from 2015
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
jjido has joined #riscv
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
pecastro has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
hendursa1 has joined #riscv
tgamblin has joined #riscv
eduardas has joined #riscv
tgamblin has quit [Remote host closed the connection]
hendursaga has quit [Ping timeout: 276 seconds]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
valentin has joined #riscv
smartin has joined #riscv
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
BOKALDO has quit [Quit: Leaving]
jacklsw has quit [Quit: Back to the real world]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
jacklsw has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
smartin has quit [Quit: smartin]
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
jwillikers has joined #riscv
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
Guest9190 has joined #riscv
tgamblin has joined #riscv
Guest9190 has quit [Client Quit]
dolonbus has joined #riscv
dolonbus has quit [Remote host closed the connection]
tgamblin has quit [Remote host closed the connection]
dolonbus has joined #riscv
dolonbus has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
tgamblin has quit [Remote host closed the connection]
jjido has joined #riscv
<jrtc27> xypron: I gave a whole load of feedback on gnu-efi patches at https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/merge_requests/4836 without realising they've since been upstreamed
<jrtc27> most of those look like they still apply to the patches you upstreamed
<jrtc27> also https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/issues/1227#note_674058473 gives you a way to make EFI_SUBSYSTEM work
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
riff_IRC has joined #riscv
riff-IRC has quit [Killed (NickServ (GHOST command used by riff_IRC))]
riff_IRC is now known as riff-IRC
<xypron> @jrtc27: https://github.com/rhboot/gnu-efi/pull/3 contains similar patches. I will look how your comments apply there.
aburgess_ has joined #riscv
<jrtc27> ack
<jrtc27> my "most" was perhaps overstating things, just so happened the few I sampled at first applied, but a fair few do look like they've been resolved
<xypron> @jrtc27: unfortunately every project seems to be maintaining its own version of gnu-efi instead of mainting a good upstream.
<jrtc27> :(
aburgess has quit [Ping timeout: 260 seconds]
tgamblin has joined #riscv
frost has quit [Quit: Connection closed]
BOKALDO has joined #riscv
BOKALDO has quit [Client Quit]
pehaef has joined #riscv
hendursa1 has quit [Quit: hendursa1]
hendursaga has joined #riscv
wolfshappen has quit [Ping timeout: 252 seconds]
wolfshappen has joined #riscv
BOKALDO has joined #riscv
aburgess_ is now known as aburgess
pehaef has quit [Quit: leaving]
jjido has joined #riscv
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
mahmutov has joined #riscv
vagrantc has joined #riscv
jacklsw has quit [Read error: Connection reset by peer]
pjw has quit [Read error: Connection reset by peer]
sorear has quit [Ping timeout: 252 seconds]
adomas has quit [Ping timeout: 240 seconds]
geist has quit [Read error: Connection reset by peer]
mobius has quit [Read error: Connection reset by peer]
rsalveti has quit [Ping timeout: 252 seconds]
NishanthMenon_ has quit [Read error: Connection reset by peer]
rsalveti has joined #riscv
geist has joined #riscv
pjw has joined #riscv
sorear has joined #riscv
mobius has joined #riscv
NishanthMenon_ has joined #riscv
valentin has quit [Quit: Leaving]
adomas has joined #riscv
elastic_dog has quit [Quit: elastic_dog]
jjido has joined #riscv
elastic_dog has joined #riscv
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
jjido has joined #riscv
cwebber has quit [Ping timeout: 245 seconds]
tgamblin has quit [Quit: Leaving]
<palmer1> is kito in here?
<jimwilson> palmer1, don't think so, and it is way too early for him
<palmer1> ya, makes sense
<palmer1> do you know if he was going to submit that plumbers talk he was talking about yesterday?
<palmer1> about the probing extensions to turn on ifuncs?
<jimwilson> I don't know
<palmer1> OK
<palmer1> I'll email him
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
BOKALDO has quit [Quit: Leaving]
eduardas has quit [Quit: Konversation terminated!]
jjido has joined #riscv
pehaef has joined #riscv
pehaef has quit [Client Quit]
cwebber has joined #riscv
<jrtc27> don't know the content of it but would be good to make sure it's not linux-specific so we can have a standard means to do this across all OSes
<meowray> anyther instance of gcc atomic mix-and-match libcalls vs open coding: https://github.com/tikv/jemallocator/pull/14#issuecomment-917179231
<jrtc27> technically yes in practice no
<jrtc27> I thought philipp had patches to rewrite gcc's riscv atomics though
<jrtc27> did those ever get finished off?
<sorear> mix and match won't cause problems for subword. it WILL cause problems for double-word if that's ever added to gcc but not libatomic
<jrtc27> it can cause problems because whether or not you use libatomic is ABI
<jrtc27> since a legal libatomic implementation is to take a spinlock not to do a hardware atomic
<jrtc27> which then doesn't synchronise with a real atomic (in this case, masked)
<jrtc27> in practice the libatomic implementation does use the equivalent instructions
<jrtc27> but it technically doesn't have to
<meowray> aarch64 without lse can use lib call for __atomic_exchange_1 as well. is that benign?
<jrtc27> I've never seen that
<meowray> compile my example with aarch64-linux-gnu or clang --target=aarch64-linux-gnu. you'll get __aarch64_swp1_acq_rel (default is -moutline-atomics for libgcc>=9.3.1 and compiler-rt) unless -march=armv8-a+lse
<jrtc27> that's not __atomic_exchange_1, that's outlined atomics
<jrtc27> which is specifically "do a real hardware atomic"
<jrtc27> which will be either an LSE AMO or an older LDX/STX thing
<jrtc27> but importantly those still compose fine
<jrtc27> though I do find it hard to believe that the overhead of a function call for every atomic doesn't outweigh the cost of an LDX/STX loop...
<meowray> for riscv libgcc, how is __atomic_exchange_1 mixed with open coding benign? doesn't __atomic_exchange_1 use mutex?
<jrtc27> (yes there are also contention and progress issues, but... most atomics are not contended)
<jrtc27> I assume the implementation is just what the open-coded version would be
<xentrac> is it a tail call with no extra movs to set up the arguments? that's what the github thread seems to suggest
<xentrac> that sounds pretty cheap
<xentrac> (if there's no issues with contention and progress anyway)
<jrtc27> you still have to juggle arguments to be in argument registers
<jrtc27> and in a leaf function you now are no longer a leaf
<xentrac> tail calls don't preserve leafness?
<jrtc27> it's only a tail call if it's in a tail position
<xentrac> true (and you would normally expect it to not be, though the github example seems to have been?)
<jrtc27> well yes github examples are deliberately minimal
<xentrac> so perhaps it's not good for me to imagine that the real-world case would have a similar performance cost?
<sorear> meowray: it doesn't use the mutex array, it's just a cas loop
<xentrac> so it does guarantee progress
<meowray> sorear: libatomic/exch_n.c uses libatomic/cas_n.c. ok, i find no mutex array. compiler-rt's impl uses mutex array...
<jrtc27> riscv/atomic.c in libgcc
<jrtc27> oh but that's the __sync_foo
<sorear> i would strongly argue compiler-rt is in the wrong here, since it can't be mixed with libgcc and the intent is obvious in the manuals
<jrtc27> you can't mix and match compiler-rt and libgcc
<jrtc27> they conflict wrt symbols
<jrtc27> that's like linking two libc's
<meowray> compiler-rt is definitely bad regarding atomics
<meowray> it's simple, though. reading libatomic/libgcc impl on atomics seems challenging
zjason` has joined #riscv
<jrtc27> wow it looks like libatomic really uses pthread_mutex_lock for the fallback case when there is no atomic of that size to use
<jrtc27> that seems a bit heavyweight
zjason has quit [Ping timeout: 260 seconds]
<xentrac> is libatomic pervasively coupled to pthreads or is that limited to a few places?
<jrtc27> there's an abstraction layer
<sorear> you can't just use a spinlock because spinlocks on a uniprocessor are bad times
<jrtc27> that's just the posix implementation of the locking
<xentrac> that's not so bad then, I guess
<sorear> even pthread mutexes are Not Great because of priority inversion...
<xentrac> though, as you say, heavyweight
<jrtc27> the FreeBSD-specialised implementation of locks in compiler-rt uses its umtx
<sorear> pthread_mutex is just a couple of CASes in the uncontended case, although it goes through a fair amount of code to do so
<xentrac> sorear: presumably the lock in question is only held within the atomic swap, avoiding the priority inversion problem, no?
<jrtc27> no futex equivalent for Linux though
<jrtc27> yeah exactly
<jrtc27> it still has a fast path, but it's not as fast as a spinlock in the uncontended case
<sorear> xentrac: low priority thread starts to do an atomic swap, gets preempted in the middle by a high priority thread that wants to do an atomic on the same lock bucket
<sorear> really wish pthreads didn't try to multiplex so many lock types on a single set of functions, much of the overhead is dispatching between implementations
<xentrac> sorear: hmm, so then the high priority thread blocks, and if there's a medium-priority thread that has a lot of work to do, the low-priority thread may not get scheduled for a long time. my mistake!
<xentrac> somehow I had thought that you needed two locks to get priority inversion. thanks!
<xentrac> (a long time or never, if you're using strict priorities, as you would in the cases where this matters most)
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
<xentrac> in more spartan news, stikonas has reduced the hex0 RV64 seed down to 392 bytes: https://github.com/oriansj/stage0-posix/pull/43/files
<jrtc27> running on what though?
<xentrac> Linux
<xentrac> stage0-posix uses exit, execve, fork, waitpid, brk, open, close, read, write, lseek, chmod for up to M2-Planet+mescc-tools (excluding Kaem)
<xentrac> although in that case you can see the seed is invoking openat instead of open
<xentrac> it's using the Linux system call interface
<xentrac> no pthreads though :)
<jrtc27> how do you trust that linux kernel?
<jrtc27> especially if it's a riscv one?
<xentrac> it would be better to have a very simple bootstrapping kernel, but nobody has written one yet
<jrtc27> beyond a certain point I just don't see what the use of all this is, unless you go right back to hardware you built running some bare-metal code you hand assembled and input into the machine
<xentrac> yes, but getting there will take time
<jrtc27> sure, but you wouldn't use the hex0 for that, you'd skip a load of those stages
<jrtc27> because whatever runs those needs a far more complex environment than those do
<xentrac> once you have a hand-assembled kernel sufficient to run hex0, you can use it to compile the later stages
<jrtc27> but the point is currently you need one per architecture
<jrtc27> which isn't practical
<jrtc27> if you just have a single more featureful one you can skip all the hex0 stuff
<xentrac> no, you only need one architecture
<xentrac> because once you have one architecture bootstrapped in a trustworthy way, you can cross-compile the others and verify that they're bit-identical
<jrtc27> then why are there x86, arm, aarch64 and riscv64 hex0's?
<jrtc27> all but one of those are a waste of time
<xentrac> yes, but we don't know which one yet
<jrtc27> and that's my point, it's investing effort now that you know is going to be thrown away
<jrtc27> if it were me I'd work up not down
<jrtc27> to avoid that
<jrtc27> with a tiny amount of down just to get beyond the point of some kind of trusting trust attack being practically feasible
<xentrac> my guess is that someone will build an inspectable RV64 CPU out of individual transistors long before anyone builds an inspectable aarch64 or amd64 CPU :)
<xentrac> but oriansj was not very sanguine about risc-v at all once he saw the headaches involved in the instruction encoding
<jrtc27> I mean it's a mess for humans but it should mean fewer transistors...
<xentrac> right!
<xentrac> he's more enthusiastic about the Knight, a TTL design from the 01970s, which I think is about 20 times more complicated to implement than RV64
<jrtc27> but likely harder to reason that it's correct
<xentrac> oh? how would you design an ISA to be as easy as possible to reason that it's correct, without giving up the possibility of running things like a C compiler on a POSIX implementation?
<jrtc27> as in just that the jumbled bits are harder to look at and say "yes that's correct"
<jrtc27> laying down a big block of bit slicing is easier
<jrtc27> even if it's larder
<jrtc27> *ger
<xentrac> ah, so, like, 64-bit instructions, or 128-bit like Kay and Nguyen's Chifir?
<xentrac> (also if you have *two* architectures that are fully bootstrapped, then a Karger–Thompson attack would have to have compromised *both* of them, so it's not *completely* a waste of time)
<jrtc27> 32 bits is enough for a simple CPU
<jrtc27> only gets too small when you want to cram all kinds of features in
<xentrac> Chifir avoids needing multiple types of linker relocations ;)
<jrtc27> my point is just an un-jumbled riscv, whilst less efficient in terms of hardware resources, is likely easier to understand the implementation of
<xentrac> aaaah, I see! that's a great idea that I hadn't thought of at all
<xentrac> thank you!
<jrtc27> having said that, if you're not doing C (why would you for this) there's not all that much jumbling
<xentrac> stage0 gets to C fairly quickly
<jrtc27> just JAL and branches
<jrtc27> I meant C the RISC-V extension, not the language
<xentrac> oh sorry
<xentrac> naturally
<xentrac> you can see hex0 uses addi a lot
<xentrac> well, hopefully you can. M1 assembly is not very readable
<xentrac> but yeah, M1 uses $ for J jumbling, @ for B jumbling, and ! for I jumbling
<xentrac> thoughts about the minimal practical ALU repertoire for this kind of bootstrapping?
<jrtc27> hmm
<jrtc27> well you could ditch the immediate instructions other than addi and/or ori (since you need one of those to load constants in the first place)
<jrtc27> don't need jal if you have jalr
<jrtc27> well, I guess you pick jal or auipc
jjido has joined #riscv
<jrtc27> probably keep just blt and beq or similar
elastic_dog has quit [Ping timeout: 245 seconds]
<jrtc27> could ditch one of those too but starts to get annoying to code for
<jrtc27> slt[u] not needed if you have branches
<jrtc27> don't need sub if you have add and xor
<jrtc27> you probably want all three types of shifts
<xentrac> yeah, you could ditch beq but it would be a pain
<jrtc27> (not having the sign-extending ones is painful, and zero-extending ones are easy if you have sign-extending)
<xentrac> you could probably drop non-immediate add
<jrtc27> obviously fences and exception-y things you can do without unless relevant
<jrtc27> you'd do it that way round?
<jrtc27> I'd drop the immediate add instead
<jrtc27> more general
<xentrac> it's easier to make add out of sub than vice versa
<jrtc27> oh if you keep sub, sure
<jrtc27> I'd ditch sub and addi and just use add for everything
<jrtc27> or no
<jrtc27> I'd keep addi solely for lui+addi being li
<xentrac> yeah. and addi is subi after all
<xentrac> to use add for non-immediate sub you need some way to negate
elastic_dog has joined #riscv
<jrtc27> so then it's just add or sub
<jrtc27> and I don't see why you'd pick sub over add
<xentrac> well, a += b; is b = 0 - b; a -= b;
<xentrac> but a -= b; in terms of addition requires some way to negate b. I guess you could xor with an immediate -1?
elastic_dog has quit [Client Quit]
<xentrac> which you loaded with li
elastic_dog has joined #riscv
<jrtc27> yes + and - are symmetric, you just do the opposite with -
<xentrac> which would be less of a pain in rv32
<jrtc27> it's entirely equivalent
<jrtc27> and yeah
<jrtc27> xor with -1 and add 1
<xentrac> it's not symmetric in the sense that b = 0 + b; a += b; does not perform a -= b;
<jrtc27> it is if you view your `b = 0 - b` as a unary negation of b
<jrtc27> which is what you actually do
<xentrac> yeah, but you can do b = 0 - b; with sub and r0; you don't need unary negation
<xentrac> you don't need r0 either since you can do z = z - z;
<jrtc27> oh I see, yes, that's true, sub gets you negation for fre
<jrtc27> *e
<xentrac> but sooner or later you're going to want xor and some nonlinear bitwise operation like and or or
<xentrac> so maybe add vs. sub is just a question of taste, I have an ugh field around 'xor with -1 and add 1' that may not be actually justifiable
<jrtc27> I think it just depends on the relative frequency of operations
<jrtc27> I would naively assume add is more frequent and thus warrants making crappily emulating sub worth it
<xentrac> IIRC Chifir only supplies NAND for bitwise operations :)
<xentrac> yeah, you're probably right about that
<jrtc27> then again, could just support both in hardware, it's pretty trivial to make an adder into a subtracter...
<xentrac> I think Wirth-the-RISC omits bitwise operations entirely, just like Pascal
<xentrac> hmm, that sounds like an exercise I should try
<jrtc27> I can imagine that not going so well for things like writing assemblers...
<xentrac> well, you *can* do bit shifts with division (which both Chifir and Wirth-the-RISC supply, which seems like a terrible idea to me)
<xentrac> and multiplication, of course, which you can do with addition
<xentrac> and you *can* do LSB tests with modulo, which likewise
<jrtc27> "we don't give you bitwise operators but we give you multiplication, division and modulo" is an interesting take for a minimal design
<xentrac> IKR?
<xentrac> original MIX also did that, for an arguably slightly less ridiculous reason in the context of 01965
jjido has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
<xentrac> Wirth-the-RISC also has floating-point, as did, for example, Zuse's Z3
<jrtc27> o.O
<xentrac> I remembered wrong about it, though; it has bitwise and, or, xor, and abjunction
<xentrac> speaking of minimal sets of operations, intuitively it seems like abjunction ought to be more expressive than NAND or NOR (if you have constants!) since it's non-commutative, but so far I haven't found that circuits are systematically simpler in abjunction gates than in NAND or NOR gates, although of course there are individual circuits that are (abjunction is one gate with abjunction gates, 3 gates with
<xentrac> NAND)
<xentrac> stikonas points out that it's probably simpler to run cut-down versions of hex0, hex1, and hex2 as subroutines that read and write RAM than it is to implement a kernel in hex that is sufficient to run them
<xentrac> and if you do that you can write the kernel in assembly instead of hex, which would be a big improvement
stikonas has joined #riscv
<xentrac> I think doing floating-point, multiplication, and division in hardware would probably be fatal to hand-auditability
<stikonas> well, floating-point multiplication and division are not required for early bootstrapping software (assembler can be written with just integer operations)
<xentrac> agreed. that was in the context of 21:44 < xentrac> Wirth-the-RISC also has floating-point, as did, for example, Zuse's Z3
<xentrac> but I meant integer multiplication and division, which are also not needed for compilers and assemblers until they're fairly sophisticated, at which point it's easy enough to supply them as subroutines as we did on, for example, the 8080 or 6502
<xentrac> or RV32I :)
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
<dh`> surely you mean "08080" and "06502"
* dh` hides
<jrtc27> back when I was an undergrad, and in my first year, our intro to assembly lab was to implement division in RV32I assembly (that we'd later use to write pong)
<jrtc27> though that was before the big re-encoding of RISC-V
<stikonas> most of the work for software division is I guess writing some high level prototype (e.g. in C)
gioyik has joined #riscv
<xentrac> dh`: haha
<xentrac> jrtc27: oh heh, I didn't realize you were younger than I am
<jrtc27> I know, with how cynical I can be you're forgiven for thinking I'm old :P
<xentrac> I'm reminded of Shannon's story about Tukey
<xentrac> uh, Hamming's
* stikonas never had any assembly classes...
<xentrac> (not that I'm comparable to Hamming or Shannon)
<xentrac> > One day about three or four years after I joined, I discovered that John Tukey was slightly younger than I was. John was a genius and I clearly was not. Well I went storming into Bode's office and said, ``How can anybody my age know as much as John Tukey does?'' He leaned back in his chair, put his hands behind his head, grinned slightly, and said, ``You would be surprised Hamming, how much you would
<xentrac> know if you worked as hard as he did that many years.''
<xentrac> was it a good intro to assembly? I'm surrpised they didn't give you an easier problem for the first lab...
<jrtc27> dunno, I already knew assembly
<jrtc27> I think people mostly managed to get it done though
<jrtc27> wrote pong for a soft-core attached to an LCD screen and various controller inputs later in the course (in C though), that was a bit of fun
<xentrac> nice!
<jrtc27> main challenge was the pitiful amount of memory the core exposed...
<jrtc27> loads of BRAMs on that chip but only like 4K or 8K instruction memory available
<xentrac> a lot more than the original Pong machine had! and the Atari 2600 potentially had 64KiB of instruction memory but only 1024 bits of RAM
<xentrac> I guess RV32I is less dense than 6502
<jrtc27> yeah
<jrtc27> also I'm sure writing in assembly would make it denser than the emitted C
<jrtc27> given it was probably a rather less mature GCC than exists today and dutifully following the ABI
<xentrac> and especially a rather less mature RISC-V backend
<jrtc27> yeah that's what I meant
<xentrac> ah
<jrtc27> it wasn't *that* long ago, x86 and arm were mature
<jrtc27> just not this weirdo out-of-tree port
<xentrac> GCC has gotten substantially better in the last ten years
<jrtc27> this was late 2015 (and actually my second year, not first year, bleh, memory)
<jrtc27> that or early 2016, one of those terms
<xentrac> sounds more reasonable for a second-year class, by then all the CS majors would have learned to program a little even if they didn't program before going to college
<xentrac> one of my favorite microbenchmarks is stupid fibonacci (fib(n) { return n < 1 ? 1 : fib(n-1) + fib(n-2); }) because you can write it in one line of code; it exercises integer arithmetic, recursion, and control flow; and it's easily adaptable to a wide range of implementation speeds by giving it different arguments
<xentrac> but modern GCC has totally obsoleted this
<xentrac> used to be, a simple Forth would beat GCC at this on i386 or amd64, because it's *so* recursion-heavy
jwillikers has quit [Remote host closed the connection]
<xentrac> in March I tried it on a reasonably modern GCC (though maybe not post-02015), and that (well, with n < 2 and ANSI declarations!) compiled to 169 instructions
<xentrac> I think GCC inlined it into itself 13 times
<jrtc27> 185 lines on godbolt
<jrtc27> what in gods name is that
<xentrac> that was what I said!
<jrtc27> presumably there's a fib benchmark out there that calls it for n < 13 or whatever...
<xentrac> maybe, but maybe it's just standard optimizations
<xentrac> it makes it a lot faster for large n
<xentrac> like about 40 times faster
<xentrac> and by "large" of course I mean "in the 30s or 40s"
<jrtc27> I mean, if it wants to make it faster, why doesn't it just notice that fib(n) is pure and thus do CSE on the inlined recursive version and some form of induction...
<xentrac> I suspect it might be doing that, yeah
<xentrac> because it's not plausible that it could be doing all the subtractions and additions and get 40 times faster
<xentrac> though I admit I haven't grokked the astounding assembly program GCC produces these days
<xentrac> I mean the naive assembly compilation doesn't spend 96% of its time in argument passing and prologues and epilogues
<jrtc27> clang doesn't do a whole lot, but does notice it can turn it into fib(n-1) + fib(n-3) + ... + fib(3/2) + fib(1/0) and that fib(0) = fib(1) = 1 so it turns that into a + 1 instead
<jrtc27> still explosive but a bit less so
<jrtc27> as has GCC I think in the mess of it all?
<jrtc27> there's ultimately a tight inner loop that has a sub 2 in there
<jrtc27> though even that bit's an utter mss
<xentrac> heh
<jrtc27> 7 stack spills and 7 stack reloads in the loop
<jrtc27> so yeah I imagine for small n GCC might fare better but as you ramp up it's going to tank compared to Clang
<xentrac> ramp up how far?
<jrtc27> that would require me to work out what the other 80% of the code does...
<xentrac> haha
<xentrac> or you could just run int
<xentrac> *it
<jrtc27> yeah I should...
<xentrac> if it's, say, inlining things four levels deep, then it has a tree of 31 calls which CSE can simplify down to maybe 13
<xentrac> which might be more important than the stack spills in the loop
rpb has joined #riscv
gioyik_ has joined #riscv
gioyik has quit [Ping timeout: 276 seconds]
elastic_dog has quit [Ping timeout: 245 seconds]
<dh`> if it's really doing enough CSE to bust it out of being exponential it should be easy to tell by running it
<xentrac> it isn't, but it's doing something that knocks a quite hefty constant factor off the exponential
winterflaw has quit [Ping timeout: 276 seconds]
<jrtc27> hm, interesting, clang is about 80% slower for larger numbers
<jrtc27> so maybe the sheer number of calls to the smaller numbers makes it better in the long run
<xentrac> hmm, I hadn't thought about that! you're right, even if it doesn't do CSE on the internal nodes of the call tree, trimming down the leaf nodes would help a lot
<xentrac> the call tree for F₃₈ has F₃₈ leaf nodes, after all, and cutting that down to F₃₄ would speed it up a lot
TMM_ is now known as TMM
TMM is now known as TMM_
elastic_dog has joined #riscv
<xentrac> anyway, that level of optimization makes it useless for my intended purpose, which is to quickly get a crude first-order feel for how fast or slow a language is. bash, for example, takes 2.6 seconds to run fib 15
<xentrac> as fib() { if [[ $1 -lt 2 ]]; then echo 1; else echo $(($(fib $(($1 - 1))) + $(fib $(($1 - 2))) )); fi
<xentrac> which makes it about F₃₃/F₁₅ times slower than CPython
<xentrac> about 6000
<xentrac> it's not a very precise benchmark but it's usually good to within an order of magnitude. but not with modern GCC
<dh`> it seems plausible for a compiler to unroll it, do CSE, and thereby make it nonexponential
<dh`> but I haven't looked at it that closely, maybe there's a good reason that doesn't work
<xentrac> well, you can't unroll it to infinite depth
<xentrac> you have to stop inlining at some point
elastic_dog has quit [Ping timeout: 260 seconds]
<xentrac> and at that point you have, say, calls to fib(n-8), fib(n-9), fib(n-10),...
<xentrac> as long as you have more than one such call it's still exponential, and inlining doesn't decrease the number of such calls; it increases them
elastic_dog has joined #riscv
pecastro has quit [Ping timeout: 265 seconds]
gioyik has joined #riscv
<jrtc27> same growth rate, just a constant factor between them
gioyik_ has quit [Ping timeout: 276 seconds]