<jrtc27>
how much do the added c.nops all over the place cost you instead?
<jrtc27>
those will surely have a cycle of overhead each on such simple pipelined cores?
Orac is now known as Tenkawa
mlw has joined #riscv
nexR has quit [Ping timeout: 246 seconds]
saulosilva has joined #riscv
tlwoerner has quit [Quit: Leaving]
tlwoerner has joined #riscv
nexR has joined #riscv
SpaceCoaster has quit [Ping timeout: 252 seconds]
alexghiti has joined #riscv
saulosilva has quit [Quit: Client closed]
alifib has joined #riscv
BootLayer has joined #riscv
crabbedhaloablut has quit []
heat_ has joined #riscv
dakralex has joined #riscv
<cousteau>
jrtc27: the added nops might be replaced with expanded compressed instructions
<cousteau>
so instead of a compressed instruction followed by a compressed nop and the 4-byte target, you get the expanded version of that instruction, no nop, and the target
<jrtc27>
linker relaxation makes that very awkward to do
<cousteau>
I think I've seen gcc do that
<cousteau>
When I was messing with godbolt, I saw that -falign-labels=4 just added `R_RISCV_ALIGN *ABS*+0x2` lines to the generated assembly code. I suppose that's a cue for the assembler/linker(?) to expand compressed instructions
<cousteau>
Practical example: my use case went from sometimes taking 502k clock cycles and sometimes taking 452k clock cycles to always taking 452k clock cycles.
<cousteau>
so yes, -falign-labels=4 did help
<cousteau>
(in THAT use case, at least)
<jrtc27>
you don't know if .option norelax was applied to the previous instruction though
<jrtc27>
or the subsequent
<jrtc27>
though that's way harder to expand
<jrtc27>
... nvm that doesn't help anyway
saulosilva has joined #riscv
<jrtc27>
you only get positive information from R_RISCV_RELAX being applied to instructions
<jrtc27>
but that's only for instructions that have known-relaxable relocations
<jrtc27>
or norvc even
<cousteau>
For context: when I mentioned this issue here a few months ago, someone suggested that I opened an issue in GCC bugzilla to discuss it, but since my logs are unreliable and spread across a multitude of machines I can't remember who it was (maybe you, not sure). But I personally know very little about the implications of using -falign-labels=4,
<cousteau>
linker voodoo, relocation, and all that stuff
<cousteau>
...oh, it was palmer who suggested it :)
<cousteau>
palmer: I filed the bug as you suggested four months ago ^_^"
saulosilva has quit [Quit: Client closed]
heat_ has quit [Remote host closed the connection]
heat has joined #riscv
SpaceCoaster has joined #riscv
saulosilva has joined #riscv
<cousteau>
jrtc27: ...and even if the added nops did have a cost, if the loop iterates more than once, it will be better to suffer the nop delay once than to suffer the misalignment delay N times.
<cousteau>
But that only works under the assumption that the branch is going to be taken at least once. It might not.
<jrtc27>
for loop labels, sure
<jrtc27>
not for labels where a high % of them are reached via fallthrough
<cousteau>
are you suggesting to make a distinction between -falign-labels, -falign-jumps, -falign-loops?
<cousteau>
(because I'm still not entirely sure how those work)
<jrtc27>
yes, -falign-loops is the important one
<jrtc27>
and you might as well do -falign-jumps
<jrtc27>
but -falign-labels will include fallthroughs
<cousteau>
the documentation wasn't entirely clear to me in this regard, but I understood that -loops was for "loops", which I guess means "conditional branches", whereas -jumps was for "stuff that is not the target of conditional branches", and -labels was for both kinds of targets
<jrtc27>
jumps doesn't include labels that have fallthrough
<cousteau>
-falign-jumps Align branch targets to a power-of-two boundary, for branch targets where the targets can only be reached by jumping.
<jrtc27>
it's ones that are *only* reachable by jumps
<cousteau>
-falign-loops Align loops to a power-of-two boundary. (for some obscure definition of "loop")
<cousteau>
I think my assumption here back in the day was that loops were "targets that can not only be reached by jumping"
<cousteau>
so that GCC classified targets as "can only be reached by jumping" and "can not only be reached by jumping", and that -jumps and -loops referred to each of them, whereas -labels was the union of both sets.
<cousteau>
This was my assumption and not something I found the documentation to state explicitly, though.
<jrtc27>
loops are whatever the compiler's loop detection heuristics decide is a loop
<cousteau>
ok
<jrtc27>
I would assume
<cousteau>
oh, ok
<cousteau>
I would assume that your assumption is better than my assumption :)
<cousteau>
-falign-labels [...] If -falign-loops or -falign-jumps are applicable and are greater than this value, then their values are used instead. -> this statement led me to think that either one or the other were applicable for any given label.
<palmer>
it's really a cost modeling thing, and IIRC that's kind of clunky here
mlw has quit [Ping timeout: 248 seconds]
damian101_ has joined #riscv
mlw has joined #riscv
<cousteau>
ok, so let's assume there are three types of label: those that are only reachable by jumping (conditionally or unconditionally), those that can be reached by jumping or by fallthrough, but will frequently be reached by jumping, and those that can be reached by jumping or by fallthrough, but will rarely or never be reached by jumping
damian101 has quit [Ping timeout: 248 seconds]
pbsds30 has joined #riscv
<cousteau>
in the first case, it doesn't matter if we pad them with nop before the target, because those nop will never be reached
<cousteau>
in the second case, if we pad them with nop before the target, then sure, we will have to run the nop once, but the benefit will offset that cost
<cousteau>
and in the third case, an extra nop before the target would always have a penalty and rarely a benefit
pbsds3 has quit [Ping timeout: 260 seconds]
pbsds30 is now known as pbsds3
<jrtc27>
which is why -falign-labels itself defaults to 1
<cousteau>
BUT, if the compiler is smart enough to replace that "extra nop" with a "decompressed instruction", then there won't be any such penalty
<jrtc27>
and that's the wrinkle
<jrtc27>
the compiler can't know that when linker relaxation is in use
<palmer>
oh, I was posting on the bug, but we actually had some code to do that but we didn't merge it. It was for exactly this case, it'd save a cycle in Dhrystone on Rocket ;)
SpaceCoaster has quit [Read error: Connection reset by peer]
SpaceCoaster has joined #riscv
bjdooks has quit [Read error: Connection reset by peer]
bjdooks has joined #riscv
prabhakalad has quit [Ping timeout: 248 seconds]
pbsds33 has joined #riscv
pbsds3 has quit [Ping timeout: 276 seconds]
pbsds33 is now known as pbsds3
mightysands has joined #riscv
<mightysands>
Anyone know if booting your Linux distro of choice (gentoo/slackware) onto a Milk-V Jupiter is as simple as dd-ing the iso onto a usb stick ?
<mightysands>
I'm wondering how unique each risc-v machine's methods of booting a new OS are
coldfeet has joined #riscv
cousteau has quit [Quit: Client closed]
<palmer>
mightysands: it's generally a bit clunkier than that in practice, we're not quite at the point where stuff is that portable yet
naoki has joined #riscv
<mightysands>
palmer: Do you know how one might go about installing a new linux OS on the Jupiter ?
<mightysands>
I take it every machine is a little different then ?
<palmer>
ya, there's generally just things like vendor kernel trees and bootloader issues. Nothing super fundamental, they're just poorly supported systems so things tend to be fragile. Best bet is to just look at the vendor docs
psydroid has quit [Read error: Connection reset by peer]
psydroid has joined #riscv
danilogondolfo has quit [Quit: Leaving]
craigo has quit [Quit: Leaving]
fuwei has quit [Ping timeout: 276 seconds]
mightysands has quit [Remote host closed the connection]
coldfeet has quit [Remote host closed the connection]
damian101_ has joined #riscv
damian101 has quit [Ping timeout: 248 seconds]
alifib has quit [Ping timeout: 252 seconds]
coldfeet has joined #riscv
dakralex has quit [Quit: Leaving]
mlw has quit [Ping timeout: 276 seconds]
JanC has quit [Remote host closed the connection]
JanC has joined #riscv
ldevulder has quit [Quit: Leaving]
saulosilva has joined #riscv
<drmpeg>
HiFive P550 running my ATSC 3.0 transmitter. This is the first RISC-V board with enough horsepower to run this flow. https://www.w6rz.net/p550.mp4