sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv | Backup if libera.chat and freenode fall over: irc.oftc.net
_whitelogger has joined #riscv
_whitelogger has joined #riscv
_whitelogger has joined #riscv
iorem6 has joined #riscv
Gravis has quit [Ping timeout: 264 seconds]
Gravis has joined #riscv
mrkajetanp_ has joined #riscv
_whitelogger has joined #riscv
zvijezda has joined #riscv
JSharp has quit [Ping timeout: 272 seconds]
JSharp has joined #riscv
riff_IRC has quit [Quit: PROTO-IRC v0.73a (C) 1988 NetSoft - Built on 11-13-1988 on AT&T System V]
riff-IRC has joined #riscv
smaeul has joined #riscv
zvijezda has quit [Ping timeout: 264 seconds]
a2800276 has quit [Quit: Lost terminal]
iorem6 has quit [Ping timeout: 272 seconds]
vagrantc has quit [Quit: leaving]
SanchayanMaity has joined #riscv
Oli has quit [Quit: leaving]
riff-IRC has quit [Remote host closed the connection]
Forty-B8 has joined #riscv
Forty-B8 is now known as Forty-Bot
SanchayanMaity has quit [Quit: SanchayanMaity]
_whitelogger has joined #riscv
smaeul has quit [Quit: cya]
smaeul has joined #riscv
smartin has joined #riscv
geist has quit [Quit: leaving]
geist has joined #riscv
geertu_ is now known as geertu
geertu has quit [Quit: confusion]
geertu has joined #riscv
geist is now known as geist2
geist2 is now known as geist
smartin has quit [Ping timeout: 264 seconds]
_whitelogger has joined #riscv
smartin has joined #riscv
cmuellner has quit [*.net *.split]
hl has quit [*.net *.split]
jimwilson has quit [*.net *.split]
peepsalot has quit [*.net *.split]
edef has quit [*.net *.split]
kgz has quit [*.net *.split]
balrog has quit [*.net *.split]
Slide-O-Mix has quit [*.net *.split]
owl has quit [*.net *.split]
GreaseMonkey has quit [*.net *.split]
sorear has quit [*.net *.split]
stefanct has quit [*.net *.split]
tux3 has quit [*.net *.split]
aurel32 has quit [*.net *.split]
jn has quit [*.net *.split]
kgz has joined #riscv
balrog has joined #riscv
hl has joined #riscv
jn has joined #riscv
sorear has joined #riscv
stefanct has joined #riscv
aurel32 has joined #riscv
edef has joined #riscv
owl has joined #riscv
peepsalot has joined #riscv
jimwilson has joined #riscv
cmuellner has joined #riscv
GreaseMonkey has joined #riscv
Slide-O-Mix has joined #riscv
tux3 has joined #riscv
hl has quit [*.net *.split]
cmuellner has quit [*.net *.split]
peepsalot has quit [*.net *.split]
edef has quit [*.net *.split]
jimwilson has quit [*.net *.split]
kgz has quit [*.net *.split]
owl has quit [*.net *.split]
GreaseMonkey has quit [*.net *.split]
balrog has quit [*.net *.split]
tux3 has quit [*.net *.split]
stefanct has quit [*.net *.split]
sorear has quit [*.net *.split]
Slide-O-Mix has quit [*.net *.split]
aurel32 has quit [*.net *.split]
jn has quit [*.net *.split]
cmuellner has joined #riscv
stefanct has joined #riscv
hl has joined #riscv
edef has joined #riscv
owl has joined #riscv
peepsalot has joined #riscv
balrog has joined #riscv
jn has joined #riscv
jimwilson has joined #riscv
kgz has joined #riscv
Slide-O-Mix has joined #riscv
sorear has joined #riscv
aurel32 has joined #riscv
tux3 has joined #riscv
GreaseMonkey has joined #riscv
smartin has quit [Ping timeout: 264 seconds]
smartin has joined #riscv
_whitelogger has joined #riscv
aquijoule_ has quit [Remote host closed the connection]
_whitelogger has joined #riscv
TwoNotes has joined #riscv
smartin has quit [Ping timeout: 244 seconds]
ats has quit [Ping timeout: 272 seconds]
ats has joined #riscv
dlan has joined #riscv
dlan has quit [Client Quit]
dlan has joined #riscv
dlan has quit [Quit: leaving]
dlan has joined #riscv
mhorne has quit [Ping timeout: 272 seconds]
riff-IRC has joined #riscv
tnt has joined #riscv
<tnt> So huh ... I'm trying to so something like : lhu x8, %lo(some_symbol)(x30) and it builds no error ... but instead of x30, zero gets used silently :/
<jrtc27> that assembly is fine
<jrtc27> what are you *actually* doing, and what are you seeing *exactly*?
mhorne has joined #riscv
<tnt> A bit more detail https://pastebin.com/Rn9kFNxF
<jrtc27> linker relaxation
<jrtc27> %lo is meant to be used with a register that was written to by a lui %hi
<jrtc27> so it sees the tiny symbol value and transforms the load to one relative to x0 as the value fits entirely in the immediate
<jrtc27> if there were a lui %hi it would have been deleted
<jrtc27> whatever you're doing here such that it cares about that is likely bad code
<jrtc27> but if you *really* need it, use .option norelax for that section, or build your entire code base with -mno-relax
<tnt> Err, unfortunately I'd need it just for that line ...
<jrtc27> if you are absolutely sure you really need to write your code that way
<tnt> I know that symbol fits entirely in %lo because the system doesn't even have more than 4k of addressable data space.
<jrtc27> .option push .option norelax <code> .option pop
<jrtc27> but I would seriously encourage you to change your code to not be so dodgy
<tnt> The system doesn't even have _any_ RAM ... and only 1kbyte of instruction-ROM, so every opcode I can save, I need to save.
<jrtc27> so why do you need x30?
<tnt> A bit hard to explain, but that symbol can be used in 2 different "addressing context" depending on the type of hw access and they have different base addresses. 'x30' is set as a constant that always contains one of the base address for a type of access.
<jrtc27> then you're using symbol values in a non-standard way
<tnt> Yes
<jrtc27> don't do that
<tnt> But short of hard coding them I couldn't find any other way ...
<tnt> Even if I changed it so that the symbol contains the base address ... hwo would I tell the compiler than x30 _always_ contains the proper upper bits to not reload them each time ? (I can't afford extra instrcutions to set x30 to the same value over and over)
<jrtc27> you wouldn't need to
<jrtc27> so long as the register input to the load has the value that a lui %hi would have
<jrtc27> it works
<tnt> Actually another type of access that would have the same isseu: Image I want to do array access. a[x30] and I know 'a' fits in imm12 ... that'd be the same code.
<tnt> How exactly would it "know" the value of that register ?!?
<jrtc27> compilers never know that so the relocations do not support that
<tnt> Anyway, the .option thing work, thanks. I'll use that since I don't see any better option that wouldn't cost me added instructions. I wish I could do like lhu x8, some_symbol(x30) (without %lo) and it just errors out during inking if it turns out 'some_symbol' doesn't fit ...
<jrtc27> would be a waste of a relocation id
<tnt> ¯\_(ツ)_/¯
<tnt> I know my use case is special but on very tiny micro, I would't exclude it to know _by_design_ that a symbol fits in the first 2k of memory and allowing single opcode array indexing seems useful to me.
<jrtc27> it does, you just have to turn off linker relaxation
<jrtc27> because linker relaxation is built on the premise of knowing higher-level information about what you're doing
<jrtc27> that matches what any compiler will do
<jrtc27> and is designed to take pessimistic general code emitted by a compiler (ie no knowledge of what value symbols have) and optimise it based on what's known at link time
<jrtc27> but if you're writing code with that knowledge already then it's not needed and wrong
<tnt> I'm just not seeing what the "correct" way would be that would yield the same result.
<jrtc27> I don't think there is
<jrtc27> this is a slightly interesting oversight
<jrtc27> https://godbolt.org/z/65se31jcx could be one instruction shorter even in the general case if the ABI had a way to express that
<tnt> Turns out I can use -mno-relax globally, I thouht this would affect 'li' of constants (that auto use 1 or 2 opcode depending on the constant), but I guess given the constants are known, that's a different process and I know all my symbols value fit.
<dh`> if you have two base registers, you might be able to make the linker behave if you use gp for one and tp for the other
<dh`> but tbh if you have only 1K of code, why bother using a linker?
<tnt> I have 1 k of code but a few k of data ROM and I don't want to manually keep track of what end up in ROM at which address and manually fix up each reference.
<dh`> seems like you could still keep it all in one source file without it becoming unmaintainable, though
<tnt> Huh ... what ? I mean it's all in one file, that doesn't mean you don't get the linker involved.
<dh`> if it's one file you can make the assembler resolve everything
<dh`> though gas wasn't ever really intended to be used that way so it may not actually work
<jrtc27> still needs -mno-relax otherwise gas will defer everything to link time in case you link in something else
<dh`> even if you make the base addresses absolute?
<dh`> seems like that ought to turn them into constants
<dh`> but see: gas wasn't ever really intended to be used that way
<jrtc27> you can't make labels absolute at assemble time
<jrtc27> you can have absolute symbols, but they're not labels
<jrtc27> you can't put anything there
<jrtc27> except maybe if you do nasty things with .org?
<jrtc27> (does that work/exist outside of x86? only ever seen that for bios boot block thingies)
<dh`> that's the point of .org
djdelorie has joined #riscv
<dh`> doesn't seem to work though
<dh`> riscv gas accepts it but the net result is weirdly wrong
<dh`> it seems to treat the .org value as an offset from the beginning of .text
<dh`> (regardless of whether that's the current section, too)
<dh`> ok, not actually that broken, it is actaully an offset from the beginning of the current section
<dh`> but it doesn't generate absolute symbols and it seems like it ought to
<dh`> apparently this is the intended behavior, how stupid
<dh`> ah well
<dh`> it seems like someone might be interested in a riscv assembler meant for use with such projects
<dh`> also, it looks like if you write la t0, sym; lw t0, 0(t0) and sym is within range of gp, the linker produces two instructions, not one
<dh`> and writing "lw t0, sym" does not fix this, still produces different instructions (but different onces)
<dh`> erm
<dh`> still produces _two_ instructions
<dh`> (is there an explicit relocation widget for gp-relative %lo? none of %gplo, %gprel, or %gp works and the gas manual doesn't document the riscv relocation widgets)
<jrtc27> not currently
<jrtc27> the compact code ~~model~~ ABI introduces one to expose the existing binutils-internal relocation
<jrtc27> lw t0, sym should result in one instruction post-relaxation
<jrtc27> la t0, sym; lw t0, 0(t0) can't do anything with the load because it's just a load
<jrtc27> though there's nothing stopping you using explicit %pcrel_hi and %pcrel_lo relocs
<jrtc27> (and you can reuse the same %pcrel_hi with multiple %pcrel_lo's for the same symbol if you say want to load, add 1 and store)
jimbzy has joined #riscv
jimbzy has quit [Killed (NickServ (GHOST command used by jim_!~jim@67.6.38.172))]
ats has quit [Ping timeout: 265 seconds]
vagrantc has joined #riscv
ats has joined #riscv
s0ph0s has quit [Read error: Connection reset by peer]
s0ph0s has joined #riscv
<dh`> lw t0, sym generates auipc and lw
<dh`> depending on how the linker relocations work I might or might not expect it to pick up la followed by lw
<dh`> er, relaxations
<dh`> does it scan for patterns or is it driven entirely by relocations that arise from expansions in gas?
<jrtc27> entirely relocations
<jrtc27> each relocation is relaxed independently
<geist> and then as a result of it it has to tweak all the jumps that are now the wrong thing, etc. lots of rel entries in an .o file
<jrtc27> and designed such that, provided you use them "normally" (ie not like the original question here), the composition is thus relaxed fine
<jrtc27> yes, the bloat in .o files isn't great
<jrtc27> but who cares about build intermediates
<dh`> there are packages where build intermediates are large enough to still cause problems occasionaly
* dh` glares at rust
<dh`> but yeah, it doesn't really matter
<jrtc27> disks are cheap
<dh`> ISTM that scanning for relaxable patterns would be more effective
<dh`> but I suppose that makes it hard to disable them
<geist> yah totally. first time i bumped into an arch like this that relied on relaxations so much was microblaze. it has a quirk that *all* immediates in the arch can either be 16 or 32 bit and thus all can be relaxed if they're not resolved at linker type
<jrtc27> I mean, the real answer is LTO that's tightly integrated with the linker layout algorithm so you just generate the right code in the first place
<geist> by inserting an opcode that loads the top 16 bits into a hidden register before the real instruction
<geist> so it's rel entries out the wazoo
<jrtc27> interesting
<dh`> (how hidden? what happens if you trap between the two instructions?)
* jrtc27 wonders how long arc/microblaze/nios will stick around given the long-term sensible thing is surely riscv for that use case
<dh`> (I suppose I have a microblaze manual here somewhere)
<dh`> I am starting to feel like unix linkers are holding the world back a fair amount
<dh`> but not sure what would constitute an improvement
<geist> microblaze has a 'load upper' i think instruction that loads the top 16 bits into a hidden register that's consumed by the next instruction that uses a 16bit immediate
<geist> and/or it patches exactly the next instruction
<jrtc27> so basically mips $at but not normally addressable?
<geist> i guess yeah
<geist> mblaze has an *extremely* simple opcode format, such that there's precisely one kinda immediate: bottom 16 bits are immediate
<geist> so all branches, alu immediate, etc use the one format
<geist> so you get 16 bit branches or 32bit branches, period. kinda nice a simple and elegant
<geist> if not inefficient
<geist> whcih actually riscv is pretty close to, but it doesn't have the 'hack the next instruction instruction'
<jrtc27> rv32ixgeist
<jrtc27> :)
<geist> well all is not great: mblaze has the branch delay slot
<geist> so now you can push it under the bus
<sorear> sounds like fun when you start dealing with interrupts
<geist> another arch i fiddled with for work once was a TI DSP called piccolo i think
<geist> it had a relaxation phase that ran *after* linking
<geist> that then looked at the binary and did constant folding and general relaxations
<geist> i have no idea how that worked 100%
s0ph0s|alt has joined #riscv
<dh`> the answer to my question seems to be that you can't trap between the imm instruction and the following one
<geist> i *guess* if you know where all the text is (it was a separate I and D memory address) you could restruct what's going on
<geist> dh`: in mblaze? yeah i remember there being some interlock. i forget what happens if you imm imm imm ....
<geist> some hack for that
s0ph0s has quit [Ping timeout: 264 seconds]
<dh`> geist: the manual I have says that imm is only useful for the next instruction
<dh`> so presumably there's some internal interlock that clears if the next instruction doesn't use it
<geist> mblaze is a cheezy arch but it has enough quirkiness that i kinda like it. xilinx makes it for their fpgas, so i suspect at any time they'll ditch it in favor of riscv
<dh`> and you get your interrupt before the next one takes effect
<geist> yah. their hack for the branch delay slot is to waste a bit on every branch so you can specify if you want a delay slot or an extra cycle with no delay slot
Gravis has quit [Remote host closed the connection]
<dh`> given when microblaze was invented, having delay slots at all seems stupid
Gravis has joined #riscv
<geist> yah it is kinda baked into the arch in one way i remember: to return from IRQ you basicaly branch to the return spot and in the slot restore the CSR (or whatever the control register is)
<geist> i think MIPS did that too?
<geist> ie, no dedicated return-from-irq instruction
<dh`> mips-I did
<dh`> there's an RFE instruction but it just perturbs the status register, the jump is an ordinary jump
<dh`> but that was changed from mips-III and (like other mips-I stuff) has been retconned out of subsequent docs
<geist> yah iirc that also means on microblaze that one of the registers is trashed in irq
<geist> so the ABI says 'dont touch r15' or something like that
<geist> so it's pretty lame, all in all
<dh`> so they could have copied it but if they were paying attention they would have realized they probably shouldn't
<dh`> really? that's pretty weak
<jrtc27> mips does the same with k1/k2
<dh`> I mean, mips does that but for real reasons
<geist> see, yeah it feels like microbalze copied basically mips-1 in lots of ways
<geist> well the irq is trashed because you need a register to branch *from* when exiting the irq
<jrtc27> ah
<geist> so i think the convention is to use essentially this reserved register to hold the return address
<jrtc27> so no epc then
<geist> right
<geist> actually the hardware maybe actually overwrites this register on irq entry?
<geist> either way it's do not touch in the ABI
* jrtc27 glares at arm's pc register
Narrat has joined #riscv
<dh`> yeah it seems that it does
<dh`> and there are two of them, one for interrupts and one for breakpoints
<dh`> I guess because there's some kind of external breakpoint facility
<geist> yah probably so. since it's intended to be used in FPGAs it probably has a lot of jtaggability
<geist> i guess they put those in the register file so they dont have to create any sort of MSR like functionality
<geist> i guess there are no gigantic bank of control registers
<geist> though there must be at least a status register somewhere
<dh`> they have a bunch of control registers anyway
<geist> ah hmm
<dh`> e.g. it seems that if you get a trap in a delay slot you're supposed to use a control register for the return address instead of the value it drops in the magic general-purpose register
<dh`> does not seem that well designed
<geist> :)
<geist> also why i kinda expect xilinx and altera to both evetually announce that their new soft core support is riscv
<jrtc27> mips has some weirdness around that too though
<jrtc27> where you have to emulate the branch in your trap handler
<geist> mblaze and nios2 (alteras) are highly configurable, so they'd need similar functionality for riscv but no reason they couldn't
<jrtc27> not in the common case thoug
<jrtc27> *h
<dh`> no, you don't, there's a bit to tell you it happened but you can return to the EPC address regardless
<jrtc27> in the normal case yes
<dh`> you only need to care if you're trying to disassemble the faulting instruction
<jrtc27> maybe it's only for dtrace then that it matters
<jrtc27> I just know freebsd has this nasty MipsEmulateBranch function
<dh`> you are perhaps thinking about sparc where the way you return from a trap is two successive branches
<dh`> hmm that seems gross, pretty sure we don't have any such thing
<geist> hmm, the second brnach goes to a recovery thing (on sparc?)
<dh`> no, basically sparc handles delay slots by having two PCs, and to return you need to set both of them
<jrtc27> jmp + rett
<dh`> so if you trap in a delay slot the PC is the delay slot and the next-PC is the branch target, and branching to both of them in sequence will continue execution
<geist> gross!
<dh`> yup!
<geist> i always meant to hack sparc low level, i even have a few old sparc v8 machines floating around, but never really got the activation energy to do so
<jrtc27> the PC + nPC model is kinda nice in that you can have branches in delay slots, but that was only architecturally guaranteed since sparcv9 and deprecated in ultrasparc
<geist> sparc is only vaguely interesting to me, but not over the thresold of other older interesting arches
<jrtc27> (as in, it's the "elegant" model for branch delay slots)
<dh`> how did you do trap return in v8? I thought I read all that in a v8 manual but I might be misremembering
<geist> plus i just really never wanted to deal with that damn register window
<jrtc27> pre-v9 still had the notion of nPC
<jrtc27> it's an interesting question though for exception return
<jrtc27> maybe rett in the delay slot is special-cased?
<dh`> idk
<dh`> could go look but I don't care enough to figure out where the docs are hiding
<dh`> :-)
<dh`> I vaguely recall that it might have been something like "this is the only case where executing a jump in a delay slot is guaranteed to work correctly"
<jrtc27> ah
<jrtc27> v9 adds "done" and "retry"
<jrtc27> depending on whether you want to return to PC or nPC
<jrtc27> and I guess rett for v8 and below wasn't a *delayed* instruction
<jrtc27> thus was legal to have in a delay slow
<jrtc27> *t
<dh`> I don't remember, all I ever actually did with sparc was port an assembler/linker
<dh`> where this stuff isn't of immediate concern
richbridger has joined #riscv
* jrtc27 ported libreoffice's uno
<jrtc27> which is a big pile of wat
<jrtc27> ok, rett is delayed, but v8 only left unspecified what happened if you did `conditional branch ; delayed branch of any kind` without the annul bit set on the first
<jrtc27> I guess because they couldn't be bothered to add the checkpointing complexity to hardware
<geist> i guess the gist of all of this is branch delay slots are bad for your health
<jrtc27> depends if you have perverse masochistic tendencies
jimbzy has joined #riscv
zvijezda has joined #riscv
<dh`> anyway back to the earlier topic for a sec, I can't get gas to emit a gp-relative load
<dh`> or gas/ld combined rather
zvijezda has quit [Read error: Connection reset by peer]
<jrtc27> is the target in .sdata (or otherwise within +/-4K of __global_pointer$)?
<dh`> it seems to be
<dh`> oh wait it's in range but not in .sdata
<dh`> ... putting it in .sdata causes it to be assigned address 0 instead of anything reasonable so it's then not in range
<dh`> though it does manage to relax that to a one-instruction read relative to x0
somlo_ is now known as somlo
Narrat has quit [Quit: They say a little knowledge is a dangerous thing, but it's not one half so bad as a lot of ignorance.]
awordnot has joined #riscv