riff_IRC has quit [Quit: PROTO-IRC v0.73a (C) 1988 NetSoft - Built on 11-13-1988 on AT&T System V]
riff-IRC has joined #riscv
smaeul has joined #riscv
zvijezda has quit [Ping timeout: 264 seconds]
a2800276 has quit [Quit: Lost terminal]
iorem6 has quit [Ping timeout: 272 seconds]
vagrantc has quit [Quit: leaving]
SanchayanMaity has joined #riscv
Oli has quit [Quit: leaving]
riff-IRC has quit [Remote host closed the connection]
Forty-B8 has joined #riscv
Forty-B8 is now known as Forty-Bot
SanchayanMaity has quit [Quit: SanchayanMaity]
_whitelogger has joined #riscv
smaeul has quit [Quit: cya]
smaeul has joined #riscv
smartin has joined #riscv
geist has quit [Quit: leaving]
geist has joined #riscv
geertu_ is now known as geertu
geertu has quit [Quit: confusion]
geertu has joined #riscv
geist is now known as geist2
geist2 is now known as geist
smartin has quit [Ping timeout: 264 seconds]
_whitelogger has joined #riscv
smartin has joined #riscv
cmuellner has quit [*.net *.split]
hl has quit [*.net *.split]
jimwilson has quit [*.net *.split]
peepsalot has quit [*.net *.split]
edef has quit [*.net *.split]
kgz has quit [*.net *.split]
balrog has quit [*.net *.split]
Slide-O-Mix has quit [*.net *.split]
owl has quit [*.net *.split]
GreaseMonkey has quit [*.net *.split]
sorear has quit [*.net *.split]
stefanct has quit [*.net *.split]
tux3 has quit [*.net *.split]
aurel32 has quit [*.net *.split]
jn has quit [*.net *.split]
kgz has joined #riscv
balrog has joined #riscv
hl has joined #riscv
jn has joined #riscv
sorear has joined #riscv
stefanct has joined #riscv
aurel32 has joined #riscv
edef has joined #riscv
owl has joined #riscv
peepsalot has joined #riscv
jimwilson has joined #riscv
cmuellner has joined #riscv
GreaseMonkey has joined #riscv
Slide-O-Mix has joined #riscv
tux3 has joined #riscv
hl has quit [*.net *.split]
cmuellner has quit [*.net *.split]
peepsalot has quit [*.net *.split]
edef has quit [*.net *.split]
jimwilson has quit [*.net *.split]
kgz has quit [*.net *.split]
owl has quit [*.net *.split]
GreaseMonkey has quit [*.net *.split]
balrog has quit [*.net *.split]
tux3 has quit [*.net *.split]
stefanct has quit [*.net *.split]
sorear has quit [*.net *.split]
Slide-O-Mix has quit [*.net *.split]
aurel32 has quit [*.net *.split]
jn has quit [*.net *.split]
cmuellner has joined #riscv
stefanct has joined #riscv
hl has joined #riscv
edef has joined #riscv
owl has joined #riscv
peepsalot has joined #riscv
balrog has joined #riscv
jn has joined #riscv
jimwilson has joined #riscv
kgz has joined #riscv
Slide-O-Mix has joined #riscv
sorear has joined #riscv
aurel32 has joined #riscv
tux3 has joined #riscv
GreaseMonkey has joined #riscv
smartin has quit [Ping timeout: 264 seconds]
smartin has joined #riscv
_whitelogger has joined #riscv
aquijoule_ has quit [Remote host closed the connection]
_whitelogger has joined #riscv
TwoNotes has joined #riscv
smartin has quit [Ping timeout: 244 seconds]
ats has quit [Ping timeout: 272 seconds]
ats has joined #riscv
dlan has joined #riscv
dlan has quit [Client Quit]
dlan has joined #riscv
dlan has quit [Quit: leaving]
dlan has joined #riscv
mhorne has quit [Ping timeout: 272 seconds]
riff-IRC has joined #riscv
tnt has joined #riscv
<tnt>
So huh ... I'm trying to so something like : lhu x8, %lo(some_symbol)(x30) and it builds no error ... but instead of x30, zero gets used silently :/
<jrtc27>
that assembly is fine
<jrtc27>
what are you *actually* doing, and what are you seeing *exactly*?
<jrtc27>
%lo is meant to be used with a register that was written to by a lui %hi
<jrtc27>
so it sees the tiny symbol value and transforms the load to one relative to x0 as the value fits entirely in the immediate
<jrtc27>
if there were a lui %hi it would have been deleted
<jrtc27>
whatever you're doing here such that it cares about that is likely bad code
<jrtc27>
but if you *really* need it, use .option norelax for that section, or build your entire code base with -mno-relax
<tnt>
Err, unfortunately I'd need it just for that line ...
<jrtc27>
if you are absolutely sure you really need to write your code that way
<tnt>
I know that symbol fits entirely in %lo because the system doesn't even have more than 4k of addressable data space.
<jrtc27>
.option push .option norelax <code> .option pop
<jrtc27>
but I would seriously encourage you to change your code to not be so dodgy
<tnt>
The system doesn't even have _any_ RAM ... and only 1kbyte of instruction-ROM, so every opcode I can save, I need to save.
<jrtc27>
so why do you need x30?
<tnt>
A bit hard to explain, but that symbol can be used in 2 different "addressing context" depending on the type of hw access and they have different base addresses. 'x30' is set as a constant that always contains one of the base address for a type of access.
<jrtc27>
then you're using symbol values in a non-standard way
<tnt>
Yes
<jrtc27>
don't do that
<tnt>
But short of hard coding them I couldn't find any other way ...
<tnt>
Even if I changed it so that the symbol contains the base address ... hwo would I tell the compiler than x30 _always_ contains the proper upper bits to not reload them each time ? (I can't afford extra instrcutions to set x30 to the same value over and over)
<jrtc27>
you wouldn't need to
<jrtc27>
so long as the register input to the load has the value that a lui %hi would have
<jrtc27>
it works
<tnt>
Actually another type of access that would have the same isseu: Image I want to do array access. a[x30] and I know 'a' fits in imm12 ... that'd be the same code.
<tnt>
How exactly would it "know" the value of that register ?!?
<jrtc27>
compilers never know that so the relocations do not support that
<tnt>
Anyway, the .option thing work, thanks. I'll use that since I don't see any better option that wouldn't cost me added instructions. I wish I could do like lhu x8, some_symbol(x30) (without %lo) and it just errors out during inking if it turns out 'some_symbol' doesn't fit ...
<jrtc27>
would be a waste of a relocation id
<tnt>
¯\_(ツ)_/¯
<tnt>
I know my use case is special but on very tiny micro, I would't exclude it to know _by_design_ that a symbol fits in the first 2k of memory and allowing single opcode array indexing seems useful to me.
<jrtc27>
it does, you just have to turn off linker relaxation
<jrtc27>
because linker relaxation is built on the premise of knowing higher-level information about what you're doing
<jrtc27>
that matches what any compiler will do
<jrtc27>
and is designed to take pessimistic general code emitted by a compiler (ie no knowledge of what value symbols have) and optimise it based on what's known at link time
<jrtc27>
but if you're writing code with that knowledge already then it's not needed and wrong
<tnt>
I'm just not seeing what the "correct" way would be that would yield the same result.
<jrtc27>
I don't think there is
<jrtc27>
this is a slightly interesting oversight
<jrtc27>
https://godbolt.org/z/65se31jcx could be one instruction shorter even in the general case if the ABI had a way to express that
<tnt>
Turns out I can use -mno-relax globally, I thouht this would affect 'li' of constants (that auto use 1 or 2 opcode depending on the constant), but I guess given the constants are known, that's a different process and I know all my symbols value fit.
<dh`>
if you have two base registers, you might be able to make the linker behave if you use gp for one and tp for the other
<dh`>
but tbh if you have only 1K of code, why bother using a linker?
<tnt>
I have 1 k of code but a few k of data ROM and I don't want to manually keep track of what end up in ROM at which address and manually fix up each reference.
<dh`>
seems like you could still keep it all in one source file without it becoming unmaintainable, though
<tnt>
Huh ... what ? I mean it's all in one file, that doesn't mean you don't get the linker involved.
<dh`>
if it's one file you can make the assembler resolve everything
<dh`>
though gas wasn't ever really intended to be used that way so it may not actually work
<jrtc27>
still needs -mno-relax otherwise gas will defer everything to link time in case you link in something else
<dh`>
even if you make the base addresses absolute?
<dh`>
seems like that ought to turn them into constants
<dh`>
but see: gas wasn't ever really intended to be used that way
<jrtc27>
you can't make labels absolute at assemble time
<jrtc27>
you can have absolute symbols, but they're not labels
<jrtc27>
you can't put anything there
<jrtc27>
except maybe if you do nasty things with .org?
<jrtc27>
(does that work/exist outside of x86? only ever seen that for bios boot block thingies)
<dh`>
that's the point of .org
djdelorie has joined #riscv
<dh`>
doesn't seem to work though
<dh`>
riscv gas accepts it but the net result is weirdly wrong
<dh`>
it seems to treat the .org value as an offset from the beginning of .text
<dh`>
(regardless of whether that's the current section, too)
<dh`>
ok, not actually that broken, it is actaully an offset from the beginning of the current section
<dh`>
but it doesn't generate absolute symbols and it seems like it ought to
<dh`>
apparently this is the intended behavior, how stupid
<dh`>
ah well
<dh`>
it seems like someone might be interested in a riscv assembler meant for use with such projects
<dh`>
also, it looks like if you write la t0, sym; lw t0, 0(t0) and sym is within range of gp, the linker produces two instructions, not one
<dh`>
and writing "lw t0, sym" does not fix this, still produces different instructions (but different onces)
<dh`>
erm
<dh`>
still produces _two_ instructions
<dh`>
(is there an explicit relocation widget for gp-relative %lo? none of %gplo, %gprel, or %gp works and the gas manual doesn't document the riscv relocation widgets)
<jrtc27>
not currently
<jrtc27>
the compact code ~~model~~ ABI introduces one to expose the existing binutils-internal relocation
<jrtc27>
lw t0, sym should result in one instruction post-relaxation
<jrtc27>
la t0, sym; lw t0, 0(t0) can't do anything with the load because it's just a load
<jrtc27>
though there's nothing stopping you using explicit %pcrel_hi and %pcrel_lo relocs
<jrtc27>
(and you can reuse the same %pcrel_hi with multiple %pcrel_lo's for the same symbol if you say want to load, add 1 and store)
jimbzy has joined #riscv
jimbzy has quit [Killed (NickServ (GHOST command used by jim_!~jim@67.6.38.172))]
ats has quit [Ping timeout: 265 seconds]
vagrantc has joined #riscv
ats has joined #riscv
s0ph0s has quit [Read error: Connection reset by peer]
s0ph0s has joined #riscv
<dh`>
lw t0, sym generates auipc and lw
<dh`>
depending on how the linker relocations work I might or might not expect it to pick up la followed by lw
<dh`>
er, relaxations
<dh`>
does it scan for patterns or is it driven entirely by relocations that arise from expansions in gas?
<jrtc27>
entirely relocations
<jrtc27>
each relocation is relaxed independently
<geist>
and then as a result of it it has to tweak all the jumps that are now the wrong thing, etc. lots of rel entries in an .o file
<jrtc27>
and designed such that, provided you use them "normally" (ie not like the original question here), the composition is thus relaxed fine
<jrtc27>
yes, the bloat in .o files isn't great
<jrtc27>
but who cares about build intermediates
<dh`>
there are packages where build intermediates are large enough to still cause problems occasionaly
* dh`
glares at rust
<dh`>
but yeah, it doesn't really matter
<jrtc27>
disks are cheap
<dh`>
ISTM that scanning for relaxable patterns would be more effective
<dh`>
but I suppose that makes it hard to disable them
<geist>
yah totally. first time i bumped into an arch like this that relied on relaxations so much was microblaze. it has a quirk that *all* immediates in the arch can either be 16 or 32 bit and thus all can be relaxed if they're not resolved at linker type
<jrtc27>
I mean, the real answer is LTO that's tightly integrated with the linker layout algorithm so you just generate the right code in the first place
<geist>
by inserting an opcode that loads the top 16 bits into a hidden register before the real instruction
<geist>
so it's rel entries out the wazoo
<jrtc27>
interesting
<dh`>
(how hidden? what happens if you trap between the two instructions?)
* jrtc27
wonders how long arc/microblaze/nios will stick around given the long-term sensible thing is surely riscv for that use case
<dh`>
(I suppose I have a microblaze manual here somewhere)
<dh`>
I am starting to feel like unix linkers are holding the world back a fair amount
<dh`>
but not sure what would constitute an improvement
<geist>
microblaze has a 'load upper' i think instruction that loads the top 16 bits into a hidden register that's consumed by the next instruction that uses a 16bit immediate
<geist>
and/or it patches exactly the next instruction
<jrtc27>
so basically mips $at but not normally addressable?
<geist>
i guess yeah
<geist>
mblaze has an *extremely* simple opcode format, such that there's precisely one kinda immediate: bottom 16 bits are immediate
<geist>
so all branches, alu immediate, etc use the one format
<geist>
so you get 16 bit branches or 32bit branches, period. kinda nice a simple and elegant
<geist>
if not inefficient
<geist>
whcih actually riscv is pretty close to, but it doesn't have the 'hack the next instruction instruction'
<jrtc27>
rv32ixgeist
<jrtc27>
:)
<geist>
well all is not great: mblaze has the branch delay slot
<geist>
so now you can push it under the bus
<sorear>
sounds like fun when you start dealing with interrupts
<geist>
another arch i fiddled with for work once was a TI DSP called piccolo i think
<geist>
it had a relaxation phase that ran *after* linking
<geist>
that then looked at the binary and did constant folding and general relaxations
<geist>
i have no idea how that worked 100%
s0ph0s|alt has joined #riscv
<dh`>
the answer to my question seems to be that you can't trap between the imm instruction and the following one
<geist>
i *guess* if you know where all the text is (it was a separate I and D memory address) you could restruct what's going on
<geist>
dh`: in mblaze? yeah i remember there being some interlock. i forget what happens if you imm imm imm ....
<geist>
some hack for that
s0ph0s has quit [Ping timeout: 264 seconds]
<dh`>
geist: the manual I have says that imm is only useful for the next instruction
<dh`>
so presumably there's some internal interlock that clears if the next instruction doesn't use it
<geist>
mblaze is a cheezy arch but it has enough quirkiness that i kinda like it. xilinx makes it for their fpgas, so i suspect at any time they'll ditch it in favor of riscv
<dh`>
and you get your interrupt before the next one takes effect
<geist>
yah. their hack for the branch delay slot is to waste a bit on every branch so you can specify if you want a delay slot or an extra cycle with no delay slot
Gravis has quit [Remote host closed the connection]
<dh`>
given when microblaze was invented, having delay slots at all seems stupid
Gravis has joined #riscv
<geist>
yah it is kinda baked into the arch in one way i remember: to return from IRQ you basicaly branch to the return spot and in the slot restore the CSR (or whatever the control register is)
<geist>
i think MIPS did that too?
<geist>
ie, no dedicated return-from-irq instruction
<dh`>
mips-I did
<dh`>
there's an RFE instruction but it just perturbs the status register, the jump is an ordinary jump
<dh`>
but that was changed from mips-III and (like other mips-I stuff) has been retconned out of subsequent docs
<geist>
yah iirc that also means on microblaze that one of the registers is trashed in irq
<geist>
so the ABI says 'dont touch r15' or something like that
<geist>
so it's pretty lame, all in all
<dh`>
so they could have copied it but if they were paying attention they would have realized they probably shouldn't
<dh`>
really? that's pretty weak
<jrtc27>
mips does the same with k1/k2
<dh`>
I mean, mips does that but for real reasons
<geist>
see, yeah it feels like microbalze copied basically mips-1 in lots of ways
<geist>
well the irq is trashed because you need a register to branch *from* when exiting the irq
<jrtc27>
ah
<geist>
so i think the convention is to use essentially this reserved register to hold the return address
<jrtc27>
so no epc then
<geist>
right
<geist>
actually the hardware maybe actually overwrites this register on irq entry?
<geist>
either way it's do not touch in the ABI
* jrtc27
glares at arm's pc register
Narrat has joined #riscv
<dh`>
yeah it seems that it does
<dh`>
and there are two of them, one for interrupts and one for breakpoints
<dh`>
I guess because there's some kind of external breakpoint facility
<geist>
yah probably so. since it's intended to be used in FPGAs it probably has a lot of jtaggability
<geist>
i guess they put those in the register file so they dont have to create any sort of MSR like functionality
<geist>
i guess there are no gigantic bank of control registers
<geist>
though there must be at least a status register somewhere
<dh`>
they have a bunch of control registers anyway
<geist>
ah hmm
<dh`>
e.g. it seems that if you get a trap in a delay slot you're supposed to use a control register for the return address instead of the value it drops in the magic general-purpose register
<dh`>
does not seem that well designed
<geist>
:)
<geist>
also why i kinda expect xilinx and altera to both evetually announce that their new soft core support is riscv
<jrtc27>
mips has some weirdness around that too though
<jrtc27>
where you have to emulate the branch in your trap handler
<geist>
mblaze and nios2 (alteras) are highly configurable, so they'd need similar functionality for riscv but no reason they couldn't
<jrtc27>
not in the common case thoug
<jrtc27>
*h
<dh`>
no, you don't, there's a bit to tell you it happened but you can return to the EPC address regardless
<jrtc27>
in the normal case yes
<dh`>
you only need to care if you're trying to disassemble the faulting instruction
<jrtc27>
maybe it's only for dtrace then that it matters
<jrtc27>
I just know freebsd has this nasty MipsEmulateBranch function
<dh`>
you are perhaps thinking about sparc where the way you return from a trap is two successive branches
<dh`>
hmm that seems gross, pretty sure we don't have any such thing
<geist>
hmm, the second brnach goes to a recovery thing (on sparc?)
<dh`>
no, basically sparc handles delay slots by having two PCs, and to return you need to set both of them
<jrtc27>
jmp + rett
<dh`>
so if you trap in a delay slot the PC is the delay slot and the next-PC is the branch target, and branching to both of them in sequence will continue execution
<geist>
gross!
<dh`>
yup!
<geist>
i always meant to hack sparc low level, i even have a few old sparc v8 machines floating around, but never really got the activation energy to do so
<jrtc27>
the PC + nPC model is kinda nice in that you can have branches in delay slots, but that was only architecturally guaranteed since sparcv9 and deprecated in ultrasparc
<geist>
sparc is only vaguely interesting to me, but not over the thresold of other older interesting arches
<jrtc27>
(as in, it's the "elegant" model for branch delay slots)
<dh`>
how did you do trap return in v8? I thought I read all that in a v8 manual but I might be misremembering
<geist>
plus i just really never wanted to deal with that damn register window
<jrtc27>
pre-v9 still had the notion of nPC
<jrtc27>
it's an interesting question though for exception return
<jrtc27>
maybe rett in the delay slot is special-cased?
<dh`>
idk
<dh`>
could go look but I don't care enough to figure out where the docs are hiding
<dh`>
:-)
<dh`>
I vaguely recall that it might have been something like "this is the only case where executing a jump in a delay slot is guaranteed to work correctly"
<jrtc27>
ah
<jrtc27>
v9 adds "done" and "retry"
<jrtc27>
depending on whether you want to return to PC or nPC
<jrtc27>
and I guess rett for v8 and below wasn't a *delayed* instruction
<jrtc27>
thus was legal to have in a delay slow
<jrtc27>
*t
<dh`>
I don't remember, all I ever actually did with sparc was port an assembler/linker
<dh`>
where this stuff isn't of immediate concern
richbridger has joined #riscv
* jrtc27
ported libreoffice's uno
<jrtc27>
which is a big pile of wat
<jrtc27>
ok, rett is delayed, but v8 only left unspecified what happened if you did `conditional branch ; delayed branch of any kind` without the annul bit set on the first
<jrtc27>
I guess because they couldn't be bothered to add the checkpointing complexity to hardware
<geist>
i guess the gist of all of this is branch delay slots are bad for your health
<jrtc27>
depends if you have perverse masochistic tendencies
jimbzy has joined #riscv
zvijezda has joined #riscv
<dh`>
anyway back to the earlier topic for a sec, I can't get gas to emit a gp-relative load
<dh`>
or gas/ld combined rather
zvijezda has quit [Read error: Connection reset by peer]
<jrtc27>
is the target in .sdata (or otherwise within +/-4K of __global_pointer$)?
<dh`>
it seems to be
<dh`>
oh wait it's in range but not in .sdata
<dh`>
... putting it in .sdata causes it to be assigned address 0 instead of anything reasonable so it's then not in range
<dh`>
though it does manage to relax that to a one-instruction read relative to x0
somlo_ is now known as somlo
Narrat has quit [Quit: They say a little knowledge is a dangerous thing, but it's not one half so bad as a lot of ignorance.]