<jporquet>
and then for all the other possible combination (e.g. rv32imafd == rv32g), these libs are reused
<jporquet>
the problem is that if libgcc is compiled using a rv32imac configuration, then it won't work when running gcc with --march=rv32g since the application will be linked against code that contain compressed instructions
<jporquet>
in other words, if my cpu doesn't support the C extension, I can't make sure that all my code is compiled in rv32g
<jporquet>
am I missing something?
choozy has quit [Remote host closed the connection]
<jrtc27>
the output is valid and works, it's just not helpful for your specific use case
vagrantc has quit [Ping timeout: 250 seconds]
<jrtc27>
I agree though that some of those mappings seem rather strange
<jrtc27>
but at the same time, imac and imafdc are by far the most common sets of extensions
vagrantc has joined #riscv
<jrtc27>
it's strongly recommended that you support one of those
<xentrac>
does the c extension usually make code run faster?
<jrtc27>
it reduces icache pressure
<xentrac>
right
<jrtc27>
and, if you have virtual memory, itlb pressure
<xentrac>
(I mean obviously that's a question about implementations, not architectures, so I'm asking about your experience with popular implementation techniques)
<jporquet>
what's weird is that the G meta-extension (IMAFD) was marketed as the default extension, but it seems like with the default configuration, GCC is compiled for GC only
<pabs3>
do many RISC-V CPU designs not support the C extension?
<xentrac>
not many, surely some
<jrtc27>
at the time G was defined it was not clear what the state of C would be
<jporquet>
if you have a CPU that doesn't support the C extension, you can't use vanilla GCC
<jrtc27>
distributions, implementations and embedded OS'es all settled on C being assumed in the end
<jrtc27>
sure you can
<jrtc27>
just make sure you have the right multilib
<jrtc27>
I assume
<jporquet>
all the generated multilib use the C extension
<jrtc27>
yes but if you build an rv32imafd libgcc and put it in the right place will it not be picked up?
<jrtc27>
and if not, well, -L is your friend I guess
<jporquet>
sure, but as said previously, it still means that vanilla GCC does not strictly support RV{32,64}G even though it's supported to be the default set of extensions :|
<jporquet>
*supposed to be
<jrtc27>
"support" is not a well-defined term here
<jporquet>
what do you mean?
<jrtc27>
GCC the compiler supports everything
<jrtc27>
it ships with some pre-built libgcc's for convenience
<jrtc27>
those happen to not include a configuration that you support
<jrtc27>
but that's libgcc the runtime, not GCC the compiler
<jporquet>
I'll rephrase by saying: I'm surprised that no libgcc is pre-compiled to be strictly compatible with rv32/64g since the G extension is supposed to be the default hardware-wise
<jrtc27>
the second part of your statement is false
<jrtc27>
and it's not surprising because it would be a waste of space when the linux community has collectively agreed on GC as the base ISA
<jrtc27>
with IMAC if you want to do soft-float
<jrtc27>
but on a linux-capable system the complexity of implementing C is insignificant
<jporquet>
I hear you but I don't think my statement is false
<jporquet>
there's a whole chapter in the specs about the G ISA
<jrtc27>
G means general-purpose not default
<jporquet>
hmmm
<jrtc27>
because C doesn't add any new *functionality*
<jporquet>
I'm not a native speaker so I won't argue further, but I find it confusing nonetheless
<jporquet>
anyway, thanks for your insight, really appreciate it
<geist>
yah 'pv' is a nice app for a quickie benchmark: 'pv /dev/zero > /dev/null'
<leah2>
yeah i used pv :)
<xentrac>
ha cool
<jrtc27>
too scared of dd? :P
<xentrac>
pv tells you the answer before it exits
<geist>
it gives uyou a running display which is nice
<jrtc27>
so does dd if you killall -INFO it
<jrtc27>
or, if you're on FreeBSD, just press ^T
<geist>
also helpfulk for quickie benchmarks like 'pv /dev/zero | md5sum'
<xentrac>
oh neat, I didn't know that about dd
<xentrac>
I miss ^T from VMS
<xentrac>
(because I'm not running BSD, of course. self-inflicted injury)
<geist>
pv has a kinda neat thing that it also uses splice() fairly aggressively on linux, so sometimes depending on what you're piping from and to, it avoids a copy
<geist>
so pv /dev/zero > /dev/null *probably* just splices between those two fds
<geist>
OTOH, that may also skew your particular benchmark here, depending on if linux decides to short circuit that internally
<leah2>
i'll try fio later
<leah2>
but machine is building rust atm
<xentrac>
thanks jrtc27!
<geist>
oh that's slow no matter what arch youi're on!
<jrtc27>
macOS also has ^T
<jrtc27>
it's really just Linux that sucks here
<geist>
yah that's why i dont get too invested in particular flavors of dd
<geist>
it's like tar, you're always finding that your version doesn't have this or that
<jrtc27>
it's the OS not supporting ^T for SIGINFO, not a property of the dd
<geist>
sure, but same result
<xentrac>
Linux doesn't support ^Y either, which I also used to miss a lot
riff_IRC is now known as riff-IRC
<xentrac>
although I don't actually know if ^Y is useful with ssh
<geist>
hmm, what is ^Y supposed to do?
<xentrac>
dsusp
<geist>
and yeah VMS DCL and whatnot is pretty neat
<xentrac>
it sends SIGTSTP like ^Z, but not to the tty pgrp, but rather to whatever process tries to read the poisoned ^Y
<geist>
the job control is wonky if you came from unix, but it's pretty pwoerful once you grok it
<jrtc27>
(also I meant -USR1 for Linux, not -INFO, because Linux doesn't have SIGINFO, except on alpha because Linux's ABI is a mess)
<xentrac>
so in particular you could rsh hercules, start some stuff on hercules, ~^Y, rsh hephaestus, and do stuff on hephaestus, while still seeing the output from whatever you were doing on hercules
<xentrac>
because only the outbound rsh process got suspended, not the inbound half
FluffyMask has quit [Quit: WeeChat 2.9]
<xentrac>
ssh doesn't do the forking into two unidirectional processes thing that rsh did, so I don't think it would work
<xentrac>
I never learned to use DCL job control. I liked DCL a lot but didn't understand it much. but I was just a kid
<geist>
that's kinda expected, IMO. far less syscalls than the probable default of 512
<geist>
(1MB instead of 512)
<xentrac>
yeah, Emacs ^Y is "yank" (paste "killed" text), and bash and zsh copied that. I think tcsh too
<xentrac>
amusingly enough Emacs doesn't have convenient keys for either Unix's ^U (originally @) or ^W (which I think was added in BSD, post-printing-teletype)
<xentrac>
I end up using alt-← for ^W in Emacs (which also works in bash's readline)
<xentrac>
2GB/s sounds significantly different from 0.12GB/s. does that help, leah2?
<geist>
the block size of the access probably matters a lot here
<geist>
the bs=1024k is doing a syscall per 1MB
<geist>
vs whatever block size the other tests are
<geist>
i bet if you start lowering that bs to 256k then 64k etc you'll see the speed roll off
<dh`>
^U in emacs is ^A^K, which is deeply wired into emacs users' fingers to the extent that running screen means your windows disappear without realizing what happened
<dh`>
(^A is screen's attention key and ^K means kill this screen)
<xentrac>
more recent versions of screen have rebound that to ^Aky
<xentrac>
because it used to be a real annoyance
<dh`>
I solved that problem by not using screen
<dh`>
I still don't understand why screen never fixed their shit so you could use a function key as the attention key
<dh`>
it has to be a single byte, not an escape sequence
<xentrac>
my cousin rebound F1 to ^A and ^A to ^Aa in his xterm, problem solved
<sorear>
hah
<xentrac>
not all his xterms, only the ones he launches to connect to screen sessions
<xentrac>
using a function key as the attention key in screen itself requires some kind of timeout mechanism to decide when to just pass the initial ^[ on to vi rather than waiting for the rest of the function key escape sequence
<dh`>
sure
<dh`>
but every other damn program does that, why can't screen?
<xentrac>
well, most don't; instead they just don't use ^[ by itself
<dh`>
lots do, including vi
<xentrac>
vim and irssi do the timeout thing, vi doesn't last I checked
<dh`>
I mean, this is stupid and should have been fixed at the os level 40 years ago
<dh`>
it has to, there is no other way to use esc as a keystroke
<xentrac>
it's profoundly annoying in irssi because in irssi network latency makes it guess wrong pretty often
<xentrac>
sure there is, you can support esc and not support function keys
gector has quit [Ping timeout: 268 seconds]
<xentrac>
as for ^A^K, by default I think of a two-chord sequence as being profoundly different from a one-chord sequence, but I guess that's mostly for things that repeat, like ^W^W^W. and repeating ^U doesn't make sense
<dh`>
that's not a viable proposition for programs that actually do any kind of input editing
<xentrac>
not sure what you mean by input editing but vi spent decades supporting esc and not supporting function keys
<xentrac>
I don't know if the OS really needs to be involved with solving this. you just need a protocol for sending streams of keystrokes and maybe other events like touchmove events that doesn't have this kind of parsing ambiguity
<dh`>
the OS needs to be involved with this because it's the OS that sends the input stream
<dh`>
anyway yeah true, archaic vi doesn't support anything that has an escape prefix
<xentrac>
maybe in a virtual console, but I'm typing this in gnome-terminal
<dh`>
which gets its input from a pty
<xentrac>
well no, it sends my keystrokes to a pty
<xentrac>
it gets the keystrokes as input over its socket to the X server
<dh`>
yes, and the pty munges them because that's what unix ttys do
<xentrac>
(it does get input from a pty but that input isn't keystrokes, it's screen contents)
<xentrac>
a little, right now the pty is in raw mode because I'm running ssh on it
<dh`>
anyway the standards for what you read from ptys as input keystrokes are an OS thing
<xentrac>
potentially? I mean historically Unix treats them as a private matter between the terminal and the application
<dh`>
which is why the situation remains broken, because nobody has the authority to fix it
<dh`>
yes and no, the mapping between input sequences and any kind of useful keystroke concept beyond ascii sits in libcurses
<xentrac>
sure, if you consider curses part of the operating system. and, hey, not only curses but also terminfo are in posix
<xentrac>
my design for Wercam sends the keystrokes over a seqpacket socket rather than a pty and represents key events with packets containing "key up %d %n" or "key down %d x", where %d is a numeric scan code from the USB HID standard and x is a URL-encoded string which represents its default UTF-8 value
<dh`>
curses is definitely part of the operating system
<xentrac>
it's just a library. it doesn't have anything to do with securely multiplexing resources
<dh`>
next you're going to say /bin/sh isn't part of the operating system
<xentrac>
agreed, it's not
<xentrac>
although there are lots of historical operating systems where the shell *was* part of the operating system
<xentrac>
it's just a semantic argument, though
<dh`>
this is a silly thing to argue about, but there is historical practice
<xentrac>
I recognize that the sense I'm using "operating system" in is a bit old-fashioned
<xentrac>
so in the broader sense of "functionality shared between many or all applications" certainly key event encoding is part of the operating system
<xentrac>
oh I see I wrote "key up %d %n" where I meant "key up %d x"
<dh`>
anyway this is all bust in unix because it was stuffed into curses 40+ years ago when doing it properly would have been unacceptably expensive, and nobody since has taken the trouble to make it work nicely
<xentrac>
at the time it *wasn't* a matter of the OS either
<xentrac>
because your VT100 was what was encoding your keystrokes into a bytestream, with the ambiguity already baked in
<xentrac>
and the OS didn't have any control over that
<dh`>
yes, but instead of interpreting it in a driver like it should have been done, it was passed straight to applications
<dh`>
just like printers in msdos
<xentrac>
you mean, it should have been done in the kernel instead of in a library?
<xentrac>
that would help in a few cases but it doesn't help the underlying ambiguity problem
<dh`>
no but it isolates the problem so it can be fixed rather than making it part of the standard os/application interface
<xentrac>
now you've switched back to using *my* definition of operating system, it seems ;)
<dh`>
no, because if everything used the input interpreter in curses there wouldn't be a problem
<xentrac>
all the ridiculous complexity of curses and termcap and later terminfo was an attempt to work around the inability to reprogram the commonly used terminals to do more convenient things
<dh`>
but they don't
<dh`>
for various reasons
<xentrac>
curses inherently can't tell the difference between me pressing ↑ and typing Esc [ A
<xentrac>
on a VT100 or on a modern emulator of it like gnome-terminal
<dh`>
neither can anything the way the world is structured
<xentrac>
so there would still be a problem
<dh`>
nardly
<dh`>
er
<dh`>
hardly
<dh`>
because if this had been fixed properly when it should have been, today typing esc [ A would cause curses to feed you ESCAPE LEFT-BRACKET A
<xentrac>
yeah, that would be nice. and actually curses does guess and get that right most of the time
<dh`>
whereas pushing the up arrow would cause you to receive UP
<dh`>
today the escape sequences are generated in your keyboard driver in order to be ambiguous for curses to try to cope with
<xentrac>
but at low baud rates and particularly with packet loss and retransmission and jitter, it doesn't work reliably enough
<dh`>
well yes
<xentrac>
which of course leads to application writers trying to fix it
<xentrac>
and since it can't be fixed they just end up making different tradeoffs
<dh`>
and you end up with more and more layers of hack pasted on because nobody can be arsed to fix it properly
<xentrac>
which is why every once in a while I'll send a message in irssi beginning with U preceded by a different message I'd decided not to send
<xentrac>
because the lack of delay around my ^U caused by TCP retransmission made irssi decide that I was pasting text (another thing not initially contemplated)
<xentrac>
or there will be literally a [[A or something in there
jotweh has quit [Ping timeout: 258 seconds]
<xentrac>
I think the basic reason we're still dealing with unfixable workarounds for 50-year-old protocol design errors like this is that nobody's come up with a protocol that's a Pareto improvement
<dh`>
nobody's seriously tried
<dh`>
how much work would it be to add a tty mode that produces useful input symbols?
<dh`>
might take a whole weekend.
jotweh has joined #riscv
<xentrac>
you don't need to modify the kernel; you just need to change the protocol that applications like IRC clients and text editors speak to the "terminal emulator"
<dh`>
yes you do, because the input characters ultimately come from the kernel
<xentrac>
to a great extent what has happened is that DHTML has displaced VT100 emulation, for better or worse
<xentrac>
what comes ultimately from the kernel in this case are USB HID input events, not characters
<dh`>
when you're in X
<xentrac>
Wayland too
<xentrac>
it's true that if you're on a virtual console it's the kernel that does it
<xentrac>
the X server (in my case) transforms those into XKeyEvents, and then gnome-terminal (via GTK) does the translation to ASCII
<xentrac>
if you have an application that's running on a virtual console, though, it doesn't have to suffer through the kernel's lossy transformation to ASCII; it can read the key events from (on Linux) /dev/input/*
<xentrac>
anyway, defining an unambiguous represntation for <k><e><y><s><r><BS><t><r><o><k><e><s> isn't the hard part; it's changing all the applications to use it
<dh`>
yes, but the raw console interface isn't standard
<dh`>
anyway there are only about a dozen things to patch to get a large amount of traction (just readline, curses, vim, and emacs will go a long way)
<xentrac>
I compiled a longer list at the link above
<GreaseMonkey>
if i'm reading this correctly, what you'd want to do here is roll your own termcap config
<GreaseMonkey>
see how far you get with that
<GreaseMonkey>
...also i'm curious as to if anyone's working on to-riscv dynamic recompilers, currently i'm having a go at writing a backend for dosbox-staging
<xentrac>
no, that doesn't address the problem at all
<sorear>
there's a qemu tcg backend
<sorear>
haven't tried to use it
<GreaseMonkey>
a fun part is as part of the process of writing one you end up with gold like this:
<GreaseMonkey>
(i accidentally fed in a pointer to what expected a register)
<xentrac>
heh
<GreaseMonkey>
i really do need to confirm if i'm actually using a dynarec when using qemu
<GreaseMonkey>
i mean, qemu works, but don't expect performance miracles
<jrtc27>
yeah I've done that in LLVM before and got immediates and registers confused, though -verify-machineinstrs is a godsend for finding that kind of thing
<GreaseMonkey>
i'm also quite impressed with dosbox's admittedly slightly wonky core_dynrec core, backends are about 1000 lines each
<GreaseMonkey>
could be made smaller almost trivially, but still a pretty good effort
<sorear>
if you didn't build it with --enable-tcg-interpreter and you don't have KVM loaded (and you're not in an environment where HVF/etc is applicable), you can establish by exclusion that you're using a dynrec
<GreaseMonkey>
alright, probably got a dynarec then
<jrtc27>
I think it's safe to say that GreaseMonkey isn't running RISC-V binaries on HVF :P
<jrtc27>
Xen is technically kinda a thing
<GreaseMonkey>
the only "hypervisor" i've got here is OpenSBI
<GreaseMonkey>
speaking of which, the unaligned memory access code is in dire need of some optimisation
<GreaseMonkey>
a 64-bit load is done as 8 individual "temporarily give ourselves U-mode privileges and suppress traps" byte loads
<jrtc27>
opensbi is not a hypervisor...
<GreaseMonkey>
yeah, hence the "quotes"
<jrtc27>
yeah, uh, don't do unaligned accesses
<jrtc27>
just because it works doesn't mean it's fast and a good idea
<jrtc27>
really they should've just been banned like sparc did
<jrtc27>
but then you break crappy software and hurt adoption of your new architecture
<GreaseMonkey>
yeah my intention is to avoid them in my dynarec backend
<GreaseMonkey>
and also nowadays there's plenty of code that runs on ARM that had to go via cores which didn't support unaligned accesses
<GreaseMonkey>
GCC handles those things decently
<jrtc27>
arm has supported unaligned accesses for ages
<jrtc27>
the weirdo rotation on unaligned accesses is in the past
<GreaseMonkey>
except it's not, because the Cortex-M0+ exists
<GreaseMonkey>
it's in an alternate timeline, but it's not merely in the past
<jrtc27>
no, there it faults
<jrtc27>
which is the correct behaviour
<jrtc27>
I'm talking about the pre-armv4(?) behaviour where loading 4 bytes from address 0x1003 would load 4 bytes from address 0x1000 and rotate it by 24 bits
<geist>
yah pre-armv5 indeed
<jrtc27>
pre-armv6 apparently, that's newer than I thought...
<geist>
they added an ability to generate a fault in v5 i think, and then in v6 they started to add the ability to just deal with unaligned, etc
<geist>
and v7 made it the default
<jrtc27>
ah that sounds more like the right timeline
<jrtc27>
the old behaviour actually made sense from a hardware perspective though :P
<dh`>
I blame the mips lwl/lwr patent rubbish
<geist>
oh yeah? there was a period where it was dangerous to implement it?
<dh`>
yes
<geist>
ah that's interesting. probably mostly impacting other load/store architectures more than CISC ones?
<dh`>
idk
<geist>
never heard of it, but totally not surprised
<dh`>
I didn't hear about it until years after
<jrtc27>
alpha's equivalent sucked
<geist>
just not allowing yuo to do unaligned at all and only on a 64bit boundary?
<jrtc27>
but was probably done that way so there were no data dependencies for the loads
<geist>
(EV4 at least)
<GreaseMonkey>
...oh right
<GreaseMonkey>
if i understand correctly, ARMv7-M goes for the 32-bit -> 16+16 or 8+16+8 approach for unaligned accesses
<jrtc27>
also lwl/lwr only worked for 32-bit values, the 16-bit version was more verbose
<GreaseMonkey>
ah yes, 16-bit, the curse of many a 32-bit or 64-bit RISC machine
<dh`>
because so much code does 16-bit accesses :-)
<dh`>
though there was more when mips was invented
<jrtc27>
lots of short's in the TCP/IP stack
<GreaseMonkey>
riscv64 gives you 32-bit sign extends as ADDIW rd, rs, 0
<GreaseMonkey>
and 16-bit sign extends as two shifts
<geist>
alpha EV4 is hilarious to watch the codegen for string routines. it's always doing 64bit load/stores and a bunch of shifting and masking
<GreaseMonkey>
although 8-bit sign extends are also two shifts but at least the zero extends are one op
<jrtc27>
bitmanip adds single-instruction zext.[hw] and sext.[bh]
<jrtc27>
all in Zbb, and ext.[hw] are in Zbp
<jrtc27>
*zext.[hw]
<jrtc27>
(and if you don't have the relevant extension, they're implemented as pseudoinstructions that expand to the shifts in the assembler)
<GreaseMonkey>
what i've seen of bitmanip is a nice mix of impressive and weird
* sorear
still unsold on the "single bit" and "shift ones" instructions, does anyone else have or use those