#riscv on 2021-06-26 — irc logs at libera.irclog.whitequark.org

2021-05-20 20:58 sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv | Backup if libera.chat and freenode fall over: irc.oftc.net

00:05 jporquet has joined #riscv

00:07 <jporquet> Hi all!

00:07 <jporquet> I'm a bit confused by the multilib feature in gcc, I was wondering if someone could clarify it

00:08 <jporquet> according to https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/t-linux-multilib, it looks like the few multilibs that are actually generated are rv{32,64}imac+rv{32,64}imacfd

00:09 <jporquet> and then for all the other possible combination (e.g. rv32imafd == rv32g), these libs are reused

00:10 <jporquet> the problem is that if libgcc is compiled using a rv32imac configuration, then it won't work when running gcc with --march=rv32g since the application will be linked against code that contain compressed instructions

00:11 <jporquet> in other words, if my cpu doesn't support the C extension, I can't make sure that all my code is compiled in rv32g

00:12 <jporquet> am I missing something?

00:14 choozy has quit [Remote host closed the connection]

00:16 <jrtc27> the output is valid and works, it's just not helpful for your specific use case

00:16 vagrantc has quit [Ping timeout: 250 seconds]

00:17 <jrtc27> I agree though that some of those mappings seem rather strange

00:17 <jrtc27> but at the same time, imac and imafdc are by far the most common sets of extensions

00:18 vagrantc has joined #riscv

00:18 <jrtc27> it's strongly recommended that you support one of those

00:19 <xentrac> does the c extension usually make code run faster?

00:20 <jrtc27> it reduces icache pressure

00:20 <xentrac> right

00:20 <jrtc27> and, if you have virtual memory, itlb pressure

00:20 <xentrac> (I mean obviously that's a question about implementations, not architectures, so I'm asking about your experience with popular implementation techniques)

00:20 <jporquet> what's weird is that the G meta-extension (IMAFD) was marketed as the default extension, but it seems like with the default configuration, GCC is compiled for GC only

00:20 <pabs3> do many RISC-V CPU designs not support the C extension?

00:21 <xentrac> not many, surely some

00:21 <jrtc27> at the time G was defined it was not clear what the state of C would be

00:21 <jporquet> if you have a CPU that doesn't support the C extension, you can't use vanilla GCC

00:21 <jrtc27> distributions, implementations and embedded OS'es all settled on C being assumed in the end

00:21 <jrtc27> sure you can

00:21 <jrtc27> just make sure you have the right multilib

00:21 <jrtc27> I assume

00:22 <jporquet> all the generated multilib use the C extension

00:22 <jporquet> look at https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/t-linux-multilib

00:23 <jporquet> variable `MULTILIB_REQUIRED`

00:23 <jrtc27> yes but if you build an rv32imafd libgcc and put it in the right place will it not be picked up?

00:24 <jrtc27> and if not, well, -L is your friend I guess

00:24 <jporquet> sure, but as said previously, it still means that vanilla GCC does not strictly support RV{32,64}G even though it's supported to be the default set of extensions :|

00:25 <jporquet> *supposed to be

00:27 <jrtc27> "support" is not a well-defined term here

00:28 <jporquet> what do you mean?

00:28 <jrtc27> GCC the compiler supports everything

00:28 <jrtc27> it ships with some pre-built libgcc's for convenience

00:28 <jrtc27> those happen to not include a configuration that you support

00:29 <jrtc27> but that's libgcc the runtime, not GCC the compiler

00:29 <jrtc27> which is optional

00:30 <jrtc27> note that https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/t-elf-multilib *does* have non-C configs

00:30 <jporquet> gotcha

00:31 <jporquet> I'll rephrase by saying: I'm surprised that no libgcc is pre-compiled to be strictly compatible with rv32/64g since the G extension is supposed to be the default hardware-wise

00:33 <jrtc27> the second part of your statement is false

00:33 <jrtc27> and it's not surprising because it would be a waste of space when the linux community has collectively agreed on GC as the base ISA

00:33 <jrtc27> with IMAC if you want to do soft-float

00:34 <jrtc27> but on a linux-capable system the complexity of implementing C is insignificant

00:34 <jporquet> I hear you but I don't think my statement is false

00:35 <jporquet> there's a whole chapter in the specs about the G ISA

00:35 <jrtc27> G means general-purpose not default

00:35 <jporquet> hmmm

00:36 <jrtc27> because C doesn't add any new *functionality*

00:36 <jporquet> I'm not a native speaker so I won't argue further, but I find it confusing nonetheless

00:37 <jporquet> anyway, thanks for your insight, really appreciate it

00:37 <jporquet> it definitely clarifies my confusion

00:37 <jrtc27> https://github.com/riscv/riscv-platform-specs/blob/master/riscv-unix.adoc FWIW

00:38 <jrtc27> although the repo's since been reworked (and the branch renamed to main)

00:42 jporquet has quit [Quit: Client closed]

00:48 <xentrac> ah, they renamed from master to main?

00:48 <xentrac> that link still seems to work tho

00:57 <jrtc27> yeah, just means the file's old

00:59 <xentrac> maybe they should put a note at the top of the github page, above the file, about the branch renaming

01:00 <xentrac> oh well. I'm not going to go pester github about it

01:01 <jrtc27> https://github.com/riscv/riscv-platform-specs/tree/master tells you

01:01 <jrtc27> if you rename the branch in GitHub though I think they do various bits of redirection

01:02 <jrtc27> so might be that they side-stepped that

01:03 <xentrac> oh, could be

01:19 Sos has quit [Quit: Leaving]

01:47 peepsalot has quit [Ping timeout: 268 seconds]

01:50 peepsalot has joined #riscv

01:51 aquijoule_ has joined #riscv

01:54 richbridger has quit [Ping timeout: 265 seconds]

03:41 davidlt has joined #riscv

03:52 FluffyMask has quit [Quit: WeeChat 2.9]

05:31 vagrantc has quit [Ping timeout: 272 seconds]

05:57 dionysos has quit [Ping timeout: 252 seconds]

07:12 hendursaga has joined #riscv

07:26 davidlt has quit [Ping timeout: 265 seconds]

08:01 dmang has quit [Ping timeout: 258 seconds]

08:02 dmang has joined #riscv

08:04 gector has joined #riscv

08:07 dmang has quit [Ping timeout: 272 seconds]

08:09 hendursaga has quit [Ping timeout: 244 seconds]

08:15 dmang has joined #riscv

08:16 gector has quit [Ping timeout: 258 seconds]

08:33 hendursaga has joined #riscv

08:44 jeancf_ has joined #riscv

08:49 jeancf_ has quit [Quit: Konversation terminated!]

08:49 helium-3 has joined #riscv

08:49 jeancf_ has joined #riscv

08:50 jeancf_ has quit [Client Quit]

08:50 jeancf_ has joined #riscv

08:51 helium-3 is now known as dionysos

09:01 jeancf_ has quit [Ping timeout: 265 seconds]

09:33 abelvesa_ has joined #riscv

09:36 abelvesa has quit [Ping timeout: 252 seconds]

10:38 jeancf_ has joined #riscv

10:46 jeancf_ has quit [Ping timeout: 272 seconds]

10:48 TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

10:48 TMM_ has joined #riscv

11:11 frost has joined #riscv

11:28 choozy has joined #riscv

11:43 mahmutov has quit [Ping timeout: 250 seconds]

11:57 choozy has quit [Remote host closed the connection]

12:03 zjason` is now known as zjason

12:47 mahmutov has joined #riscv

14:01 FluffyMask has joined #riscv

14:21 gector has joined #riscv

14:25 mhorne has quit [Ping timeout: 252 seconds]

14:34 gector has quit [Ping timeout: 272 seconds]

14:36 Andre_H has joined #riscv

14:39 frost has quit [Quit: Connection closed]

14:45 gector has joined #riscv

14:58 mhorne has joined #riscv

16:46 vagrantc has joined #riscv

17:01 vagrantc has quit [Quit: leaving]

17:04 gector has quit [Ping timeout: 258 seconds]

17:12 gector has joined #riscv

17:16 riff-IRC has joined #riscv

18:02 TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

18:02 TMM_ has joined #riscv

18:34 gector has quit [Ping timeout: 244 seconds]

19:34 gector has joined #riscv

19:56 smaeul has quit [Remote host closed the connection]

19:56 smaeul has joined #riscv

20:08 riff_IRC has joined #riscv

20:08 smaeul has quit [Quit: SIGPWR]

20:08 riff-IRC has quit [Ping timeout: 244 seconds]

20:16 <leah2> what kind of speeds to you get on nvme disks with the unmatched?

20:35 gector has quit [Ping timeout: 265 seconds]

20:41 <xentrac> unmatched speeds?

20:42 <leah2> i only seem to get 120mb/s which sounds a bit slow?

20:43 <xentrac> sorry, I was just joking. it does sound slow!

20:45 Sos has joined #riscv

20:46 <jrtc27> well how fast is memcpy for you?

20:46 <geist> also may only be a pcie 1x link?

20:46 <jrtc27> that should give you an upper bound

20:46 <jrtc27> I think it's 2x for the NVMe

20:47 <jrtc27> or maybe it was 4x

20:47 <geist> ah yeah that should be good. even with 1x pcie 2.0 that'd be 500MB/sec

20:47 <geist> always have to go back to the table to see

20:47 <leah2> jrtc27: what's a good benchmark for that?

20:48 <jrtc27> yeah x4 for NVMe, x1 for the M.2 E, x2 for the xHCI

20:48 <jrtc27> leah2: dd if=/dev/zero of=/dev/null (with appropriate bs and count) does more than a memcpy but is closer to real I/O

20:49 <leah2> gives me 1GB/s which sounds realistic

20:50 <leah2> [ 3.386049] nvme nvme0: 4/0/0 default/read/poll queues

20:51 <geist> yah 'pv' is a nice app for a quickie benchmark: 'pv /dev/zero > /dev/null'

20:51 <leah2> yeah i used pv :)

20:51 <xentrac> ha cool

20:51 <jrtc27> too scared of dd? :P

20:52 <xentrac> pv tells you the answer before it exits

20:52 <geist> it gives uyou a running display which is nice

20:52 <jrtc27> so does dd if you killall -INFO it

20:52 <jrtc27> or, if you're on FreeBSD, just press ^T

20:52 <geist> also helpfulk for quickie benchmarks like 'pv /dev/zero | md5sum'

20:52 <xentrac> oh neat, I didn't know that about dd

20:52 <xentrac> I miss ^T from VMS

20:53 <xentrac> (because I'm not running BSD, of course. self-inflicted injury)

20:53 <geist> pv has a kinda neat thing that it also uses splice() fairly aggressively on linux, so sometimes depending on what you're piping from and to, it avoids a copy

20:53 <geist> so pv /dev/zero > /dev/null *probably* just splices between those two fds

20:53 <geist> OTOH, that may also skew your particular benchmark here, depending on if linux decides to short circuit that internally

20:54 <leah2> i'll try fio later

20:55 <leah2> but machine is building rust atm

20:55 <xentrac> thanks jrtc27!

20:55 <geist> oh that's slow no matter what arch youi're on!

20:55 <jrtc27> macOS also has ^T

20:56 <jrtc27> it's really just Linux that sucks here

20:56 <geist> yah that's why i dont get too invested in particular flavors of dd

20:56 <geist> it's like tar, you're always finding that your version doesn't have this or that

20:57 <jrtc27> it's the OS not supporting ^T for SIGINFO, not a property of the dd

20:57 <geist> sure, but same result

20:57 <xentrac> Linux doesn't support ^Y either, which I also used to miss a lot

20:57 riff_IRC is now known as riff-IRC

20:57 <xentrac> although I don't actually know if ^Y is useful with ssh

20:57 <geist> hmm, what is ^Y supposed to do?

20:57 <xentrac> dsusp

20:58 <geist> and yeah VMS DCL and whatnot is pretty neat

20:58 <xentrac> it sends SIGTSTP like ^Z, but not to the tty pgrp, but rather to whatever process tries to read the poisoned ^Y

20:58 <geist> the job control is wonky if you came from unix, but it's pretty pwoerful once you grok it

20:59 <jrtc27> (also I meant -USR1 for Linux, not -INFO, because Linux doesn't have SIGINFO, except on alpha because Linux's ABI is a mess)

20:59 <xentrac> so in particular you could rsh hercules, start some stuff on hercules, ~^Y, rsh hephaestus, and do stuff on hephaestus, while still seeing the output from whatever you were doing on hercules

20:59 <xentrac> because only the outbound rsh process got suspended, not the inbound half

21:00 FluffyMask has quit [Quit: WeeChat 2.9]

21:01 <xentrac> ssh doesn't do the forking into two unidirectional processes thing that rsh did, so I don't think it would work

21:02 <xentrac> I never learned to use DCL job control. I liked DCL a lot but didn't understand it much. but I was just a kid

21:04 <xentrac> BSD does have ^Y. not sure about MacOS

21:05 <jrtc27> ^Y is the opposite of ^U for me

21:09 <jimwilson_> nvme speed is discussed in this forum thread, with iflag=direct and bs=1024 you should get close to 2GB/s, https://forums.sifive.com/t/ssd-performance/4850/3

21:19 <geist> ah actually more specifically bs=1024k

21:19 <geist> that's kinda expected, IMO. far less syscalls than the probable default of 512

21:20 <geist> (1MB instead of 512)

21:29 <xentrac> yeah, Emacs ^Y is "yank" (paste "killed" text), and bash and zsh copied that. I think tcsh too

21:30 <xentrac> amusingly enough Emacs doesn't have convenient keys for either Unix's ^U (originally @) or ^W (which I think was added in BSD, post-printing-teletype)

21:30 <xentrac> I end up using alt-← for ^W in Emacs (which also works in bash's readline)

21:31 <xentrac> 2GB/s sounds significantly different from 0.12GB/s. does that help, leah2?

21:50 gector has joined #riscv

21:53 <leah2> so i think something is wrong here

21:54 <leah2> READ: bw=75.7MiB/s (79.4MB/s), 75.7MiB/s-75.7MiB/s (79.4MB/s-79.4MB/s), io=3070MiB (3219MB), run=40559-40559msec

21:54 <leah2> WRITE: bw=25.3MiB/s (26.5MB/s), 25.3MiB/s-25.3MiB/s (26.5MB/s-26.5MB/s), io=1026MiB (1076MB), run=40559-40559msec

21:54 <leah2> mixed rw test. it just adds up to 100mb/s

21:56 <leah2> split:

21:56 <leah2> READ: bw=117MiB/s (123MB/s), 117MiB/s-117MiB/s (123MB/s-123MB/s), io=4096MiB (4295MB), run=34945-34945msec

21:56 <leah2> WRITE: bw=78.4MiB/s (82.2MB/s), 78.4MiB/s-78.4MiB/s (82.2MB/s-82.2MB/s), io=4096MiB (4295MB), run=52238-52238msec

21:57 <xentrac> sounds like a 100MiB/s bottleneck somewhere, yeah

21:57 <leah2> hmm

21:57 <leah2> dd if=/dev/nvme0n1p1 of=/dev/null bs=1024k status=progress iflag=direct iflag=fullblock gives me 1.6Gb/s tho

21:58 <leah2> ok, note that the fio tests random access

22:09 <geist> the block size of the access probably matters a lot here

22:10 <geist> the bs=1024k is doing a syscall per 1MB

22:10 <geist> vs whatever block size the other tests are

22:10 <geist> i bet if you start lowering that bs to 256k then 64k etc you'll see the speed roll off

22:12 <dh`> ^U in emacs is ^A^K, which is deeply wired into emacs users' fingers to the extent that running screen means your windows disappear without realizing what happened

22:12 <dh`> (^A is screen's attention key and ^K means kill this screen)

22:15 <xentrac> more recent versions of screen have rebound that to ^Aky

22:15 <xentrac> because it used to be a real annoyance

22:16 <dh`> I solved that problem by not using screen

22:16 <dh`> I still don't understand why screen never fixed their shit so you could use a function key as the attention key

22:16 <dh`> it has to be a single byte, not an escape sequence

22:17 <xentrac> my cousin rebound F1 to ^A and ^A to ^Aa in his xterm, problem solved

22:17 <sorear> hah

22:17 <xentrac> not all his xterms, only the ones he launches to connect to screen sessions

22:18 <xentrac> using a function key as the attention key in screen itself requires some kind of timeout mechanism to decide when to just pass the initial ^[ on to vi rather than waiting for the rest of the function key escape sequence

22:19 <dh`> sure

22:19 <dh`> but every other damn program does that, why can't screen?

22:19 <xentrac> well, most don't; instead they just don't use ^[ by itself

22:19 <dh`> lots do, including vi

22:19 <xentrac> vim and irssi do the timeout thing, vi doesn't last I checked

22:20 <dh`> I mean, this is stupid and should have been fixed at the os level 40 years ago

22:20 <dh`> it has to, there is no other way to use esc as a keystroke

22:20 <xentrac> it's profoundly annoying in irssi because in irssi network latency makes it guess wrong pretty often

22:20 <xentrac> sure there is, you can support esc and not support function keys

22:21 gector has quit [Ping timeout: 268 seconds]

22:21 <xentrac> as for ^A^K, by default I think of a two-chord sequence as being profoundly different from a one-chord sequence, but I guess that's mostly for things that repeat, like ^W^W^W. and repeating ^U doesn't make sense

22:21 <dh`> that's not a viable proposition for programs that actually do any kind of input editing

22:21 <xentrac> not sure what you mean by input editing but vi spent decades supporting esc and not supporting function keys

22:22 <xentrac> I don't know if the OS really needs to be involved with solving this. you just need a protocol for sending streams of keystrokes and maybe other events like touchmove events that doesn't have this kind of parsing ambiguity

22:23 <dh`> the OS needs to be involved with this because it's the OS that sends the input stream

22:23 <dh`> anyway yeah true, archaic vi doesn't support anything that has an escape prefix

22:23 <xentrac> maybe in a virtual console, but I'm typing this in gnome-terminal

22:24 <dh`> which gets its input from a pty

22:24 <xentrac> well no, it sends my keystrokes to a pty

22:24 <xentrac> it gets the keystrokes as input over its socket to the X server

22:25 <dh`> yes, and the pty munges them because that's what unix ttys do

22:25 <xentrac> (it does get input from a pty but that input isn't keystrokes, it's screen contents)

22:25 <xentrac> a little, right now the pty is in raw mode because I'm running ssh on it

22:28 <dh`> anyway the standards for what you read from ptys as input keystrokes are an OS thing

22:29 <xentrac> potentially? I mean historically Unix treats them as a private matter between the terminal and the application

22:29 <dh`> which is why the situation remains broken, because nobody has the authority to fix it

22:30 <dh`> yes and no, the mapping between input sequences and any kind of useful keystroke concept beyond ascii sits in libcurses

22:31 <xentrac> sure, if you consider curses part of the operating system. and, hey, not only curses but also terminfo are in posix

22:31 <xentrac> my design for Wercam sends the keystrokes over a seqpacket socket rather than a pty and represents key events with packets containing "key up %d %n" or "key down %d x", where %d is a numeric scan code from the USB HID standard and x is a URL-encoded string which represents its default UTF-8 value

22:31 <dh`> curses is definitely part of the operating system

22:32 <xentrac> it's just a library. it doesn't have anything to do with securely multiplexing resources

22:32 <dh`> next you're going to say /bin/sh isn't part of the operating system

22:32 <xentrac> agreed, it's not

22:32 <xentrac> although there are lots of historical operating systems where the shell *was* part of the operating system

22:33 <xentrac> it's just a semantic argument, though

22:33 <dh`> this is a silly thing to argue about, but there is historical practice

22:33 <xentrac> I recognize that the sense I'm using "operating system" in is a bit old-fashioned

22:34 <xentrac> so in the broader sense of "functionality shared between many or all applications" certainly key event encoding is part of the operating system

22:35 <xentrac> oh I see I wrote "key up %d %n" where I meant "key up %d x"

22:36 <dh`> anyway this is all bust in unix because it was stuffed into curses 40+ years ago when doing it properly would have been unacceptably expensive, and nobody since has taken the trouble to make it work nicely

22:37 <xentrac> at the time it *wasn't* a matter of the OS either

22:38 <xentrac> because your VT100 was what was encoding your keystrokes into a bytestream, with the ambiguity already baked in

22:38 <xentrac> and the OS didn't have any control over that

22:39 <dh`> yes, but instead of interpreting it in a driver like it should have been done, it was passed straight to applications

22:39 <dh`> just like printers in msdos

22:39 <xentrac> you mean, it should have been done in the kernel instead of in a library?

22:39 <xentrac> that would help in a few cases but it doesn't help the underlying ambiguity problem

22:40 <dh`> no but it isolates the problem so it can be fixed rather than making it part of the standard os/application interface

22:40 <xentrac> now you've switched back to using *my* definition of operating system, it seems ;)

22:40 <dh`> no, because if everything used the input interpreter in curses there wouldn't be a problem

22:40 <xentrac> all the ridiculous complexity of curses and termcap and later terminfo was an attempt to work around the inability to reprogram the commonly used terminals to do more convenient things

22:40 <dh`> but they don't

22:40 <dh`> for various reasons

22:41 <xentrac> curses inherently can't tell the difference between me pressing ↑ and typing Esc [ A

22:41 <xentrac> on a VT100 or on a modern emulator of it like gnome-terminal

22:41 <dh`> neither can anything the way the world is structured

22:41 <xentrac> so there would still be a problem

22:42 <dh`> nardly

22:42 <dh`> er

22:42 <dh`> hardly

22:42 <dh`> because if this had been fixed properly when it should have been, today typing esc [ A would cause curses to feed you ESCAPE LEFT-BRACKET A

22:42 <xentrac> yeah, that would be nice. and actually curses does guess and get that right most of the time

22:42 <dh`> whereas pushing the up arrow would cause you to receive UP

22:43 <dh`> today the escape sequences are generated in your keyboard driver in order to be ambiguous for curses to try to cope with

22:43 <xentrac> but at low baud rates and particularly with packet loss and retransmission and jitter, it doesn't work reliably enough

22:43 <dh`> well yes

22:44 <xentrac> which of course leads to application writers trying to fix it

22:44 <xentrac> and since it can't be fixed they just end up making different tradeoffs

22:44 <dh`> and you end up with more and more layers of hack pasted on because nobody can be arsed to fix it properly

22:45 <xentrac> which is why every once in a while I'll send a message in irssi beginning with U preceded by a different message I'd decided not to send

22:45 <xentrac> because the lack of delay around my ^U caused by TCP retransmission made irssi decide that I was pasting text (another thing not initially contemplated)

22:46 <xentrac> or there will be literally a [[A or something in there

22:46 jotweh has quit [Ping timeout: 258 seconds]

22:47 <xentrac> I think the basic reason we're still dealing with unfixable workarounds for 50-year-old protocol design errors like this is that nobody's come up with a protocol that's a Pareto improvement

22:48 <dh`> nobody's seriously tried

22:49 <dh`> how much work would it be to add a tty mode that produces useful input symbols?

22:49 <dh`> might take a whole weekend.

22:49 jotweh has joined #riscv

22:50 <xentrac> you don't need to modify the kernel; you just need to change the protocol that applications like IRC clients and text editors speak to the "terminal emulator"

22:50 <xentrac> I wrote a bit about the problem a couple of months ago in https://news.ycombinator.com/item?id=26815196

22:51 <dh`> yes you do, because the input characters ultimately come from the kernel

22:51 <xentrac> to a great extent what has happened is that DHTML has displaced VT100 emulation, for better or worse

22:51 <xentrac> what comes ultimately from the kernel in this case are USB HID input events, not characters

22:52 <dh`> when you're in X

22:53 <xentrac> Wayland too

22:53 <xentrac> it's true that if you're on a virtual console it's the kernel that does it

22:53 <xentrac> the X server (in my case) transforms those into XKeyEvents, and then gnome-terminal (via GTK) does the translation to ASCII

22:54 <xentrac> if you have an application that's running on a virtual console, though, it doesn't have to suffer through the kernel's lossy transformation to ASCII; it can read the key events from (on Linux) /dev/input/*

22:56 <xentrac> anyway, defining an unambiguous represntation for <k><e><y><s><r><BS><t><r><o><k><e><s> isn't the hard part; it's changing all the applications to use it

22:56 <dh`> yes, but the raw console interface isn't standard

22:57 <dh`> anyway there are only about a dozen things to patch to get a large amount of traction (just readline, curses, vim, and emacs will go a long way)

22:58 <xentrac> I compiled a longer list at the link above

22:59 <GreaseMonkey> if i'm reading this correctly, what you'd want to do here is roll your own termcap config

22:59 <GreaseMonkey> see how far you get with that

23:02 <GreaseMonkey> ...also i'm curious as to if anyone's working on to-riscv dynamic recompilers, currently i'm having a go at writing a backend for dosbox-staging

23:02 <xentrac> no, that doesn't address the problem at all

23:02 <sorear> there's a qemu tcg backend

23:02 <sorear> haven't tried to use it

23:03 <GreaseMonkey> a fun part is as part of the process of writing one you end up with gold like this:

23:03 <GreaseMonkey> => 0x0000003fb1ffb3cc:lbua0,0(zero) # 0x0

23:03 <GreaseMonkey> (i accidentally fed in a pointer to what expected a register)

23:03 <xentrac> heh

23:03 <GreaseMonkey> i really do need to confirm if i'm actually using a dynarec when using qemu

23:04 <GreaseMonkey> i mean, qemu works, but don't expect performance miracles

23:04 <jrtc27> yeah I've done that in LLVM before and got immediates and registers confused, though -verify-machineinstrs is a godsend for finding that kind of thing

23:05 <GreaseMonkey> i'm also quite impressed with dosbox's admittedly slightly wonky core_dynrec core, backends are about 1000 lines each

23:05 <GreaseMonkey> could be made smaller almost trivially, but still a pretty good effort

23:05 <sorear> if you didn't build it with --enable-tcg-interpreter and you don't have KVM loaded (and you're not in an environment where HVF/etc is applicable), you can establish by exclusion that you're using a dynrec

23:06 <GreaseMonkey> alright, probably got a dynarec then

23:07 <jrtc27> I think it's safe to say that GreaseMonkey isn't running RISC-V binaries on HVF :P

23:07 <jrtc27> Xen is technically kinda a thing

23:08 <GreaseMonkey> the only "hypervisor" i've got here is OpenSBI

23:08 <GreaseMonkey> speaking of which, the unaligned memory access code is in dire need of some optimisation

23:09 <GreaseMonkey> a 64-bit load is done as 8 individual "temporarily give ourselves U-mode privileges and suppress traps" byte loads

23:09 <jrtc27> opensbi is not a hypervisor...

23:09 <GreaseMonkey> yeah, hence the "quotes"

23:10 <jrtc27> yeah, uh, don't do unaligned accesses

23:10 <jrtc27> just because it works doesn't mean it's fast and a good idea

23:10 <jrtc27> really they should've just been banned like sparc did

23:10 <jrtc27> but then you break crappy software and hurt adoption of your new architecture

23:10 <GreaseMonkey> yeah my intention is to avoid them in my dynarec backend

23:11 <GreaseMonkey> and also nowadays there's plenty of code that runs on ARM that had to go via cores which didn't support unaligned accesses

23:11 <GreaseMonkey> GCC handles those things decently

23:11 <jrtc27> arm has supported unaligned accesses for ages

23:11 <jrtc27> the weirdo rotation on unaligned accesses is in the past

23:11 <GreaseMonkey> except it's not, because the Cortex-M0+ exists

23:12 <GreaseMonkey> it's in an alternate timeline, but it's not merely in the past

23:13 <jrtc27> no, there it faults

23:13 <jrtc27> which is the correct behaviour

23:13 <jrtc27> I'm talking about the pre-armv4(?) behaviour where loading 4 bytes from address 0x1003 would load 4 bytes from address 0x1000 and rotate it by 24 bits

23:14 <geist> yah pre-armv5 indeed

23:15 <jrtc27> pre-armv6 apparently, that's newer than I thought...

23:15 <geist> they added an ability to generate a fault in v5 i think, and then in v6 they started to add the ability to just deal with unaligned, etc

23:15 <geist> and v7 made it the default

23:15 <jrtc27> ah that sounds more like the right timeline

23:16 <jrtc27> the old behaviour actually made sense from a hardware perspective though :P

23:17 <dh`> I blame the mips lwl/lwr patent rubbish

23:17 <geist> oh yeah? there was a period where it was dangerous to implement it?

23:17 <dh`> yes

23:17 <geist> ah that's interesting. probably mostly impacting other load/store architectures more than CISC ones?

23:18 <dh`> idk

23:18 <geist> never heard of it, but totally not surprised

23:18 <dh`> I didn't hear about it until years after

23:19 <jrtc27> alpha's equivalent sucked

23:19 <geist> just not allowing yuo to do unaligned at all and only on a 64bit boundary?

23:19 <jrtc27> but was probably done that way so there were no data dependencies for the loads

23:19 <geist> (EV4 at least)

23:22 <GreaseMonkey> ...oh right

23:22 <GreaseMonkey> if i understand correctly, ARMv7-M goes for the 32-bit -> 16+16 or 8+16+8 approach for unaligned accesses

23:22 <jrtc27> also lwl/lwr only worked for 32-bit values, the 16-bit version was more verbose

23:23 <GreaseMonkey> ah yes, 16-bit, the curse of many a 32-bit or 64-bit RISC machine

23:23 <dh`> because so much code does 16-bit accesses :-)

23:23 <dh`> though there was more when mips was invented

23:23 <jrtc27> lots of short's in the TCP/IP stack

23:24 <GreaseMonkey> riscv64 gives you 32-bit sign extends as ADDIW rd, rs, 0

23:24 <GreaseMonkey> and 16-bit sign extends as two shifts

23:24 <geist> alpha EV4 is hilarious to watch the codegen for string routines. it's always doing 64bit load/stores and a bunch of shifting and masking

23:24 <GreaseMonkey> although 8-bit sign extends are also two shifts but at least the zero extends are one op

23:25 <jrtc27> bitmanip adds single-instruction zext.[hw] and sext.[bh]

23:26 <jrtc27> all in Zbb, and ext.[hw] are in Zbp

23:26 <jrtc27> *zext.[hw]

23:28 <jrtc27> (and if you don't have the relevant extension, they're implemented as pseudoinstructions that expand to the shifts in the assembler)

23:29 <GreaseMonkey> what i've seen of bitmanip is a nice mix of impressive and weird

23:35 * sorear still unsold on the "single bit" and "shift ones" instructions, does anyone else have or use those

23:39 Andre_H has quit [Quit: Leaving.]