sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv
pabs3 has quit [Read error: Connection reset by peer]
pabs3 has joined #riscv
haritz has quit [Quit: ZNC 1.8.2+deb2 - https://znc.in]
haritz has joined #riscv
haritz has joined #riscv
Maylay has quit [Ping timeout: 252 seconds]
EchelonX has quit [Quit: Leaving]
crabbedhaloablut has quit [Remote host closed the connection]
Maylay has joined #riscv
crabbedhaloablut has joined #riscv
crabbedhaloablut has quit [Remote host closed the connection]
dramforever has quit [Ping timeout: 252 seconds]
crabbedhaloablut has joined #riscv
dramforever has joined #riscv
dramforever has quit [Ping timeout: 252 seconds]
dramforever has joined #riscv
handsome_feng has joined #riscv
Gravis has quit [Ping timeout: 245 seconds]
jacklsw has joined #riscv
vagrantc has quit [Quit: leaving]
Maylay has quit [Ping timeout: 268 seconds]
Maylay has joined #riscv
crabbedhaloablut has quit [Remote host closed the connection]
crabbedhaloablut has joined #riscv
dramforever_ has joined #riscv
dramforever has quit [Read error: Connection reset by peer]
crabbedhaloablut has quit [Remote host closed the connection]
crabbedhaloablut has joined #riscv
crabbedhaloablut has quit [Remote host closed the connection]
crabbedhaloablut has joined #riscv
ldevulder has joined #riscv
crabbedhaloablut has quit [Remote host closed the connection]
crabbedhaloablut has joined #riscv
kehvo has quit [*.net *.split]
octav1a has quit [*.net *.split]
jcm has quit [*.net *.split]
xypron has quit [*.net *.split]
tmiw has quit [*.net *.split]
mrkajetanp has quit [*.net *.split]
meowray has quit [*.net *.split]
meowray has joined #riscv
octav1a has joined #riscv
xypron has joined #riscv
tmiw has joined #riscv
jcm has joined #riscv
kehvo has joined #riscv
mrkajetanp has joined #riscv
BootLayer has joined #riscv
khem has quit [*.net *.split]
kido_ has quit [*.net *.split]
Sofia has quit [*.net *.split]
elms has quit [*.net *.split]
geist has quit [*.net *.split]
pjw has quit [*.net *.split]
ln5 has quit [*.net *.split]
arnd has quit [*.net *.split]
englishm has quit [*.net *.split]
merry has quit [*.net *.split]
kido_ has joined #riscv
elms has joined #riscv
geist has joined #riscv
englishm has joined #riscv
Sofia has joined #riscv
pjw has joined #riscv
arnd has joined #riscv
Maylay has quit [Ping timeout: 252 seconds]
Sofia has quit [Changing host]
Sofia has joined #riscv
ln5 has joined #riscv
khem has joined #riscv
merry has joined #riscv
dramforever__ has joined #riscv
dramforever_ has quit [Read error: Connection reset by peer]
lagash has quit [Quit: ZNC - https://znc.in]
lagash has joined #riscv
Maylay has joined #riscv
dramforever_ has joined #riscv
dramforever__ has quit [Ping timeout: 255 seconds]
prabhakarlad has joined #riscv
jacklsw has quit [Ping timeout: 245 seconds]
pecastro has joined #riscv
dor has joined #riscv
pecastro has quit [Quit: Lost terminal]
<dramforever_> Hi, I'm trying to get U-boot to boot Linux with EFI, but I couldn't get it to load the initrd (log: https://fars.ee/2KW9 ). Any ideas on what I could be missing?
dramforever_ is now known as dramforever
<dramforever> Oops, full log here, with u-boot interaction: http://fars.ee/gCG4
lopa has joined #riscv
lopa has joined #riscv
<lopa> hi, when tryin to dfu flash longan nano I get this error
<lopa> Cannot open DFU device 28e9:0189 found on devnum 9 (LIBUSB_ERROR_ACCESS)
<lopa> it has been set in bootload mode
<dramforever> I think that means 'permission denied'
<lopa> ho alright
<dramforever> Get the bus and device number from lsusb and do a 'sudo chmod +w /dev/usb/{bus}/{device}'
<dramforever> You can also set up a udev rule or something...
<lopa> but do I need to run dfu with sudo?
<dramforever> I think either would work, but I just prefer to run less stuff with sudo
<lopa> ok so tried sudo chmod +w /dev/bus/usb/001/009 but nothing has changed
<lopa> ho it was a+w
<lopa> its did work awesome, Imma make a udev rule
<lopa> :DDD
<lopa> thanks bye
lopa has left #riscv [WeeChat 1.6]
<dramforever> Just found out that the initrd= thing needs CONFIG_EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER=y
<dramforever> depends on !RISCV
<dramforever> why
kettenis has quit [Quit: Lost terminal]
handsome_feng has quit [Quit: Connection closed for inactivity]
<dramforever> Looks like it's replaced by LoadFile2
strlst has joined #riscv
<dramforever> ... and LINUX_EFI_INITRD_MEDIA_GUID, which is not supported by systemd-boot
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
dor has quit [Remote host closed the connection]
jmdaemon has quit [Ping timeout: 272 seconds]
<dramforever> That's a bit disappointing, but I did end up successfully building systemd-boot and grub2 and they do seem to run, so at least we have that
Andre_H has joined #riscv
kehvo has quit [Changing host]
kehvo has joined #riscv
Andre_H has quit [Ping timeout: 268 seconds]
zjason` is now known as zjason
<Sofia> Hello world. Looking at Pine64's Star64 SBC which uses the StarFive JH7110, which in turn is based on the FU740 which builds on the SiFive U74-MC (4x U74 + 1x S7).
<Sofia> Indicates this U74-MC has 32 KB L1d and 32 KB L1i per core; and 2 MB shared L2.
<Sofia> According to lstopo my laptop has 32 KB L1d and 32 KB L1i per core and 256 KB L2 caches per core.
<Sofia> Am I misinterpreting something or does this RISC-V chip have more RAM than my 10th gen Intel i7 laptop?
<Sofia> cache**
<Sofia> This says 8 MB cache, but doesn't say how it is divided.
<Sofia> lstopo gives me 4 lines of:
<Sofia> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
<Sofia> Ah, there we go.
<Sofia> lscpu shows:
<Sofia> L1d: 128 KiB (4 instances)
<Sofia> L1i: 128 KiB (4 instances)
<Sofia> L2: 1 MiB (4 instances)
<Sofia> L3: 8 MiB (1 instance)
<Sofia> lstopo just doesn't mention the L3; and apparently Intel doesn't count below L3 in their cache size.
<Sofia> More fun. U74-MC explicitly states ECC cache. Does Intel even use ECC for caches?
<Sofia> Per Intel's spec page, ECC is not supported for *RAM*. Doesn't mention caches.
prabhakarlad has quit [Quit: Client closed]
dramforever_ has joined #riscv
dramforever has quit [Read error: Connection reset by peer]
loki_val has joined #riscv
crabbedhaloablut has quit [Quit: No Ping reply in 180 seconds.]
dramforever_ has quit [Remote host closed the connection]
dramforever_ has joined #riscv
dramforever__ has joined #riscv
<conchuod> My 10th gen has the L3 in lstopo Sofia, but since it is not shared it doesnt show up in the individual core entries
<conchuod> since it is shared**
dramforever_ has quit [Ping timeout: 268 seconds]
<Sofia> Oh
<Sofia> Package L#0 + L3 L#0 (8192KB)
<Sofia> So it does, just put it up on the package layer.
<Sofia> Thanks for pointing that out :)
<conchuod> Also, cache size != RAM size. Your laptop has L3 cache and prob more DDR4 than the ~4 GB that the Quartz64 has (I think I heard that was going to be the basis for the jh110 one)
<Sofia> 4 and 8 G
<Sofia> And yes, my RAM is 16G, or 15 as lstopo shows. Don't think I have 1G integrated video allocation, but at least that'd take a bite.
<Sofia> I checked the P650 as well; (32 KB L1d + 32 KB L1i + 256 KB L2) * 16 cores + 16 MB shared L3.
<Sofia> LOTS more than my laptop, heh.
<conchuod> My midrange 10th gen (10400F) is configured identically, 32/32, 256 L2 & 12 MB shared L3
<conchuod> Actually not identically since thats 12 threads not 12 cores..
<Sofia> Eek. Network issues. What was my last message?
<Sofia> Anything from anyone else between conchuod "Also cache size != RAM size. ..." and "My midrange" ?
<octav1a> >"LOTS more than my laptop, heh."
<Sofia> Ty octav1a <3
<octav1a> :3
<Sofia> Am I correct in estimating instructions per second as: max = clock * cores * issue, min = max / stages. As a conservative and lazy estimate? Ex. 1.5 GHz * 4 cores * 1 issue, max = 6 billion, min = 750 million. 1.5 GHz * 16 cores * 4 issue, max = 96 billion, min = 7.3 billion.
<Sofia> Without diving into all the microarchitectural details for specifics or distributions. Notably out of order biases towards its best case much more often.
dramforever_ has joined #riscv
<Sofia> The P650. (32 KB L1d + 32 KB L1i + 256 KB L2) * 16 cores + 16 MB shared L3
<Sofia> Oh wait, that was received. :)
dramforever__ has quit [Ping timeout: 268 seconds]
rsalveti has joined #riscv
cousteau has joined #riscv
<cousteau> Hi!
<cousteau> Off-topic, but I don't know where else to ask... Do binary patterns 0xAAAAAAAA 0xCCCCCCCC 0xF0F0F0F0 0xFFFF0000 etc have any specific name?
<cousteau> These patterns have the peculiarity that their i'th bit is set if a specific bit of the binary number i is set. For example, 0xAAAAAAAA has all odd bits set and all even bits cleared
<Esmil> dramforever_: grub (in ubuntu at least) does implement the right protocol to load an initramfs, but unfortunately systemd-boot only implements it in the stub, but not in sd-boot: https://github.com/systemd/systemd/pull/20918#issuecomment-943780719
<cousteau> They're quite often used in bit manipulation functions (I've seen them used in the pseudocode description of some RISC-V bitmanip instructions), and play an important role in the generation of LUT functions, but I couldn't figure out a name for them
<Sofia> cousteau: Magic numbers or magic constants are terms I've seen used to describe them.
<cousteau> Yeah, I also saw "binary magic numbers" but that's kind of a broad term...
<cousteau> But thanks!
<cousteau> A paper I was reading (linked from an OEIS sequence) refers to them as magic numbers or magic constants (I think), and then b-constants
<Sofia> Not familiar of any more specific name
* Sofia nods
<Sofia> Does OEIS support set queries or only subsequence queries? Hmm
<cousteau> Sofia: yeah that's the article where I read them being called that for the first time
<cousteau> Maybe they should have a more specific name
<dramforever_> Esmil: Thanks for the tip. I did get grub to work last time, so the case is closed now
<Sofia> cousteau: Also hacker's delight, I think has the same name.
<dramforever_> Unfortunately NixOS relies on a perl script to generate the grub config, and some of its deps fail to cross, so that's why I ended up trying sdboot
<cousteau> Let me check. I think I've seen that article too
<Esmil> dramforever_: yeah, i like the sd-boot model where you put kernel, initrd and devicetree on the efi partition and reuse the efi fat implementation much better
<cousteau> I think the name may come from a (pretty lame) magic trick in which the magician guesses a number by asking for its inclusion in multiple cards with numbers
<cousteau> One has all the even numbers from 1 to 63, another has all that are a multiple of 4 plus 2 or 3, etc. I.e., the bits that are set in this sequence.
<cousteau> I've seen them used in bit manipulation hacks, LUT function generation, and some crypto functions such as Reed-Muller (and maybe Hamming too)
Andre_H has joined #riscv
<cousteau> OK, I'll go with "magic binary constants/patterns"... Thanks!
<Sofia> cousteau: I think this name also applies to division by constants / mul-high constants. https://en.wikipedia.org/wiki/Division_algorithm#Division_by_a_constant
<cousteau> Oh, I have to check that because I was going to generate these values by dividing UINT_MAX by a power of 2 plus 1
<dramforever_> Not sure how you're going to use this name, but if I was writing an article I'd probably just call it mask(5, 0) or something
<cousteau> And division is kind of undesirable...
<cousteau> dramforever_: it was for documenting a function. "This function fills a memory region with... uh... binary magic patterns"
<dramforever_> Then I'd probably call it 'alternating bit pattern'
<dramforever_> You know, alternating between 0 and 1 at different frequencies
<cousteau> Well, 111000111000 also does that and it's not in the list
<dramforever_> true
<cousteau> A descriptive phrase may be "alternating power-of-two runs of zeros and ones"
<Sofia> Sounds like a wrapper for the more general tool which may be useful to have defined anyway
<cousteau> I guess I'll go with "magic", followed by a description and maybe an explanation of the properties
<Sofia> What is your purpose for the term? Writingn an article or documenting usage of the numbers?
<Sofia> Writing
motherfsck has quit [Ping timeout: 245 seconds]
<cousteau> For now, documenting the function that generates them, which I will use to analyze memory of an FPGA
<Sofia> Fun
<Sofia> Which analysis passes are you interested in?
<cousteau> I don't want to give too many details because it's a work-related thing but it has to do with partial reconfiguration
<Sofia> Like updating the bitstream with micro patches? Sounds like a JIT?
<cousteau> Sort of, yeah
<Sofia> Interesting :)
<Sofia> Does the FPGA need to be suspended to update it?
* Sofia has never done FPGA dev but has looked into the languages and tooling a little.
<dramforever_> Okay perhaps the most hilarious name, Bluespec compiler has an option to fill (basically) uninitialized values with alternating 0101010 and they just call it 'A'
<dramforever_> So uninitialized bits can be 0, 1, X, A
<Sofia> But 0b010101 = U, not A. o.o
<Sofia> 0b01010101
<Sofia> Oops you wanted 0b0101010, that is *
motherfsck has joined #riscv
<dh`> it's 0xa?
<dramforever_> Uhh this is getting off-topic so anyone having anything else to talk about please interrupt, but apparently it fills LSB first 0 then 1 then 0 then 1 etc, so it is indeed A
<Sofia> 0b1010, closer but there isn't a 0xx <_<
haritz has quit [Read error: Connection reset by peer]
<conchuod> OxX does exist as does OxZ ;)
<conchuod> Not that you ever want to see a OxX haha
<Sofia> Are you considering 0xX as an overflowing digit, such as to defer carries?
<conchuod> No, as unknown/uninitialised (tbf there are times you might use X, never was hyerbole)
<Sofia> In that case, X = 33 = 0x21 = 0b100001.
<Sofia> Oh
loki_val has quit [Remote host closed the connection]
crabbedhaloablut has joined #riscv
<strlst> I'm not sure if I understood correctly, but if this discussion is about possible binary values for wires in HDLs, there are a great variety of possible states outside 0 and 1 due to the physical reality of the wiring and for simulations
<Sofia> /o\
<strlst> there is U (uninitialized), X (unknown), 0 and 1, but also (W, L and H, weak undetermined, weak low, weak high), Z (high impedance, like for busses) and - (don't care)
<strlst> those are the possible states of the std_logic datatype in VHDL (I think it was comparable for verilog)
cousteau has quit [Read error: Connection reset by peer]
EchelonX has joined #riscv
Maylay has quit [Ping timeout: 252 seconds]
Raito_Bezarius has quit [Ping timeout: 240 seconds]
haritz has joined #riscv
haritz has joined #riscv
Maylay has joined #riscv
Raito_Bezarius has joined #riscv
lagash has quit [Ping timeout: 255 seconds]
prabhakarlad has joined #riscv
<muurkha> I'm thinking weak low and weak high are more useful for designing with transistors and resistors than designing with LUTs? or are there FPGAs that provide a weak-drive capability?
lagash has joined #riscv
lagash has quit [Ping timeout: 240 seconds]
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
lagash has joined #riscv
cousteau_ has joined #riscv
<cousteau_> Sofia: sorry was away
<cousteau_> Yeah no, the point of dynamic partial reconfiguration is that you don't need to stop the FPGA to reconfigure it
<cousteau_> You can even do it from the FPGA itself
<strlst> muurkha: I've never encountered weak driving of logical signals, so I couldn't tell, but from what I remember learning it's a consequence of what kind of logic drives your signal: PMOS can drive a strong 1 and NMOS can drive a strong 0, but if you only use one type of logic you'll have problems driving one or the other, making it a weak signal
<strlst> so weak driving of signals is rather a shortcoming than a capability
<cousteau_> strlst: btw, Verilog keeps it simpler and only has 0 1 Z and X
<cousteau_> X is somewhat like VHDL's X, U, and -
<strlst> oh good to know, goes to show that I should've taken a closer look at verilog, thanks for the comment
<cousteau_> I learned both, started with VHDL but learned some Verilog afterwards
<strlst> they both have interesting properties/advantages, but people seem to prefer verilog in the long run
<strlst> VHDL is quite verbose after all
<cousteau_> strlst: Verilog is more commonly used in the US, and VHDL in Europe, or that was my understanding
<muurkha> strlst: things like the pullup resistor on the I²C bus, or open-collector or open-drain logic in general, intentionally provide weak driving of signals
<strlst> cousteau_: yeah, it's mine as well, we definitely learn VHDL over here in Europe
<cousteau_> And well, RISC-V was developed in Berkeley and so was Chisel, so they didn't bother with VHDL and all the generated code is Verilog at the end
<muurkha> heh
<cousteau_> Not that you should care, unless you really plan to understand the generated code...
<strlst> muurkha: true, that was a thing as well, it's curious that there are so many states, one would think 0 and 1 is all you need
<muurkha> still is a thing! and an important one! but generally outside the scope of an FPGA
<strlst> but on the other hand it makes sense that verilog reduces these states, because it's not often that you come across them inside an FPGA really (that's what it seems like to me)
<strlst> yeah
<cousteau_> If you decide to make a blackbox, VHDL is fine. You can put that in your Verilog code.
<muurkha> DRAM amplifiers also do an interesting weak-drive thing but I don't think VHDL is useful in that context
<cousteau_> strlst: inside an FPGA there should only be 0 and 1
<cousteau_> And for tri-state signals you usually want to explicitly have three separate signals. But I find VHDL's '-' useful for optimization.
<muurkha> yeah, don't-care can help a lot for combinational optimization
<strlst> also helps a lot to deal with the verbosity, effectively reducing it
<strlst> although I guess that is combinational optimization
<muurkha> reducing the verbosity of a behavioral specification is not at all what I meant by "combinational optimization" ;)
<cousteau_> Still useful. You want the tool to do that for you, rather than doing it manually
<strlst> I wonder though if VHDL/verilog will ever be displaced, right now all other HDLs seem to sit on top of them, that doesn't stop people from making HDLs though
<cousteau_> You don't know if the optimal thing will be to leave a counter unchanged, tied to 0, or incremented as in the other state, and you shouldn't care
<cousteau_> strlst: I don't think so, they're still used as backend languages
<cousteau_> Verilog specifically seems to be the backest end, since it's the lowest level
<cousteau_> Like, I've seen it used as a language to specify elaborated netlists
<strlst> cousteau_: yeah, I guess it has to do with the synthesis/place&route toolchains, it doesn't seem easy to rebuild them for something other than what we already have
<strlst> but it's so weird, especially as VHDL was originally meant to be only a documentation language to make specifications of circuits (I hope I got that right)
<cousteau_> Verilog was for Vero-fication, right?
<cousteau_> Er, Veri-ficatiob
<cousteau_> Stupid autocorrect
<cousteau_> Sofia: so back in the day I did dynamic partial reconfiguration on a Virtex-5. It was cool. Basically you can hot-swap fragments of a circuit at runtime
<cousteau_> Or "download hardware"
<conchuod> On-the-fly programming of FPGAs is cool
<cousteau_> And the cool part is that you can reconfigure half of it while the other half is still working
<conchuod> Linux has an FPGA manager subsystem that does this sort of thing
<conchuod> But I don't understand what the userland interface is meant to be for it (it seems not to exist, and the existing users of it have spun their own out-of-tree implementations)
<cousteau_> (although now that the idea is to put a small FPGA as part of a larger integrated system, doing so partially isn't as important. This was cooler when EVERYTHING was in an FPGA)
<muurkha> huh
<cousteau_> conchuod: in Xilinx it's just writing to a /dev file, right?
<muurkha> you can still put everything in an FPGA for a small system
<cousteau_> Yeah sure
<conchuod> idk if it was Intel or Xilinx that I looked at, but they were using a sysfs file
<muurkha> do the lattice parts support dynamic partial reconfiguration?
<muurkha> you can't run linux on them
<cousteau_> muurkha: not sure but I recall that you couldn't do DPR on them
<conchuod> From a quick google, some of them seem to Muurka
<cousteau_> But maybe things have changed
<cousteau_> ...see, I talked about stuff I have no idea about
<muurkha> but generating bitstreams for lattice parts might be more practical to do quickly
<muurkha> rather than just timesharing an FPGA between a few precomputed bitstreams
<conchuod> I think there's an upstream driver for lattice FPGA reprogramming & there's another in progress here: https://lore.kernel.org/linux-fpga/20220719112335.9528-1-i.bornyakov@metrotek.ru/
<muurkha> neat
<conchuod> But from a quick look, that seems to be full reprogramming rather than partial
<muurkha> doesn't mention partial reconfig, yeah
<cousteau_> Aaah, could be
<conchuod> I've not looked into the partial reconfiguration side of that subsystem at all so dunno what would be required to extend it
BootLayer has quit [Quit: Leaving]
<muurkha> I suppose if you have two FPGAs rather than one you can do "dynamic partial reconfiguration" by having one of them reconfigure the other ;)
<muurkha> cousteau_: what kind of things were you reconfiguring it for?
<muurkha> I'm thinking that you don't really want to try to invoke Vivado as a subroutine from a JIT compiler
Maylay has quit [Ping timeout: 245 seconds]
<muurkha> there are some really teensy FPGAs these days; the 7-year-old 640-LUT iCE40UL640 is US$2.80 in quantity 1 and 1.4 millimeters square as a WLCSP
<muurkha> supposedly 9 ns best-case delay per LUT level, so at most that's a few billion bit operations per second. enough to be interesting for some kinds of vector coprocessor applications maybe?
Maylay has joined #riscv
motherfsck has quit [Ping timeout: 268 seconds]
Maylay has quit [Ping timeout: 268 seconds]
<cousteau_> muurkha: evolvable hardware
<cousteau_> Circuits that are trained for doing different stuff
<cousteau_> But yeah, this was done with pre-built hardware
<muurkha> cousteau_: what was the paper path from genome to bitstream?
<cousteau_> You just have small pieces that do different functions, and put them together making a puzzle
<muurkha> ah, so you were doing floorplanning and place-and-route but not logic synthesis as such?
<cousteau_> It was like a large array of boxes connected to their neighbors, each gene defined the function that each of the boxes executed
ln5 has quit [Ping timeout: 245 seconds]
<cousteau_> Yeah, logic synthesis was done beforehand
<muurkha> but what were you using to do the PNR from the genome and get a bitstream you could feed int othe device? were you in fact running Vivado or something?
<muurkha> *into the
<cousteau_> Xilinx even has a PR flow that allows you to do that sort of thing
<cousteau_> Ah well, you feed it a partial bitstream, which is like "reconfigure this fragment of the FPGA with this function"
<cousteau_> Ie a bitstream that only reconfigures part of the FPGA, not all of it
<muurkha> sure
<cousteau_> Xilinx has tools for making those
<muurkha> I was just wondering what the tools were and how long they took to run for each new partial bitstream
<cousteau_> Honestly I don't know for sure because I don't remember that part, but not much since it doesn't have to P&R the whole design, only the part that changed
<muurkha> sure
<muurkha> I'm thinking that software that takes hundreds of milliseconds to run would be a problem to run as part of a process you want to do every fifty microseconds
<cousteau_> Well yeah that was a concern
<muurkha> and software that takes ten seconds to run would be a problem to run as part of a process you want to do every second
<cousteau_> At the end we managed to reduce the reconfiguration to a very small part
<muurkha> but either one might be described as "not much" from a human standpoint :)
<cousteau_> It was a few microseconds per reconfiguration, so not that much of a problem
<cousteau_> At the end, with lots of optimizations and parallel evaluations, I had a system that evolved in about a second
crabbedhaloablut has quit [Write error: Connection reset by peer]
<cousteau_> But that involved "cheating" a lot (doing things not supported natively by the tool)
crabbedhaloablut has joined #riscv
<cousteau_> Example application: you may have RoCC accelerators that are dynamically swapped, and have a processor that changes its functionality at runtime. (Although you probably prefer regular memory-mapped peripherals for that.)
<muurkha> aha, "a few microseconds", that's great! exactly what I was hoping for
<muurkha> what's RoCC?
<strlst> apparently it stands for rocket [chip] custom coprocessor
Maylay has joined #riscv
<strlst> on the topic of evolving circuits, there was this really fascinating and old attempt at using darwinian evolution to make a chip that could discriminate a 10khz from a 1khz signal (by outputting 5V or 0V)
<strlst> the paper was "an evolved circuit, intrinsic in silicon, entwined with physics" by adrian thompson ( https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.9691 )
<strlst> what fascinated me was how old it was, this was someone using fpgas trying to make circuits using evolution over 20 years ago
Gravis has joined #riscv
<cousteau_> muurkha: RoCC is sort of an interface for extending RISC-V ISAs
<cousteau_> With custom, sort-of-standardized instructions
<cousteau_> strlst: is that the tone detector thing?
<cousteau_> I think that went into the state of the art of my thesis
<cousteau_> It was super weird. I think it used circuit delays instead of flip flops
<strlst> cousteau_: yeah, it discriminated between signals of different frequencies
<cousteau_> I think it was the first "real" application of evolvable hardware. The first one was a 6-mux, but that's kind of a silly demonstration example
<cousteau_> (The first "non-real" one)
<strlst> if I recall correctly the end result was an incomprehensible circuit that is completely asynchronous and uses the intrinsic physical properties (and variations) of the chips and the parts itself
<strlst> but it actually solved the problem, if held at the right temperature range (the one it was evolved at)
<cousteau_> Yep, that was exactly it
<strlst> I'm not so sure about other such works, but seeing that was really quite fascinating, it's such a radical idea
<muurkha> yeah, one of those things where you learn more about what not to do than what to do
<muurkha> on the other hand
<muurkha> the human brain of this body also fails to work properly if its temperature rises from 37° to 41°
<muurkha> or falls to 27°
<cousteau_> It really takes the idea of "make a circuit that you have no idea how the hell it works, but it does" to an extreme
<cousteau_> For the record, mine was synchronous
<strlst> muurkha: good point, we don't critique ourselves for that :)
<strlst> or do we now?
<cousteau_> And based on incomprehensible logic functions, not on incomprehensible laws of physics
<muurkha> well, it would be an inefficient design for a micromachine, which is why shrews are orders of magnitude larger than varroa mites
crabbedhaloablut has quit [Remote host closed the connection]
<strlst> cousteau_: that sounds very cool, what was the thing that the thesis tried to solve?
<strlst> or find out
crabbedhaloablut has joined #riscv
<muurkha> and "warmblooded" ovenized crystal oscillators are still standard equipment for high-precision analog test equipment, even if less so than 30 years ago
<strlst> muurkha: well, yeah, the world can appear very different depending on the scale you're looking at
<cousteau_> strlst: my use case application was an image filter
<muurkha> OCXO ovens are typically a few tens of millimeters across
<muurkha> cousteau_: oh, that's awesome! what loss function did you use?
<muurkha> hmm, we're asking questions that are probably answered in the abstract of your dissertation
<cousteau_> I used a mean absolute error, because I didn't have time for a quadratic function
<muurkha> well, the benefit of quadratics is that they're differentiable and work well with gradient descent
<muurkha> I tried using absolute error with gradient descent. that sure did not go well
<cousteau_> Like, "compare the output of a filtered photo of a playboy model with noise with the original one, and get the sum of absolute error"
<muurkha> right
<muurkha> what kinds of logic functions did your evolved filters end up using?
<cousteau_> I did compare the result of evolving with absolute and quadratic error, the results were similar
<muurkha> I've been exploring some extensions of Urbach and Wilkinson's work on fast algorithms for mathematical morphology
<muurkha> I think there's an unexplored space there of "factoring" kernels with the Minkowski sum operator
<cousteau_> Functions were like max, min, mean, add, subtract, pass-through, divide by 2, etc
<cousteau_> They could be implemented with 16 LUTs
<muurkha> sure, but I mean, what did it compose them into?
<strlst> cousteau_: wow, that sounds like an incredibly interesting thing to work on
<cousteau_> In fact, they WERE implemented using 16 LUTs
<muurkha> heh
<muurkha> the Virtex-5 uses 6-input LUTs, doesn't it?
<cousteau_> Yep
<muurkha> so 16 LUTs is pretty powerful
<cousteau_> They can be used as a half-adder, too
<muurkha> right
<muurkha> were you feeding the pixels in one at a time, or one row at a time, or all at once?
<cousteau_> It used 8 LUTs to compute a simple (configurable) 8-bit addition/comparison, and then other 8 LUTs for selecting/post-processing the result
<cousteau_> Feeding one 3x3 window at a time
<cousteau_> Sliding window
<strlst> I guess image kernels always look at the area surrounding a pixel, hence the sliding window
<cousteau_> Exactly
<strlst> okay, got it
<cousteau_> It was a very nice use case
<strlst> it's an interesting problem to choose because you can quantify the performance/quality of a result
<cousteau_> Although the goal was to make evolvable filters, not image filters specifically
<muurkha> oh interesting, so it didn't have to evolve its own buffering
<cousteau_> Yeah also images look super cool on papers
<strlst> they do
<strlst> =D
<cousteau_> That's why you see more Lenas than audio samples
<cousteau_> Yeah the filter had some "previous info" on how to solve the problem
<muurkha> the Cytocomputer could do (non-arbitrary) 3×3-window functions, but it buffered internally
<muurkha> you fed in one 8-bit pixel at a time
<cousteau_> The window generation was already provided
<muurkha> and internally it had FIFO buffers that held two lines plus two pixels
Maylay has quit [Ping timeout: 240 seconds]
<cousteau_> Yeah well, I was feeding one pixel at a time and then building the window internally too, but that functionality was fixed
<muurkha> per pipeline stage
Maylay has joined #riscv
<cousteau_> With fifos or shift registers or whatever
<cousteau_> Yeah, good ol' 2 lines + 2 pixels
<cousteau_> Used those a lot
<muurkha> because it had multiple pipeline stages you could do filters that covered a much larger area than 3×3
<muurkha> and you always got one output pixel per cycle
<muurkha> regardless of how many stages you configured the Cytocomputer for
<cousteau_> Yeah that's the cool thing of pipelined architectures
<muurkha> yeah
<cousteau_> Could filter a 1080p video in real time
<muurkha> on the Cytocomputer? I think its max clock speed was like 10 MHz and its max width was 1024 pixels
<muurkha> so you could reasonably do a morphological dilation or erosion with a twenty-pixel-wide kernel at one pixel per cycle, as long as you could compose your desired kernel out of the Minkowski sum of 3×3 kernels
<cousteau_> No, on the thing I was making
<muurkha> oh, fantastic!
<cousteau_> It went 400 MHz
<cousteau_> Or 250 on a Zynq 7000 because I didn't spend that much time optimizing or overclocking it
<muurkha> nice
<cousteau_> Since each processing element was a very simple function, and clocked
<muurkha> right
<cousteau_> So each function added a 1 clk delay
<muurkha> sure
<muurkha> was there additional buffering so that your overall filtering could take into account a larger neighborhood?
<cousteau_> At the end the whole thing was structured so that you weren't mixing pixels from different "front waves"
<cousteau_> Or "wave fronts"
<cousteau_> Can't remember
<cousteau_> So the filter just had a dozen clock cycles of delay or so
<cousteau_> And you took this into account
<cousteau_> But no, the filter only had each element take the input from its immediate neighbors
<cousteau_> Seemed good enough though
zjason` has joined #riscv
pabs3 has quit [Ping timeout: 252 seconds]
zjason has quit [Ping timeout: 244 seconds]
EchelonX has quit [Quit: Leaving]
<muurkha> there are kinds of filtering you'd sometimes like to compute that require information from further away
Raito_Bezarius has quit [Ping timeout: 268 seconds]
Maylay has quit [Ping timeout: 240 seconds]
Maylay has joined #riscv
GenTooMan has quit [Ping timeout: 272 seconds]
cousteau_ has quit [Ping timeout: 268 seconds]
Raito_Bezarius has joined #riscv
GenTooMan has joined #riscv
vagrantc has joined #riscv
Raito_Bezarius has quit [Max SendQ exceeded]
Raito_Bezarius has joined #riscv
cousteau has joined #riscv
cousteau has quit [Quit: Quit]
sorear_ has joined #riscv
JanC_ has joined #riscv
moto-timo_ has joined #riscv
theruran_ has joined #riscv
DynamiteDan_ has joined #riscv
JanC has quit [Killed (lead.libera.chat (Nickname regained by services))]
JanC_ is now known as JanC
indy_ has joined #riscv
HdkR has quit [Ping timeout: 245 seconds]
clandmeter has quit [Ping timeout: 252 seconds]
abelvesa has quit [Ping timeout: 260 seconds]
Finde has quit [Ping timeout: 252 seconds]
muurkha_ has joined #riscv
gruetze_ has joined #riscv
alexfanq1 has joined #riscv
rimrunner has joined #riscv
_koolazer has joined #riscv
oaken-so1rce has joined #riscv
JTL1 has joined #riscv
agraf_ has joined #riscv
khem has quit [*.net *.split]
indy has quit [*.net *.split]
la_mettrie has quit [*.net *.split]
theruran has quit [*.net *.split]
koolazer has quit [*.net *.split]
agraf has quit [*.net *.split]
oaken-source has quit [*.net *.split]
gruetzkopf has quit [*.net *.split]
muurkha has quit [*.net *.split]
JTL has quit [*.net *.split]
Esmil[m] has quit [*.net *.split]
moto-timo has quit [*.net *.split]
alexfanqi has quit [*.net *.split]
sorear has quit [*.net *.split]
DynamiteDan has quit [*.net *.split]
theruran_ is now known as theruran
sorear_ is now known as sorear
moto-timo_ is now known as moto-timo
DynamiteDan_ is now known as DynamiteDan
agraf_ is now known as agraf
Esmil[m] has joined #riscv
rimrunner is now known as la_mettrie
khem has joined #riscv
strlst has quit [Quit: Lost terminal]
muurkha_ is now known as muurkha
Andre_H has quit [Ping timeout: 268 seconds]
dramforever_ has quit [Remote host closed the connection]
dramforever__ has joined #riscv
HdkR has joined #riscv
Noisytoot has quit [Excess Flood]
Noisytoot has joined #riscv
crabbedhaloablut has quit [Remote host closed the connection]
crabbedhaloablut has joined #riscv
abelvesa has joined #riscv
HdkR has quit [Ping timeout: 268 seconds]
clandmeter has joined #riscv
HdkR has joined #riscv
clandmeter8 has joined #riscv
clandmeter has quit [Read error: Connection reset by peer]
clandmeter8 is now known as clandmeter
Finde has joined #riscv
vagrantc has quit [Quit: leaving]