azonenberg changed the topic of ##openfpga to: Open source tools for FPGAs, CPLDs, etc. Silicon RE, bitfile RE, synthesis, place-and-route, and JTAG are all on topic. Channel logs: https://libera.irclog.whitequark.org/~h~openfpga
Degi_ has joined ##openfpga
Degi has quit [Ping timeout: 260 seconds]
Degi_ is now known as Degi
cr1901_ has joined ##openfpga
cr1901 has quit [Ping timeout: 268 seconds]
<pie_>
is there any consensus on whether xilinx or altera have "better" architectures?
<mwk>
xilinx.
<mwk>
fuck altera with a chainsaw.
<mwk>
<whitequark> it should be a rusty chainsaw
<pie_>
heh
<pie_>
why?
<pie_>
From the aforementioned review article, and also I guess justifying my question of if there have been any (major) changes, I guess this kind of explains FPGAs showing-up-ish in hpc;
<pie_>
"More recently, FPGAs have been widely deployed in datacenters to accelerate various types of workloads such as search engines and network packet processing [9]. In addition, DL has emerged as a key component of many applications both in datacenter and edge workloads, with MAC being its core arithmetic operation. Driven by these new trends, the DSP block architecture has evolved in two different directions. The first direction targets the high-
<pie_>
performance computing (HPC) domain by adding native support for single-precision floating-point (fp32) multiplication. Before that, FPGA vendors would supply designers with IP cores that implement floating-point arithmetic out of fixed-point DSPs and a considerable amount of soft logic resources. This created a huge barrier for FPGAs to compete with CPUs and GPUs (which have dedicated floating-point units) in the HPC domain. Native floating-point
<pie_>
capabilities were first introduced in Intel’s Arria 10 architecture, with a key design goal of avoiding a large increase in DSP block area [79]. "
<pie_>
"The second direction targets increasing the density of low-precision integer multiplication specifically for DL inference workloads."
<pie_>
Though I still wonder about competing with GPUs or dedicated ML inference chips.
<pie_>
yeesh;
<pie_>
"For example, a single channel of high-bandwidth memory (HBM) has a 128-bit double data rate interface operating at 1 GHz, so a bandwidth-matched soft bus running at 250 MHz must be 1024 bits wide. With recent FPGAs incorporating up to 8 HBM channels [91] as well as numerous PCIe, Ethernet and other interfaces, system level interconnect can rapidly use a major fraction of the FPGA logic and routing resources. In addition, system-level interconnect
<pie_>
tends to span large distances. The combination of very wide and physically long buses makes timing closure challenging and usually requires deep pipelining of the soft bus, further increasing its resource use. The system-level interconnect challenge is becoming more difficult in advanced process nodes, as the number and speed of FPGA external interfaces increases, and the metal wire parasitics (and thus interconnect delay) scales poorly [92]."
<pie_>
Apparently leading to network-on-chips
<pie_>
> Recent Xilinx (Versal) and Achronix (Speedster7t) FPGAs integrate a hard NoC [102], [103] similar to the academic proposals discussed above.
<sorear>
I wonder if any of the "make it easier to do GPU and ML stuff" is actually "make it easier for GPU and ML companies to use our products for pre-silicon validation". there are enough ML accelerator startups that it might not be a small market
sgstair has quit [Ping timeout: 268 seconds]
sgstair has joined ##openfpga
schaeg has joined ##openfpga
schaeg has quit [Ping timeout: 255 seconds]
Flea86 has quit [Ping timeout: 255 seconds]
Flea86 has joined ##openfpga
mewt has quit [Read error: Connection reset by peer]
mewt has joined ##openfpga
<pie_>
sorear: I was kind of wondering about that but I have no idea of the economics so I kind of discounted that as a viable option
<pie_>
is it worth it to tape out an fpga due to hype cycle? at what point does it become worth it when asics are coming?
emeb_mac has joined ##openfpga
cr1901_ is now known as cr1901
<cr1901>
pie_: Could you relink the article you're quoting?
<pie_>
yeah one sec
<pie_>
<pie_> afaict, good (introductory?) fpga review article on some basic internals of fpga architectures from 2021 from Andrew Boutros and Vaughn Betz (dunno if these are known names, at least the former seems to be part of VTR which is mentioned in this article and i think may come up here, or maybe thats VPR - well apparently vpr is part of vtr); https://www.eecg.utoronto.ca/~vaughn/papers/casm2021_arch_survey.pdf
<cr1901>
tyvm
<pie_>
\o/
<sorear>
neat review article, would be nice if mentioned more companies
<pie_>
i would ask if anyone else matters but they must because they stay afloat somehow
<pie_>
and i suppose lattice is a given in this channel
<pie_>
and I guess now that you mention it, i wonder if there is somethng on lattice?
<sorear>
lattice, siliconblue, wasn't ice40 also alcatel/lucent at some point?
<sorear>
then there's the actel^Wmicrosemi^Wmicrochip designs which replace all of the "SRAM cells" in the review article with flash transistors, could be important to mention
<mwk>
ice40 was siliconblue, bought by lattice
<mwk>
all the rest of lattice chips were at&t, alcatel/lucent, then lattice
<mwk>
they're two technologically unrelated lineages
<cr1901>
I read in Xilinx app notes that Spartan3E multipliers have configurable polarities for CLK and RST, but there's no parameter for the Verilog primitives to control them. Is this undocumented behavior, or Just Plain False?
<sorear>
til about nanoxplore, trying to find information on the gr765 efpga