_whitelogger has joined #prjcombine
<mwk> meow
_catircservices has joined #prjcombine
whitequark[cis] has joined #prjcombine
<whitequark[cis]> test
<mwk> excellent
<whitequark[cis]> okay, i think it's fully set up now
<whitequark[cis]> might want to set a topic?
<whitequark[cis]> also might want to +O _catircservices
Wanda[cis] has joined #prjcombine
<Wanda[cis]> umm
<Wanda[cis]> does it not sync?
<Wanda[cis]> it's set on the IRC side
<whitequark[cis]> it only syncs matrix->irc
<Wanda[cis]> +O is already done
<whitequark[cis]> what does it say on irc side
<whitequark[cis]> hm. is that going to sync or not
<whitequark[cis]> i guess not
<Wanda[cis]> doesn't seem to be
<whitequark[cis]> cursed ;;
<whitequark[cis]> anyway, libera guidelines require the topic to state at the least that the channel is logged
<Wanda[cis]> _catircservices does have +o on the irc side
<Wanda[cis]> yeah
azonenberg has joined #prjcombine
<azonenberg> mwk: I can definitely write a lot of info on the XC2C interconnect and macrocell stuff when i have time
<azonenberg> where's the doc source stored?
<Wanda[cis]> in docs/ of course
<Wanda[cis]> the tables are all autogenerated from the database though
<Wanda[cis]> also uh. I really need to fix up that sphinx theme to at least remove max-width
<Wanda[cis]> the experience of wide tables is currently Not Great
<Wanda[cis]> (I have a local css hack but I never got around to fixing it in the published docs)
<whitequark[cis]> oh, THAT's why it's unusable when published
<azonenberg> it might be a bit as i'm pushign on trying to get ngscopeclient v0.1 out the door by EOY but i have a lot of internal notes, programming algorithms at least verified on the 2c32a, and other stuff i can write up
<Wanda[cis]> yeah sorry >_>
<azonenberg> as well as some interesting notes on the internal structure of the ZIA/AIM
<Wanda[cis]> anyway I have to leave in like negative 20 minutes
<Wanda[cis]> see you later
<azonenberg> (and a working verilog emulation model of the 2c32a that implements everything except eeprom programming)
<azonenberg> and some quirks of the IOBs
<azonenberg> not sure if you want that shoved in your repo anywhere, but it exists somewhere and is BSD-3 licensed if you wanna make a separate repo or whatever
<azonenberg> even implements the JTAG, you can run it on an artix7 and hook its GPIOs up to iMPACT and it'll happily program a 2c32a jed to it
<azonenberg> I dont think i can contribute much to the other device families but definitely xc2c i can help with
<azonenberg> (the emulation model is parameterizable and could easily be extended with ZIA tables for larger devices but I only ever actually implemented the 32a codepath)
jn has joined #prjcombine
mupuf has joined #prjcombine
<mupuf> mwk: wow, you've been productive! Congrats!
<mupuf> How would nextpnr be able to make use of all this work? Is there an IR that can be used to document FPGAs?
<whitequark[cis]> there's himbaechel
ari has joined #prjcombine
<mupuf> whitequark[cis]: thanks, that's just what I was looking for :)
BluRaf has joined #prjcombine
<Wanda[cis]> alright
<Wanda[cis]> back
<Wanda[cis]> holy crap it's cold
<Wanda[cis]> <mupuf> "How would nextpnr be able to..." <- so this is kinda a complex question
<Wanda[cis]> first off, there's no way to do that with just an IR, you're going to need a bunch of target-specific code in the P&R tool
<Wanda[cis]> though hopefully not that much
<Wanda[cis]> second
<Wanda[cis]> a big goal of prjcombine is getting the chip databases to manageable size
<Wanda[cis]> which is... tricky
<Wanda[cis]> the largest devices are kind of huge
<Wanda[cis]> so the way prjcombine works is that the device geometry is specified as a very small "blueprint", which is expanded to a proper tile grid by target-specific code
<Wanda[cis]> which has been a reasonably successful approach, allowing me to fit all Xilinx devices up to ultrascale+ within 4.4MiB of compressed database total
<h_ro> What kind of information is included in "device geometry"?
<azonenberg> Wanda[cis]: i wish actual xilinx toolchains did thatk ind of thing lol
<azonenberg> i tried to do that in my xc2c code years ago
<Wanda[cis]> what kind of tiles every device is made of
<Wanda[cis]> what positions
<Wanda[cis]> what wires are in each kind of tile, what muxes
<Wanda[cis]> etc.
<h_ro> got it
<Wanda[cis]> unfortunately the 4.4MB figure doesn't include timing data, which is likely to be quite large and will probably make up the bulk of the final database
<Wanda[cis]> azonenberg: how did that work out?
<Wanda[cis]> I find CPLDs don't really benefit from deduplication that much
<azonenberg> Wanda[cis]: yeah the ZIA didn't dedup well but at least i only had to store the macrocell structures once
<azonenberg> it was actually procedural rather than data driven
<azonenberg> so i just had a loop making a bunch of macrocell objects etc
<Wanda[cis]> I mean, I still did that, but... well there's much less benefit in deduplicating a 512-macrocell CPLD than a million-LUT FPGA
<azonenberg> well yeah lol
<azonenberg> This reminds me i wanted to make nice APB-based VIO/ILA cores that could interface with an attached MCU and bridge to ngscopeclient
<azonenberg> the idea was that i could just have a SCPI interface on a uart, ethernet port, whatever
<azonenberg> and interface to one or more virtual instruments in the DUT
<azonenberg> without using any of xilinx's IPs
<azonenberg> the other thing i wanted to do differently was have symbol tables (at least optionally) baked into block ram
<azonenberg> or a flash chunk on the mcu or something
<h_ro> azonenberg: have you looked into chipscopy: https://github.com/Xilinx/chipscopy/tree/master AFAIK this only supports versal boards
<azonenberg> basically the equivalent of a xilinx .ltx but built into firmware so you can just take a device and talk to it without needing separate symobls
<azonenberg> symbols*
<azonenberg> h_ro: Yes. i have no versal hardware, nor am i likely to ever get any any time soon
<azonenberg> so it's dead to me
<azonenberg> if and when they add support for 7 series or ultrascale+ i want to make a scopehal driver for the xilinx ILA/VIO using it
<azonenberg> either way i want a fully f/oss alternative
<Wanda[cis]> versal is deliciously insane hardware
<Wanda[cis]> perfect to self-harm with
<azonenberg> i was at a customer last week that had VMK108's *everywhere*
<azonenberg> they had one that was a glorified ethernet to [redacted] bridge
<azonenberg> i asked for an fpga devkit to generate a handful of simple digital signals as part of the test i was doing and they gave me another
<azonenberg> there must have been half a dozen VMK108s just on this one bench i was sitting at
<azonenberg> it took me most of a day just to set things up and figure out the stupid block design flow and infrastructure enough i could get a blinky working
<azonenberg> the versal (and zynq) chips and flows embody everything i think xilinx is doing wrong
<azonenberg> (extra funny because it seems to be their primary focus moving forward)
<azonenberg> i tried making a systemverilog top level design like i usually do, then it complained about me not having their stupid PS9 wrapper IP, which had to be a block design
<azonenberg> (of course it didnt tell me until i tried to make a bitstream)
<azonenberg> then it wouldnt let me put my sv design in a block design because that flow doesnt support sv
<azonenberg> so i had to make a v2005 wrapper around my sv code and put THAT in the bd
<Wanda[cis]> idk I think zynq is kinda cute
<Wanda[cis]> but then I never used it with the official tools
<azonenberg> lol
<azonenberg> my big problem is that the PS isn't isolated enough from the PL
<azonenberg> i like rtl-centric security architectures where you can build a root of trust out of gateware and guarantee that no matter what else happens, X invariant will hold
<azonenberg> If the PS can load a new bitstream on the PL without its consent at any time, it turns that on its head
<Wanda[cis]> oh.
<azonenberg> also i dont like how they have all the hardware AXI interfaces (and on their IPs) exposed as 50 separate discrete named ports. SV interfaces and the VHDL equivalent exist for a reason
<azonenberg> by all means have the primitive work that way under the hood since you need discrete wires
<Wanda[cis]> would you prefer virtex5-style FPGA-hard core combo?
<azonenberg> but then wrap it in interfaces
<azonenberg> Yes. I want an FPGA with a CPU just sitting somewhere like a block ram
<azonenberg> that does nothing until i ask it to
<azonenberg> and can't talk to anything i don't allow it to
<azonenberg> ideally it would have a couple of ~1 GHz M85 class processors and a few dozen m0+ class i can use as offloads for what would otherwise be an annoyingly large rtl state machine in some logic block
<azonenberg> a m0+ is like the size of a bram transistor wise
<azonenberg> put a column of m0's next to every 3rd bram column or so
<azonenberg> and give me pips to hook them to the adjacent bram as TCM and then provide an AHB interface out to fabric
<azonenberg> anyway my other problems with xc7z are smaller things, like the inability to boot the PL and PS independently from spi flash and the lack of hard TRNG + crypto IPs
<Wanda[cis]> I... hm
<Wanda[cis]> I'm not sure about that
<azonenberg> you can jtag the PL
<Wanda[cis]> but there's a distinct possibility that the PL actually can be indepentendly booted from SPI
<azonenberg> but there is no documented way to boot the pl from spi flash
<azonenberg> key word documented
<Wanda[cis]> oh yes.
<azonenberg> there are some strap pins and bits of bootrom where i think it probably is possible
<Wanda[cis]> just well
<azonenberg> Just havent bothered to hack on it when xc7z isnt even that great CPU wise by modern standards
<Wanda[cis]> there are three RSVDVCC and RSVDGND pins that are suspiciously in the same area as M0-M2 on other virtex7 devices
<azonenberg> Yes. I noticed that too
<azonenberg> never bothered to tinker with them
<azonenberg> but i can guess
<azonenberg> how buggy the mode is is anybody's question
<Wanda[cis]> I wonder if it actually works
<Wanda[cis]> yeah
<azonenberg> In my own bigger projects lately I've been using a stm32h735 with the parallel memory controller connecting to an APB bridge on an adjacent 7 series or, soon, ultrascale+, FPGA
<Wanda[cis]> ... of course I don't have any board that'd be actually wired for it, so...
<azonenberg> the h735 CPU is coremark-wise competitive with an xc7z A9
<Wanda[cis]> I wonder if I can like
<azonenberg> and it has internal sram and flash and a ton of IO independent of the FPGA
<Wanda[cis]> INTEST the configuration logic
<azonenberg> Which is especially important for u+ because the low end parts like the au20p, ku3p, etc only come in ffg676 which is pretty light on IO (low end OG ultrascale were available in ffg1156)
<azonenberg> so being able to throw all my slow IOs on the stm32 and save the FPGA IOs for fast stuff is important
<azonenberg> I did have another cursed xc7z idea i've been meaning to play with, though
<azonenberg> i may have mentioned it to you, basically porting antikernel to the platform
<Wanda[cis]> using coresight for external context-switching?
<azonenberg> Yeah
<azonenberg> with each A9 locked up in a padded cell with access to a small chunk of ddr and a mailbox to the PL
<azonenberg> i've never had time to work on it but i have a zybo i bought years ago meaning to try it out
<h_ro> Wanda[cis]: Got toolchain file set up, but failing on dump_ise_parts step: https://bpa.st/EAFA Is this something you have encountered before?
<Wanda[cis]> this maaaay be the thing I'm working around with LD_PRELOAD
<h_ro> Oh lol
<Wanda[cis]> save to fixuseafterfree.c ; gcc -shared -fPIC fixuseafterfree.c -o fixuseafterfree.so ; add fixuseafterfree.so to LD_PRELOAD within the toolchain toml file
<Wanda[cis]> see if it fixes the problem
<h_ro> brb
<Wanda[cis]> it certainly seems like that could be it; the problem manifested with emitting (possibly non-ascii) junk in xdlrc files
<Wanda[cis]> (I think if you actually use the ancient RHEL version that ISE nominally requires, you don't hit this issue or something?)
<h_ro> It worked. Thanks for that fix.
<Wanda[cis]> ISE is a great piece of software.
<Wanda[cis]> grep prjcombine sources for your favorite obscenity to find more examples of greatness.
<Wanda[cis]> I particularly like this one
<h_ro> 0 == O of course
<h_ro> That is pretty hilarious actually
<Wanda[cis]> I have spent a while looking at the seemingly-nonsensical name mismatch error in the terminal before I realised.
<Wanda[cis]> I was ... rather unhappy about it
<azonenberg> Wanda[cis]: oh god lol
<azonenberg> that's awful
<Wanda[cis]> https://github.com/prjunnamed/prjcombine/blob/main/prjcombine_ise_hammer/src/main.rs#L145 oh right another one I was particularly annoyed about
<Wanda[cis]> there's also like
<Wanda[cis]> countless cases where ISE emits subtly broken bitstreams because of literal typos in their database files
<azonenberg> lool
<Wanda[cis]> which I'm fixing up manually
<azonenberg> oh here's a fun question
<azonenberg> did you ever figure out what the root cause of this is by looking at older/newer bitstreams?
<Wanda[cis]> oh, isn't that actually documented?
<Wanda[cis]> anyway it's pretty simple
<azonenberg> AFAIK it's just documented "9k bram init doesnt work" and "the fix is available in newer ISE but isn't compatible with encrypted bitstreams"
<Wanda[cis]> s6 has 16kbit blockrams, splittable into two 8kbit blockrams
<azonenberg> i'm not aware of any root cause explained
<Wanda[cis]> and, like on any other FPGA, uploading BRAM initial contents borrows one of the bram read/write ports
<Wanda[cis]> it turns out the borrowing logic does not take the split-8kbit configuration into account, and breaks when it is active
<Wanda[cis]> so it'll mangle the data somehow (I haven't checked how)
<Wanda[cis]> ISE normally works around it by not turning on the "split into 2×8kbit" bit on the first pass
<Wanda[cis]> and then overwriting the relevant configuration frames later, after the bram contents are uploaded
<Wanda[cis]> (this is why the workaround bitstreams are larger than normal)
<azonenberg> oh interesting
<Wanda[cis]> but this is not possible with encrypted bitstreams, because encrypted bitstreams for security reasons only allow you one upload pass, from start to finish, in order
<Wanda[cis]> (encrypted s6 bitstreams that is; v6/v7 encryption works differently)
<azonenberg> yeah i really do wonder what happened to s6's dev team
<azonenberg> it was absolutely xilinx's windows ME
<Wanda[cis]> yes.
<azonenberg> even right down to "they axed the entire product line and rebuilt all the new low end products as cut down virtex6's"
<Wanda[cis]> they managed to kill off the spartan line
<Wanda[cis]> I mean
<Wanda[cis]> it wasn't really a separate product line for long
<azonenberg> There's a few products i've been very curious about
<azonenberg> The XC7A350T, for example
<Wanda[cis]> other than s6 and sooooomewhat s3, every other spartan has been rebadged something else
<Wanda[cis]> well, what about it?
<Wanda[cis]> it got cancelled
<Wanda[cis]> can't really tell you why
<azonenberg> yeah thats the thing
<azonenberg> It never launched. How far did it get? did it tape out? were there bugs? did they nuke it because they didn't want it cutting into kintex's market share?
<azonenberg> ditto the xc2c1024
<Wanda[cis]> idk
<Wanda[cis]> it does have an IDCODE
<azonenberg> Yeah and it's referenced in some older ISE versions, datasheets, etc
<azonenberg> it was to be in the same ffg1156 as the 7a200t but pin out the two NC banks
<azonenberg> there's info about how many luts, ram, etc it was supposed to have
<azonenberg> idk if there was ever a full P&R DB for it
<Wanda[cis]> there's still traces of it, and of many other canceled virtex7 devices, in ISE files
<Wanda[cis]> but not complete
<Wanda[cis]> the main P&R DBs got cut
<azonenberg> The one i'm most curious about, though
<azonenberg> are the BladeRunner and StarFighter CPLD platforms
<azonenberg> BladeRunner-I was XC2C afaik
<azonenberg> hence "xbr"
<azonenberg> BladeRunner-II/III were according to a leaked roadmap to have been 1.5 and 1.2V, likely on UMC 110 and 90nm based on some other sources
<azonenberg> and then StarFighter I/II were to be 1.8 / 1.5V descendents of XC9500
<Wanda[cis]> pfft.
<Wanda[cis]> they killed off the 2.5V XC9500 already
<azonenberg> There was also a patent i found about a weird FPGA-CPLD hybrid architecture
<azonenberg> with a 2D routing interconnect like an FPGA
<azonenberg> but with PLAs instead of LUTs as the basic logic primitive
<Wanda[cis]> ... I think lattice specialised in monstrosities like this
<azonenberg> it would essentially be a grid of XC2C FBs in a 2D array with FPGA-style pip routing between them
<azonenberg> i'm really curious how far the internal projects based on that got
<azonenberg> and why they got killed
<Wanda[cis]> shrug that I have no idea about
<azonenberg> ah ok same roadmap says BladeRunner-II was to be 150nm, as was StarFighter II
<azonenberg> launching in 2002-2003
<Wanda[cis]> heh
<Wanda[cis]> what I'm curious about is whether fpgacore ever like, existed
<azonenberg> yeah
<azonenberg> thats another one i was wondering about
<Wanda[cis]> that one actually has complete support in released ISE! just kind of... disabled
<Wanda[cis]> yeah I know it was an IBM thing
<Wanda[cis]> ... in exchange for the PPC cores or something
<Wanda[cis]> (btw that thing is essentially just a filleted out spartan3)
<azonenberg> yeah i figured
<azonenberg> it's probably an xc3s50 sans pad ring pretty much
<Wanda[cis]> sans pad ring, DCMs, and BRAMs
<azonenberg> oh, no brams? interesting
<Wanda[cis]> mhm
<Wanda[cis]> oh and also BUFGMUXes got downgraded to just plain BUFGs
<azonenberg> multipliers or no
<azonenberg> i guess the idea is you have all of the clock tree coming out of the parent asic
<azonenberg> so that makes some sense
<Wanda[cis]> no multipliers either (they're closely tied to BRAMs in s3)
<azonenberg> yeah thats why i asked
<azonenberg> probably dedicated clock inputs on the periphery not shared with the normal IOs?
<Wanda[cis]> nope
<Wanda[cis]> shared with IO just like on plain s3
<azonenberg> oh interesting
<Wanda[cis]> the IO cell is special
<Wanda[cis]> S3 has 3 IOBs per tile, of which (on average) 2.2 are actually connected to pads
<azonenberg> i would have figured you'd just have like a bunch of pips on the perimeter of the array that would just route signals that would normally go to IOBs to a fixed buffer and an interconnect track on metal 3 or so
<Wanda[cis]> fpgacore has 4 IPADs and 4 OPADs per IO tile
<azonenberg> and then the asic integrator would wire that to whatever
<Wanda[cis]> they have FFs and loopback mode
<azonenberg> interesting
<Wanda[cis]> the "pad" being likely a misnomer
<Wanda[cis]> the thing also has its own JTAG TAP
<Wanda[cis]> including BSCAN for the "pads"
<azonenberg> i really wanna get my hands on one of those chips lol
<Wanda[cis]> ... if they exist
<Wanda[cis]> if you're interested
<azonenberg> yeah i know
<azonenberg> also wow my xc3s50a substrate sample is FILTHY
<Wanda[cis]> here's the raw tile grids from ISE: https://0x04.net/~mwk/chtml/fpgacore/
<azonenberg> i need to clean that up and re-image it
<azonenberg> i never put it on pr0n because it was too ugly to share lol
<azonenberg> but it's just lots of dirt, there's only one chip and it's in a corner mostly out of the way