#prjcombine on 2024-11-29 — irc logs at libera.irclog.whitequark.org

2024-11-12 16:41 ChanServ changed the topic of #prjcombine to: repo: https://github.com/prjunnamed/prjcombine/ | docs: https://prjunnamed.github.io/prjcombine/ | logs: https://libera.irclog.whitequark.org/prjcombine/

00:06 <h_ro> Wanda[cis]: never used XACT before, so I am interested in hearing what you came up with and how your reversing methodology for XACT differs from ISE

00:07 <Wanda[cis]> methodology is the same, but I structured the code a lot better

00:07 <Wanda[cis]> well, at least the same for the actual reversing; I improved geometry extraction methodology a bit

00:11 <Wanda[cis]> the ISE bitstream reversing code uses Rust macros to construct testcases, with some weird sexpr-like syntax to describe the attribtues to be tested and the prerequisites. this was a major mistake, as it centralizes pretty much every kind of attribute I could possibly want to test in all supported devices in a single unwieldy macro, and then in a single unwieldy enum which is matched over by incredibly unwieldy test-batcher code

00:12 <Wanda[cis]> when I started, I thought I'd only need like a handful of those, but... well reality turned out to be complex

00:13 <Wanda[cis]> the testcase generation code is split in a funny way: first, I generate a declarative list of things I'd like to test within every tile type

00:13 <Wanda[cis]> this is the majority of the code

00:13 <h_ro> Are you referring to the fuzzing macros in ise_hammer?

00:14 <Wanda[cis]> second, a test batcher goes through this list and assigns every testcase to an actual test batch, and to an actual tile of the desired type

00:14 <Wanda[cis]> yes

00:14 <Wanda[cis]> they are absolutely horrible.

00:14 <Wanda[cis]> so. the second part is much shorter than the first, and I hoped it'd be simple

00:15 <Wanda[cis]> unfortunately turns out that sometimes assigning the testcase to an actual tile of the device involves complex logic

00:16 <Wanda[cis]> the macros are bad, but the really bad part is this monster file: https://github.com/prjunnamed/prjcombine/blob/main/prjcombine_ise_hammer/src/fgen.rs

00:16 <Wanda[cis]> it contains pretty much all of the second stage

00:16 <Wanda[cis]> it has highly generic code mixed in with very device-specific code

00:17 <Wanda[cis]> there are also problems with using macros in the first place. the immediately noticable one is that compiling the crate is ridiculously slow

00:18 <Wanda[cis]> it's also annoying when writing the code because rustfmt either gives up on the macro invocations, or in rare cases formats them in nonsensical way

00:19 <h_ro> yeah that looks cumbersome

00:19 <Wanda[cis]> the corresponding code in xact_hammer is structured differently; the macros are replaced by a builder pattern, the TileKV/BelKV/... enums are replaced with dynamic dispatch (boxed dyn trait objects)

00:20 <Wanda[cis]> the "generic" attributes stay in generic code and get used most of the time, and when I need weird device-specific processing in the second stage, I just define a new impl of the trait right next to whatever first-stage code uses it

00:21 ari has quit [Ping timeout: 246 seconds]

00:22 sdomi has quit [Ping timeout: 272 seconds]

00:22 sdomi has joined #prjcombine

00:22 Maja has quit [Ping timeout: 245 seconds]

00:22 Maja has joined #prjcombine

00:23 <Wanda[cis]> then there's geometry extraction

00:23 <Wanda[cis]> which would be xrd2geom for ISE

00:25 <Wanda[cis]> this stage produces several results

00:25 <Wanda[cis]> 1) an interconnect tile database (list of tile types, list of wires in all tiles, list of connections between wires, list of bels and assignment of their pins to wires)

00:27 <Wanda[cis]> 2) a naming database (whatever is necessary to map prjcombine wire/bel/... IDs to toolchain identifiers; for ISE, it's wire names; for XACT, it's (X, Y) coordinates of every connection point)

00:28 <Wanda[cis]> 3) device database (enough information about each device of FPGA family to reconstruct a complete tile grid and whatever else we need)

00:28 <Wanda[cis]> 4) package database (pin <-> I/O pad mapping)

00:29 <Wanda[cis]> the ISE code is a bit of a mess; for 3., each family has a bit of code that scans the toolchain grid and extracts just enough information

00:30 <Wanda[cis]> for 1. and 2.... well this is the really messy part

00:31 <Wanda[cis]> first I manually produce the list of wires and their geometry for each device kind, plus enough information to figure out the mapping to ISE wire names

00:32 <Wanda[cis]> then I go over all tile types, look for a specimen within the device, and extract the pips and bels

00:33 <Wanda[cis]> using the ugliest piece of code in the entire project, the prjcombine_rdintb crate

00:35 <Wanda[cis]> it's an incredibly delicate piece of code that automatically extracts pips based on a ton of knobs that can be tweaked by the device-specific code before invoking it; it kinda suffers from the same problem as fgen.rs, ie. containing a lot of device specific hacks that can interact with each other in unpredictable ways sometimes, and I feel really uneasy whenever I have to tweak something there

00:35 ari has joined #prjcombine

00:35 <Wanda[cis]> the other problem is that it operates on the source database only locally

00:36 <Wanda[cis]> it does some verification, but it can easily fail to notice more complex problems

00:37 <Wanda[cis]> so the final stage uses the information previously extracted to do a full verification pass of the resulting database against the source database

00:37 <Wanda[cis]> that's what prjcombine_rdverify and prjcombine_<device>_rdverify are for

00:37 <Wanda[cis]> this, of course, duplicates a bunch of work

00:38 <h_ro> Just curious: what would you say sitePIPs (and site wires) are? Are these just bits you enable in the bitstream for the corresponding tile/site?

00:39 <Wanda[cis]> the XACT database extractor does a single extraction+verification pass

00:39 <Wanda[cis]> not sure what you mean by site wires

00:39 <Wanda[cis]> or site pips for that matter

00:40 <Wanda[cis]> are these the vivado terms?

00:40 <h_ro> Routing BELs on CLB for instance can select between an inverted signal or non-inverting signal. So I was wondering how these map to the bitstream

00:41 <Wanda[cis]> what device is this about?

00:41 <h_ro> 7 series; likely vivado terms

00:42 <Wanda[cis]> that's strange; the virtex7 CLB contains like only a single programmable inverter for the clock signal, I think?

00:42 <h_ro> ultrascale has a few more I believe

00:43 <Wanda[cis]> yeah ultrascale has a bunch more of that

00:43 <h_ro> DSP48 has several for some inputs

00:43 <Wanda[cis]> also older devices have more

00:43 <Wanda[cis]> v5/v6/v7/s6 have mostly just clocks

00:43 <Wanda[cis]> anyway, well

00:44 <Wanda[cis]> it's just a programmable inverter on the clock input of the CLB

00:44 <Wanda[cis]> the clock signal from the interconnect MUX is essentially XORed with a bitstream bit before being routed to the FFs within CLB; that's it

00:44 <h_ro> I see

00:45 <Wanda[cis]> you can find the relevant bit in prjcombine database as SLICE*.INV.CLK on the CLB* tiles

00:45 <Wanda[cis]> in general I use `INV.<pin>` on bels for invertible inputs, as a convention

00:46 <Wanda[cis]> that's for virtex7 and similar devices at least, where the inversion is conceptually part of the bel

00:47 <Wanda[cis]> on the other hand, for virtex2/virtex4, the inversion is actually part of the interconnect tile (ie. the relevant inverter bit is always at the same location for a given interconnect mux, regardless of what bel it feeds, if any)

00:49 <Wanda[cis]> (this is actually observable if you make use of the interconnect test circuitry to U-turn the input mux output back into the interconnect; on virtex4, you'll see the inverted signal, while on virtex5 and up you'll see the signal before inversion)

00:50 <Wanda[cis]> as for site pips and site wires

00:50 <Wanda[cis]> vivado uses them to name internal wires and muxes within the bel (ie. internal stuff not exposed to general interconnect)

00:51 <Wanda[cis]> prjcombine just considers them part of the bel, and whatever bits control the muxes get represented as bel attributes

00:51 <Wanda[cis]> (this is consistent with how ISE represents them)

00:53 <Wanda[cis]> https://prjunnamed.github.io/prjcombine/xilinx/virtex7/clb.html#bits-xc7v-CLBLL-SLICE0:AFFMUX this, I believe, counts as a bunch of site pips between a bunch of site wires within vivado; it's a mux that controls what goes into one of the FFs within a CLB slice

00:53 <h_ro> ISE PlanAhead does have the relevent tcl command `get_site_pips`

00:53 <Wanda[cis]> PlanAhead is basically an early and incredibly buggy vivado.

00:54 <h_ro> true

00:54 <h_ro> Alright, thanks for answering

00:54 <Wanda[cis]> I tried extracting geometry data from it instead of dealing with xdl, but quickly abandoned this plan when it became clear that it's just going to segfault if you so much as look at it wrong

00:55 <Wanda[cis]> there's btw another related concept

00:55 <Wanda[cis]> to the site pip

00:55 <Wanda[cis]> I don't remember what it was called in vivado

00:55 <Wanda[cis]> ISE calls them "routethrough pips"

00:55 <h_ro> I think I have heard of it in RapidWright

00:55 <Wanda[cis]> it's basically a fake pip that's in the routing database, which is actually realized by programming a bel in a particular way

00:56 <Wanda[cis]> eg. whenever you have a LUT, you also have a routethrough pip that goes from its I1 input to its comb output

00:56 <Wanda[cis]> (or from any other input for that matter)

00:57 <Wanda[cis]> it's not a real separate thing; if you attempt to use it, ISE/vivado will just realize it by configuring a passthrough truth table into the LUT

00:57 <Wanda[cis]> prjcombine just strips them out, they're not represented in the database in any way

00:59 <Wanda[cis]> btw both this and the decision on how to represent site pips are done this way because I already have a rough plan on how a mostly-generic P&R (and, in particular, the packing stage) interface should be implemented

01:01 <Wanda[cis]> see, site pips should not be treated the same as proper interconnect pips because of what they imply for your routing stage

01:02 <Wanda[cis]> the site pip is kinda the same thing as a regular pip in its function, but from routing PoV it implies a heavy constraint: the two things you connect must be within the same block

01:03 <Wanda[cis]> ie. either the two primitives you're connecting are already in the right position and you can and should trivially use the site pip, or they are not and the site pip is useless

01:04 <Wanda[cis]> ie. site pips actually come into play during packing/placement stage, not routing

01:05 <h_ro> I'd love to pick your brain on PNR stages, but I will need to head out soon.

01:06 <Wanda[cis]> that's why I represent them the exact same way as bel attributes (which can become packing constraints just like site pips)

04:55 <Wanda[cis]> ... XC2000 done

07:31 <mupuf> Wanda[cis]: jeez, you are on fire! Have you slept at all?

07:31 <mupuf> or did you just leave your fuzzer run while taking a cat nap?

14:18 h_ro has quit [Ping timeout: 248 seconds]

14:19 h_ro has joined #prjcombine

14:41 <Wanda[cis]> I just got a proper 8h of sleep, yes

15:45 <whitequark[cis]> nice

22:25 anuejn has joined #prjcombine