#prjcombine on 2025-02-27 — irc logs at libera.irclog.whitequark.org

2024-11-12 16:41 ChanServ changed the topic of #prjcombine to: repo: https://github.com/prjunnamed/prjcombine/ | docs: https://prjunnamed.github.io/prjcombine/ | logs: https://libera.irclog.whitequark.org/prjcombine/

06:02 <_whitenotifier-4> [prjcombine] wanda-phi opened issue #11: Add a crate for handling xilinx bit files - https://github.com/prjunnamed/prjcombine/issues/11

06:21 <_whitenotifier-4> [prjcombine] wanda-phi opened issue #12: create a single `prjcombine-cli` entry point for public-facing functionality - https://github.com/prjunnamed/prjcombine/issues/12

12:50 <_whitenotifier-4> [prjcombine] wanda-phi opened issue #13: Document and clean up interconnect database format and interface - https://github.com/prjunnamed/prjcombine/issues/13

14:41 <_whitenotifier-4> [prjunnamed/prjcombine] wanda-phi pushed 1 commit to main [+2/-0/±17] https://github.com/prjunnamed/prjcombine/compare/0636c650d18e...a090dce97157

14:41 <_whitenotifier-4> [prjunnamed/prjcombine] wanda-phi a090dce - jed: add a shared JESD3 parser/emitter.

14:41 <_whitenotifier-4> [prjcombine] wanda-phi closed issue #10: Deduplicate the way too many copies of JED parse/emit code. - https://github.com/prjunnamed/prjcombine/issues/10

14:46 <_whitenotifier-4> [prjcombine] wanda-phi edited issue #1: Be a good little sub~~module~~ for prjunnamed - https://github.com/prjunnamed/prjcombine/issues/1

18:13 <cr1901_> Why isn't the harvester crate called "sickle"?

18:13 cr1901_ is now known as cr1901

18:13 Wanda[cis] has joined #prjcombine

18:13 <Wanda[cis]> I don't think they make combine sickles

18:14 whitequark[cis] has joined #prjcombine

18:14 <whitequark[cis]> ... "combine harvester"

18:14 <whitequark[cis]> oh my god.

18:14 <whitequark[cis]> that's so upsetting.

18:14 <Wanda[cis]> that has always been the intended reading of the project name!

18:14 <cr1901> I thought it a play on communism- hammer and sickle

18:15 <Wanda[cis]> well. either that or as a HL2 reference, if you prefer to think of me as an evil alien hell-bent on assimilating everything around me.

18:15 <Wanda[cis]> which, fair

18:15 <cr1901> Never played it

18:24 <cr1901> Issue #1 says to document them eventually; what's the short version of "how is the RE job split between hammer and harvester?"

18:25 <Wanda[cis]> they're two completely separate approaches to RE

18:26 <Wanda[cis]> hammer relies entirely on controlling every single feature of the bitstream, down to individual routing pips

18:27 <Wanda[cis]> it's the first one I designed, 5 years ago or so (a few rewrites of this project ago), and it is what the ISE and XACT reversing code in prjcombine is based on

18:31 <Wanda[cis]> unfortunately, it has two fundamental flaws: first, it is entirely incapable of dealing with toolchains that simply don't allow this kind of control (or where it's prohibitively hard); second, it is often very tricky to design hammer samples that pinpoint the exact thing you're trying to reverse without dragging a bunch of other stuff into the bitstream diff; this is particularly true for toolchains with strong required DRC (XACT

18:31 <Wanda[cis]> already gave me a lot of trouble here; Vivado ... I believe would be theoretically possible to handle, but would be even worse to deal with)

18:32 <Wanda[cis]> so, I decided to design an alternate approach, capable of reversing more targets, and the result is called harvester

18:33 <Wanda[cis]> it relies on generating mostly-random crap, letting vendor P&R do whatever it wants with it, and then just using the resulting placement and routing information to correlated bits within the bitstreams

18:33 <Wanda[cis]> the only requirement on the toolchain becomes that you have to be able to extract the final routing from it, you don't have to control it

18:34 <cr1901> >often very tricky to design hammer samples <-- For trellis, we manually edited NeoCAD files, which Diamond happily ingested and emitted a bitstream. The main issue was figuring out what the valid NeoCAD fields were :P

18:34 <cr1901> ^That sounds like hammer

18:35 <Wanda[cis]> it also has its downsides. it's slower, because you don't get to just directly ask for the bits you want, you have to wait until they fall out from the router by chance.

18:35 <Wanda[cis]> and you sometimes have to manually nudge it into giving you what you want anyway

18:36 <Wanda[cis]> but after getting really frustrated with XC4000 interconnect reversing, I believe this should be the way to go for further RE work going forward

18:37 <cr1901> I'm poking around and reading the src to see how MachXO2 RE mk II would work

18:37 <Wanda[cis]> the harvester approach is still basically in testing, by its first target (prjcombine-siliconblue), which was explicitly picked as something easy that won't require me dealing with ridiculously large bitstreams

18:38 <Wanda[cis]> the core works perfectly, but I still need to work on diagnostics when something goes wrong

18:43 <cr1901> Maybe gatecat or you know other ways, but getting the routing info in plaintext from Diamond requires a NeoCAD parser. It was presumably easier to write NeoCAD by hand until Diamond didn't bitch, and then correlate bitstreams for each pip,non-routing config,LUT vals, etc

18:51 <Wanda[cis]> what exactly is a "NeoCAD parser"? NeoCADis a toolchain, not an interchange format

18:52 <cr1901> "The file format that original NeoCAD toolchain that Diamond is derived from used"

18:52 <Wanda[cis]> that describes dozens of formats

18:53 <cr1901> https://github.com/YosysHQ/prjtrellis/blob/master/fuzzers/machxo2/005-reg_config/reg.ncl <-- these things

18:53 <Wanda[cis]> ah, ncl.

18:53 <Wanda[cis]> so, yes

18:53 <cr1901> The NC is for NeoCAD, and there's an extra "L"

18:53 <cr1901> so I call it "NeoCAD" for short

18:53 <Wanda[cis]> the plan for prjcombine-lattice is exactly to parse ncl and use harvester.

18:56 <whitequark[cis]> <cr1901> "Maybe gatecat or you know..." <- ki have a neocad netlist parser on my hard drive

18:56 <whitequark[cis]> s/ki/i/

18:57 <whitequark[cis]> it supports specifically Diamond netlists

18:57 <whitequark[cis]> it's written in C++ however but it has all the RE bits you could wish for

18:58 <cr1901> Ahh, well at least that work doesn't need to be done then :P

19:01 <cr1901> whitequark[cis]: Can you post a link so I could look at it?

19:03 <whitequark[cis]> no, for... reasons

19:03 <whitequark[cis]> I can however DM you an archive of it

19:08 <cr1901> Ahhhh... I understand.

19:09 <whitequark[cis]> the reasons are entirely personal, the code itself isn't private. long story

19:17 <cr1901> Yea, no worries, I left my lone q in DM

21:07 <Wanda[cis]> alright, so back to this discussion

21:08 <Wanda[cis]> <cr1901> ">often very tricky to design..." <- yup that sounds like the hammer approach, though perhaps without all the massive batching stuff that makes hammer fast

21:09 <Wanda[cis]> so I was originally planning to write prjcombine-diamond-hammer as the second target right after ISE; there's even some initial work in the repo for it

21:10 <Wanda[cis]> this was motivated exactly by Diamond 1) allowing this kind of control via ncl, 2) allowing you to just dump the complete interconnect database tcl api just like ISE does, 3) being otherwise similar to ISE due to shared neocad ancestry

21:11 <Wanda[cis]> point 2) is also kind of important for prjcombine-hammer: you don't just need to have control, you also have to know what to aim for

21:12 <Wanda[cis]> but then I have decided that doing this as the second target because it's easy is the obviously wrong decision

21:14 <Wanda[cis]> instead, I should make sure that prjcombine is capable of working with diverse toolchains before cementing the existing design even harder than it already is (by 77kLOC of prjcombine-ise-hammer)

21:17 <Wanda[cis]> siliconblue is the perfect target. it fails all of criteria 1-3, while also being small and already mostly reversed, so I could just look up what the results should be and focus on developing the actual RE approach

21:18 <Wanda[cis]> (well, and also prjcombine-xact-hammer became the actual second FPGA target after I needed a distraction while sleep deprived and on train)

21:21 <Wanda[cis]> other targets that have been considered for developing prjcombine-harvester were Vivado (which was another obvious next target, given where prjcombine's supported device list currently ends...) and Quartus.

21:23 <Wanda[cis]> I rejected Vivado because it still would be "too easy" by allowing you to dump the interconnect database (prjcombine already has the full interconnect database for ultrascale, in fact) and being similar to ISE, and also because developing core RE methodology while dealing with a bitgen that takes minutes and produces hundreds of megabits of bitstreams would be incredible suffering

21:26 <Wanda[cis]> and I rejected Quartus because, once started, that could easily tie me up in an incredibly long and complex RE campaign with a toolchain and vendor that are batshit insane

21:29 <Wanda[cis]> I'll deal with it, one day. but it'll take an intentional decision to deal with Quartus and all the devices it supports, it's not something to be casually done on the way to a more interesting goal

21:41 <Wanda[cis]> and now that prjcombine-harvester exists, I believe it should be used for all new targets by default, even when you have the level of control that ncl allows. that is, unless something weird comes up that requires a different approach.

21:42 <Wanda[cis]> there is, by the way, a particular reason why I am somewhat distrustful of XDL- or ncl-based stuff

21:43 <Wanda[cis]> you use xdl/ncl so you can have low-level control over the netlist, yes?

21:43 <Wanda[cis]> the primitive attributes correspond more closely to the bitstream than verilog does, after all

21:43 <Wanda[cis]> but that is dangerous.

21:44 <Wanda[cis]> you run a serious risk of missing something about how verilog corresponds to xdl/ncl

21:46 <Wanda[cis]> an approach with Verilog at the frontend lets you verify the whole thing end-to-end, something that xdl/ncl-based reversing is not really capable of

21:47 <Wanda[cis]> I was worried enough about this problem that prjcombine has a completely separate step where it fuzzes the verilog to xdl transformation and verifies the results are as expected. and this step has indeed found a few surprises.

21:50 <Wanda[cis]> this is actually one of the core problems faced by prjcombine: it is incredibly easy to miss little details when you're operating on this scale of scope and automation. this is why I try to design every step with cross-checking in mind, as much as is practical.

21:54 <Wanda[cis]> so one obvious reason why prjcombine-hammer massively batches samples together into shared bitgen runs is performance. however, the other, equally important reason is that this allows me to detect when two features turn out to be not as independent as I had assumed. the hammer core explicitly includes redundancy and randomization to make it more likely that it'll blow up in this case.

21:59 <Wanda[cis]> this is also why I don't consider any database included in prjcombine to be at all reliable until it has been properly verified in-hardware and the results documented. unfortunately, I am still not quite sure of the details of how to do that verification, particularly at scale.

22:02 <Wanda[cis]> I may be saying that "I have extracted all bitstream bits that can be extracted from ISE", and it is the literal truth for the most part, but that's... not quite a gold standard

22:04 <Wanda[cis]> there's a bunch of bits that I found completely incomprehensible even though I understand their verilog-to-xdl and xdl-to-bitstream mapping behavior, and I basically decided that the only way to deal with them is to just hook up a scope to the device and start poking at things

22:04 <Wanda[cis]> the usual suspects being I/O tiles. or whatever the hell is going on with Spartan 6 clock distribution.

22:12 <cr1901> Gimme a few mins to read please :)

22:14 * cr1901 is eating, so can't focus on reading :(

22:27 <cr1901> >an approach with Verilog at the frontend lets you verify the whole thing end-to-end <-- so, FWIW, trellis _does_ use Verilog for fuzzing I/O standards and a few other things. The minitests directory was for doing Verilog to ncl tests to get a feel for what template Verilog/NCL file will extract the most information 1/2

22:28 <cr1901> Unfortunately, as you can prob guess, I/O standards are enough of a clusterf*** already and they are somewhat handwaved in Trellis (as in, I only have coarse "set these bits for this exact I/O standard". I don't know how those bits are split into further groups)

22:29 <Wanda[cis]> I have actually managed to split them for the most part for ISE, mostly with incredible and manually-applied violence

22:30 <cr1901> If harvester can be made to work on Diamond, that's great! I'm not attached to any particular way of fuzzing things. I'm mostly thinking out loud today :D

22:30 <Wanda[cis]> I/O tile reversing code tends to be the absolute worst

22:32 <Wanda[cis]> (said violence has involved a scope on a few occasions)

22:32 <cr1901> ("Given that I've done MachXO2 REing, and I understand mostly* how Trellis works, how can I apply those skills to help combine when the time comes, _without_ getting too deep into the weeds and burning out again?")

22:32 <Wanda[cis]> "again", eh

22:34 <cr1901> Doing the nextpnr port took everything I had. I desperately wish it didn't take that much effort/out of my comfort zone, but I wanted nothing to do with FPGA code for a decent period after the port was done. That's burnout, I think.

22:35 <Wanda[cis]> oh hey that's pretty much exactly what happened when I was doing a spartan6 nextpnr port

22:36 <Wanda[cis]> I have since concluded that this means the proper solution to the problem is to just reverse things, skipping the "write the nextpnr backend afterwards" part

22:37 <Wanda[cis]> worked well so far

22:37 <cr1901> When I go back to MachXO2, it helps me to focus on the things I enjoy most on the chip. Mostly playing with the UFM (User Flash Memory)

22:37 <Wanda[cis]> ... I wonder how bad prjunnamed-pnr is going to be

22:38 <Wanda[cis]> well

22:39 <Wanda[cis]> I guess we'll see what rule #0 is made of

22:41 <cr1901> Hey, I'm taking a look at the code and chatting when my bandwidth permits :P. That's progress :D!

22:42 <cr1901> (Re: mostly*- I never had to touch the actual 'find the differing bits' logic in Trellis. That stuff was rock-solid for getting MVP. So I never really looked at it. I'm sure I'd be fine after a few hours.)

22:43 <whitequark[cis]> <Wanda[cis]> "... I wonder how bad prjunnamed..." <- i think it'll be fine, probably

22:44 <Wanda[cis]> probably

22:44 <Wanda[cis]> we just have a goal of designing a P&R flow competitive with vivado

22:44 <Wanda[cis]> no big deal

22:45 <whitequark[cis]> yeah!

22:54 <Wanda[cis]> now here's where I'd say something rude, except I'm still not over that part where we just sat down for a month and wrote a synthesis tool

22:57 <whitequark[cis]> nice

22:57 <whitequark[cis]> i think we'll hit scaling issues but i also think we'll be in a better position to resolve them than nextpnr

22:58 <Wanda[cis]> as in, when we actually hook up larger devices?

22:58 <whitequark[cis]> yeah

22:59 <Wanda[cis]> mmm, I'm not even worried about scaling issues, I'm worried about handling sparse interconnect

22:59 <whitequark[cis]> i think you've already effectively started working on prjunnamed's P&R (although not `prjunnamed-pnr`) because the choices in database format are pretty significant

22:59 <whitequark[cis]> Wanda[cis]: right

22:59 <whitequark[cis]> I mean I think of that as a sort of scaling issue because bigger FPGA families have sparser interconnect

23:00 <Wanda[cis]> I guess

23:00 <whitequark[cis]> I am quite curious to see how this works out

23:00 <Wanda[cis]> same

23:01 <whitequark[cis]> either way I don't feel like there are unresolvable problems there since FPGA P&R tools, empirically, exist

23:02 <whitequark[cis]> not only that but there exists a vast and diverse amount of them, notwithstanding how many of them are just NeoCAD again and how many of them are terrible

23:02 <Wanda[cis]> whitequark[cis]: it didn't really feel like that at the time; I've always been optimizing for deduplication and understandability of the resulting database, not any sort of fitness for P&R purposes

23:02 <whitequark[cis]> but that's a part of it!

23:02 <whitequark[cis]> I think one of our bigger risks is that our poor database design (well, if it would be poor) crushes our ambition

23:03 <whitequark[cis]> basically nobody seriously makes tools that are portable across FPGA families

23:04 <Wanda[cis]> nobody ever really did that, yeah

23:04 <Wanda[cis]> neocad maybe?

23:05 <whitequark[cis]> i wonder if they did or if it was more like nextpnr where you'd use separate backends

23:06 <Wanda[cis]> shrug

23:06 <Wanda[cis]> there are per-family .sos

23:06 <Wanda[cis]> but then, unclear how much there is in them, I've never bothered looking

23:07 <Wanda[cis]> it's not like we'll avoid target-specific code either

23:40 <whitequark[cis]> right