whitequark changed the topic of #amaranth-lang to: Amaranth hardware definition language · code https://github.com/amaranth-lang · logs https://libera.irclog.whitequark.org/amaranth-lang
lf has quit [Ping timeout: 248 seconds]
lf has joined #amaranth-lang
cesar has quit [Read error: Software caused connection abort]
cesar has joined #amaranth-lang
electronic_eel has quit [Read error: Software caused connection abort]
electronic_eel has joined #amaranth-lang
<_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://github.com/YoWASP/yosys/compare/805a79256e0e...b0d3656d4ea1
<_whitenotifier> [YoWASP/yosys] whitequark b0d3656 - Update dependencies.
ovf has quit [Read error: Software caused connection abort]
ovf has joined #amaranth-lang
electronic_eel has quit [Ping timeout: 260 seconds]
electronic_eel has joined #amaranth-lang
Degi_ has joined #amaranth-lang
Degi has quit [Ping timeout: 252 seconds]
Degi_ is now known as Degi
gruetzkopf has quit [Read error: Software caused connection abort]
gruetzkopf has joined #amaranth-lang
jevinskie[m] has quit [Read error: Software caused connection abort]
jevinskie[m] has joined #amaranth-lang
bob_twinkles_ has quit [Read error: Software caused connection abort]
bob_twinkles has joined #amaranth-lang
richardeoin has quit [Ping timeout: 248 seconds]
richardeoin has joined #amaranth-lang
richardeoin has quit [Ping timeout: 260 seconds]
richardeoin has joined #amaranth-lang
Mrmaxmeier has quit [Quit: The Lounge - https://thelounge.chat]
Mrmaxmeier has joined #amaranth-lang
<d1b2> <gatin00b> I'm currently working on a project which takes an unpractical amount of time to convert to Verilog. Not being a python programmer, but wanting to learn more Amaranth internals, I'm looking for guidance on debugging the issue
Lord_Nightmare has quit [Quit: ZNC - http://znc.in]
Lord_Nightmare has joined #amaranth-lang
<tpw_rules> i feel like i remember that problem long ago with large memories and simulation
<d1b2> <gatin00b> Yeah, more likely due to the use of Array, but I don't really see a way around that. So to me, it's either we find a way to make Amaranth generate the rtlIL/verilog in a practical time frame either by finding better ways to code the design or despite the way it was coded... or I have to use something else. Since I've got time and energy, I'd rather help improve the Amaranth ecosystem than drop it because it's not mature enough for my use
<d1b2> case.
<tpw_rules> i mean how big are your Arrays?
<tpw_rules> the quirk i recall i thought was fixed though
<tpw_rules> but maybe that was just for Memory and the Array has the same underlying issue
<d1b2> <gatin00b> Array((Signal(4) for idx in range(256)))
<tpw_rules> what are you doing with that? can you not just have a 1024 bit signal and slice it up later?
<tpw_rules> iirc there's a function to do that even
<d1b2> <gatin00b> It's used as a sort of async read memory. slice accesses are dynamic(?) and this is my first Amaranth design.
<d1b2> <gatin00b> Arrays was what was recommended for the use case
<tpw_rules> are the docstrings put online anywhere?
<d1b2> <gatin00b> Is that question directed at me?
<tpw_rules> not directly
<tpw_rules> yeah you can use Value.word_select(x, 4) to replace Value[x]
<tpw_rules> maybe that can be a custom subclass of Value. or you can wait a bit for someone else to weigh in
<d1b2> <gatin00b> Good, 'cause I don't know what that meant
<tpw_rules> i mean if you want to learn python and amaranth's structure that would be something good to chew on
<tpw_rules> subclass Signal, override __getitem__ to call the super's word_select
<d1b2> <gatin00b> Thanks, I'll wait a bit, but I'm willing to give it a try and see if it improves the situation.
<d1b2> <gatin00b> Are there any tools that would allow me to trace the process to get some insight of what is going on and why the process is slow?
<tpw_rules> there is a basic profiler built into python itelf
<tpw_rules> and iirc the problem is in the python logic, not yosys
<d1b2> <gatin00b> I would think so as the process is tuck in python land generating the rtlil, and not the yosys processing
<d1b2> <gatin00b> Thanks for the help
<_whitenotifier> [YoWASP/yosys] whitequark pushed 1 commit to develop [+0/-0/±1] https://github.com/YoWASP/yosys/compare/b0d3656d4ea1...8b1bc9b30bed
<_whitenotifier> [YoWASP/yosys] whitequark 8b1bc9b - Update dependencies.
<_whitenotifier> [YoWASP/yosys] whitequark pushed 11 commits to release [+0/-0/±13] https://github.com/YoWASP/yosys/compare/32bb4208bd8e...8b1bc9b30bed
<_whitenotifier> [YoWASP/yosys] whitequark 11a2b93 - [skip ci] README: add a note on build and development platforms.
<_whitenotifier> [YoWASP/yosys] whitequark d19d108 - Update dependencies.
<_whitenotifier> [YoWASP/yosys] whitequark 1fba5d1 - Update dependencies.
<_whitenotifier> [YoWASP/yosys] ... and 8 more commits.
<d1b2> <gatin00b> And after hours of processing: amaranth._toolchain.yosys.YosysError: Could not find an acceptable Yosys binary. The `amaranth-yosys` PyPI package, if available for this platform, can be used as fallback
<d1b2> <gatin00b> That one is on me
<whitequark> that sounds like you have an exponential expansion somewhere where you're using Arrays, or where you are using part-select on LHS as a part of a compex expression
<whitequark> tpw_rules: you can't subclass Value
<whitequark> you can use ValueCastable though
<d1b2> <gatin00b> At least, now I'm sure it doesn't get stuck in an infinit loop
<d1b2> <gatin00b> I guess I am, how should I address that?
<tpw_rules> like i said i recall there being a problem in how Memory was simulated that led to this behavior. it wasn't hours though, especially for something only 1024 bits. but maybe it's being used a lot more
<whitequark> tpw_rules: gatin00b are generating Verilog, no?
<d1b2> <gatin00b> Yes
<whitequark> s/are/is/
<d1b2> <gatin00b> It's simulates within a sec I think
<d1b2> <gatin00b> Or not much more
<whitequark> yeah, it doesn't have anything to do with Memory then
<d1b2> <gatin00b> Which is why it's surprising it's so long when converting to verilog
<tpw_rules> ok. well wq is undoubtedly the expert here :)
<whitequark> gatin00b: first off: this is not really an Amaranth issue. you would have the same problem in Verilog if you attempted to use the same approach that Array expands down into
<whitequark> it is just that in Verilog you would have to write down what sounds like several gigabytes of code to do so, which people are usually unwilling to do
<whitequark> could you show me your code, please?
<d1b2> <gatin00b> I've found some other issue in Amaranth's elaboration, which I've side steped, but haven't commited yet.
<whitequark> aha, I see
<whitequark> in smolcpu.py, every time you use data_memory[], it expands into a 256-input multiplxeer
<whitequark> * in smolcpu.py, every time you use data_memory[], it expands into a 256-input multiplexer
<whitequark> in addition, when you use several data_memory[] in the same m.d.sync statement, it effectively expands into a 16777216-wide multiplexer
<d1b2> <gatin00b> a) That is pretty often b) why everytime?
<whitequark> * in addition, when you use several data_memory[] in the same m.d.sync statement, it effectively expands into a 16777216-input multiplexer
<tpw_rules> why did you decide to use Array instead of Memory, actually? if it's just because it's async, you can have async ports on a Memory just fine. but they won't be synthesizable to a real FPGA. not that a 16777216-input mux is either
<d1b2> <gatin00b> Okay... so stay clear from Array in my usecase. Then, I'm very curious in how the Array class should be used then
<d1b2> <gatin00b> Because it's effectively, not a memory and you're right in that is more of a 1024 bit sliced register
<d1b2> <gatin00b> That's what was recommended as an alternative to bit slicing
<whitequark> why everytime: Amaranth is a low-level language that makes no attempt to determine which branches of the `m.Switch(self.opcode)` or `m.Switch(self.operandX)` are mutually exclusive, or which indexed lookups of `data_memory[]` have the same index. so it assumes that all of the m.d.sync/m.d.comb/etc statements you have written can execute in parallel. so for every one, it emits a separate multiplexer, since they could all have different
<whitequark> inputs
<whitequark> it's a bit like a powerful macroassembler, if you're familiar with those
<whitequark> and Array is very much like a macro
<d1b2> <gatin00b> OOOOOOOOOOOOOOOOOOOOOOOOOOOOOH
<d1b2> <gatin00b> Now it makes sense
<whitequark> I'm glad it does ^_^
<d1b2> <gatin00b> memory would really be impractical for my use case so bit slicing makes much more sense
<whitequark> you could try hoisting `data_memory[x]` for identical `x` into top-level signals (i.e. add some `m.d.comb += data_memory_x.eq(data_memory[x])`)
<tpw_rules> it's a memory though, right?
<d1b2> <gatin00b> It' complicated and ultimately not my design. Just my implementation.
<whitequark> and for things like `Cat(data_memory[0x10+(3*self.sp)-3],Cat(data_memory[0x10+(3*self.sp)-2],data_memory[0x10+(3*self.sp)-1]))`, try grabbing the entire three-nibble word as a whole and then operating on it as a bit vector
<d1b2> <gatin00b> Some part of that memory is used as GPRs
<whitequark> the second suggestion is actually the one that I think will make synthesis times on your design manageable
<whitequark> the first one, applied on top, will make it actually compact
<whitequark> tpw_rules: it's an async memory with too many read ports. ultimately using an Array here carefully is not any worse than using a Memory with async read ports, but it will actually synthesize
<whitequark> it seems like it'd be pretty large when synthesized, but since this is an existing design, I'm not sure if much can be done about that
<d1b2> <gatin00b> Alright, I'll revise the code based on that and try to write some sort of blog post on Array and memory after.
<whitequark> at some point this should be the part of first-party documentation, but yeah
<d1b2> <gatin00b> Yeah, for hardware purposes, it's not a great design. The original design is meant to be emulated on a uC and this one is really a learning exercise more than anything
<whitequark> I'm not sure if it'll actually fit into the tapeout constraints, to be honest
<whitequark> from a glance at your code, you're going to have a dozen 256-wide multiplexers
<whitequark> (at least)
<d1b2> <gatin00b> Yeah, I want to bring it there, but I'll work with what I can first then refer to you how you best want to integrate into Amaranth's doc
<d1b2> <gatin00b> I know
<whitequark> oh, now that I've looked at the terms of the contest, I think I see where a misunderstanding may lie
<whitequark> you don't actually have to have this many read ports. you could write an FSM that sequences reads and implement it with a Memory with even a single read port (which is how CPUs are often implemented)
<tpw_rules> i don't think this is an existing design
<tpw_rules> there's nothing in the contest about clocks per instruction :)
<tpw_rules> could you just implement a PIC and submit that i wonder
<whitequark> yeah, I misunderstood what "an existing design" meant in this context
<whitequark> tpw_rules: I think a PIC might not fit either?
<d1b2> <gatin00b> Yes, but first goal was not to get a practical design for tapeout, just a working design in Amaranth. Figuring out how to make tapeout happen comes after
<whitequark> I see
<tpw_rules> have you done digital logic design before?
<d1b2> <gatin00b> Contest asked to replicate the badge. I decided to make it cycle accurate
<d1b2> <gatin00b> Plenty
<d1b2> <gatin00b> This is my first amaranth design and will be my first ASIC design, but I've done plenty FPGA desgns
<adamgreig[m]> cycle accurate to the pic running it?
<d1b2> <gatin00b> I'm still on the fence about getting it to ASIC in the first place, but if it makes Matt happy and we get to prove another Amaranth design on silicon, then that's a good win
<d1b2> <gatin00b> My contract with AMD is going out in December, so learning Amaranth is more of skill development for a new job.
<d1b2> <gatin00b> No cycle accurate of the badge as documented
<whitequark> oh, nice! best of luck on your endeavours
<d1b2> <gatin00b> Thanks. In the process, I hope to help improve Amaranth.
<d1b2> <gatin00b> By the way, really looking forward your live stream
<whitequark> oh yeah! I need to set that up
<d1b2> <gatin00b> There's glimesh.tv which I think you might prefer over Twitch, for you use case, but that's really out of topic
<whitequark> oh, that actually looks really handy, thank you!
<d1b2> <gatin00b> Yeah, I was reading the tweets and figured it'd be a good match, but I really don't want to derail anything.
nelgau has quit []
nelgau has joined #amaranth-lang
cr1901 has quit [Read error: Connection reset by peer]
cr1901 has joined #amaranth-lang