<re_irc>
<@adamgreig:matrix.org> mostly I think there's one set of c-m peripherals per core
<re_irc>
<@adamgreig:matrix.org> and each core sees them at the same address
<re_irc>
<@adamgreig:matrix.org> but some are shared by both cores iirc
<re_irc>
<@adamgreig:matrix.org> and anyway it's not clear how to even conceptualise multicore execution in bare-metal rust right now
<re_irc>
<@9names:matrix.org> 9names: Can confirm that pinning signal-hook to 0.3.10 builds fine, so it's a new regression.
<re_irc>
<@dirbaio:matrix.org> adamgreig: wot
<re_irc>
<@dirbaio:matrix.org> isn't everything at 0xExxx_xxxx core-local?
<re_irc>
<@adamgreig:matrix.org> hmm, you're probably right
<re_irc>
<@adamgreig:matrix.org> I didn't remember if stuff like MPU was shared but I guess it wouldn't make sense
<re_irc>
<@dirbaio:matrix.org> it's not
<re_irc>
<@dirbaio:matrix.org> MPU is checks the core does before the access goes out to the bus
<re_irc>
<@adamgreig:matrix.org> so what you're saying is the cortex-m peripherals needs to be given out once per core but the device peripherals need to be given out once per device to be safe...
<re_irc>
<@adamgreig:matrix.org> ...but we might have both cores running code from one elf, or might have one elf per core on the same device....
<re_irc>
<@adamgreig:matrix.org> ....so it's a completely lost cause to try and provide singletons for the device peripherals
<re_irc>
<@dirbaio:matrix.org> yeah it's mega cursed
<re_irc>
<@dirbaio:matrix.org> but again it's up to the definition of "program"
<re_irc>
<@dirbaio:matrix.org> if both cores run the same "program" then you have two "mains"
<re_irc>
<@grantm11235:matrix.org> I'm starting to think that the best option might just be for `take` to be unsafe
<re_irc>
<@dirbaio:matrix.org> so for device peripherals, you have one `Peripherals::take()` that must give out only one instance per program
<re_irc>
<@dirbaio:matrix.org> so it needs to use atomics, or a multicore-sound critical section
<re_irc>
<@dirbaio:matrix.org> and for core peripherals well.... you don't
<re_irc>
<@grantm11235:matrix.org> What about device peripherals that are core-local?
<re_irc>
<@adamgreig:matrix.org> you might not even have two mains if the second core starts off and gets turned on by passing it a function to run or something
<re_irc>
<@dirbaio:matrix.org> yeah
<re_irc>
<@dirbaio:matrix.org> the rp-hal folks are doing it that way, it's super cool
<re_irc>
<@dirbaio:matrix.org> you can start core1 from core0, giving it a closure
<re_irc>
<@adamgreig:matrix.org> yea, it seems nice, like starting a thread on a hosted platform
<re_irc>
<@dirbaio:matrix.org> so you can "pass" some singletons from core0 to core1
<re_irc>
<@adamgreig:matrix.org> that's what I can't remember if the h7 can do
<re_irc>
<@adamgreig:matrix.org> you can start the cm4 from the cm7 and v.v., but can you start it at some arbitrary address?
<re_irc>
<@dirbaio:matrix.org> even if the hw can't do it, you can make a runtime that can
<re_irc>
<@adamgreig:matrix.org> I guess you can do it the same way, right
<re_irc>
<@adamgreig:matrix.org> yea
<re_irc>
<@adamgreig:matrix.org> the rp0 only does it because the rom for the second core is a spinloop waiting on a semaphore right?
<re_irc>
<@grantm11235:matrix.org> dirbaio:matrix.org: I think that's how ISRs should work too
<re_irc>
<@dirbaio:matrix.org> in the rp2040 both cores boot, but the ROM traps core1 in a loop until core0 sends it a start addr
<re_irc>
<@adamgreig:matrix.org> nice to have that in the rom though
<re_irc>
<@adamgreig:matrix.org> on the h7 it would depend on what the option bytes were set to and you'd need to dump a little startup code into some random far away flash address that's the cm4 boot
<re_irc>
<@dirbaio:matrix.org> hehe
<re_irc>
<@adamgreig:matrix.org> I guess the HAL could set the flash option bytes but it's a bit surprising
<re_irc>
<@adamgreig:matrix.org> I think the cm7 can kill/reset the cm4 so I guess it could be done
<re_irc>
<@dirbaio:matrix.org> can you have them both boot from the same vectors then __start queries "which core am I"?
<re_irc>
<@adamgreig:matrix.org> in pre_init so they don't fight over initialising sram I guess
<re_irc>
<@dirbaio:matrix.org> yeah, or custom rt
<re_irc>
<@adamgreig:matrix.org> seems a bit unnecessary to have a custom rt just for that
<re_irc>
<@dirbaio:matrix.org> because they'll fight over the stack
<re_irc>
<@dirbaio:matrix.org> yeah :S
<re_irc>
<@adamgreig:matrix.org> would be nice if c-m-rt was a bit more flexible about extending it to prevent people needing to customise it
<re_irc>
<@adamgreig:matrix.org> wish the h7 rm had a bit more detail about the dual core setup, it's not really explained anywhere i've seen
<re_irc>
<@adamgreig:matrix.org> I couldn't say how you even query which core you are
<re_irc>
<@adamgreig:matrix.org> I guess read your own coresight ID tables, heh
<re_irc>
<@dirbaio:matrix.org> the rp2040 has a reg you can read from 0xDxxxx :)
<re_irc>
<@adamgreig:matrix.org> and a datasheet that actually tells you all about the two cores
<re_irc>
<@dirbaio:matrix.org> lol
<re_irc>
<@adamgreig:matrix.org> I guess at least on the h7 they're two different types of core
<re_irc>
<@adamgreig:matrix.org> so you can just read the "am i a cortex-m7" or "am i a cortex-m4"
<re_irc>
<@adamgreig:matrix.org> adamgreig: fine, I looked and found the 36 page app note on the dual core architecture, the 24 page app note on the inter-core communication, and 45-page note on debugging dual-core stuff
<re_irc>
<@grantm11235:matrix.org> What if each hal implemented it's own runtime, using `cortex-m-rt` as a "runtime-building toolbox"
<re_irc>
<@dirbaio:matrix.org> sometimes it's the user that wants custom stuff though
<re_irc>
<@dirbaio:matrix.org> like using the HAL under some RTOS
<re_irc>
<@grantm11235:matrix.org> The hal could make the rt feature optional
<re_irc>
<@grantm11235:matrix.org> Then the user could use `cortex-m-rt` to make their own custom rt
<re_irc>
<@adamgreig:matrix.org> some hals more or less already do this, right? they define a default memory.x and depend on cortex-m-rt
<re_irc>
<@adamgreig:matrix.org> user still has to pass the linker flag though
<re_irc>
<@adamgreig:matrix.org> but anyway does it buy you anything? the HAL would decide it's safe if it's providing the reset vector something?
<re_irc>
<@grantm11235:matrix.org> The hal could take a `fn(hal::Periperals) -> !` as its main program
<re_irc>
<@grantm11235:matrix.org> And then start a second core the same way that rpi does it
Foxyloxy has quit [Ping timeout: 256 seconds]
Foxyloxy has joined #rust-embedded
troth has quit [Ping timeout: 240 seconds]
<re_irc>
<@firefrommoonlight:matrix.org> Re some of the dual core chat in the `stm32-rs` room and the discussion here: I'm curious how to set up smart abstractions for dual core. I may dive into a project that does this, and will hopefully work through some ideas. It appears that `cortex-m-rt`, in conjunction to specifying the correct addressed in `memory.x`, will let you flash either core, with standalone programs
<re_irc>
<@firefrommoonlight:matrix.org> This isn't really ideal from an application perspective (For the uses I'm envisioning), compared to a single program
<re_irc>
<@firefrommoonlight:matrix.org> So, the open question is, how to set up abstractions so the programs "talk" to each other in an organized, easy-to-code way
<re_irc>
<@adamgreig:matrix.org> it shouldn't be too awful to put together something that will boot both cores from a single elf
<re_irc>
<@adamgreig:matrix.org> be a bit hacky at first though
<re_irc>
<@firefrommoonlight:matrix.org> And avoid race conditions between shared memory, periphs etc
<re_irc>
<@adamgreig:matrix.org> it really might be best to generate two elfs at first
<re_irc>
<@adamgreig:matrix.org> save a lot of questions around how initialisation and statics and vector tables and stuff will work
<re_irc>
<@adamgreig:matrix.org> and probably not much harder to coordinate sharing access (maybe even easier...), but flashing and stuff is more annoying
<re_irc>
<@adamgreig:matrix.org> check out the rp2040 hal, it has some multi-core support already
<re_irc>
<@firefrommoonlight:matrix.org> Good point. I think initial moves might be #1 Setting up a clear code style (With published examples for other people who try!). #2 Set up HAL code for Semaphore periphs etc (eg HSEM on STM32)
<re_irc>
<@firefrommoonlight:matrix.org> So you have a high-level API for the semaphores
<re_irc>
<@adamgreig:matrix.org> adamgreig: for example: add a new flash2 memory to memory.x at 0810_0000, add a new section there for a vector table, construct that table in your code and put its link_section to that new section, and have it point to the address of a function you want the cm4 to run
<re_irc>
<@adamgreig:matrix.org> in principle at that point cm4 will boot and start running that function
<re_irc>
<@adamgreig:matrix.org> (and crucially won't try and do bss/static initialisation)
<re_irc>
<@dirbaio:matrix.org> did anyone else's rust-analyzer just break?
<re_irc>
<@dirbaio:matrix.org> today's nightly is borked
<re_irc>
<@adamgreig:matrix.org> I wonder if the h7 sets the cm4 vtor to match its boot address or if you have to do that yourself
<re_irc>
<@firefrommoonlight:matrix.org> LMK if there's anything I can do to test that. I'm not too good with linking/memory.x!
<re_irc>
<@adamgreig:matrix.org> perhaps it just maps the boot address to the cm4's address 0, so vtor 0 still works fine
<re_irc>
<@adamgreig:matrix.org> do you have the h745zi-q nucleo?
<re_irc>
<@firefrommoonlight:matrix.org> Yes
<re_irc>
<@firefrommoonlight:matrix.org> (Until I desolder the chip to put on a custom board lol)
<re_irc>
<@adamgreig:matrix.org> lol
<re_irc>
<@adamgreig:matrix.org> ugh good luck, i hate de/resoldering lqfp144
<re_irc>
<@firefrommoonlight:matrix.org> I never have, and struggle enough with qfp48, but I'm out of options
<re_irc>
<@adamgreig:matrix.org> i had to do a few after a bom incident at work and i didn't enjoy it
<re_irc>
<@adamgreig:matrix.org> lcsc have some 743 lqfp100 in stock pretty often but no help if you want that dual core goodness
troth has joined #rust-embedded
<re_irc>
<@firefrommoonlight:matrix.org> Oh! I'm very interested
<re_irc>
<@firefrommoonlight:matrix.org> Showing out now, but will F5 occasionally
<re_irc>
<@firefrommoonlight:matrix.org> Also nice re that single program!
<re_irc>
<@adamgreig:matrix.org> (plus note that FLASH2 is defined in my memory.x MEMORY section, and I've cheekily hardcoded the top of SRAM2 for the second core's initial stack pointer, but you could easily get that from the linker)
<re_irc>
<@firefrommoonlight:matrix.org> I'm getting `error: could not execute process `target\thumbv7em-none-eabihf\release\h7dc` (never executed)` when clone + run release
<re_irc>
<@dirbaio:matrix.org> missing runner in .cargo/config
<re_irc>
<@adamgreig:matrix.org> I've given you basically a loaded shotgun pointed directly at your feet so it's worth being a little cautious about whether this is better than two separate programs and stuff
<re_irc>
<@adamgreig:matrix.org> but it does seem kinda fun
<re_irc>
<@adamgreig:matrix.org> probe-rs seems to handle flashing the two disjoint parts of flash well too, nice
<re_irc>
<@firefrommoonlight:matrix.org> Hah. I'm still clueless with linking and everything dual core beyond "hello world", but we'll see
<re_irc>
<@adamgreig:matrix.org> the setup in memory.x is pretty reasonable
<re_irc>
<@adamgreig:matrix.org> most of the time it took to get this working was remembering that you need KEEP(...) in the linker script to stop it throwing everything away
<re_irc>
<@firefrommoonlight:matrix.org> Idea is M7 does program logic, UI, realtime audio processing. M4 receives a copy of the microphone data from M7, uses it to adjust the filter coeffs the M7 core uses
<re_irc>
<@firefrommoonlight:matrix.org> So the realtime audio isn't interrupted by the sound-analysis and filter adjustment
<re_irc>
<@firefrommoonlight:matrix.org> So, what needs to pass is mic data, filter coefficients, and some signals like "Filter updated"
<re_irc>
<@adamgreig:matrix.org> seems like you should be able to cook something up
<re_irc>
<@firefrommoonlight:matrix.org> Probably a few ways to do it
<re_irc>
<@adamgreig:matrix.org> hmm, you can really tell the cm7 is dual-issue
<re_irc>
<@adamgreig:matrix.org> the cm4 and cm7 are both callying delay(16_000_000) and in theory the cm7 runs at twice the clock of the cm4
<re_irc>
<@adamgreig:matrix.org> but it's definitely blinking more than twice as fast
<re_irc>
<@adamgreig:matrix.org> HSI is 64MHz, gosh
<re_irc>
<@firefrommoonlight:matrix.org> Same
<re_irc>
<@firefrommoonlight:matrix.org> So, I made a wrong assumptin. It's only twice the speed if hclk scaler is 2
<re_irc>
<@firefrommoonlight:matrix.org> I'm not sure what the default is; might be 8
<re_irc>
<@firefrommoonlight:matrix.org> It was twice on my board only because I had it configured to 2
<re_irc>
<@adamgreig:matrix.org> and actually hpre is /1 by default
<re_irc>
<@adamgreig:matrix.org> and d1cpre=/1 too, so by default both cores run at 64MHz
<re_irc>
<@adamgreig:matrix.org> the reason they blink at different speeds is that the cortex-m7 core can retire more instructions per cycle than the cortex-m4 core
<re_irc>
<@firefrommoonlight:matrix.org> Oh! So it's a quirk of asm::delay?
<re_irc>
<@adamgreig:matrix.org> but I'm a bit surprised it turns out to only be due to that and by default they're the same clock speed, very fun
<re_irc>
<@adamgreig:matrix.org> well, sort of
<re_irc>
<@adamgreig:matrix.org> asm::delay promises to delay for "at least this many clock cycles"
<re_irc>
<@adamgreig:matrix.org> but it might be more, and indeed on cortex-m0/1/3/4 it will be like twice as much
<re_irc>
<@firefrommoonlight:matrix.org> So, I tested with A: PLL with SYSTICK1 = 400Mhz and SYSTICK2 = 200Mhz. I configured teh `cortex-m` crate delay, telling it I'm using 400Mhz systick. Got 1Hz flash on the M7 LED, and 1/2Hz flash on M4 LED as expected with those settings
<re_irc>
<@firefrommoonlight:matrix.org> I also tested with default config, (Presumably HSI with no PLL, both at 64MHz as you said), and didn't time it, but got much slower flashing
<re_irc>
<@adamgreig:matrix.org> systick delay does not depend on core type, different to cortex_m::asm::delay
<re_irc>
<@firefrommoonlight:matrix.org> That explains it entirely then
<re_irc>
<@adamgreig:matrix.org> the systick delay counts systicks, which are either the cpu clock or cpu clock /8, but is accurate
<re_irc>
<@adamgreig:matrix.org> cortex_m::asm::delay just counts from 0 to the number of cycles/2, which takes about 3-4 clock cycles on a cortex-m4 but only 2 on a cortex-m7
<re_irc>
<@adamgreig:matrix.org> but it means on that sample code I posted, the difference in flash rates between the two cores is entirely down to the m7 doing more counting per clock cycle, which is pretty wild
<re_irc>
<@firefrommoonlight:matrix.org> Yea! I didn't know it did that!
emerent has quit [Ping timeout: 265 seconds]
emerent has joined #rust-embedded
starblue has quit [Ping timeout: 252 seconds]
starblue has joined #rust-embedded
procton_ has quit [Remote host closed the connection]
procton_ has joined #rust-embedded
troth has quit [Ping timeout: 252 seconds]
troth has joined #rust-embedded
<re_irc>
<@xnorman:matrix.org> so i've figure out where i'm overflowing my stack.. haven't tested yet if i hit a hardfault outside of the debugger.. and i'm trying to figure out, what is it that is getting pushed to the stack that takes up so much space
<re_irc>
<@xnorman:matrix.org> my stack is at 0x20000000 and the method just before the segfault i'm getting has sp = 0x02001f9d0..
<re_irc>
<@xnorman:matrix.org> so it seems like i should have a lot of space. any advice on how to figure out what is taking up all that space?
PyroPeter has quit [Ping timeout: 240 seconds]
PyroPeter has joined #rust-embedded
<re_irc>
<@jamesmunns:beeper.com> You could use the bt command to get a backtrace, and figure out where all your stack frames start
<re_irc>
<@jamesmunns:beeper.com> One or more of them are probably going to be chonky bois
<re_irc>
<@jamesmunns:beeper.com> One blind guess is you're probably moving/returning some large buffer by value, and the compiler didn't apply an RVO to it, meaning you end up with more instances live at one time than you expect.
<re_irc>
<@xnorman:matrix.org> yeah, it happens a couple of times, one out of this function which calls postcard::from_bytes which also returns a value on the stack.. and, my struct is big.. like 13000 bytes, but that just about 10x smaller than 0x1f9d0
<re_irc>
<@jamesmunns:beeper.com> Yeah, with how serde works, you can sometimes end up with multiple instances (like 2-4x) when you return structs by value
<re_irc>
<@jamesmunns:beeper.com> There might be a way to extend postcard to fill into a maybeuninit object, but I'm not sure off the top of my head how to do that
<re_irc>
<@xnorman:matrix.org> and.. with maybeuninit, would you use &mut MaybeUninit<T> ?
<re_irc>
<@xnorman:matrix.org> so you wouldn't have to pass down all the values?
<re_irc>
<@jamesmunns:beeper.com> But yeah, I definitely became aware of it through use of postcard on embedded systems, where I also blew my stack trying to create fairly huge structs from some data source (packets on the wire? data in flash? can't remember)
<re_irc>
<@xnorman:matrix.org> any suggested workarounds? I'll definitely need to grow the size of this data that I'm deserding ..
<re_irc>
<@xnorman:matrix.org> i guess.. would passing down `&mut Option<T>` at least avoid the RVO issue until I get to postcard?
<re_irc>
<@xnorman:matrix.org> i also wonder if i could just move my stack to this big sdram i have.. but I figure I'd have to go with a bootloader approach for that because the stack would need to be setup (there is some memory protection etc initialization for the sdram) before my program starts??
<re_irc>
<@jamesmunns:beeper.com> I think that korken89's MU PR is probably the right approach, though it has gotten stale (my fault: When he first opened it I was way less comfortable/familiar with MU, so I let it bit rot).
<re_irc>
<@jamesmunns:beeper.com> If you're interested in updating it or working on it, I'd be happy to mentor, especially since you have an immediate test case to verify against :D
<re_irc>
<@jamesmunns:beeper.com> Re: your earlier question, I don't think the GenericArray stuff is needed anymore, since Postcard switched to using const generics instead
<re_irc>
<@adamgreig:matrix.org> in theory you can still put your stack on the sdram and move your setup to the pre_init method that runs before c-m-rt initialises bss
<re_irc>
<@xnorman:matrix.org> yeah, i can definitely look into that.. I was thinking that that work was only serde and not deserde but I can give deserde a go if so
<re_irc>
<@xnorman:matrix.org> adamgreig: so pre_init wouldn't need the stack?
<re_irc>
<@jamesmunns:beeper.com> Ah, I'm not sure what it covers anymore. I actually should double check to make sure deser is possible with serde and maybeuninit
<re_irc>
<@adamgreig:matrix.org> you'd have to write a pre_init that didn't use the stack
<re_irc>
<@adamgreig:matrix.org> potentially you could leave the internal sram as the vector table SP, and then in pre_init configure the SDRAM and then update the MSP register
<re_irc>
<@adamgreig:matrix.org> wonder if that should really be deprecated
<re_irc>
<@adamgreig:matrix.org> I guess you probably can't call it from rust really
<re_irc>
<@xnorman:matrix.org> adamgreig: I guess it would have to be assembly then?
<re_irc>
<@jamesmunns:beeper.com> Hmm, actually thinking about it, I don't think you can deserialize directly into a MU, since Serde returns everything by value...
<re_irc>
<@adamgreig:matrix.org> in an ideal world your pre_init would be a naked fn with assembly but that requires nightly rust atm
<re_irc>
<@adamgreig:matrix.org> it's a bit cursed for sure
<re_irc>
<@jamesmunns:beeper.com> Alex Norman what does your data actually look like? Do you have some huge array of bytes (like sample data)? Or is it actually a ton of nested structs?
<re_irc>
<@adamgreig:matrix.org> hmm, perhaps you could set up the PSP in SDRAM and swap to using PSP in main for postcard, while using MSP for interrupts
<re_irc>
<@xnorman:matrix.org> jamesmunns:beeper.com: nested structs, some of those are arrays
<re_irc>
<@jamesmunns:beeper.com> Would you be able to link it for me? I might be able to suggest some tweaks to your data model to sidestep the problem
<re_irc>
<@xnorman:matrix.org> adamgreig: I'm not sure what this means.. i'll have to look up psp
<re_irc>
<@adamgreig:matrix.org> cortex-m can have two stacks, MSP and PSP
<re_irc>
<@adamgreig:matrix.org> interrupts always use MSP but your main thread can use either
<re_irc>
<@jamesmunns:beeper.com> I can suggest some hacks to avoid doing it all at once, basically deserializing a chunk at a time, and manually pushing each chunk into the final collection
<re_irc>
<@jamesmunns:beeper.com> it's not super elegant though
<re_irc>
<@xnorman:matrix.org> yeah.. that is one thing i was thinking i could do.. i don't love it, but might-could work
<re_irc>
<@xnorman:matrix.org> I'll edit that, i don't like it .. but would do it if i need to
<re_irc>
<@jamesmunns:beeper.com> Yeah, let me think...
<re_irc>
<@jamesmunns:beeper.com> It might also be interesting taking something non-serde for a spin, like `zerocopy`: https://docs.rs/zerocopy
<re_irc>
<@xnorman:matrix.org> jamesmunns:beeper.com: cool...i'll check that out. i'm at my day job now but have this all noted for later
<re_irc>
<@xnorman:matrix.org> thanks for your suggestions James Munns and adamgreig !
<re_irc>
<@jamesmunns:beeper.com> I'll keep thinking, but I think without RVO, changes to Serde, or a little bit of hacks (e.g. walking through the bytes, deserializing one larger item at a time), I'm not sure I'll have any magic for you.
<re_irc>
<@jamesmunns:beeper.com> I'm assuming you already have `lto = "full"` and `codegen-units = 1` turned on? Those will help give you the best chance for llvm to apply space-saving opts
<re_irc>
<@xnorman:matrix.org> jamesmunns:beeper.com: i have `lto = true` and `codegen-units = 1`
<re_irc>
<@jamesmunns:beeper.com> Darn. Perils of heuristics based optimizers.
rardiol has joined #rust-embedded
<re_irc>
<@dirbaio:matrix.org> is it possible to tell Cargo to `[patch]` a path dependency to a *different* path?
<re_irc>
<@dirbaio:matrix.org> and this parses but does nothing:
<re_irc>
<@dirbaio:matrix.org> > warning: Patch `stm32-metapac v0.1.0 (/home/dirbaio/embassy/embassy/stm32-metapac-gen/out)` was not used in the crate graph.