ChanServ changed the topic of #rust-embedded to: Welcome to the Rust Embedded IRC channel! Bridged to #rust-embedded:matrix.org and logged at https://libera.irclog.whitequark.org/rust-embedded, code of conduct at https://www.rust-lang.org/conduct.html
AlexandrosLiarok has joined #rust-embedded
<AlexandrosLiarok> any tips on profiling stm32 projects?
<Darius> AlexandrosLiarok: Orbuculum? :)
<Darius> depends what MCU and what tooling you have
cinemaSundays has quit [Quit: Connection closed for inactivity]
bartmassey[m] has joined #rust-embedded
<bartmassey[m]> Henk jamesmunns adamgreig tamme and anyone else who is interested in participating: the MB2 Discovery Book sprint will start in a few hours October 26, 17:00-21:00 CEST, 15:00-19:00 UTC, 8:00AM-12:00PM PST. Please meet here and we will organize and make a plan to produce a new chapter or two. The focus this time is on interrupts DMA and PWM as time permits.
<bartmassey[m]> I look forward to writing with you folks!
Makarov has joined #rust-embedded
Makarov has quit [Ping timeout: 256 seconds]
ivmarkov[m] has joined #rust-embedded
<ivmarkov[m]> <thejpster[m]> "Ugh, Long File Names are messy..." <- The whole fs is not async. Does that mean that the impl busyloops when hitting flash and thus delays all other tasks in the same executor, or is it so fast, that this is barely noticeable?
Makarov has joined #rust-embedded
emerent has quit [Ping timeout: 252 seconds]
emerent has joined #rust-embedded
<thejpster[m]> There’s a PR to make it async but it hasn’t been merged. The delays reading and writing blocks will be perhaps milliseconds long so you will definitely notice them.
GuineaWheek[m] has joined #rust-embedded
<GuineaWheek[m]> i'm trying to use the dwt cycle counter to do some performance measurements but the compiler keeps eliding one of the reads so i get inaccurate measurements
<GuineaWheek[m]> any idea on how to get rust to....not do this?
<GuineaWheek[m]> i've been trying to compiler fence it
<GuineaWheek[m]> and i look at the source code and it uses volatile reads
<GuineaWheek[m]> and yet that first read just gets elided
rafael[m] has joined #rust-embedded
<GuineaWheek[m]> which Defeats The Point
<JamesMunns[m]> If you change all the orderings to AcqRel does it change?
<JamesMunns[m]> I think you might want Acquire ordering really, but I always have to look up which one to use in which situation :D
<JamesMunns[m]> the other question is whether you have DWT running, I know it requires the debugger to be attached (and maybe some manual init?) for it to actually return valid values
<JamesMunns[m]> like - is dwt just always returning zero
<JamesMunns[m]> generally volatile operations can NEVER be reordered relative to each other, but they could be reordered across your non-volatile code (e.g. the micromath ops)
<JamesMunns[m]> but I think if you have the fences right then it shouldn't
<GuineaWheek[m]> DWT is running, it returns correct values
<JamesMunns[m]> I can't remember if black_box is stable these days
<GuineaWheek[m]> i can see it returning correct values in the debugger
<GuineaWheek[m]> i've tried black_box, every single combination of Ordering, and still no dice
<JamesMunns[m]> you can wrap your micromath call in core::hint::black_box(micromath::F32(theta).sin_cos());
<JamesMunns[m]> is the compiler just optimizing the operation away entirely?
<GuineaWheek[m]> so either start gets optimized away
<GuineaWheek[m]> or if i use asm!
<GuineaWheek[m]> start and end get put adjacent to each other
<JamesMunns[m]> when you say "optimized away", do you mean you just can't print it in the debugger?
<GuineaWheek[m]> so the dwt difference ends up being 1 cycle
<GuineaWheek[m]> no
<GuineaWheek[m]> there's no asm operation
<GuineaWheek[m]> no second read
<JamesMunns[m]> gotcha! are you looking with objdump or cargo-show-asm?
<GuineaWheek[m]> actually i take it back
<GuineaWheek[m]> they just get put adjacent to each other
<GuineaWheek[m]> you can see the adjacent LDRs reading from the same value
<GuineaWheek[m]> sorry, address
<GuineaWheek[m]> lemme try Release
<JamesMunns[m]> you could use a real fence instead of compiler_fence, which shouldn't be necessary, but will emit dmb instructions that might make the sequencing more clear in the asm
<GuineaWheek[m]> yeah release still emits adjacent reads
<GuineaWheek[m]> even with real fances
<JamesMunns[m]> what target is that, btw?
<GuineaWheek[m]> thumbv7em-none-eabihf
<GuineaWheek[m]> it's a cortex-m4f
<JamesMunns[m]> (I don't remember seeing the "BCC" instruction before, but im not an asm expert)
<GuineaWheek[m]> lemme try black-box
<JamesMunns[m]> I think you want an acquire before capturing start, and then a Release before you capture DWT the second time
<JamesMunns[m]> but I would set them all to AcqRel or SeqCst until it starts looking right
<GuineaWheek[m]> huh black_box on the micromath call seems to put the reads in the right place
<GuineaWheek[m]> lemme test if the DWT math actually checks out
<JamesMunns[m]> that's a start! Now we can work back :D
wassasin[m] has quit [Quit: Idle timeout reached: 172800s]
<JamesMunns[m]> what is that debugger, btw? Is that Ozone?
diondokter[m] has quit [Quit: Idle timeout reached: 172800s]
<GuineaWheek[m]> yeah it's ozone
<JamesMunns[m]> neat! Never used it.
<GuineaWheek[m]> it's what we were previously experienced with in c++, but it just got rust support in the last release or so
<JamesMunns[m]> I think you need both to establish the relationship between the fences
<JamesMunns[m]> ah, maybe that's backwards, again I use AcqRel as a default because I always have to re-think orderings out, and my gut always seems wrong :D
<GuineaWheek[m]> other way it seems i could solve this is writing start to a static which seems to force the compiler to consider the possibility that there's an intervening access elsewhere...which seems Cursed
<JamesMunns[m]> like, "nothing before getting the dwt is moved after it, and nothing after it is moved before it"
<JamesMunns[m]> (in both places you acquire it)
<JamesMunns[m]> making them all the same ordering like in your code snippet will only prevent the reorderings in one direction, I think?
<GuineaWheek[m]> i think the ordering is correct but the fundamental problem is that the micromath operation is reorderable relative to the dwt fetches
<GuineaWheek[m]> becaus ei think the dwt fetches are correct relative to each other but the compiler thinks that the micromath calls don't affect DWT
<JamesMunns[m]> as I understand it, that should not happen if the orderings are right, but if you try my code and tell me I'm wrong then I can't really argue with that :)
<GuineaWheek[m]> i tried it without black_box and it reorders
<GuineaWheek[m]> but with black_box it's correct
<GuineaWheek[m]> yes
<JamesMunns[m]> got it!
<JamesMunns[m]> then... yeah. I'm not sure why the orderings there aren't sufficient.
<JamesMunns[m]> thanks for trying!
<GuineaWheek[m]> the dwt calls are likely correct relative to each other
<JamesMunns[m]> yep
<JamesMunns[m]> volatiles ops can never be reordered relative to other volatile ops
<GuineaWheek[m]> but i think the compiler just thinks that the intervening code is reorderable relative to those blocks yeah
<JamesMunns[m]> their ordering wrt non-volatile ops (and even atomic ops!) are undefined, IIRC
<JamesMunns[m]> but we do rely on fences like this for things like DMA transactions
<GuineaWheek[m]> side effects like this are not something that the memory model expects to care about i suspect
<JamesMunns[m]> we certainly care for reasons like that ^ !
<GuineaWheek[m]> yeah i am really curious what it's just...broken here
<GuineaWheek[m]> compared to the dma case
<JamesMunns[m]> otherwise you can get cases where you end up DMA'ing an empty buffer instead of a buffer with the data you expect to send
<GuineaWheek[m]> of note is that the micromath case is the only library i was benchmarking that really suffered from these issues
<GuineaWheek[m]> maybe it's some sort of inlining thing?
<JamesMunns[m]> unsure! It could be good to make a minimal repro and report upstream?
<JamesMunns[m]> like, it should be visible looking at the generate asm
<GuineaWheek[m]> idk what upstream i would report to here
<JamesMunns[m]> GuineaWheek[m]: rust-lang/rust
<JamesMunns[m]> if it's a rustc or llvm bug, then they'll sort it out
<dirbaio[m]> I think fences only apply to memory operations?
<JamesMunns[m]> like, have two functions, with the same fences, but one has black_box, and show the dumped asm
<dirbaio[m]> so if it's just math on registers it can still get reordered?
<dirbaio[m]> so it's not necessarily a bug
<dirbaio[m]> I think if nonvolatile_op2 is just math on registers it can still get reordered
<GuineaWheek[m]> dirbaio[m]: yeha it's likely math on registers because micromath doesn't use LUTs for sin/cos
<dirbaio[m]> it's only the visible effects on memory that are guaranteed to be ordered
<GuineaWheek[m]> DWT (or perhaps hardware timers in general) is in a spot where like, the very number of cycles executed affects what it can read
<GuineaWheek[m]> so like, how do you even guarantee the desired behavior if you never touch memory?
<dirbaio[m]> with something stronger than fences 🤷
<dirbaio[m]> black_box, inline_asm
<JamesMunns[m]> <dirbaio[m]> "it's only the visible effects..." <- ahhh, you're saying it catches with the DMA because the op is observable because we use the ptr of the buf?
<dirbaio[m]> no, because they are memory accesses
<dirbaio[m]> what fences do is "these memory accesses should happen before these other memory accesses"
<dirbaio[m]> s/should/must/
<dirbaio[m]> which is exactly what you want for DMA: the memory accesses for filling the buff must happen before the memory access that starts DMA
<dirbaio[m]> but if you're doing just math on registers... that's not a memory access. it's not visible from outside the core, so the compiler is still free to reorder it
<GuineaWheek[m]> i'm...still having reorderings with black_box on a different piece of code and i have Zero Idea Why
<GuineaWheek[m]> now my dwt memory access is getting inserted in the Middle of the Code I Want To Benchmark
<GuineaWheek[m]> the only consistent way I can seem to stop this is to write start to a static variable
<GuineaWheek[m]> my guess is it's because it forces start to get written to memory and not just a register
<JamesMunns[m]> writing to a static means that the "visibility" of the write is a much huger scope
<JamesMunns[m]> like, other threads could theoretically access it, it's not local to the current execution context
<JamesMunns[m]> which I assume makes the optimizer much more pessimistic
Makarov has quit [Ping timeout: 256 seconds]
<thejpster[m]> <JamesMunns[m]> "(I don't remember seeing the "..." <- BCC is a conditional branch where the condition is Carry Clear IIRC.
<JamesMunns[m]> yeah
<JamesMunns[m]> I did look it up, it just didn't ring a bell :)
<thejpster[m]> Not so Reduced these days.
<thejpster[m]> More like Reduced Addressing Mode Instruction Set really.
<JamesMunns[m]> "everything but the MMU architecture"
<AlexandrosLiarok> <Darius> "depends what MCU and what..." <- stm32h745 and I am flexible enough to pick whatever I need as long I can get a better workflow
<AlexandrosLiarok> I think native profiling is something that I've been missing a lot
<AlexandrosLiarok> I do have a VST emulator but that's not the same especially due to fpu capabilities
<AlexandrosLiarok> orbuculum seems nice
<Darius> I've only played with it a little with a blackmagic probe but it is pretty neat
<Darius> using an orbtrace would give more bandwidth though
dandels has quit [Quit: ZNC 1.9.1 - https://znc.in]
dandels has joined #rust-embedded
adamgreig[m] has joined #rust-embedded
<adamgreig[m]> they really are
<bartmassey[m]> Henk jamesmunns adamgreig tamme and anyone else who is interested in participating: the MD2 Discovery Book sprint starts now. Please say "hi" if you're participating…
<adamgreig[m]> hi!
<JamesMunns[m]> Ah! I did not put it in my calendar and am afk for a bit, but can join later!
<bartmassey[m]> adamgreig: Shall we set up a temp channel for this?
tamme[m] has joined #rust-embedded
<tamme[m]> hi
<bartmassey[m]> James Munns: No problem. Join us when you can
<adamgreig[m]> personally i don't mind here or a new channel
<bartmassey[m]> tamme: Welcome!
<bartmassey[m]> Ok, here is fine. We'll probably only be here for a few minutes anyhow.
Henk[m] has joined #rust-embedded
<Henk[m]> 👋
<bartmassey[m]> I've set up a HackMD for us to work in: https://hackmd.io/@bart-massey/S1OBmFcgyl
<Henk[m]> bartmassey[m]: hi!
<bartmassey[m]> @Henk hi!
<bartmassey[m]> Henk: Hi!
<bartmassey[m]> Sigh
<bartmassey[m]> I've also set up a Zoom meeting in case folks want to use it? https://pdx.zoom.us/j/81646005591
<bartmassey[m]> Henk and I are on Zoom…
<adamgreig[m]> sorry, wasn't set up for zoom calls at all, one min
<bartmassey[m]> Sorry, should have warned folks in advance.
<bartmassey[m]> tamme: Can you join us on Zoom
<tamme[m]> yup, just seeting it up now
<bartmassey[m]> Cool
Makarov has joined #rust-embedded
<AlexandrosLiarok> anyone knows how I can setup the tracebus on the stm32h745 (cortex-m7) for usage with orbuculum ?
Makarov91 has joined #rust-embedded
<adamgreig[m]> ( James Munns, feel free to hop on that zoom link when you're around)
Makarov has quit [Ping timeout: 256 seconds]
<JamesMunns[m]> <adamgreig[m]> "( James Munns, feel free to..." <- will do, just finished up, should jump on in 5-10m or so.
<bartmassey[m]> Cool. We have a thing for you to do when you're ready 😀
<norineko> AlexandrosLiarok: the h7 has a few quirks but if you go to the 1bitsquared discord #orbuculum channel you'll find folks who've done it.
<norineko> I can't recall the exact details off my head
Makarov76 has joined #rust-embedded
Makarov65 has joined #rust-embedded
Makarov91 has quit [Ping timeout: 256 seconds]
jonored has quit [Ping timeout: 252 seconds]
Makarov76 has quit [Quit: Client closed]
Makarov48 has joined #rust-embedded
Makarov65 has quit [Ping timeout: 256 seconds]
berkus[m] has joined #rust-embedded
<berkus[m]> https://crates.io/crates/aarch64-cpu new version auto-released, folks.
Makarov76 has joined #rust-embedded
Makarov82 has joined #rust-embedded
scorpion2185[m] has quit [Quit: Idle timeout reached: 172800s]
Makarov48 has quit [Ping timeout: 256 seconds]
Makarov76 has quit [Ping timeout: 256 seconds]
<AlexandrosLiarok> <norineko> "Alexandros Liarokapis: the h7..." <- I have, haven't had much lack so far although the arbuculum folks are helpful.
<AlexandrosLiarok> > <@libera_norineko:catircservices.org> Alexandros Liarokapis: the h7 has a few quirks but if you go to the 1bitsquared discord #orbuculum channel you'll find folks who've done it.
<AlexandrosLiarok> * I have, haven't had much lack so far although the arbuculum folks were very helpful.
lulf[m] has quit [Quit: Idle timeout reached: 172800s]
Makarov34 has joined #rust-embedded
Makarov82 has quit [Ping timeout: 256 seconds]
Makarov51 has joined #rust-embedded
Makarov34 has quit [Ping timeout: 256 seconds]
Makarov9 has joined #rust-embedded
<bartmassey[m]> Henk jamesmunns adamgreig tamme : The text we wrote so far for Book Sprint is in https://github.com/rust-embedded/discovery-mb2 in the branch `sprint-20241026'. Let's handle further edits and updates as PRs to that branch rather than dumping it in the HackMD (https://hackmd.io/zcqqCWvVRrqdhLOCzWLakg)? That will help me keep track of what's changing as I edit…
Makarov51 has quit [Ping timeout: 256 seconds]
<AlexandrosLiarok> any idea where I can get proper SVDs for the cortex-m registers/
<AlexandrosLiarok> eg: [this](https://github.com/ARM-software/CMSIS_5/blob/develop/Device/ARM/SVD/ARMCM7.svd) is useless because it contains no register info.
<rmsyn[m]> aliarokapis: there is the `rust-embedded/cortex-m` repo that has register definitions. I wasn't able to find SVDs with register definitions, however I wrote a tool for parsing DTS files, and producing a SVD. If you want help taking register info, and adding it to the known definitions in that tool, you could generate your own SVD. here is a link to the tool: https://codeberg.org/weathered-steel/svd-generator
<rmsyn[m]> it will be a bit of upfront manual work doing the translation, though. so, if others know of vendor supplied SVD files, maybe that would be better
inara has quit [Ping timeout: 248 seconds]
inara has joined #rust-embedded
Makarov9 has quit [Ping timeout: 256 seconds]
M9names[m] has joined #rust-embedded
<M9names[m]> github has a search function. it lets you filter not only on text, but on file type too.
<M9names[m]> i suspect you'll be able to find what you're looking for if you specify a register name and a file type like xml or svd, as below:
<M9names[m]> https://github.com/search?q=SHCSR+path%3A*.svd&type=code
<bartmassey[m]> In the MB2 Discovery Book, we are about evenly divided right now between "Arm" and "ARM". Which capitalization should we be using? 😀
<berkus[m]> arM
<berkus[m]> https://www.arm.com/company they are mentioning themselves as Arm nowadays
i509vcb[m] has joined #rust-embedded
<i509vcb[m]> ArrrrrrM
<M9names[m]> <berkus[m]> "https://www.arm.com/company they..." <- this is kinda the problem though. if you rely on how the company stylises itself you will be wrong again in a few years when they change their mind again.
<berkus[m]> We can call them LEG or "A company that shall not be named"
<M9names[m]> Love the wikipedia entry: ARM (stylised in lowercase as arm) and run by Arm Holdings.
<M9names[m]> So i guess ARM when referring to the chips, arm when in logo form, and Arm when talking about the company.
<berkus[m]> M9names[m]: > <@9names:matrix.org> Love the wikipedia entry: ARM (stylised in lowercase as arm) and run by Arm Holdings.
<berkus[m]> > So i guess ARM when referring to the chips, arm when in logo form, and Arm when talking about the company.
<berkus[m]> Makes sense ©
<rmsyn[m]> hdoordt: hi!
<AlexandrosLiarok> I have jlink gdb server running. How to connect to it from gdb, get a backtrace, immediately exit while resuming execution ?
<AlexandrosLiarok> https://github.com/Amanieu/minicov seems promishing
Jubilee[m] has joined #rust-embedded
<Jubilee[m]> <bartmassey[m]> "In the MB2 Discovery Book, we..." <- "aarch32", clearly.