<TeXitoi[m]>
<jacksonn97[m]> "I made a library for bme280..." <- Why not https://crates.io/crates/bme280 (there is a few other bme280 cartes on cartes.il)?
<jacksonn97[m]>
TeXitoi: because it is overloaded with useless async code (I was planning a library for single-threaded Arduino)
<jacksonn97[m]>
also it
<TeXitoi[m]>
Async is optional in this crate. The big drawback of your is that you depend on a specific half, not on embedded half, that makes your crate useless for any non avr MCU.
<jacksonn97[m]>
TeXitoi[m]: yeah, any other MCU supported by some other library, I did't found library for avr and decided to write it myself
<TeXitoi[m]>
Embedded hal libs should work on avr, that's the goal of it: write a driver once for every MCU.
pronvis has quit [Ping timeout: 240 seconds]
<jacksonn97[m]>
I think avr-hal more user-friendly cause provides functionality specially for arduino-like boards
xiretza[cis] has joined #rust-embedded
<xiretza[cis]>
avr-hal is just a wrapper around embedded-hal AFAIK, when you're writing drivers you should be targeting the more generic embedded-hal
<M9names[m]>
it's their time, they can develop something that only works on one hal if they choose, even if you think it's a strange choice
<xiretza[cis]>
their decision seems to be based on a misunderstanding of what avr-hal is though
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 260 seconds]
<M9names[m]>
alright, well, if everyone else is piling on "don't do that" i'll try to give other driver feedback.
<M9names[m]>
- taking &mut to the i2c/spi bus everywhere is pretty un-ergonomic. the most common approach these days is to have the driver own the bus, with something like embedded_hal_bus ensuring exclusive access for one device at a time and handle chip-select for spi
<M9names[m]>
- it's usually a good idea to add `#[derive(Debug)]` on any struct that will be user facing, because they'll often want to print it out while debugging.
<TeXitoi[m]>
I don't agree on the last point. Giving the spi at each call allow you to easily wrap the driver with it, while the other way is not possible.
<TeXitoi[m]>
Shared bus pattern can give you runtime failure and dead lock, while explicit bus give you compile time check.
pronvis has joined #rust-embedded
<sourcebox[m]>
So just FYI: the original maintainer of usbd-midi passed it over to me. I opened a join request issue for rust-embedded-community as described in the instructions so that I can pass over the repo to the group. Ownership on crates.io is not transferred yet.
pronvis has quit [Ping timeout: 256 seconds]
<jacksonn97[m]>
9names: the reason for the shared bus is that you might want to connect several sensors to one bus
haobogu[m] has joined #rust-embedded
<haobogu[m]>
A dumb question: is there any crate for representing bitfield enums? like
<M9names[m]>
jacksonn97[m]: oh right, you can't use `embedded_hal_bus` because you're not targeting `embedded_hal`
<M9names[m]>
but that is usually what that does for you
<JamesMunns[m]1>
sourcebox[m]: (there are a lot of different bitfield/bitflag libraries, most of the differences are just macro syntax and such)
<M9names[m]>
* but that is usually what does that for you
<haobogu[m]>
thank you all!
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 272 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 260 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 256 seconds]
pronvis has joined #rust-embedded
AlexandervanSaas has joined #rust-embedded
<AlexandervanSaas>
I need to do a couple simple matrix operations like multiplication and inversion of small matrices. Nalgebra can run on no_std but it's quite a big dependency so I'd like to avoid it if I can. How do people usually do this? Copy some of the well-known pure float-based algorithms into your own code, C style?
<JamesMunns[m]1>
if it's like one or two small things, probably yeah, make your own functions and call that, maybe unit test it against nalgebra on the host to make sure it checks out. There are a bunch of math libs (nalgebra, idsp, cortex-dsp), but iirc they all were a little chonky, but it probably depends on exactly how much you need.
<JamesMunns[m]1>
tho I mean, if you only use one function from a big dep, the rest of the dead code gets optimized away
<JamesMunns[m]1>
so worth just using the lib and seeing how much of an impact it makes, it might not be a problem at all
<AlexandervanSaas>
I recall adding nalgebra and doing a couple simple operations added around 50kb.
<AlexandervanSaas>
Quite a lot IMO
<JamesMunns[m]1>
with all the opts turned on, measured with cargo-size or arm-none-eabi-size? it'd be interesting to see the diff of cargo-bloat to see where all that's going, or checking with panic immediate abort to see if it's all panics or something
<JamesMunns[m]1>
oh, is this on RP2040?
<AlexandervanSaas>
Yes
<JamesMunns[m]1>
on rp2040 you will pull in a lot of softfloat code for f32, and even more for f64, but 50k does sound excessive
<AlexandervanSaas>
Do you pay that price once or for every operation?
<JamesMunns[m]1>
shouldn't, but could if it gets inlined
<JamesMunns[m]1>
there's also the pseudo-accelerated ROM functions available, which should cut down that a lot, I think you have to activate it on embassy-rp tho
<AlexandervanSaas>
Yep I have those activated
<AlexandervanSaas>
Makes quite a speed difference
<JamesMunns[m]1>
yeah, idk how complex of math you're doing, 50k on the rp2040 is kinda whatever IMO (unused flash is wasted flash), but yeah idk how it would scale or what is causing it
<AlexandervanSaas>
I also experimented with the fixed crate but I found that not to be much faster. Maybe the conversion between float and fixed point is slow.
<JamesMunns[m]1>
yeah, I've tried to use the fixed crate before and found it unsatisfying. I've rolled my own fixed point types for projects before that were massively faster
<AlexandervanSaas>
That's interesting. Do you need to implement num-traits traits to use your own numeric types with other libraries?
<AlexandervanSaas>
Ah yeah I used that article before
<JamesMunns[m]1>
For that project I just spent a couple of hours making my own fixed types, I didn't need to do that much math, mostly sin/cos, amplitude adjustment, and linear ramping/lerp
<JamesMunns[m]1>
it was pretty simple audio-ish stuff
<sourcebox[m]>
JamesMunns[m]1: Good to know that I'm not the only one here trying to do audio π
<JamesMunns[m]1>
A couple of jobs back we were doing a lot of 3d math in cylindrical coordinates, on a fast processor that didn't have hardware floating point
<JamesMunns[m]1>
so I learned a lot of fun fixed point math tricks lol
<AlexandervanSaas>
I'm a bit disappointed that the fixed crate is so much slower than rolling your own
<JamesMunns[m]1>
to be fair, I could have been holding it wrong? idk where the difference came from, I should try to replicate it at some point.
<JamesMunns[m]1>
since I hand-rolled all the math, there weren't a lot of layers, so the optimizer was probably able to smoosh smoosh it down pretty easily
<JamesMunns[m]1>
(sadly that was all private client work, so I don't have it still, but it was pretty standard 16 and 32 bit math, where I was prioritizing speed over absolute accuracy)
<sourcebox[m]>
The speed possibly depends on the formats you use because certain amounts of shifts can be optimized better than others.
<JamesMunns[m]1>
happy to share notes if it's useful, but again that wasn't really matrix math, but rather generating and mixing a handful of oscillators
<JamesMunns[m]1>
yep, it was basically all wrapping add, shift, and muls
<JamesMunns[m]1>
and this was still M4 (not F), which I think has better "shifts are free" instructions
korken89[m] has quit [Quit: Idle timeout reached: 172800s]
<JamesMunns[m]1>
i'd expect it to be almost but not quite as fast on M0+.
<sourcebox[m]>
Some Cortexes have instructions that just work on lower/upper half of registers IRC.
<AlexandervanSaas>
The matrix operations are only to calibrate an accelerometer so they don't have to be fast. I'm also considering sending the raw measurements to a host PC and doing this math there. Something to think about.
<AlexandervanSaas>
JamesMunns[m]1: Love this talk
<JamesMunns[m]1>
yep, rp2040 is much better at shuffling bytes than doing math, imo :D
<sourcebox[m]>
A Hammond B3 emulation has quite some overall complexity.
<JamesMunns[m]1>
ah, "Direct Digital Synthesis" is the term I always forget
<JamesMunns[m]1>
Yeah, I had more like 5 oscillators, not 91 :D
<AlexandervanSaas>
JamesMunns[m]1: I'm thinking about switching to an ESP32 but I haven't hit the limits of the RP2040 yet because everything is still running on core0.
<sourcebox[m]>
I did a few experiments with the RP2040 for audio, but I gave up quickly. One of the showstoppers is the lack of 32-bit multiplication with 64-bit result.
<JamesMunns[m]1>
yep, in my project I would always drop down to 16 bits for gain calcs
<JamesMunns[m]1>
because 16 x 16 fits losslessly in 32 bits, then shift it back down 16 to renormalize
<JamesMunns[m]1>
I had a "gain" type that was just a u16 that represented `0.0..1.0`
<JamesMunns[m]1>
technically that's not correct, because u16::MAX is only 99.998% volume, so every gain mul shaved a tiny bit off, but in general it was close enough lol
<JamesMunns[m]1>
anyway, this is probably not actually helpful, but I will say it was fun for me to sit down and write a handful of manual fixed point math types and ops, though your definition of fun may vary
<sourcebox[m]>
Yeah, that difference doesn't really matter. The more relevant thing is that 16-bit internal resolution is typical not enough for the required headroom.
<AlexandervanSaas>
I found it fun to read about π
<JamesMunns[m]1>
sourcebox[m]: yep, what I was doing was decidedly lo-fi, and speed beat precision
<JamesMunns[m]1>
definitely wouldn't win prizes for lossless replication.
<sourcebox[m]>
That's totally ok if you do it by intention.
<sourcebox[m]>
But people are sometimes disappointed when they don't expect it.
<JamesMunns[m]1>
yeah, this is why I haven't turned it into a library, by the time you allowed for variable precision, I'd probably be back at the fixed crate
<sourcebox[m]>
In my experiments, I initially had the naive assumption that the pure clock frequency of the RP2040 would compensate for the lack of floating point. But that was completely wrong.
<JamesMunns[m]1>
yeah, softfloat is like 100-1000x slower than fixed point or hardfloat, sadly
<JamesMunns[m]1>
the romfuncs help with that, but it's a big gap to make up
<JamesMunns[m]1>
(and worse: the perf is variable)
K900 has quit [Quit: Idle timeout reached: 172800s]
pronvis has quit [Ping timeout: 240 seconds]
<AlexandervanSaas>
π€for a RP2040 successor with FPU
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 264 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 252 seconds]
<sourcebox[m]>
A variant using a Cortex-M4F would be nice.
<JamesMunns[m]1>
I'm interested to see what the price point of something like the Espressif P4 is going to be, if it ends up being a reasonable priced alternative to other "crossover" processors like the imxrt family, then it could be a fun alternative
<JamesMunns[m]1>
more options is always good tho :D
<sourcebox[m]>
I'm also very interested in the P4.
<sourcebox[m]>
But since esp-hal even removed all code related to it, I'm not sure if it's gonna be released soon.
<JamesMunns[m]1>
ah, interesting, haven't been following it that closely
<andreas[m]>
<AlexandervanSaas> "π€for a RP2040 successor with FPU" <- Rumors have it, they are working on a `RP235x`, whatever this is going to be. M23 perhaps π€
<sourcebox[m]>
andreas[m]: M23 also doesn't feature an FPU.
pronvis has joined #rust-embedded
<sourcebox[m]>
For the P4, I would like to know mostly 2 things:
<sourcebox[m]>
- How does performance (FPU) compare to the S3? Is it what to be expected from the difference in clock speed?
<sourcebox[m]>
- Are the peripherals quite similar to the other series so that code for the drivers is already available?
<AlexandervanSaas>
<andreas[m]> "Rumors have it, they are working..." <- Do you have a link to these rumours? I can't find anything more than a hint by the CEO that there will be a successor.
<vollbrecht[m]>
that also give you a rough idea what peripheral translates to what driver
<sourcebox[m]>
My question was mostly if the peripherals work similar internally so that there's little effort to make the existing drivers work.
<vollbrecht[m]>
ah ok, makes sense for esp-hal to think about that. It will not be 0 work ;D
<sourcebox[m]>
Otherwise the esp-hal team would take way longer to get them work.
<vollbrecht[m]>
s/0/zero/
<andreas[m]>
<AlexandervanSaas> "Do you have a link to these..." <- Nothing more than some chit-chat over at the 1BitSquared Discord.
<sourcebox[m]>
My question about performance is not really one about the P4, more about how RISC-V floats perform in general. There are not that much MCUs out to prove it.
pronvis has joined #rust-embedded
<sourcebox[m]>
Compiler optimization surely is top-notch for ARM, it shouldn't be that bad for RISC-V, but how good is it for Xtensa?
IlPalazzo-ojiisa has joined #rust-embedded
<sourcebox[m]>
So if RISC-V has much better optimizations, then the P4 would give way more performance than just the relation in of clock frequencies.
pronvis has quit [Ping timeout: 260 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 240 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 272 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 264 seconds]
TomB[m] has joined #rust-embedded
<TomB[m]>
<sourcebox[m]> "Compiler optimization surely..." <- Xtensa is complicated
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 260 seconds]
SArpnt[m] has quit [Quit: Idle timeout reached: 172800s]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 256 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 252 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 272 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 255 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 240 seconds]
pronvis has joined #rust-embedded
jannic[m] has joined #rust-embedded
<jannic[m]>
<andreas[m]> "Rumors have it, they are working..." <- Interesting. Searching for that term lead me to https://forums.raspberrypi.com/viewtopic.php?t=371165#p2223383. Looks like this was published because of the upcoming IPO. From the linked document: "We are designing and developing a family of microcontrollers, RP235x, which will serve as successors to RP2040, and which we expect to launch in second half of 2024. RP235x products are
<jannic[m]>
designed to operate at higher speeds, use less power and provide greater security than RP2040."
<jannic[m]>
So no technical details, but at least a time frame, and if the naming scheme holds, it won't be a M0+, and have double the ram of the rp2040. And the x instead of the 0 hints that it may have internal flash? ("floor(log2(nonvolatile / 16k)) or 0 if no onboard nonvolatile storage")
pronvis has quit [Remote host closed the connection]
<AlexandervanSaas>
I hope they include enough onboard flash to fit decently sized projects. Or that adding external flash doesn't reduce the IO available to other peripherals.
firefrommoonligh has quit [Quit: Idle timeout reached: 172800s]
pronvis has quit [Remote host closed the connection]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 240 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 268 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 256 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 272 seconds]
AtleoS has joined #rust-embedded
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 260 seconds]
pronvis has joined #rust-embedded
pronvis has quit [Ping timeout: 268 seconds]
pronvis has joined #rust-embedded
whitequark[cis] has quit [Quit: Idle timeout reached: 172800s]