ChanServ changed the topic of #rust-embedded to: Welcome to the Rust Embedded IRC channel! Bridged to #rust-embedded:matrix.org and logged at https://libera.irclog.whitequark.org/rust-embedded, code of conduct at https://www.rust-lang.org/conduct.html
<re_irc> <@firefrommoonlight:matrix.org> How can I lose?
<re_irc> <@firefrommoonlight:matrix.org> Wait I thought we were supposed to buy low and sell high, know when to walk away, know when to run
<re_irc> <@dngrs:matrix.org> the cryptocurrency claims are getting more ridiculous and desparate by the hour, bags must be really heavy
<re_irc> <@grantm11235:matrix.org> Do we not have any mods online?
<re_irc> <@grantm11235:matrix.org> If we need more mods, I'll volunteer. I promise I'm trustworthy šŸ‘¼
<re_irc> <@thejpster:matrix.org> : Because Iā€™m about to have File Handles which have references back to the Volume which has a reference to the SD Card and basically everyone will need a ref cell. The goal is simplicity, not performance.
<re_irc> <@xiretza:xiretza.xyz> : they should at least look into setting up a moderation not, that user was banned on the Community Moderation Effort list hours before they posted here
<re_irc> <@admin:orangemurker.com> : That's pretty awesome actually
emerent has quit [Ping timeout: 256 seconds]
emerent has joined #rust-embedded
IlPalazzo-ojiisa has joined #rust-embedded
starblue has joined #rust-embedded
dc740 has joined #rust-embedded
<re_irc> <@ubik:matrix.org> does anyone have any tricks/tips on reverse-engineering a CRC algorithm? tried to use RevEng to no avail...
<re_irc> <@peter9477:matrix.org> : Do you have code you could disassemble or at least view as hex?
<re_irc> <@ubik:matrix.org> : yes, I lost a few hours on ghidra yesterday šŸ˜„
<re_irc> <@peter9477:matrix.org> Have you been able to discover or narrow down basic parameters, like is it 16-bit, polynomial, starting value (e.g. 0 vs 0xffff) etc?
<re_irc> <@ubik:matrix.org> it's a C++ program on windows/QT, so it's a mess to figure out
<re_irc> <@peter9477:matrix.org> In some simple cases I've been able to figure it out with some small test inputs and using a tool like https://www.lammertbies.nl/comm/info/crc-calculation to compare outputs.
dc_740 has joined #rust-embedded
<re_irc> <@ubik:matrix.org> well, it's a MIDI SysEx message, and there are some algorithms out there
<re_irc> <@peter9477:matrix.org> You might have a hope of even just scanning the binary for some of the well-known polynomials.
<re_irc> <@ubik:matrix.org> but they don't seem to match this
<re_irc> <@ubik:matrix.org> I am not sure whether it's 8 or 16 bit...
<re_irc> <@peter9477:matrix.org> Are you sure it's even a CRC, specifically? That's a particular type of check, but there are many that aren't actual CRCs. Often the term "checksum" is mistakenly replaced with "CRC" when that's incorrect.
<re_irc> <@ubik:matrix.org> most messages seem to finish with "xx01f7", I'd expect 01 to be a fixed byte, but then I've seen a couple of examples
<re_irc> <@peter9477:matrix.org> e.g. in https://forum.fractalaudio.com/threads/mfc-101-sysex-checksum-help.145377/ I see it's possibly a relatively trivial checksum.
<re_irc> <@ubik:matrix.org> yeah, it's some time of checksum, not necessarily crc
<re_irc> <@ubik:matrix.org> * type
<re_irc> <@peter9477:matrix.org> (I've never heard of MIDI SysEx before so starting from zero knowledge here...)
<re_irc> <@ubik:matrix.org> : wow. hadn't bumped into this one yet. I'll give it a try, thanks!
<re_irc> <@peter9477:matrix.org> https://github.com/shingo45endo/sysex-checksum ?
<re_irc> <@peter9477:matrix.org> It might help to retry some searches you've been doing with "checksum" instead of "CRC", or even adding "-crc" though that of course skips pages where the info is correct but the terminology is wrong. Is this not covered by some MIDI standard document? I thought MIDI was ancient and well understood.
<re_irc> <@peter9477:matrix.org> Do you have any sample input/output pairs to check?
dc740 has quit [*.net *.split]
<re_irc> <@ubik:matrix.org> this is a sysex message, it's non-standard extensions to MIDI, basically
<re_irc> <@ubik:matrix.org> every manufacturer does it differently
<re_irc> <@ubik:matrix.org> I do have sample messages
<re_irc> <@peter9477:matrix.org> GPT-4 suggests:
<re_irc> fn midi_sysex_checksum(data: &[u8]) -> u8 {
<re_irc> let total: u16 = data.iter().map(|&byte| u16::from(byte)).sum();
<re_irc> let checksum = (-(total as i16)) as u8 & 0x7F;
<re_irc> checksum
<re_irc> }
<re_irc> and to test it:
<re_irc> #[cfg(test)]
<re_irc> mod tests {
<re_irc> use super::*;
<re_irc> #[test]
<re_irc> fn test_midi_sysex_checksum() {
<re_irc> let data1 = [0x43, 0x10, 0x4C, 0x00, 0x00, 0x7E, 0x00];
<re_irc> let expected_checksum1 = 0x59;
<re_irc> assert_eq!(midi_sysex_checksum(&data1), expected_checksum1);
<re_irc> let data2 = [0x7D, 0x01, 0x02, 0x03, 0x04, 0x05];
<re_irc> let expected_checksum2 = 0x6E;
<re_irc> assert_eq!(midi_sysex_checksum(&data2), expected_checksum2);
<re_irc> }
<re_irc> }
<re_irc> <@peter9477:matrix.org> This of course will work only if it's correct that "A MIDI SysEx (System Exclusive) message is a type of MIDI message used for transmitting device-specific data between MIDI-compatible devices. The checksum is typically an 8-bit value calculated from the data bytes in the message and used for error detection. One common method for calculating the checksum is to sum up all the data bytes, then take the two's complement of...
<re_irc> ... the least significant byte."
<re_irc> <@dngrs:matrix.org> I don't think it's a good idea to copy paste GPT output verbatim with no personal domain knowledge (thus no way of verifying it)
<re_irc> <@ubik:matrix.org> nope, that doesn't work
<re_irc> <@ubik:matrix.org> some example data if anyone fancies a challenge
<re_irc> <@peter9477:matrix.org> It's clearly tagged "GPT-4" so anyone using it would know the source, and it's offered merely for the possibility it's of some use. Not everyone wants to pay for GPT-4 access...
<re_irc> <@peter9477:matrix.org> : Is the checksum the 3rd last byte in each line?
<re_irc> <@dngrs:matrix.org> I think GPT salad is actively detrimental to communities and knowledge, even if tagged (since not everyone knows LLMs are just hallucinating with zero regard for facts), but I'll let it rest with that
<re_irc> <@ubik:matrix.org> : I don't know. I'd expect either that or that one plus the 2nd last
<re_irc> <@peter9477:matrix.org> : Any context for that data? Are any of the bytes known to be metadata (e.g. header/trailer) that isn't included in the checksum? And do you know which is the checksum data? Little/big-endian?
<re_irc> <@ubik:matrix.org> the first and last bits are the delimeters of the sysex message
<re_irc> <@peter9477:matrix.org> if includes second last too, that makes it seem like a little-endian 16-bit value, which I think would be a little unusual for a checksum unless it's more of an actual sum.
<re_irc> <@peter9477:matrix.org> So f0 and f7 are delimiters, and everything else might be data/payload?
<re_irc> <@ubik:matrix.org> this one is a 2-byte checksum
<re_irc> <@peter9477:matrix.org> ok
<re_irc> <@ubik:matrix.org> : i also read that bytes 2, 3, 4 and 5 left-to-right are header bytes, and that only the stuff after is checksummed
<re_irc> <@ubik:matrix.org> but I'm starting to doubt they took that approach
<re_irc> <@peter9477:matrix.org> : How much do you trust those packets? And is it 7-bit data, except for the delimiters? There doesn't appear to be a length field, but also no values with 8th bit high except the f0 and f7.
<re_irc> <@peter9477:matrix.org> (re "trust"... could any be corrupted packets, like that really long one?)
<re_irc> <@ubik:matrix.org> i tested two midi sniffers, they both gave me the same result
<re_irc> <@ubik:matrix.org> yeah, that's correct. the actual data is 7-bit
<re_irc> <@ubik:matrix.org> also, i tried replaying some of them, and the device accepts them
<re_irc> <@ubik:matrix.org> if i change the checksum, it fails
<re_irc> <@peter9477:matrix.org> : Well, I give up for now. Maybe with your better context you can make more progress. I threw this together (in Python) to explore, but although it approximates or even matches some of the values, others are wildly different like 4 of the last 5.
<re_irc> TESTCASES = [
<re_irc> 'f000320949000040020000000018000000026e01f7',
<re_irc> 'f000320949000040020000000018000000017001f7',
<re_irc> 'f000320949000040020200000018000000095c01f7',
<re_irc> 'f000320949000040020900000018000000015e01f7',
<re_irc> 'f0003209490000400205000000180000003f6a00f7',
<re_irc> 'f000320949000040020a00000018000000005e01f7',
<re_irc> 'f000320949000040020a00000018000000015c01f7',
<re_irc> 'f000320949000040020b00000018000000005c01f7',
<re_irc> 'f000320949000040020b00000018000000015a01f7',
<re_irc> 'f000320949000040020c00000018000000005a01f7',
<re_irc> 'f000320949000040020c00000018000000015801f7',
<re_irc> 'f000320949000000000060000010000000001c03f7',
<re_irc> 'f000320949000000000060000010000000011a03f7',
<re_irc> 'f000320d41030040020000000000060000020c2828502650001f1004081020000000001010502001163c2e78386020400001000000106041030568342341210101010204000000403af7',
<re_irc> 'f000320d49000000000060000010000000011a03f7',
<re_irc> 'f000320d410000000000600000100000004e01f7',
<re_irc> ]
<re_irc> seen = set()
<re_irc> def checksum(data):
<re_irc> payload = data[4:-3]
<re_irc> theirs = data[-3] | data[-2] << 8
<re_irc> seen.update(payload)
<re_irc> chk = sum(x for x in payload)
<re_irc> chk = chk << 1
<re_irc> return chk, theirs
<re_irc> for case in TESTCASES:
<re_irc> data = bytes.fromhex(case)
<re_irc> chk, theirs = checksum(data)
<re_irc> print(f'{theirs:04x} -> {chk:04x}: {case}')
<re_irc> print(f'seen: {", ".join(f"{x:02x}" for x in sorted(seen))}')
<re_irc> Output (original checksum at left, then mine, then raw data):
<re_irc> 0170 -> 0148: f000320949000040020000000018000000017001f7
<re_irc> 016e -> 014a: f000320949000040020000000018000000026e01f7
<re_irc> 015c -> 015c: f000320949000040020200000018000000095c01f7
<re_irc> 015e -> 015a: f000320949000040020900000018000000015e01f7
<re_irc> 006a -> 01ce: f0003209490000400205000000180000003f6a00f7
<re_irc> 015e -> 015a: f000320949000040020a00000018000000005e01f7
<re_irc> 015c -> 015c: f000320949000040020a00000018000000015c01f7
<re_irc> 015c -> 015c: f000320949000040020b00000018000000005c01f7
<re_irc> 015a -> 015e: f000320949000040020b00000018000000015a01f7
<re_irc> 015a -> 015e: f000320949000040020c00000018000000005a01f7
<re_irc> 0158 -> 0160: f000320949000040020c00000018000000015801f7
<re_irc> 031c -> 0172: f000320949000000000060000010000000001c03f7
<re_irc> 031a -> 0174: f000320949000000000060000010000000011a03f7
<re_irc> 3a40 -> 0d00: f000320d41030040020000000000060000020c2828502650001f1004081020000000001010502001163c2e78386020400001000000106041030568342341210101010204000000403af7031a -> 0174: f000320d49000000000060000010000000011a03f7
<re_irc> 014e -> 0162: f000320d410000000000600000100000004e01f7
<re_irc> seen: 00, 01, 02, 03, 04, 05, 06, 08, 09, 0a, 0b, 0c, 10, 16, 18, 1f, 20, 21, 23, 26, 28, 2e, 34, 38, 3c, 3f, 40, 41, 49, 50, 60, 68, 78
<re_irc> <@peter9477:matrix.org> It feels like something critical is missing. I tried checking if there might be a parity bit included in the checksum at least (use of 7-bit data suggests ancient tech) but that didn't "feel" right either. I've also seen checksums that do things like add 1 for every byte included, effectively including a count after adding in all the data, which might fit with why the long one gets up to 0x3a40, but ... no luck yet.
<re_irc> <@peter9477:matrix.org> It does seem like all the checksums are at least <<1 for some reason, as none have the low bit set. Maybe that will help.
<re_irc> <@peter9477:matrix.org> : Well, I give up for now. Maybe with your better context you can make more progress. I threw this together (in Python) to explore, but although it approximates or even matches some of the values, others are wildly different like 4 of the last 5.
<re_irc> TESTCASES = [
<re_irc> 'f000320949000040020000000018000000017001f7',
<re_irc> 'f000320949000040020200000018000000095c01f7',
<re_irc> 'f000320949000040020000000018000000026e01f7',
<re_irc> 'f000320949000040020900000018000000015e01f7',
<re_irc> 'f0003209490000400205000000180000003f6a00f7',
<re_irc> 'f000320949000040020a00000018000000005e01f7',
<re_irc> 'f000320949000040020a00000018000000015c01f7',
<re_irc> 'f000320949000040020b00000018000000005c01f7',
<re_irc> 'f000320949000040020b00000018000000015a01f7',
<re_irc> 'f000320949000040020c00000018000000005a01f7',
<re_irc> 'f000320949000040020c00000018000000015801f7',
<re_irc> 'f000320949000000000060000010000000001c03f7',
<re_irc> 'f000320949000000000060000010000000011a03f7',
<re_irc> 'f000320d41030040020000000000060000020c2828502650001f1004081020000000001010502001163c2e78386020400001000000106041030568342341210101010204000000403af7',
<re_irc> 'f000320d49000000000060000010000000011a03f7',
<re_irc> 'f000320d410000000000600000100000004e01f7',
<re_irc> ]
<re_irc> seen = set()
<re_irc> def checksum(data):
<re_irc> payload = data[4:-3]
<re_irc> theirs = data[-3] | data[-2] << 8
<re_irc> seen.update(payload)
<re_irc> chk = sum(x for x in payload)
<re_irc> chk = chk << 1
<re_irc> return chk, theirs
<re_irc> for case in TESTCASES:
<re_irc> data = bytes.fromhex(case)
<re_irc> chk, theirs = checksum(data)
<re_irc> print(f'{theirs:04x} -> {chk:04x}: {case}')
<re_irc> print(f'seen: {", ".join(f"{x:02x}" for x in sorted(seen))}')
<re_irc> Output (original checksum at left, then mine, then raw data):
<re_irc> 0170 -> 0148: f000320949000040020000000018000000017001f7
<re_irc> 016e -> 014a: f000320949000040020000000018000000026e01f7
<re_irc> 015c -> 015c: f000320949000040020200000018000000095c01f7
<re_irc> 015e -> 015a: f000320949000040020900000018000000015e01f7
<re_irc> 006a -> 01ce: f0003209490000400205000000180000003f6a00f7
<re_irc> 015e -> 015a: f000320949000040020a00000018000000005e01f7
<re_irc> 015c -> 015c: f000320949000040020a00000018000000015c01f7
<re_irc> 015c -> 015c: f000320949000040020b00000018000000005c01f7
<re_irc> 015a -> 015e: f000320949000040020b00000018000000015a01f7
<re_irc> 015a -> 015e: f000320949000040020c00000018000000005a01f7
<re_irc> 0158 -> 0160: f000320949000040020c00000018000000015801f7
<re_irc> 031c -> 0172: f000320949000000000060000010000000001c03f7
<re_irc> 031a -> 0174: f000320949000000000060000010000000011a03f7
<re_irc> 3a40 -> 0d00: f000320d41030040020000000000060000020c2828502650001f1004081020000000001010502001163c2e78386020400001000000106041030568342341210101010204000000403af7
<re_irc> 031a -> 0174: f000320d49000000000060000010000000011a03f7
<re_irc> 014e -> 0162: f000320d410000000000600000100000004e01f7
<re_irc> seen: 00, 01, 02, 03, 04, 05, 06, 08, 09, 0a, 0b, 0c, 10, 16, 18, 1f, 20, 21, 23, 26, 28, 2e, 34, 38, 3c, 3f, 40, 41, 49, 50, 60, 68, 78
<re_irc> <@peter9477:matrix.org> (Edited to fix copy/paste glitch with the long output.)
<re_irc> <@ubik:matrix.org> thanks a lot!
<re_irc> <@ubik:matrix.org> trying to decompile the code
<re_irc> <@peter9477:matrix.org> It might also help to focus in on a few of those where only a couple of bits are different, especially the 0x015c ones where you have 3-4 examples. Maybe collect more if you can, to compare.
<re_irc> <@sourcebox:matrix.org> : I've not read the whole thread, but what I can say for sure is that there are at least 2 different options how SysEx checksums are calculated with different devices: sum and xor. Complement can also be applied additionally.
<re_irc> <@sourcebox:matrix.org> * ways