#scopehal on 2023-11-20 — irc logs at libera.irclog.whitequark.org

2023-10-21 05:40 azonenberg changed the topic of #scopehal to: ngscopeclient, libscopehal, and libscopeprotocols development and testing | https://github.com/ngscopeclient/scopehal-apps | Logs: https://libera.irclog.whitequark.org/scopehal

00:14 juri__ has joined #scopehal

00:15 juri_ has quit [Ping timeout: 255 seconds]

00:19 <azonenberg> balrog, miek, tnt: when you get a chance can you let me know what make/model scope(s) you are actively using ngscopeclient with, and what commit hash of scopehal-apps you're testing with, and if there are any problems related to the scope driver?

00:20 <azonenberg> (and anyone else who wants to throw in a test report that's different, i don't need multiple reports of "works fine" or of the same bug)

00:20 <azonenberg> i'm trying to build out the dashboard we discussed on the last dev call at https://github.com/ngscopeclient/scopehal-apps/wiki

00:20 <azonenberg> it may move, and eventually i want some sort of semi automated way to report status

00:20 <azonenberg> but for starters, periodic reports of "X got scope Y working with commit Z" is a starting point

00:21 <d1b2> <azonenberg> also @hansemro same goes for you and whatever siglent you're testing on

00:48 juri__ has quit [Read error: Connection reset by peer]

00:48 juri_ has joined #scopehal

00:58 <d1b2> <miek__> @azonenberg i've been using it a bit lately on an Agilent MSO6104A, 0f7fe878d00085001dcaf6676bdc4f2c9cf8b775, no issues

01:08 juri_ has quit [Read error: Connection reset by peer]

01:09 juri_ has joined #scopehal

01:22 <_whitenotifier-b> [scopehal-apps] juh2600 opened issue #632: "Setup > Manage instruments" menu item disabled after opening it once - https://github.com/ngscopeclient/scopehal-apps/issues/632

01:37 <_whitenotifier-b> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://github.com/ngscopeclient/scopehal-apps/compare/ecc525e6d344...1cb68dba7847

01:37 <_whitenotifier-b> [scopehal-apps] azonenberg 1cb68db - Fixed bug where m_manageInstrumentsDialog was not set to null when the dialog was closed, making it impossible to re-open the dialog after closing it once. Fixes #632.

01:37 <_whitenotifier-b> [scopehal-apps] azonenberg closed issue #632: "Setup > Manage instruments" menu item disabled after opening it once - https://github.com/ngscopeclient/scopehal-apps/issues/632

02:30 <_whitenotifier-b> [scopehal] juh2600 opened issue #815: Stack smashing or FPE when changing coupling DC > Ground > DC - https://github.com/ngscopeclient/scopehal/issues/815

02:30 <d1b2> <david.rysk> Haven't been running/testing lately, when I do it's with a Siglent usually

03:10 <azonenberg> juh: so i see "failed to read 0 bytes" in the log

03:11 <juh> aye, and failed to bind to an empty buffer or something like that

03:11 <juh> I'm wondering if the rigol maybe just doesn't report anything when a channel is coupled to ground?

03:11 <azonenberg> I suspect these are all related and the root cause is earlier on in the trace

03:12 <azonenberg> Shouldn't be the case. ground coupling is normally used to measure the scope's inherent noise level

03:12 <azonenberg> and generally to establish a baseline for the frontend performance when it's not measuring anything

03:12 <azonenberg> the better question is if ds1000z even supports ground coupling

03:12 <azonenberg> and if it doesn't perhaps it misbehaves when you try to enable it

03:13 <juh> mine supports it, yeah...or at least says it does

03:13 <azonenberg> yeah ok

03:13 <azonenberg> so that's not the problem then

03:14 <azonenberg> anyway at this point my assumption (which might be disproven later) is that something is scribbling over memory and that the sigfpe and stack smashing are dependent on random elements of e.g. what waveform data gets dumped over some other variable

03:19 <azonenberg> juh: the ground coupling is key here?

03:19 <juh> I haven't tried it with AC, I'll do that now

03:19 <azonenberg> have you tried changing to say AC coupling, or some other setting like probe attenuation?

03:19 <azonenberg> ok

03:19 <azonenberg> yeah the more you can narrow it down the better

03:21 <_whitenotifier-b> [scopehal-apps] juh2600 opened issue #633: [UX] Provide feedback to the user about whether the application is thinking/waiting - https://github.com/ngscopeclient/scopehal-apps/issues/633

03:22 <azonenberg> Wrt #633, we don't have any wait cursors or anything because the intent has always been to never block

03:23 <azonenberg> anything that makes it freeze for any significant amount of time is a bug, generally speaking

03:23 <juh> Not a blocking cursor, more like a uhhh

03:23 <azonenberg> as much as possible anything slow should be done in a background thread

03:23 <azonenberg> and load asynchronously

03:23 <azonenberg> changing any settings that are slow should be dispatched when the scope is available

03:23 <juh> https://i.imgur.com/xbhLjQm.png this kinda deal, so you can still interact, but you know it's doing something

03:23 <azonenberg> yeah i know

03:24 <azonenberg> and then we do currently block when loading a scopesession, that's tricky to do async but is on the longer term todo

03:26 <azonenberg> Anyway, so the first thing that jumps out at me looking at your log is that when it's in the bad state

03:26 <azonenberg> it's reading the waveform in a very strange way

03:26 <azonenberg> (the stack smashing log)

03:28 <azonenberg> See how it's sending WASV:STAR 1, 3, 5... and then WAV:STOP 1200?

03:28 <azonenberg> WAV:STAR*

03:29 <azonenberg> That's not right. For less than i think 250K points it should read entirely in one block

03:29 <azonenberg> for larger blocks, they should be consecutive and non-overlapping

03:29 <_whitenotifier-b> [scopehal] juh2600 commented on issue #815: Stack smashing or FPE when changing coupling DC > Ground > DC - https://github.com/ngscopeclient/scopehal/issues/815#issuecomment-1818171423

03:29 <azonenberg> My conjecture is that we are allocating rx space for 1200 samples of data, then reading multiple just-under-1200-sample blocks, and overrunning the buffer as a result

03:30 <azonenberg> And the root cause of the bug is whatever is making it do that

03:31 <azonenberg> [SCPISocketTransport::SendCommand] [10.8.0.207] Sending WAV:PRE?

03:31 <azonenberg> [SCPISocketTransport::ReadReply] [10.8.0.207] Got RUN

03:31 <azonenberg> ok that is very wrong lol

03:31 <azonenberg> ok i think this is a race condition in the driver and i think i have a handle as to what might be going on

03:31 <azonenberg> give me some time and i'll refactor the driver to use the new queued command API

03:31 <azonenberg> which may fix it

03:33 <azonenberg> That should also make it faster responsiveness wise because write-only commands like SetChannelVoltageRange() won't need to block until the mutex is free

03:34 <azonenberg> hmm, on second thought, i dont think it's a race condition

03:34 <azonenberg> [SCPISocketTransport::SendCommand] [10.8.0.207] Sending :TRIG:STAT?

03:34 <azonenberg> [SCPISocketTransport::ReadReply] [10.8.0.207] Got

03:35 <azonenberg> The queued command conversion should still happen, i'll file a ticket, but let's not touch that until we've found the bug

03:35 <_whitenotifier-b> [scopehal] azonenberg opened issue #816: Rigol: refactor to use queued SCPITransport API - https://github.com/ngscopeclient/scopehal/issues/816

03:58 Degi has quit [Ping timeout: 256 seconds]

03:58 Degi_ has joined #scopehal

03:58 Degi_ is now known as Degi

03:59 <_whitenotifier-b> [scopehal] juh2600 commented on issue #815: Stack smashing or FPE when changing coupling DC > Ground > DC - https://github.com/ngscopeclient/scopehal/issues/815#issuecomment-1818189791

04:00 <_whitenotifier-b> [scopehal] azonenberg commented on issue #815: Stack smashing or FPE when changing coupling DC > Ground > DC - https://github.com/ngscopeclient/scopehal/issues/815#issuecomment-1818190650

04:00 <_whitenotifier-b> [scopehal] azonenberg edited issue #815: Memory corruption when changing coupling DC > Ground > DC - https://github.com/ngscopeclient/scopehal/issues/815

04:03 <azonenberg> juh: BTW, a feature you might not know of (which won't work with the rigol driver until i refactor it to use the queued command API, but you should know of once I do that) is the window | SCPI console dialog

04:03 <azonenberg> Most instruments only let you have one concurrent TCP connection open so you can't telnet/netcat to the scope to debug something while you have ngscopeclient open

04:04 <azonenberg> the built in console is (again, if the driver uses the queued API) properly interlocked so it won't step on critical sections within the driver, but lets you send commands and read back replies while you're connected

04:04 <azonenberg> in order to, say, test out syntax for a new feature you want to add or something

04:06 <_whitenotifier-b> [scopehal] juh2600 opened issue #817: Occasional floating point exception when triggering capture from ground-coupled channel - https://github.com/ngscopeclient/scopehal/issues/817

04:08 <_whitenotifier-b> [scopehal] azonenberg edited issue #817: Rigol: Occasional floating point exception when triggering capture from ground-coupled channel - https://github.com/ngscopeclient/scopehal/issues/817

04:09 <_whitenotifier-b> [scopehal] azonenberg commented on issue #817: Rigol: Occasional floating point exception when triggering capture from ground-coupled channel - https://github.com/ngscopeclient/scopehal/issues/817#issuecomment-1818196413

04:14 <azonenberg> juh: ok so this is definitely looking like a rigol firmware bug. we just need to add a workaround

04:15 <azonenberg> In the stack-smashing-filtered pcap, look at byte offset 0x3024 in the data coming back from the scope

04:16 <_whitenotifier-b> [scopehal-apps] juh2600 opened issue #634: Unclear how to delete a filter graph node - https://github.com/ngscopeclient/scopehal-apps/issues/634

04:16 <azonenberg> and offset 0x21b in the command stream

04:17 <azonenberg> (you can see that in wireshark if you go "follow TCP stream" and select show data as hexdump)

04:17 <azonenberg> the driver sends WAV:DATA?

04:17 <azonenberg> The scope replies with #9000000000

04:17 <azonenberg> then a newline

04:18 <azonenberg> Which means "nine ascii length digits, 000000000 bytes of data"

04:18 <azonenberg> Which is a bug, because just prior we sent WAV:PRE? and it return 0, 2,1200, <snip>

04:18 <azonenberg> which means we SHOULD have 1200 samples in the reply not zero

04:20 <azonenberg> So what's happening is, the driver reads the length header, sees there's zero bytes of sample data, then stops

04:22 <azonenberg> Does *not* read the newline, so we trigger the "ran out of data after %zu points" call - but there is a bug in that we do not read and discard the trailing newline

04:22 <azonenberg> So then we send TRIG:STAT? and the newline is returned

04:22 <azonenberg> then we send CHAN1.WAV:PRE? and get back the RUN meant as a reply to TRIG:STAT?

04:22 <azonenberg> and from then on we're desynced and TSHTF

04:29 <_whitenotifier-b> [scopehal] azonenberg pushed 4 commits to master [+0/-0/±4] https://github.com/ngscopeclient/scopehal/compare/2f49730aefb5...3a5636c3c565

04:29 <_whitenotifier-b> [scopehal] azonenberg c8ffb64 - RigolOscilloscope: always read trailing newline if the data stream ends early

04:29 <_whitenotifier-b> [scopehal] azonenberg da18976 - RigolOscilloscope: use AllocateAnalogWaveform() to reduce unnecessary allocations

04:29 <_whitenotifier-b> [scopehal] azonenberg db57335 - RigolOscilloscope: don't attempt to read a waveform with zero samples in it. See #817.

04:29 <_whitenotifier-b> [scopehal] azonenberg 3a5636c - RigolOscilloscope: avoid leaking waveform if we unexpectedly get zero data in reply to a waveform readback request

04:29 <_whitenotifier-b> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://github.com/ngscopeclient/scopehal-apps/compare/1cb68dba7847...19b65f1f87fb

04:29 <_whitenotifier-b> [scopehal-apps] azonenberg 19b65f1 - Updated submodules

04:37 <_whitenotifier-b> [scopehal-apps] azonenberg commented on issue #634: Unclear how to delete a filter graph node - https://github.com/ngscopeclient/scopehal-apps/issues/634#issuecomment-1818215629

04:38 <_whitenotifier-b> [scopehal-apps] azonenberg commented on issue #634: Unclear how to delete a filter graph node - https://github.com/ngscopeclient/scopehal-apps/issues/634#issuecomment-1818216789

04:40 <_whitenotifier-b> [scopehal-apps] azonenberg commented on issue #634: Unclear how to delete a filter graph node - https://github.com/ngscopeclient/scopehal-apps/issues/634#issuecomment-1818217886

06:47 <_whitenotifier-b> [scopehal] azonenberg opened issue #818: Agilent: refactor to use queued command API - https://github.com/ngscopeclient/scopehal/issues/818

06:47 <_whitenotifier-b> [scopehal] azonenberg opened issue #819: KeysightDCA: refactor to use queued command API - https://github.com/ngscopeclient/scopehal/issues/819

06:52 <_whitenotifier-b> [scopehal] juh2600 closed issue #815: Memory corruption when changing coupling DC > Ground > DC - https://github.com/ngscopeclient/scopehal/issues/815

06:52 <_whitenotifier-b> [scopehal] juh2600 commented on issue #815: Memory corruption when changing coupling DC > Ground > DC - https://github.com/ngscopeclient/scopehal/issues/815#issuecomment-1818335303

07:54 <_whitenotifier-b> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://github.com/ngscopeclient/scopehal/compare/3a5636c3c565...10899c0f58e6

07:54 <_whitenotifier-b> [scopehal] azonenberg 10899c0 - RigolOscilloscope: general code cleanup, refactored to use queued command API. Fixes #816.

07:54 <_whitenotifier-b> [scopehal] azonenberg closed issue #816: Rigol: refactor to use queued SCPITransport API - https://github.com/ngscopeclient/scopehal/issues/816

07:54 <_whitenotifier-b> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://github.com/ngscopeclient/scopehal-apps/compare/19b65f1f87fb...45d255de8795

07:54 <_whitenotifier-b> [scopehal-apps] azonenberg 45d255d - Updated submodules

10:31 juri_ has quit [Read error: Connection reset by peer]

10:31 juri_ has joined #scopehal

11:06 juri_ has quit [Read error: Connection reset by peer]

11:22 juri_ has joined #scopehal

18:36 nelgau has quit [Ping timeout: 260 seconds]

21:14 nelgau has joined #scopehal