<azonenberg>
balrog, miek, tnt: when you get a chance can you let me know what make/model scope(s) you are actively using ngscopeclient with, and what commit hash of scopehal-apps you're testing with, and if there are any problems related to the scope driver?
<azonenberg>
(and anyone else who wants to throw in a test report that's different, i don't need multiple reports of "works fine" or of the same bug)
<_whitenotifier-b>
[scopehal-apps] azonenberg 1cb68db - Fixed bug where m_manageInstrumentsDialog was not set to null when the dialog was closed, making it impossible to re-open the dialog after closing it once. Fixes #632.
<d1b2>
<david.rysk> Haven't been running/testing lately, when I do it's with a Siglent usually
<azonenberg>
juh: so i see "failed to read 0 bytes" in the log
<juh>
aye, and failed to bind to an empty buffer or something like that
<juh>
I'm wondering if the rigol maybe just doesn't report anything when a channel is coupled to ground?
<azonenberg>
I suspect these are all related and the root cause is earlier on in the trace
<azonenberg>
Shouldn't be the case. ground coupling is normally used to measure the scope's inherent noise level
<azonenberg>
and generally to establish a baseline for the frontend performance when it's not measuring anything
<azonenberg>
the better question is if ds1000z even supports ground coupling
<azonenberg>
and if it doesn't perhaps it misbehaves when you try to enable it
<juh>
mine supports it, yeah...or at least says it does
<azonenberg>
yeah ok
<azonenberg>
so that's not the problem then
<azonenberg>
anyway at this point my assumption (which might be disproven later) is that something is scribbling over memory and that the sigfpe and stack smashing are dependent on random elements of e.g. what waveform data gets dumped over some other variable
<azonenberg>
juh: the ground coupling is key here?
<juh>
I haven't tried it with AC, I'll do that now
<azonenberg>
have you tried changing to say AC coupling, or some other setting like probe attenuation?
<azonenberg>
ok
<azonenberg>
yeah the more you can narrow it down the better
<azonenberg>
My conjecture is that we are allocating rx space for 1200 samples of data, then reading multiple just-under-1200-sample blocks, and overrunning the buffer as a result
<azonenberg>
And the root cause of the bug is whatever is making it do that
<azonenberg>
[SCPISocketTransport::ReadReply] [10.8.0.207] Got RUN
<azonenberg>
ok that is very wrong lol
<azonenberg>
ok i think this is a race condition in the driver and i think i have a handle as to what might be going on
<azonenberg>
give me some time and i'll refactor the driver to use the new queued command API
<azonenberg>
which may fix it
<azonenberg>
That should also make it faster responsiveness wise because write-only commands like SetChannelVoltageRange() won't need to block until the mutex is free
<azonenberg>
hmm, on second thought, i dont think it's a race condition
<azonenberg>
juh: BTW, a feature you might not know of (which won't work with the rigol driver until i refactor it to use the queued command API, but you should know of once I do that) is the window | SCPI console dialog
<azonenberg>
Most instruments only let you have one concurrent TCP connection open so you can't telnet/netcat to the scope to debug something while you have ngscopeclient open
<azonenberg>
the built in console is (again, if the driver uses the queued API) properly interlocked so it won't step on critical sections within the driver, but lets you send commands and read back replies while you're connected
<azonenberg>
in order to, say, test out syntax for a new feature you want to add or something
<azonenberg>
and offset 0x21b in the command stream
<azonenberg>
(you can see that in wireshark if you go "follow TCP stream" and select show data as hexdump)
<azonenberg>
the driver sends WAV:DATA?
<azonenberg>
The scope replies with #9000000000
<azonenberg>
then a newline
<azonenberg>
Which means "nine ascii length digits, 000000000 bytes of data"
<azonenberg>
Which is a bug, because just prior we sent WAV:PRE? and it return 0, 2,1200, <snip>
<azonenberg>
which means we SHOULD have 1200 samples in the reply not zero
<azonenberg>
So what's happening is, the driver reads the length header, sees there's zero bytes of sample data, then stops
<azonenberg>
Does *not* read the newline, so we trigger the "ran out of data after %zu points" call - but there is a bug in that we do not read and discard the trailing newline
<azonenberg>
So then we send TRIG:STAT? and the newline is returned
<azonenberg>
then we send CHAN1.WAV:PRE? and get back the RUN meant as a reply to TRIG:STAT?
<azonenberg>
and from then on we're desynced and TSHTF
<_whitenotifier-b>
[scopehal] azonenberg c8ffb64 - RigolOscilloscope: always read trailing newline if the data stream ends early
<_whitenotifier-b>
[scopehal] azonenberg da18976 - RigolOscilloscope: use AllocateAnalogWaveform() to reduce unnecessary allocations
<_whitenotifier-b>
[scopehal] azonenberg db57335 - RigolOscilloscope: don't attempt to read a waveform with zero samples in it. See #817.
<_whitenotifier-b>
[scopehal] azonenberg 3a5636c - RigolOscilloscope: avoid leaking waveform if we unexpectedly get zero data in reply to a waveform readback request