<_whitenotifier-3>
[scopehal-waveforms-bridge] azonenberg 2ad656c - Force 64-bit values in protocol even when building on a 32-bit environment (such as ADP3450 embedded OS)
<_whitenotifier-3>
[scopehal-waveforms-bridge] azonenberg 935fd3d - Cap interpolation to max of 10 samples, fixed possible race condition in sending memory depth
<_whitenotifier-3>
[scopehal-waveforms-bridge] azonenberg 058bc6f - Initial support for IP connection to the remote instrument
<tnt>
azonenberg: demo scope, zoom out making sure the end of the waveform is in view.
<tnt>
I had partially debugged the problem and it's related to the termination condition of the loop in the shader that uses float and so is probably subject to precision issue (possibly platform/hw dependent).
<d1b2>
<johnsel> @azonenberg I'm having some trouble building the latest commit
<d1b2>
<johnsel> [ 78%] Linking CXX executable ngscopeclient /usr/bin/ld: ../../lib/scopeprotocols/libscopeprotocols.so: undefined reference to InvertFilter::InvertFilter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' /usr/bin/ld: ../../lib/scopeprotocols/libscopeprotocols.so: undefined reference to InvertFilter::GetProtocolNameabi:cxx11' collect2: error: ld returned 1 exit status make[2]:
<d1b2>
<azonenberg> glscopeclient could do it but ng can't (yet)
<d1b2>
<johnsel> ah well
<d1b2>
<johnsel> another quick question; is your scpi bridge updated to talk to aklabs scopehal driver?
<d1b2>
<azonenberg> i have not used the aklabs driver in years, its essentialyl abandoned until i get back to building my own scopes again
<d1b2>
<azonenberg> i have no idea what if anything it does today
<d1b2>
<johnsel> any other reference I can use for a minimal driver for the thunderscope?
<d1b2>
<azonenberg> we use the RemoteBridgeOscilloscope class to talk to pico, digilent, etc scopes via external usb bridges. check the pico or digilent drivers for references of minimal drivers there
<d1b2>
<azonenberg> this is what i expect thunderscope to be based on
<d1b2>
<azonenberg> (RemoteBridge is the base with a common set of commands, then you extend with instrument specific stuff)
<d1b2>
<johnsel> eventually we'll reuse TS.Net with the new firmware
<d1b2>
<johnsel> but I'd like to get a full chain operational if I can with a minimal driver
<d1b2>
<azonenberg> wrt vulkan packages, we should mention in docs that the vulkan package on older ubuntu is a no go
<d1b2>
<johnsel> but pico or digilent then, thanks
<d1b2>
<azonenberg> but yes there were some breaking header changes some time ago
<d1b2>
<azonenberg> we call for specific SDK versions in the docs because we've tested them and know they work
<d1b2>
<johnsel> yeah.. bit of an annoyance
<d1b2>
<johnsel> yes but we still depend on the libvulkan-dev of the OS repo
<d1b2>
<azonenberg> do we?
<d1b2>
<johnsel> yep
<d1b2>
<azonenberg> my understanding was when you installed the SDK from upstream .deb packages it completely replaced the distro package
<d1b2>
<azonenberg> that's how my dev machine is set p
<d1b2>
<azonenberg> then it just pulls in the ICD for your GPU driver
<d1b2>
<johnsel> I followed the docs/CI
<d1b2>
<azonenberg> ah its possible if you install from source it doesnt do that
<d1b2>
<azonenberg> I'm using the lunarg apt repo
<d1b2>
<johnsel> it requires you to install the libvulkan-dev using apt
<d1b2>
<johnsel> yeah that may be the better way forward
<d1b2>
<johnsel> I also don't like that I can't copy paste the CI commands because we use a template variable for the version number
<d1b2>
<johnsel> but that is an aside
<azonenberg>
tnt: i suspected it was loop termination in the shader but i didn't think we used floats in the loop for exactly that reason??
<azonenberg>
that's the whole reason we have int64's for coordinates everywhere
<azonenberg>
to avoid roundoff
<azonenberg>
and we even have our own glsl bignum library for platforms (like some intel) that need to use cascaded 32 bit operations to mimic int64 math
<d1b2>
<azonenberg> @johnsel in other news i'm drawing up notes for the dev call next week and have a list of bugs i'm tentatively designating release critical for v0.1
<d1b2>
<azonenberg> i want to get concrete plans for addressing each
<d1b2>
<johnsel> sounds good
<d1b2>
<azonenberg> as well as a blanket policy that even if not designated as such any segfault or hang is release critical
<d1b2>
<johnsel> yeah I run into those every once in a while, I'll start writing them up when I do
<d1b2>
<azonenberg> yeah please do. stability, docs, and packaging are my three big pillars for v0.1
<d1b2>
<azonenberg> idk if you've seen but i've written a lot of content on the general UI in the last few days
<d1b2>
<johnsel> in the docs?
<d1b2>
<azonenberg> yeah
<d1b2>
<johnsel> I did not, let me take a look
<d1b2>
<azonenberg> i dont know if i've updated the doc submodule in the top level repo recently enough
<d1b2>
<azonenberg> e.g. the "dialogs" section is all new, the "filter graph editor" section was completely rewritten
<d1b2>
<azonenberg> a bunch of stale screenshots got replaced
<d1b2>
<azonenberg> we're up to 272 pages now in the pdf although iirc 100+ is filter docs that are just a heading and no content, so there's still a lot to do on that front
<d1b2>
<azonenberg> this is latest as of an hour ago
<d1b2>
<johnsel> I'm going to try to make some "tutorial" videos on ngscopeclient as well
<d1b2>
<azonenberg> yeah that would be good i did a quick introduction in my BERT video but nothing specifically meant as an intro for new users
<d1b2>
<azonenberg> i was planning to but unsurprisingly too busy
<d1b2>
<azonenberg> how goes the CI? are you pretty close wrt orchestration stuff?
<d1b2>
<johnsel> I still need to put everything nicely in a repo and rewrite some stuff likely to put secrets where they belong instead of hardcoding them
<d1b2>
<johnsel> and then add the windows cloud-config/terraform script
<d1b2>
<johnsel> I think it's maybe another 4-6 hours work, just have to find time for it
<d1b2>
<azonenberg> awesome. if you can have that done by the dev meeting monday that would be great so we can talk about next steps. but dont lose any sleep over it
<d1b2>
<azonenberg> even if just linux is running that would be a huge milestone
<d1b2>
<johnsel> I'll see if I can get that done
<d1b2>
<azonenberg> Awesome. also would be interested in hearing any performance data from you
<d1b2>
<azonenberg> (wrt building with more than 4 threads)
<d1b2>
<johnsel> on CI?
<d1b2>
<azonenberg> yeah
<d1b2>
<azonenberg> the github azure runners are on like 2 vCPUs with 8GB ram so we capped the build there
<d1b2>
<johnsel> yesterday it crashed when I ran with -j
<d1b2>
<johnsel> but the same happens locally
<d1b2>
<azonenberg> LOL
<d1b2>
<azonenberg> even i can't run -j on my machine with 192GB RAM
<d1b2>
<johnsel> not sure why, I had a weird linking bug as well
<d1b2>
<azonenberg> i usually do -j32
<d1b2>
<johnsel> Oh hmm, I have 80GB and 12 threads
<d1b2>
<johnsel> I would have expected that to be fine
<d1b2>
<johnsel> but we're not surprised about that then
<d1b2>
<azonenberg> yeah scopehal and scopeprotocols have a huge number of complex source files with few dependencies among them
<d1b2>
<azonenberg> it'll launch like 200+ jobs if you let it
<d1b2>
<azonenberg> each one needing a gig or two of ram
<d1b2>
<azonenberg> i dont remember the exact numbers but it's a lot. you will get cpu bound long before then
<d1b2>
<azonenberg> -j32 works fine on my machine with dual socket * 8c/16t and pretty much saturates it
<d1b2>
<johnsel> I'm not on Windows right now but I'll push a -j limited config next then
<d1b2>
<azonenberg> yeah try like -j8 or -j16 and see what peak ram usage in the vm is
<d1b2>
<johnsel> I think I have 64GB and 32 threads set now for the CI
<d1b2>
<azonenberg> that is likely to be our limiting factor. -j32 might be ok but work your way up
<d1b2>
<johnsel> yes it was definitely RAM limited
<d1b2>
<azonenberg> If we think we have CPU to spare and could speed the build with more RAM, let me know
<d1b2>
<azonenberg> the vm server has 256GB right now but iirc only half the dimm slots are stuffed
<d1b2>
<azonenberg> i will gladly throw more ram at the problem if we have data showing that will improve it
<d1b2>
<johnsel> well, since you're so enthousiastic about it and I'm staring too long at SCPI anyway let me see now
<d1b2>
<azonenberg> see if -j32 is any faster. linking is still going to be a bottleneck, i suspect there are things we can do to optimize that as well
<d1b2>
<johnsel> already pushed it
<d1b2>
<azonenberg> great 😄
<d1b2>
<johnsel> hmm looks like the Windows template is broken somehow
<d1b2>
<johnsel> looks like you need to upgrade your internet buddy
<d1b2>
<johnsel> we're bandwidth limited now
<azonenberg>
lol
<azonenberg>
i want to :p
<azonenberg>
nobody will sell me a symmetric fiber pipe
<d1b2>
<johnsel> aww we should be getting that next year if I'm not mistaken
<azonenberg>
well ok let me rephrase
<azonenberg>
the local public utility district will
<d1b2>
<johnsel> though cable works fine for me
<azonenberg>
the own the cable plant and will hook me up, i'd have to contract with an ISP separately for transit service to the internet
<azonenberg>
but the PUD would first need me to pony up somewhere around 50000 USD for the... 2km ish? of fiber they'd have to trench to me from the closest splice point
<d1b2>
<johnsel> yeah that's the problem always, somebody here decided we all need ftth so it's going to be rolled out
<d1b2>
<azonenberg> (down from twice that a few years ago, they've expanded the plant... now it runs past the end of my street)
<d1b2>
<azonenberg> but the wrong end :p
<d1b2>
<johnsel> oof, if I'm not mistaken we have the fiber already running down the street
<d1b2>
<johnsel> just needs splicing to customer handover points in the houses
<d1b2>
<azonenberg> yeah i would literally have to get them to dig up / bore under every front yard from the main drag downtown to my house
<d1b2>
<johnsel> yeah that's precisely what happened here, months of noise and heavy machinery
<d1b2>
<johnsel> but doing it for everyone is cheaper than for 1 person
<d1b2>
<johnsel> 50k is insane
<d1b2>
<johnsel> we'll have to look into that broken template at some point
<d1b2>
<johnsel> I'm not yet decided if I'll install msys2 etc manually for the time being. As much as a fully scripted infra is desirable we're throwing it out next year anyway
<d1b2>
<azonenberg> yeah i'd say preinstall it make it work and bake it into the templaet
<d1b2>
<azonenberg> it'll speed turnaround times
<d1b2>
<johnsel> can you take a look if there are permissions/self service group assigning for the Win11 template? Strangely enough I don't see it in the list of templates, only when creating a new VM (but it produces a vm that won't start)
<azonenberg>
Stand by...
<azonenberg>
CI_Windows11?
<d1b2>
<johnsel> yes
<azonenberg>
yes you have perms on it
<d1b2>
<johnsel> weird
<azonenberg>
But right now scopehal-ci-set is using 91.89 GB of RAM
<azonenberg>
and the self service pool is capped at 96
<azonenberg>
so you may be hitting the ram limit
<azonenberg>
try shutting down another instance and see if thats the problem?
<d1b2>
<johnsel> I tested this but let me double check
<d1b2>
<johnsel> my dashboard says max 128
<azonenberg>
I do have another 75GB of RAM i'm not using so i can assign a bit more to the CI set
<azonenberg>
we're using 181 of 256 right now
<d1b2>
<johnsel> I think it would be useful to have 2x 64GB + for the main runners + some for orchestrator + testing
<d1b2>
<azonenberg> done and added to your self service list
<d1b2>
<azonenberg> ok anyway, so if i wanted to bump up to 512GB of RAM i'd need 8x 32GB ddr4 2667 lrdrimm
<d1b2>
<david.rysk> rdimm should work too I think? what CPU do you have in there?
<d1b2>
<azonenberg> looks like around $135 per stick is the going rate these days on newegg for sk hynix dimms
<d1b2>
<david.rysk> (RAM is more tied to CPU nowadays than motherboard)
<d1b2>
<azonenberg> I don't think you can mix LR and R since there's an extra cycle of latency?
<d1b2>
<johnsel> that produced a working template
<d1b2>
<azonenberg> Woot
<d1b2>
<david.rysk> oh you already have LRDIMMs in there?
<d1b2>
<azonenberg> anyway, so about $1080 to fill it up the rest of the way
<d1b2>
<david.rysk> but yeah what CPU?
<d1b2>
<azonenberg> @david.rysk yeah we're talking about potentially bumping my existing 256GB up to 32
<d1b2>
<azonenberg> xeon scalable gold 5320 iirc
<d1b2>
<johnsel> I'll have to tinker a bit more to reproduce it with a terraform script
<d1b2>
<johnsel> and set up msys2 and any system dependencies we take for granted on github runners
<d1b2>
<azonenberg> up to 512*
<d1b2>
<azonenberg> 256 is looking like it might be a little light for all of my normal virtualization workloads plus all the CI runners
<d1b2>
<azonenberg> I could bump up a bit to 384 or something but the nice round option would be to go from 1 to 2 dimms per channel on all 8 memory channels
<d1b2>
<azonenberg> the cpu is capable of running ram up to 2933 but the existing 256GB is 2667
<d1b2>
<azonenberg> so i dont see a point in paying for faster dimms unless i were to replace all of them
<d1b2>
<azonenberg> which i don't think makes sense unless i am also moving to a new cpu with ddr5 and this one is only a year old
<d1b2>
<david.rysk> yeah you'd be looking at 8x ddr4 lrdimm indeed. Faster might not be that much more expensive though
<d1b2>
<azonenberg> well no i'd need 16x lrdimm is my point
<d1b2>
<azonenberg> if i wanted to upgade
<d1b2>
<david.rysk> can't you mix speeds?
<d1b2>
<azonenberg> since i'd have to replace all of the existing ram
<d1b2>
<azonenberg> you can but it runs at the minimum
<d1b2>
<david.rysk> and it will run at the slower speed
<d1b2>
<johnsel> or overclock
<d1b2>
<azonenberg> so i dont see the point as i dont intend to ever retire the existing dimms
<d1b2>
<johnsel> but if that's a good idea...
<d1b2>
<azonenberg> I'll be keeping them until i upgrade the whole mobo etc and move to a ddr5/6 platform in the indefinite future
<d1b2>
<david.rysk> heh, I overclocked the ECC RAM in the PC I just built by finding the DIMMs with the same ICs that they sell as part of their gamer memory and matching the timings
<d1b2>
<david.rysk> cost and availability, sometimes slower speeds become more expensive or unavailable
<d1b2>
<azonenberg> as of right now ddr4 2667 lrdimms are readily available on newegg and i dont actually see 2933 in lrdimm
<d1b2>
<david.rysk> you're more likely to see 3200
<d1b2>
<johnsel> you can be lucky and have that work indeed, though it's always a gamble wrt system stability. I also run a set of 2666s at 3200 but every once in a while I doubt the stability
<d1b2>
<david.rysk> with ECC you can look for reported errors
<d1b2>
<johnsel> oh 3600 actually
<d1b2>
<david.rysk> and then adjust based on that
<d1b2>
<azonenberg> yeah and this is a prod server that i use for lots of things i actually care about
<d1b2>
<azonenberg> so OC is not happening :p
<d1b2>
<johnsel> sure but who has the time for that with a production server
<d1b2>
<david.rysk> LOL yeah
<d1b2>
<david.rysk> still at this point I'm not willing to build even a gaming PC with non-ECC
<d1b2>
<azonenberg> yeah same here
<d1b2>
<david.rysk> one near data loss incident that took a month to fully recover from 🙃
<d1b2>
<azonenberg> anyway its not a huge deal for me to throw a line item in my q1 2024 budget for another 256 gigs of ram if you think we will need it
<d1b2>
<johnsel> eh my previous build had a flaky CPU, it would crash so often I now have everything in the cloud
<d1b2>
<johnsel> I spend so much time chasing that bug but eventually gave up and replaced everything
<d1b2>
<johnsel> I'm sure @azonenberg remembers my complaining lol
<d1b2>
<david.rysk> ouch
<d1b2>
<johnsel> anyway I'm not saying no to some extra RAM
<d1b2>
<david.rysk> the CPU was defective?
<d1b2>
<azonenberg> I have everything on the cloud too 😛 My cloud
<d1b2>
<david.rysk> yeah the next step up from 2667 would be 3200
<d1b2>
<johnsel> I mean I can't prove it but based on replacing everything it's either the motherboard or the CPU
<d1b2>
<johnsel> I am using the same RAM sticks now after a while
<d1b2>
<johnsel> I don't think it's the motherboard though. I think I pushed the CPU too hard while overclocking and/or had a flaky one from the start
<d1b2>
<david.rysk> I think 3200 is the highest JEDEC speed
<d1b2>
<azonenberg> top to bottom: xen box, router/firewall (old and due for replacement in Q1 2024 as well), storage cluster
<d1b2>
<johnsel> JEDEC afaik is all 2166
<d1b2>
<david.rysk> no, JEDEC added speeds up through 3200
<d1b2>
<johnsel> oh, do you have a link somewhere?
<d1b2>
<azonenberg> @johnsel also speaking of storage
<d1b2>
<azonenberg> let me know if you run low on disk space
<d1b2>
<azonenberg> as of right now i have about 3T free on the cluster and in a few weeks i will be retiring some older drives and should have closer to 5
<d1b2>
<johnsel> I will, but I don't think I will
<d1b2>
<david.rysk> you'll have to find one of the newer revisions of JESD79-4D
<d1b2>
<azonenberg> (swapping some 1.92T drives out for 3.84)
<d1b2>
<david.rysk> but e.g. newer AMD chips like the 5000 series run at 3200 stock
<d1b2>
<azonenberg> i can also look at potentially providing additional LAN bandwidth, the vm server currently has a single 10G pipe to the core switch that's shared by VM NICs as well as access to the storage cluster
<d1b2>
<azonenberg> at some point i'm planning on splitting that so i have one for storage and one for vNICs
<d1b2>
<johnsel> I don't see it
<d1b2>
<azonenberg> (the physical nic is dual port)
<d1b2>
<azonenberg> but only one is lit up right now
<d1b2>
<johnsel> Yes that's true, in the end the chipset is deciding what is acceptable or not
<d1b2>
<azonenberg> i will also be doing some tuning on the storage cluster soonish that should improve write performance
<d1b2>
<david.rysk> it's paywalled
<d1b2>
<azonenberg> unfortunately upload BW to the internet is one thing i can't easily do much about 😦
<d1b2>
<johnsel> yeah improved storage speed would be great
<d1b2>
<david.rysk> yeah, but JEDEC "just works" while overclocking involves XMP profiles or something more than that