<ccox>
set CI benchmarking as a long term goal. For now, just get something reliable when run on a single machine by a developer. Then we can compare across several developers results to make sure there are no hidden bogosities.
<ccox>
Once we have something we can trust locally, then we can think about how it should be tweaked for running in an uncertain cloud.
ur5us has joined #openscad
<InPhase>
ccox: I understand you had some compilation issues starting out, but honestly that threshold was hit a pretty long time ago.
<ccox>
well, I still haven't found documentation for benchmarks (or any smoke tests beyond the CI)
<InPhase>
Every system configuration will be a little bit different. You might have been the first person to ever attempt an M1 compilation for example, a platform notorious for lots of standard things not working on it. But a lot of developers come through and get compilation going pretty quickly.
<ccox>
What brought up compilation while we're talking about benchmarking?
<InPhase>
I thought you meant set benchmarking as a long term goal.
<InPhase>
If you meant just CI benchmarking vs local benchmarking, then ignore everything I replied. :)
<ccox>
Ok, safely ignored.
<InPhase>
As for local benchmarking, the critical thing isn't cross-developer comparisons, but cross-version to check for increases and decreases.
<ccox>
yes, but we need some cross developer checking to make sure the benchmarks are useful and reliable.
<ccox>
I could regale you with tales of broken stopwatches and systems with bad clocks - but you should never trust benchmarks run on a single system and OS.
<ccox>
To get useful benchmarks (and to know where the edge cases lie), you have to compare across systems. Then you can trust the results (well, except for those darn edge cases).
<InPhase>
Well you in general can't compare across systems unless there are extreme deviations, as a lot of things differ between systems.
<InPhase>
So you can pick up if one platform runs everything 5 times slower than reasonable while otherwise being fast. But nuanced things aren't trivial to pick up with any confidence.
<InPhase>
However, you can separately monitor for regressions, which in some rare cases might be platform specific.
<ccox>
That's why you don't want too much nuance in a general benchmark, and keep the edge cases separate.
<InPhase>
In this case though, what we want is a collection of targetted benchmarks. Like one that does a lot of recursive functions with embedded vectors, one that does TCO, a heavy list comprehension one, some high intensity fractal geometry unioning...
<InPhase>
That way when there's a performance regression, the test result tells us what the problem thing is right away.
<InPhase>
Or when peepsalot goes to swap out the malloc, we can know exactly which part of the code it's speeding up, and check to see if there's a part negatively impacted.
<dalias>
how does rational even help? most stuff would need arbitrary algebraic field extensions to be exact
<dalias>
double would probably make it like 10-30x as fast
<peepsalot>
idk, i'm not a math professor
<dalias>
well it should be clear that most points don't have rational coordinates
<dalias>
even simple shapes start to bring in cos(30°)
<dalias>
and of course sqrt(2)
<peepsalot>
i know, radicals and transcendentals and all that
<InPhase>
"<peepsalot> and what would you use instead?" python3 -c 'num = ....'
<dalias>
it would be awesome if we could get a 30x speedup just by not doing something that doesn't make sense to begin with
<InPhase>
dalias: I also really do not understand the rationals thing.
<peepsalot>
i'm guessing it mainly allows for affine transformations to at least be applied consistently (although the calculation of the matrices won't be exact in the first place, for rotations especially
<InPhase>
No rational survives battle with the first transcendental it encounters.
<peepsalot>
yeah
<InPhase>
(Except for the handful of special cases where it does, but that doesn't sound as epic.)
<peepsalot>
LEDA implements some form of that, and is supported by CGAL, but its not open source
<InPhase>
peepsalot: Which makes me curious on our benchmarking topic... What happens if you take one of those fractals with cubes, and rotate everything by 17 degrees before the union? Does this obliterate CGAL performance?
<peepsalot>
InPhase: well the example024 is already rotated to orient it as a diagonal through the cube
<InPhase>
But by 45.
<InPhase>
Oh, well I guess that's still irational.
<peepsalot>
not really
<peepsalot>
rotate([45, atan(1/sqrt(2)), 0])
<InPhase>
Ok.
<InPhase>
Although that's after most of the unioning.
<peepsalot>
i did also try a quick replacement of CGAL::Gmpq with double the other day, but couldn't get it to compile. something was failing in cgalutils, still needs a bit more work than just that
<InPhase>
28.9s to 26.4s by changing only rotate([45, atan(1/sqrt(2)), 0]) to rotate([0, 0, 0]) in example024.scad.
<InPhase>
So about 10% faster just by getting rid of one irrational on the rotation.
<InPhase>
I did not until today expect that would cause such a performance difference. But, now I know.
<InPhase>
Oh, well to be fair the output is 20% smaller without that rotation. Maybe this is not a perfectly fair test.
<InPhase>
I can fudge the translation to try to get those equal.
<InPhase>
Yeah, matching the output sizes by sliding the intersecting cube down leaves the unrotated one still about 10% faster.
<InPhase>
I bet it would be even slower if the rotation was before the menger union.
Jack21 has quit [Quit: Client closed]
<peepsalot>
yeah seems quite likely
snaked has joined #openscad
Jack21 has joined #openscad
Jack21 has quit [Quit: Client closed]
Jack21 has joined #openscad
ur5us has joined #openscad
ali1234 has quit [Quit: Leaving.]
lastrodamo has quit [Quit: Leaving]
snakedGT has joined #openscad
snaked has quit [Ping timeout: 252 seconds]
<InPhase>
peepsalot: FYI, your tricubic interpolation blob implementation just today inspired a research method for looking at high frequency ripples in brain data. A postdoc is working on making that happen right now. :)
<peepsalot>
wow, ok :)
<InPhase>
No guarantees it works out, as about 50% of such approaches fail. But, an attempt is being made. :)
<peepsalot>
brain data like waves or like 3D scans?
<InPhase>
Oscillatory signals from depth electrodes mapped into 3D.
<InPhase>
As in, overlayed on a rendered brain.
<peepsalot>
so no measuring how smooth someones brain is then? :-P
<InPhase>
Smooth is generally not good for brains. :) It's a thing called sharp-wave ripples which are found to correspond to memory consolidation, but there are outstanding questions about where exactly these are active when memory is being consolidated. And so it would be helpful to be able to see these regions in a visual form based on the values at the points we can measure.
<InPhase>
And someone asked me for an algorithm for this today, and I went, "Wait..." and then went to dig up my #openscad logs.
snakedLX has joined #openscad
snakedGT has quit [Ping timeout: 252 seconds]
<InPhase>
Then when I saw your code again and remembered how tricky it was to implement, I tracked down one of the Python libraries for it, for the postdoc. The hard part on these sorts of things is figuring out the right algorithm to use, so random bits of information like that can be quite useful. :)
<peepsalot>
ah, cool. yeah i know smooth is not good for brains, smoothbrain is an insult i've seen people use, mostly in reference to our former president
<InPhase>
He'd probably take that as a compliment.
amahl has quit [Remote host closed the connection]
snakedLX has quit [Ping timeout: 245 seconds]
ur5us has quit [Ping timeout: 245 seconds]
snakedLX has joined #openscad
<peepsalot>
i've got a project with openscad linking to mimalloc. its actually very convenient for the benchmarking discussion
<peepsalot>
if you set env MIMALLOC_VERBOSE=1, you get all kinds of useful stats, eg the last line: process: user: 2.203 s, system: 0.283 s, faults: 4134, rss: 262.5 mb, commit: 165.4 mb
<peepsalot>
there's even more breakdown of peak and total reserved and freed etc, and if debug build of mimalloc is used, you get the breakdown of the size buckets