#crux on 2022-06-11 — irc logs at libera.irclog.whitequark.org

2021-09-08 18:30 jaeger changed the topic of #crux to: CRUX 3.6 | Homepage: https://crux.nu/ | Ports: https://crux.nu/portdb/ https://crux.ninja/portdb/ | Logs: https://libera.irclog.whitequark.org/crux/

00:01 _moth_1 has joined #crux

00:03 _moth_ has quit [Ping timeout: 246 seconds]

00:03 _moth_1 is now known as _moth_

02:49 SiFuh_ has joined #crux

02:50 SiFuh has quit [Ping timeout: 240 seconds]

04:22 <SiFuh_> groovy2shoes: Yeah, it is a bit annoying, the reason for the addition of "${D}" so that it still opens the Download folder. I will look into it a bit later

04:59 <SiFuh_> groovy2shoes: It was using %20 for spaces https://dpaste.com/CBLH5DN3T

04:59 <SiFuh_> I have left the entries that write to log so I can see what is happening as it goes

06:23 <SiFuh_> farkuhar: At the bottom of this perl script https://dpaste.com/6YY5YGTTW I have a for loop. Each time I run the program it seems to just randomnly spit out the results rather than showing them in a static order.

06:35 ppetrov^ has joined #crux

07:40 <farkuhar> SiFuh: if you want a definite order each time, try enclosing the array inside sort(). In your script the loop might begin instead: for(sort(keys %langs)) {print(

07:42 <SiFuh_> farkuhar: I am aware of that, I am just curious why the results appear to be random

07:43 <farkuhar> It's the natural side-effect of Perl storing its hashes in *Random* Access Memory

07:44 <SiFuh_> Ahh I see thanks!

07:45 <SiFuh_> I have a friend who hates talking politics but he likes talking code. So I write the politics in the code :-P

07:48 <SiFuh_> https://twitter.com/AndrewScheer/status/1534946649667211266 <-- Any Canadians here?

07:49 <farkuhar> that whitespace issue with qBitTorrent and mc ... does it actually work now that you filter the urls through s/%20/\ /g?

07:49 <SiFuh_> Yes

07:50 <farkuhar> clever detective work there.

07:50 <SiFuh_> And I didn't need to do anything special for brackets ( ) I haven't tested " ' or :

07:50 <SiFuh_> farkuhar: No i just echoed the output to log files so I can see what is going on behind the scenes

07:51 <SiFuh_> It is originally how I knew it was sending file://

07:52 <SiFuh_> s/%20/\\\\ /g

07:53 <SiFuh_> Also it appears I didn't need to us '\ '. It looks like it accepts the whitespace only.

07:54 <farkuhar> interpreting saved logs (and print debugging more generally) is an underappreciated skill.

07:56 <SiFuh_> Thought that was a normal thing. When I wrote ckut.bash and tuxedo-keys everything was printed as through echo first or dumped into logs.

07:57 <SiFuh_> I'd even have echo 1, echo 2, echo 3 ... throughout the scripts so I can see where exactly it went wrong. Like mile markers

08:04 <farkuhar> it definitely was the norm, among coders of our age cohort. younger coders might learn the use of an interactive debugger early in their training, and never fall back on old-school methods.

08:06 <SiFuh_> Oh I see now. Yeah, I have friends who have programs for writing scripts and debugging. I have always used vi/vim but these days have been using emacs more often

08:21 <SiFuh_> https://web.archive.org/web/20031202213105/http://www.kku.ac.th/eng1/

08:21 <SiFuh_> I wrote this one entirely in VIM on FreeBSD

09:12 samsep10l has joined #crux

09:36 SiFuh_ has quit [Ping timeout: 258 seconds]

09:58 <ppetrov^> hmmm,... my ancient xfce has a problem with xorg-libx11 1.8.1 -- Xfce's settings no longer get applied (icons, theme, keyboard switch); so, i guess i just lock the lib to v1.8

10:19 SiFuh has joined #crux

10:50 <farkuhar> ppetrov^: I'm starting to appreciate why you bioinformatics people use Perl so much. With all the features built into its standard library, I was able to write a working clone of prt-get in about 850 lines: https://git.sdf.org/jmq/Documentation/src/branch/master/scripts/prt-auf

11:04 <ppetrov^> farkuhar, i am a "wannabe" bioinformatician (though I have several papers) and unfortunatelly do not know perl. I have used some programs that take advantage of libraries from the bioperl project though

11:08 <farkuhar> it was the lengthy directory listing in your crux-ports/p5 repository that suggested to me a strong affinity between bioinformaticians and Perl.

11:09 <ppetrov^> these are all needed for bioperl, and this was the reason I switched to CRUX; I could not create that many SlackBuilds, while CRUX has a tool cpan2crux that makes Pkgfiles automatically

11:09 _moth_ has quit [Remote host closed the connection]

11:09 _moth_ has joined #crux

11:09 <ppetrov^> there's cpan2tgz for Slackware but it did not resolve deps when I tried it for bioperl

11:10 <ppetrov^> btw, isn't pkg-get written in perl also?

11:10 <ppetrov^> as for bioinformaticians, they use mainly R and python nowadays... at least these are required for RNASeq analyses, which is a hot topic

11:12 <braewoods> Perl has declined in use. I mainly still see it used in legacy software that can't afford to switch to something else.

11:12 <farkuhar> tbh, I've never installed pkg-get myself; most of my hardware is sufficient for building ports directly on the machine that will use them. If I had an underpowered device lying around, I might try setting up pkg-get.

11:14 <ppetrov^> farkuhar, for me the pkg-get use is really convenient, installing stuff on my x230 thinkpad is a breeze

11:14 <ppetrov^> braewoods, that's what i have heard also

11:15 <groovy2shoes> almost everywhere i've worked has still been writing internal tools in Perl

11:16 <groovy2shoes> if they were running *nix in any capacity, they were still using Perl

11:17 <groovy2shoes> maybe not as extensively as they used to. there was always some interest in Python and the occasional Tcl script lying around.

11:17 <braewoods> Python. Ick. lol

11:17 <farkuhar> i recall that jaeger was toying around with the idea of rewriting pkgutils (or was it prt-get?) in python first, where iterating towards a polished product would be faster than a C rewrite.

11:18 <groovy2shoes> a year ago i'd've been on board with that idea. i've gotten very annoyed with python lately ¬_¬

11:18 <farkuhar> +1 on the "Ick", braewoods. I couldn't imagine the inconvenience of having to be so careful with whitespace, if I had tried my rewrite in python rather than perl.

11:18 <braewoods> python is very slow in pure python code. no idea why but that's the general rule i've been seeing.

11:19 <groovy2shoes> and it's a big mess, too

11:19 <braewoods> about the only positive i see is the popularity

11:21 <braewoods> that said i wonder how hard it is to write Perl C modules

11:21 <braewoods> people in #perl previously told it was not recommended

11:22 <groovy2shoes> really easy with Lua, Tcl, or Chibi Scheme. PITA with Perl, Python, or Ruby. that's been my experience.

11:24 <farkuhar> in my perl rewrite I made the assumption that pkgadd and pkgrm would continue to have the same userland interface as they do now. There's no attempt to connect with them through a C api, only through the shell or Unix execve.

11:25 <braewoods> perl rewrite of what?

11:25 <farkuhar> prt-get

11:25 <groovy2shoes> prt-get

11:25 <braewoods> hm.

11:26 <braewoods> i should get around to finishing that C library port of pkgutils

11:26 <braewoods> then you could eliminate the overhead of using pkgadd / pkgrm though probably doesn't make much difference for prt-get most of the time

11:27 <braewoods> due to build time

11:28 <braewoods> i do know one trick that could improve the package processing time for pkgadd but it would require changing the way we produce binary packages

11:28 <braewoods> if you make the first file of the tar archive into a file listing, then you could avoid an extra pass through the tar archive which can be expensive

11:29 <braewoods> you need the file listing to do some early validation tests

11:30 <braewoods> or could just use an archive format that supports a central directory listing like ZIP does :P

11:30 <farkuhar> you mean like explicitly listing a footprint file on the line where bsdtar creates the pkg.tar.gz?

11:31 <braewoods> farkuhar: well just a path file listing would be helpful. i believe ARCH's pacman stores it as a plain text file just containing all the full file paths of the tarball

11:32 <braewoods> it helps avoid an expensive pass to collect metadata for validation tests that the operation can proceed as planned

11:33 <braewoods> .FILELIST or something i believe

11:33 <farkuhar> shouldn't be too hard to implement. We have a footprint file already, just send it to cut -f 3.

11:33 <groovy2shoes> some mechanism for metadata in packages would be nice

11:33 <braewoods> it's more of a cache than anything due to tar not being seekable

11:34 <braewoods> the metadata files are stored as the first files tarred up by their makepkg

11:34 <braewoods> they don't actually get installed, they're just used by pacman to store quick data or other stuff

11:35 <cruxbot> [opt.git/3.6]: clang: 14.0.4 -> 14.0.5

11:35 <groovy2shoes> do all of the various versions of tar have that limitation? i know GNU tar and BSD tar store filenames differently, and pax probably does, too, but i don't know the details

11:35 <cruxbot> [opt.git/3.6]: compiler-rt: 14.0.4 -> 14.0.5

11:35 <cruxbot> [opt.git/3.6]: lld: 14.0.4 -> 14.0.5

11:35 <cruxbot> [opt.git/3.6]: lldb: 14.0.4 -> 14.0.5

11:35 <cruxbot> [opt.git/3.6]: llvm: 14.0.4 -> 14.0.5

11:35 <cruxbot> [opt.git/3.6]: polly: 14.0.4 -> 14.0.5

11:35 <braewoods> groovy2shoes: the seek limitation?

11:35 <groovy2shoes> yeah

11:35 <braewoods> why wouldn't they? tar is naturally not seekable at the binary format level.

11:35 <braewoods> it's also commonly compressed as a giant stream

11:35 <groovy2shoes> because there's more than one tar format

11:36 <braewoods> these are naturally non-seekable by design

11:36 <braewoods> i can't say i've known any tar format to be seekable. even if it was the compression layer makes it unseekable.

11:36 <groovy2shoes> ah

11:37 <braewoods> you have to decompress until you reach the part of the archive you want to use shows up

11:37 <braewoods> the compression layer is the main reason this ends up being expensive

11:37 <braewoods> otherwise you could probably skip through pretty fast

11:38 <braewoods> another option is to switch the archive format in use

11:38 <braewoods> some are competitive with TAR and actually seekable

11:39 <braewoods> https://en.wikipedia.org/wiki/Lzip is one i've seen suggested before

11:40 <braewoods> hm

11:40 <braewoods> oh wait this is just compression

11:41 <groovy2shoes> yeah, but good compression. i use lzip a lot.

11:41 <groovy2shoes> looks like none of the usual unix archives are seekable. cpio and ar seem to both have the same problem. hm.

11:42 <braewoods> zip definitely is but support for non-DEFLATE algorithms is pretty hit or miss.

11:42 <groovy2shoes> yeah, but zip is also... not very unixy ;)

11:43 <braewoods> infozip can store unix filesystem stuff but i wouldn't count on it

11:45 <braewoods> i suspect the only reasonable option would probably be 7-zip

11:46 <braewoods> most archive + compression options are windows only or not very unix friendly

11:47 <groovy2shoes> yeah

11:47 <braewoods> oh, well this shuts that idea down

11:47 <braewoods> https://en.wikipedia.org/wiki/7z#Limitations

11:47 <groovy2shoes> or they use kinda awkward tools. i use zpaq a lot for backups, and it's great, but it makes no effort at all to function like a normal unix program.

11:48 <braewoods> i think the current situation is the downside of a basic unix philosophy of every tool doing a single job well

11:48 <braewoods> it basically destroys any advantages you can get from tight integration

11:49 <braewoods> such as archive seeking

11:49 <groovy2shoes> i'd done 7z before, with the addition of an mtree listing to the archive. but then we're just back to adding a manifest to the tarball.

11:51 <farkuhar> braewoods, would this one-line change to pkgmk create tarballs with the first file you want? http://sprunge.us/UsXaeU

11:52 <groovy2shoes> not if there's a footprint mismatch

11:52 <braewoods> eh...

11:52 <braewoods> no

11:52 <braewoods> you should use find or similar to generate the file

11:53 <braewoods> let me see

11:53 <groovy2shoes> you'd probably want `cut -f3-`, in case the file name contains embedded delimiters

11:54 <braewoods> the idea is to generate it not from the footprint but from the actual package contents

11:54 <braewoods> the footprint won't be reliable enough

11:54 <groovy2shoes> but the bigger thing is it'll be missing files if you're ignoring new, and might be listing files that don't exist if you're ignoring missing

11:54 <braewoods> footprint wasn't intended for this now that i think about it

11:55 _moth_ has quit [Remote host closed the connection]

11:55 <braewoods> something like

11:56 _moth_ has joined #crux

11:56 <braewoods> find $PKGDIR | sed "s;^$PKGDIR/;;"

11:56 <braewoods> it may need more finetuning but that's the idea

11:56 <farkuhar> ah, i see what you're getting at now.

11:57 <braewoods> yea, it needs to hold the path of every file in the package, regardless of the type

11:57 <braewoods> the main reason this is needed is to check for file conflicts

11:58 <braewoods> otherwise an expensive pass through the archive to collect such metadata is needed

11:58 <braewoods> if you can reduce to a single pass you can cut package processing time in half

12:01 <groovy2shoes> then a specially-crafted archive could trick pkgadd into unwittingly obliterating files. just leave some files out of the manifest.

12:03 <groovy2shoes> a C program using libarchive could make a single pass through the archive, but still check each file for conflict before writing it. but then if you have to bail halfway through, you'd wind up with a partially-extracted archive.

12:03 <groovy2shoes> hmm

12:09 SiFuh has quit [Remote host closed the connection]

12:09 <braewoods> groovy2shoes: easily solved by a feature i was planning to add

12:09 <braewoods> the ability to rollback the transaction due to an unexpected failure

12:09 <groovy2shoes> that would be a good feature

12:09 <braewoods> basically i don't make any permanent changes to the previous filesystem contents (that were being tracked by pkgutils)

12:10 <braewoods> this is mostly needed by pkgadd

12:10 <braewoods> pkgrm doesn't really have this issue

12:10 <groovy2shoes> yeah

12:10 SiFuh has joined #crux

12:10 <braewoods> it was intended to cover external interruptions but it also needs to cover things like disk space exhaustion

12:11 <groovy2shoes> yeah

12:11 <braewoods> i figured out a way to handle it in any case

12:11 <farkuhar> that solves the problem of having to bail out in the middle of a partially-extracted archive, but the other problem of a "specially-crafted archive" with an incomplete manifest can easily be guarded against. Just refuse to unpack any files that didn't weren't found on the manifest during pre-validation.

12:12 <farkuhar> s/didn't//

12:12 <braewoods> or just fail the whole operation if the archive is found to be malformed during processing

12:12 <braewoods> a properly built package should never trigger this in any case

12:14 <farkuhar> if you have access to an OpenBSD system, read the manpage for pkg_add(1). The section called "Technical details" gives a complete overview of an elaborate unpacking process, which involves a staging area for temporary extraction before finally merging the changes onto the host filesystem.

12:15 <groovy2shoes> https://man.openbsd.org/pkg_add#Technical_details

12:21 <braewoods> i have a different idea in mind i think... the problem with an isolated staging area is the moving cost.

12:22 <groovy2shoes> not ideal for large packages, either

12:22 <braewoods> i could use a common subdirectory for the new files though to make it easier to find leftovers from a failed operation

12:23 <braewoods> the main thing i'd want to do is guarantee that rename() will work and that's only possible if i extract to the final directory of each file or a subdir of that directory

12:23 <braewoods> since that's the only way to avoid crossing filesystem boundaries

12:24 <braewoods> rename being the final pass i do once i've completed everything else in the operation

12:29 <ocb> heya cruxrs

12:29 <ocb> how are you? :)

12:56 <ppetrov^> thara was an interesting discussion about package compression

13:00 <SiFuh> lz4

13:09 frinnst has quit [Remote host closed the connection]

13:15 Guest57 has joined #crux

13:16 Guest57 has quit [Client Quit]

13:26 <farkuhar> groovy2shoes: i just pushed a quick fix for pkgmeek. download a fresh copy if you haven't started your sysup testing yet.

13:35 <groovy2shoes> okay, thanks :)

14:09 <ppetrov^> what does pkgmeek do?

14:17 _moth_ has quit [Ping timeout: 255 seconds]

14:54 cybi has joined #crux

14:57 <groovy2shoes> ppetrov^, it's meant as a pkgmk replacement, with less code, and more straightforward.

14:59 <ppetrov^> thanks

15:00 <groovy2shoes> farkuhar, it looks like pkgmeek does a bit of extra work when used with `-do` (the work dir preparation and the up-to-date check in particular). i'm not actually sure how pkgmk does it, but regardless i'd expect it to only fetch the sources

15:15 <farkuhar> groovy2shoes, thanks for the feedback. The work dir preparation is meant to accommodate all sorts of configurations, either the sources downloaded right into the ports tree, or saved in a central location. The download step eventually moves the sources to where the user meant to put them, but first they might appear in the work directory.

15:17 <groovy2shoes> ah, okay

15:17 <farkuhar> it's not ideal for respecting filesystem boundaries, i agree, so maybe some rethinking of that step is warranted. i was building out from the base that Fun wrote four years ago, without trying to deviate too much from his straightforward approach.

15:18 <groovy2shoes> it's not a big deal, it was just a little surprising

16:59 cybi has quit [Read error: Connection reset by peer]

17:03 _moth_ has joined #crux

17:13 jue has quit [Quit: killed]

17:13 samsep10l has quit [Quit: leaving]

17:25 cybi has joined #crux

17:26 <cruxbot> [contrib.git/3.6]: docker-compose: updated to version 2.6.0

17:26 <cruxbot> [contrib.git/3.6]: runc: updated to version 1.1.3

17:26 <cruxbot> [contrib.git/3.6]: docker: updated to version 20.10.17

17:27 <cruxbot> [contrib.git/3.6]: containerd: updated to version 1.6.6

17:59 cybi has quit [Read error: Connection reset by peer]

18:26 cybi has joined #crux

18:42 <farkuhar> groovy2shoes: by design, pkgmeek doesn't have any modifiable subroutines. The handful of Pkgfiles that rely on being able to redefine unpack_source() will probably fail. The bash script itself hints at the recommended way to accomplish what the maintainer wanted, but an example is always more instructive: http://sprunge.us/Q2XChf

18:49 <cruxbot> [contrib.git/3.6]: open-vm-tools: updated to version 12.0.5-19716617

18:51 <braewoods> farkuhar: i can see a point to that but it's worth remembering that pkgfiles can execute code in the regular top level of the file since it's a shell script still. if that was intended for security it doesn't really help.

19:15 <ppetrov^> hey, graphene installs stuff in /usr/libexec; isn't CRUX supposed not to use this folder? I don't mind, just asking.

19:16 cybi has quit [Ping timeout: 248 seconds]

19:18 <braewoods> libexec is used by libraries for stuff that shouldn't be exposed under normal PATH

19:20 <ppetrov^> yes, however the handbook states: /usr/libexec/ is not used in CRUX, thus packages should never install anything there. Use /usr/lib/<prog>/ instead.

19:28 _moth_ has quit [Remote host closed the connection]

19:28 _moth_ has joined #crux

19:41 cybi has joined #crux

19:50 <farkuhar> braewoods: understood. the main security issue i was aiming to solve was to have two calls to check_signature(), one before the Pkgfile is sourced, and one before the sources are unpacked.

19:51 <farkuhar> but having all subroutines read-only also offers some measure of predictability, so the build process is not wildly different from one port to the next.

20:00 <braewoods> you always have the option of using special function names if you want to allow extension or something

20:01 <braewoods> most of the time people want to keep the original function behavior and just add to it instead of completely replacing it

20:09 samsep10l has joined #crux

20:41 maledictium has quit [Remote host closed the connection]

21:43 ppetrov^ has quit [Quit: Leaving]

21:44 cybi has quit [Ping timeout: 248 seconds]

21:52 cybi has joined #crux

22:06 cybi has quit [Read error: Connection reset by peer]

22:06 cybi has joined #crux

22:30 cybi has quit [Ping timeout: 246 seconds]

22:31 cybi has joined #crux

23:46 tilman has quit [Ping timeout: 276 seconds]

23:47 tilman has joined #crux