jaeger changed the topic of #crux to: CRUX 3.6 | Homepage: https://crux.nu/ | Ports: https://crux.nu/portdb/ https://crux.ninja/portdb/ | Logs: https://libera.irclog.whitequark.org/crux/
_moth_1 has joined #crux
_moth_ has quit [Ping timeout: 246 seconds]
_moth_1 is now known as _moth_
SiFuh_ has joined #crux
SiFuh has quit [Ping timeout: 240 seconds]
<SiFuh_> groovy2shoes: Yeah, it is a bit annoying, the reason for the addition of "${D}" so that it still opens the Download folder. I will look into it a bit later
<SiFuh_> groovy2shoes: It was using %20 for spaces https://dpaste.com/CBLH5DN3T
<SiFuh_> I have left the entries that write to log so I can see what is happening as it goes
<SiFuh_> farkuhar: At the bottom of this perl script https://dpaste.com/6YY5YGTTW I have a for loop. Each time I run the program it seems to just randomnly spit out the results rather than showing them in a static order.
ppetrov^ has joined #crux
<farkuhar> SiFuh: if you want a definite order each time, try enclosing the array inside sort(). In your script the loop might begin instead: for(sort(keys %langs)) {print(
<SiFuh_> farkuhar: I am aware of that, I am just curious why the results appear to be random
<farkuhar> It's the natural side-effect of Perl storing its hashes in *Random* Access Memory
<SiFuh_> Ahh I see thanks!
<SiFuh_> I have a friend who hates talking politics but he likes talking code. So I write the politics in the code :-P
<farkuhar> that whitespace issue with qBitTorrent and mc ... does it actually work now that you filter the urls through s/%20/\ /g?
<SiFuh_> Yes
<farkuhar> clever detective work there.
<SiFuh_> And I didn't need to do anything special for brackets ( ) I haven't tested " ' or :
<SiFuh_> farkuhar: No i just echoed the output to log files so I can see what is going on behind the scenes
<SiFuh_> It is originally how I knew it was sending file://
<SiFuh_> s/%20/\\\\ /g
<SiFuh_> Also it appears I didn't need to us '\ '. It looks like it accepts the whitespace only.
<farkuhar> interpreting saved logs (and print debugging more generally) is an underappreciated skill.
<SiFuh_> Thought that was a normal thing. When I wrote ckut.bash and tuxedo-keys everything was printed as through echo first or dumped into logs.
<SiFuh_> I'd even have echo 1, echo 2, echo 3 ... throughout the scripts so I can see where exactly it went wrong. Like mile markers
<farkuhar> it definitely was the norm, among coders of our age cohort. younger coders might learn the use of an interactive debugger early in their training, and never fall back on old-school methods.
<SiFuh_> Oh I see now. Yeah, I have friends who have programs for writing scripts and debugging. I have always used vi/vim but these days have been using emacs more often
<SiFuh_> I wrote this one entirely in VIM on FreeBSD
samsep10l has joined #crux
SiFuh_ has quit [Ping timeout: 258 seconds]
<ppetrov^> hmmm,... my ancient xfce has a problem with xorg-libx11 1.8.1 -- Xfce's settings no longer get applied (icons, theme, keyboard switch); so, i guess i just lock the lib to v1.8
SiFuh has joined #crux
<farkuhar> ppetrov^: I'm starting to appreciate why you bioinformatics people use Perl so much. With all the features built into its standard library, I was able to write a working clone of prt-get in about 850 lines: https://git.sdf.org/jmq/Documentation/src/branch/master/scripts/prt-auf
<ppetrov^> farkuhar, i am a "wannabe" bioinformatician (though I have several papers) and unfortunatelly do not know perl. I have used some programs that take advantage of libraries from the bioperl project though
<farkuhar> it was the lengthy directory listing in your crux-ports/p5 repository that suggested to me a strong affinity between bioinformaticians and Perl.
<ppetrov^> these are all needed for bioperl, and this was the reason I switched to CRUX; I could not create that many SlackBuilds, while CRUX has a tool cpan2crux that makes Pkgfiles automatically
_moth_ has quit [Remote host closed the connection]
_moth_ has joined #crux
<ppetrov^> there's cpan2tgz for Slackware but it did not resolve deps when I tried it for bioperl
<ppetrov^> btw, isn't pkg-get written in perl also?
<ppetrov^> as for bioinformaticians, they use mainly R and python nowadays... at least these are required for RNASeq analyses, which is a hot topic
<braewoods> Perl has declined in use. I mainly still see it used in legacy software that can't afford to switch to something else.
<farkuhar> tbh, I've never installed pkg-get myself; most of my hardware is sufficient for building ports directly on the machine that will use them. If I had an underpowered device lying around, I might try setting up pkg-get.
<ppetrov^> farkuhar, for me the pkg-get use is really convenient, installing stuff on my x230 thinkpad is a breeze
<ppetrov^> braewoods, that's what i have heard also
<groovy2shoes> almost everywhere i've worked has still been writing internal tools in Perl
<groovy2shoes> if they were running *nix in any capacity, they were still using Perl
<groovy2shoes> maybe not as extensively as they used to. there was always some interest in Python and the occasional Tcl script lying around.
<braewoods> Python. Ick. lol
<farkuhar> i recall that jaeger was toying around with the idea of rewriting pkgutils (or was it prt-get?) in python first, where iterating towards a polished product would be faster than a C rewrite.
<groovy2shoes> a year ago i'd've been on board with that idea. i've gotten very annoyed with python lately ¬_¬
<farkuhar> +1 on the "Ick", braewoods. I couldn't imagine the inconvenience of having to be so careful with whitespace, if I had tried my rewrite in python rather than perl.
<braewoods> python is very slow in pure python code. no idea why but that's the general rule i've been seeing.
<groovy2shoes> and it's a big mess, too
<braewoods> about the only positive i see is the popularity
<braewoods> that said i wonder how hard it is to write Perl C modules
<braewoods> people in #perl previously told it was not recommended
<groovy2shoes> really easy with Lua, Tcl, or Chibi Scheme. PITA with Perl, Python, or Ruby. that's been my experience.
<farkuhar> in my perl rewrite I made the assumption that pkgadd and pkgrm would continue to have the same userland interface as they do now. There's no attempt to connect with them through a C api, only through the shell or Unix execve.
<braewoods> perl rewrite of what?
<farkuhar> prt-get
<groovy2shoes> prt-get
<braewoods> hm.
<braewoods> i should get around to finishing that C library port of pkgutils
<braewoods> then you could eliminate the overhead of using pkgadd / pkgrm though probably doesn't make much difference for prt-get most of the time
<braewoods> due to build time
<braewoods> i do know one trick that could improve the package processing time for pkgadd but it would require changing the way we produce binary packages
<braewoods> if you make the first file of the tar archive into a file listing, then you could avoid an extra pass through the tar archive which can be expensive
<braewoods> you need the file listing to do some early validation tests
<braewoods> or could just use an archive format that supports a central directory listing like ZIP does :P
<farkuhar> you mean like explicitly listing a footprint file on the line where bsdtar creates the pkg.tar.gz?
<braewoods> farkuhar: well just a path file listing would be helpful. i believe ARCH's pacman stores it as a plain text file just containing all the full file paths of the tarball
<braewoods> it helps avoid an expensive pass to collect metadata for validation tests that the operation can proceed as planned
<braewoods> .FILELIST or something i believe
<farkuhar> shouldn't be too hard to implement. We have a footprint file already, just send it to cut -f 3.
<groovy2shoes> some mechanism for metadata in packages would be nice
<braewoods> it's more of a cache than anything due to tar not being seekable
<braewoods> the metadata files are stored as the first files tarred up by their makepkg
<braewoods> they don't actually get installed, they're just used by pacman to store quick data or other stuff
<cruxbot> [opt.git/3.6]: clang: 14.0.4 -> 14.0.5
<groovy2shoes> do all of the various versions of tar have that limitation? i know GNU tar and BSD tar store filenames differently, and pax probably does, too, but i don't know the details
<cruxbot> [opt.git/3.6]: compiler-rt: 14.0.4 -> 14.0.5
<cruxbot> [opt.git/3.6]: lld: 14.0.4 -> 14.0.5
<cruxbot> [opt.git/3.6]: lldb: 14.0.4 -> 14.0.5
<cruxbot> [opt.git/3.6]: llvm: 14.0.4 -> 14.0.5
<cruxbot> [opt.git/3.6]: polly: 14.0.4 -> 14.0.5
<braewoods> groovy2shoes: the seek limitation?
<groovy2shoes> yeah
<braewoods> why wouldn't they? tar is naturally not seekable at the binary format level.
<braewoods> it's also commonly compressed as a giant stream
<groovy2shoes> because there's more than one tar format
<braewoods> these are naturally non-seekable by design
<braewoods> i can't say i've known any tar format to be seekable. even if it was the compression layer makes it unseekable.
<groovy2shoes> ah
<braewoods> you have to decompress until you reach the part of the archive you want to use shows up
<braewoods> the compression layer is the main reason this ends up being expensive
<braewoods> otherwise you could probably skip through pretty fast
<braewoods> another option is to switch the archive format in use
<braewoods> some are competitive with TAR and actually seekable
<braewoods> https://en.wikipedia.org/wiki/Lzip is one i've seen suggested before
<braewoods> hm
<braewoods> oh wait this is just compression
<groovy2shoes> yeah, but good compression. i use lzip a lot.
<groovy2shoes> looks like none of the usual unix archives are seekable. cpio and ar seem to both have the same problem. hm.
<braewoods> zip definitely is but support for non-DEFLATE algorithms is pretty hit or miss.
<groovy2shoes> yeah, but zip is also... not very unixy ;)
<braewoods> infozip can store unix filesystem stuff but i wouldn't count on it
<braewoods> i suspect the only reasonable option would probably be 7-zip
<braewoods> most archive + compression options are windows only or not very unix friendly
<groovy2shoes> yeah
<braewoods> oh, well this shuts that idea down
<groovy2shoes> or they use kinda awkward tools. i use zpaq a lot for backups, and it's great, but it makes no effort at all to function like a normal unix program.
<braewoods> i think the current situation is the downside of a basic unix philosophy of every tool doing a single job well
<braewoods> it basically destroys any advantages you can get from tight integration
<braewoods> such as archive seeking
<groovy2shoes> i'd done 7z before, with the addition of an mtree listing to the archive. but then we're just back to adding a manifest to the tarball.
<farkuhar> braewoods, would this one-line change to pkgmk create tarballs with the first file you want? http://sprunge.us/UsXaeU
<groovy2shoes> not if there's a footprint mismatch
<braewoods> eh...
<braewoods> no
<braewoods> you should use find or similar to generate the file
<braewoods> let me see
<groovy2shoes> you'd probably want `cut -f3-`, in case the file name contains embedded delimiters
<braewoods> the idea is to generate it not from the footprint but from the actual package contents
<braewoods> the footprint won't be reliable enough
<groovy2shoes> but the bigger thing is it'll be missing files if you're ignoring new, and might be listing files that don't exist if you're ignoring missing
<braewoods> footprint wasn't intended for this now that i think about it
_moth_ has quit [Remote host closed the connection]
<braewoods> something like
_moth_ has joined #crux
<braewoods> find $PKGDIR | sed "s;^$PKGDIR/;;"
<braewoods> it may need more finetuning but that's the idea
<farkuhar> ah, i see what you're getting at now.
<braewoods> yea, it needs to hold the path of every file in the package, regardless of the type
<braewoods> the main reason this is needed is to check for file conflicts
<braewoods> otherwise an expensive pass through the archive to collect such metadata is needed
<braewoods> if you can reduce to a single pass you can cut package processing time in half
<groovy2shoes> then a specially-crafted archive could trick pkgadd into unwittingly obliterating files. just leave some files out of the manifest.
<groovy2shoes> a C program using libarchive could make a single pass through the archive, but still check each file for conflict before writing it. but then if you have to bail halfway through, you'd wind up with a partially-extracted archive.
<groovy2shoes> hmm
SiFuh has quit [Remote host closed the connection]
<braewoods> groovy2shoes: easily solved by a feature i was planning to add
<braewoods> the ability to rollback the transaction due to an unexpected failure
<groovy2shoes> that would be a good feature
<braewoods> basically i don't make any permanent changes to the previous filesystem contents (that were being tracked by pkgutils)
<braewoods> this is mostly needed by pkgadd
<braewoods> pkgrm doesn't really have this issue
<groovy2shoes> yeah
SiFuh has joined #crux
<braewoods> it was intended to cover external interruptions but it also needs to cover things like disk space exhaustion
<groovy2shoes> yeah
<braewoods> i figured out a way to handle it in any case
<farkuhar> that solves the problem of having to bail out in the middle of a partially-extracted archive, but the other problem of a "specially-crafted archive" with an incomplete manifest can easily be guarded against. Just refuse to unpack any files that didn't weren't found on the manifest during pre-validation.
<farkuhar> s/didn't//
<braewoods> or just fail the whole operation if the archive is found to be malformed during processing
<braewoods> a properly built package should never trigger this in any case
<farkuhar> if you have access to an OpenBSD system, read the manpage for pkg_add(1). The section called "Technical details" gives a complete overview of an elaborate unpacking process, which involves a staging area for temporary extraction before finally merging the changes onto the host filesystem.
<braewoods> i have a different idea in mind i think... the problem with an isolated staging area is the moving cost.
<groovy2shoes> not ideal for large packages, either
<braewoods> i could use a common subdirectory for the new files though to make it easier to find leftovers from a failed operation
<braewoods> the main thing i'd want to do is guarantee that rename() will work and that's only possible if i extract to the final directory of each file or a subdir of that directory
<braewoods> since that's the only way to avoid crossing filesystem boundaries
<braewoods> rename being the final pass i do once i've completed everything else in the operation
<ocb> heya cruxrs
<ocb> how are you? :)
<ppetrov^> thara was an interesting discussion about package compression
<SiFuh> lz4
frinnst has quit [Remote host closed the connection]
Guest57 has joined #crux
Guest57 has quit [Client Quit]
<farkuhar> groovy2shoes: i just pushed a quick fix for pkgmeek. download a fresh copy if you haven't started your sysup testing yet.
<groovy2shoes> okay, thanks :)
<ppetrov^> what does pkgmeek do?
_moth_ has quit [Ping timeout: 255 seconds]
cybi has joined #crux
<groovy2shoes> ppetrov^, it's meant as a pkgmk replacement, with less code, and more straightforward.
<ppetrov^> thanks
<groovy2shoes> farkuhar, it looks like pkgmeek does a bit of extra work when used with `-do` (the work dir preparation and the up-to-date check in particular). i'm not actually sure how pkgmk does it, but regardless i'd expect it to only fetch the sources
<farkuhar> groovy2shoes, thanks for the feedback. The work dir preparation is meant to accommodate all sorts of configurations, either the sources downloaded right into the ports tree, or saved in a central location. The download step eventually moves the sources to where the user meant to put them, but first they might appear in the work directory.
<groovy2shoes> ah, okay
<farkuhar> it's not ideal for respecting filesystem boundaries, i agree, so maybe some rethinking of that step is warranted. i was building out from the base that Fun wrote four years ago, without trying to deviate too much from his straightforward approach.
<groovy2shoes> it's not a big deal, it was just a little surprising
cybi has quit [Read error: Connection reset by peer]
_moth_ has joined #crux
jue has quit [Quit: killed]
samsep10l has quit [Quit: leaving]
cybi has joined #crux
<cruxbot> [contrib.git/3.6]: docker-compose: updated to version 2.6.0
<cruxbot> [contrib.git/3.6]: runc: updated to version 1.1.3
<cruxbot> [contrib.git/3.6]: docker: updated to version 20.10.17
<cruxbot> [contrib.git/3.6]: containerd: updated to version 1.6.6
cybi has quit [Read error: Connection reset by peer]
cybi has joined #crux
<farkuhar> groovy2shoes: by design, pkgmeek doesn't have any modifiable subroutines. The handful of Pkgfiles that rely on being able to redefine unpack_source() will probably fail. The bash script itself hints at the recommended way to accomplish what the maintainer wanted, but an example is always more instructive: http://sprunge.us/Q2XChf
<cruxbot> [contrib.git/3.6]: open-vm-tools: updated to version 12.0.5-19716617
<braewoods> farkuhar: i can see a point to that but it's worth remembering that pkgfiles can execute code in the regular top level of the file since it's a shell script still. if that was intended for security it doesn't really help.
<ppetrov^> hey, graphene installs stuff in /usr/libexec; isn't CRUX supposed not to use this folder? I don't mind, just asking.
cybi has quit [Ping timeout: 248 seconds]
<braewoods> libexec is used by libraries for stuff that shouldn't be exposed under normal PATH
<ppetrov^> yes, however the handbook states: /usr/libexec/ is not used in CRUX, thus packages should never install anything there. Use /usr/lib/<prog>/ instead.
_moth_ has quit [Remote host closed the connection]
_moth_ has joined #crux
cybi has joined #crux
<farkuhar> braewoods: understood. the main security issue i was aiming to solve was to have two calls to check_signature(), one before the Pkgfile is sourced, and one before the sources are unpacked.
<farkuhar> but having all subroutines read-only also offers some measure of predictability, so the build process is not wildly different from one port to the next.
<braewoods> you always have the option of using special function names if you want to allow extension or something
<braewoods> most of the time people want to keep the original function behavior and just add to it instead of completely replacing it
samsep10l has joined #crux
maledictium has quit [Remote host closed the connection]
ppetrov^ has quit [Quit: Leaving]
cybi has quit [Ping timeout: 248 seconds]
cybi has joined #crux
cybi has quit [Read error: Connection reset by peer]
cybi has joined #crux
cybi has quit [Ping timeout: 246 seconds]
cybi has joined #crux
tilman has quit [Ping timeout: 276 seconds]
tilman has joined #crux