IRC channel logs

2023-06-24.log

back to list of logs

<pabs3>fossy: stikonas suggested I ask you about these: how are you detecting pregenerated files in git? are you using any tools other than grep? do you have a list of greps you use?
<pabs3>same question for anyone else doing pregenerated file detection
<sam_>pabs3: from lanodan, https://git.sr.ht/~lanodan/deblob
<pabs3>nice, thanks.
<pabs3>I'm wondering if there is anything for tricky stuff, like generated C code
<pabs3>other stuff I'm aware of: Debian licensecheck, Debian suspicious-source (from devscripts), grepping images for gimp/inkscape/Synfig/POV-Ray/gnuplot, grepping html/svg for base64
<lanodan> https://salsa.debian.org/debian/devscripts/-/blob/master/scripts/suspicious-source interesting, I wonder how slow it is, one of my reasons for deblob was to get something fast enough to be used all the time, this one seems more appropriate for manual audits
<pabs3>lanodan: on my 2012-era desktop, it takes 1m13s to check the u-boot git repo.
<lanodan>That's pretty decent
<lanodan>For u-boot, suspicious clocks at "0m10.93s real 0m10.26s user 0m00.54s system"; deblob at "0m00.46s real 0m00.15s user 0m00.31s system"
<pabs3>so deblob is way faster
<lanodan>I guess for pregenerated you could modify the magic file, this way you don't have to read everything twice
<lanodan>Yeah, way faster but not the same goals at all, deblob catches known binaries where only reading 4096 bytes of each file is enough, suspicious lists non-source which is doomed to be slower (whole file + huge database)
<oriansj>notgulll: well the creation of a floppy disk manually is in theory possible (given steady enough hands and a magnetized needle) but in practical terms historically directly writing to memory is much easier to do (unless your system lacks a ROM or hardware which makes it possible)
<oriansj>lanodan: well all of the time wouldn't fit most downstream source models; which changes would only be pulled every so often but yes a check running on every pull would catch a good bit