IRC channel logs

2021-03-01.log

back to list of logs

<stikonas>hmm, newer ar and ranlib have deterministic mode...
<stikonas>(but that's only in 2.20... much newer than what we have)
<Hagfish> https://news.ycombinator.com/item?id=26298022 interesting link to "One Man Unix" on hackernews (no discussion yet)
<stikonas>fossy: do we want to regenerate files like https://github.com/hermitcore/binutils/blob/36e4dce69dd23bea9ea2258dea35f034b6d6351c/bfd/bfd-in2.h ?
<stikonas>need to run "make headers"
<stikonas>although, running this produces no changes
<stikonas>hmm, and I still can't get ar to create deterministic archives :(
<stikonas>patched mtime,uid, gid to 0, file mode to 0644 but there is still something left...
<stikonas>I guess diffoscope time...
<stikonas>hmm, timestamps are still there for some reason
<fossy>stikonas: yes, regenerate all files
<fossy>even if the results are identical, the entire point is so one can verify thst
<stikonas>ok...
<stikonas>will do so
<stikonas>don't merge that PR yet anyway
<stikonas>I'm still trying to find what introduces timestamps
<stikonas>probably some stat call
<pabs3>I'm getting an expired cert on bootstrappable.org, anyone know who can fix that?
<siraben>Oh dear, same here
***mephista is now known as spicy_icecream
***mephista is now known as spicy_icecream
***mephista is now known as spicy_icecream
<siraben>trying to reduce Nixpkgs' stdenv be like
<siraben> https://matrix.org/_matrix/media/r0/download/matrix.org/aJfZEhIrRlCjEqoQaWRnsTlJ/stdenvMinimal.dot.png
<siraben>still need to get rid of gcc, bash and other miscellanea
<Hagfish>wow, that image looks like it's the result of a stress test for graphviz, but i assume it's genuine data and not generated by some sort of runaway fuzzing process :)
<Hagfish>i'd love to see a "before and after" comparison if you do get rid of gcc and bash etc.
<Hagfish> https://www.perl.com/article/the-hijacking-of-perl-com/ "Features such as two-factor authentication probably would have saved us much of this trouble"
<Hagfish>i guess there's a lesson in there
<Hagfish>in terms of bootstrapping, they say: "This incident only affected the domain ownership of Perl.com and there was no other compromise of community resources."
<siraben>Hagfish: with the full stdenv graphviz takes forever to finish on my machine
<OriansJ`>pabs3: the person you need to notify is rekado_ about the cert expiration.
<OriansJ`>Hagfish: I think the lesson is more no one does security until after its lack results in a great deal of pain for them. Even those who should know better.
<OriansJ`>Depending on such things as a dns record alone for security is a bad design
<OriansJ`>In security, you are only as secure as your weakest link
<Hagfish>i think people haven't fully realised what an attack vector domain registrars are
<Hagfish>or they haven't thought about the financial incentives for attackers
<Hagfish>interestingly, although control of a domain can allow an attacker to obtain a new SSL certificate, that certificate would appear in public transparency logs, so a domain owner could detect such an attack
<OriansJ`>Hagfish: well security classification is often ranked on the $$$ required to break the security. Once the values of the target exceeds the cost of breaking of its security, its compromise is certain.
<Hagfish>or at least, if you're not attacked, it's due to luck
<OriansJ`>Hagfish: I think it is more of the "You don't have to outrun the bear, just the person next to you"
<Hagfish>yeah
<Hagfish>i guess a good threat model will factor in how many bears there are, and how many runners
<OriansJ`>well to a degree that seems reasonable but few people know how easy it would be to put lightyears between you and the average runner.
<Hagfish>nuclear propulsion? ;)
<OriansJ`>A single offline system, binary white-listing and 2-factor auth puts one light years ahead of most organizations.
<Hagfish>oh, absolutely
<Hagfish>people seem to imagine that cyber security involves things like "military grade firewalls", when in reality it's just "pull out the network cable, dummy"
<OriansJ`>That and contract terms involving stiff penalties for contracted parties for violating security requirements (That is a hard sell outside of Military Contracting)
<Hagfish>in theory cyber-security insurance should fix this in the private sector
<Hagfish>insurance companies should compete on low premiums by letting their audits do security reviews
<OriansJ`>Hagfish: The insurance model is a race to checkbox the problem and shift blame around.
<Hagfish>not if insurance companies want to make a profit
<Hagfish>well, i guess they will do the standard tricks of saying "sorry, your policy doesn't cover payouts for these sorts of attacks"
<OriansJ`>Hagfish: 100% profit if you always blame the company that gets breached.
<OriansJ`>you needed to complete section 12 but didn't
<Hagfish>right
<Hagfish>i guess the checkbox is "see, we have cyber insurance", when that cyber insurance never pays out
<OriansJ`>Government penalties might work if set properly though.
<Hagfish>the GDPR setting fines to a percentage of _revenue_ is a nice approach
<OriansJ`>Hagfish: multiples of _revenue_ not percentage ensures compliance.
<Hagfish>even 1x of revenue would be a death sentence to most companies
<OriansJ`>literally do this and die; it only takes a few companies to shake the industry into getting their shit together.
<Hagfish>there is value in making an example of a few companies
<OriansJ`>Like current regulations on nuclear material transportation.
<OriansJ`>You screw up once, you are done and your investors lose *EVERYTHING*
<Hagfish>we don't treat data the same way as nuclear material (although opinions are starting to move in that direction)
<OriansJ`>well it is just as dangerous in skilled hands.
<Hagfish>calibrating these penalties is quite difficult for governments
<Hagfish>you don't want to prevent all investments in tech companies (or tech companies from operating in your jurisdiction)
<OriansJ`>Hagfish: starting with don't store any data unless it is explicitly needed to operate your business seems like a good starting point.
<Hagfish>or some nasty endemic problem like Spectre turning up one year and causing every single tech business to go bankrupt
<Hagfish>yeah, GDPR has finally established that as a benchmark
<OriansJ`>Hagfish: Well in spectre's case, it would have just killed Intel
<Hagfish>part of the problem is that there is so little competition among tech companies
<Hagfish>if there were 10 different Intels, we could afford for a couple of them to go out of business
<OriansJ`>Hagfish: libre hardware would solve that too
<Hagfish>yeah, i think the tensions between "intellectual property" and the free market are becoming clearer with each passing year
<OriansJ`>A full source code/design exception for manufactors of computing products.
<OriansJ`>Hagfish: please don't use that phrase.
<Hagfish>yeah, that would be a nice 80:20 solution
<Hagfish>i was going to say "intellectual monopoly"
<OriansJ`>not much better
<Hagfish>or were you questioning "free market"?
<OriansJ`> https://www.gnu.org/philosophy/not-ipr.html
<Hagfish>unfortunately there doesn't really exist a well established and accurate term which captures these artificially created and amalgamated set of legal privileges
<Hagfish>so i just used scare-quotes, like that article does
<OriansJ`>well sometimes it is best to break the argument into smaller pieces to provide more accurate and useful detail. For example there is by definition a big cost in the market for the existence of patents (By design to encourage the disclosure of technical implementation details). But not so much with Trademarks which are a traditional solution in the market to enable one to identify products from companies which have certain reputations.
<Hagfish>yeah, that would be more nuanced
<Hagfish>another rhetorical trick might have been to say something like: IPR (Innovation Prevention Racket)
<Hagfish>i.e. hijacking the existing term, rather than scare-quoting it
<OriansJ`>Hagfish: potentially proper propaganda technique there. Although personally I prefer being as clear and meaningful when possible.
<Hagfish>i think that "Innovation Prevention Racket" conveys quite a lot of meaning for how succinct it is
<OriansJ`>It probably does. However when discussing how to actually solve a specific problem, more precise terms and solutions generally would enable to more productive discourse
<Hagfish>i suppose "patent and copyright law" wouldn't have been that much more to type
<Hagfish>and i could have avoided this unnecessary tangent :)
***Noisytoot is now known as impostor
<OriansJ`>back to the previous point on how to encourage organizations to improve security
<OriansJ`>User data should be thought of as toxic waste, not a priceless resource to be collected at all costs.
<OriansJ`>For one can not leak data that one does not have.
<OriansJ`>The lost laptop with +1M patient records just shouldn't even be a possibility.
<OriansJ`>let alone a single config mistake away from allowing an attacker to access/modify all patent records for a hospital.
***impostor is now known as Noisytoot
<OriansJ`>attacks like the recent solarwinds could have quickly been detected and addressed if reproducible builds for all binaries shipped to customers was done.
***ChanServ sets mode: +o rekado_
***rekado_ is now known as rekado
<Hagfish>yeah, it wouldn't be too difficult for a government to demand that all updates from a vendor be signed with a private key, and all uses of that key be recorded in a public log
<Hagfish>if the vendor fails to sign the update, then they pay a fee for the paperwork of registering a new key (and for delivering the update late), and if the government find an update that isn't in the public log then the company pay a pre-agreed fee
<Hagfish>by making the failure modes of a contract unambiguous, it becomes easier to invest in the proper engineering (rather than paying lawyers to argue about the contract)
<OriansJ`>indeed
<OriansJ`>or requiring full source to be provided that corresponds to the binary to get copyright protection would be a simple one to enforce (With reproducible builds)
<OriansJ`>with probably a grandfather clause which applies as of some near future date.
<OriansJ`>Probably could use trademark law to require a registered GPG signature for all companies and the requirement for the signatures of sold products to meet requirements.
<OriansJ`>So in theory just leveraging existing copyright and trademark processes with minor tweaks could easily improve existing security foundation for all businesses with minimal costs at a society level.
<OriansJ`>But if you are willing to abandon trademark and copyright benefits, it becomes trivial to skip those requirements.
<OriansJ`>It provides an opt-in model which is mandatory for companies that expect to generate revenue but doesn't impact FLOSS projects as copyright for written works (The source code) remains unchanged, only those that distribute binaries need to change to comply with the revised copyright law.
<OriansJ`>Trademark should incur a cost but copyright shouldn't
***gio_ is now known as gio
<roptat>stikonas, I've looked a bit at live-bootstrap, and I see there are some .kaem files with #!/bin/sh, does that mean /bin/sh is part of the bootstrap? do you consider it only as the build driver, or are you trying to replace it? or is it actually bootstrapped and I'm blind?
<stikonas>roptat: kaem files are run by kaem, not /bin/sh
<stikonas>that shebang is a bit confusing
<stikonas>it's there only for dev reasons, so you can run those scripts with sh too
<stikonas>but in live-bootstrap only kame is used
<roptat>ah, I see
<stikonas>there are actually two kaems
<stikonas>one is 737 byte binary written in x86 assembly
<stikonas>that one is very limitted
<stikonas>and then there is a simplified C version (I think built with M2-Planet or cc_x86) and has a few more features
<stikonas>(variable substitution and some escaping)
<roptat>so before m2-planet, you use the bootstrap kaem, then you build the better one and use it after m2-planet?
<stikonas>indeed
<roptat>nice
<stikonas>maybe they should have had different names
<stikonas>but I was not involved in that
<roptat>I'm wondering about the steps to get to guile, since you mentioned last time it was needed for gash and a few other utilities
<stikonas>hmm, well, so far we completely ignored guile
<stikonas>we built make with tcc, then used kaem in combination with make, then after some more trouble got bash
<roptat>fair enough ^^
<stikonas>(bash was a bit tricky because we also wanted to rebuild .y file)
<stikonas>and since we didn't have bison yet, we used heirloom yacc
<stikonas>roptat: so either mes / gash should be made compatible (might make sense for guix) or we build guile much later, e.g. after gcc
<roptat>I wonder if we could integrate that into guix, using kaem as the build driver instead of guile
<roptat>not sure exactly where the derivations and builders are generated, but that should be possible
<civodul>roptat: the goal is rather to start with a Scheme; that's kinda why Mes exists :-)
<roptat>but mes is way bigger than kaem, no?
<civodul>possibly
<roptat>I mean, we'd use it up to mes, then we'd use mes
<civodul>(i'm talking but i haven't looked into that very closely)
<civodul>actually i think that's +/- what janneke did in the full-source bootstrap branch
<roptat>though I'm pretty sure the build driver is still guile
<civodul>yes
<civodul>that's the elephant in the room
<roptat>I'm ok with guile on the host-side, but I think that means guile gets injected on the build side too, so you'd need it as a seed
<civodul>it's not "injected", it's %bootstrap-guile that shows up in the derivation graph
<roptat>actually I'm not entirely sure what happens on the build side exactly
<roptat>so %bootstrap-guile is an input to the derivation
<civodul>i think we need a strategy where we don't end up rewriting (guix build gnu-build-system) & co. in shell or whatever
<roptat>right
<civodul>see: guix graph -e '(@@ (gnu packages commencement) gash-boot)' -t derivation | xdot -
<roptat>so guile-bootstrap is built in a derivation, but I don't understand how, if we don't have guile yet
<roptat>is that the role of build-bootstrap-guile.sh?
<civodul>(gnu packages bootstrap) is where it's "built"
<civodul>i.e., extracted
<civodul>actually it's +/- described in https://guix.gnu.org/manual/en/html_node/Reduced-Binary-Seed-Bootstrap.html :-)
<stikonas>civodul: by the way, I'll be looking at binutils 2.14 timestamps. I'll let you know once I get it working (you raised that in https://issues.guix.gnu.org/45962).
<stikonas>it's now the same problem in live-bootstrap
<stikonas>(that binutils-2.14 is not reproducible)
<civodul>stikonas: oh nice, thanks!
<siraben>I just reviewed the bootstrap process of Nixpkgs today. It wasn't too bad and doesn't contain any cycles.
<siraben>Reducing the stage0 would bring the most improvements, it looks like.
<siraben>Here's the linux bootstrap https://github.com/NixOS/nixpkgs/blob/master/pkgs/stdenv/linux/default.nix
<civodul>the big thing is here: https://github.com/NixOS/nixpkgs/blob/master/pkgs/stdenv/linux/bootstrap-files/x86_64.nix
<civodul>those seeds include gcc 8.3.0, glibc, zlib, pcre, binutils, etc.
<civodul>128 MiB uncompressed
<stikonas>well, those are standard...
<civodul>sure, but since we're talking about reducing binary seeds...
<stikonas>well, I only meant it as, should be similar in what guix did when it went to reduced bootstrap seed
<siraben>Yes. Looks promising.
<OriansJ`>roptat: there is a different name for the bootstrap kaem: kaem-optional-seed as it is designed to be buildable by hex0. So only the hex0-seed is actually required to build if you have a shell that you trust. kaem-optional-seed is only needed if you wish to remove your init+shell from your bootstrap graph (as it can function as an init)
<OriansJ`>civodul: the issue with mes.c is that it is unable to run gash, gash-utils or bootar but that is why the blynn-compiler proposal for a scheme written in Haskell appeals to me (Mostly that it eliminates work from janneke and me while still moving the needle forward)
<OriansJ`>If nothing else, it forces Haskell programmers to prove that their code is bootstrappable beyound the subset we currently support in blynn-compiler.
<OriansJ`>siraben: I am extremely curious in what way you think stage0 could be improved as the steps between hex0, hex1, hex2, M0, cc_* and M2-Planet+mescc-tools are near the edges of reasonable human audit efforts
<OriansJ`>(as I had to include a safety margin for unknown future architectures)
<stikonas>OriansJ`: my reading of that sentence was that siraben meant reducing Nixos bootstrap seed to stage0 would bring most improvements
<stikonas>not reducign stage0 itself
<stikonas>but I might be wrong
<OriansJ`>stikonas: definitely a possible parsing of that sentence. But I do always actively look for improvements in the work that I do.
<stikonas>well, depends on what the goal is, optional init seed can definitely made smaller, but I'm not sure if that will make the whole thing more readable...
<stikonas>at the moment kaem-optional-seed does quite a bit...
<OriansJ`>737bytes to read kaem.run, print what it is executing, drop comment lines, spawn processes and halt if anyone of them returns failure.
<OriansJ`>and as it is trivial to build (it is a hex0 program) its audit can be on the hex0 sources rather than the binary itself.
<OriansJ`>So in theory only the hex0 seed needs to be squeezed further to drop bytes and there is a large amount of room left to squeeze out
<stikonas>well, I'm just saying, binary size depends on what tradeoffs you accept. E.g. one can write smaller binary that hardcodes 3 commands that are read from kaem.run, that will reduce initial binary seed, but will increase total amont of hex0 code
<OriansJ`>well kaem-optional-seed's hex0 is 415lines (including comments and license header)
<OriansJ`>(it also includes just whitespace-only lines)
<OriansJ`>hex0 is 231 lines (comments, license header and whitespace-only lines included)
<stikonas>but most of keam-optional-seed is dealing with reading and parsing kaem.run file
<stikonas>can probably go down to even less than hex0 if it doesn't have to deal with file stuff...
<OriansJ`>even if it takes you 1 day to read a single line and validate it, in 646 days you'll be done
<stikonas>exactly, but I'm not convinced that it's the best course of action to reduce binary size...