IRC channel logs

2020-08-22.log

back to list of logs

<OriansJ`>bauen1: I would take some time to review it first but I always accept patches provided they don't introduce issues.
<OriansJ`>mihi: actually the crc32 of an empty file is 00000000 but the crc32 of those same zeros is 2144df1c and appending that produces a file with a crc32 of 189eb7b1 (stage0's hex and xeh make it very easy to build such files)
***terpri_ is now known as terpri
<OriansJ`>now doing FFFFFFFF does get a crc32 of FFFFFFFF but doing FFFFFFFFFFFFFFFF gets a crc32 of 2144df1c but FFFFFFFF00000000 gives us a crc32 of FFFFFFFF still
<OriansJ`>mihi: you are probably right if we just get the bits at the start of the file and the crc32 for the rest of the file to equal 00000000; then the crc32 will match; we can probably use half of the padding to make it happen for arbitrary rest of files.
<OriansJ`>I could possibly modify the crc32.c program I have to generate the needed values; although crc64 presents a much harder problem
<OriansJ`>bauen1: hex1_x86's table is at max 958bytes (256*4) which when added to the 689bytes of binary is less than the 4KB page size; which means it can't hit unallocated memory
<OriansJ`>hex2_x86.S sbrk's 8,192,000 should be larger than the amount of memory required to build cc_x86.S or mescc-tools' hex2; which does proper calloc
<OriansJ`>bauen1: you are right about the names not being exactly correct. They were just sort of what felt good at the time I wrote it. So improvements and enhancements are of course welcome.
<OriansJ`>2144df1c bit inverted is DEBB20E3 but appending that doesn't get the desired result but appending E320BBDE does (FFFFFFFF)
<bauen1>OriansJ`: if "defeating" crc is as easy as inverting a qword and shuffling it around a bit, then it is trivial for a backdoor to do
<bauen1>OriansJ`: so i'm not really sure if embedding the hash / checksum in the binary itself and having it self-verify will do very much
<bauen1>OriansJ`: comparing (and differential compilation) and a sha256sum program however could probably go a lot further
<bauen1>OriansJ`: i was under the impression that hex1 uses 8 bytes per entry resulting in 256 * 8 = 2048 (still under the page size)
<bauen1>OriansJ`: and i kind of want to modify hex1 to only allow printable ascii characters as labels
<OriansJ`>bauen1: true however crc was just an example of a checksum we could put in the 8bytes available in the ELF header; finding an alternate checksum that is more computationally expensive would however probably be the better course of action.
<OriansJ`>probably right about it not doing much, as it is to just increase the amount of work/functionality that an attacker putting in a backdoor would need to handle. Thus making the hiding of the backdoor impossible at the individual ELF level (save for in something big like the kernel)
<OriansJ`>bauen1: 4bytes is the size of a register in x86 but yes in AMD64 the register size is 8bytes and thus the entries in hex1 are 8bytes in size.
<OriansJ`>I wouldn't put the printable ascii only requirement in hex1 but rather in the mescc-tools hex2 implementation (as it is just C code) and you could use hex2 to check hex1 binaries for such bad behavior and report where such things are found. hex1 itself is only supposed to be as large as required to get hex2 off the ground.
<OriansJ`>eg smaller hex1 -> better (as manually calculating offsets by hand sucks)
<bauen1>e.g. https://github.com/bauen1/mescc-tools-seed/commit/9a014284cea5eeaec0c9c6f7d8c0d13185c56c33
<bauen1>and a sort of backdoor into hex0-seed: https://gitlab.com/bauen1/stage0-backdoor
<bauen1>currently expects a slightly patched source code so syscalls follow the format `B8 beefbeef 0F05`
<bauen1>and doesn't really work with anything that tries to write past its own memory (according to p_memsz / p_filesz)
<dannym>janneke: If you have the time can you show me how to compile your fork of tinycc? (on GNU Mes for ARM)
<janneke>dannym: yeah, sure -- it's been a while since i did that for x86 and never did it yet for ARM
<janneke>i think it involved some silly changing of commented sections in a build script -- let me have a look
<janneke>dannym: i had MES_PREFIX point to a pre-built mes directory, because working on tcc often meant also changing stuff in mes, or adding tests
<dannym>janneke: Right now I'm trying to use: tcc-0.9.26-1103-g6e62e0e$ ./doit
<dannym>janneke: I get: /bin/sh: 1: etags: not found
<dannym>janneke: What package is etags in? Also, why do we need it for tcc? Also, does this script set up the guix environment or not?
<janneke>you don't need etags; that's for emacs source code navigation
<dannym>janneke: Well, ./doit seems to need it
<janneke>weird; via "configure/make" maybe
<dannym>+ ./configure --tccdir=/home/dannym/src/mes-tinycc/tcc-0.9.26-1103-g6e62e0e --crtprefix= '--extra-cflags=-DHAVE_FLOAT=1 -DHAVE_BITFIELD=1'
<dannym>Creating config.mak and config.h
<dannym>config.h is unchanged
<dannym>+ make ETAGS
<janneke>ah, that's in build-gcc.sh
<janneke>you can remove that
<dannym>janneke: Ok, removed it
<dannym>janneke: Thanks!
<dannym>janneke: Hmm, it seems to use gcc to compile tcc and not mescc ?
<janneke>yeah
<janneke>most of the work in tcc meant: instrumenting the tcc code with tracing, build with gcc and mescc, run both and compare traces
<janneke>so "doit" helps with automating that, i guess
<dannym>cp mes-source/gcc-lib/libc+tcc.a .
<dannym>cp: cannot stat 'mes-source/gcc-lib/libc+tcc.a': No such file or directory
<dannym>*makes symlink*
<dannym>I already made one with target ../mes
<dannym>source*
<janneke>good
<dannym>cp: cannot stat 'mes-source/gcc-lib/libtcc1.a': No such file or directory
<janneke>as you come to see, these scripts are of throwaway/edit-when-needed quality (or less)
<janneke>hmm
<janneke>13:21:09 janneke@dundal:~/src/mes/wip-arm
<janneke>$ find . -name libtcc1.a
<janneke>./mescc-lib/arm-mes/libtcc1.a
<janneke>./mescc-lib/libtcc1.a
<janneke>i also have ./gcc-lib/arm-mes/libtcc1.c
<janneke>but apparently we don't build it (anymore)
<dannym>(env) dannym@banana:~/src/mes$ find -name "libtcc1.*"
<dannym>./gcc-lib/arm-mes/libtcc1.c
<dannym>./lib/libtcc1.c
<janneke>dannym: are we still configuring --with-courage?
<janneke>build-aux/build-lib.sh:
<janneke>if $courageous; then
<janneke> exit 0
<janneke>fi
<janneke>(just before building libtcc1) -- to save time
<dannym>janneke: Oh. That explains a lot...
<janneke>the thinking was: --with-courage => port is not finished => don't waste time towards tcc yet
<janneke>i forgot, i was looking where to (re-)add libtcc1 build instructions and stumbled upon it just yet
<dannym>janneke: *recompiles without --with-courage*
<janneke>dannym: what mes branch are you working on?
<dannym>janneke: master
<dannym>janneke: mescc for ARM backend has been finished for quite a while now
<dannym>janneke: We can do another release of Mes, I guess
<janneke>yeah, i've been postponing that too long
<dannym>janneke: However, I changed a few (very few) things for all archs on master, so we should retest guix bootstrap on x86 or something too
<janneke>guess we want some bug reports from vagrantc to be addressed and then make a release
<janneke>yeah, sure -- i also seem to remember some minor breakage (x86_64?)
<janneke>that was to be expected with such a major operation
<dannym>janneke: My non-ARM-specific changes in master are: lib/mes/ntoab.c (ntoab), build-aux: Increase test timeout to 20 s., core: Fix unreadchar on string port when unreading EOF., test: Test signed division.,
<janneke>yes, i didn't really worry about that -- just being accurate and thourough with testing is not something that comes natural to me ;)
<dannym>janneke: I also made the __syscall things available for non-libc targets so we can use it when dividing by 0
*janneke loves guix
<janneke>ah yes, that's great
<dannym>janneke: (which was basically just cut&paste)
<dannym>into new files
<dannym>Which then makes it into (new) libmescc.a
<janneke>right, i remember
<janneke>hmm, HACKING on mes only talks about x86, and PORTING doesn't use (guix/git/mes.scm's) mescc-tools-0.7 yet
*janneke tries to play along a bit
<janneke>it would be great if we could remove all junk and make it so that a next port is easier to get into
*janneke runs guix environment -s armhf-linux --pure --ad-hoc bash coreutils diffutils gawk gcc-toolchain grep guile help2man make nyacc pkg-config sed texinfo mescc-tools
<dannym>janneke: You have an account janneke@banana.pyramid.dynv6.net on the ARM build server btw; but yeah, we can make sure that it works on transparent enumation, too (everything should work fine there by now)
<janneke>ah, sure -- i should use that!
<dannym>janneke: By now, you can also use "guix environment -s armhf-linux" on x86_64 and everything should just work
<dannym>janneke: (Good if someone tests that, too)
<dannym>In unknown file:
<dannym> 1 (scm-error misc-error #f "~A ~S" ("unhandled consta?" ?) ?)
<dannym>unhandled constant #<procedure >= (#:optional _ _ . _)>
<dannym>Uhhh
<janneke>ugh
<dannym>(Sounds like a guile thing; maybe need to clean more after make clean)
<dannym>After make clean, ./module/mescc/armv4/info.go and ./module/mescc/armv4/as.go are still there
<janneke>dannym: ah, see build-aux/GNUmakefile.in
<janneke>the clean target needs to be amended
<janneke>(there are simpler 'clean-go' and 'all-go' targets)
<dannym>janneke: Pushed cleaning fix; rebuilt; now the error message is gone, but it still fails
<dannym>*edits GNUmakefile to add "-x" to build target's $(SHELL) build.sh*
<janneke>hmm 'lib/linux/arm-mes-gcc/exit-42.S': No such file or directory
<janneke>ah -- wrong branch
<janneke>hmm -- no code for module (nyacc lang c99 pprint)
<janneke>ah, nyacc/we need guile-2.2
<janneke>this is better => guix environment --system=armhf-linux --pure --ad-hoc bash coreutils diffutils gawk gcc-toolchain@7 grep git guile@2.2 help2man make mescc-tools nyacc openssh-sans-x pkg-config sed texinfo
<janneke>dannym: hmm, it seems guile wants something liken https://paste.debian.net/1160841/
<janneke>not sure what mes thinks about that, though
<janneke>(how wasn't this a problem before?)
<janneke>oh, numbers do not need to be unquoted either
<dannym>janneke: Not sure why it wasn't a problem before--worked fine for a lot of mescc tests...
<dannym>I think...
<dannym>janneke: The unquote in the macro is basically because I hate spooky variable capture
<dannym>janneke: Like, with a quoted >= it would use whatever >= is in the environment at the user site. Ugh
<dannym>janneke: Even the "if"... but I guess people don't overwrite that anyway
<dannym>janneke: In any case, trying your change on banana...
<janneke>yeah, np for now -- i'm just curious
*janneke is afk for a bit
<dannym>janneke: lib/tests/scaffold/7l-struct-any-size-array-simple.c fails on ARM gcc and so does lib/tests/scaffold/7r-sign-extend.c on ARM gcc, both for obvious reasons.
<dannym>janneke: (char is unsigned on ARM)
<dannym>janneke: For 7r-sign-extend, gcc even warns: ./../lib/tests/scaffold/7r-sign-extend.c:64:23: warning: large integer implicitly truncated to unsigned type [-Woverflow]
<dannym> char a[2] = { -1, -129 };
<dannym> ^
<dannym>./../lib/tests/scaffold/7r-sign-extend.c:100:23: warning: large integer implicitly truncated to unsigned type [-Woverflow]
<dannym> unsigned char b = -129;
<dannym>janneke: And on the 7l-struct-any-size-array-simple.c the struct field "d" is char.
<janneke>dannym: ah the unsigned char thing -- that was known, wasn't it?
<dannym>janneke: Yes
<dannym>janneke: I didn't want to fiddle around in 7l-struct-any-size-array-simple.c myself is because it's testing something else (the other struct members are int); I think it's trying to find out whether it's possible to have a char unaligned in a struct
<dannym>janneke: I think it's an accident that the char is negative, or that it doesn't specify "signed char" (in which case it would work just fine)
<dannym>janneke: Do you still know what the test was supposed to test?
<dannym>(as for 7r-sign-extend, it seems obvious to me that it was meant to test sign extension of signed things. So should be fine to put "signed" on all the chars there)
<janneke>OK
<janneke>i can have a loook for 7l-struct-any-size-array-simple.c
<janneke>from the 7x- number i know that it's a test that was isolated from the tcc sources
*janneke looks
<dannym>janneke: Thanks :)
<janneke>ah that __packed thingy
<janneke>hmm
<janneke>dannym: yes, it checks wether you can have an array of structs of any size and have them placed in memory without gaps
<janneke>dannym: iow, initially i was lazy and used 4 bytes for everything
<janneke>some of that kludge still lingered in mescc's struct code -- so it pretty much tests what the name says ;-)
<janneke>dannym: you're right about the signed char/sign extend thing, playing with signedness of char is just a missing feature
***terpri__ is now known as terpri
<janneke>arm-unknown-linux-gnueabihf-gcc -g -o arm-unknown-linux-gnueabihf-tcc -nostdinc ... tcc.c libtcc1.a libc.a => ntoab.c:54: undefined reference to `__aeabi_uidivmod'
<janneke>dannym: so...this needs div.c for gcc...hmm
<janneke>(or -lmescc for now...hmm)
***terpri_ is now known as terpri
<janneke>i copied build-x86.sh => build-arm.sh
<janneke>ah no, div.c should also be in libc+tcc!
*janneke adds $libmescc_SOURCES to libc_tcc_SOURCES="
<janneke>dannym: just for fun...wrt the struct size thing, there was also "mescc: Support --align, off by default."
<janneke>)
*janneke pushes some stuff to 'wip'
<dannym>janneke: Actually, earlier ARM also used 4 bytes for everything. Struct packing is not really portable.
<dannym>janneke: I mean all of ARM, not mes
<janneke>yeah, so i guess for ARM this test makes no sense?
<janneke>dannym: trying to link with arm-unknown-linux-gnueabihf-gcc sans libc, with div.c in libc+tcc, i now get more undefined functions: https://paste.debian.net/1160869/
<janneke>should be mostly trivial for you, i guess...
<dannym>janneke: No, armv7 can do struct packing just fine.
<dannym>janneke: So we can totally use it
<dannym>janneke: The question is why would a compiler like tcc use it? Sounds totally bogus to me
<dannym>janneke: That only concerns the packing of internal tcc structs in the RAM image of the running tcc, right? WTF...
<dannym>janneke: Should make no difference whatsoever to a compiler whether the compiler's source code uses packed structs or non-packed structs
<dannym>janneke: I just say that so we keep it in mind for the future. Maybe this part of tcc is unnecessarily convoluted?
<dannym>janneke: Right now, if we are sure that the test 7l-struct-any-size-array-simple is still meaningful with "signed char" instead of "char"
<janneke>dannym: yeah, the meaning does not change
<dannym>janneke: ... if we are sure then we can fix up the test on mes master
<dannym>janneke: To use "signed char"
***ChanServ sets mode: +o rekado_
***rekado_ is now known as rekado
<dannym>janneke: Right? Totally weird to use "packed" in the first place...
<janneke>dannym: i didn't really look into the why of struct packing; i only sought to recreate (bit for bit) the same data structures with mescc as that gcc did
<janneke>dannym: otherwise debug-printing / comparing trace logging becomes very hard / impossible
<dannym>janneke: I see. I think that's a good way to do it.
<janneke>dannym: so, i'm all for it to let that rest for now and first proceed building tcc and looking what we find
<dannym>janneke: Ok, I'll push the fixes to master.
<dannym>janneke: Both
<dannym>janneke: Hmmm... about the arm-unknown-linux-gnueabihf-gcc sans libc, with div.c, is it "uidivmod" or "uldivmod"?
<dannym>janneke: I guess we can just s/__aeabi_uidiv/__aeabi_uldiv/ in div.c, or we need both
<dannym>janneke: Even though sizeof(unsigned long) == sizeof(unsigned int) on ARM :P
<janneke>dannym: from the commented error messages (let's remove them) we need both -- or at least we need the others on x86/x86_64?
<dannym>janneke: Yeah, the __aeabi are gcc builtin names
<dannym>janneke: So when compiling with gcc and using their builtins (the latter is almost always the case), then we need those
<dannym>janneke: Trying it right now with gcc on banana
<dannym>__aeabi_uidiv used for unsigned long, apparently :P
<dannym>gcc 5.4.0
<dannym>int main(int argc) {
<dannym> unsigned long a;
<dannym> unsigned int b;
<dannym> return a/argc;
<dannym>}
<dannym>gcc -c a.c
<dannym>objdump -S a.o |grep __aeabi
<dannym>The division by argc is to make sure it can't optimize it ;)
<janneke>:)
<dannym>In the mes environment, I have gcc 7.5.0 ?
<dannym>Which gcc are we targeting for the comparison thing?
<janneke>i have gcc-7.5.0 now (i'm sure i had gcc-5.5 when building x86 tcc)
<dannym>gcc 7.5.0: unsigned long / something: __aeabi_uidiv
<janneke>i didn't expect a change here -- we could mandate gcc-5...it's just that gcc-7 is convenient nowadays with guix --prebuilt substitutes
<janneke>well, i get /home/janneke/src/tcc-boot/tccasm.c:218: undefined reference to `__aeabi_uldivmod'
<dannym>janneke: It's fine, we can just add the other one. But we have to be careful because sometimes gcc means unsigned long long (not a typo) by "ul"
<dannym>janneke: What types are used in that line?
<janneke>218: pe->v %= e2.v;
<janneke>both are ExprValue
<janneke>#ifdef CONFIG_TCC_ASM
<janneke>typedef struct ExprValue {
<janneke> uint64_t v;
<dannym>Yeah, so that's unsigned long long on ARMv7
<dannym>And on x86, actually ?
<janneke>yeah (wondering how we work/limp around this in mescc ...)
<janneke>yes, same
<janneke>guess mescc just ignores the most significant 4 bytes
<dannym>janneke: I see :)
<dannym>janneke: The "__aeabi" things are for gcc only, so we have to make sure to use prototypes compatible with what gcc expects
<dannym>janneke: Alternatively, we could also force usage of gcc's builtin library on gcc (-lgcc) instead
<dannym>janneke: Back when I did the stuff, it seemed that the rest of mes took care not to pull in any gcc-specific things
<dannym>janneke: Also, it's nice that our division algorithm is "independently tested" by having gcc use it in its compiled programs
<janneke>dannym: i did not think of -lgcc -- that could also work; i guess i felt "safer" providing everything in mes
<janneke>dannym: safer as in not accidentally pull in stuff from glibc or so -- standard libraries -- to try making gcc-built resemble mescc-built as much as possible
<janneke>dannym: if you like to try removing these workarounds and use -lgcc, feel free to have a go ;-)
<dannym>janneke: Yeah, sounds good
<dannym>janneke: But that also means that we have to provide gcc builtins
<dannym>janneke: Look at https://github.com/openzfs/zfs/pull/706 into what would be involved in that
<dannym>janneke: https://github.com/behlendorf/spl/commit/93b0dc92eab55f8729b4798b383d4670073ebddc arrrrgh
<janneke>dannym: no, we're using -fno-builtins, right?
<janneke>again, to stay close
<dannym>janneke: I mean we have to provide the interface the gcc compiler expects when it is emitting calls to gcc builtins
<dannym>janneke: For example gcc will emit division builtin calls
<janneke>dannym: yes, this __aeabi*div/mod stuff felt/feels uncomfortable
<vagrantc>/18/18
<janneke>arbitrary gcc-internals that could be fragile -- dunno
*vagrantc waves
<janneke>\o
<dannym>Hi vagrantc!
<dannym>janneke: Yeha, I think gcc people split off -lgcc exactly so you could use those but not libc
<dannym>janneke: If you wanted
<janneke>vagrantc: still thankful for your bug reports -- still posponing to fix them and making a release
<vagrantc>janneke: :)
<janneke>dannym: yeah, makes sense -- i really had no clue
<janneke>vagrantc: getting back to mescc work (dannym's fault!) anyway, at last
<vagrantc>managed to get it building again with gcc10 on i386, but amd64 is still grumpy with one of it's tests
<vagrantc>dannym: thanks for the faults :)
<janneke>lol
<vagrantc>not sure mes belongs in a debian stable release at this point anyways
<janneke>right
<janneke>dannym: oohh, now i see the libmescc analogy -- which means that the patch i pushed to WIP is wrong, adding div.c to libc+tcc
<dannym>janneke: I think so, yes. I wouldn't say wrong, but that's not how gcc would do it, and not how mescc does it either
<dannym>janneke: It would totally work, but the split between libc and builtins does make sense
<janneke>dannym: the correct fix is to change the build-x86.sh/build-arm.sh script in tcc to add -lgcc (or -l mescc when building with mes)
<janneke>*mescc
<janneke>dannym: yeah, sure
<dannym>janneke: Yeah, I agree
<janneke>"love it"
<dannym>janneke: Did you rebase wip on master btw? The wip I checked out is kinda behind I think
<janneke>oops
<janneke>dannym: fixed
<janneke>dannym: eh, but as discussed, "b5d7e1749 build: Include div.c in libc+tcc.a." is probably wrong/not what we want
*janneke -> zZzzz
<dannym>janneke: Yeah
<dannym>janneke: Good night :)
<dannym>janneke: (__aeabi_d2ulz, from your error message list, is for floats to integer conversion btw; see <https://www.redhat.com/archives/edk2-devel-archive/2019-May/msg01546.html>)