IRC channel logs

2020-12-04.log

back to list of logs

<OriansJ>pder: it should also build correctly with M2-Planet as well
<OriansJ>now I'm going to try to figure out how to break the generation of raw into smaller steps.
<yt>OriansJ: AArch64 hex0 is up https://github.com/oriansj/mescc-tools-seed/pull/12
<yt>thought we best start small and see if there are any big changes that need to be applied across the board
<OriansJ>hex0_AArcd64.M1 ? did you mean hex0_AArch64.M1 ?
<yt>ahhh I'd forgotten about that typo
<yt>that's the problem with thinking "I'll fix that later" XD
<yt>should be fixed now
<OriansJ>It'll take me a bit to validate the DEFINEs but I'll get to that shortly
<yt>do you trust objdump when verifying the opcodes? :)
<yt>I think when I was doing hex0 I even hand-encoded them all, but I gave up on that with the bigger tools
<yt>I didn't really try to minimise the number of opcodes, aiming for something that's more or less readable in M1
<OriansJ>yt: DATA_OFFSET ? couldn't $output DATA_OFFSET be better written as &output?
<OriansJ>or is &label not behaving correctly in hex2 for AArch64?
<yt>ah I'm sure it can, I'm sure I used &label in the later tools
<yt>hex0 lived at 0x00000000 originally when I started in qemu, but a real linux kernel didn't like that whatsoever
<yt>hence the botched in DATA_OFFSET
<OriansJ>well hex0 doesn't actually have support for labels as all but hex2 will accept any arbitrary base address
<yt>yeah, I'd missed that option to hex2 initially; maybe the default for the base address shouldn't be 0x0 as that's unlikely to generate a working executable
<OriansJ>yt: well hex2 by default generates KNIGHT binaries which are loaded at address 0x0 by default
<OriansJ>perhaps I could make hex2 show a warning for all other architectures if --base-address isn't set to something other than zero
<OriansJ>that would help people avoid hitting similiar problems in the future.
<OriansJ>or would an error be better yt ?
<yt>I'm all for an error if it doesn't make sense to ignore it
<OriansJ>well one could use hex2 to build rom images and then using address 0x0 would be entirely reasonable.
<yt>ah fair enough, warning seems fine then!
<yt>a very useful error message: "A displacement of 1073742988 does not fit in 2 bytes"
<yt>would probably help if the displacement was in hex though :D
<OriansJ>yt: true and I certainly will add that to the backlog of nice-to-have features
<OriansJ>yt: warning message for hex2 is up; hopefully you find that obvious enough.
<yt>is that in https://github.com/oriansj/mescc-tools/commits/master ? don't see it yet
<yt>ah I see hex2_linker.c can't print hex in error messages yet? sounds like something for on the backlog :)
<OriansJ>7f7039e6e59a1d35e1fe9bbaa92c7806a4836cf1 is up on github
<yt>OriansJ: just ran into another little niggle trying to test hex0_AArch64.M1 with the change you suggested: aarch64 is little endian only (I think) but hex2 defaults to big endian even when architecture is set to aarch64
<yt>most tools (readelf, objdump, file) will fall over hard with a big-endian aarch64 ELF file
<OriansJ>yt: bot M1 and hex2 support --little-endian
<OriansJ>^bot^both^
<yt>support yes, default to with --architecture aarch64, no :)
<OriansJ>certainly a nice to have but unsure if setting Endianness is something that should be implicit when dealing with such low level files.
<yt>hmm, then maybe a warning (or error) for big-endian aarch64 might be more appropriate
<OriansJ>pder: sizeof(prog) wouldn't work in M2-Planet as prog isn't a type.
<OriansJ>well a warning for armv7l, x86 and AMD64 too
<yt>doesn't armv7l allow both big and little endian?
<OriansJ>yt: yes it does; hence just a warning
<OriansJ>as big endian armv7 is more rare as I understand it.
<yt>sounds good, yeah, TIL that armv7 is mostly little-endian anyway
<OriansJ>well I can't say that for certain
<yt>I've had more exposure to the theoretical bounds of the architecture than what's out there in the wild
<OriansJ>as I have yet to find an armv7 setup that is in big-endian mode
<OriansJ>but perhaps I just haven't looked in the right places or it is OS image dependent or etc
<OriansJ>if I could find a armv7 in big endian mode, it'll probably have MUCH nicer M1 definition files.
<yt>yeah, why did we ever settle on little-endian, it's so annoying when hand-editing hex files /s
<xentrac>yt: heh
<OriansJ>yt: it is cheaper in TTL circuits
<xentrac>why TTL? I don't think the logic family is a relevant distinction here
<xentrac>and the chips that established little-endian as the standard (8080, 6502) were NMOS or CMOS, not TTL
<xentrac>I've seen big-endian ARMs but not very often
<OriansJ>xentrac: fair; it generally required less chips when working with individually packaged gates when making a processor out of them
<xentrac>really? how come?
<OriansJ>8080 was based on a ?Datapoint? calculator design if I remember corerctly which was little endian to reduce costs
<OriansJ>xentrac: simpler circuit if I remember correctly; saves about 8 gates
<xentrac>yeah, the 8080 was sort of a less crippled version of the 8008, which was a clone of the Datapoint intelligent terminal, which was TTL for speed
<xentrac>Datapoint ended up not using the 8008 because it was too slow: not as slow as CD4000-family discrete CMOS gates, but slower than the Datapoint 2200 design made out of discrete TTL gates
<xentrac>most of my notes on the 8008 are in an extended digression in https://dercuano.github.io/notes/computer-algebras-ii.html
<OriansJ>anyway; once enough software was written changing form little endian to big endian became an impossible task (or close enough people missed obvious solutions)
<xentrac>(I was investigating the history of the DAA instruction to give you BCD addition by fixing up the results of binary addition, which survived until the i386 and was dropped in amd64)
<xentrac>I'm interested to see what the simplification in the 2200 was
<OriansJ>for example when DEC went from pdp-11 to VAX or VAX to Alpha; they never fixed endianess. Despite it would have been easier with the Alpha change to go Big Endian by simply adding little endian LOAD and STORE instructions
<xentrac>the Alpha isn't really binary-compatible with the VAX, so I'm not sure the compatibility argument applies
<xentrac>if you're going to run PDP-11 software on your AXP you'd better have source or you're going to be running a PDP-11 emulator like SIMH or something :)
<OriansJ>yt: I should have the hex0 fully checked by tomorrow evening.
<OriansJ>xentrac: yes but then why did they opt for little endian if not to skimp on having to update code that depended upon little endian data.
<yt>OriansJ: that's awesome! I'm about to push the change to remove DATA_OFFSET; had to debug hex0_AArch64.M1 first, had one branch instruction going the wrong direction
<yt>I don't think I tested the M1 / hex2 versions very well
<xentrac>OriansJ: maybe they liked little-endian better after using it for 25 years
<OriansJ>yt: sounds fair; assuming the DEFINEs don't change, it will not slow me down too much
<xentrac>the PDP-8 and PDP-10 families were non-endian
<yt>OriansJ: hopefully not! I'll keep it as separate commits in the pull requests, so you should be able to easily see what's changed
<yt>hah! I *had* fixed that bug in the hex2 version, but forgotten to do so in hex0_AArch64.M1
<OriansJ>yt: I am going to check M1 version first; then hex2 and finally hex0
<OriansJ>M1, hex2 and hex0 versions should all have the exact same checksum when built
<yt>as long as your ELF headers match :)
<yt>which I don't think they do... at least the hex2 version has a debug header for objdump purposes
<yt>but that's something I can fix
<OriansJ>yt: if you notice ELF-i386.hex2 has the same ELF header as is embedded in x86's hex0, hex1 and hex2
<OriansJ>the only difference is hard coded values for the stages that don't support proper & or %label>label displacement
<OriansJ>one trick you might find handy is gdb does a better job of disassembling than objdump
<OriansJ>do readelf -h $file and b* 0x600078 (using the value for Entry point address)
<yt>gotcha. I'll fix that up tomorrow, it's only slighly past my bedtime :)
<yt>OriansJ: yeah, I figured out that trick, thanks though ;)
<OriansJ>yt: fair; besides it time for me to put my son to bed now anyway ^_^
<yt>the other changes we talked about are up now; looks like the pull request has updated itself :)
<yt>see you around; and thanks for the immmediate review, much appreciated!
<OriansJ>xentrac: now that I think more about it; I am gonna blame IBM's 360 project for little endian lock-in
<xentrac>OriansJ: oh, was the 360 little-endian?
<xentrac>no, it was big-endian
<xentrac> https://en.wikipedia.org/wiki/IBM_System/360#Architectural_overview
<xentrac>are you suggesting that Intel, DEC, and MOSTek picked little-endian byte ordering as a way of thumbing their nose at IBM? :)
<Darius>what's wrong with little-endian, anyway?
<xentrac>it's confusing. but hey, so is binary
<OriansJ>xentrac: more they changed the financial dynamics about software; eg binaries without source code became more standard and the lock-in to endianess resulted.
<OriansJ>Darius: short version 0x12345678 looks like this 78 56 34 12 in memory
<xentrac>that was the DoJ consent decree in the 1970s IIRC, not the 360
<xentrac>but it was standard pre-360 for people to write nonportable programs, which had lock-in to not just endianness (if you had byte addressing anyway) but lots of other details about your hardware
<xentrac>I suspect the 1401 was "little-endian" but can't remember
<OriansJ>the big problem in computer history; we delete too much
<xentrac>well, you'll be pleased to learn that the Utah Data Center is now in operation...
<OriansJ>xentrac: wrong sort of computer history
<OriansJ>I think I got it
<OriansJ>it is mostly wrong but close enough
<Darius>big-endian decimals in everyday writing may have been a mistake too
<Darius>they were little-endian in the arabic source, at least, so i gather
<OriansJ>Darius: then why is virtually every instruction set big bit endian?
<Darius>i don't follow
<Darius>if you mean the most-significant bits of an instruction tend to be function selectors, that's a different thing
<OriansJ>for example little bit endian with little byte endian would make 0x12345678 into 87 65 43 21
<Darius>who's advocating for that?
<OriansJ>where as big bit endian with little byte endian would look like 78 56 34 12
<OriansJ>but big bit endian with big byte endian would look like 12 34 56 78
<Darius>in my world little-endian means bit k has significance 1<<k, byte b has significance 256**b
<OriansJ>Darius: so you are for big bit endianess but little byte endianess?
<OriansJ>because bit endianess doesn't extend past the byte boundary
<Darius>i don't see how you're getting that from what i said
<OriansJ>ok which of these four do you like best to encode in memory the number 0x12345678: 87 65 43 21 or 78 56 34 12 or 21 43 65 87 or 12 34 56 78
<Darius>you're talking about displaying numbers as text; i'm talking about addressing
<OriansJ>Darius: no that is how the bits are arranged on consecutive memory cells
<OriansJ>pder: nearly got vm.c to be behavior match for rts.c; it isn't quite right but if we could figure that out we can eliminate rts.c and the need for compiling repeatedly.
<OriansJ>Darius: as 0-F are only the order of the nybbles in each byte
<Darius>i already said: bit k has significance 1<<k, byte b has significance 256**b. if you divide a word into bytes, then e.g. the 8th bit of the word would be the 0th bit of byte 1, in that addressing.
<pder>OriansJ: very cool. there seems to be lots of opportunities to remove duplication
<OriansJ>pder: I just don't have it quite right and I have to get some sleep; hopefully you'll see what stupid thing that I am missing.
<Darius>i'm confused why you would call that scheme big-bit-endian; maybe it has something to do with the big-endian display of the individual bytes in the way you're writing them out, but i don't know and don't feel like arguing about it
<xentrac>OriansJ: typically the bits in a physical word of memory (a byte, if the memory is 8-bit) are arranged along a different dimension than successive words in the memory system. for example, typically each bit in a word is in a separate DRAM chip
<xentrac>on an EPROM, each bit in a word comes out (or goes in, when you're programming it) on a separate pin
<xentrac>so in a bit-parallel CPU (like very nearly all CPUs since 1960) bits don't have an endianness; bits don't come in a sequence
<xentrac>(bit-serial CPUs are invariably bit-little-endian because that way carry propagation is feasible)
<bauen1>nice to see some progress on aarch64, i recently got some sbcs that i plan into making into a sort of root-of-trust, and bootstrapping from scratch on those would be kind of fun
<rain1> https://softwarediversity.eu/hardening-the-software-supply-chain-with-multi-compilers/
<OriansJ>Darius: fair, I guess we don't need this then: https://paste.debian.net/1175568/
<OriansJ>ironically only 6 and 9 are the only encodings that match
<yt>OriansJ: a little M2-planet fix https://github.com/oriansj/M2-Planet/pull/6
<yt>M2-planet test suite passes with that on AArch64
<Darius>OriansJ, if i came across as combative last night i'm sorry; i was actually curious about what problems you had in mind, and then in answer you seemed to be attributing things to me i hadn't said.
<Darius>of course there's a mismatch between big-endian writing systems and little-endian addressing, but i think that's boring.
<Hagfish>(for what it's worth, i don't see anything egregious from a skim through what you wrote, but i really respect that pre-emptive apology. it's great for this community to see professionalism like that)
<QuickBootstrapQu>The 60MB bootstrap binary seed, Is there anything they require to run that isn't provided by, say, a Linux 2.6.8 kernel?
<Hagfish>QuickBootstrapQu: good question
<Hagfish>i think a 2.6 kernel (and a shell?) should be all that's needed, but OriansJ knows much more than me
***ChanServ sets mode: +o rekado_
***rekado_ is now known as rekado