IRC channel logs

2023-01-09.log

back to list of logs

<doras>It would be great if it was a format that rootfs.py itself consumes when downloading sources.
<stikonas>doras: that is harder, because there is an option to download sources live in sysc
<doras>This would leave us with one implementation of directory-walking logic for getting sources.
<stikonas>and we might not have json or yaml parser yet
<doras>stikonas: This of course should be preserved and untouched. I'm only referring to the --external-sources logic.
<doras>i.e., whichever Python logic we have now behind this flag.
<stikonas>anyway, fossy wrote that checksum-transcriber and sources stuff
<stikonas>so he is probably a better person to talk to about sources format
<doras>I'm not a Python expert, but I think it's best to use whichever human-readable format is considered the most Python friendly/native.
<stikonas>well, most python friendly is probably yaml...
<stikonas>but yaml is harder to parse manually using just M2 style C code
<stikonas>so most likely it will be just a plain text list (possibly in space separated columns)
<doras>Again, I want to change our bootstrap source downloading logic, only our Python source downloading logic.
<doras>I don't want**
<plantman>if you choose csv, you can parse a line with just 2 functions
<doras>Unless... it was unified since I last reviewed the source code.
<plantman>one function to count the commas, and the other to get the Nth string
<doras>It wasn't unified, it's here: https://github.com/fosslinux/live-bootstrap/blob/d1d36a4b8d2c5b9687bf91dbefe8ca0df91314ab/lib/sysgeneral.py#L102-L117
<stikonas>doras: so what do you want this function to be? I'm a bit confused
<doras>stikonas @stikonas:libera.chat: I want it to accept the source manifest (whichever format we decide) as a function argument and not open files and parse them itself.
<stikonas>I see...
<stikonas>anyway, since I wasn't really involved in that refactoring, it's probably best to talk to fossy
<stikonas>I don't remember exactly the reason behind this
<stikonas>there might be some good reasons why it's split
<stikonas>I guess if we have one big manifest
<stikonas>we also need to keep track of extra metadata
<stikonas>e.g. which package needs those sources
<muurkha>most python friendly is probably a sequence of lines
<muurkha>for line in open(filename):
<muurkha>space-separated columns adds another line, but of course without any quoting convention eliminates the possibility of whitespace within values
<muurkha> words = line.split()
<muurkha>yaml isn't even in the Python standard library
<stikonas>anyway, I was not advocating for yaml here...
<muurkha>csv is tho
<muurkha>by default it uses Excel-compatible delimiter and quoting conventions
<plantman>that excel stuff is more complicated
<muurkha>it is, but it's in the Python standard library, so if you're looking for "the most Python friendly" it qualifies
<muurkha>toml is apparently in the standard library too now actually
<muurkha>and of course xml has been for quite a while
<doras>Of course. fossy: I'd appreciate your opinion on the idea if you read this.... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/d1f6b78545ada211f33288852d92bd2c08133aad>)
<doras>The BuildStream plugin would live in a separate project of course.
<doras>muurkha: thanks. So i understand that csv or xml are our most likely candidates.
<doras>I really don't want to add new dependencies for running rootfs.py so I understand yaml is out of the question. I do, however, want the manifest format to be trivially consumed, so it's better to use a standard format.
<muurkha>delimiter-separated columns are even easier in a sense; they don't even require an import statement
<muurkha>just [line.split(':') for line in open(filename)] or something similar
<muurkha>.split() without an argument splits on any whitespace, but for example .split('\t') splits on tabs and .split('|') splits on | characters
<doras>muurkha: I do need a concept of a tuple though. I need 3 values per entry. So we'll need two different delimiters with that approach; the second probably being \n.
<doras>Oh, well you added "for line in...", so you already made that assumption.
<muurkha>yeah, in Python iterating over a file by default iterates over its lines