[pacman-dev] [PATCH v2 5/8] Avoid problematic use of Python's StringIO.

Jeremy Heiner scalaprotractor at gmail.com
Mon Oct 14 11:01:23 EDT 2013


On Sun, Oct 13, 2013 at 7:55 PM, Allan McRae <allan at archlinux.org> wrote:
> I am going to merge all these patches apart from this one and the final
> patch.  If a consensus can be found on how to deal with this issue, I
> will pull it in - I am not familiar enough with python issues to make
> the decision myself.

Thanks, Allan. I'm gratified that I can help make (some small) improvements.

Sorry I got delayed, but I said I would explain how the Python 2
string gotchas impact the pacman testing framework. I think I found a
way to shorten this from what I had anticipated, so hopefully it won't
be completely boring...

There are two pmtests with non-7bit-ascii chars: remove071 and
sync600. remove071 creates one pmpkg (p2) and adds it to the "local"
pmdb. sync600 copy-n-pastes that same p2 pmpkg setup, but also creates
and adds sp2 to the "sync" pmdb.

The framework does very different things for the pmdbs: "local" stuff
get written to the filesystem (simulating in Python code what pacman
would do to install), while "sync" stuff get written to a tarfile (for
later processing by the pacman binary being tested). That is the key
difference and stumbling block (and also why this can't be dealt with
in sync600).

Python's filesystem write API gracefully handles strings of all sorts,
automatically converting char-to-byte as needed, so the "local" pmpkg
p2 (in both pmtests) works great, but...

The tarfile.addfile API requires a fileobject, so the caller of that
API is responsible for handling the low-level char-to-byte conversion.
Python 2.7's StringIO meets that need. But in 3.x there aren't just
fileobjects, there's RawIOBase (the parent class for BytesIO) which
reads and writes bytes, and TextIOBase (the parent class for StringIO)
which reads and writes chars. tarfile.addfile writes bytes, so in 3.x
it fails when it tries to read bytes from a TextIOBase.

So how do we feed tarfile.addfile what it wants without special-casing
for the Python runtime version? Rather than typing up a long
explanation of why there is no way to meet that goal I've attached a
Python script that tries all the options I could think of and produces
a nice printout of the reason for failure in each case. The last line
of the printout lists the successful options - those that work for
that particular Python runtime. Running it on 2.7 and 3.3 shows no
single option is successful for both.

The attached script covers what Martin suggested (assuming I haven't
misunderstood what he meant). And if anyone can think of an option
that I didn't please post a reply - I love learning new things.

Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stringmadness.py
Type: application/octet-stream
Size: 2610 bytes
Desc: not available
URL: <http://mailman.archlinux.org/pipermail/pacman-dev/attachments/20131014/09b40db4/attachment-0001.obj>


More information about the pacman-dev mailing list