[pacman-dev] filesystem performance

Vesa Kaihlavirta vpkaihla at gmail.com
Mon Jan 28 04:57:44 EST 2008


>
>
>
> I said you could use that, not should.
>
> The other way would be to experiment with other backends. In this case,
> an actual sqlite proof of concept (that is, working code) would help.


Hey.
I got interested in this last week, and started breaking libalpm apart and
try to fit in an sqliteish implementation. The code was new to me and I
didn't have any other consideration but to get something working as fast as
possible, so the result is nasty.
Basically, I first commented treename from struct __pmdb_t so the compiler
would tell me all(or most of) the places where the old db is used, and
either disabled those functions or did the same with sqlite. Mainly the
additions are in be_sqlite.c (renamed be_files.c), where _alpm_db_open opens
the sqlite db, and db.c: _alpm_db_search, which executes a simple SELECT *
FROM packages WHERE name LIKE "%foo%" and populates the return list.

So, I implemented about 40% of pacman -Ss. If someone cares about timings
(and you probably shouldn't, since my version doesn't do quite the same
thing), here they are:

(running pacman -Ss g three times after a reboot)

pacman-3.1: 41.866s, 0.765s, 0.762s
mutilated-pacman-with-sqlite: 1.036s, 0.131s, 0.133s

pacman-3.1 shows probably rather worse performance in the worst run than it
usually would, since my /var/ was 99% full at the time :)


Anyway. The timing is not the most important issue, I think. libalpm has a
lot of code that is merely there because C sucks for things like string and
directory manipulation. And we need to do a lot of that. My humble guess is
that a proper implementation of libalpm done with sqlite could be at least
50% smaller with a more understandable codebase.

If we want to do this, then how? Some options from the top of my head:

1) for the parts that deal with the db, start from scratch. With the talent
you guys have, shouldn't be a problem? Libalpm isn't very large...

2) for the development phase, consider sqlite to be a cache for the filedb,
and gradually move each piece of code to the other side. This way, the
legacy code would weigh us down a bit, but the change might be more
sustainable.

3) Just hack in the functionality somehow, anyhow.

4) Refactor alpm to support different backends and implement whatever
backend de jour.


Ideas, praise, flames welcome. Code available by request.

--vk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://archlinux.org/pipermail/pacman-dev/attachments/20080128/f456b7d2/attachment.htm>


More information about the pacman-dev mailing list