list of package names, versions[, descriptions]

Eli Schwartz eschwartz at archlinux.org
Wed Oct 2 15:52:07 UTC 2019


On 10/2/19 11:32 AM, Greg Minshall wrote:
> i'm not sure i can explain why i find having the complete list, with
> descriptions, local on my machine useful.  but, i do.  "search locally,
> build globally" somehow works well for me.  (one rationalization might
> be that searching is inherently more interactive than building, so
> random network latencies, etc., during building are less annoying than
> during searching.)
> 
> anyway, grant me the desire to maintain, offline, a complete list of AUR
> packages, version numbers, descriptions.

Could be, I dunno. All I know is what I would consider personally useful
-- your use cases remain your own, regardless of my opinions or
expressed doubts. :)

> let's say that i've managed, over a period of a week or so, to download
> the entire database (or, at least, the "rows" in which i am interested:
> package name, version, description) into my own local database.
> 
> then, a week later, i'd like to *update* my local database with what's
> changed in the AUR repository.  how would i proceed?  as things
> currently stand, iiuc (always a dubious proposition), i'd need to again
> download the entire database.
> 
> on the other hand, if there were a packages-vers.gz (*), i could
> download that, then compare the package names and versions in it with
> those in my local database, and schedule to download the database
> entries for those packages whose version numbers had changed (as well as
> those packages in packages-vers.gz that are new; and at the same time
> delete those packages in my local database that are no longer in
> packages-vers.gz); one can visualize this code.
> 
> my presumption is that this would be much lighter on server resources
> than downloading the entire database each week.  and, maybe (you'll know
> the "churn" in the repository) would even be very light.
> 
> and, i think this could be useful for general use.  i may only care
> about descriptions, but if someone cares about dependencies,
> maintainers, etc., they would still use the version-number mechanism
> (again, see (*) below) to determine which packages have changed, and
> only download the information from those changed packages.

Well, I guess I could hear the argument you make for providing a way to
invalidate offline assumptions about a package. Even if providing a dump
of names-versions is not strictly useful itself.

> ps -- thanks for the pointer to expac.  i'll look at converting to that.
> no one ever accused me of writing overly-efficient code... :)
> 
> (*) NB:
> 
> note that, for "true consistency", using "version" depends on the
> assumption, likely to be at least occasionally, maybe often, invalid,
> that if the *metadata* for a package in the database changes then the
> *version* of the package itself also changes.

This is "supposed to be true", as in, it's generally considered pretty
bad if people update a PKGBUILD so that it creates a different package
but don't update the pkgrel for metadata or package content changes. It
isn't guaranteed, sure, but I guess there are worse things than simply
failing to detect a cache invalidation for that package.

On the other hand...

> if "last modified" time in the database is updated when any of the
> metadata changes, that would be better to use than package version
> number.
>
> if "last modified" time isn't updated when (some) metadata is updated,
> one could also run an md5sum(1) over (a textual representation of) each
> package's database entry, and provide packages-md5sums.gz, say.  i'll
> note that a simple test shows that adding an md5sum to each line
> inflates the size of the file considerably
> : % ls -skh packages*.gz
> : 1.5M packages-md5sums.gz  344K packages.gz
> 
> the inflation for version numbers and/or "last modified time" (as
> seconds since the epoch) would probably be less, maybe double the size
> of packages.gz?

The package details key "last modified" is indeed updated to the time of
the latest push to the package's git repository, see
https://git.archlinux.org/aurweb.git/tree/aurweb/git/update.py#n92

So this would be a valid method for guaranteeing cache invalidation.

-- 
Eli Schwartz
Bug Wrangler and Trusted User

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1601 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/aur-dev/attachments/20191002/36f4878e/attachment.sig>


More information about the aur-dev mailing list