[OE-core] [PATCH v4 00/10] Hash Equivalency Server

Joshua Watt jpewhacker at gmail.com
Tue Dec 18 15:30:51 UTC 2018


Apologies for cross-posting this to both the bitbake-devel and
openembedded-devel; this work necessarily intertwines both places, and
it is really necessary to look at both parts to get an idea of what is
going on. For convenience, the bitbake patches are listed first,
followed by the oe-core patches.

The basic premise is that any given task no longer hashes a dependent
task's taskhash to determine it's own taskhash, but instead hashes the
dependent task's "unique hash" (which doesn't strictly need to be a
hash, but is for consistency.  This allows multiple taskhashes to map to
the same unique hash, meaning that trivial changes to a recipe that
would change the taskhash don't necessarily need to change the unique
hash, and thus don't need to cause downstream tasks to be rebuilt (with
caveats, see below).

In the absence of any interaction by the user, the unique hash for a
task is just that task's taskhash, which effectively maintains the
current behavior. However, if the user enables the "OEEquivHash"
signature generator, they can direct it to look at a hash equivalency
server (of which a reference implementation is provided). The sstate
code will provide the server with an output hash that it calculates, and
the server will record all tasks with the same output hash as
"equivalent" and report the same unique hash for them when requested.
When initializing tasks, bitbake can ask the server about the unique
hash for new tasks it has never seen before and potentially skip
rebuilding, or restore the task from an equivalent sstate file. To
facilitate restoring tasks from sstate, sstate objects are now named
based on the tasks unique hash instead of the taskhash (which, again has
no effect if the server is in use).

This patchset doesn't make any attempt to dynamically update task unique
hash after bitbake initializes the tasks, and as such there are some
cases where this isn't accelerating the build as much as it possibly
could. I think it will be possible to add support for this, but this
preliminary support needs to come first.

You can also see these patches (and my first attempts at dynamic task
re-hashing) on the "jpew/hash-equivalence" branch in poky-contrib.

As always, thanks for your feedback and time

VERSION 2:

At the core, this patch does the same thing as V1 with some very minor
tweaks. The main things that have changed are:
 1) Per request, the Hash Equivalence Server reference implementation is
    now based entirely on built in Python modules and requires no
    external libraries. It also has a wrapper script to launch it
    (bitbake-hashserv) and unittests.
 2) There is a major rework of persist_data in bitbake. I
    think these patches could be submitted independently, but I doubt
    anyone is clamoring for them. The general gist of them is that there
    were a lot of strange edge cases that I found when using
    persist_data as an IPC mechanism between the main bitbake process
    and the bitbake-worker processes. I went ahead and added extensive
    unit tests for this as well.

VERSION 3:

Minor tweak to version 2 that should fix timeout errors seen on the
autobuilder

VERSION 4:

Based on discussion, the term "dependency ID" was dropped in favor of
"unique hash" (unihash).

The hash validation checks were updated to properly fallback to the old
function signatures (that don't pass the unihashes) for compatibility
with older implementations.

Joshua Watt (10):
  bitbake: fork: Add os.fork() wrappers
  bitbake: persist_data: Close databases across fork
  bitbake: tests/persist_data: Add tests
  bitbake: siggen: Split out task unique hash
  bitbake: runqueue: Track task unique hash
  bitbake: runqueue: Pass unique hash to task
  bitbake: runqueue: Pass unique hash to hash validate
  classes/sstate: Handle unihash in hash check
  bitbake: hashserv: Add hash equivalence reference server
  sstate: Implement hash equivalence sstate

 bitbake/bin/bitbake-hashserv         |  67 ++++++++++
 bitbake/bin/bitbake-selftest         |   3 +
 bitbake/bin/bitbake-worker           |   9 +-
 bitbake/lib/bb/fork.py               |  73 +++++++++++
 bitbake/lib/bb/persist_data.py       |  32 ++++-
 bitbake/lib/bb/runqueue.py           |  73 +++++++----
 bitbake/lib/bb/siggen.py             |   7 +-
 bitbake/lib/bb/tests/persist_data.py | 188 +++++++++++++++++++++++++++
 bitbake/lib/hashserv/__init__.py     | 152 ++++++++++++++++++++++
 bitbake/lib/hashserv/tests.py        | 141 ++++++++++++++++++++
 meta/classes/sstate.bbclass          | 102 +++++++++++++--
 meta/conf/bitbake.conf               |   4 +-
 meta/lib/oe/sstatesig.py             | 167 ++++++++++++++++++++++++
 13 files changed, 978 insertions(+), 40 deletions(-)
 create mode 100755 bitbake/bin/bitbake-hashserv
 create mode 100644 bitbake/lib/bb/fork.py
 create mode 100644 bitbake/lib/bb/tests/persist_data.py
 create mode 100644 bitbake/lib/hashserv/__init__.py
 create mode 100644 bitbake/lib/hashserv/tests.py

-- 
2.19.2



More information about the Openembedded-core mailing list