[OE-core] Hash Equiv Server experiment results

Thu Aug 22 16:20:12 UTC 2019

I wanted to summarise what my local tests with the hash server
concluded.

a) the opendb() changes I made in:
   http://git.yoctoproject.org/cgit.cgi/poky/commit/?id=ca04aaf7b51e3ee2bb04da970d5f20f2c9982cb8
   broke things as the database is opened for each request. I have 
   local patches to fix that but it only helps by about 10%.

b) the overhead from the separate receive and handling threads is not 
   worth the overhead:
   http://git.yoctoproject.org/cgit.cgi/poky/commit/?id=d40d7e43856f176c45cf515644b5f211c708e237
   This probably halves throughput.

c) I moved the database writes to their own thread with a queue but it 
   doesn't seem to help much other than allowing other threads to 
   handle requests in parallel.

d) the ThreadedMixin regresses performance further and is the worst 
   change I've tested.

e) Using ThreadPoolExecutor along the lines of:
   self.executor = concurrent.futures.ThreadPoolExecutor(max_workers=10)
   self.executor.submit(self.process_request_thread, request, client_address)
   doesn't help speed. Its faster then ThreadedMixin but slower than no threads.

f) The profile data I have suggests we spend a lot of time in TCP/HTTP 
   header overhead and connection setup which confirms Joshua's 
   thoughts.

g) The most optimal setup is therefore the original server with no 
   threading.

h) The autobuilder would need to cope with 9000*40 requests in under a 
   minute, preferably faster. The current server does not have 
   anywhere near that speed.

My conclusion based on this is that we need to rewrite the way runqueue
makes the hash computations, perhaps seeding the cache in advance so we
minimise the number of single calls. We can make one query for all 9000
entries to the server on a single connection/request, or batch in
blocks of 1000 or similar.

The other option would be a custom server/protocol but I think what
we're using is fine, we just need to change how.

I have some profiling code for the hashserver but its doing profiling
per thread so doesn't integrate well until we decide what form the
codebase should have. Simpler is looking to perform better.

I haven't had a chance to work on these patches, nor will I over the
next few days as I'm taking a break but I think I know what we need to
do now.

Cheers,

Richard