Age | Commit message (Collapse) | Author |
|
Moving the code and related definitions from
hashserv/__init__.py to asyncrpc/client.py,
allowing this function to be used in other asyncrpc clients.
(Bitbake rev: b67bb05e431414866b8e8c6a4c88d20b9cdb44a3)
Signed-off-by: Michael Opdenacker <michael.opdenacker@bootlin.com>
Suggested-by: Joshua Watt <JPEWhacker@gmail.com>
Cc: Tim Orling <ticotimo@gmail.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Adds support for the hashserver to have per-user permissions. User
management is done via a new "auth" RPC API where a client can
authenticate itself with the server using a randomly generated token.
The user can then be given permissions to read, report, manage the
database, or manage other users.
In addition to explicit user logins, the server supports anonymous users
which is what all users start as before they make the "auth" RPC call.
Anonymous users can be assigned a set of permissions by the server,
making it unnecessary for users to authenticate to use the server. The
set of Anonymous permissions defines the default behavior of the server,
for example if set to "@read", Anonymous users are unable to report
equivalent hashes with authenticating. Similarly, setting the Anonymous
permissions to "@none" would require authentication for users to perform
any action.
User creation and management is entirely manual (although
bitbake-hashclient is very useful as a front end). There are many
different mechanisms that could be implemented to allow user
self-registration (e.g. OAuth, LDAP, etc.), and implementing these is
outside the scope of the server. Instead, it is recommended to
implement a registration service that validates users against the
necessary service, then adds them as a user in the hash equivalence
server.
(Bitbake rev: 69e5417413ee2414fffaa7dd38057573bac56e35)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Adds an SQLAlchemy backend to the server. While this database backend is
slower than the more direct sqlite backend, it easily supports just
about any SQL server, which is useful for large scale deployments.
(Bitbake rev: e0b73466dd7478c77c82f46879246c1b68b228c0)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Abstracts the way the database backend is accessed by the hash
equivalence server to make it possible to use other backends
(Bitbake rev: 04b53deacf857488408bc82b9890b1e19874b5f1)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Adds support to the hash equivalence client and server to communicate
over websockets. Since websockets are message orientated instead of
stream orientated, and new connection class is needed to handle them.
Note that websocket support does require the 3rd party websockets python
module be installed on the host, but it should not be required unless
websockets are actually being used.
(Bitbake rev: 56dd2fdbfb6350a9eef43a12aa529c8637887a7e)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Rewrites the asyncrpc client and server code to make it possible to have
other transport backends that are not stream based (e.g. websockets
which are message based). The connection handling classes are now shared
between both the client and server to make it easier to implement new
transport mechanisms
(Bitbake rev: 2aaeae53696e4c2f13a169830c3b7089cbad6eca)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Fixes the hashequivalence server to resolve the diverging report race
error. This error occurs when the same task(hash) is run simultaneous on
two different builders, and then the results are reported back but the
hashes diverge (e.g. have different outhashes), and one outhash is
equivalent to a hash and another is not. If taskhash was not originally
in the database, the client will fallback to using the taskhash as the
suggested unihash and the server will see reports come in like:
taskhash: A
unihash: A
outhash: B
taskhash: C
unihash: C
outhash: B
taskhash: C
unihash: C
outhash: D
Note that the second and third reports are the same taskhash, with
diverging outhashes.
Taskhash C should be equivalent to taskhash (and unihash) A because they
share an outhash B, but the server would not do this when tasks were
reported in the order shown.
It became clear while trying to fix this that single large table to
store all reported hashes was going to make these updates difficult
since updating the unihash of all entries would be complex and time
consuming. Instead, it makes more sense to split apart the database into
two tables: One that maps taskhashes to unihashes and one that maps
outhashes to taskhashes. This should hopefully improve the parsing query
times as well since they only care about the taskhashes to unihashes
table, at the cost of more complex INNER JOIN queries on the lesser used
API.
Note this change does delete existing hash equivlance data and starts a
new database table rather than converting existing data.
(Bitbake rev: dff5a17558e2476064e85f35bad1fd65fec23600)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
The -r/--readonly argument is added to the bitbake-hashserv app. If this
argument is given then clients may only perform read operations against
the server. The read-only mode is implemented by simply not installing
handlers for write operations, this keeps the permission model simple
and reduces the risk of accidentally allowing write operations.
As a sqlite database can be safely opened by multiple processes in
parallel, it's possible to start two hashserv instances against a single
database if you wish to export both a read-only port and a read-write
port.
(Bitbake rev: 492bb02eb0e071c792407ac3113f92492da1a9cc)
Signed-off-by: Paul Barker <pbarker@konsulko.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Adds support for an upstream server to be specified. The upstream server
will be queried for equivalent hashes whenever a miss is found in the
local server. If the server returns a match, it is merged into the
local database. In order to keep the get stream queries as fast as
possible since they are the critical path when bitbake is preparing the
run queue, missing tasks provided by the server are not immediately
pulled from the upstream server, but instead are put into a queue to be
backfilled by a worker task later.
(Bitbake rev: e6d6c0b39393e9bdf378c1eba141f815e26b724b)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Adds support for create a client that operates using Python asynchronous
I/O.
(Bitbake rev: cf9bc0310b0092bf52b61057405aeb51c86ba137)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
The hash equivalence client and server can occasionally send messages
that are too large for the server to fit in the receive buffer (64 KB).
To prevent this, support is added to the protocol to "chunkify" the
stream and break it up into manageable pieces that the server can each
side can back together.
Ideally, this would be negotiated by the client and server, but it's
currently hard coded to 32 KB to prevent the round-trip delay.
(Bitbake rev: e27a28c1e40e886ee68ba4b99b537ffc9c3577d4)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Reworks the hash equivalence server to address performance issues that
were encountered with the REST mechanism used previously, particularly
during the heavy request load encountered during signature generation.
Notable changes are:
1) The server protocol is no longer HTTP based. Instead, it uses a
simpler JSON over a streaming protocol link. This protocol has much
lower overhead than HTTP since it eliminates the HTTP headers.
2) The hash equivalence server can either bind to a TCP port, or a Unix
domain socket. Unix domain sockets are more efficient for local
communication, and so are preferred if the user enables hash
equivalence only for the local build. The arguments to the
'bitbake-hashserve' command have been updated accordingly.
3) The value to which BB_HASHSERVE should be set to enable a local hash
equivalence server is changed to "auto" instead of "localhost:0". The
latter didn't make sense when the local server was using a Unix
domain socket.
4) Clients are expected to keep a persistent connection to the server
instead of creating a new connection each time a request is made for
optimal performance.
5) Most of the client logic has been moved to the hashserve module in
bitbake. This makes it easier to share the client code.
6) A new bitbake command has been added called 'bitbake-hashclient'.
This command can be used to query a hash equivalence server, including
fetching the statistics and running a performance stress test.
7) The table indexes in the SQLite database have been updated to
optimize hash lookups. This change is backward compatible, as the
database will delete the old indexes first if they exist.
8) The server has been reworked to use python async to maximize
performance with persistently connected clients. This requires Python
3.5 or later.
(Bitbake rev: 2124eec3a5830afe8e07ffb6f2a0df6a417ac973)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
This can cause a huge backlog of closing sockets on the server and
in our case we don't really want/need the protection TCP is trying to
give us so work around it.
(Bitbake rev: 7bc79fdf60519231da7c0c7b5b6143ce090ed830)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
At exit the hashserv code was causing tracebacks as join() wasn't
being called from the thread that started the process. Ensure that
the hashserver is started from the pre_serve hook which is the
final thread the cooker runs in. This avoids the traceback at the
expense of some horrific poking into data stores which will ultimately
need improving through a proper API.
(Bitbake rev: 05888700e5f6cba48a26c8a4c447634a28e3baa6)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
There were hard to debug lockups when trying to use threading to start
hashserv as a thread. Switch to multiprocessing which doesn't show the
same locking problems.
(Bitbake rev: be23d887c8e244f1ef961298fbc9214d0fd0968a)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Experience with the prserv shows that having two threads, one accepting
and queueing connections and one handling the requests leads to much
more reliable behaviour than having everything in a single thread.
(Bitbake rev: a03d60671a53d9ff70e07cc42fe35f6f8776dac2)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
We're seeing performance problems with hashserv running on a normal build
system. The cause seems to be the large amounts of file IO that builds involve
blocking writes to the database. Since sqlite blocks on the sync calls, this
causes a significant problem.
Since if we lose power we have bigger problems, run with synchronous=off
to avoid locking and put the jounral into memory to avoid any write issues
there too.
This took writes from 120s down to negligible in my tests, which means
hashserv then responds promptly to requests.
(Bitbake rev: 7ae56a4d4fcf66e1da1581c70f75e30bfdf3ed83)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
BB_HASHSERVE
Its useful, particularly in the local developer model of usage, for
bitbake to start and stop a hash equivalence server on local port,
rather than relying on one being started by the user before the build.
The new BB_HASHSERVE variable supports this.
The database handling is moved internally into the hashserv code so that
different threads/processes can be used for the server without errors.
(Bitbake rev: a4fa8f1bd88995ae60e10430316fbed63d478587)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Implements a number of optimizations to the SQL used in the hash
equivalence server:
1) Two indexes are created for the two methods (method, taskhash and
method outhash) by which rows are found in order to speed up the
lookup
2) An extra SELECT to lookup the just inserted row was removed. This
SELECT is unnecessary since all of the information about the newly
inserted row is already available.
3) A uniqueness constraint was added to the table. This should allow
the server to be multithreaded in the future since duplicate inserts
can be detected (and ignored). This change requires bumping the
database version to '2', since a uniqueness constraint can't be
added to an existing table.
4) Some comments are added to clarify the trick SELECT statement used
when inserting new equivalent hashes
(Bitbake rev: 7aec8632e67b4f0ab7b72692c40a42f6926608c3)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
With the introduction of SPDX-License-Identifier headers, we don't need a ton
of header boilerplate in every file. Simplify the files and rely on the top
level for the full licence text.
(Bitbake rev: 695d84397b68cc003186e22f395caa378b06bc75)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
This adds the SPDX-License-Identifier license headers to the majority of
our source files to make it clearer exactly which license files are under.
The bulk of the files are under GPL v2.0 with one found to be under V2.0
or later, some under MIT and some have dual license. There are some files
which are potentially harder to classify where we've imported upstream code
and those can be handled specifically in later commits.
The COPYING file is replaced with LICENSE.X files which contain the full
license texts.
(Bitbake rev: ff237c33337f4da2ca06c3a2c49699bc26608a6b)
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|
|
Implements a reference implementation of the hash equivalence server.
This server has minimal dependencies (and no dependencies outside of the
standard Python library), and implements the minimum required to be a
conforming hash equivalence server.
[YOCTO #13030]
(Bitbake rev: 1bb2ad0b44b94ee04870bf3f7dac4e663bed6e4d)
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
|