[Open SoC Debug] Major changes in Open SoC Debug ahead!

Philipp Wagner philipp.wagner at tum.de
Thu Sep 21 14:59:56 CEST 2017

[To get a HTML-version of this message head over to 

Hi all,

Over the last roughly two years, Open SoC Debug has grown into a
reliable debugging tool for the needs of lowRISC and OpTiMSoC. A lot of
effort went into fixing small bugs to improve reliability and to add
some features such as the emulated UART device, UART-DEM. And it was
worth the effort, as we've seen over the summer when we added Linux
support to OpTiMSoC. Control flow traces generated by the CTM modules,
as well as the UART-DEM module were major enablers for this work.

This work has also given us a better understanding of areas where our
current design limits extensibility. Furthermore, we found some parts of
the reference implementation to be tricky or fragile to use. To fix all
this, we've started brainstorming and refactoring the spec and the
reference implementation.

What changes are coming?

Debug Interconnect Addresses are now 16 bit wide

Previously, addresses used to identify debug modules were 10 bit wide,
allowing up to 1024 modules to be present in a debug system. We're
extending addresses to 16 bit, addressing up to 65536 debug modules.
("65k should be enough for everybody.") This change also requires
modifications to the packet format the on-chip interconnect, and hence
modifications to all debug modules which send and receive packets. 16
bit addresses enable us to address more debug modules, but they also
enable us to reserve some parts of the address for special purposes.
(More on that later.)

Changes to the base register map

All debug modules in OSD conform to a common base register map. These
registers describe the type of the module, its version, amongst other
things. In order to be more extensible, we've split the module type into
two fields (vendor and type identifier) and rearranged them in the
register map. The specification is already updated to describe the new
register map.

Cocotb-based hardware testing/verification

Currently the hardware portion of the OSD reference implementation is
mainly tested using manual tests, together with system-level tests in
OpTiMSoC and lowRISC. To make changes to the code base easier and the
results more predictable, we're adding unit tests to the reference
implementation using the excellent Python-based cocotb unit testing

A full rewrite of the software reference implementation

The software running on the host is the main entry point for users to
OSD. It must be as robust as possible to give a smooth debugging and
tracing experience. But it also must be extensible to add new debug
tools easily.

The current implementation has a couple of very nice properties:

-   Multiple debug tools can consume data coming from the target. For
     example, run-control debugging with GDB doesn't interfere with
     logging a system trace.
-   A scriptable interface makes it easy to automate the interaction
     with OSD-enabled SoCs.
-   Debug tools can be separated into individual processes, or combined
     into one process.

We'll keep these properties, but extend them in a couple ways:

-   Instead of using our own TCP-based communication protocol between
     the debug tools on the host, we'll rely on ZeroMQ. Using ZeroMQ is a
     great as it solves a couple problems at once:
     -   It supports different types of transports. Components connected
         by ZeroMQ can live in the same process using the inproc shared
         memory transport, but they can also live on different machines
         using the tcp transport. All of that is fully transparent to
         the application.
     -   ZeroMQ has bindings for just about any programming language
         out there. This enables writing debug tools in all those
         languages, as long as the host communication protocol is adhered
         to (something we are also documenting in more detail as part of
         the refactoring process).
     -   Finally, ZeroMQ is great at handling all the tiny little details
         of communication over unreliable links -- connects and
         disconnects, timeouts, signal handling, and much more.
-   We're redesigning the architecture libopensocdebug to be easier
     testable. This mostly involves splitting the current
     matroshka-doll-like structure (a debug tools encapsulates the tool
     client component which encapsulates the tool server component and
     ultimately GLIP for communication) into smaller classes. For
     example, GLIP is no longer encapsulated in libopensocdebug, but it
     remains separate and the two libraries are connected on a higher
     level. A more detailed look at the new architecture will follow in a
     later blog post.
-   We're adding unit tests and code coverage metrics to make sure we're
     not breaking things when extending our implementation in the future.

One more thing: Subnets

Something which is mostly in our head right now are OSD Subnets. For now
this only means: all debug tools on the host are part of one "subnet",
and all debug modules on the target device are another subnet. There's
more to it, but we'll keep that for a later time.

So where's the code?

A large rework like the one we're currently attempting involves changing
code in various places. Unfortunately, it's not always possible to
completely decouple the dependencies between these changes. This is
especially true for the changes to the communication protocol, which
require changes to both software and hardware parts.

So to keep OSD working and usable by downstream projects while we're
working on this major refactoring, we've decided to take the following

-   We'll keep the master branches of the reference implementation in a
     working, stable state. All our refactoring will happen on a
     different branch, called osd-next. If you're currently using OSD in
     your designs (that's most likely only true for OpTiMSoC and
     lowRISC), stay on the master branch for now.
-   The specification is continuously updated to reflect the current
     state of our thinking, i.e. how we want the spec to be. This, in
     turn, implies that the reference implementation temporarily diverges
     from the spec. But since there has been no formal release of the
     spec anyways, we feel that approach shouldn't cause too many
     problems for our users.
-   We'll try to review and merge individual chunks of work making up
     the rewrite as usual, and commit them into the respective
     osd-next branches.
-   The new software reference implementation is currently in the
     progress of being cleaned up for an initial review round. It still
     has a lot of rough edges, but already exhibits the properties we're
     looking for, and I hope no major redesign will be needed before it
     can get merged. Expect a first pull request in the coming weeks.

In addition to the upstream work at OSD, OpTiMSoC is maintaining an
continuously updated branch osd-rework in its repository. This ensures
that the changes in OSD fit well into the needs of our downstream

Give feedback and get involved

Changes like the ones we're attempting here present an excellent
opportunity to get involved. Let us know on the mailing list what you
think, or what your questions are.



More information about the OpenSoCDebug mailing list