Category Archives: Engine

Under the hood: Transactions

The final major “under-the-hood enhancement” in Toves’s design worth discussing is the transactional-based access scheme. In Logisim, there is a persistent bug where a circuit’s simulation occasionally messes up. It seems to be quite rare, even for large circuits, and it happens unpredictably. My best guess is that it has to do with multithreaded access.

To avoid multithreading issues, Toves uses the concept of a “transaction” from databases. Its key data structures (the circuit layout, the simulation engine, the overall project) provide no way to access them directly. Instead, you have to nest all access within a transaction. In the program, it looks like this:

Transaction xn = new Transaction();
ILayoutAccess layout = xn.RequestReadAccess(layoutStructure);
ISimulationAccess sim = xn.RequestWriteAccess(simulationStructure);
using (xn.Start()) {
    // code using layout and sim to read layoutStructure and modify simulationStructure
}

The ‘layoutStructure‘ object actually doesn’t even have methods like ‘GetWires‘ or ‘AddWire‘. Such methods are provided in the ILayoutAccess interface, and such methods always confirm that the transaction is still live before proceeding with their work.

The example above shows how a transaction can request simultaneous locks on different structures. If there prove to be issues with deadlock between transactions, these will hopefully be able to be addressed by making Transaction’s Start method more sophisticated in the order in which it obtains locks.

(I suspect that the distinction between read access and write access is a matter of overengineering: In practice, it probably happens that two threads never end up getting read access simultaneously.)

This should degrade performance somewhat, as repainting the circuit (which happens often) will never happen at the same time as stepping the circuit (which happens extremely often). But this simultaneous access is exactly what makes bugs possible – and in practice I believe the interface events like repainting the circuit happen at most 50 times per second, whereas the simulation-stepping transactions happen upwards of 1,000 times per second. In any case, the locking can easily be removed (by making Transaction’s Start method do absolutely nothing) if performance proves problematic – though that requires more fine-grained concurrency that may actually hurt performance even more than the exclusive access.

Under the hood: Hash tables

[I have a few more posts describing what’s going on “under the hood.” I hope you’re finding this interesting… but what I’m really hoping for is to find commenters who are ready to discuss/critique what they’d like to see from the user-facing features. For instance, how should a counter be drawn in the circuit, and with what inputs/outputs? Or what file format should Toves use?]

Another change “under the hood” in Toves is a move away from hash tables. Logisim uses hash tables liberally: For the circuit under simulation, there is one hash table that maps locations to the values stored at those, and there is another hash table that maps circuit components to their individual state (for components that have state, like flip-flops). And there are several hash tables beyond that.

I believe a large fraction of Logisim’s simulation time is spent in hash table lookups, and so Toves is much more careful in avoiding hash tables. For example, stylistically it’s tempting to have a hash table mapping each “node” to its current “value,” since the current value is conceptually not a property of the node. But it’s more efficient for each node to have a mutable field saying what its value is. (Actually, Toves groups linked nodes together into “subnets,” and each node has a mutable field saying what subnet it belongs to; the subnet is actually where the value is stored.) To look up a node’s value, it has only to dereference a couple of pointers, which should be far faster than a hash table lookup.

In fact, right now the simulation engine in Toves includes just two hash tables. (More could easily be added as more features are added – two big missing features that potentially will involve substantial edits to the simulation engine are subcircuits and splitters.) One of these hash tables is used to track which subnets have changed in the current time step, so that subnets’ values aren’t re-computed when the result is guaranteed to be the same. Another tracks which components have seen their inputs changed, so that we don’t ask the components to recompute their output values.

As I write this, I believe each of these remaining two hash tables could be replaced with an array-backed list accompanied by a boolean flag in the subnet/instance indicating whether it is already in the list. That should be more efficient (though there are potential problems with this for a multithreaded implementation to take advantage of multicore computation, which is something that I hope in the vague future to tackle). I plan to look into that after adding subcircuits and splitters.

Under the hood: Simulation data

Many of the biggest changes from Logisim to Toves are “under the hood,” some of the more important of which I’ll explain in the next few days. These changes are basically invisible to a user, though it has major implications for the code design and features provided.

This post will discuss the biggest of the changes: How circuits are represented during simulation. Logisim has just one representation of the circuit layout: During simulation, it remembers, “there’s a 0 at this coordinate, X at that coordinate” and so on. By contrast, as you build a layout, Toves “compiles” the physical layout into a separate data structure corresponding to the circuit’s logical connections. Toves represents a set of linked wire segments as a system of “nodes” and “links” between them. The simulation code has no knowledge of coordinates. It also forgets about components and subcircuits, dealing simply with a collection of “instances.”

Naturally, Logisim’s approach has some advantages. It is conceptually simpler, and it has less redundancy, so there are fewer data structures to keep “in sync.” By contrast, Toves maintains two redundant representations of the circuit – one representation used for drawing and editing, with another representation being used for simulation. Keeping these two representations in sync is tricky. In fact, I started work on this concept several times over the last several years before finally getting this version to work.

While Toves’s technique is more complex, there are of course important reasons to keep them separate. First is the efficiency angle. Actually, the Logisim approach has the potential to be more efficient for *drawing the circuit*, which works with the layout information but Toves must repeatedly map to the simulation to determine what values to draw. But for *simulating a circuit*, the Toves approach can potentially be faster – both because the data structure it’s working with is designed explicitly for simulation, and because the structure itself is simpler, providing more room for further optimization. So do we go for efficiency at drawing or at simulation? Well, there’s little reason to repaint the circuit more than 30 times a second, but we hope to execute thousands of simulation steps each second, so priority should go to simulation.

Another advantage to Toves’s approach is an issue of software architecture. Once you get past the complexity of maintaining a mapping between layout and simulation representations, Toves has the advantage of being cleanly broken into “layers”: At the bottom, you have the simulation model, then the simulation state on top of that, then the layout model, then the abstract GUI code (written to be independent of platform), then the actual GUI implementation. Toves tries to keep the layers more separate, with one-way dependencies between them; by contrast, Logisim is not very cohesive, with packages/namespaces freely depending on one another.

Finally, separating the simulation representation from the physical layout should make it easier to support project modules being defined not by a physical layout but by HDL code (whether Verilog or VHDL). With Logisim, we’d have to make up a fictional graphical layout corresponding to the textual code, whereas with Toves we can compile both the HDL code and the physical layout into the same underlying representation. Admittedly, support for HDL code is a long way off, but I’m hopeful that Toves will be ready for this when the time comes.