Main

ECE 576 - Homework Assignment 2

Due Monday, March 01, 11:59PM


Announcements and Clarifications:

March 08: A solution to Homework Assignment 2 has been posted on the embedded server at /scratch/ece576/tlm_sharedbus.

February 25: Grading Rubric.

February 23: The due date for Assignment 2 has been extended to Monday, March 1 at 11:59PM.

February 23: Bus Protocol Clarification: The bus Request() should provide all information necessary to indicate what transaction is requested on the bus, similar to the two bus protocols discussed in lecture. During the Request(), sufficient information should be provided such that any servant component listening to the bus will receive the required information for responding (or determining if they should respond) to the request. For example, consider an implementation in which only one bus master is present. In this scenario, the bus master's Request() would need to include all information for the request such that the servants listening (i.e. waiting within Listen() function) to the bus can respond with the Acknowledge() if the request is for them. In this scenario, the acknowledge comes directly from the servant component as no arbiter is be present and only one master is waiting for the acknowledge.

Thus, in a TLM implementation, all components acting as bus masters should not be aware of the arbitration method. Instead, all bus masters are provided the same protocol that hides the details of the arbitration such that the WaitForAcknowledge() will wait until an Acknowledge() has been made by a servant component in response to the bus master's request. While the bus arbitration is responsible for sending the acknowledge to correct master component, the acknowledge comes from the servant component.

The WaitForAcknowledge() called by a master component should not return until an Acknowledge() has been provided by a servant component in response to the current bus request. In other words, the bus component is not responsible for generating the acknowledge, but rather simply forwarding an acknowledge from a servant component.

February 23: SW <-> HW Communication: All communication between the software and hardware components must take place using memory mapped addresses.

February 23: SW Performance Estimation: The provided profile data will be needed to add the performance estimation information for the software component, specifically for estimating the performance of any loop(s) that are not partitioned to hardware. Please note that the provided profile is in cycles not instructions.


1. (15 points) System-level Design of Hardware/Software Partitioned Application.

Using SystemC and transaction-level modeling, implement the matrix multiplication application as a hardware/software partitioned design consisting of a software component (SW), hardware coprocessor (HW), and memory communicating over a shared bus (Bus). Using the profile information annotated within C code, determine which of the two innermost loops will result in the best increase in performance when partitioned to a hardware coprocessor.



#define LOOPS 1000
#define SIZE 5

int main() // Total Cycles: 8193437
{
  int n;
  int i,j,k;

  for (n = 0 ; n < LOOPS ; n++) // Total Cycles: 8186006, Execs: 1, Iters: 1000
  {
    for(i=1;i<=SIZE;i++) // Total Cycles: 579000, Execs: 1000, Iters: 5
      for(j=1;j<=SIZE;j++) // Total Cycles: 520000, Execs: 5000, Iters: 5
        c[i][j] = 0;

    for(i=1;i<=SIZE;i++) // Total Cycles: 7579000, Execs: 1000, Iters: 5
      for(j=1;j<=SIZE;j++) // Total Cycles: 7520000, Execs: 5000, Iters: 5
        for(k=1;k<=SIZE;k++) // Total Cycles: 7225000, Execs: 25000, Iters: 5
          c[i][j] += a[i][k] * b[k][j];
  }

  return 0;
}

Shared Bus

Your implementation should consist of two interfaces, a bus master interface (bus_master_if) and a bus servant interface (bus_servant_if) with support for directly modeling the various stages of the bus communication protocol. The names of all required transactions (or functions) for each interface are provided, but the exact parameters and return types are left open ended.

// Bus Master Interface
class bus_master_if : virtual public sc_interface
{
  public:
    virtual Request() = 0;
    virtual WaitForAcknowledge() = 0;
    virtual ReadData() = 0;
    virtual WriteData() = 0;
};


// Bus Servant Interface
class bus_servant_if : virtual public sc_interface
{
  public:
    virtual Listen() = 0;
    virtual Acknowledge() = 0;
    virtual SendReadData() = 0;
    virtual ReceiveWriteData() = 0;
};

For this assignment, your bus implementation only needs to support single data read and single data write operations. You do not need to support burst transactions, although you are welcome to do so. The shared bus component should support a round robin arbitration scheme.

Software Component

The application software that is not partitioned to hardware can be directly modeled as C/C++ code within the software component. All array data (specifically arrays a, b, and c) are stored within the memory component and all reads from or writes to these arrays must be accessed through the shared bus.

Hardware Component

The hardware component can act as both a master and slave of the shared bus. Thus, the hardware component should have two ports: one connecting to the bus_master_if of the bus and one connecting to the bus_servant_if of the shared bus. All communication between the software and hardware components of the bus must be implemented using memory mapped addressing. you must also define the specific memory mapped addresses to which this component is mapped.

Memory

The arrays a, b, and c are located within a single memory component. You will need to determine both the minimum required size for the memory as well define the specific addresses at which the memory is located.

The a and b data values within the memory should be initialized at the beginning of simulation from the memory components constructor. The contents for these arrays is as follows:

a = { 0,0,0,0,0,0,0,0,9,4,7,9,0,12,14,15,16,11,0,2,3,4,5,6,0,4,3,2,1,2,0,2,7,6,4,9 };
b = { 0,0,0,0,0,0,0,0,9,4,7,9,0,12,14,15,16,11 0,2,3,4,5,6,0,4,3,2,1,2,0,2,7,6,4,9 };

2. (12 points) Performance Modeling of Software, Hardware, and Bus Communication.

Extend your system-level implementation to incorporate approximate time/cycle accurate performance data using the following information, assuming a system clock operating at 100 MHz.

  • A bus Request requires at least two clock cycle
  • A bus Acknowledge requires one clock cycle
  • A bus Write operation require one cycle
  • A bus Read operations require two cycles
  • The performance of the hardware component is only limited by the speed of the data transfers
  • All software instructions require 1.5 cycle (i.e. software execution delays can be estimated using the provided profile data)

3. (3 points) Report

You must submit a Word or PDF document providing the following details regarding your implementation:

  • Provide complete details and a brief description of the bus_master_if and bus_servant_if utilized within your design.
  • Briefly provide and overview of how you integrated the performance modeling within your SystemC TLM implementation. Note: you may include code snippets in your report, but please keep these to only what is minimally necessary).
  • Provide a breakdown of the resulting hardware/software performance for the matrix multiplication application.

Submission Requirements:

You must submit your SystemC files and Report via D2L as a single ZIP or TAR/GZIPPED archive. Note: Do not submit executables, Makefiles, or Visual Studio project files.

Linux Server Requirements:

The server embedded.ece.arizona.edu will be utilized to test all homework assignments. The embedded server is available for development and testing of your design. If you want to use the embedded server, please email the instructor with your ECE account login to allow for a directory to setup for your development efforts.

Students can utilize the departmental servers and workstations for their projects, and are free to use Microsoft Visual Studio, if desired. If you choose to use Microsoft Visual Studio, you will need to compile the SystemC library, for which many tutorials exist. While you are free to use any development environment, you should test your code for correct functionality on the embedded server before submitting.