« Band on the run ! PCIE 3.0 DIO / SR-iov / MR-iov | Main | Is it fun to hack? »

Next stop - PCIE 3.0 MR-iov SR-iov

This is going to be my next blog. I will try to cover what are the ideas and what to look for! Before I start, I need to do a revisit to PCIE error handling. It confused me a lot - specially when I'm responsible to shed light on the design alternatives in architectural meetings. First thing to note is that PCIE is packet based serial communication method. Since it is packet based, it has different layers to handle the complexity of communication between devices on point to point links. Now different layers can find error situations and can triggers events to report errors. So error classification, reporting, and handling should be a fundamental paradigm for such protocols. It is not new though, TCP/IP and other layer 3 and 4 protocol does this type of error managements. Few fundamentals, though aside, are the difference between asynchronous and synchronous communications. Any packet based communication is asynchronous!. Synchronizing clock speed proved to be both difficult and time consuming, hence asynchronous packetized protocol is the current trend, and clocking informations are in the handshaking or link training packets. Note that I did not go into detail of synchronous and asynchronous communication. But to note, PCI-E is really asynchronous serial communication between devices in a link. It employs 8b/10b self synchronizing line coding. Also the signaling scheme is quite elegant. There is no separate address and data signals used in PCI. It also does not have side band clock signals along with data. It is also scalable, since devices can have x4, x8, x16, and x32 lanes. x4 configuration means 4 lanes between two devices. Each lane is a pair of unidirectional bit flows: one in each direction. x4 has the theoretical limit of 1GB. So if x16 is used, close to 4GB of bit rate is possible!This is pretty much the base specification PCIE 1.0. Version 2.2 and 3.0 achieve higher speed. Now coming back to PCIE error mechanism. Since PCIE has to have backward compatibility, it supports the old PERR#, and SERR# signals. PERR is data parity error, SERR is system error. While PERR are potentially recoverable, SERR are usually considered unrecoverable. PCI-X basically follow the same rule as PCI, but defines device specific error handling. It is really a prelude to more comprehensive error handling in PCIE. Since the error detection, reporting, and handling are from the PCIE device point of view, instead of platform point of view, the discussion here will only stress from the device point of view. Moreover I will try to stress some apparently salient points with associated complexities that comes with virtualization of device function. Briefly, when we introduce virtual functions off of a physical function, and try to assign to virtual machines(VM), each VM is now running a PCIE device ( sort of ). First the detection mechanism from the driver. In its simplest form the device should have a error status register like in the PCI config space. But that is just the simplest case. Now since error can occur at several protocol layers, not all errors needs to be percolated at the top layer for detection by driver as well as reporting from the device ( i.e., from endpoint ) could vary. For example most of the TLP ( Transaction layer ) error are actually shows up in error status register. Since PCIE has the backward compatibility, status register for PCI compatibility as well as PCIE error status register get set. But software compatible to one or the other bus specification does not necessarily clear the bits of the other bus type when done with error handling. So the driver will know from the error status register if there was an error got reported. At the driver level, this is the basic error detection. To understand the reporting the transactions are classified into (1) Non-posted requests: Reads; I/O writes; Configuration request, (2) Posted requests: Messages and Memory Writes. For non-posted request, completer reports error using completion status. The reporting target is the requester and optionally the root-complex. Usually it is the requester who decides how to handle the error. For the posted requests, the requestor TLP does not expect any completion TLP be returned from the completer (i.e., fire and forget type), completer creates an Error message and sends to Root Complex. Root complex needs to handle the error. From the software driver point of view, it is vital to look at what level of device support exist in the chipset for PCIE compatibility. One area that become a decision point when we introduce virtual functions off of a physical function. Depending on the OS, there could be different instances of the driver in kernel space or it could have different context and scope. When an error gets reported, perhaps every instance of the driver or the context might detect it. Now who should handle this and how? This where the host vm comes to play. Any VM but host should report this back to host VM for handling!!! I will end this installment here. I did not cover a lot of things here. For example: error sources, error message type, advance error reporting, DLLP and TLP error types etc, etc.
Posted on Friday, August 20, 2010 at 03:44PM by Registered CommenterProkash Sinha | CommentsPost a Comment | References2 References

References (2)

References allow you to track sources for this article, as well as articles that were written in response to this article.
  • Response
    Square and all spaces of the right and motives. It is the timely followed and introduced. It is the information of the blogs and all knowledge. It is the vital and produced.
  • Response
    Response: Showbox App Error

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
All HTML will be escaped. Hyperlinks will be created for URLs automatically.