Saturday, August 31, 2013

Notes on PCB traceability

One of the little things in board design that a lot of hobbyists neglect (although the practice is fairly widespread in industry) is traceability. This pretty much means you should be able to do two things:
  • Given a physical PCB, find the exact version of the CAD files used for it (and, in a team environment, figure out who designed it)
  • Each physical PCB should have some unique marking so that repairs, etc can be logged.
Most people sign their PCBs with a name and/or company logo in a corner, but don't go further than that. Once you have two or three versions of a board sitting around your lab and you don't know which ones are the most recent, it gets difficult to even know which bare board to assemble when you need an extra!

The first half of my standard traceability tag is normally on the front side of the board along one edge. (The board shown is the first prototype of my SNMP-managed 5V/12V DC power distribution unit, which will be described in more detail in a future post once I've debugged it a bit more.)

Traceability tag on PDU board
This tag consists of five distinct pieces of information:
  1. Logos / icons providing general info about the board. In this case there's three - open hardware, recyclable electronics, and lead-free. (Almost all of my boards are BSD licensed; all of them use SAC305 solder and are Pb-free and RoHS compliant.)
  2. A brief one-line summary of what the board does: "IP-Controlled 5V/12V DC PDU". This could include a product name if applicable.
  3. My name. This is obviously a matter of personal pride to some extent, but in a team environment it's handy for the firmware engineer to know who to ask if there's a question about layout. etc.
  4. The Subversion revision number of the layout file. KiCAD files are plain text so I can simply use svn:keywords to insert the revision number directly into the silkscreen layer. As long as I remember to commit before exporting Gerber files for fab, this tag will let me look up the exact layout revision for the board.
  5. A short-URL pointing to the directory in my public Google Code repository for this board.
The underside of the board has one last item: the serial number.

Serial number on XC6SLX9 dev board
I usually assemble the first prototype of a new board by itself, tinker with it for a while, perhaps do some rework, and then assemble the rest if things look good. Sometimes there are slight BOM changes, for example changing the speed grade of a part or the capacity of a a memory device. Having a serial number on the board makes it easy to keep track which boards have had which fixes applied.

Tuesday, August 27, 2013

Laser IC decapsulation experiments

Laser decapsulation is commonly used by professional shops to rapidly remove material before finishing with a chemical etch. Upon finding out that one of my friends had purchased a laser cutting system, we decided to see how well it performed at decapping.

Infrared light is absorbed strongly by most organics as well as some other materials such as glass. Most metals, especially gold, reflect IR strongly and thus should not be significantly etched by it. Silicon is nearly transparent to IR. The hope was that this would make laser ablation highly selective for packaging material over the die, leadframe, and bond wires.

Unfortunately I don't have any in-process photos. We used a raster scan pattern at fairly low power on a CO2 laser with near-continuous duty cycle.

The first sample was a Xilinx XC9572XL CPLD in a 44-pin TQFP.

Laser-etched CPLD with die outline visible
If you look closely you can see the outline of the die and wire bonds beginning to appear. This probably has something to do with the thermal resistances of gold bonding wires vs silicon and the copper leadframe.

Two of the other three samples (other CPLDs) turned out pretty similar except the dies weren't visible because we didn't lase quite as long.
Laser-etched CPLD without die visible
I popped this one under my Olympus microscope to take a closer look.

Focal plane on top of package
Focal plane at bottom of cavity
Scan lines from the laser's raster-etch pattern were clearly visible. The laser was quite effective at removing material at first glance, however higher magnification provided reason to believe this process was not as effective as I had hoped.
Raster lines in molding compound
Raster lines in molding compound
Most engineers are not aware that "plastic" IC packages are actually not made of plastic. (The curious reader may find the "epoxy" page on siliconpr0n.org a worthwhile read).

Typical "plastic" IC molding compounds are actually composite materials made from glass spheres of varying sizes as filler in a black epoxy resin matrix. The epoxy blocks light from reaching the die and interfering with circuits through induced photocurrents and acts to bond the glass together. Unfortunately the epoxy has a thermal expansion coefficient significantly different from that of the die, so glass beads are added as a filler to counteract this effect. Glass is usually a significant percentage (80 or 90 percent) of the molding compound.

My hope was that the laser would vaporize the epoxy and glass cleanly without damaging the die or bond wires. It seems that the glass near the edge of the beam fused together, producing a mess which would be difficult or impossible to remove. This effect was even more pronounced in the first sample.

The edge of the die stood out strongly in this sample even though the die is still quite a bit below the surface. Perhaps the die (or the die-attach paddle under it) is a good thermal conductor and acted to heatsink the glass, causing it to melt rather than vaporize?
The first sample seen earlier in the article, showing the corner of the die
A closeup showed a melted, blasted mess of glass. About the only things able to easily remove this are mechanical abrasion or HF, both of which would probably destroy the die.
Fused glass particles
Fused glass particles

I then took a look at the last sample, a PIC18F4553. We had etched this one all the way down to the die just to see what would happen.
Exposed PIC18F4553 die
Edge of the die showing bond pads
Most bond wires were completely gone - it appeared that the glass had gotten so hot that it melted the wires even though they did not absorb the laser energy directly. The large reddish sphere at the center of the frame is what remains of a ball bond that did not completely vanish.

The surface of the die was also covered by fused glass. No fine structure at all was visible.

Looking at the overview photo, reddish spots were visible around the edge of the die and package. I decided to take a closer look in hopes of figuring out what was going on there.
Red glass on the edge of the hole
I was rather confused at first because there should have only been metal, glass, and plastic in that area - and none of these were red. The red areas had a glassy texture to them, suggesting that they were partly or mostly made of fused molding compound.

Some reading on stained glass provided the answer - cranberry glass. This is a colloid of gold nanoparticles suspended in glass, giving it color from scattering incoming light.

The normal process for making cranberry glass is to mix Au2O3 in with the raw materials before smelting them together. At high temperatures the oxide decomposes, leaving gold particles suspended in the glass. It appears that I've unintentionally found a second synthesis which avoids the oxidation step: flash vaporization of solid gold and glass followed by condensation of the vapor on a cold surface.

Saturday, August 17, 2013

SoC framework, part 2: layer 2/3 protocols

Introduction

This is the second post in a series on the SoC framework I'm developing for my research. I'm going to get into more interesting topics (such as my build/test framework and FPGA cluster) shortly, but to understand how all of the parts communicate it's necessary to understand the basics of the SoC interconnect.

I'm omitting some of the details of link-layer flow control and congestion handling for now as it's not necessary to understand the higher-level concepts. If anyone really wants to know the dirty details, comment and I'll do a post on it at some point in the future.

As I mentioned briefly in part 1 of the series, my interconnect actually consists of two independent networks with the same topology. The RPC network is intended for control-plane transactions and supports function call/return semantics (request followed by response) as well as interrupts (one-way datagrams). The DMA network is meant for bulk data transfers between cores and memory devices.

Layer-2 header

The layer-2 header is the same for both networks:
31:2423:1615:87:0
Source addressDest address

This is then followed by the layer-3 header for the protocol of interest. Which protocol is in use depends on the interface; the routers are optimized for one or the other. I may consider changing this in the future.

DMA network

Packet format

31:2423:0
Layer-2 header
OpcodePayload length in words (only rightmost 10 bits implemented)
Physical memory address
Zero or more application-layer data words

Protocol description

The DMA network is meant for bulk data transfers and is normally memory mapped when used by a CPU.

It supports read and write transactions of an integer number of 32-bit words, up to 512 data words plus three header words. This size was chosen so that a DMA transfer could transport an entire Ethernet frame or typical NAND page in one packet.

Byte write enables are not supported; it is expected that a CPU core requiring this functionality will use read-modify-write semantics inside the L1 cache and then move words (or cache lines containing several words) over the DMA network.

The physical DMA address space is 48 bits: each of the 2^16 possible cores in the SoC has 32 bits of address space. If one core requires more than 4GB of address space it may respond to several consecutive DMA addresses. CPU cores are expected to translate the 48-bit physical addresses into 32 or 64 bit virtual addresses as required by their microarchitecture.

Write transactions are unidirectional: a single packet with opcode set to "write request" is all that is required. The destination host may send an RPC interrupt back on success or failure of the write however this is not required by the layer 3 protocol. Specific application layer APIs may mandate write acknowledgements.

Read transactions are bidirectional: a "read request" packet with length set to the desired read size, and no data words, is sent. The response is a "read data" packet with the appropriate length and data fields. As with write transactions, failure interrupts are optional at layer 3 but typically required by application layer APIs.

RPC network

Packet format

31:2423:2120:0
Layer-2 header
CallnumTypeApplication-layer data
Application-layer data
Application-layer data

Protocol description

The RPC network is meant for small, low-latency control transfers and is normally register mapped when used by a CPU.

It supports fixed length packets of four word length so as to easily fit into standard register-based calling conventions.

The "callnum" field uniquely identifies the specific request / interrupt being performed. The meaning of this field is up to the application-layer protocol.

The "type" field can be one of the following:
  • Function call request
    The source host is requesting the destination host to perform some action. A response is required.
  • Function return (success)
    The source host has completed the requested action successfully. The application-layer protocol may specify a return value.
  • Function return (fail)
    The source host attempted the requested operation but could not complete it. The application-layer protocol may specify an error code.
  • Function return (retry)
    The source host is busy with a long-running operation and cannot complete the requested operation now, but might be able to in the future. The source host may re-send the request in the future or consider this to be a failure.
  • InterruptSomething interesting happened at the source host, and the destination host has previously requested to been notified when this happened.
  • Host prohibited
    Sent by a router to indicate that the destination host attempted to reach a host in violation of security policy. The source address of the packet is the prohibited address.
  • Host unreachable
    Sent by a router to indicate that the destination host attempted to reach a nonexistent address. The source address of the packet is the invalid address.

Monday, August 12, 2013

SoC framework, part 1: NoC overview and layer 1 structure

Those of you who have read my older posts may remember that I am currently pursuing a PhD in computer science at RPI. My research focus is the intersection of computer architecture and security, blurring classical distinctions between components in hopes of solving open problems in security. I'd go into more detail but I have to keep some surprises for my published papers ;)

As part of my research I am developing an FPGA-based SoC to test my theories. Existing frameworks and buses, such as AXI and Wishbone, lacked the flexibility I required so I had to create my own.

The first step was to forgo the classic shared-bus or crossbar topology in favor of a packet-switched network-on-chip (NoC). In order to keep the routing simple I elected to use a quadtree topology, with 16-bit routing addresses, for the network. This maps well to a spatially distributed system and should permit scaling to very large SoCs (up to 65536 IP cores per SoC are theoretically possible, though FPGA gate counts limit feasible designs to much smaller)

Example quadtree (from http://www.eecs.berkeley.edu/)
For the remainder of this post series I will use a slightly modified form of CIDR notation, as used with IP subnetting, to describe NoC addresses. For example, "8000/14" is the subnet with routing prefix 1000 0000 0000 00,  consisting of hexadecimal addresses 0x8000, 0x8001, 0x8002, and 0x8002. (Unlike IPv4 addressing, all addresses in the NoC are usable by hosts; there are no reserved broadcast addresses since all traffic is point to point.)

Each router has four downstream and one upstream ports. When a packet arrives at a router it checks if the packet is intended for its subnet; if so the next two bits control which downstream port it is forwarded out of. If the packet belongs to another subnet, it is sent out the upstream port.

Example NoC routing topology
As an example, if the host at 0x8001 wanted to send a message to the host at 0x8003, it would first reach the router for the 0x8000/14 subnet, The router checks the prefix, determines it to be a match, and then reads address bits 1:0 to determine that the packet should go out port 2'b11.

If 0x8001 were instead communicating with 0x8005, the router would instead forward the message out the upstream port. The router at 0x8000/12 would check address bits 3:2, determine that the packet is destined for port 2'b01, and forward to the destination router, which would then use bits 1:0 as the selector and forward out port 2'b01 to the final destination.

The actual network topology is slightly more complex than the diagram above implies, because my framework uses two independent networks, one for bulk data transfer and one for control-plane traffic. Thus, each line in the above diagram is actually four independent one-way links; two upstream and two downstream. Each link consists of a 32-bit data bus plus a few status bits. The actual protocol used will be described in the next post in this series.

Status update - August 2013

Well, it's been a long time since my last post and a lot has been going on. I'm still alive and hacking :)

I've been mostly working on my research but a lot of side projects have managed to find their way into the mix. I'll try to post on several of them over the next week or two before school starts.

Here's a quick tease of what I'll be posting on soon. These are all WIP projects, some closer to completion than others.
  • Splash - open source build system borrowing many ideas from Google Blaze
  • My new "raised floor" desktop FPGA cluster
  • The custom SoC framework that I'm building my thesis project on top of
  • The SNMP-managed DC power distribution unit feeding 5V and 12V power to all of my dev boards
  • A custom FPGA+ARM SoC based JTAG ICE system (in early planning at this time) bridging 8 or 16 JTAG master ports to gigabit Ethernet