Welcome to **E-XFL.COM** # Understanding <u>Embedded - FPGAs (Field Programmable Gate Array)</u> Embedded - FPGAs, or Field Programmable Gate Arrays, are advanced integrated circuits that offer unparalleled flexibility and performance for digital systems. Unlike traditional fixed-function logic devices, FPGAs can be programmed and reprogrammed to execute a wide array of logical operations, enabling customized functionality tailored to specific applications. This reprogrammability allows developers to iterate designs quickly and implement complex functions without the need for custom hardware. # **Applications of Embedded - FPGAs** The versatility of Embedded - FPGAs makes them indispensable in numerous fields. In telecommunications. | Details | | |--------------------------------|--------------------------------------------------------------| | Product Status | Obsolete | | Number of LABs/CLBs | 1024 | | Number of Logic Elements/Cells | 2432 | | Total RAM Bits | 32768 | | Number of I/O | 129 | | Number of Gates | 28000 | | Voltage - Supply | 3V ~ 3.6V | | Mounting Type | Surface Mount | | Operating Temperature | -40°C ~ 100°C (TJ) | | Package / Case | 160-BQFP Exposed Pad | | Supplier Device Package | 160-PQFP (28x28) | | Purchase URL | https://www.e-xfl.com/product-detail/xilinx/xc4028xl-1hq160i | Email: info@E-XFL.COM Address: Room A, 16/F, Full Win Commercial Centre, 573 Nathan Road, Mongkok, Hong Kong # Input Thresholds The input thresholds of 5V devices can be globally configured for either TTL (1.2 V threshold) or CMOS (2.5 V threshold), just like XC2000 and XC3000 inputs. The two global adjustments of input threshold and output level are independent of each other. The XC4000XL family has an input threshold of 1.6V, compatible with both 3.3V CMOS and TTL levels. #### Global Signal Access to Logic There is additional access from global clocks to the F and G function generator inputs. # Configuration Pin Pull-Up Resistors During configuration, these pins have weak pull-up resistors. For the most popular configuration mode, Slave Serial, the mode pins can thus be left unconnected. The three mode inputs can be individually configured with or without weak pull-up or pull-down resistors. A pull-down resistor value of $4.7~\mathrm{k}\Omega$ is recommended. The three mode inputs can be individually configured with or without weak pull-up or pull-down resistors after configuration. The PROGRAM input pin has a permanent weak pull-up. ### Soft Start-up Like the XC3000A, XC4000 Series devices have "Soft Start-up." When the configuration process is finished and the device starts up, the first activation of the outputs is automatically slew-rate limited. This feature avoids potential ground bounce when all outputs are turned on simultaneously. Immediately after start-up, the slew rate of the individual outputs is, as in the XC4000 family, determined by the individual configuration option. # XC4000 and XC4000A Compatibility Existing XC4000 bitstreams can be used to configure an XC4000E device. XC4000A bitstreams must be recompiled for use with the XC4000E due to improved routing resources, although the devices are pin-for-pin compatible. # Additional Improvements in XC4000X Only # Increased Routing New interconnect in the XC4000X includes twenty-two additional vertical lines in each column of CLBs and twelve new horizontal lines in each row of CLBs. The twelve "Quad Lines" in each CLB row and column include optional repowering buffers for maximum speed. Additional high-performance routing near the IOBs enhances pin flexibility. ### Faster Input and Output A fast, dedicated early clock sourced by global clock buffers is available for the IOBs. To ensure synchronization with the regular global clocks, a Fast Capture latch driven by the early clock is available. The input data can be initially loaded into the Fast Capture latch with the early clock, then transferred to the input flip-flop or latch with the low-skew global clock. A programmable delay on the input can be used to avoid hold-time requirements. See "IOB Input Signals" on page 20 for more information. # Latch Capability in CLBs Storage elements in the XC4000X CLB can be configured as either flip-flops or latches. This capability makes the FPGA highly synthesis-compatible. ### IOB Output MUX From Output Clock A multiplexer in the IOB allows the output clock to select either the output data or the IOB clock enable as the output to the pad. Thus, two different data signals can share a single output pad, effectively doubling the number of device outputs without requiring a larger, more expensive package. This multiplexer can also be configured as an AND-gate to implement a very fast pin-to-pin path. See "IOB Output Signals" on page 23 for more information. # Additional Address Bits Larger devices require more bits of configuration data. A daisy chain of several large XC4000X devices may require a PROM that cannot be addressed by the eighteen address bits supported in the XC4000E. The XC4000X Series therefore extends the addressing in Master Parallel configuration mode to 22 bits. # **Detailed Functional Description** XC4000 Series devices achieve high speed through advanced semiconductor technology and improved architecture. The XC4000E and XC4000X support system clock rates of up to 80 MHz and internal performance in excess of 150 MHz. Compared to older Xilinx FPGA families, XC4000 Series devices are more powerful. They offer on-chip edge-triggered and dual-port RAM, clock enables on I/O flip-flops, and wide-input decoders. They are more versatile in many applications, especially those involving RAM. Design cycles are faster due to a combination of increased routing resources and more sophisticated software. # **Basic Building Blocks** Xilinx user-programmable gate arrays include two major configurable elements: configurable logic blocks (CLBs) and input/output blocks (IOBs). - CLBs provide the functional elements for constructing the user's logic. - IOBs provide the interface between the package pins and internal signal lines. Three other types of circuits are also available: - 3-State buffers (TBUFs) driving horizontal longlines are associated with each CLB. - Wide edge decoders are available around the periphery of each device. - An on-chip oscillator is provided. Programmable interconnect resources provide routing paths to connect the inputs and outputs of these configurable elements to the appropriate networks. The functionality of each circuit block is customized during configuration by programming internal static memory cells. The values stored in these memory cells determine the logic functions and interconnections implemented in the FPGA. Each of these available circuits is described in this section. # **Configurable Logic Blocks (CLBs)** Configurable Logic Blocks implement most of the logic in an FPGA. The principal CLB elements are shown in Figure 1. Two 4-input function generators (F and G) offer unrestricted versatility. Most combinatorial logic functions need four or fewer inputs. However, a third function generator (H) is provided. The H function generator has three inputs. Either zero, one, or two of these inputs can be the outputs of F and G; the other input(s) are from outside the CLB. The CLB can, therefore, implement certain functions of up to nine variables, like parity check or expandable-identity comparison of two sets of four inputs. Each CLB contains two storage elements that can be used to store the function generator outputs. However, the storage elements and function generators can also be used independently. These storage elements can be configured as flip-flops in both XC4000E and XC4000X devices; in the XC4000X they can optionally be configured as latches. DIN can be used as a direct input to either of the two storage elements. H1 can drive the other through the H function generator. Function generator outputs can also drive two outputs independent of the storage element outputs. This versatility increases logic capacity and simplifies routing. Thirteen CLB inputs and four CLB outputs provide access to the function generators and storage elements. These inputs and outputs connect to the programmable interconnect resources outside the block. #### **Function Generators** Four independent inputs are provided to each of two function generators (F1 - F4 and G1 - G4). These function generators, with outputs labeled F' and G', are each capable of implementing any arbitrarily defined Boolean function of four inputs. The function generators are implemented as memory look-up tables. The propagation delay is therefore independent of the function implemented. A third function generator, labeled H', can implement any Boolean function of its three inputs. Two of these inputs can optionally be the F' and G' functional generator outputs. Alternatively, one or both of these inputs can come from outside the CLB (H2, H0). The third input must come from outside the block (H1). Signals from the function generators can exit the CLB on two outputs. F' or H' can be connected to the X output. G' or H' can be connected to the Y output. A CLB can be used to implement any of the following functions: - any function of up to four variables, plus any second function of up to four unrelated variables, plus any third function of up to three unrelated variables<sup>1</sup> - any single function of five variables - any function of four variables together with some functions of six variables - · some functions of up to nine variables. Implementing wide functions in a single block reduces both the number of blocks required and the delay in the signal path, achieving both increased capacity and speed. The versatility of the CLB function generators significantly improves system speed. In addition, the design-software tools can deal with each function generator independently. This flexibility improves cell usage. <sup>1.</sup> When three separate functions are generated, one of the function outputs must be captured in a flip-flop internal to the CLB. Only two unregistered function generator outputs are available from the CLB. #### Set/Reset An asynchronous storage element input (SR) can be configured as either set or reset. This configuration option determines the state in which each flip-flop becomes operational after configuration. It also determines the effect of a Global Set/Reset pulse during normal operation, and the effect of a pulse on the SR pin of the CLB. All three set/reset functions for any single flip-flop are controlled by the same configuration data bit. The set/reset state can be independently specified for each flip-flop. This input can also be independently disabled for either flip-flop. The set/reset state is specified by using the INIT attribute, or by placing the appropriate set or reset flip-flop library symbol. SR is active High. It is not invertible within the CLB. #### Global Set/Reset A separate Global Set/Reset line (not shown in Figure 1) sets or clears each storage element during power-up, re-configuration, or when a dedicated Reset net is driven active. This global net (GSR) does not compete with other routing resources; it uses a dedicated distribution network. Each flip-flop is configured as either globally set or reset in the same way that the local set/reset (SR) is specified. Therefore, if a flip-flop is set by SR, it is also set by GSR. Similarly, a reset flip-flop is reset by both SR and GSR. Figure 2: Schematic Symbols for Global Set/Reset GSR can be driven from any user-programmable pin as a global reset input. To use this global net, place an input pad and input buffer in the schematic or HDL code, driving the GSR pin of the STARTUP symbol. (See Figure 2.) A specific pin location can be assigned to this input using a LOC attribute or property, just as with any other user-programmable pad. An inverter can optionally be inserted after the input buffer to invert the sense of the Global Set/Reset signal. Alternatively, GSR can be driven from any internal node. ### Data Inputs and Outputs The source of a storage element data input is programmable. It is driven by any of the functions F', G', and H', or by the Direct In (DIN) block input. The flip-flops or latches drive the XQ and YQ CLB outputs. Two fast feed-through paths are available, as shown in Figure 1. A two-to-one multiplexer on each of the XQ and YQ outputs selects between a storage element output and any of the control inputs. This bypass is sometimes used by the automated router to repower internal signals. #### **Control Signals** Multiplexers in the CLB map the four control inputs (C1 - C4 in Figure 1) into the four internal control signals (H1, DIN/H2, SR/H0, and EC). Any of these inputs can drive any of the four internal control signals. When the logic function is enabled, the four inputs are: - EC Enable Clock - SR/H0 Asynchronous Set/Reset or H function generator Input 0 - DIN/H2 Direct In or H function generator Input 2 - H1 H function generator Input 1. When the memory function is enabled, the four inputs are: - EC Enable Clock - WE Write Enable - D0 Data Input to F and/or G function generator - D1 Data input to G function generator (16x1 and 16x2 modes) or 5th Address bit (32x1 mode). #### Using FPGA Flip-Flops and Latches The abundance of flip-flops in the XC4000 Series invites pipelined designs. This is a powerful way of increasing performance by breaking the function into smaller subfunctions and executing them in parallel, passing on the results through pipeline flip-flops. This method should be seriously considered wherever throughput is more important than latency. To include a CLB flip-flop, place the appropriate library symbol. For example, FDCE is a D-type flip-flop with clock enable and asynchronous clear. The corresponding latch symbol (for the XC4000X only) is called LDCE. In XC4000 Series devices, the flip flops can be used as registers or shift registers without blocking the function generators from performing a different, perhaps unrelated task. This ability increases the functional capacity of the devices. The CLB setup time is specified between the function generator inputs and the clock input K. Therefore, the specified CLB flip-flop setup time includes the delay through the function generator. # Using Function Generators as RAM Optional modes for each CLB make the memory look-up tables in the F' and G' function generators usable as an array of Read/Write memory cells. Available modes are level-sensitive (similar to the XC4000/A/H families), edge-triggered, and dual-port edge-triggered. Depending on the selected mode, a single CLB can be configured as either a 16x2, 32x1, or 16x1 bit array. Supported CLB memory configurations and timing modes for single- and dual-port modes are shown in Table 3. XC4000 Series devices are the first programmable logic devices with edge-triggered (synchronous) and dual-port RAM accessible to the user. Edge-triggered RAM simplifies system timing. Dual-port RAM doubles the effective throughput of FIFO applications. These features can be individually programmed in any XC4000 Series CLB. #### Advantages of On-Chip and Edge-Triggered RAM The on-chip RAM is extremely fast. The read access time is the same as the logic delay. The write access time is slightly slower. Both access times are much faster than any off-chip solution, because they avoid I/O delays. Edge-triggered RAM, also called synchronous RAM, is a feature never before available in a Field Programmable Gate Array. The simplicity of designing with edge-triggered RAM, and the markedly higher achievable performance, add up to a significant improvement over existing devices with on-chip RAM. Three application notes are available from Xilinx that discuss edge-triggered RAM: "XC4000E Edge-Triggered and Dual-Port RAM Capability," "Implementing FIFOs in XC4000E RAM," and "Synchronous and Asynchronous FIFO Designs." All three application notes apply to both XC4000E and XC4000X RAM. **Table 3: Supported RAM Modes** | | 16 | 16 | 32 | Edge- | Level- | |-------------|----|----|----|-----------|-----------| | | х | х | x | Triggered | Sensitive | | | 1 | 2 | 1 | Timing | Timing | | Single-Port | V | 1 | 1 | 1 | <b>V</b> | | Dual-Port | 1 | | | $\sqrt{}$ | | #### **RAM Configuration Options** The function generators in any CLB can be configured as RAM arrays in the following sizes: - Two 16x1 RAMs: two data inputs and two data outputs with identical or, if preferred, different addressing for each RAM - One 32x1 RAM: one data input and one data output. One F or G function generator can be configured as a 16x1 RAM while the other function generators are used to implement any function of up to 5 inputs. Additionally, the XC4000 Series RAM may have either of two timing modes: - Edge-Triggered (Synchronous): data written by the designated edge of the CLB clock. WE acts as a true clock enable. - Level-Sensitive (Asynchronous): an external WE signal acts as the write strobe. The selected timing mode applies to both function generators within a CLB when both are configured as RAM. The number of read ports is also programmable: - Single Port: each function generator has a common read and write port - Dual Port: both function generators are configured together as a single 16x1 dual-port RAM with one write port and two read ports. Simultaneous read and write operations to the same or different addresses are supported. RAM configuration options are selected by placing the appropriate library symbol. # **Choosing a RAM Configuration Mode** The appropriate choice of RAM mode for a given design should be based on timing and resource requirements, desired functionality, and the simplicity of the design process. Recommended usage is shown in Table 4. The difference between level-sensitive, edge-triggered, and dual-port RAM is only in the write operation. Read operation and timing is identical for all modes of operation. **Table 4: RAM Mode Selection** | | Level-Sens itive | Edge-Trigg<br>ered | Dual-Port<br>Edge-Trigg<br>ered | |----------------------------|------------------|--------------------|---------------------------------| | Use for New Designs? | No | Yes | Yes | | Size (16x1,<br>Registered) | 1/2 CLB | 1/2 CLB | 1 CLB | | Simultaneous<br>Read/Write | No | No | Yes | | Relative<br>Performance | Х | 2X | 2X (4X<br>effective) | #### **RAM Inputs and Outputs** The F1-F4 and G1-G4 inputs to the function generators act as address lines, selecting a particular memory cell in each look-up table. The functionality of the CLB control signals changes when the function generators are configured as RAM. The DIN/H2, H1, and SR/H0 lines become the two data inputs (D0, D1) and the Write Enable (WE) input for the 16x2 memory. When the 32x1 configuration is selected, D1 acts as the fifth address bit and D0 is the data input. The contents of the memory cell(s) being addressed are available at the F' and G' function-generator outputs. They can exit the CLB through its X and Y outputs, or can be captured in the CLB flip-flop(s). Configuring the CLB function generators as Read/Write memory does not affect the functionality of the other por- #### **Dual-Port Edge-Triggered Mode** In dual-port mode, both the F and G function generators are used to create a single 16x1 RAM array with one write port and two read ports. The resulting RAM array can be read and written simultaneously at two independent addresses. Simultaneous read and write operations at the same address are also supported. Dual-port mode always has edge-triggered write timing, as shown in Figure 3. Figure 6 shows a simple model of an XC4000 Series CLB configured as dual-port RAM. One address port, labeled A[3:0], supplies both the read and write address for the F function generator. This function generator behaves the same as a 16x1 single-port edge-triggered RAM array. The RAM output, Single Port Out (SPO), appears at the F function generator output. SPO, therefore, reflects the data at address A[3:0]. The other address port, labeled DPRA[3:0] for Dual Port Read Address, supplies the read address for the G function generator. The write address for the G function generator, however, comes from the address A[3:0]. The output from this 16x1 RAM array, Dual Port Out (DPO), appears at the G function generator output. DPO, therefore, reflects the data at address DPRA[3:0]. Therefore, by using A[3:0] for the write address and DPRA[3:0] for the read address, and reading only the DPO output, a FIFO that can read and write simultaneously is easily generated. Simultaneous access doubles the effective throughput of the FIFO. The relationships between CLB pins and RAM inputs and outputs for dual-port, edge-triggered mode are shown in Table 6. See Figure 7 on page 16 for a block diagram of a CLB configured in this mode. Figure 6: XC4000 Series Dual-Port RAM, Simple Model Table 6: Dual-Port Edge-Triggered RAM Signals | RAM Signal | CLB Pin | Function | |------------|---------|---------------------------| | D | D0 | Data In | | A[3:0] | F1-F4 | Read Address for F, | | | | Write Address for F and G | | DPRA[3:0] | G1-G4 | Read Address for G | | WE | WE | Write Enable | | WCLK | K | Clock | | SPO | F' | Single Port Out | | | | (addressed by A[3:0]) | | DPO | G' | Dual Port Out | | | | (addressed by DPRA[3:0]) | **Note:** The pulse following the active edge of WCLK ( $T_{WPS}$ in Figure 3) must be less than one millisecond wide. For most applications, this requirement is not overly restrictive; however, it must not be forgotten. Stopping WCLK at this point in the write cycle could result in excessive current and even damage to the larger devices if many CLBs are configured as edge-triggered RAM. #### Single-Port Level-Sensitive Timing Mode **Note:** Edge-triggered mode is recommended for all new designs. Level-sensitive mode, also called asynchronous mode, is still supported for XC4000 Series backward-compatibility with the XC4000 family. Level-sensitive RAM timing is simple in concept but can be complicated in execution. Data and address signals are presented, then a positive pulse on the write enable pin (WE) performs a write into the RAM at the designated address. As indicated by the "level-sensitive" label, this RAM acts like a latch. During the WE High pulse, changing the data lines results in new data written to the old address. Changing the address lines while WE is High results in spurious data written to the new address—and possibly at other addresses as well, as the address lines inevitably do not all change simultaneously. The user must generate a carefully timed WE signal. The delay on the WE signal and the address lines must be carefully verified to ensure that WE does not become active until after the address lines have settled, and that WE goes inactive before the address lines change again. The data must be stable before and after the falling edge of WE. In practical terms, WE is usually generated by a 2X clock. If a 2X clock is not available, the falling edge of the system clock can be used. However, there are inherent risks in this approach, since the WE pulse must be guaranteed inactive before the next rising edge of the system clock. Several older application notes are available from Xilinx that discuss the design of level-sensitive RAMs. However, the edge-triggered RAM available in the XC4000 Series is superior to level-sensitive RAM for almost every application. Figure 15: Simplified Block Diagram of XC4000E IOB Figure 16: Simplified Block Diagram of XC4000X IOB (shaded areas indicate differences from XC4000E) Table 8: Supported Sources for XC4000 Series Device Inputs | | XC4000E/EX<br>Series Inputs | | XC4000XL<br>Series Inputs | |-------------------------------------------------------------------------|-----------------------------|---------------|---------------------------| | Source | 5 V,<br>TTL | 5 V,<br>CMOS | 3.3 V<br>CMOS | | Any device, Vcc = 3.3 V,<br>CMOS outputs | V | Unreli | √ | | XC4000 Series, Vcc = 5 V, TTL outputs | V | -able<br>Data | √ | | Any device, $Vcc = 5 \text{ V}$ , TTL outputs $(Voh \le 3.7 \text{ V})$ | √ | Data | √ | | Any device, Vcc = 5 V,<br>CMOS outputs | V | √ | √ | #### XC4000XL 5-Volt Tolerant I/Os The I/Os on the XC4000XL are fully 5-volt tolerant even though the $V_{\rm CC}$ is 3.3 volts. This allows 5 V signals to directly connect to the XC4000XL inputs without damage, as shown in Table 8. In addition, the 3.3 volt $V_{\rm CC}$ can be applied before or after 5 volt signals are applied to the I/Os. This makes the XC4000XL immune to power supply sequencing problems. #### **Registered Inputs** The I1 and I2 signals that exit the block can each carry either the direct or registered input signal. The input and output storage elements in each IOB have a common clock enable input, which, through configuration, can be activated individually for the input or output flip-flop, or both. This clock enable operates exactly like the EC pin on the XC4000 Series CLB. It cannot be inverted within the IOB. The storage element behavior is shown in Table 9. Table 9: Input Register Functionality (active rising edge is shown) | Mode | Clock | Clock<br>Enable | D | Q | |-----------------|-------|-----------------|---|----| | Power-Up or GSR | X | X | Х | SR | | Flip-Flop | | 1* | D | D | | | 0 | Х | Х | Q | | Latch | 1 | 1* | Х | Q | | | 0 | 1* | D | D | | Both | Χ | 0 | Х | Q | Legend: X Don't care Rising edge SR Set or Reset value. Reset is default. 0\* Input is Low or unconnected (default value) 1\* Input is High or unconnected (default value) #### **Optional Delay Guarantees Zero Hold Time** The data input to the register can optionally be delayed by several nanoseconds. With the delay enabled, the setup time of the input flip-flop is increased so that normal clock routing does not result in a positive hold-time requirement. A positive hold time requirement can lead to unreliable, temperature- or processing-dependent operation. The input flip-flop setup time is defined between the data measured at the device I/O pin and the clock input at the IOB (not at the clock pin). Any routing delay from the device clock pin to the clock input of the IOB must, therefore, be subtracted from this setup time to arrive at the real setup time requirement relative to the device pins. A short specified setup time might, therefore, result in a negative setup time at the device pins, i.e., a positive hold-time requirement. When a delay is inserted on the data line, more clock delay can be tolerated without causing a positive hold-time requirement. Sufficient delay eliminates the possibility of a data hold-time requirement at the external pin. The maximum delay is therefore inserted as the default. The XC4000E IOB has a one-tap delay element: either the delay is inserted (default), or it is not. The delay guarantees a zero hold time with respect to clocks routed through any of the XC4000E global clock buffers. (See "Global Nets and Buffers (XC4000E only)" on page 35 for a description of the global clock buffers in the XC4000E.) For a shorter input register setup time, with non-zero hold, attach a NODELAY attribute or property to the flip-flop. The XC4000X IOB has a two-tap delay element, with choices of a full delay, a partial delay, or no delay. The attributes or properties used to select the desired delay are shown in Table 10. The choices are no added attribute, MEDDELAY, and NODELAY. The default setting, with no added attribute, ensures no hold time with respect to any of the XC4000X clock buffers, including the Global Low-Skew buffers. MEDDELAY ensures no hold time with respect to the Global Early buffers. Inputs with NODELAY may have a positive hold time with respect to all clock buffers. For a description of each of these buffers, see "Global Nets and Buffers (XC4000X only)" on page 37. Table 10: XC4000X IOB Input Delay Element | Value | When to Use | |------------------|----------------------------------------| | full delay | Zero Hold with respect to Global | | (default, no | Low-Skew Buffer, Global Early Buffer | | attribute added) | | | MEDDELAY | Zero Hold with respect to Global Early | | | Buffer | | NODELAY | Short Setup, positive Hold time | Any XC4000 Series 5-Volt device with its outputs configured in TTL mode can drive the inputs of any typical 3.3-Volt device. (For a detailed discussion of how to interface between 5 V and 3.3 V devices, see the 3V Products section of *The Programmable Logic Data Book*.) Supported destinations for XC4000 Series device outputs are shown in Table 12. An output can be configured as open-drain (open-collector) by placing an OBUFT symbol in a schematic or HDL code, then tying the 3-state pin (T) to the output signal, and the input pin (I) to Ground. (See Figure 18.) Table 12: Supported Destinations for XC4000 Series Outputs | | XC4000 Series<br>Outputs | | | |----------------------------------|--------------------------|-------------|-------------------| | Destination | 3.3 V,<br>CMOS | 5 V,<br>TTL | 5 V,<br>CMOS | | Any typical device, Vcc = 3.3 V, | V V | | some <sup>1</sup> | | CMOS-threshold inputs | | | | | Any device, Vcc = 5 V, | V | | √ | | TTL-threshold inputs | | | | | Any device, Vcc = 5 V, | Unreliable | | √ | | CMOS-threshold inputs | Data | | | 1. Only if destination device has 5-V tolerant inputs Figure 18: Open-Drain Output #### **Output Slew Rate** The slew rate of each output buffer is, by default, reduced, to minimize power bus transients when switching non-critical signals. For critical signals, attach a FAST attribute or property to the output buffer or flip-flop. For XC4000E devices, maximum total capacitive load for simultaneous fast mode switching in the same direction is 200 pF for all package pins between each Power/Ground pin pair. For XC4000X devices, additional internal Power/Ground pin pairs are connected to special Power and Ground planes within the packages, to reduce ground bounce. Therefore, the maximum total capacitive load is 300 pF between each external Power/Ground pin pair. Maximum loading may vary for the low-voltage devices. For slew-rate limited outputs this total is two times larger for each device type: 400 pF for XC4000E devices and 600 pF for XC4000X devices. This maximum capacitive load should not be exceeded, as it can result in ground bounce of greater than 1.5 V amplitude and more than 5 ns duration. This level of ground bounce may cause undesired transient behavior on an output, or in the internal logic. This restriction is common to all high-speed digital ICs, and is not particular to Xilinx or the XC4000 Series. XC4000 Series devices have a feature called "Soft Start-up," designed to reduce ground bounce when all outputs are turned on simultaneously at the end of configuration. When the configuration process is finished and the device starts up, the first activation of the outputs is automatically slew-rate limited. Immediately following the initial activation of the I/O, the slew rate of the individual outputs is determined by the individual configuration option for each IOB. #### **Global Three-State** A separate Global 3-State line (not shown in Figure 15 or Figure 16) forces all FPGA outputs to the high-impedance state, unless boundary scan is enabled and is executing an EXTEST instruction. This global net (GTS) does not compete with other routing resources; it uses a dedicated distribution network. GTS can be driven from any user-programmable pin as a global 3-state input. To use this global net, place an input pad and input buffer in the schematic or HDL code, driving the GTS pin of the STARTUP symbol. A specific pin location can be assigned to this input using a LOC attribute or property, just as with any other user-programmable pad. An inverter can optionally be inserted after the input buffer to invert the sense of the Global 3-State signal. Using GTS is similar to GSR. See Figure 2 on page 11 for details. Alternatively, GTS can be driven from any internal node. # Output Multiplexer/2-Input Function Generator (XC4000X only) As shown in Figure 16 on page 21, the output path in the XC4000X IOB contains an additional multiplexer not available in the XC4000E IOB. The multiplexer can also be configured as a 2-input function generator, implementing a pass-gate, AND-gate, OR-gate, or XOR-gate, with 0, 1, or 2 inverted inputs. The logic used to implement these functions is shown in the upper gray area of Figure 16. When configured as a multiplexer, this feature allows two output signals to time-share the same output pad; effectively doubling the number of device outputs without requiring a larger, more expensive package. When the MUX is configured as a 2-input function generator, logic can be implemented within the IOB itself. Combined with a Global Early buffer, this arrangement allows very high-speed gating of a single signal. For example, a wide decoder can be implemented in CLBs, and its output gated with a Read or Write Strobe Driven by a BUFGE buffer, as shown in Figure 19. The critical-path pin-to-pin delay of this circuit is less than 6 nanoseconds. As shown in Figure 16, the IOB input pins Out, Output Clock, and Clock Enable have different delays and different flexibilities regarding polarity. Additionally, Output Clock sources are more limited than the other inputs. Therefore, the Xilinx software does not move logic into the IOB function generators unless explicitly directed to do so. The user can specify that the IOB function generator be used, by placing special library symbols beginning with the letter "O." For example, a 2-input AND-gate in the IOB function generator is called OAND2. Use the symbol input pin labelled "F" for the signal on the critical path. This signal is placed on the OK pin — the IOB input with the shortest delay to the function generator. Two examples are shown in Figure 20. Figure 19: Fast Pin-to-Pin Path in XC4000X Figure 20: AND & MUX Symbols in XC4000X IOB # Other IOB Options There are a number of other programmable options in the XC4000 Series IOB. # Pull-up and Pull-down Resistors Programmable pull-up and pull-down resistors are useful for tying unused pins to Vcc or Ground to minimize power consumption and reduce noise sensitivity. The configurable pull-up resistor is a p-channel transistor that pulls to Vcc. The configurable pull-down resistor is an n-channel transistor that pulls to Ground. The value of these resistors is 50 k $\Omega$ – 100 k $\Omega$ . This high value makes them unsuitable as wired-AND pull-up resistors. The pull-up resistors for most user-programmable IOBs are active during the configuration process. See Table 22 on page 58 for a list of pins with pull-ups active before and during configuration. After configuration, voltage levels of unused pads, bonded or un-bonded, must be valid logic levels, to reduce noise sensitivity and avoid excess current. Therefore, by default, unused pads are configured with the internal pull-up resistor active. Alternatively, they can be individually configured with the pull-down resistor, or as a driven output, or to be driven by an external source. To activate the internal pull-up, attach the PULLUP library component to the net attached to the pad. To activate the internal pull-down, attach the PULLDOWN library component to the net attached to the pad. ### Independent Clocks Separate clock signals are provided for the input and output flip-flops. The clock can be independently inverted for each flip-flop within the IOB, generating either falling-edge or rising-edge triggered flip-flops. The clock inputs for each IOB are independent, except that in the XC4000X, the Fast Capture latch shares an IOB input with the output clock pin. # Early Clock for IOBs (XC4000X only) Special early clocks are available for IOBs. These clocks are sourced by the same sources as the Global Low-Skew buffers, but are separately buffered. They have fewer loads and therefore less delay. The early clock can drive either the IOB output clock or the IOB input clock, or both. The early clock allows fast capture of input data, and fast clock-to-output on output data. The Global Early buffers that drive these clocks are described in "Global Nets and Buffers (XC4000X only)" on page 37. #### **Global Set/Reset** As with the CLB registers, the Global Set/Reset signal (GSR) can be used to set or clear the input and output registers, depending on the value of the INIT attribute or property. The two flip-flops can be individually configured to set or clear on reset and after configuration. Other than the global GSR net, no user-controlled set/reset signal is available to the I/O flip-flops. The choice of set or clear applies to both the initial state of the flip-flop and the response to the Global Set/Reset pulse. See "Global Set/Reset" on page 11 for a description of how to use GSR. #### **JTAG Support** Embedded logic attached to the IOBs contains test structures compatible with IEEE Standard 1149.1 for boundary scan testing, permitting easy chip and board-level testing. More information is provided in "Boundary Scan" on page 42. #### **Three-State Buffers** A pair of 3-state buffers is associated with each CLB in the array. (See Figure 27 on page 30.) These 3-state buffers can be used to drive signals onto the nearest horizontal longlines above and below the CLB. They can therefore be used to implement multiplexed or bidirectional buses on the horizontal longlines, saving logic resources. Programmable pull-up resistors attached to these longlines help to implement a wide wired-AND function. The buffer enable is an active-High 3-state (i.e. an active-Low enable), as shown in Table 13. Another 3-state buffer with similar access is located near each I/O block along the right and left edges of the array. (See Figure 33 on page 34.) The horizontal longlines driven by the 3-state buffers have a weak keeper at each end. This circuit prevents undefined floating levels. However, it is overridden by any driver, even a pull-up resistor. Special longlines running along the perimeter of the array can be used to wire-AND signals coming from nearby IOBs or from internal longlines. These longlines form the wide edge decoders discussed in "Wide Edge Decoders" on page 27. # Three-State Buffer Modes The 3-state buffers can be configured in three modes: - · Standard 3-state buffer - Wired-AND with input on the I pin - Wired OR-AND #### Standard 3-State Buffer All three pins are used. Place the library element BUFT. Connect the input to the I pin and the output to the O pin. The T pin is an active-High 3-state (i.e. an active-Low enable). Tie the T pin to Ground to implement a standard buffer. #### Wired-AND with Input on the I Pin The buffer can be used as a Wired-AND. Use the WAND1 library symbol, which is essentially an open-drain buffer. WAND4, WAND8, and WAND16 are also available. See the *XACT Libraries Guide* for further information. The T pin is internally tied to the I pin. Connect the input to the I pin and the output to the O pin. Connect the outputs of all the WAND1s together and attach a PULLUP symbol. #### **Wired OR-AND** The buffer can be configured as a Wired OR-AND. A High level on either input turns off the output. Use the WOR2AND library symbol, which is essentially an open-drain 2-input OR gate. The two input pins are functionally equivalent. Attach the two inputs to the I0 and I1 pins and tie the output to the O pin. Tie the outputs of all the WOR2ANDs together and attach a PULLUP symbol. # Three-State Buffer Examples Figure 21 shows how to use the 3-state buffers to implement a wired-AND function. When all the buffer inputs are High, the pull-up resistor(s) provide the High output. Figure 22 shows how to use the 3-state buffers to implement a multiplexer. The selection is accomplished by the buffer 3-state signal. Pay particular attention to the polarity of the T pin when using these buffers in a design. Active-High 3-state (T) is identical to an active-Low output enable, as shown in Table 13. **Table 13: Three-State Buffer Functionality** | IN | Т | OUT | |----|---|-----| | X | 1 | Z | | IN | 0 | IN | Figure 21: Open-Drain Buffers Implement a Wired-AND Function The oscillator output is optionally available after configuration. Any two of four resynchronized taps of a built-in divider are also available. These taps are at the fourth, ninth, fourteenth and nineteenth bits of the divider. Therefore, if the primary oscillator output is running at the nominal 8 MHz, the user has access to an 8 MHz clock, plus any two of 500 kHz, 16kHz, 490Hz and 15Hz (up to 10% lower for low-voltage devices). These frequencies can vary by as much as -50% or +25%. These signals can be accessed by placing the OSC4 library element in a schematic or in HDL code (see Figure 24). The oscillator is automatically disabled after configuration if the OSC4 symbol is not used in the design. # **Programmable Interconnect** All internal connections are composed of metal segments with programmable switching points and switching matrices to implement the desired routing. A structured, hierarchical matrix of routing resources is provided to achieve efficient automated routing. The XC4000E and XC4000X share a basic interconnect structure. XC4000X devices, however, have additional routing not available in the XC4000E. The extra routing resources allow high utilization in high-capacity devices. All XC4000X-specific routing resources are clearly identified throughout this section. Any resources not identified as XC4000X-specific are present in all XC4000 Series devices. This section describes the varied routing resources available in XC4000 Series devices. The implementation software automatically assigns the appropriate resources based on the density and timing requirements of the design. # **Interconnect Overview** There are several types of interconnect. - CLB routing is associated with each row and column of the CLB array. - IOB routing forms a ring (called a VersaRing) around the outside of the CLB array. It connects the I/O with the internal logic blocks. Global routing consists of dedicated networks primarily designed to distribute clocks throughout the device with minimum delay and skew. Global routing can also be used for other high-fanout signals. Five interconnect types are distinguished by the relative length of their segments: single-length lines, double-length lines, quad and octal lines (XC4000X only), and longlines. In the XC4000X, direct connects allow fast data flow between adjacent CLBs, and between IOBs and CLBs. Extra routing is included in the IOB pad ring. The XC4000X also includes a ring of octal interconnect lines near the IOBs to improve pin-swapping and routing to locked pins. XC4000E/X devices include two types of global buffers. These global buffers have different properties, and are intended for different purposes. They are discussed in detail later in this section. # **CLB Routing Connections** A high-level diagram of the routing resources associated with one CLB is shown in Figure 25. The shaded arrows represent routing present only in XC4000X devices. Table 14 shows how much routing of each type is available in XC4000E and XC4000X CLB arrays. Clearly, very large designs, or designs with a great deal of interconnect, will route more easily in the XC4000X. Smaller XC4000E designs, typically requiring significantly less interconnect, do not require the additional routing. Figure 27 on page 30 is a detailed diagram of both the XC4000E and the XC4000X CLB, with associated routing. The shaded square is the programmable switch matrix, present in both the XC4000E and the XC4000X. The L-shaped shaded area is present only in XC4000X devices. As shown in the figure, the XC4000X block is essentially an XC4000E block with additional routing. CLB inputs and outputs are distributed on all four sides, providing maximum routing flexibility. In general, the entire architecture is symmetrical and regular. It is well suited to established placement and routing algorithms. Inputs, outputs, and function generators can freely swap positions within a CLB to avoid routing congestion during the placement and routing operation. Figure 25: High-Level Routing Diagram of XC4000 Series CLB (shaded arrows indicate XC4000X only) Table 14: Routing per CLB in XC4000 Series Devices | | XC4 | 1000E | XC4000X | | | |-------------|----------|---------------------|---------|------------|--| | | Vertical | Vertical Horizontal | | Horizontal | | | Singles | 8 | 8 | 8 | 8 | | | Doubles | 4 | 4 | 4 | 4 | | | Quads | 0 | 0 | 12 | 12 | | | Longlines | 6 | 6 | 10 | 6 | | | Direct | 0 | 0 | 2 | 2 | | | Connects | | | | | | | Globals | 4 | 0 | 8 | 0 | | | Carry Logic | 2 | 0 | 1 | 0 | | | Total | 24 | 18 | 45 | 32 | | # **Programmable Switch Matrices** The horizontal and vertical single- and double-length lines intersect at a box called a programmable switch matrix (PSM). Each switch matrix consists of programmable pass transistors used to establish connections between the lines (see Figure 26). For example, a single-length signal entering on the right side of the switch matrix can be routed to a single-length line on the top, left, or bottom sides, or any combination thereof, if multiple branches are required. Similarly, a double-length signal can be routed to a double-length line on any or all of the other three edges of the programmable switch matrix. Figure 26: Programmable Switch Matrix (PSM) ### Single-Length Lines Single-length lines provide the greatest interconnect flexibility and offer fast routing between adjacent blocks. There are eight vertical and eight horizontal single-length lines associated with each CLB. These lines connect the switching matrices that are located in every row and a column of CLBs. Single-length lines are connected by way of the programmable switch matrices, as shown in Figure 28. Routing connectivity is shown in Figure 27. Single-length lines incur a delay whenever they go through a switching matrix. Therefore, they are not suitable for routing signals for long distances. They are normally used to conduct signals within a localized area and to provide the branching for nets with fanout greater than one. Figure 27: Detail of Programmable Interconnect Associated with XC4000 Series CLB 6-30 May 14, 1999 (Version 1.6) # Global Nets and Buffers (XC4000X only) Eight vertical longlines in each CLB column are driven by special global buffers. These longlines are in addition to the vertical longlines used for standard interconnect. The global lines are broken in the center of the array, to allow faster distribution and to minimize skew across the whole array. Each half-column global line has its own buffered multiplexer, as shown in Figure 35. The top and bottom global lines cannot be connected across the center of the device, as this connection might introduce unacceptable skew. The top and bottom halves of the global lines must be separately driven — although they can be driven by the same global buffer. The eight global lines in each CLB column can be driven by either of two types of global buffers. They can also be driven by internal logic, because they can be accessed by single, double, and quad lines at the top, bottom, half, and quarter points. Consequently, the number of different clocks that can be used simultaneously in an XC4000X device is very large. There are four global lines feeding the IOBs at the left edge of the device. IOBs along the right edge have eight global lines. There is a single global line along the top and bottom edges with access to the IOBs. All IOB global lines are broken at the center. They cannot be connected across the center of the device, as this connection might introduce unacceptable skew. IOB global lines can be driven from two types of global buffers, or from local interconnect. Alternatively, top and bottom IOBs can be clocked from the global lines in the adjacent CLB column. Two different types of clock buffers are available in the XC4000X: - Global Low-Skew Buffers (BUFGLS) - Global Early Buffers (BUFGE) Global Low-Skew Buffers are the standard clock buffers. They should be used for most internal clocking, whenever a large portion of the device must be driven. Global Early Buffers are designed to provide a faster clock access, but CLB access is limited to one-fourth of the device. They also facilitate a faster I/O interface. Figure 35 is a conceptual diagram of the global net structure in the XC4000X. Global Early buffers and Global Low-Skew buffers share a single pad. Therefore, the same IPAD symbol can drive one buffer of each type, in parallel. This configuration is particularly useful when using the Fast Capture latches, as described in "IOB Input Signals" on page 20. Paired Global Early and Global Low-Skew buffers share a common input; they cannot be driven by two different signals. #### Choosing an XC4000X Clock Buffer The clocking structure of the XC4000X provides a large variety of features. However, it can be simple to use, without understanding all the details. The software automatically handles clocks, along with all other routing, when the appropriate clock buffer is placed in the design. In fact, if a buffer symbol called BUFG is placed, rather than a specific type of buffer, the software even chooses the buffer most appropriate for the design. The detailed information in this section is provided for those users who want a finer level of control over their designs. If fine control is desired, use the following summary and Table 15 on page 35 to choose an appropriate clock buffer. - The simplest thing to do is to use a Global Low-Skew buffer. - If a faster clock path is needed, try a BUFG. The software will first try to use a Global Low-Skew Buffer. If timing requirements are not met, a faster buffer will automatically be used. - If a single quadrant of the chip is sufficient for the clocked logic, and the timing requires a faster clock than the Global Low-Skew buffer, use a Global Early buffer. #### **Global Low-Skew Buffers** Each corner of the XC4000X device has two Global Low-Skew buffers. Any of the eight Global Low-Skew buffers can drive any of the eight vertical Global lines in a column of CLBs. In addition, any of the buffers can drive any of the four vertical lines accessing the IOBs on the left edge of the device, and any of the eight vertical lines accessing the IOBs on the right edge of the device. (See Figure 36 on page 38.) IOBs at the top and bottom edges of the device are accessed through the vertical Global lines in the CLB array, as in the XC4000E. Any Global Low-Skew buffer can, therefore, access every IOB and CLB in the device. The Global Low-Skew buffers can be driven by either semi-dedicated pads or internal logic. To use a Global Low-Skew buffer, instantiate a BUFGLS element in a schematic or in HDL code. If desired, attach a LOC attribute or property to direct placement to the designated location. For example, attach a LOC=T attribute or property to direct that a BUFGLS be placed in one of the two Global Low-Skew buffers on the top edge of the device, or a LOC=TR to indicate the Global Low-Skew buffer on the top edge of the device, on the right. Figure 36: Any BUFGLS (GCK1 - GCK8) Can Drive Any or All Clock Inputs on the Device # **Global Early Buffers** Each corner of the XC4000X device has two Global Early buffers. The primary purpose of the Global Early buffers is to provide an earlier clock access than the potentially heavily-loaded Global Low-Skew buffers. A clock source applied to both buffers will result in the Global Early clock edge occurring several nanoseconds earlier than the Global Low-Skew buffer clock edge, due to the lighter loading. Global Early buffers also facilitate the fast capture of device inputs, using the Fast Capture latches described in "IOB Input Signals" on page 20. For Fast Capture, take a single clock signal, and route it through both a Global Early buffer and a Global Low-Skew buffer. (The two buffers share an input pad.) Use the Global Early buffer to clock the Fast Capture latch, and the Global Low-Skew buffer to clock the normal input flip-flop or latch, as shown in Figure 17 on page 23. The Global Early buffers can also be used to provide a fast Clock-to-Out on device output pins. However, an early clock in the output flip-flop IOB must be taken into consideration when calculating the internal clock speed for the design. The Global Early buffers at the left and right edges of the chip have slightly different capabilities than the ones at the top and bottom. Refer to Figure 37, Figure 38, and Figure 35 on page 36 while reading the following explanation. Each Global Early buffer can access the eight vertical Global lines for all CLBs in the quadrant. Therefore, only one-fourth of the CLB clock pins can be accessed. This restriction is in large part responsible for the faster speed of the buffers, relative to the Global Low-Skew buffers. Figure 37: Left and Right BUFGEs Can Drive Any or All Clock Inputs in Same Quadrant or Edge (GCK1 is shown. GCK2, GCK5 and GCK6 are similar.) The left-side Global Early buffers can each drive two of the four vertical lines accessing the IOBs on the entire left edge of the device. The right-side Global Early buffers can each drive two of the eight vertical lines accessing the IOBs on the entire right edge of the device. (See Figure 37.) Each left and right Global Early buffer can also drive half of the IOBs along either the top or bottom edge of the device, using a dedicated line that can only be accessed through the Global Early buffers. The top and bottom Global Early buffers can drive half of the IOBs along either the left or right edge of the device, as shown in Figure 38. They can only access the top and bottom IOBs via the CLB global lines. Figure 38: Top and Bottom BUFGEs Can Drive Any or All Clock Inputs in Same Quadrant (GCK8 is shown. GCK3, GCK4 and GCK7 are similar.) Figure 41 on page 44 is a diagram of the XC4000 Series boundary scan logic. It includes three bits of Data Register per IOB, the IEEE 1149.1 Test Access Port controller, and the Instruction Register with decodes. XC4000 Series devices can also be configured through the boundary scan logic. See "Readback" on page 55. # **Data Registers** The primary data register is the boundary scan register. For each IOB pin in the FPGA, bonded or not, it includes three bits for In, Out and 3-State Control. Non-IOB pins have appropriate partial bit population for In or Out only. PROGRAM, CCLK and DONE are not included in the boundary scan register. Each EXTEST CAPTURE-DR state captures all In, Out, and 3-state pins. The data register also includes the following non-pin bits: TDO.T, and TDO.O, which are always bits 0 and 1 of the data register, respectively, and BSCANT.UPD, which is always the last bit of the data register. These three boundary scan bits are special-purpose Xilinx test signals. The other standard data register is the single flip-flop BYPASS register. It synchronizes data being passed through the FPGA to the next downstream boundary scan device. The FPGA provides two additional data registers that can be specified using the BSCAN macro. The FPGA provides two user pins (BSCAN.SEL1 and BSCAN.SEL2) which are the decodes of two user instructions. For these instructions, two corresponding pins (BSCAN.TDO1 and BSCAN.TDO2) allow user scan data to be shifted out on TDO. The data register clock (BSCAN.DRCK) is available for control of test logic which the user may wish to implement with CLBs. The NAND of TCK and RUN-TEST-IDLE is also provided (BSCAN.IDLE). Figure 40: Block Diagram of XC4000E IOB with Boundary Scan (some details not shown). XC4000X Boundary Scan Logic is Identical. The default option, and the most practical one, is for DONE to go High first, disconnecting the configuration data source and avoiding any contention when the I/Os become active one clock later. Reset/Set is then released another clock period later to make sure that user-operation starts from stable internal conditions. This is the most common sequence, shown with heavy lines in Figure 47, but the designer can modify it to meet particular requirements. Normally, the start-up sequence is controlled by the internal device oscillator output (CCLK), which is asynchronous to the system clock. XC4000 Series offers another start-up clocking option, UCLK\_NOSYNC. The three events described above need not be triggered by CCLK. They can, as a configuration option, be triggered by a user clock. This means that the device can wake up in synchronism with the user system. When the UCLK\_SYNC option is enabled, the user can externally hold the open-drain DONE output Low, and thus stall all further progress in the start-up sequence until DONE is released and has gone High. This option can be used to force synchronization of several FPGAs to a common user clock, or to guarantee that all devices are successfully configured before any I/Os go active. If either of these two options is selected, and no user clock is specified in the design or attached to the device, the chip could reach a point where the configuration of the device is complete and the Done pin is asserted, but the outputs do not become active. The solution is either to recreate the bitstream specifying the start-up clock as CCLK, or to supply the appropriate user clock. #### Start-up Sequence The Start-up sequence begins when the configuration memory is full, and the total number of configuration clocks received since $\overline{\text{INIT}}$ went High equals the loaded value of the length count. The next rising clock edge sets a flip-flop Q0, shown in Figure 48. Q0 is the leading bit of a 5-bit shift register. The outputs of this register can be programmed to control three events. - The release of the open-drain DONE output - The change of configuration-related pins to the user function, activating all IOBs. - The termination of the global Set/Reset initialization of all CLB and IOB storage elements. The DONE pin can also be wire-ANDed with DONE pins of other FPGAs or with other external signals, and can then be used as input to bit Q3 of the start-up register. This is called "Start-up Timing Synchronous to Done In" and is selected by either CCLK SYNC or UCLK SYNC. When DONE is not used as an input, the operation is called "Start-up Timing Not Synchronous to DONE In," and is selected by either CCLK\_NOSYNC or UCLK\_NOSYNC. As a configuration option, the start-up control register beyond Q0 can be clocked either by subsequent CCLK pulses or from an on-chip user net called STARTUP.CLK. These signals can be accessed by placing the STARTUP library symbol. #### **Start-up from CCLK** If CCLK is used to drive the start-up, Q0 through Q3 provide the timing. Heavy lines in Figure 47 show the default timing, which is compatible with XC2000 and XC3000 devices using early DONE and late Reset. The thin lines indicate all other possible timing options. **Table 22: Pin Functions During Configuration** | CONFIGURATION MODE <m2:m1:m0></m2:m1:m0> | | | | | | | |------------------------------------------|-----------------------------|---------------------------------|----------------------------|------------------------------------|----------------------------------|-------------------| | SLAVE<br>SERIAL<br><1:1:1> | MASTER<br>SERIAL<br><0:0:0> | SYNCH.<br>PERIPHERAL<br><0:1:1> | ASYNCH. PERIPHERAL <1:0:1> | MASTER<br>PARALLEL DOWN<br><1:1:0> | MASTER<br>PARALLEL UP<br><1:0:0> | USER<br>OPERATION | | M2(HIGH) (I) | M2(LOW) (I) | M2(LOW) (I) | M2(HIGH) (I) | M2(HIGH) (I) | M2(HIGH) (I) | (I) | | M1(HIGH) (I) | M1(LOW) (I) | M1(HIGH) (I) | M1(LOW) (I) | M1(HIGH) (I) | M1(LOW) (I) | (O) | | M0(HIGH) (I) | M0(LOW) (I) | M0(HIGH) (I) | M0(HIGH) (I) | M0(LOW) (I) | M0(LOW) (I) | (I) | | HDC (HIGH) | HDC (HIGH) | HDC (HIGH) | HDC (HIGH) | HDC (HIGH) | HDC (HIGH) | I/O | | LDC (LOW) | LDC (LOW) | LDC (LOW) | LDC (LOW) | LDC (LOW) | LDC (LOW) | I/O | | ĪNIT | ĪNIT | ĪNĪT | ĪNIT | ĪNIT | ĪNIT | I/O | | DONE | PROGRAM (I) | PROGRAM (I) | PROGRAM (I) | PROGRAM (I) | PROGRAM (I) | PROGRAM (I) | PROGRAM | | CCLK (I) | CCLK (O) | CCLK (I) | CCLK (O) | CCLK (O) | CCLK (O) | CCLK (I) | | | | RDY/BUSY (O) | RDY/BUSY (O) | RCLK (O) | RCLK (O) | I/O | | | | | RS (I) | | | I/O | | | | | CSO (I) | | | I/O | | | | DATA 7 (I) | DATA 7 (I) | DATA 7 (I) | DATA 7 (I) | I/O | | | | DATA 6 (I) | DATA 6 (I) | DATA 6 (I) | DATA 6 (I) | I/O | | | | DATA 5 (I) | DATA 5 (I) | DATA 5 (I) | DATA 5 (I) | I/O | | | | DATA 4 (I) | DATA 4 (I) | DATA 4 (I) | DATA 4 (I) | I/O | | | | DATA 3 (I) | DATA 3 (I) | DATA 3 (I) | DATA 3 (I) | I/O | | | | DATA 2 (I) | DATA 2 (I) | DATA 2 (I) | DATA 2 (I) | I/O | | | | DATA 1 (I) | DATA 1 (I) | DATA 1 (I) | DATA 1 (I) | I/O | | DIN (I) | DIN (I) | DATA 0 (I) | DATA 0 (I) | DATA 0 (I) | DATA 0 (I) | I/O | | DOUT | DOUT | DOUT | DOUT | DOUT | DOUT | SGCK4-GCK6-I/O | | TDI | TDI | TDI | TDI | TDI | TDI | TDI-I/O | | TCK | TCK | TCK | TCK | TCK | TCK | TCK-I/O | | TMS | TMS | TMS | TMS | TMS | TMS | TMS-I/O | | TDO | TDO | TDO | TDO | TDO | TDO | TDO-(O) | | | | | WS (I) | A0 | A0 | I/O | | | | | | A1 | A1 | PGCK4-GCK7-I/O | | | | | CS1 | A2 | A2 | I/O | | | | | • | A3 | A3 | I/O | | | | | | A4 | A4 | I/O | | | | | | A5 | A5 | I/O | | | | | | A6 | A6 | I/O | | | | | | A7 | A7 | I/O | | | | | | A8 | A8 | I/O | | | | | | A9 | A9 | I/O | | | | | | A10 | A10 | I/O | | | | | | A11 | A11 | I/O | | | | | | A12 | A12 | I/O | | | | | | A13 | A13 | I/O | | | | | | A14 | A14 | I/O | | | | | | A15 | A15 | SGCK1-GCK8-I/O | | | | | | A16 | A16 | PGCK1-GCK1-I/O | | | | | | A17 | A17 | I/O | | | | | | A18* | A18* | I/O | | | | | | A19* | A19* | I/O | | | | | | A20* | A20* | I/O | | | | | | A21* | A21* | I/O | | | | | | | | ALL OTHERS | # Synchronous Peripheral Mode Synchronous Peripheral mode can also be considered Slave Parallel mode. An external signal drives the CCLK input(s) of the FPGA(s). The first byte of parallel configuration data must be available at the Data inputs of the lead FPGA a short setup time before the rising CCLK edge. Subsequent data bytes are clocked in on every eighth consecutive rising CCLK edge. The same CCLK edge that accepts data, also causes the RDY/BUSY output to go High for one CCLK period. The pin name is a misnomer. In Synchronous Peripheral mode it is really an ACKNOWLEDGE signal. Synchronous operation does not require this response, but it is a meaningful signal for test purposes. Note that RDY/BUSY is pulled High with a high-impedance pullup prior to $\overline{\text{INIT}}$ going High. The lead FPGA serializes the data and presents the preamble data (and all data that overflows the lead device) on its DOUT pin. There is an internal delay of 1.5 CCLK periods, which means that DOUT changes on the falling CCLK edge, and the next FPGA in the daisy chain accepts data on the subsequent rising CCLK edge. In order to complete the serial shift operation, 10 additional CCLK rising edges are required after the last data byte has been loaded, plus one more CCLK cycle for each daisy-chained device. Synchronous Peripheral mode is selected by a <011> on the mode pins (M2, M1, M0). Figure 56: Synchronous Peripheral Mode Circuit Diagram 6-64 # **Asynchronous Peripheral Mode** #### Write to FPGA Asynchronous Peripheral mode uses the trailing edge of the logic AND condition of $\overline{WS}$ and $\overline{CS0}$ being Low and $\overline{RS}$ and CS1 being High to accept byte-wide data from a microprocessor bus. In the lead FPGA, this data is loaded into a double-buffered UART-like parallel-to-serial converter and is serially shifted into the internal logic. The lead FPGA presents the preamble data (and all data that overflows the lead device) on its DOUT pin. The RDY/BUSY output from the lead FPGA acts as a handshake signal to the microprocessor. RDY/BUSY goes Low when a byte has been received, and goes High again when the byte-wide input buffer has transferred its information into the shift register, and the buffer is ready to receive new data. A new write may be started immediately, as soon as the RDY/BUSY output has gone Low, acknowledging receipt of the previous data. Write may not be terminated until RDY/BUSY is High again for one CCLK period. Note that RDY/BUSY is pulled High with a high-impedance pull-up prior to INIT going High. The length of the $\overline{\text{BUSY}}$ signal depends on the activity in the UART. If the shift register was empty when the new byte was received, the $\overline{\text{BUSY}}$ signal lasts for only two CCLK periods. If the shift register was still full when the new byte was received, the $\overline{\text{BUSY}}$ signal can be as long as nine CCLK periods. Note that after the last byte has been entered, only seven of its bits are shifted out. CCLK remains High with DOUT equal to bit 6 (the next-to-last bit) of the last byte entered. The READY/BUSY handshake can be ignored if the delay from any one Write to the end of the next Write is guaranteed to be longer than 10 CCLK periods. #### Status Read The logic AND condition of the $\overline{CSO}$ , CS1and $\overline{RS}$ inputs puts the device status on the Data bus. - D7 High indicates Ready - D7 Low indicates Busy - D0 through D6 go unconditionally High It is mandatory that the whole start-up sequence be started and completed by one byte-wide input. Otherwise, the pins used as Write Strobe or Chip Enable might become active outputs and interfere with the final byte transfer. If this transfer does not occur, the start-up sequence is not completed all the way to the finish (point F in Figure 47 on page 53). In this case, at worst, the internal reset is not released. At best, Readback and Boundary Scan are inhibited. The length-count value, as generated by the XACT*step* software, ensures that these problems never occur. Although RDY/ $\overline{\text{BUSY}}$ is brought out as a separate signal, microprocessors can more easily read this information on one of the data lines. For this purpose, D7 represents the RDY/ $\overline{\text{BUSY}}$ status when $\overline{\text{RS}}$ is Low, $\overline{\text{WS}}$ is High, and the two chip select lines are both active. Asynchronous Peripheral mode is selected by a <101> on the mode pins (M2, M1, M0). Figure 58: Asynchronous Peripheral Mode Circuit Diagram