



Welcome to **E-XFL.COM** 

Embedded - System On Chip (SoC): The Heart of Modern Embedded Systems

Embedded - System On Chip (SoC) refers to an integrated circuit that consolidates all the essential components of a computer system into a single chip. This includes a microprocessor, memory, and other peripherals, all packed into one compact and efficient package. SoCs are designed to provide a complete computing solution, optimizing both space and power consumption, making them ideal for a wide range of embedded applications.

What are **Embedded - System On Chip (SoC)**?

**System On Chip (SoC)** integrates multiple functions of a computer or electronic system onto a single chip. Unlike traditional multi-chip solutions. SoCs combine a central

| Details                    |                                                                                                       |
|----------------------------|-------------------------------------------------------------------------------------------------------|
| Product Status             | Active                                                                                                |
| Architecture               | MPU, FPGA                                                                                             |
| Core Processor             | Quad ARM® Cortex®-A53 MPCore™ with CoreSight™, Dual ARM®Cortex™-R5 with CoreSight™, ARM Mali™-400 MP2 |
| Flash Size                 | -                                                                                                     |
| RAM Size                   | 1.8MB                                                                                                 |
| Peripherals                | DMA, WDT                                                                                              |
| Connectivity               | CANbus, I <sup>2</sup> C, SPI, UART/USART, USB                                                        |
| Speed                      | 500MHz, 1.2GHz                                                                                        |
| Primary Attributes         | Zynq®UltraScale+™ FPGA, 154K+ Logic Cells                                                             |
| Operating<br>Temperature   | -40°C ~ 100°C (TJ)                                                                                    |
| Package / Case             | 784-BFBGA, FCBGA                                                                                      |
| Supplier Device<br>Package | 784-FCBGA (23x23)                                                                                     |
| Purchase URL               | https://www.e-xfl.com/product-detail/xilinx/xazu3eg-1sfvc784i                                         |

Email: info@E-XFL.COM

Address: Room A, 16/F, Full Win Commercial Centre, 573 Nathan Road, Mongkok, Hong Kong



#### ARM Mali-400 Based GPU

- Supports OpenGL ES 1.1 and 2.0
- Supports OpenVG 1.1
- GPU frequency: Up to 600MHz
- Single Geometry Processor, Two Pixel Processors
- Vertex processing: 66 M Triangles/s
- Pixel processing: 1.2 G Pixels/s
- 64KB L2 Cache
- Power island gating

## **External Memory Interfaces**

- Multi-protocol dynamic memory controller
- 32-bit or 64-bit interfaces to DDR4, DDR3, DDR3L, or LPDDR3 memories, and 32-bit interface to LPDDR4 memory
- ECC support in 64-bit and 32-bit modes
- Up to 32GB of address space using single or dual rank of 8-, 16-, or 32-bit-wide memories
- Static memory interfaces
  - eMMC4.51 Managed NAND flash support
  - ONFI3.1 NAND flash with 24-bit ECC
  - 1-bit SPI, 2-bit SPI, 4-bit SPI (Quad-SPI), or two Quad-SPI (8-bit) serial NOR flash

#### **8-Channel DMA Controller**

- Two DMA controllers of 8-channels each
- Memory-to-memory, memory-to-peripheral, peripheral-to-memory, and scatter-gather transaction support

#### **Serial Transceivers**

- Four dedicated PS-GTR receivers and transmitters supports up to 6.0Gb/s data rates
  - Supports SGMII tri-speed Ethernet, PCI Express® Gen2, Serial-ATA (SATA), USB3.0, and DisplayPort

# **Dedicated I/O Peripherals and Interfaces**

- PCI Express Compliant with PCIe® 2.1 base specification
  - Root complex and End Point configurations
  - o x1, x2, and x4 at Gen1 or Gen2 rates
- SATA Host
  - 1.5, 3.0, and 6.0Gb/s data rates as defined by SATA Specification, revision 3.1
  - Supports up to two channels
- DisplayPort Controller
  - Up to 5.4Gb/s rate
  - Up to two TX lanes (no RX support)

- Four 10/100/1000 tri-speed Ethernet MAC peripherals with IEEE Std 802.3 and IEEE Std 1588 revision 2.0 support
  - Scatter-gather DMA capability
  - Recognition of IEEE Std 1588 rev.2 PTP frames
  - GMII, RGMII, and SGMII interfaces
  - Jumbo frames
- Two USB 3.0/2.0 Device, Host, or OTG peripherals, each supporting up to 12 endpoints
  - o USB 3.0/2.0 compliant device IP core
  - Super-speed, high- speed, full-speed, and low-speed modes
  - Intel XHCI- compliant USB host
- Two full CAN 2.0B-compliant CAN bus interfaces
  - o CAN 2.0-A and CAN 2.0-B and ISO 118981-1 standard compliant
- Two SD/SDIO 2.0/eMMC4.51 compliant controllers
- Two full-duplex SPI ports with three peripheral chip selects
- Two high-speed UARTs (up to 1Mb/s)
- Two master and slave I2C interfaces
- Up to 78 flexible multiplexed I/O (MIO) (up to three banks of 26 I/Os) for peripheral pin assignment
- Up to 96 EMIOs (up to three banks of 32 I/Os) connected to the PL

#### Interconnect

- High-bandwidth connectivity within PS and between PS and PL
- ARM AMBA® AXI4-based
- QoS support for latency and bandwidth control
- Cache Coherent Interconnect (CCI)

## **System Memory Management**

- System Memory Management Unit (SMMU)
- Xilinx Memory Protection Unit (XMPU)

## **Platform Management Unit**

- Power gates PS peripherals, power islands, and power domains
- Clock gates PS peripheral user firmware option

# **Configuration and Security Unit**

- Boots PS and configures PL
- Supports secure and non-secure boot modes

2

## **System Monitor in PS**

• On-chip voltage and temperature sensing



# **Programmable Logic (PL)**

# **Configurable Logic Blocks (CLB)**

- Look-up tables (LUT)
- Flip-flops
- Cascadable adders

#### 36Kb Block RAM

- True dual-port
- Up to 72 bits wide
- Configurable as dual 18Kb

#### **UltraRAM**

- 288Kb dual-port
- 72 bits wide
- Error checking and correction

#### **DSP Blocks**

- 27 x 18 signed multiply
- 48-bit adder/accumulator
- 27-bit pre-adder

## Programmable I/O Blocks

- Supports LVCMOS, LVDS, and SSTL
- 1.0V to 3.3V I/O
- Programmable I/O delay and SerDes

# JTAG Boundary-Scan

• IEEE Std 1149.1 Compatible Test Interface

## **PCI Express**

- Supports Root complex and End Point configurations
- Supports up to Gen3 speeds
- Up to two integrated blocks in select devices

# Video Encoder/Decoder (VCU)

- Available in EV devices
- Accessible from either PS or PL
- Simultaneous encode and decode
- H.264 and H.265 support

## **System Monitor in PL**

- On-chip voltage and temperature sensing
- 10-bit 200KSPS ADC with up to 17 external inputs



# **Feature Summary**

Table 1: XA Zynq UltraScale+ MPSoC: EG Device Feature Summary

|                                 | XAZU2EG                                                                                                                         | XAZU3EG                                                 |  |
|---------------------------------|---------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|--|
| Application Processing Unit     | Quad-core ARM Cortex-A53 MPCore with CoreSight; NEON & Single/Double Precision Floating Point; 32KB/32KB L1 Cache, 1MB L2 Cache |                                                         |  |
| Real-Time Processing Unit       | Dual-core ARM Cortex-R5 with CoreSight; Single/Double Precision Floating Point; 32KB/32KB L1 Cache, and TCM                     |                                                         |  |
| Embedded and External<br>Memory | 256KB On-Chip Memory w/ECC; External DDR4; DDR3; DDR3L; LPDDR4; LPDDR3; External Quad-SPI; NAND; eMMC                           |                                                         |  |
| General Connectivity            | 214 PS I/O; UART; CAN; USB 2.0; I2C; SPI; 32b<br>Timer C                                                                        | GPIO; Real Time Clock; WatchDog Timers; Triple Counters |  |
| High-Speed Connectivity         | 4 PS-GTR; PCIe Gen1/2; Serial ATA 3                                                                                             | 3.1; DisplayPort 1.2a; USB 3.0; SGMII                   |  |
| Graphic Processing Unit         | ARM Mali™-400 M                                                                                                                 | P2; 64KB L2 Cache                                       |  |
| System Logic Cells              | 103,320                                                                                                                         | 154,350                                                 |  |
| CLB Flip-Flops                  | 94,464                                                                                                                          | 141,120                                                 |  |
| CLB LUTs                        | 47,232                                                                                                                          | 70,560                                                  |  |
| Distributed RAM (Mb)            | 1.2                                                                                                                             | 1.8                                                     |  |
| Block RAM Blocks                | 150                                                                                                                             | 216                                                     |  |
| Block RAM (Mb)                  | 5.3                                                                                                                             | 7.6                                                     |  |
| UltraRAM Blocks                 | 0                                                                                                                               | 0                                                       |  |
| UltraRAM (Mb)                   | 0                                                                                                                               | 0                                                       |  |
| DSP Slices                      | 240                                                                                                                             | 360                                                     |  |
| CMTs                            | 3                                                                                                                               | 3                                                       |  |
| Max. HP I/O <sup>(1)</sup>      | 156                                                                                                                             | 156                                                     |  |
| Max. HD I/O <sup>(2)</sup>      | 96                                                                                                                              | 96                                                      |  |
| System Monitor                  | 2                                                                                                                               | 2                                                       |  |
| GTH Transceiver 12.5Gb/s        | 0 0                                                                                                                             |                                                         |  |
| Transceiver Fractional PLLs     | 0                                                                                                                               | 0                                                       |  |
| PCIe Gen3 x16                   | 0 0                                                                                                                             |                                                         |  |

#### Notes:

- 1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
- 2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.

Table 2: XA Zynq UltraScale+ MPSoC: EG Device-Package Combinations and Maximum I/Os

| Package (1)(2)(3)      | Package Dimensions (mm) | XAZU2EG     | XAZU3EG     |  |
|------------------------|-------------------------|-------------|-------------|--|
| (1)(2)(3)              |                         | HD, HP, GTH | HD, HP, GTH |  |
| SBVA484 <sup>(4)</sup> | 19x19                   | 24, 58, 0   | 24, 58, 0   |  |
| SFVA625                | 21x21                   | 24, 156, 0  | 24, 156, 0  |  |
| SFVC784                | 23x23                   | 96, 156, 0  | 96, 156, 0  |  |

#### Notes:

- 1. Go to Ordering Information for package designation details.
- All device package combinations bond out 4 PS-GTR transceivers.
- 3. All device package combinations bond out 214 PS I/O except ZU2EG and ZU3EG in the SBVA484 and SFVA625 packages, which bond out 170 PS I/Os.
- 4. All 58 HP I/O pins are powered by the same  $V_{CCO}$  supply.



Table 3: XA Zynq UltraScale+ MPSoC: EV Device Feature Summary

|                              | XAZU4EV                                                                                                                         | XAZU5EV                               |  |  |
|------------------------------|---------------------------------------------------------------------------------------------------------------------------------|---------------------------------------|--|--|
| Application Processing Unit  | Quad-core ARM Cortex-A53 MPCore with CoreSight; NEON & Single/Double Precision Floating Point; 32KB/32KB L1 Cache, 1MB L2 Cache |                                       |  |  |
| Real-Time Processing Unit    | Dual-core ARM Cortex-R5 with CoreSight; Single/Double Precision Floating Point; 32KB/32KB L1 Cache, and TCM                     |                                       |  |  |
| Embedded and External Memory | 256KB On-Chip Memory w/ECC; External DDR4; DDR3; DDR3L; LPDDR4; LPDDR3; External Quad-SPI; NAND; eMMC                           |                                       |  |  |
| General Connectivity         | 214 PS I/O; UART; CAN; USB 2.0; I2C; SPI; 32b GPIO; Real Time Clock; WatchDog Timers; Triple Timer Counters                     |                                       |  |  |
| High-Speed Connectivity      | 4 PS-GTR; PCIe Gen1/2; Serial ATA 3                                                                                             | 3.1; DisplayPort 1.2a; USB 3.0; SGMII |  |  |
| Graphic Processing Unit      | ARM Mali™-400 M                                                                                                                 | P2; 64KB L2 Cache                     |  |  |
| Video Codec                  | 1                                                                                                                               | 1                                     |  |  |
| System Logic Cells           | 192,150                                                                                                                         | 256,200                               |  |  |
| CLB Flip-Flops               | 175,680                                                                                                                         | 234,240                               |  |  |
| CLB LUTs                     | 87,840                                                                                                                          | 117,120                               |  |  |
| Distributed RAM (Mb)         | 2.6                                                                                                                             | 3.5                                   |  |  |
| Block RAM Blocks             | 128                                                                                                                             | 144                                   |  |  |
| Block RAM (Mb)               | 4.5                                                                                                                             | 5.1                                   |  |  |
| UltraRAM Blocks              | 48                                                                                                                              | 64                                    |  |  |
| UltraRAM (Mb)                | 13.5                                                                                                                            | 18.0                                  |  |  |
| DSP Slices                   | 728                                                                                                                             | 1,248                                 |  |  |
| CMTs                         | 4                                                                                                                               | 4                                     |  |  |
| Max. HP I/O <sup>(1)</sup>   | 156                                                                                                                             | 156                                   |  |  |
| Max. HD I/O <sup>(2)</sup>   | 96                                                                                                                              | 96                                    |  |  |
| System Monitor               | 2                                                                                                                               | 2                                     |  |  |
| GTH Transceiver 12.5Gb/s     | 4                                                                                                                               | 4                                     |  |  |
| Transceiver Fractional PLLs  | 2                                                                                                                               | 2                                     |  |  |
| PCIe Gen3 x16                | 2                                                                                                                               | 2                                     |  |  |

#### Notes:

- 1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.
- 2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.

#### Table 4: XA Zynq UltraScale+ MPSoC: EV Device-Package Combinations and Maximum I/Os

| Package                | Package            | XAZU4EV     | XAZU5EV     |
|------------------------|--------------------|-------------|-------------|
| Package (1)(2)         | Dimensions<br>(mm) | HD, HP, GTH | HD, HP, GTH |
| SFVC784 <sup>(3)</sup> | 23x23              | 96, 156, 4  | 96, 156, 4  |

#### Notes:

- Go to Ordering Information for package designation details. All device package combinations bond out 4 PS-GTR transceivers.
- GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s.

**Product Specification** 



# **Zynq UltraScale+ MPSoCs**

A comprehensive device family, Zynq UltraScale+ MPSoCs offer single-chip, all programmable, heterogeneous multiprocessors that provide designers with software, hardware, interconnect, power, security, and I/O programmability. The range of devices in the Zynq UltraScale+ MPSoC family allows designers to target cost-sensitive as well as high-performance applications from a single platform using industry-standard tools. While each Zynq UltraScale+ MPSoC contains the same PS, the PL, Video hard blocks, and I/O resources vary between the devices.

Table 5: XA Zynq UltraScale+ MPSoC Device Features

|     | EG Devices               | EV Devices               |
|-----|--------------------------|--------------------------|
| APU | Quad-core ARM Cortex-A53 | Quad-core ARM Cortex-A53 |
| RPU | Dual-core ARM Cortex-R5  | Dual-core ARM Cortex-R5  |
| GPU | Mali-400MP2              | Mali-400MP2              |
| VCU | -                        | H.264/H.265              |

XA Zynq UltraScale+ MPSoCs are able to serve a wide range of Automotive applications including multi-camera multi-feature driver assistance systems, high resolution and video graphic infotainment systems, and driver information.

The UltraScale MPSoC architecture provides processor scalability from 32 to 64 bits with support for virtualization, the combination of soft and hard engines for real-time control, graphics/video processing, waveform and packet processing, next-generation interconnect and memory, advanced power management, and technology enhancements that deliver multi-level security, safety, and reliability. Xilinx offers a large number of soft IP for the XA Zynq UltraScale+ MPSoC family. Stand-alone and Linux device drivers are available for the peripherals in the PS and the PL. Xilinx's Vivado® Design Suite, SDK™, and PetaLinux development environments enable rapid product development for software, hardware, and systems engineers. The ARM-based PS also brings a broad range of third-party tools and IP providers in combination with Xilinx's existing PL ecosystem.

The XA Zynq UltraScale+ MPSoC family delivers unprecedented processing, I/O, and memory bandwidth in the form of an optimized mix of heterogeneous processing engines embedded in a next-generation, high-performance, on-chip interconnect with appropriate on-chip memory subsystems. The heterogeneous processing and programmable engines, which are optimized for different application tasks, enable the XA Zynq UltraScale+ MPSoC to deliver the extensive performance and efficiency required to address next-generation smarter systems while retaining backwards compatibility with the original XA Zynq-7000 All Programmable SoC family. The UltraScale MPSoC architecture also incorporates multiple levels of security, increased safety, and advanced power management, which are critical requirements of next-generation smarter systems. Xilinx's embedded UltraFast™ design methodology fully exploits the ASIC-class capabilities afforded by the UltraScale MPSoC architecture while supporting rapid system development.

The inclusion of an application processor enables high-level operating system support (e.g., AutoSAR and Linux). Other standard operating systems used with the Cortex-A53 processor are also available for the XA Zynq UltraScale+ MPSoC family. The PS and the PL are on separate power domains, enabling users to power down the PL for power management if required. The processors in the PS always boot first, allowing a software centric approach for PL configuration. PL configuration is managed by software running on the CPU, so it boots similar to an ASSP.



# **Processing System**

# **Application Processing Unit (APU)**

The key features of the APU include:

- 64-bit quad-core ARM Cortex-A53 MPCores. Features associated with each core include:
  - o ARM v8-A Architecture
  - Operating target frequency: up to 1.2GHz
  - Single and double precision floating point:
     4 SP/2 DP FLOPS/MHz
  - NEON Advanced SIMD support with single and double precision floating point instructions
  - o A64 instruction set in 64-bit operating mode, A32/T32 instruction set in 32-bit operating mode
  - Level 1 cache (separate instruction and data, 32KB each for each Cortex-A53 CPU)
    - 2-way set-associative Instruction Cache with parity support
    - 4-way set-associative Data Cache with ECC support
  - Integrated memory management unit (MMU) per processor core
  - TrustZone for secure mode operation
  - Virtualization support
- Ability to operate in single processor, symmetric quad processor, and asymmetric quad-processor modes
- Integrated 16-way set-associative 1MB Unified Level 2 cache with ECC support
- Interrupts and Timers
  - Generic interrupt controller (GIC-400)
  - o ARM generic timers (4 timers per CPU)
  - One watchdog timer (WDT)
  - One global timer
  - Two triple timers/counters (TTC)
- CoreSight debug and trace support
  - o Embedded Trace Macrocell (ETM) for instruction trace
  - o Cross trigger interface (CTI) enabling hardware breakpoints and triggers
- ACP interface to PL for I/O coherency and Level 2 cache allocation
- ACE interface to PL for full coherency
- Power island gating on each processor core
- Optional eFUSE disable per core



## Real-Time Processing Unit (RPU)

- Dual-core ARM Cortex-R5 MPCores. Features associated with each core include:
  - o ARM v7-R Architecture (32-bit)
  - Operating target frequency: Up to 500MHz
  - o A32/T32 instruction set support
  - o 4-way set-associative Level 1 caches (separate instruction and data, 32KB each) with ECC support
  - o Integrated Memory Protection Unit (MPU) per processor
  - 128KB Tightly Coupled Memory (TCM) with ECC support
  - o TCMs can be combined to become 256KB in lockstep mode
- Ability to operate in single-processor or dual-processor modes (split and lock-step)
- Dedicated SWDT and two Triple Timer Counters (TTC)
- CoreSight debug and trace support
  - Embedded Trace Macrocell (ETM) for instruction and trace
  - Cross trigger interface (CTI) enabling hardware breakpoints and triggers
- Optional eFUSE disable

# Full-Power Domain DMA (FPD-DMA) and Low-Power Domain DMA (LPD-DMA)

- Two general-purpose DMA controllers one in the full-power domain (FPD-DMA) and one in the low-power domain (LPD-DMA)
- Eight independent channels per DMA
- Multiple transfer types:
  - Memory-to-memory
  - o Memory-to-peripheral
  - Peripheral-to-memory and
  - Scatter-gather
- 8 peripheral interfaces per DMA
- TrustZone per DMA for optional secure operation

**Product Specification** 



- Low power modes
  - Active/precharge power down
  - Self-refresh, including clean exit from self-refresh after a controller power cycle
- Enhanced DDR training by allowing software to measure read/write eye and make delay adjustments dynamically
- Independent performance monitors for read path and write path
- Integration of PHY Debug Access Port (DAP) into JTAG for testing

The DDR memory controller is multi-ported and enables the PS and the PL to have shared access to a common memory. The DDR controller features six AXI slave ports for this purpose:

- Two 128-bit AXI ports from the ARM Cortex-A53 CPU(s), RPU (ARM Cortex-R5 and LPD peripherals), GPU, high speed peripherals (USB3, PCIe & SATA), and High Performance Ports (HPO & HP1) from the PL through the Cache Coherent Interconnect (CCI)
- One 64-bit port is dedicated for the ARM Cortex-R5 CPU(s)
- One 128-bit AXI port from the DisplayPort and HP2 port from the PL
- One 128-bit AXI port from HP3 and HP4 ports from the PL
- One 128-bit AXI port from General DMA and HP5 from the PL

# **High-Speed Connectivity Peripherals**

#### **PCIe**

- Compliant with the PCI Express Base Specification 2.1
- Fully compliant with PCI Express transaction ordering rules
- Lane width: x1, x2, or x4 at Gen1 or Gen2 rates
- 1 Virtual Channel
- Full duplex PCIe port
- End Point and single PCIe link Root Port
- Root Port supports Enhanced Configuration Access Mechanism (ECAM), Cfg Transaction generation
- Root Port support for INTx, and MSI
- Endpoint support for MSI or MSI-X
  - 1 physical function, no SR-IOV
  - No relaxed or ID ordering
  - Fully configurable BARs
  - INTx not recommended, but can be generated
  - Endpoint to support configurable target/slave apertures with address translation and Interrupt capability



#### SATA

- Compliant with SATA 3.1 Specification
- SATA host port supports up to 2 external devices
- Compliant with Advanced Host Controller Interface ('AHCI') ver. 1.3
- 1.5Gb/s, 3.0Gb/s, and 6.0Gb/s data rates
- Power management features: supports partial and slumber modes

#### **USB 3.0**

- Two USB controllers (configurable as USB 2.0 or USB 3.0)
- Up to 5.0Gb/s data rate
- Host and Device modes
  - Super Speed, High Speed, Full Speed, and Low Speed
  - o Up to 12 endpoints
  - The USB host controller registers and data structures are compliant to Intel xHCI specifications
  - 64-bit AXI master port with built-in DMA
  - o Power management features: Hibernation mode

#### DisplayPort Controller

- 4K Display Processing with DisplayPort output
  - Maximum resolution of 4K x 2K-30 (30Hz pixel rate)
  - DisplayPort AUX channel, and Hot Plug Detect (HPD) on the output
  - o RGB YCbCr, 4:2:0; 4:2:2, 4:4:4 with 6, 8, 10, and 12b/c
  - Y-only, xvYCC, RGB 4:4:4, YCbCr 4:4:4, YCbCr 4:2:2, and YCbCr 4:2:0 video format with 6,8,10 and 12-bits per color component
  - 256-color palette
  - o Multiple frame buffer formats
  - o 1, 2, 4, 8 bits per pixel (bpp) via a palette
  - o 16, 24, 32bpp
  - o Graphics formats such as RGBA8888, RGB555, etc.
- Accepts streaming video from the PL or dedicated DMA controller
- Enables Alpha blending of graphics and Chroma keying

**Product Specification** 



- Full duplex flow control with recognition of incoming pause frames and hardware generation of transmitted pause frames
- 802.1Q VLAN tagging with recognition of incoming VLAN and priority tagged frames
- Supports IEEE Std 1588 v2

#### SD/SDIO 3.0 Controller

In addition to secure digital (SD) devices, this controller also supports eMMC 4.51.

- Host mode support only
- Built-in DMA
- 1/4-Bit SD Specification, version 3.0
- 1/4/8-Bit eMMC Specification, version 4.51
- Supports primary boot from SD Card and eMMC (Managed NAND)
- High speed, default speed, and low-speed support
- 1 and 4-bit data interface support
  - Low-speed clock 0–400KHz
  - Default speed 0–25MHz
  - o High speed clock 0–50MHz
- High-speed Interface
  - o SD UHS-1: 208MHz
  - o eMMC HS200: 200MHz
- Memory, I/O, and SD cards
- Power control modes
- Data FIFO interface up to 512B

#### **UART**

- Programmable baud rate generator
- 6, 7, or 8 data bits
- 1, 1.5, or 2 stop bits
- Odd, even, space, mark, or no parity
- Parity, framing, and overrun error detection
- Line break generation and detection
- Automatic echo, local loopback, and remote loopback channel modes
- Modem control signals: CTS, RTS, DSR, DTR, RI, and DCD (from EMIO only)



- Sleep Mode with automatic wake-up
- Snoop Mode
- 16-bit time-stamping for receive messages
- Both internal generated reference clock and external reference clock input from MIO
- Guarantee clock sampling edge between 80 to 83% at 24MHz reference clock input
- Optional eFUSE disable per port

#### **USB 2.0**

- Two USB controllers (configurable as USB 2.0 or USB 3.0)
- Host, device and On-The-Go (OTG) modes
- High Speed, Full Speed, and Low Speed
- Up to 12 endpoints
- 8-bit ULPI External PHY Interface
- The USB host controller registers and data structures are compliant to Intel xHCI specifications.
- 64-bit AXI master port with built-in DMA
- Power management features: hibernation mode

# **Static Memory Interfaces**

The static memory interfaces support external static memories.

- ONFI 3.1 NAND flash support with up to 24-bit ECC
- 1-bit SPI, 2-bit SPI, 4-bit SPI (Quad-SPI), or two Quad-SPI (8-bit) serial NOR flash
- 8-bit eMMC interface supporting managed NAND flash

#### NAND ONFI 3.1 Flash Controller

- ONFI 3.1 compliant
- Supports chip select reduction per ONFI 3.1 spec
- SLC NAND for boot/configuration and data storage
- ECC options based on SLC NAND
  - o 1, 4, or 8 bits per 512+spare bytes
  - o 24 bits per 1,024+spare bytes
- Max bandwidth as follows
  - Asynchronous mode (SDR) at 50MHz
  - Synchronous mode (NV-DDR) at 100MHz (200Mb/s)
- 8-bit SDR NAND interface



#### Interconnect

All the blocks are connected to each other and to the PL through a multi-layered ARM Advanced Microprocessor Bus Architecture (AMBA) AXI interconnect. The interconnect is non-blocking and supports multiple simultaneous master-slave transactions.

The interconnect is designed with latency sensitive masters, such as the ARM CPU, having the shortest paths to memory, and bandwidth critical masters, such as the potential PL masters, having high throughput connections to the slaves with which they need to communicate.

Traffic through the interconnect can be regulated through the Quality of Service (QoS) block in the interconnect. The QoS feature is used to regulate traffic generated by the CPU, DMA controller, and a combined entity representing the masters in the IOP.

# **PS Interfaces**

PS interfaces include external interfaces going off-chip or signals going from PS to PL.

#### **PS External Interfaces**

The Zynq UltraScale+ MPSoC's external interfaces use dedicated pins that cannot be assigned as PL pins. These include:

- Clock, reset, boot mode, and voltage reference
- Up to 78 dedicated multiplexed I/O (MIO) pins, software-configurable to connect to any of the internal I/O peripherals and static memory controllers
- 32-bit or 64-bit DDR4/DDR3/DDR3L/LPDDR3 memories with optional ECC
- 32-bit LPDDR4 memory with optional ECC
- 4 channels (TX and RX pair) for transceivers

#### **MIO Overview**

The IOP peripherals communicate to external devices through a shared pool of up to 78 dedicated multiplexed I/O (MIO) pins. Each peripheral can be assigned one of several pre-defined groups of pins, enabling a flexible assignment of multiple devices simultaneously. Although 78 pins are not enough for simultaneous use of all the I/O peripherals, most IOP interface signals are available to the PL, allowing use of standard PL I/O pins when powered up and properly configured. Extended multiplexed I/O (EMIO) allows unmapped PS peripherals to access PL I/O.

Port mappings can appear in multiple locations. For example, there are up to 12 possible port mappings for CAN pins. The PS Configuration Wizard (PCW) tool aids in peripheral and static memory pin mapping.



Table 6: MIO Peripheral Interface Mapping

| Peripheral<br>Interface                  | MIO                                        | ЕМІО                                                                                                                                                                                                                                                      |
|------------------------------------------|--------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Quad-SPI<br>NAND                         | Yes                                        | No                                                                                                                                                                                                                                                        |
| USB2.0: 0,1                              | Yes: External PHY                          | No                                                                                                                                                                                                                                                        |
| SDIO 0,1                                 | Yes                                        | Yes                                                                                                                                                                                                                                                       |
| SPI: 0,1<br>I2C: 0,1<br>CAN: 0,1<br>GPIO | Yes  CAN: External PHY GPIO: Up to 78 bits | Yes  CAN: External PHY GPIO: Up to 96 bits                                                                                                                                                                                                                |
| GigE: 0,1,2,3                            | RGMII v2.0:<br>External PHY                | Supports GMII, RGMII v2.0 (HSTL), RGMII v1.3, MII, SGMII, and 1000BASE-X in Programmable Logic                                                                                                                                                            |
| UART: 0,1                                | Simple UART:<br>Only two pins (TX and RX)  | <ul> <li>Full UART (TX, RX, DTR, DCD, DSR, RI, RTS, and CTS) requires either:</li> <li>Two Processing System (PS) pins (RX and TX) through MIO and six additional Programmable Logic (PL) pins, or</li> <li>Eight Programmable Logic (PL) pins</li> </ul> |
| Debug Trace Ports                        | Yes: Up to 16 trace bits                   | Yes: Up to 32 trace bits                                                                                                                                                                                                                                  |
| Processor JTAG                           | Yes                                        | Yes                                                                                                                                                                                                                                                       |

# Transceiver (PS-GTR)

The four PS-GTR transceivers, which reside in the full power domain (FPD), support data rates of up to 6.0Gb/s. All the protocols cannot be pinned out at the same time. At any given time, four differential pairs can be pinned out using the transceivers. This is user programmable via the high-speed I/O multiplexer (HS-MIO).

- A Quad transceiver PS-GTR (TX/RX pair) able to support following standards simultaneously
  - o x1, x2, or x4 lane of PCIe at Gen1 (2.5Gb/s) or Gen2 (5.0Gb/s) rates
  - o 1 or 2 lanes of DisplayPort (TX only) at 1.62Gb/s, 2.7Gb/s, or 5.4Gb/s
  - o 1 or 2 SATA channels at 1.5Gb/s, 3.0Gb/s, or 6.0Gb/s
  - o 1 or 2 USB3.0 channels at 5.0Gb/s
  - o 1-4 Ethernet SGMII channels at 1.25Gb/s
- Provides flexible host-programmable multiplexing function for connecting the transceiver resources to the PS masters (DisplayPort, PCIe, Serial-ATA, USB3.0, and GigE).



#### **HS-MIO**

The function of the HS-MIO is to multiplex access from the high-speed PS peripheral to the differential pair on the PS-GTR transceiver as defined in the configuration registers. Up to 4 channels of the transceiver are available for use by the high-speed interfaces in the PS.

Table 7: HS-MIO Peripheral Interface Mapping

| Peripheral Interface   | Lane0  | Lane1  | Lane2  | Lane3  |
|------------------------|--------|--------|--------|--------|
| PCIe (x1, x2 or x4)    | PCIe0  | PCIe1  | PCIe2  | PCIe3  |
| SATA (1 or 2 channels) | SATA0  | SATA1  | SATA0  | SATA1  |
| DisplayPort (TX only)  | DP1    | DP0    | DP1    | DP0    |
| USB0                   | USB0   | USB0   | USB0   | _      |
| USB1                   | _      | _      | _      | USB1   |
| SGMII0                 | SGMII0 | _      | _      | _      |
| SGMII1                 | _      | SGMII1 | _      | -      |
| SGMI12                 | _      | _      | SGMI12 | -      |
| SGMI13                 | _      | _      | _      | SGMI13 |

#### **PS-PL Interface**

The PS-PL interface includes:

- AMBA AXI4 interfaces for primary data communication
  - Six 128-bit/64-bit/32-bit High Performance (HP) Slave AXI interfaces from PL to PS.
    - Four 128-bit/64-bit/32-bit HP AXI interfaces from PL to PS DDR.
    - Two 128-bit/64-bit/32-bit high-performance coherent (HPC) ports from PL to cache coherent interconnect (CCI).
  - o Two 128-bit/64-bit/32-bit HP Master AXI interfaces from PS to PL.
  - o One 128-bit/64-bit/32-bit interface from PL to RPU in PS (PL\_LPD) for low latency access to OCM.
  - One 128-bit/64-bit/32-bit AXI interface from RPU in PS to PL (LPD\_PL) for low latency access to PL.
  - One 128-bit AXI interface (ACP port) for I/O coherent access from PL to Cortex-A53 cache memory.
     This interface provides coherency in hardware for Cortex-A53 cache memory.
  - One 128-bit AXI interface (ACE Port) for Fully coherent access from PL to Cortex-A53. This interface provides coherency in hardware for Cortex-A53 cache memory and the PL.
- Clocks and resets
  - o Four PS clock outputs to the PL with start/stop control.
  - Four PS reset outputs to the PL.



#### 3-State Digitally Controlled Impedance and Low Power I/O Features

The 3-state Digitally Controlled Impedance (T\_DCI) can control the output drive impedance (series termination) or can provide parallel termination of an input signal to  $V_{CCO}$  or split (Thevenin) termination to  $V_{CCO}/2$ . This allows users to eliminate off-chip termination for signals using T\_DCI. In addition to board space savings, the termination automatically turns off when in output mode or when 3-stated, saving considerable power compared to off-chip termination. The I/Os also have low power modes for IBUF and IDELAY to provide further power savings, especially when used to implement memory interfaces.

# I/O Logic

#### Input and Output Delay

All inputs and outputs can be configured as either combinatorial or registered. Double data rate (DDR) is supported by all inputs and outputs. Any input or output can be individually delayed by up to 1,250ps of delay with a resolution of 5–15ps. Such delays are implemented as IDELAY and ODELAY. The number of delay steps can be set by configuration and can also be incremented or decremented while in use. The IDELAY and ODELAY can be cascaded together to double the amount of delay in a single direction.

#### **ISERDES** and **OSERDES**

Many applications combine high-speed, bit-serial I/O with slower parallel operation inside the device. This requires a serializer and deserializer (SerDes) inside the I/O logic. Each I/O pin possesses an IOSERDES (ISERDES and OSERDES) capable of performing serial-to-parallel or parallel-to-serial conversions with programmable widths of 2, 4, or 8 bits. These I/O logic features enable high-performance interfaces, such as Gigabit Ethernet/1000BaseX/SGMII, to be moved from the transceivers to the SelectIO™ interface.

# **High-Speed Serial Transceivers**

Ultra-fast serial data transmission between devices on the same PCB, over backplanes, and across even longer distances is becoming increasingly important for scaling to 100Gb/s and 400Gb/s line cards. Specialized dedicated on-chip circuitry and differential I/O capable of coping with the signal integrity issues are required at these high data rates.

Two types of transceivers are used in the XA Zynq UltraScale+ MPSoC: GTH and PS-GTR. Both transceivers are arranged in groups of four, known as a transceiver Quad. Each serial transceiver is a combined transmitter and receiver. Table 8 compares the available transceivers.



Three sets of programmable frequency dividers (D, M, and O) are programmable by configuration and during normal operation via the Dynamic Reconfiguration Port (DRP). The pre-divider D reduces the input frequency and feeds one input of the phase/frequency comparator. The feedback divider M acts as a multiplier because it divides the VCO output frequency before feeding the other input of the phase comparator. D and M must be chosen appropriately to keep the VCO within its specified frequency range. The VCO has eight equally-spaced output phases (0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315°). Each phase can be selected to drive one of the output dividers, and each divider is programmable by configuration to divide by any integer from 1 to 128.

The MMCM has three input-jitter filter options: low bandwidth, high bandwidth, or optimized mode. Low-Bandwidth mode has the best jitter attenuation. High-Bandwidth mode has the best phase offset. Optimized mode allows the tools to find the best setting.

The MMCM can have a fractional counter in either the feedback path (acting as a multiplier) or in one output path. Fractional counters allow non-integer increments of 1/8 and can thus increase frequency synthesis capabilities by a factor of 8. The MMCM can also provide fixed or dynamic phase shift in small increments that depend on the VCO frequency. At 1,600MHz, the phase-shift timing increment is 11.2ps.

#### **PLL**

With fewer features than the MMCM, the two PLLs in a clock management tile are primarily present to provide the necessary clocks to the dedicated memory interface circuitry. The circuit at the center of the PLLs is similar to the MMCM, with PFD feeding a VCO and programmable M, D, and O counters. There are two divided outputs to the device fabric per PLL as well as one clock plus one enable signal to the memory interface circuitry.

Zynq UltraScale+ MPSoCs are equipped with five additional PLLs in the PS for independently configuring the four primary clock domains with the PS: the APU, the RPU, the DDR controller, and the I/O peripherals.

# **Clock Distribution**

Clocks are distributed throughout Zynq UltraScale+ MPSoCs via buffers that drive a number of vertical and horizontal tracks. There are 24 horizontal clock routes per clock region and 24 vertical clock routes per clock region with 24 additional vertical clock routes adjacent to the MMCM and PLL. Within a clock region, clock signals are routed to the device logic (CLBs, etc.) via 16 gateable leaf clocks.

Several types of clock buffers are available. The BUFGCE and BUFCE\_LEAF buffers provide clock gating at the global and leaf levels, respectively. BUFGCTRL provides glitchless clock muxing and gating capability. BUFGCE\_DIV has clock gating capability and can divide a clock by 1 to 8. BUFG\_GT performs clock division from 1 to 8 for the transceiver clocks. In MPSoCs, clocks can be transferred from the PS to the PL using dedicated buffers.



# **UltraRAM**

UltraRAM is a high-density, dual-port, synchronous memory block used in some UltraScale+ families. Both of the ports share the same clock and can address all of the 4K x 72 bits. Each port can independently read from or write to the memory array. UltraRAM supports two types of write enable schemes. The first mode is consistent with the block RAM byte write enable mode. The second mode allows gating the data and parity byte writes separately. Multiple UltraRAM blocks can be cascaded together to create larger memory arrays. UltraRAM blocks can be connected together to create larger memory arrays. Dedicated routing in the UltraRAM column enables the entire column height to be connected together. This makes UltraRAM an ideal solution for replacing external memories such as SRAM. Cascadable anywhere from 288Kb to 36Mb, UltraRAM provides the flexibility to fulfill many different memory requirements.

#### **Error Detection and Correction**

Each 64-bit-wide UltraRAM can generate, store and utilize eight additional Hamming code bits and perform single-bit error correction and double-bit error detection (ECC) during the read process.

# **Digital Signal Processing**

DSP applications use many binary multipliers and accumulators, best implemented in dedicated DSP slices. All UltraScale architecture-based devices have many dedicated, low-power DSP slices, combining high speed with small size while retaining system design flexibility.

Each DSP slice fundamentally consists of a dedicated 27 × 18 bit twos complement multiplier and a 48-bit accumulator. The multiplier can be dynamically bypassed, and two 48-bit inputs can feed a single-instruction-multiple-data (SIMD) arithmetic unit (dual 24-bit add/subtract/accumulate or quad 12-bit add/subtract/accumulate), or a logic unit that can generate any one of ten different logic functions of the two operands.

The DSP includes an additional pre-adder, typically used in symmetrical filters. This pre-adder improves performance in densely packed designs and reduces the DSP slice count by up to 50%. The 96-bit-wide XOR function, programmable to 12, 24, 48, or 96-bit widths, enables performance improvements when implementing forward error correction and cyclic redundancy checking algorithms.

The DSP also includes a 48-bit-wide pattern detector that can be used for convergent or symmetric rounding. The pattern detector is also capable of implementing 96-bit-wide logic functions when used in conjunction with the logic unit.

The DSP slice provides extensive pipelining and extension capabilities that enhance the speed and efficiency of many applications beyond digital signal processing, such as wide dynamic bus shifters, memory address generators, wide bus multiplexers, and memory-mapped I/O register files. The accumulator can also be used as a synchronous up/down counter.



The Full Power Domain (FPD) consists of the following major blocks:

- Application Processing Unit (APU)
- DMA (FP-DMA)
- Graphics Processing Unit (GPU)
- Dynamic Memory Controller (DDRC)
- High-Speed I/O Peripherals

The Low Power Domain (LPD) consists of the following major blocks:

- Real-Time Processing Unit (RPU)
- DMA (LP-DMA)
- Platform Management Unit (PMU)
- Configuration Security Unit (CSU)
- Low-Speed I/O Peripherals
- Static Memory Interfaces

The Battery Power Domain (BPD) is the lowest power domain of the Zynq UltraScale+ MPSoC processing system. In this mode, all the PS is powered off except the Real-Time Clock (RTC) and battery-backed RAM (BBRAM).

#### **Power Examples**

Power for the Zynq UltraScale+ MPSoCs varies depending on the utilization of the PL resources, and the frequency of the PS and PL. To estimate power, use the Xilinx Power Estimator (XPE) at:

http://www.xilinx.com/products/design\_tools/logic\_design/xpe.htm

# **PS Boot and Device Configuration**

Zynq UltraScale+ MPSoCs use a multi-stage boot process that supports both a non-secure and a secure boot. The PS is the master of the boot and configuration process. For a secure boot, the AES-GCM, SHA-3/384 decrypts and authenticates the images while the 4,096-bit RSA block authenticates the image.

Upon reset, the device mode pins are read to determine the primary boot device to be used: NAND, Quad-SPI, SD, eMMC, or JTAG. JTAG can only be used as a non-secure boot source and is intended for debugging purposes. The CSU executes code out of on-chip ROM and copies the first stage boot loader (FSBL) from the boot device to the OCM.

After copying the FSBL to OCM, one of the processors, either the Cortex-A53 or Cortex-R5, executes the FSBL. Xilinx supplies example FSBLs or users can create their own. The FSBL initiates the boot of the PS and can load and configure the PL, or configuration of the PL can be deferred to a later stage. The FSBL typically loads either a user application or an optional second stage boot loader (SSBL), such as U-Boot. Users obtain example SSBL from Xilinx or a third party, or they can create their own SSBL. The SSBL continues the boot process by loading code from any of the primary boot devices or from other sources such as USB, Ethernet, etc. If the FSBL did not configure the PL, the SSBL can do so, or again, the configuration can be deferred to a later stage.

**Product Specification** 



The static memory interface controller (NAND, eMMC, or Quad-SPI) is configured using default settings. To improve device configuration speed, these settings can be modified by information provided in the boot image header. The ROM boot image is not user readable or callable after boot.

## **Hardware and Software Debug Support**

The debug system used in Zynq UltraScale+ MPSoCs is based on the ARM CoreSight architecture. It uses ARM CoreSight components including an embedded trace controller (ETC), an embedded trace Macrocell (ETM) for each Cortex-A53 and Cortex-R5 processor, and a system trace Macrocell (STM). This enables advanced debug features like event trace, debug breakpoints and triggers, cross-trigger, and debug bus dump to memory. The programmable logic can be debugged with the Xilinx Vivado Logic Analyzer.

#### **Debug Ports**

Three JTAG ports are available and can be chained together or used separately. When chained together, a single port is used for chip-level JTAG functions, ARM processor code downloads and run-time control operations, PL configuration, and PL debug with the Vivado Logic Analyzer. This enables tools such as the Xilinx Software Development Kit (SDK) and Vivado Logic Analyzer to share a single download cable from Xilinx.

When the JTAG chain is split, one port is used to directly access the ARM DAP interface. This CoreSight interface enables the use of ARM-compliant debug and software development tools such as Development Studio 5 (DS-5™). The other JTAG port can then be used by the Xilinx FPGA tools for access to the PL, including configuration bitstream downloads and PL debug with the Vivado Logic Analyzer. In this mode, users can download to and debug the PL in the same manner as a stand-alone FPGA.



# **Revision History**

The following table shows the revision history for this document:

| Date       | Version | Description of Revisions                                                                 |
|------------|---------|------------------------------------------------------------------------------------------|
| 07/13/2017 | 1.2     | Updated Table 3, Application Processing Unit (APU), and Real-Time Processing Unit (RPU). |
| 03/23/2017 | 1.1     | Updated Table 3, Table 10, and Figure 3.                                                 |
| 11/09/2016 | 1.0     | Initial Xilinx release.                                                                  |

# Disclaimer

The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of Xilinx products. To the maximum extent permitted by applicable law: (1) Materials are made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under, or in connection with, the Materials (including your use of the Materials), including for any direct, indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinx assumes no obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or to product specifications. You may not reproduce, modify, distribute, or publicly display the Materials without prior written consent. Certain products are subject to the terms and conditions of Xilinx's limited warranty, please refer to Xilinx's Terms of Sale which can be viewed at <a href="http://www.xilinx.com/legal.htm#tos">http://www.xilinx.com/legal.htm#tos</a>; IP cores may be subject to warranty and support terms contained in a license issued to you by Xilinx. Xilinx products are not designed or intended to be fail-safe or for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx products in such critical applications, please refer to Xilinx's Terms of Sale which can be viewed at <a href="http://www.xilinx.com/legal.htm#tos">http://www.xilinx.com/legal.htm#tos</a>.

This document contains preliminary information and is subject to change without notice. Information provided herein relates to products and/or services not yet available for sale, and provided solely for information purposes and are not intended, or to be construed, as an offer for sale or an attempted commercialization of the products and/or services referred to herein.

## **Automotive Applications Disclaimer**

AUTOMOTIVE PRODUCTS (IDENTIFIED AS "XA" IN THE PART NUMBER) ARE NOT WARRANTED FOR USE IN THE DEPLOYMENT OF AIRBAGS OR FOR USE IN APPLICATIONS THAT AFFECT CONTROL OF A VEHICLE ("SAFETY APPLICATION") UNLESS THERE IS A SAFETY CONCEPT OR REDUNDANCY FEATURE CONSISTENT WITH THE ISO 26262 AUTOMOTIVE SAFETY STANDARD ("SAFETY DESIGN"). CUSTOMER SHALL, PRIOR TO USING OR DISTRIBUTING ANY SYSTEMS THAT INCORPORATE PRODUCTS, THOROUGHLY TEST SUCH SYSTEMS FOR SAFETY PURPOSES. USE OF PRODUCTS IN A SAFETY APPLICATION WITHOUT A SAFETY DESIGN IS FULLY AT THE RISK OF CUSTOMER, SUBJECT ONLY TO APPLICABLE LAWS AND REGULATIONS GOVERNING LIMITATIONS ON PRODUCT LIABILITY.