Welcome to **E-XFL.COM** Embedded - System On Chip (SoC): The Heart of Modern Embedded Systems Embedded - System On Chip (SoC) refers to an integrated circuit that consolidates all the essential components of a computer system into a single chip. This includes a microprocessor, memory, and other peripherals, all packed into one compact and efficient package. SoCs are designed to provide a complete computing solution, optimizing both space and power consumption, making them ideal for a wide range of embedded applications. What are **Embedded - System On Chip (SoC)?** **System On Chip (SoC)** integrates multiple functions of a computer or electronic system onto a single chip. Unlike traditional multi-chip solutions. SoCs combine a central | Details | | |----------------------------|-------------------------------------------------------------------------------------------------------| | Product Status | Active | | Architecture | MCU, FPGA | | Core Processor | Quad ARM® Cortex®-A53 MPCore™ with CoreSight™, Dual ARM®Cortex™-R5 with CoreSight™, ARM Mali™-400 MP2 | | Flash Size | - | | RAM Size | 256KB | | Peripherals | DMA, WDT | | Connectivity | CANbus, EBI/EMI, Ethernet, I <sup>2</sup> C, MMC/SD/SDIO, SPI, UART/USART, USB OTG | | Speed | 500MHz, 600MHz, 1.2GHz | | Primary Attributes | Zynq®UltraScale+™ FPGA, 504K+ Logic Cells | | Operating<br>Temperature | -40°C ~ 100°C (TJ) | | Package / Case | 1517-BBGA, FCBGA | | Supplier Device<br>Package | 1517-FCBGA (40x40) | | Purchase URL | https://www.e-xfl.com/product-detail/xilinx/xczu7eg-l1ffvf1517i | Email: info@E-XFL.COM Address: Room A, 16/F, Full Win Commercial Centre, 573 Nathan Road, Mongkok, Hong Kong #### ARM Mali-400 Based GPU - Supports OpenGL ES 1.1 and 2.0 - Supports OpenVG 1.1 - GPU frequency: Up to 667MHz - Single Geometry Processor, Two Pixel Processors - Pixel Fill Rate: 2 Mpixels/sec/MHz - Triangle Rate: 0.11 Mtriangles/sec/MHz - 64KB L2 Cache - Power island gating ## **External Memory Interfaces** - Multi-protocol dynamic memory controller - 32-bit or 64-bit interfaces to DDR4, DDR3, DDR3L, or LPDDR3 memories, and 32-bit interface to LPDDR4 memory - ECC support in 64-bit and 32-bit modes - Up to 32GB of address space using single or dual rank of 8-, 16-, or 32-bit-wide memories - Static memory interfaces - eMMC4.51 Managed NAND flash support - ONFI3.1 NAND flash with 24-bit ECC - 1-bit SPI, 2-bit SPI, 4-bit SPI (Quad-SPI), or two Quad-SPI (8-bit) serial NOR flash ### **8-Channel DMA Controller** - Two DMA controllers of 8-channels each - Memory-to-memory, memory-to-peripheral, peripheral-to-memory, and scatter-gather transaction support #### **Serial Transceivers** - Four dedicated PS-GTR receivers and transmitters supports up to 6.0Gb/s data rates - Supports SGMII tri-speed Ethernet, PCI Express® Gen2, Serial-ATA (SATA), USB3.0, and DisplayPort # **Dedicated I/O Peripherals and Interfaces** - PCI Express Compliant with PCIe® 2.1 base specification - Root complex and End Point configurations - o x1, x2, and x4 at Gen1 or Gen2 rates - SATA Host - 1.5, 3.0, and 6.0Gb/s data rates as defined by SATA Specification, revision 3.1 - Supports up to two channels - DisplayPort Controller - Up to 5.4Gb/s rate - Up to two TX lanes (no RX support) - Four 10/100/1000 tri-speed Ethernet MAC peripherals with IEEE Std 802.3 and IEEE Std 1588 revision 2.0 support - Scatter-gather DMA capability - Recognition of IEEE Std 1588 rev.2 PTP frames - o GMII, RGMII, and SGMII interfaces - Jumbo frames - Two USB 3.0/2.0 Device, Host, or OTG peripherals, each supporting up to 12 endpoints - o USB 3.0/2.0 compliant device IP core - Super-speed, high- speed, full-speed, and low-speed modes - Intel XHCI- compliant USB host - Two full CAN 2.0B-compliant CAN bus interfaces - o CAN 2.0-A and CAN 2.0-B and ISO 118981-1 standard compliant - Two SD/SDIO 2.0/eMMC4.51 compliant controllers - Two full-duplex SPI ports with three peripheral chip selects - Two high-speed UARTs (up to 1Mb/s) - Two master and slave I2C interfaces - Up to 78 flexible multiplexed I/O (MIO) (up to three banks of 26 I/Os) for peripheral pin assignment - Up to 96 EMIOs (up to three banks of 32 I/Os) connected to the PL #### Interconnect - High-bandwidth connectivity within PS and between PS and PL - ARM AMBA® AXI4-based - QoS support for latency and bandwidth control - Cache Coherent Interconnect (CCI) ### **System Memory Management** - System Memory Management Unit (SMMU) - Xilinx Memory Protection Unit (XMPU) ### **Platform Management Unit** - Power gates PS peripherals, power islands, and power domains - Clock gates PS peripheral user firmware option ## **Configuration and Security Unit** - Boots PS and configures PL - Supports secure and non-secure boot modes 2 ### **System Monitor in PS** • On-chip voltage and temperature sensing 7 Table 4: Zynq UltraScale+ MPSoC: EG Device-Package Combinations and Maximum I/Os | Dackage | Package | ZU2EG | ZU3EG | ZU4EG | ZU5EG | ZU6EG | ZU7EG | ZU9EG | ZU11EG | ZU15EG | ZU17EG | ZU19EG | |----------------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------| | Package<br>(1)(2)(3)(4)(5) | Dimensions<br>(mm) | HD, HP<br>GTH, GTY | SBVA484 <sup>(6)</sup> | 19x19 | 24, 58<br>0, 0 | 24, 58<br>0, 0 | | | | | | | | | | | SFVA625 | 21x21 | 24, 156<br>0, 0 | 24, 156<br>0, 0 | | | | | | | | | | | SFVC784 <sup>(7)</sup> | 23x23 | 96, 156<br>0, 0 | 96, 156<br>0, 0 | 96, 156<br>4, 0 | 96, 156<br>4, 0 | | | | | | | | | FBVB900 | 31x31 | | | 48, 156<br>16, 0 | 48, 156<br>16, 0 | | 48, 156<br>16, 0 | | | | | | | FFVC900 | 31x31 | | | | | 48, 156<br>16, 0 | | 48, 156<br>16, 0 | | 48, 156<br>16, 0 | | | | FFVB1156 | 35x35 | | | | | 120, 208<br>24, 0 | | 120, 208<br>24, 0 | | 120, 208<br>24, 0 | | | | FFVC1156 | 35x35 | | | | | | 48, 312<br>20, 0 | | 48, 312<br>20, 0 | | | | | FFVB1517 | 40x40 | | | | | | | | 72, 416<br>16, 0 | | 72, 572<br>16, 0 | 72, 572<br>16, 0 | | FFVF1517 | 40x40 | | | | | | 48, 416<br>24, 0 | | 48, 416<br>32, 0 | | | | | FFVC1760 | 42.5x42.5 | | | | | | | | 96, 416<br>32, 16 | | 96, 416<br>32, 16 | 96, 416<br>32, 16 | | FFVD1760 | 42.5x42.5 | | | | | | | | | | 48, 260<br>44, 28 | 48, 260<br>44, 28 | | FFVE1924 | 45x45 | | | | | | | | | | 96, 572<br>44, 0 | 96, 572<br>44, 0 | #### Notes: - 1. Go to Ordering Information for package designation details. (5) - 2. FB/FF packages have 1.0mm ball pitch. SB/SF packages have 0.8mm ball pitch. - 3. All device package combinations bond out 4 PS-GTR transceivers. - 4. All device package combinations bond out 214 PS I/O except ZU2EG and ZU3EG in the SBVA484 and SFVA625 packages, which bond out 170 PS I/Os. - 5. Packages with the same last letter and number sequence, e.g., A484, are footprint compatible with all other UltraScale devices with the same sequence. The footprint compatible devices within this family are outlined. - 6. All 58 HP I/O pins are powered by the same $V_{CCO}$ supply. - 7. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s. Table 5: Zynq UltraScale+ MPSoC: EV Device Feature Summary | | ZU4EV | ZU5EV | ZU7EV | | | |-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|---------------------|--|--| | Application Processing Unit | Quad-core ARM Cortex-A53 MPCore with CoreSight; NEON & Single/Double Precision Floating Point; 32KB/32KB L1 Cache, 1MB L2 Cache | | | | | | Real-Time Processing Unit | Dual-core ARM Cortex-R5 with CoreSight; Single/Double Precision Floating Point; 32KB/32KB L1 Cache, and TCM | | | | | | Embedded and External<br>Memory | 256KB On-Chip Memory w/ECC; External DDR4; DDR3; DDR3L; LPDDR4; LPDDR3; External Quad-SPI; NAND; eMMC | | | | | | General Connectivity | 214 PS I/O; UART; CAN; USB 2.0; I2C; SPI; 32b GPIO; Real Time Clock; WatchDog Timers; Triple Timer Counters | | | | | | High-Speed Connectivity | 4 PS-GTR; PCIe Gen | n1/2; Serial ATA 3.1; DisplayPort 1 | .2a; USB 3.0; SGMII | | | | Graphic Processing Unit | А | RM Mali™-400 MP2; 64KB L2 Cach | ne | | | | Video Codec | 1 | 1 | 1 | | | | System Logic Cells | 192,150 | 256,200 | 504,000 | | | | CLB Flip-Flops | 175,680 | 234,240 | 460,800 | | | | CLB LUTs | 87,840 | 117,120 | 230,400 | | | | Distributed RAM (Mb) | 2.6 | 3.5 | 6.2 | | | | Block RAM Blocks | 128 | 144 | 312 | | | | Block RAM (Mb) | 4.5 | 5.1 | 11.0 | | | | UltraRAM Blocks | 48 | 64 | 96 | | | | UltraRAM (Mb) | 14.0 | 18.0 | 27.0 | | | | DSP Slices | 728 | 1,248 | 1,728 | | | | CMTs | 4 | 4 | 8 | | | | Max. HP I/O <sup>(1)</sup> | 156 | 156 | 416 | | | | Max. HD I/O <sup>(2)</sup> | 96 | 96 | 48 | | | | System Monitor | 2 | 2 | 2 | | | | GTH Transceiver 16.3Gb/s <sup>(3)</sup> | 16 | 16 | 24 | | | | GTY Transceivers 32.75Gb/s | 0 | 0 | 0 | | | | Transceiver Fractional PLLs | 8 | 8 | 12 | | | | PCIe Gen3 x16 and Gen4 x8 | 2 | 2 | 2 | | | | 150G Interlaken | 0 | 0 | 0 | | | | 100G Ethernet w/ RS-FEC | 0 | 0 | 0 | | | #### Notes: - 1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V. - HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s. See Table 6. ## **Zynq UltraScale+ MPSoCs** A comprehensive device family, Zynq UltraScale+ MPSoCs offer single-chip, all programmable, heterogeneous multiprocessors that provide designers with software, hardware, interconnect, power, security, and I/O programmability. The range of devices in the Zynq UltraScale+ MPSoC family allows designers to target cost-sensitive as well as high-performance applications from a single platform using industry-standard tools. While each Zynq UltraScale+ MPSoC contains the same PS, the PL, Video hard blocks, and I/O resources vary between the devices. Table 7: Zynq UltraScale+ MPSoC Device Features | | CG Devices | EG Devices | EV Devices | |-----|--------------------------|--------------------------|--------------------------| | APU | Dual-core ARM Cortex-A53 | Quad-core ARM Cortex-A53 | Quad-core ARM Cortex-A53 | | RPU | Dual-core ARM Cortex-R5 | Dual-core ARM Cortex-R5 | Dual-core ARM Cortex-R5 | | GPU | - | Mali-400MP2 | Mali-400MP2 | | VCU | - | - | H.264/H.265 | The Zynq UltraScale+ MPSoCs are able to serve a wide range of applications including: - Automotive: Driver assistance, driver information, and infotainment - Wireless Communications: Support for multiple spectral bands and smart antennas - Wired Communications: Multiple wired communications standards and context-aware network services - Data Centers: Software Defined Networks (SDN), data pre-processing, and analytics - Smarter Vision: Evolving video-processing algorithms, object detection, and analytics - Connected Control/M2M: Flexible/adaptable manufacturing, factory throughput, quality, and safety The UltraScale MPSoC architecture provides processor scalability from 32 to 64 bits with support for virtualization, the combination of soft and hard engines for real-time control, graphics/video processing, waveform and packet processing, next-generation interconnect and memory, advanced power management, and technology enhancements that deliver multi-level security, safety, and reliability. Xilinx offers a large number of soft IP for the Zynq UltraScale+ MPSoC family. Stand-alone and Linux device drivers are available for the peripherals in the PS and the PL. Xilinx's Vivado® Design Suite, SDK™, and PetaLinux development environments enable rapid product development for software, hardware, and systems engineers. The ARM-based PS also brings a broad range of third-party tools and IP providers in combination with Xilinx's existing PL ecosystem. The Zynq UltraScale+ MPSoC family delivers unprecedented processing, I/O, and memory bandwidth in the form of an optimized mix of heterogeneous processing engines embedded in a next-generation, high-performance, on-chip interconnect with appropriate on-chip memory subsystems. The heterogeneous processing and programmable engines, which are optimized for different application tasks, enable the Zynq UltraScale+ MPSoCs to deliver the extensive performance and efficiency required to address next-generation smarter systems while retaining backwards compatibility with the original Zynq-7000 All Programmable SoC family. The UltraScale MPSoC architecture also incorporates multiple levels of security, increased safety, and advanced power management, which are critical requirements of next-generation smarter systems. Xilinx's embedded UltraFast™ design methodology fully exploits the ## **Processing System** ## **Application Processing Unit (APU)** The key features of the APU include: - 64-bit quad-core ARM Cortex-A53 MPCores. Features associated with each core include: - ARM v8-A Architecture - Operating target frequency: up to 1.5GHz - Single and double precision floating point:4 SP / 2 DP FLOPs - NEON Advanced SIMD support with single and double precision floating point instructions - o A64 instruction set in 64-bit operating mode, A32/T32 instruction set in 32-bit operating mode - Level 1 cache (separate instruction and data, 32KB each for each Cortex-A53 CPU) - 2-way set-associative Instruction Cache with parity support - 4-way set-associative Data Cache with ECC support - Integrated memory management unit (MMU) per processor core - TrustZone for secure mode operation - Virtualization support - Ability to operate in single processor, symmetric quad processor, and asymmetric quad-processor modes - Integrated 16-way set-associative 1MB Unified Level 2 cache with ECC support - Interrupts and Timers - Generic interrupt controller (GIC-400) - o ARM generic timers (4 timers per CPU) - One watchdog timer (WDT) - One global timer - Two triple timers/counters (TTC) - Little and big endian support - o Big endian support in BE8 mode - CoreSight debug and trace support - o Embedded Trace Macrocell (ETM) for instruction trace - Cross trigger interface (CTI) enabling hardware breakpoints and triggers - ACP interface to PL for I/O coherency and Level 2 cache allocation - ACE interface to PL for full coherency - Power island gating on each processor core - Optional eFUSE disable per core ## Xilinx Memory Protection Unit (XMPU) - Region based memory protection unit - Up to 16 regions - Each region supports address alignment of 1MB or 4KB - Regions can overlap; the higher region number has priority - Each region can be independently enabled or disabled - Each region has a start and end address ## **Graphics Processing Unit (GPU)** - Supports OpenGL ES 1.1 & 2.0 - Supports OpenVG 1.1 - Operating target frequency: up to 667MHz - Single Geometry Processor and two Pixel processor - Pixel Fill Rate: 2 Mpixel/sec/MHz - Triangle Rate: 0.11 Mtriangles/sec/MHz - 64KB Level 2 Cache (read-only) - 4X and 16X Anti-aliasing Support - ETC1 texture compression to reduce external memory bandwidth - Extensive texture format support - o RGBA 8888, 565, 1556 - o Mono 8, 16 - YUV format support - Automatic load balancing across different graphics shader engines - 2D and 3D graphic acceleration - Up to 4K texture input and 4K render output resolutions - Each geometry processor and pixel processor supports 4KB page MMU - Power island gating on each GPU engine and shared cache - Optional eFUSE disable ## **Dynamic Memory Controller (DDRC)** - DDR3, DDR3L, DDR4, LPDDR3, LPDDR4 - Target data rate: Up to 2400Mb/s DDR4 operation in -1 speed grade - 32-bit and 64-bit bus width support for DDR4, DDR3, DDR3L, or LPDDR3 memories, and 32-bit bus width support for LPDDR4 memory - ECC support (using extra bits) - Up to a total DRAM capacity of 32GB - Low power modes - Active/precharge power down - o Self-refresh, including clean exit from self-refresh after a controller power cycle - Enhanced DDR training by allowing software to measure read/write eye and make delay adjustments dynamically - Independent performance monitors for read path and write path - Integration of PHY Debug Access Port (DAP) into JTAG for testing The DDR memory controller is multi-ported and enables the PS and the PL to have shared access to a common memory. The DDR controller features six AXI slave ports for this purpose: - Two 128-bit AXI ports from the ARM Cortex-A53 CPU(s), RPU (ARM Cortex-R5 and LPD peripherals), GPU, high speed peripherals (USB3, PCIe & SATA), and High Performance Ports (HPO & HP1) from the PL through the Cache Coherent Interconnect (CCI) - One 64-bit port is dedicated for the ARM Cortex-R5 CPU(s) - One 128-bit AXI port from the DisplayPort and HP2 port from the PL - One 128-bit AXI port from HP3 and HP4 ports from the PL - One 128-bit AXI port from General DMA and HP5 from the PL ## **High-Speed Connectivity Peripherals** #### **PCIe** - Compliant with the PCI Express Base Specification 2.1 - Fully compliant with PCI Express transaction ordering rules - Lane width: x1, x2, or x4 at Gen1 or Gen2 rates - 1 Virtual Channel - Full duplex PCIe port - End Point and single PCIe link Root Port - Root Port supports Enhanced Configuration Access Mechanism (ECAM), Cfg Transaction generation - Root Port support for INTx, and MSI - Endpoint support for MSI or MSI-X - o 1 physical function, no SR-IOV - No relaxed or ID ordering - Fully configurable BARs - o INTx not recommended, but can be generated - Endpoint to support configurable target/slave apertures with address translation and Interrupt capability #### SATA - Compliant with SATA 3.1 Specification - SATA host port supports up to 2 external devices - Compliant with Advanced Host Controller Interface ('AHCI') ver. 1.3 - 1.5Gb/s, 3.0Gb/s, and 6.0Gb/s data rates - Power management features: supports partial and slumber modes #### **USB 3.0** - Two USB controllers (configurable as USB 2.0 or USB 3.0) - Up to 5.0Gb/s data rate - Host and Device modes - Super Speed, High Speed, Full Speed, and Low Speed - o Up to 12 endpoints - The USB host controller registers and data structures are compliant to Intel xHCI specifications - 64-bit AXI master port with built-in DMA - o Power management features: Hibernation mode ### DisplayPort Controller - 4K Display Processing with DisplayPort output - Maximum resolution of 4K x 2K-30 (30Hz pixel rate) - DisplayPort AUX channel, and Hot Plug Detect (HPD) on the output - o RGB YCbCr, 4:2:0; 4:2:2, 4:4:4 with 6, 8, 10, and 12b/c - Y-only, xvYCC, RGB 4:4:4, YCbCr 4:4:4, YCbCr 4:2:2, and YCbCr 4:2:0 video format with 6,8,10 and 12-bits per color component - 256-color palette - Multiple frame buffer formats - o 1, 2, 4, 8 bits per pixel (bpp) via a palette - o 16, 24, 32bpp - o Graphics formats such as RGBA8888, RGB555, etc. - Accepts streaming video from the PL or dedicated DMA controller - Enables Alpha blending of graphics and Chroma keying - Audio support - A single stream carries up to 8 LPCM channels at 192kHz with 24-bit resolution - Supports compressed formats including DRA, Dolby MAT, and DTS HD - Multi-Stream Transport can extend the number of audio channels - Audio copy protection - o 2-channel streaming or input from the PL - o Multi-channel non-streaming audio from a memory audio frame buffer - Includes a System Time Clock (STC) compliant with ISO/IEC 13818-1 - Boot-time display using minimum resources ## Platform Management Unit (PMU) - Performs system initialization during boot - Acts as a delegate to the application and real-time processors during sleep state - Initiates power-up and restart after the wake-up request - Maintains the system power state at all time - Manages the sequence of low-level events required for power-up, power-down, reset, clock gating, and power gating of islands and domains - Provides error management (error handling and reporting) - Provides safety check functions (e.g., memory scrubbing) The PMU includes the following blocks: - Platform management processor - Fixed ROM for boot-up of the device - 128KB RAM with ECC for optional user/firmware code - Local and global registers to manage power-down, power-up, reset, clock gating, and power gating requests - Interrupt controller with 16 interrupts from other modules and the inter-processor communication interface (IPI) - GPI and GPO interfaces to and from PS I/O and PL - JTAG interface for PMU debug - Optional User-Defined Firmware ## **Configuration Security Unit (CSU)** - Triple redundant Secure Processor Block (SPB) with built-in ECC - Crypto Interface Block consisting of - 256-bit AES-GCM - o SHA-3/384 - o 4096-bit RSA - Key Management Unit - Built-in DMA - PCAP interface - Supports ROM validation during pre-configuration stage - Loads First Stage Boot Loader (FSBL) into OCM in either secure or non-secure boot modes - Supports voltage, temperature, and frequency monitoring after configuration ## Xilinx Peripheral Protection Unit (XPPU) - Provides peripheral protection support - Up to 20 masters simultaneously - Multiple aperture sizes - Access control for a specified set of address apertures on a per master basis - 64KB peripheral apertures and controls access on per peripheral basis ## I/O Peripherals The IOP unit contains the data communication peripherals. Key features of the IOP include: ### Triple-Speed Gigabit Ethernet - Compatible with IEEE Std 802.3 and supports 10/100/1000Mb/s transfer rates (Full and Half duplex) - Supports jumbo frames - Built-in Scatter-Gather DMA capability - Statistics counter registers for RMON/MIB - Multiple I/O types (1.8, 2.5, 3.3V) on RGMII interface with external PHY - GMII interface to PL to support interfaces as: TBI, SGMII, and RGMII v2.0 support - Automatic pad and cyclic redundancy check (CRC) generation on transmitted frames - Transmitter and Receive IP, TCP, and UDP checksum offload - MDIO interface for physical layer management #### SPI - Full-duplex operation offers simultaneous receive and transmit - 128B deep read and write FIFO - Master or slave SPI mode - Up to 3 chip select lines - Multi-master environment - Identifies an error condition if more than one master detected - Selectable master clock reference - Software can poll for status or be interrupt driven #### **12C** - 128-bit buffer size - Both normal (100kHz) and fast bus data rates (400kHz) - Master or slave mode - Normal or extended addressing - I2C bus hold for slow host service #### **GPIO** - Up to 128 GPIO bits - Up to 78-bits from MIO and 96-bits from EMIO - Each GPIO bit can be dynamically programmed as input or output - Independent reset values for each bit of all registers - Interrupt request generation for each GPIO signals - Single Channel (Bit) write capability for all control registers include data output register, direction control register, and interrupt clear register - Read back in output mode #### CAN - Conforms to the ISO 11898 -1, CAN2.0A, and CAN 2.0B standards - Both standard (11-bit identifier) and extended (29-bit identifier) frames - Bit rates up to 1Mb/s - Transmit and Receive message FIFO with a depth of 64 messages - Watermark interrupts for TXFIFO and RXFIFO - Automatic re-transmission on errors or arbitration loss in normal mode - Acceptance filtering of 4 acceptance filters - Sleep Mode with automatic wake-up - Snoop Mode - 16-bit timestamping for receive messages - Both internal generated reference clock and external reference clock input from MIO - Guarantee clock sampling edge between 80 to 83% at 24MHz reference clock input - Optional eFUSE disable per port #### **USB 2.0** - Two USB controllers (configurable as USB 2.0 or USB 3.0) - Host, device and On-The-Go (OTG) modes - High Speed, Full Speed, and Low Speed - Up to 12 endpoints - 8-bit ULPI External PHY Interface - The USB host controller registers and data structures are compliant to Intel xHCI specifications. - 64-bit AXI master port with built-in DMA - Power management features: hibernation mode ## **Static Memory Interfaces** The static memory interfaces support external static memories. - ONFI 3.1 NAND flash support with up to 24-bit ECC - 1-bit SPI, 2-bit SPI, 4-bit SPI (Quad-SPI), or two Quad-SPI (8-bit) serial NOR flash - 8-bit eMMC interface supporting managed NAND flash #### NAND ONFI 3.1 Flash Controller - ONFI 3.1 compliant - Supports chip select reduction per ONFI 3.1 spec - SLC NAND for boot/configuration and data storage - ECC options based on SLC NAND - o 1, 4, or 8 bits per 512+spare bytes - o 24 bits per 1024+spare bytes - Maximum throughput as follows - o Asynchronous mode (SDR) 24.3MB/s - Synchronous mode (NV-DDR) 112MB/s (for 100MHz flash clock) - 8-bit SDR NAND interface ### Interconnect All the blocks are connected to each other and to the PL through a multi-layered ARM Advanced Microprocessor Bus Architecture (AMBA) AXI interconnect. The interconnect is non-blocking and supports multiple simultaneous master-slave transactions. The interconnect is designed with latency sensitive masters, such as the ARM CPU, having the shortest paths to memory, and bandwidth critical masters, such as the potential PL masters, having high throughput connections to the slaves with which they need to communicate. Traffic through the interconnect can be regulated through the Quality of Service (QoS) block in the interconnect. The QoS feature is used to regulate traffic generated by the CPU, DMA controller, and a combined entity representing the masters in the IOP. ## **PS Interfaces** PS interfaces include external interfaces going off-chip or signals going from PS to PL. ### **PS External Interfaces** The Zynq UltraScale+ MPSoC's external interfaces use dedicated pins that cannot be assigned as PL pins. These include: - Clock, reset, boot mode, and voltage reference - Up to 78 dedicated multiplexed I/O (MIO) pins, software-configurable to connect to any of the internal I/O peripherals and static memory controllers - 32-bit or 64-bit DDR4/DDR3/DDR3L/LPDDR3 memories with optional ECC - 32-bit LPDDR4 memory with optional ECC - 4 channels (TX and RX pair) for transceivers #### **MIO Overview** The IOP peripherals communicate to external devices through a shared pool of up to 78 dedicated multiplexed I/O (MIO) pins. Each peripheral can be assigned one of several pre-defined groups of pins, enabling a flexible assignment of multiple devices simultaneously. Although 78 pins are not enough for simultaneous use of all the I/O peripherals, most IOP interface signals are available to the PL, allowing use of standard PL I/O pins when powered up and properly configured. Extended multiplexed I/O (EMIO) allows unmapped PS peripherals to access PL I/O. Port mappings can appear in multiple locations. For example, there are up to 12 possible port mappings for CAN pins. The PS Configuration Wizard (PCW) tool aids in peripheral and static memory pin mapping. ## **Integrated Block for 100G Ethernet** Compliant to the IEEE Std 802.3ba, the 100G Ethernet integrated blocks in the UltraScale architecture provide low latency 100Gb/s Ethernet ports with a wide range of user customization and statistics gathering. With support for 10 x 10.3125Gb/s (CAUI) and 4 x 25.78125Gb/s (CAUI-4) configurations, the integrated block includes both the 100G MAC and PCS logic with support for IEEE Std 1588v2 1-step and 2-step hardware timestamping. In UltraScale+ devices, the 100G Ethernet blocks contain a Reed Solomon Forward Error Correction (RS-FEC) block, compliant to IEEE Std 802.3bj, that can be used with the Ethernet block or stand alone in user applications. These families also support OTN mapping mode in which the PCS can be operate without using the MAC. ## **Clock Management** The clock generation and distribution components in UltraScale architecture-based devices are located adjacent to the columns that contain the memory interfacing and input and output circuitry. This tight coupling of clocking and I/O provides low-latency clocking to the I/O for memory interfaces and other I/O protocols. Within every clock management tile (CMT) resides one mixed-mode clock manager (MMCM), two PLLs, clock distribution buffers and routing, and dedicated circuitry for implementing external memory interfaces. ## **Mixed-Mode Clock Manager** The mixed-mode clock manager (MMCM) can serve as a frequency synthesizer for a wide range of frequencies and as a jitter filter for incoming clocks. At the center of the MMCM is a voltage-controlled oscillator (VCO), which speeds up and slows down depending on the input voltage it receives from the phase frequency detector (PFD). Three sets of programmable frequency dividers (D, M, and O) are programmable by configuration and during normal operation via the Dynamic Reconfiguration Port (DRP). The pre-divider D reduces the input frequency and feeds one input of the phase/frequency comparator. The feedback divider M acts as a multiplier because it divides the VCO output frequency before feeding the other input of the phase comparator. D and M must be chosen appropriately to keep the VCO within its specified frequency range. The VCO has eight equally-spaced output phases (0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315°). Each phase can be selected to drive one of the output dividers, and each divider is programmable by configuration to divide by any integer from 1 to 128. The MMCM has three input-jitter filter options: low bandwidth, high bandwidth, or optimized mode. Low-Bandwidth mode has the best jitter attenuation. High-Bandwidth mode has the best phase offset. Optimized mode allows the tools to find the best setting. The MMCM can have a fractional counter in either the feedback path (acting as a multiplier) or in one output path. Fractional counters allow non-integer increments of 1/8 and can thus increase frequency synthesis capabilities by a factor of 8. The MMCM can also provide fixed or dynamic phase shift in small increments that depend on the VCO frequency. At 1,600MHz, the phase-shift timing increment is 11.2ps. #### **PLL** With fewer features than the MMCM, the two PLLs in a clock management tile are primarily present to provide the necessary clocks to the dedicated memory interface circuitry. The circuit at the center of the PLLs is similar to the MMCM, with PFD feeding a VCO and programmable M, D, and O counters. There are two divided outputs to the device fabric per PLL as well as one clock plus one enable signal to the memory interface circuitry. Zynq UltraScale+ MPSoCs are equipped with five additional PLLs in the PS for independently configuring the four primary clock domains with the PS: the APU, the RPU, the DDR controller, and the I/O peripherals. ## **Clock Distribution** Clocks are distributed throughout Zynq UltraScale+ MPSoCs via buffers that drive a number of vertical and horizontal tracks. There are 24 horizontal clock routes per clock region and 24 vertical clock routes per clock region with 24 additional vertical clock routes adjacent to the MMCM and PLL. Within a clock region, clock signals are routed to the device logic (CLBs, etc.) via 16 gateable leaf clocks. Several types of clock buffers are available. The BUFGCE and BUFCE\_LEAF buffers provide clock gating at the global and leaf levels, respectively. BUFGCTRL provides glitchless clock muxing and gating capability. BUFGCE\_DIV has clock gating capability and can divide a clock by 1 to 8. BUFG\_GT performs clock division from 1 to 8 for the transceiver clocks. In MPSoCs, clocks can be transferred from the PS to the PL using dedicated buffers. ## **Memory Interfaces** Memory interface data rates continue to increase, driving the need for dedicated circuitry that enables high performance, reliable interfacing to current and next-generation memory technologies. Every Zynq UltraScale+ MPSoC includes dedicated physical interfaces (PHY) blocks located between the CMT and I/O columns that support implementation of high-performance PHY blocks to external memories such as DDR4, DDR3, QDRII+, and RLDRAM3. The PHY blocks in each I/O bank generate the address/control and data bus signaling protocols as well as the precision clock/data alignment required to reliably communicate with a variety of high-performance memory standards. Multiple I/O banks can be used to create wider memory interfaces. As well as external parallel memory interfaces, Zynq UltraScale+ MPSoC can communicate to external serial memories, such as Hybrid Memory Cube (HMC), via the high-speed serial transceivers. All transceivers in the UltraScale architecture support the HMC protocol, up to 15Gb/s line rates. UltraScale architecture-based devices support the highest bandwidth HMC configuration of 64 lanes with a single device. ## **Configurable Logic Block** Every Configurable Logic Block (CLB) in the UltraScale architecture contains 8 LUTs and 16 flip-flops. The LUTs can be configured as either one 6-input LUT with one output, or as two 5-input LUTs with separate outputs but common inputs. Each LUT can optionally be registered in a flip-flop. In addition to the LUTs and flip-flops, the CLB contains arithmetic carry logic and multiplexers to create wider logic functions. Each CLB contains one slice. There are two types of slices: SLICEL and SLICEM. LUTs in the SLICEM can be configured as 64-bit RAM, as 32-bit shift registers (SRL32), or as two SRL16s. CLBs in the UltraScale architecture have increased routing and connectivity compared to CLBs in previous-generation Xilinx devices. They also have additional control signals to enable superior register packing, resulting in overall higher device utilization. ## Interconnect Various length vertical and horizontal routing resources in the UltraScale architecture that span 1, 2, 4, 5, 12, or 16 CLBs ensure that all signals can be transported from source to destination with ease, providing support for the next generation of wide data buses to be routed across even the highest capacity devices while simultaneously improving quality of results and software run time. ## **Block RAM** Every UltraScale architecture-based device contains a number of 36Kb block RAMs, each with two completely independent ports that share only the stored data. Each block RAM can be configured as one 36Kb RAM or two independent 18Kb RAMs. Each memory access, read or write, is controlled by the clock. Connections in every block RAM column enable signals to be cascaded between vertically adjacent block RAMs, providing an easy method to create large, fast memory arrays, and FIFOs with greatly reduced power consumption. All inputs, data, address, clock enables, and write enables are registered. The input address is always clocked (unless address latching is turned off), retaining data until the next operation. An optional output data pipeline register allows higher clock rates at the cost of an extra cycle of latency. During a write operation, the data output can reflect either the previously stored data or the newly written data, or it can remain unchanged. Block RAM sites that remain unused in the user design are automatically powered down to reduce total power consumption. There is an additional pin on every block RAM to control the dynamic power gating feature. ### **Programmable Data Width** Each port can be configured as $32K \times 1$ ; $16K \times 2$ ; $8K \times 4$ ; $4K \times 9$ (or 8); $2K \times 18$ (or 16); $1K \times 36$ (or 32); or $512 \times 72$ (or 64). Whether configured as block RAM or FIFO, the two ports can have different aspect ratios without any constraints. Each block RAM can be divided into two completely independent 18Kb block RAMs that can each be configured to any aspect ratio from $16K \times 1$ to $512 \times 36$ . Everything described previously for the full 36Kb block RAM also applies to each of the smaller 18Kb block RAMs. Only in simple dual-port (SDP) mode can data widths of greater than 18 bits (18Kb RAM) or 36 bits (36Kb RAM) be accessed. In this mode, one port is dedicated to read operation, the other to write operation. In SDP mode, one side (read or write) can be variable, while the other is fixed to 32/36 or 64/72. Both sides of the dual-port 36Kb RAM can be of variable width. ### **Error Detection and Correction** Each 64-bit-wide block RAM can generate, store, and utilize eight additional Hamming code bits and perform single-bit error correction and double-bit error detection (ECC) during the read process. The ECC logic can also be used when writing to or reading from external 64- to 72-bit-wide memories. #### **FIFO Controller** Each block RAM can be configured as a 36Kb FIFO or an 18Kb FIFO. The built-in FIFO controller for single-clock (synchronous) or dual-clock (asynchronous or multirate) operation increments the internal addresses and provides four handshaking flags: full, empty, programmable full, and programmable empty. The programmable flags allow the user to specify the FIFO counter values that make these flags go active. The FIFO width and depth are programmable with support for different read port and write port widths on a single FIFO. A dedicated cascade path allows for easy creation of deeper FIFOs. ## **UltraRAM** UltraRAM is a high-density, dual-port, synchronous memory block used in some UltraScale+ families. Both of the ports share the same clock and can address all of the 4K x 72 bits. Each port can independently read from or write to the memory array. UltraRAM supports two types of write enable schemes. The first mode is consistent with the block RAM byte write enable mode. The second mode allows gating the data and parity byte writes separately. Multiple UltraRAM blocks can be cascaded together to create larger memory arrays. UltraRAM blocks can be connected together to create larger memory arrays. Dedicated routing in the UltraRAM column enables the entire column height to be connected together. This makes UltraRAM an ideal solution for replacing external memories such as SRAM. Cascadable anywhere from 288Kb to 36Mb, UltraRAM provides the flexibility to fulfill many different memory requirements. ### **Error Detection and Correction** Each 64-bit-wide UltraRAM can generate, store and utilize eight additional Hamming code bits and perform single-bit error correction and double-bit error detection (ECC) during the read process. ## **Digital Signal Processing** DSP applications use many binary multipliers and accumulators, best implemented in dedicated DSP slices. All UltraScale architecture-based devices have many dedicated, low-power DSP slices, combining high speed with small size while retaining system design flexibility. Each DSP slice fundamentally consists of a dedicated 27 × 18 bit twos complement multiplier and a 48-bit accumulator. The multiplier can be dynamically bypassed, and two 48-bit inputs can feed a single-instruction-multiple-data (SIMD) arithmetic unit (dual 24-bit add/subtract/accumulate or quad 12-bit add/subtract/accumulate), or a logic unit that can generate any one of ten different logic functions of the two operands. The DSP includes an additional pre-adder, typically used in symmetrical filters. This pre-adder improves performance in densely packed designs and reduces the DSP slice count by up to 50%. The 96-bit-wide XOR function, programmable to 12, 24, 48, or 96-bit widths, enables performance improvements when implementing forward error correction and cyclic redundancy checking algorithms. The DSP also includes a 48-bit-wide pattern detector that can be used for convergent or symmetric rounding. The pattern detector is also capable of implementing 96-bit-wide logic functions when used in conjunction with the logic unit. The DSP slice provides extensive pipelining and extension capabilities that enhance the speed and efficiency of many applications beyond digital signal processing, such as wide dynamic bus shifters, memory address generators, wide bus multiplexers, and memory-mapped I/O register files. The accumulator can also be used as a synchronous up/down counter. ## **System Monitor** The System Monitor blocks in the UltraScale architecture are used to enhance the overall safety, security, and reliability of the system by monitoring the physical environment via on-chip power supply and temperature sensors. All UltraScale architecture-based devices contain at least one System Monitor. The System Monitor in UltraScale+ devices is similar to the Kintex UltraScale and Virtex UltraScale devices but with the addition of a PMBus interface. Zynq UltraScale+ MPSoCs contain one System Monitor in the PL and an additional block in the PS. The System Monitor in the PL has the same features as the block in UltraScale+ FPGAs. See Table 11. Table 11: Key System Monitor Features | Zynq UltraScale+ MPSoC PL Zynq Ultr | | Zynq UltraScale+ MPSoC PS | |-------------------------------------|-----------------------|---------------------------| | ADC | 10-bit 200kSPS | 10-bit 1MSPS | | Interfaces | JTAG, I2C, DRP, PMBus | APB | Figure 3: Zynq UltraScale+ MPSoC Ordering Information ## **Revision History** The following table shows the revision history for this document: | Date | Version | Description of Revisions | |------------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 02/15/2017 | 1.4 | Updated DSP count in Table 1, Table 3, and Table 5. Updated I/O Electrical Characteristics. Updated Table 12 with -2E speed grade. | | 09/23/2016 | 1.3 | Updated Table 2; Table 3; Table 4; Table 6; Graphics Processing Unit (GPU); and NAND ONFI 3.1 Flash Controller. | | 06/03/2016 | 1.2 | Added CG devices: Updated Table 1; Table 2; Table 3; Table 4; Table 5; Table 6; and Table 12. Added Video Encoder/Decoder (VCU); Table 7; and Power Examples (removed XPE Computed Range table). Updated: General Description; ARM Cortex-A53 Based Application Processing Unit (APU); Zynq UltraScale+ MPSoCs; Dynamic Memory Controller (DDRC); and Figure 3. | | 01/28/2016 | 1.1 | Updated Table 1 and Table 2. | | 11/24/2015 | 1.0 | Initial Xilinx release. | ## **Disclaimer** The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of Xilinx products. To the maximum extent permitted by applicable law: (1) Materials are made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under, or in connection with, the Materials (including your use of the Materials), including for any direct, indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinx assumes no obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or to product specifications. You may not reproduce, modify, distribute, or publicly display the Materials without prior written consent. Certain products are subject to the terms and conditions of Xilinx's limited warranty, please refer to Xilinx's Terms of Sale which can be viewed at <a href="http://www.xilinx.com/legal.htm#tos">http://www.xilinx.com/legal.htm#tos</a>; IP cores may be subject to warranty and support terms contained in a license issued to you by Xilinx. Xilinx products are not designed or intended to be fail-safe or for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx products in such critical applications, please refer to Xilinx's Terms of Sale which can be viewed at <a href="http://www.xilinx.com/legal.htm#tos">http://www.xilinx.com/legal.htm#tos</a>. This document contains preliminary information and is subject to change without notice. Information provided herein relates to products and/or services not yet available for sale, and provided solely for information purposes and are not intended, or to be construed, as an offer for sale or an attempted commercialization of the products and/or services referred to herein. ## **Automotive Applications Disclaimer** AUTOMOTIVE PRODUCTS (IDENTIFIED AS "XA" IN THE PART NUMBER) ARE NOT WARRANTED FOR USE IN THE DEPLOYMENT OF AIRBAGS OR FOR USE IN APPLICATIONS THAT AFFECT CONTROL OF A VEHICLE ("SAFETY APPLICATION") UNLESS THERE IS A SAFETY CONCEPT OR REDUNDANCY FEATURE CONSISTENT WITH THE ISO 26262 AUTOMOTIVE SAFETY STANDARD ("SAFETY DESIGN"). CUSTOMER SHALL, PRIOR TO USING OR DISTRIBUTING ANY SYSTEMS THAT INCORPORATE PRODUCTS, THOROUGHLY TEST SUCH SYSTEMS FOR SAFETY PURPOSES. USE OF PRODUCTS IN A SAFETY APPLICATION WITHOUT A SAFETY DESIGN IS FULLY AT THE RISK OF CUSTOMER, SUBJECT ONLY TO APPLICABLE LAWS AND REGULATIONS GOVERNING LIMITATIONS ON PRODUCT LIABILITY.