Designed & Made
in America (DMA)

BASIL Networks Blog BN'B | January 2018

12 Jan, 2018

Internet of Things (IoT) -Security, Privacy, Safety-Platform Development Project Part-8

Part 8: IoT Core Platform - System On a Chip (SoC), System In a Package (SIP)
The Core Processor of Embedded System Configurations

"We are drowning in information but starved in knowledge:" - John Naisbitt


Quick review to set the atmosphere for Part 8
From the previous Internet of Things Part-1 through Part- 7:

What we want to cover in Part 8:
The high visibility issues concerning IoT devices are security, privacy and safety, hence: the primary motive for this series.  Security, privacy and safety should not be an after thought to be implemented after a crisis.  Our focus is to design in security, privacy and safety during the development process to insure the safety and security needs of the consumer be it commercial or industrial.

Since the beginning of this series in September 2016 there have been many hacked IoT devices using COTS embedded hardware and software creating high visibility to security and privacy.  The current database of breaches encouraged us to present a more detailed hardware and software presentation to assist designers and educate new comers of the new challenges with security and privacy.  Due to the complexities of processors today we will continue to follow our technical presentation methodology,  Overview → Basic → Detailed  (OBD).   We will be addressing the many sections of the Core IoT Platform separately to keep the presentations at a reasonable length.  The full details will be presented during the actual hardware, firmware and software design stages.  

The preliminary preview of the entire index is shown below and will be updated with links as we progress through the live series.  Comments are welcome both publicly and privately.  If you want to participate privately we will give acknowledge of your participation only by your permission.  We hope to share insight to address hardware and software solutions to the security, privacy and safety issues of IoT devices.  Parts of the embedded processor series will apply to both desktop, tablets as well as embedded IoT devices since they all share common elements of a CPU system.

Lets Get Started:

A Brief CPU Summary:
Central Processing Units are relatively fixed function devices and will only perform a fixed set of predefined instructions, a programmed group of these predefined instructions we label a programmed sequence.  The security and privacy challenge is how to prevent unwanted instruction execution to the CPU and the connected peripherals that are controlled by the CPU.  Access to these functions generally come from two sources, the OS (Operating System) where unwanted embedded code is has been hacked into a memory location  or the Basic I/O drivers that are hacked and used to access system functions to compromise the system, example- the Ethernet controller driver and/or application software allowing remote access.

In order to prevent such intrusions and unwanted code from being executed we have to understand the hardware, firmware and software details being used on the core IoT Platform.  When we analyze the embedded processor market place we see that there are really only a handful of different category's of processors and a large variation of licensing of the same or simular cores.  It would take a several hundred page book to cover all the processor differences and is really outside the scope of this series, however we will look at the major players of 32 bit CPU cores and start there which is well within our scope.  The major players are Intel®, AMD®, ARM®, NXP®, Microchip® and a smaller x86 player but still applicable is ZFMicro®.  There are three that are x86 based, Intel, AMD and ZFMicro.  NXP M&A acquired the original Motorola/freescale 68K CPU line; ARM Ltd stands alone and has licensed the technology to many players including all the above. Microchip is in a unique position due to the several M&A's of Atmel, SMSC and others.  The processor lines for these major players cover a broad spectrum of applications which make it difficult to select a processor that will allow full control to insure security and privacy at the core level.

All the major players compete with their own versions of an IDE (Integrated Development Environment) package and once selected you are basically locked into the selected manufacturer.  Using a third party package allows the fast turn of a product to market, however it does not guarantee privacy or security.  In order to insure security and privacy a detailed understanding and disclosure of the internals of the processor and software packages used are a must as well as the access to the core macro assembler to be able to incorporate a users integrated security methodology.  That being said we will now present the basics of CPU architectures in order to ask the right questions when performing our selection process.

The variations of processors today range from a dollar to hundreds of dollars depending on the bit size 8/16/32/64 bits and speed from a few KHz to GigaHertz as well as integrated peripheral functions.  The main function of the CPU is the programmability of a sequence of instructions fetched from a memory system to control a users process.  What makes CPU's different in the industry is the Micro-Coded ROM (MCR) which identifies the unique set of machine instructions for each CPU manufacturer.  If you change the Micro-Coded ROM you change the processor instruction set even though it still controls memory access and some logic functions it has a unique Macro-Assembler assigned to the processor.  Open source compilers like GCC have incorporated several families of processors into the compiler allowing the user to write code in C and compile it for several types of processors.

The remaining blocks of the CPU all perform similar functions, the memory controller and sequencer controls the access to memory locations and also controls the CPU jump tables.  The Central Process Controller or Instruction Process Decoder is what performs the instructions that is fetched from memory and keeps track of the programmed instructions with a program address counter.  The Execution Control Unit performs basic arithmetic and logic functions and incorporates a set of general purpose registers.  The remaining interfaces, the External Memory is to store the application program and other application parameters, the Data/Address BUS & Control are for adding user peripherals. The BUS control allows direct access to the memory controller for fast data transfers.  The final section the Power-Up Entry is a special one time execution during power-on that will allow the user to enter a unique memory address to start fetching instructions to execute.   To understand the core requirements of security one should understand how the internals of selected processor functions.  This introductory presentation will shed light on the complexities of designing an embedded system with the highest level of security possible from the core hardware to firmware to application software.

During the CPU presentation we will be creating a Key Security Requirements (KSR) list to be use for the Embedded Processor selection process.  It is important to keep in mind that all the security requirements may not be met with COTS (Commercial-Off-The-Shelf) embedded processor and may shed some light on the limitations of COTS embedded systems and the compromises that are being made to put a product on the market.

CPU (Central Processor Unit)  What They All Have In Common:
Embedded Processors today range from a simple CPU to incorporating a complete system on a chip making selection complex as wells implementing security policies for unwanted access and privacy.  To present the subject matter at the detail required for comprehensive design and security development guidelines of the Core IoT Platform this will be a continuation of multiple parts for the hardware.   Embedded processors and CPU have many features that allow the designer to incorporate security policies in hardware/firmware to control all access to the platform.  Some System on a Chip (SoC) blocks are frozen designs and do not allow the implementation of security policies at the core CPU level.

The roadmap for Intel®, AMD®, Microchip®, NXP® and other manufacturers of embedded processors are very well documented leading us from a 4 bit microprocessor (historical read Intel 4004 microprocessor) up to the 64 bit processor families on the market today.  Our intent here is to present the core functionality of all Central Processing Units to understand how we will implement the security policies for the Core IoT Platform.  The CPU is just a programmable block of logic gates that allow the user to program a set of instructions designed into the processor unit connecting real world peripherals, transfer data as well as perform arithmetic and logical computations on digital data.

When we perform a Internet search on Embedded Processors we get inundated with millions of hits on a variety of devices from Single Board Computers (SBC) to general purpose MIPS, ARM, etc. type micro controllers CPU and other names associated to the programmable device.  These embedded systems are a finished product that the manufacturer supplies with a associated IDE (Integrated Development Environment) package to get a product on the market fast.  Putting a product on the market with canned IDE's without knowing or understanding the software and the amount of control questions the vulnerabilities of the IDE for hackers etc.  The better we understand the device we are programming the more confidence we will have in securing the device for our application.  With that said, lets start at the core of the Embedded system, the Central Processing Unit.  Figures 8.0 is a functional block diagram of a typical CPU.  When we look at Figure 8.0 we see that the Instruction Process Logic Block is the central controlling core of the processing unit and everything else are "internal peripherals" that are used as support interface devices for the core process block.  The real world connections for the CPU are just a interconnecting buffer block to form a protective communications mechanism to the internal BUS & Control.  All central processing units have a few section in common, they all have:

An Instruction Process Logic Block

     • Instruction Fetch Queue Buffer

     • An Instruction Process Decoder to execute a series of programmed instructions stored at specific memory locations.

     • A Micro-Coded ROM (MICR) Contains Machine Instruction Set - Registered, Patented or Trademarked.

     • An Execution Control Unit - AKA - Arithmetic Logic Unit - performs data and logic manipulation

A Memory Access Controller, AKA, Memory Management Unit (MMU) - logic to fetch and store data to memory addresses.

A Memory Data/Address Interface Control BUS to attach External Memory, RAM, EEPROM, DRAM etc.

A Set of General Purpose Registers - usually eight registers plus a Stack, Status, Program & Control Register

A Small Internal RAM Buffer for general purpose registers for the Execution Control Unit

A External Data /Address Interface BUS to transfer data to/from the CPU internal logic

A Clock interface controller for CPU process timing.

A Power On Sequence (POS) unique to the CPU core to start executing instructions from a defined memory location.

Figure 8.0  Typical CPU Functional Block Diagram

Core Functions of the Central Processor Unit Architecture:
Central Processing Units have evolved over the years to a plug and play architecture that includes a variety of peripherals that have been incorporated into a single chip and acquired the labels of System on a Chip (SoC) or System in a Package (SIP) and of course the notorious Embedded Processor.  From the CPU functional block diagram shown in Figure 8.0 the core to the CPU is the Instruction Process Logic Block that determines the type of CPU and the registered and/or patented instruction set like the x86 instruction set by Intel Corporation.  

Today there are peripherals that include an embedded processor to communicate complex instructions to reduce the instructions required to communicate when connected to an external main CPU that controls the system. This adds a new challenge to the security of the system since the peripheral today does not necessarily need the CPU to communicate, it just needs to have access to the local BUS.  Some of the more advanced intelligent peripherals are easily hacked today due to the vulnerability of the BUS access and the Internet controller used to communicate through the Internet, hence, The Reaper IoT Botnets Infects Millions of Networks is just the beginning of the security and privacy challenges. The challenge today is to incorporate security in all part of the designs especially in the BUS protocols in order to prevent unwanted intrusions and hidden root kits.

From Figure 8.0 which defines the Central Processor Core functions we may then define anything outside the core as a peripheral whether the peripheral function remains inside the actual chip or connected to the external pins of the chip.  If outside access is allowed to the CPU then full control is obtained over the IoT platform and so enters the Security, Privacy & Safety Policies to insure the CPU performs its programmed sequences without interference.  It is important that we understand all vulnerabilities of the selected CPU core,  how it communicates with the internal bus to the real world external bus, so enters the cliché "If you don't know where you are going any BUS will take you there" and the adventure begins.  The Memory Access Controller is considered a peripheral to connect external memory to the CPU core in order to execute instructions.  It is half of the memory interface peripheral and is one of the vulnerabilities that we enter as part of the KSR list.  Some embedded processor SoC include both EEPROM and RAM to address simple controllers, however for our Core IoT Platform we would like much more control that a fixed embedded system.

Instruction Process Logic Block
This is the primary mechanism that allows the execution of sequential programmed instructions from memory.  The Instruction Process Block is essentially the CPU itself, the remaining blocks are actual peripherals that are attached to the Internal BUS & Control lines.  The real world only sees the Real world Interface lines to keep the internal execution of instructions private.  Since programmed instructions have to be fetched from memory the security issue here is how the memory fetch is protected from interception and isolated from the real world.  The initialization of the integrated system to a known state prior to loading the OS or any other application is required, vulnerabilities are probable during this Power-On-Sequence-Test (POST) which is prevalent in all CPU's and attached peripherals due to the fact that it is a fixed process that is well documented for all to use.  This is part of the "Secure Boot" processes which we will cover later in the series after the selection process, however it will be added to the KSR list to insure the hardware is capable of securing this process in some way.

Instruction Process Decoder (IPD)
The Instruction Process Decoder performs the task of decoding the instruction type and length that has been fetched from the Instruction Queue.  Instructions generally go through a setup process in the IPD to determine the type of instruction before being sent to the Micro-Coded ROM that performs the unique sequence of steps for the fetched instruction.  If the instruction requires the use of the Execution Control Unit (ECU) then it is directed to that unit and the ECU performs the proper sequence of operations to complete the instruction.  If the instruction is an immediate instruction like jump to memory address 0x1234abcd and continue execution then the instruction is directed to the Immediate Control Sequencer that performs simple instructions sequences hence, Jump, Loop, Move Immediate or other immediate instructions.  We will address CPU instruction sets in the programming part of the series.  If the instruction is not immediate then it is transferred to the Micro-Coded ROM for continued processing.  If the instruction requires the ECU to perform complex logic etc. then it is transferred and the selected set of steps are performed to complete the instruction.

From a security point of view the IPD should not be accessible to the outside world and it should not be modifiable in any way.  If you do a bit of research there are some processors that allow micro-code updates when loading specific Operating Systems.  These patches are generally a firmware patch and linked to other (Basic Input/Output System) BIOS instructions toptimize the processor for the selected OS.  Older processors before 2007 require the operating system to control all the BIOS functions as part of the OS and do not use the firmware BIOS except for loading the OS at POST, this is not the case today which we will cover as we progress through the series.

Intel processors incorporate a SMM (System Management Mode) controller to hold the setup of system parameters for access to the required core functions that the OS has privilege to.  Initially, System Management Mode was used for implementing Advanced Power Management (APM) features. However, over time, some BIOS manufacturers have relied on SMM for other functionality like making a USB keyboard work in Legacy BIOS mode

Some uses of the System Management Mode are:

System Management Mode can also be abused to run high-privileged rootkits, as demonstrated at Black Hat 2008 and 2015.  Sony BMG copy protection rootkit scandal in 2005 is just one example and robbing banks is still against the law but still happens, hackers are always tempted and execute these kits..

Types of Processors, MIPS, RISC, CISC, Pipeline, Clocked.
There are different type of Execution Control Units that perform complex steps to complete the requested instruction.  Instruction sequences require several clock cycles to complete.  To reduce the number of clock cycles in several instructions block of pipeline logic are designed in to the ECU to reduce the number of clocks.  These processors have been given the name RISC (Reduced Instructions Set Computer). This does not mean that there are fewer instructions, in fact there are more instructions that perform many more functions, it does mean that the way instructions are processed has been reduced to the minimum number of clock cycles allowing faster processing.  From Wikipedia - "RISC ISAs include ARC, Alpha, Am29000, ARM, Atmel AVR, Blackfin, i860, i960, M88000, MIPS, PA-RISC, Power ISA (including PowerPC), RISC-V, SuperH, and SPARC. In the 21st century, the use of ARM architecture processors in smartphones and tablet computers such as the iPad and Android devices provided a wide user base for RISC-based systems. RISC processors are also used in supercomputers such as the K computer, the fastest on the TOP500 list in 2011, second at the 2012 list, and fourth at the 2013 list,[3][4] and Sequoia, the fastest in 2012 and third in the 2013 list."

Processors that are not RISC based will handle completes functions via a series of steps in sequence and are called CISC (Complex Instruction Set Computers) such as Digital Equipment PD -11 and VAX systems that incorporated a Polynomial evaluation instruction set based on CISC Instruction Set Architecture as well as several other complex instructions.  The Intel IA32, IA64, x86 product line, NXP 68000 are all CISC architecture processors.

The summary of RISC vs CISC is that the CISC processor performs a function in hardware such as a multiply Reg0, Reg1 and moves the results to Reg0 and only requires one line of code as MULT A, B and in C code,  A=A*B.  The MCR only needs one word for the entire operation.  The RISC processor performs this same Multiply in several steps such as:

It is obvious that more memory and more clock cycles are required as well as more compile time for the same function.  However Even though it looks like a single operand there are still clock cycles involved.  Also in a CISC based system only the result is saved and if the B has to be used again it has to be reloaded where as in a RISC based B is still in the register.   We will come back to this at a later time when we get into the processor selection and security.  When we look at the available FPGAs (Field Programmable Gate Arrays) today it allows us to apply both worlds to obtain greater performance.  We will cover FPGA's and CPLD's (Complex Programmable Logic Device)  when we present the peripheral interface section of the series, keep in mind that Intel Corporation purchased Altera Corporation so new advancements with processor technology and FPGA's falls in the probability arena.

The Instruction Process Decoder (IPD) is a key mechanism that determines the complexity of a processor core.  Every manufacture of processors be it a stand alone CPU or embedded surrounded by a host of peripherals will incorporate a Instruction Process Decoder.  It is this decoder logic that makes the block of logic a processor due to the fact that this IPD allows a sequence of instructions to be performed in a unique sequence of steps.  The used within the IPD determines the type of Instruction Step Architecture for the processor, simply presented there are generally two type of  architecture that applies to processors, the first is Memory Mapped I/O architecture and the second is Dedicated I/O architecture.  Both types of architecture incorporate a memory access architecture that may be used for I/O peripherals, however the dedicated I/O incorporates an independent I/O instruction set that is separate from the typical fetch and store memory functions.  In the Dedicated I/O architecture there is usually a separate set of address lines to be used for peripherals and communicate through a selected register set to identify the I/O physical address to send the data to.  In a Memory Mapped I/O users may treat I/O Peripherals just like a memory location with all the memory instruction set functions within the processor.  The Intel x86 processor line has a separate I/O instruction set and uses the first 16 bits of the address bus as a shared memory and I/O since only one of the two functions, memory or I/O may be accessed at a time.  The NXP 68000 set of processors use the Memory Address/Data BUS as both for peripheral I/O and memory access. Both type of architecture have their pro's and con's.

All processor instructions require clock cycles to be implemented, the IPD is simply a sequencer that when input with a specific bit pattern will perform a specific sequence on the processor core.  Since all processor instructions that are not I/O peripheral are of a fixed nature the exact timing for each instruction may be calculated by the number of clock cycles it takes and the time period of the clock executing the instruction.  This is an important part of the processor and is added to the KSR list for the processor selection.  I/O timing is dependent on the real world inputs and may vary when waiting for responses to continue and are calculated based on the peripheral features and the application.

Pipeline processors that include pipeline instruction logic still require clocks to function as they require some type of status flag setup as well as results to be placed somewhere.  Pipeline instructions have the given of a fixed time to execute since it is based on two controllable parameters, the clock period and the propagation delays through the pipeline logic.  This gives them an advantage of clocked instruction sequences especially if the clock is much slower than the pipeline execution time of the instruction.  Pipeline instructions are less likely to be interrupted during the cycle since it is a fixed hardware function where as a clocked instruction may be interrupted in the middle of its process pushing the return state on the stack which requires more clocks then start executing the interrupt handling process until the interruption is completed, then returning to the interrupted instruction by popping the return state off the stack and continuing the instruction.  We will address processor interrupts and the security policies required to handle them later in the series.

Micro-Coded ROM (MCR) for the unique registered, patented or trademarked instruction set.
The Micro-Coded ROM contains the processor "machine" binary coded instruction set which is a unique set of bit patterns used to select a unique set of steps to complete an instruction process.  The MCR is only the machine instruction identifier that triggers a group of unique steps to complete the requested instruction.  This is accomplished by the "Instruction Control Sequencer" which contains all the sequences required for the Instruction Set Architecture of the CPU.  The MCR and the IPD are integrated logic blocks that function together to direct the path for the type of instruction and how it is sequenced.  Instruction are separated into categories to organize to optimize the performance and to direct the sequence of steps and data flow required for each category.

In order to have feedback on an instructions process in real time a set of FLAG bits are used by the process block to identify the final state of the last instruction.  Flags are set during the execution of instructions to determine the state or results of the instruction. The assigned bit position is specific to the processor design. Figure 8.1 shows a Typical; Intel Instruction Set Architecture Flag Bit Assignments.  Generally there are three classes of FLAG bits, Status, Control and System. Example: if we compare two registers data and they are equal the ZF (Zero Flag) is set to 0.

Figure 8.1  Typical Intel Instruction Set Architecture Flag Bit Assignment.

It is not uncommon that a large number of instructions are designed into the hardware processor.  The MCR, IPD along with the Sequencer interact to control the step sequence for all the defined instructions.  There is no magic to a processor design just a large block of logic gates to control the 1's and 0's.  If the instruction is a pipeline type instruction then it is started with a single clock cycle and the data is pipelined through the fixed logic for the process. There will always be a debate on which is better, pipeline logic or clocked logic blocks, regardless there are applications that are better for each.  The clarity will present itself when we address security and peripheral communications in the sections that follow.  CPU's that incorporate a Reduced Instruction Set methodology are directed to a fixed logic function generally a pipeline logic set that makes up the Instruction Set Architecture (ISA) that requires less clock cycles to complete.

Instruction Control Sequencer - Loop & Jump Control:
The Instruction Control Sequencer handles the sequence of steps for each instruction and is generally clocked step by step in order to complete the instruction.  For simple jump direct instructions the program counter is loaded with the new jump address and starts instruction fetches and executes from the new address in memory.  For conditional branches the offset is calculated from the current program counter and either added or subtracted from the current program counter.  For other complex functions that require memory to memory and register assignments the sequencer and the Execution Control unit perform in unison to load / store registers and execute the instruction.  Pipeline logic and clocked logic are in both the Sequencer and ECU to optimize performance, this is the best of both RISC and CISC architecture.  

Execution Control Unit (ECU) - AKA - Arithmetic Logic Unit (ALU)
The Execution Control Unit (ECU) performs all the general purpose integer logic and arithmetic functions  are executed like Add, Subtract, XOR, NOT, OR, AND, Multiply, Divide, Shift Right/Left.  Many of the CPU's today incorporate integer Multiply / Divide instructions and vary depending on the processor type as previously mentioned.  This allows the programmer to incorporate an integer based cryptography for data integrity and a simple level of security.  Integer based cryptography is mainly for data transfer integrity for packet switched networks.  The ECU has its own internal operation registers that are linked to the general purpose register set to complete the requested instruction.  We will address this as we develop the IoT Platform.  Today the CPU is complimented with programmable logic arrays that extend the functionality of the integer ECU/ALU in such that peripheral control and customized processes algorithms are interfaced to the CPU's  integer based Arithmetic Logic Unit.  

Instrucion Process Logic Block Summary:
The Instruction Process Logic Block is the core of the CPU that is surrounded by support logic and by nature core instructions are performed in a single thread single instruction at a time.  Over the evolution of processor development the individual logic blocks were designed to be independent with their own clock source and pipeline architecture.   The core logic block independence allows faster seamless execution of instructions by starting the next instruction while the execution of the previous instruction is finishing up.  These independent function blocks do not change the single thread limitations of the core CPU blocks regardless of the number of Instruction Process Logic Block (cores) incorporated in the system that share the support logic, hence the introduction of the multi-core processor.

One of the areas we did not cover yet is the interrupt architecture of the processor core. Interrupts are a independent function that are usually integrated into the core instruction architecture that allow an interruption of the instruction process usually at the end of the instruction.  The architecture is such that it uses a memory pointer.  This pointer is called the Stack and is a small contiguous block of memory specifically used for interrupts.  When an interrupt happens the state of the core is pushed on the stack to be used for returning to the interrupted process; we will return to interrupts in the Memory section that follows.

Memory & Memory Access Controllers (MAC):
Memory Management Unit (MMU) & Memory Protection Unit (MPU):
Nothing happens without memory whether it be static, dynamic or Read Only memory, "no memory - no programmed instructions".  Controlling memory access is the beginning of securing the process from unwanted intrusions; "simple statement difficult challenge to implement".  Simple processors that incorporate simple interface control logic have little or no protection from stopping memory hacking from changing the process.  Memory hacking has two categories, internal and external, internal hacking requires the hacker to have physical access to the system, external hacking is performed remotely by uploading hacking code or acquiring remote access through a remote port on the system. That the easy part - the difficult part is hacking the processor to execute your code. Regardless of the method memory has to be loaded with valid code to be executed by the processor.

Allocating memory space is critical to a secure user process as well as a secure operating system, so before we get into the memory allocation lets look at the actual hardware that is used to communicate wit the memory.  Memory chips handle access by a selecting an address the contains a cell of data that data is either placed on the memory data bus or there is data on the memory databus to be stored at that address.  A direct read or write is the fastest memory cycle that exists in any processor block, delays enter the picture when we add address translation and protection to the simple memory access. Enter the MMU or MPU blocks of logic as well as some intelligent for encryption to determine if the address being accessed is assigned to the process being executed along with several other conditions.  I would take a guess that many of you reading this blog that you have heard of Spectre and Meltdown, (not the movies 2015, 2004),  however they are both disaster movies as is the conditions.  These two vulnerabilities have to do with memory access from an unwanted source to intercept the process instructions to gain access to the system. This is both a hardware and software issue so we will cover the hardware side now and when we get to the security software part of the series we will address this in detail..

Regardless of the memory type be it Static RAM or Dynamic RAM the access still has to be protected from unwanted intrusions, a challenge that has been here from the beginning.  There are basically two categories of memory Volatile and Non-Volatile, Volatile is memory that looses its data when the power is turned off and obviously Non-Volatile retains it data when the power is turned off.   There are basically three types of volatile Random Access Memory today, Static (SRAM), Dynamic (DRAM) and Pseudostatic (PSRAM).  Static (SRAM) does not require any refresh to maintain its memory content, however, the cost is chip density and reduced memory size.  Dynamic (DRAM) does require the cells to be refreshed (rewriting the cell data) periodically in order to maintain data storage and requires a special DRAM controller that shares real world data access with the refresh cycle.  A new kid on the block is Pseudostatic (PSRAM) which is dynamic RAM with a build in refresh that functions like SRAM but a bit slower speeds to handle the refresh and as with the DRAM controller requires a configuration process to maintain the thermals and access timing within the chip.

Within the volatile SRAM type there is some sub-types, Conventional memory and Content Addressable Memory (CAM), Conventional RAM is arranged as a single address to a single data word, Conventional RAM is present in all the desktops, tablets, smartphone's, 98% of the servers in the cloud, one address location one data location.  Content Addressable Memory (CAM) an older technology that has been around for over 60 years but very little has been discussed or applied using this category in today's processors.  There are multiple patents on CAM and applications filed from 1970 through 2012 for various versions and applications using CAM from companies like IBM®, AMD®, IDT®, TI® and many others to get their dominance in the market.  Content Addressable Memory is primarily used with Content Addressable Parallel Processors (CAPP), however it does not have to be applied to parallel processors in general.  This technology is very applicable to Artificial Intelligent since search compare and execute functions are a major part of AI processing and CAPP give higher speed and functionality than conventional processor systems.  CAM is generally related to Content Addressable Parallel Processors (CAPP) a subject we will address at another time since it is beyond the scope of this project at this time.

The memory controller handles access to the internals of the CPU; since instructions cannot be executed without memory the CPU issues a fetch instruction requests from the Memory access controller to start an instruction processes from a memory address.  Memory access is generally defined with the CPU specifications and features in order to incorporate security policies.  There are simple Memory Access Controllers that just handle a large linear array of memory without any special features, then there are the Memory Management Units which incorporate memory segmentation and protection, however they also are integrated with other blocks of the CPU architecture.  The MAC, MMU and MPU are just controllers that define memory access requirements attached to the CPU, they require some type of physical memory attached to them to function.

Memory access is a major security issue, how it is connected to the Core CPU and how it is allocated, Figure 8.2 shows the functional blocks of a typical memory access architecture connected to the core CPU that consists of a Dynamic Random Access Controller, DRAM, EEPROM and SRAM.  Since nothing happens without memory it stands to reason that hackers want access to plant virus, worms, root kits and other codes to obtain access which makes the memory hardware a high profile part of the security architecture.

Figure 8.2  Typical Hardware Block, DRAM, EEPROM, SRAM & I/O

The simple Memory Access Controller is just a buffer cache that allows access to the memory array in a linear form from address 0000 to address nnnn, just one big linear array of memory addresses.  This is the most vulnerable memory access since there is no protection of data or program code regardless of where it is placed in memory and accessed without any fault or protection.  From Figure 8.2 we see that the MMU or MPU controls all aspects of memory access that includes Direct Memory Access from I/O devices which is a hardware security vulnerability that we will address in more detail as we progress through the series.  This is a major concern and noted in the KSR list.

Memory Management Unit (MMU) is a controller that intercepts "ALL" memory access.  Generally the common definition is that the MMU handles accesses to memory requests by the CPU, however that is not totally true.  For devices that incorporate DMA (Direct memory Access) hardware the MMU also handles this function and is termed the I/O MMU.  This does create security vulnerabilities since the direct writing to memory from a device may also put code as well as data in memory and if the code is in a page that is legal to the current program it may be executed   These devices are typically USB ports, Firewire, or storage type peripherals.  MMU's primary function is to control memory segmentation and paging, translate virtual memory to physical memory accesses and handle memory cache, protection and bus arbitration.  Hence: the MMU is just an add-on to the CPU to create a complete CPU (Central Processing Unit).  Unwanted access to the MMU and Memory is one of the root security vulnerabilities that has existed since its conception and still exists today.

The pros are paging, segmentation, virtual memory to physical memory,  security with a question mark, by intercepting all memory transactions through a TLB (Translation Lookaside Buffer) to insure multiple processes are directed to the CPU accordingly.  This does slow down the access since it adds additional clock cycles to perform the translation. The MMU segmentation protection methodology allows the memory to be partitioned and controlled into Kernal (trusted segment), code or program Protected segment, Supervisory and Data segment groups that add security to the processor system.  MMU's are incorporated into the majority of 32 bit processors available to the COTS (Commercial-Off-The-Shelf) market today.  We will research the market place for both simple processors without MMU that such that the MMU may be added as a peripheral for comparison as the series progresses.  Intel incorporated MMU as been part of the processor since the 80286 release and since then ARM, AMD, MIPS and other COTS embedded processors incorporate MMU's.

What is incorporated is an interrupt function that is in all CPUs today.  The interrupt sequence allows the core CPU to share time with other processes by saving the complete state of the core CPU along with a return location to restore the previous state and continue processing.  Each manufacturer specifies how their interrupt functions processes information to accomplish the switching.  Some use an interrupt controller that allows the fetching of a block of addresses defined as the interrupt vectors that hold pointers to processes to be executed.  Those that do not use an interrupt controller are limited to only a single interrupt along with an interrupt handler that must poll all devices to determine which one caused the interrupt.  Time sharing is an important feature when selecting a single processor core system since it will determine the overall program efficiency when several peripherals are attached.  This is also an issue with multi-core chips as well since each processor runs independently and shares memory.   Number of Processors and Interrupt capabilities will be put on the KSR list as well and does create a major security vulnerability if not handled properly.

Interrupts are a integral part of the CPU that allows the CPU to multi-task by interrupting a current process and jumping to another processor then returning when the interrupted process is completed.  CPU architectures generally have a single interrupt process that requires an external interrupt controller allowing management of several interrupt lines connected to peripherals that are assigned unique memory address locations that contain pointer to the interrupt handler code.  We will cover interrupt structures as we proceed through the series.  From the Instruction Process Logic Block can conclude that instructions are single thread and require a memory fetch request for the instruction to be executed which point to the memory as a key security vulnerability.

Each of the user defined memory segments have access control parameters to prevent any unwanted access.  Any intrusion to protected areas creates a page fault interrupt to a jump location that contains the error handling code to be executed sending a notification or correcting the intrusion.  We will cover Interrupt control processes in the Interface section as the series progresses.  The Memory Management Units control registers are generally integrated into the CPU's core control for better security such as in the Intel IA32, IA64 product lines.

Memory Protection Unit (MPU)
The MPU is an MMU with limited functionality that only allows memory protection for parts of the physical memory array, does not have virtual memory functionality.  MPU's do not have cache control, bank switching and bus arbitration which is necessary for systems that perform multi-tasking and with multiple-processors.  Many of the embedded processors incorporate an MPU because of logic array constraints by design.  MPU logic has less latency delays because of its smaller logic designs and less configuration parameters, both MPU has less segmentation and paging capability than the big brother MMU.

MMU, MPU Summary:
The MMU's MPU's are considered peripherals because their direction is from user controlled status & control registers and require a setup process.  If the MMU is part of the Internal BUS then we have to look at how it is accessed via the external bus and how does direct memory access (DMA) is connected as well as the DMA protection features.  

OK, now time to bring up interrupts again.   Now that we have all this memory access control, what happens if an unwanted access happens as it does in multi-tasking and multi-processor systems.  The answer is a page fault interrupt is generated and there is a separate stack of contiguous memory outside the normal OS that points to a block of code that handles the page fault interrupt.  A few conditions that are tested and some internal fixes are automatically fixed and process is resumed. Some errors halt the system and some errors just notify the user with some options.  Hackers are constantly playing with code injection and getting page faults to find back doors to the OS.  In a multii-tasking application the task manager in the MMU handles the segmentation and virtual memory allocation along with all the page/bank switching to save each tasks position in the task scheduler.

The interrupt sequence allows the "single thread" core CPU to share time with other processes by saving the complete state of the core CPU along with the tasks return location to restore the previous state and continue processing.  Each manufacturer specifies how their interrupt function processes information to accomplish the switching.  Some use an interrupt controller that allows the fetching of a block of addresses defined as the interrupt vectors that hold pointers to processes to be executed.  Those that do not use an interrupt controller are limited to only a single interrupt along with an interrupt handler that must poll all devices to determine which one caused the interrupt.  Time sharing is an important feature when selecting a single processor core system since it will determine the overall program efficiency when several peripherals are attached.  This is also an issue with multi-core chips as well since each processor runs independently and shares memory.   Number of Processors and Interrupt capabilities will be put on the KSR list as well and does create a major security vulnerability if not handled properly.

Task handlers require the management of all the tasks opened for processing as well as supplying a separate interrupt structure for each task, this is where bank/page switching and virtual segmentation is used for management both locally and globally.  CPU architectures generally have a single interrupt process that jumps to a specific location either pointed to by an internal CPU register set to a memory address.  Some cores require the use of an external interrupt controller allowing management of several interrupt lines connected to peripherals that are assigned unique memory address locations that contain pointer to the interrupt handler code.  We will cover interrupt structures as we proceed through the series.  From the Instruction Process Logic Block can conclude that instructions are single thread and require a memory fetch request for the instruction to be executed which point to the memory as a key security vulnerability.  Generally interrupt handlers are accessed indirectly from a contiguous block of memory specifically assigned by the core processor and integrated with an interrupt controller.

If access to the memory is permitted while the CPU is off executing tasks then all activity in the Core IoT Platform may be monitored, intercepted and controlled.  Therefore controlling access to this section of the processor is noted in the KSR list to address during the design.  Several embedded processor chips incorporate the Memory Management Unit (MMU) as part of the Memory Access Controller.  

The majority of embedded platforms work in the linear memory address mode and segmentation is generally not implemented probably for simplicity for those applications that do not require an OS to function.  Another major of concern for the MMY is the connection of a Direct Memory Access Controller (DMA) which also a direct access to memory without and CPU intervention.  This is a major security concern since DMA controllers transfer data in both directions at very high speeds to/from the real world BUS interface. Intel incorporated DMA controllers in the original 8088 PC and used it for all disc transfers for the speed.  When the 80286 was introduced the disk controller was redesigned and DMA was not used since the 80286 was faster by transferring direct.  Direct Memory Access for devices like USB, Firewire and other streaming devices present a security risk since they access a block of defined memory without CPU intervention. Memory access is a critical security vulnerability since it contains everything that is happening in the core system therefore it is on the top level KSR list of concerns.

External BUS Interface - Data /Address  & Control for External Devices & Memory
The External BUS Interface logic allows both memory and devices to be interfaced to the real world.  The BUS Control lines separate the BUS transactions for DMA (Direct Memory Access). Memory Mapped I/O this is also the device I/O BUS and the Address of the I/O devoice is decided by the user, usually at the top of the memory less the memory size it takes to boot the system up.  Memory Mapped I/O type processors require that the memory map be organized in order to prevent overlaps of program, data and I/O addressing.   If  DRAM is being incorporated into the design then a separate DRAM controller is attached as a peripheral device.  DRAM controllers are usually part of an embedded processor configuration however many are incorporated via an PAY externally.  Today there exists PSRAM (Pseudo Static Random Access Memory) that contains a self-refresh DRAM array that allows greater density than normal SRAM.  4Mx16 a standard size and it performs just like a real SRAM but with slightly slower speeds.  The one we use in the lab for testing and developing has a 55ns asychronous R/W access time  Dynamic RAM is still the best choice when looking at 1Gig to 4Gig RAM for the system.  Some Embedded Processor SoC only allow a few address lines to be brought out to the pins therefore limiting the performance requirements of the processor externally and will be part of the KSR list for review during the selection processor.

I/O Interface Control BUS:
The I/O Control BUS is dependent of the processor type as we discussed previously.  If the CPU type is a Memory-Mapped I/O then this block would be left out since all I/O device communications is the same as memory I/O.  If the CPU is a dedicated I/O type then a separate I/O address and I/O databus would be part of the chip.  Generally Dedicated I/O type CPUs use a single databus and separate address and control lines.  The Intel processor line uses the first 16 bits of the address lines and separate control lines to load a device address and the data lines are shared system wide.  The data lines are generally buffered if DRAM is used since it requires a DRAM controller to refresh the memory.  The NXP 68K series are Memory-Mapped-I/O CPU's, the Intel x86 series processors are Dedicated I/O type CPU's.  Memory mapped I/O may be controlled by the MMU or MPU allowing better segmeted control of device data, however the dedicated locations also introduce security vulnerabilities we will cover as we progress in the series.

Clock Interface Controller (CIC) for CPU process timing
The Clock Interface Control handles all the clock requirements of the integrated system.  The CIC handles synchronization for all peripherals, internal and external, insuring that there are unique time sequences for each instruction process and data transfer throughout the architecture.  The CIC also handles cascading multiple processors and systems by allowing external clock inputs to synchronize cascaded processors.  This is different from Hyper-threading as within the Intel x86 processor series, cascading embedded processors externally creates a fully independent multi-processor configuration.  On our core platform we will review the concept of adding a second processor for the control, security and privacy functions.  The CIC is dictated by the processor manufacturer so we will review the specifications as we add this to the KSR list.

Power On Sequence (POS) Initialization Test unique to the CPU core to start executing instructions from a defined memory location.
The POS is one of the critical vulnerabilities of all processor controlled systems. If one can intercept the POS it has the potential of adding code and changing the setups protected area to accept this code. This is called installing a root system that can be anything from a simple keyboard monitor to a worm or virus that infects the entire system.  we will address this in more detail as we get into the software and power on security of the IoT Platform.

At this point we will end this part and cover the remaining part of the CPU core system.

The majority of vulnerabilities are caused from poor hardware and software designs that have vulnerabilities which allow unwanted code to be injected via some remote or local mechanism.  Understanding the core processor hardware and its peripherals is a good place to start to design in security during development,  The Embedded process part of the series is intended to be an overview prior to the actual development of the Core IoT Platform in order to design in securiyt and privacy policies.  We will cover hardware vulnerability areas during the actual hardware development.   Security can not be an after-thought after the development will always create vulnerabilities that will give a back door for unwanted intrusion to the system.  Today the majority of hacks come from poor security behavior through e-mails and documents that are attached that contain malicious code embedded in the attachments.  Websites that are poorly written have hacked code to access your system remotely through various codes written in several webbased languages that are easily executed on a desktop or server.  We will cover these vulnerabilities in detail in the software sections of the series.

This part of the series is just the beginning and is meant to be an outline for reference.  The embedded processor world is expanding at such a rate that security is being bypassed for the fastest to market at the expense of the publics privacy and safety.

Reference Links for Part 8:
The majority of Internet scheme and protocol information are from a few open public information sources on the net, IETF (Internet Engineering Task Force) RFC's that explain details on the application of the protocols used for both IPv4 and IPv6 as well as experimental protocols for the next generation Internet  and the Network Sorcery web site. The remaining of this series on the IoT platform will be from BASIL Networks MDM (Modular Design Methodology) applied with the Socratic teaching method.  Thank You - expand your horizon- Sal Tuzzo

Network Sorcery:
The Internet Engineering task Force: IETF - RFC references

Reference documents for continued reading are listed below:

[PDF] Vertual Memory and Memory Management - Angelos Stavrou, George mason University

Part 9+  "Preliminary Outline" Embedded Processor Systems: Continued

Part 7 Network Protocols
- Network, Transport & Application -Continued - The CRC-32 and Checksums (Nov 27, 2017)

Part 9 IoT Core Platform - SoC Core Processor of Embedded Systems -Vulnerabilities (Mar 16, 2018)


Publishing this series on a website or reprinting is authorized by displaying the following, including the hyperlink to BASIL Networks, PLLC either at the beginning or end of each part.

BASIL Networks, PLLC - Internet of Things (IoT) -Security, Privacy, Safety-The Information Playground Part-8  Embedded Processor Systems: The Core Processor Of Embedded System Configurations- (January 12, 2018)

For Website Link: cut and paste this code:

<p><a href="" target="_blank"> BASIL Networks, PLLC - Internet of Things (IoT) -Security, Privacy, Safety-The Information Playground Part-8 Embedded Processor Systems: <i>(Jan 12, 2018)</i></a></p>



Sal (JT) Tuzzo - Founder CEO/CTO BASIL Networks, PLLC.
Sal may be contacted directly through this sites Contact Form or
through LinkedIn

Copyright© 1990-2019 BASIL Networks, PLLC. All rights reserved