The BASIL Networks Public Blog contains information on Product Designs, New Technologies. Manufacturing, Technology Law, Trade Secretes & IP, Cyber Security, LAN Security, Product Development Security
Internet of Things (IoT) -Security, Privacy, Safety-Platform Development Project Part-9
saltuzzo | 16 March, 2018 17:35
Part 9: Embedded Processor Systems - Embedded (SoC), (SIP)
"Firmness of purpose is one of the most necessary sinews of character and one of the best instruments of success. Without it, genius wastes its efforts in a maze of inconsistencies." - Lord Chesterfield
Part 1 Introduction - Setting the Atmosphere for the Series (September 26, 2016)
Quick review to set the atmosphere for Part 9
What we want to cover in Part 9:
Lets Get Started:
Wow, things change fast in the technology arena with everyone writing about the majority of CPU vulnerabilities with Meltdown and Spectre and others. This series is not going to disrespect or criticize the huge effort that has been on-going to develop operating systems for the industry. So lets give this a positive attempt to put this into the engineering perspective, that is before a problem may be solved it has to be analyzed and understood with given facts that have been vetted and the desired results. There are two main technical papers covering Meltdown and Spectre but before we get to them we have to put some core facts on the table. There are two issues of vulnerability that exists, the first is the internal control of all peripherals by a secured processor that runs an internal OS that is part of the multi-core processor chip, second is the Virtual Page vulnerability fault identified by Meltdown and Spectre publications.
OK, not to give age, al little history is good to understand as not to repeat it. I started working with designs in the 70's on Data General NOVA, Micro-NOVA and Digital Equipment Corporations PDP-11 Series minicomputers, by the late 70's early 80's I started designing with Intel Processors in the days of the 8080 when it was first released and moved to the 8088 and was fortunate enough to own an IBM-PC in December 1981 that was used for developing hardware and software, no I do not use a walker although I do sit a lot more, so you might call me seasoned or just strong minded. So what does this have to do with the price of tea? Nothing, however it has to do with the advancements of technology in the industry over time and the forgotten problems that have been solved as with all new technologies from someone that experienced those advancements while working in development for over 35 years. There were two main developments in the computer processor world at that time, mini-computer mostly CISC architecture and microprocessor RISC architecture advancements, both have teams of talent. Keep in mind that the development of MINIX and other operating systems started way back in the 1960's as well.
Processors in those days required a lot of external support chips to make up the system motherboard and memory management issue was studied and was implemented into the chips however, required more development to make it practical. As time passed and the fabrication process technology improved more of the support chips were integrated into the processor support chipset's. Figure 9.0a and 9.0b are the 19" rack mount relics' of the computer age in the 70's. There are still some PDP-11 systems controlling power plants in Canada.
A Brief Historical Summary of Technology Advancements Over The Years.
IBM in 1969 solved the problems of virtual memory for commercial computers by proving their Virtual Overlay System was reliable on the IBM-370. From there the storage and chip technology advanced to a point that made page memory feasible and Digital Equipment Corporation incorporated Virtual Address Space in their VAX/VMS (Virtual Address eXtension / Virtual Management System) and their Multi-User/Multi-Tasking OS RSX-11M and MUMPS. Data General Corporation initially developed RDOS (RealTime Disk OS) Data General was acquired by EMC in 1999 and EMC introduced VMWare a Virtual Machine OS. The main architectural differences between Digital Equipment and Data General is Digital Equipment is a Memory Mapped I/O architecture where the I/O devices occupied an actual memory address; the Data General is a dedicated I/O architecture where all I/O devices have a unique address that is not part of the memory addressing and specific I/O instruction set that is accumulator or register based. The mini-computers were all CISC architecture computers and the microprocessor introduction incorporated RISC architecture; Intel and AMD are RISC architecture base processors.
Over time Intel and AMD addressed the Virtual Technology in steps. First came the multi-tasking applications running in Virtual86 mode that was part of the 80386 processor by adding hardware instructions to the RISC processor for task switching with a TASK handler along with Global and Local Descriptor Registers, . This enhanced the virtual operation and lead the way for multi-core processors. Until the introduction of multi-core processors there were multi-processor chip motherboards 2-way, 4-way that had some unique logic as to operate in a symmetrical environment, Symmetrical Multi-Processors (SMP), or Multi-Processors (MP). Intel's Xeon-DP chips were used for the 2-Way motherboards which is what we still use here for testing.
SMP configurations share the BUS and Memory and are controlled by the main processor in SMP systems as shown in Figure 9.2 Symmetrical Multi-Processors SMP configuration. Each of the Applications ran in Virtual86 mode and each processor was capable of multi-tasking as well increasing the performance of the entire system. SMP configurations only ran one OS and would require a reboot to run a different environment which was typical in those days using a multi-boot startup program.
This SMP configuration was used for several years and continues today, hyperthreads linked to the Virtual Processor which is a separate core still controlled by the associated processor core the hyperthread was attached to. The multi-core with hyperthreads increase the performance to the next level in the Xeon Processors which are still used today. The latest Xeon is a true multi-core Virtual Technology processor chip that allows multiple operating systems as well as virtual multi-tasking per core.
For a true Virtual Machine by definition each processor must be independent, no dependencies on other processors in the core, and be capable of sharing all the functional peripherals attached to the system on demand. Both Intel and AMD have protected IP as well as patents to protect their core technology advancements.
In 2005 both Intel (Intel® VT) and AMD developed their own methodology of Virtual Technologies ending up with the same results using multi-processor core chips and entered the market of Virtual Technology. This is when Intel and AMD incorporated total control over the Virtual Memory and Virtual Processor core to implement their propriety Virtual Technology. Figure 9.3 shows the current Virtual Technology Block Diagram, a portion of the technology is propriety and protected from the public.
Prior to 2005 there were no virtual processor desktops at the time, just SMP configurations, however Intel had a few years of accumulative experience developing the Memory Management Unit (MMU), the MMU being incorporated into the i486 chip along with the Floating Point Unit (FPU) and gaining knowledge with Virtual Address Space technology partly from the Open Source VAX/VMS software lead the way for the next generation processor.
The new Virtual Technologies started to gain momentum and several talks surfaced about hardware modifications, peripherals and what is essential for a desktop system, how small can it be to maintain functionality and performance., keep in mind that the only peripherals that were in the x86 processors were the Keyboard, Mouse controller, FPU the MMU, the Internal BUS arbitration logic and the easy back door for hacking, the "System Management Mode (SMM) controller", all these incorporated peripherals were controlled by an external drivers loaded when the OS was loaded except the keyboard and mouse that were part of the BIOS from the getgo; software driver updates were downloaded and installed externally. Windows 3.x was released in 1990 and introduced the a pseudo virtual memory driver mechanism called a loadable virtual device drivers (VxD) and capable of running applications in protected mode that ran on top of DOS. The processor support advanced to include DRAM Controller, MMU and Real World BUS arbitration control were in a separate chipset's for each processor, these were called Nortbridge and Southbridge chip sets. They handled different I/O functions for the specified x86 processor.
The advancements continued to Windows 95 in 1995, Windows NT and Windows 98 in 1998 which introduced the Windows Driver Model (WDM). Along the way Windows NT also introduced a new file structure very different from the standard FAT32 and entered the server Multi-tasking/Multi-user arena. Windows NT was very easily hacked, in fact while at one company one of the engineers hacked all the user passwords and sent them to upper management to prove a point. In 2001 Windows XP was introduced and by this time Microsoft and already had several years dealing with page memory management multi-tasking, also Windows NT changed names to Windows Server 200x and maintained the NT file structure for the XP environment as well.
The advancements and demands for faster, more powerful processing kept the pressure on the key players in the x86 processor marketplace, Intel®, AMD® and Microsoft® , to develop a more self adjusting secure user oriented Windows OS. There were and still are many discussions of the pro's and con's of incorporating the video controller into the same chip with the CPU as well as other peripherals like disk management, USB etc., of course peripherals were incorporated inside the chip, however the drivers were still outside and installed as a virtual driver for the first level of change as the OS was installed. Remote control of the drivers was short lived, in 2006 Intel® and AMD® incorporated a complete OS internal to their virtual processor chips. The complete control using an internal OS was classified as a trade secret and was well kept till the vulnerabilities surfaced and that is where we are at today.
Vulnerabilities and Concerns for Chips Today:
The topology advance did not happen over night, it took several years to go from several LSI (Large Scale Integration) integrated circuits support, Northbridge/Southbridge chipset's, USB, Graphic and other LSI controller IC's. As the density of wafer fabrication technology increased along with the reliability it was cost effective to incorporate more an more technology into a single chip. As this process was being implemented so were the drivers, Intel motherboards were shipped a complete set of drivers for all the controllers that were on their motherboards, as did other motherboard manufacturers, the practice of including a setup disk for selected Operating Systems was part of the expectations for system integrators. The practice of external software drivers still exists for third party PCIe cards and other peripherals to enhance performance or customize for special applications and gave opportunities for the OS manufacturers to advance their products to better support their customers.
Incorporating peripherals into the processor chip is really not a concern, however incorporating the drivers which in turn require some type of OS to control and interface them to the applications does change the game. The "Secure Boot" process, the added hardware and software for security changed the security platform. The pushback is coming not only from the operating system software manufacturers but from the security industry and third party software developers. Now that all the vulnerabilities for a multitude of processors have been published the businesses are now looking at damage control and how they can protect their business from intruders since it does not appear a real solution is coming any time soon. If a hardware rollover of the silicon chip has is required you are looking at a year plus down the road. That is a long time to allow that many vulnerabilities to be exposed in the industry.
The incorporation of an operating system on the CPU processors creates a few concerns that have very little to do with the actual operating system itself. The issue is that it puts a hardware manufacture of CPU chips in the operating system arena and controls how software will communicate with it. This concept has been a concern for many years all the way back to the DEC and Data General days. When I consulted for DEC back then the corporate talks throughout the corporate levels were "We are a hardware company forced to include an OS with the product", Data General also had the same point of view, however to maintain competitive they also were forced to developed operating systems. Things would have been a lot different if the same support for RSX-11 and RDOS spun off and many applications build around it as the IBM-PC support. It turned out that each application was a engineering project within a specific company and the software and hardware became proprietary and not for sale on the public market.
The next issue arises on how to communicate with the new All-In-One chips, hence the UEFI (Unified Extensible Firmware Interface) specification lays out specifics on how to interface with the processor, all 2899 pages. From the first release January 31, 2006 to May 2017 Rev 2.7 there have been a dozen updates each year. The Platform Initialization (PI) and the Advanced Configuration & Power Interface (ACPI) specifications which address the core of security and protection for microprocessors in today's marketplace. What makes this interesting is that someone probably knew of the many vulnerabilities that existed way before they were published along with the fact of insuring that interfacing with the UEFI is reliable on their application. For the full set of specifications go to the UEFI Forum.
We hope to be conducting a few test here to monitor the communications activity when the system is turned off and when it is in sleep mode. Updating the operating system is also an issue and how it will effect the outside OS's that have to communicate through it. For the Intel line the internal OS has been identified as MINIX 3 which has been around for 30 years and has weathered the test of time as being a very efficient core OS. The AMD Ryzen series is a propriety OS and is controlled by a 32 bit ARM Cortex A5 and monitors and maintains the security environment. We will get to the operating systems when we address the software development part of this series. There is no criticism towards any of the operating systems only admiration of the many hours spent developing, maintaining and then giving several of them to Open Source for all to use and learn from.
There have been so many vulnerabilities over the years from many products during their introduction of evolving technologies, many have been addressed and some are still prevalent that have been ignored, it is a real world. Many issues have been solved with the release of updates and patches. To understand and publish the vulnerability purely at a technical level always leads to some type of solution that may be administered to fix the vulnerability. I doubt that this methodology will change any time soon since new code and applications are impossible to test for every type of hack. As I stated in Part 1 "Where Do We Start" - All manufactures have the right to develop anything they want, sell it and make a profit from the technology. Manufacturers also have the right to control and maintain intellectual property rights of their developed technology within their product. Today we have a large volume of open source software under the GNU license that any manufacturer may use and alter as long as they maintain the GNU identification requirements of the source. As new products, especially processors enter the market the demand to protect Intellectual Property and technology becomes more difficult due the advancements within the Internet and the ability to hack just about any server make patents and IP difficult to protect, so the next best thing is a Trade Secret which is what the major players and many corporations are now practicing and very well since it took about 10 years for the public to be aware of the current processor vulnerabilities.
It is obvious that Intel and AMD kept their trade secrets well hidden until a short time ago when vulnerabilities were uncovered. The concern is not the peripherals on a single chip, it is all about this internal OS controlling the peripherals and applications have to go through and the possibility of unwanted access to the system the processors are used in. The "Management Engine" interfacing has taken lead on the pushback. Trade secrets become a problem when it causes damage to the user by not disclosing the risks as many are experiencing and over the last few months more and more risks have been exposed. Today there are so many processors internal to everything in every day business and private life that intrusion vulnerabilities are magnified 1000 fold considering over 1.5 billion cell phones and several hundred million tablets, desktops, laptops, tablets and can create tremendous losses on all fronts.
OK, The second issue that we are not going to reiterate on is the memory paging vulnerability issue of Meltdown and Spectre since there are so many articles and opinions already out there. This went viral on the Internet with just about every publication except teach yourself basket weaving 101 has rephrased it to their publications. The technical side of this vulnerability is worth noting in order to understand what we should keep that in mind to address during the development of the Core IoT Platform. The technical details about Meltdown and Spectre are linked for those that are interested. We will talk about a solution next. Remember the MMU hardware is a peripherals and the API that drives it is software/firmware that is part of the integrated OS that the processor manufacturers incorporate as part of their chip which has opened a Pandora's box of issues. The x86 architecture on Intel x86 line and AMD Ryzen's series are now the center of a wide range of publications on the vulnerabilities.
A Proposed Solution with Facts - May Not Require A Chip Redesign- Summary:
The first argument on vulnerabilities is cored around the trade secret part of the internal operating system software as part of the processor chips, Intel x86 line using MINIX-3 and other Ryzens propriety microcode in AMD that takes control of all the peripherals and the Virtual Technology operations of the chip. Since the internal code has access to the internal peripherals, some microcode to access them would be accessed in some way. With that being said, a reasonable solution that would require a microcode update would be as follows.
Minix has a microcode kernel architecture for the OS and the drivers are user-applications, the peripherals still go through the kernel for register access, that is typical of all low level access for the application driver to be developed, I am not sure how AMD performs the similar task, however it has some type microcode or monolithic OS and drivers are in order, considering the Ryzen latest round of vulnerabilities have been published. Lets start with the Intel x86 Processors, this would apply to AMD as well, however, the OS is different.
All this leads to a relatively reasonable microcode update that will give the processor and all of its capability and accountability back to the programmers, developers and OS manufacturers. There has to be an additional mechanism to reset the BIOS to a default state if the microcode and OS update is interrupted or fails to initiate, encrypt the default bios setup. The microcode and the OS updates may be accomplished with one update process or allow companies to perform the distribution internally manually or automatically through their servers. The time frame for distribution is to be determined on the complexity of the updates and the companies involved. The main concern here is the updates and who is the creator of the microcode in order as to insure to the public that the security is traceable for all concerned.
The second issue deals with the Virtual Technology Paging vulnerabilities, that may be addressed in the remaining space in the 8MiB of EEPROM, special intercepts may be incorporated to stop the full dump and the way the page table is rewritten. Some scrambling may be initiated as to protect the raw data from being dumped as well. This is probably the smallest hit on performance and may not be recognized if an increase on performance is brought on by removing MINIX and the other support nesting.
Why This Will Work:
Even if there were some unique addressing scheme used, both Intel and AMD already have the drivers that work for their implementations of the controllers. The peripherals do not have to be removed, just allow programming control from the outside. This may or may not involve some hardware pin reassignment or microcode to redirect the control to the external BUS architecture to share the internal memory for graphics and other peripherals. However since one of the cores already have access, some small microcode should be able to handle it. This does not change any "Virtual Technology" capabilities of the chip and in some cases may even add performance to the processor since it may be able to free up a couple of cores that can be used as part of the OS as well.
As for security, this allows the software to be secured by the OS manufacturer and not the chip maker of hardware, eliminating a level of who has what access and when.. This also gives OS manufacturers more flexibility to maintain security of their platform. This solution has not changed the way peripherals are required to performs and it gives a broader flexibility for the Virtual OS to increase performance. Peripherals like the Network Interface Controller, the USB, etc. should be completely controlled outside the chip and access to these peripherals should be allowed only through the OS. There are a series of methodologies that we have been working on for a few years now and are planning to incorporating them into the Core IoT Platform for this series.
The extra internal memory probably about 5MiB and the associated cores could easily be used for protecting the processor from intrusion by allowing corporations to use that area for propriety company ID encryption and other security protocols that are unique to the corporation and also allow the OS manufacturers to add their own standard protection for consumers that purchase Windows, Linux and any other OS manufacturer for retail consumer use. This gives control and accountability back to the operating system and developers. These updates allow unique markets to emerge for the security of the consumer and corporate platforms, allows for secure code in the prevention of the current missed page interrupts for Meltdown and Spectre among others that are not mentioned here.
The Free Enterprise System:
"Those who cannot remember the past are condemned to repeat it" - George Santayana
Part 10 "Preliminary Outline" Embedded Processor Systems: Continued
Part 1X+ More To Come -"Preliminary Outline" Embedded Processor System: Continued
Reference Links for Part 9:
UEFI Specifications Version 2.7 All 2,899 pages
Server Security Advisory AMD Processors
Publishing this series on a website or reprinting is authorized by displaying the following, including the hyperlink to BASIL Networks, PLLC either at the beginning or end of each part.
BASIL Networks, PLLC - Internet of Things (IoT) -Security, Privacy, Safety-The Information Playground Part-8 Embedded Processor Systems: The Core Processor Of Embedded System Configurations- (December 24, 2017)
For Website Link: cut and past this code:
Internet of Things (IoT) -Security, Privacy, Safety-Platform Development Project Part-8
saltuzzo | 12 January, 2018 14:58
Part 8: Embedded Processor Systems - System On a Chip (SoC), System In a Package (SIP)
"We are drowning in information but starved in knowledge:" - John Naisbitt
Part 1 Introduction - Setting the Atmosphere for the Series (September 26, 2016)
Quick review to set the atmosphere for Part 8
What we want to cover in Part 8:
Since the beginning of this series in September 2016 there have been many hacked IoT devices using COTS embedded hardware and software creating high visibility to security and privacy. The current database of breaches encouraged us to present a more detailed hardware and software presentation to assist designers and educate new comers of the new challenges with security and privacy. Due to the complexities of processors today we will continue to follow our technical presentation methodology, Overview → Basic → Detailed (OBD). We will be addressing the many sections of the Core IoT Platform separately to keep the presentations at a reasonable length. The full details will be presented during the actual hardware, firmware and software design stages.
The preliminary preview of the entire index is shown below and will be updated with links as we progress through the live series. Comments are welcome both publicly and privately. If you want to participate privately we will give acknowledge of your participation only by your permission. We hope to share insight to address hardware and software solutions to the security, privacy and safety issues of IoT devices. Parts of the embedded processor series will apply to both desktop, tablets as well as embedded IoT devices since they all share common elements of a CPU system.
A Brief CPU Summary:
In order to prevent such intrusions and unwanted code from being executed we have to understand the hardware, firmware and software details being used on the core IoT Platform. When we analyze the embedded processor market place we see that there are really only a handful of different category's of processors and a large variation of licensing of the same or simular cores. It would take a several hundred page book to cover all the processor differences and is really outside the scope of this series, however we will look at the major players of 32 bit CPU cores and start there which is well within our scope. The major players are Intel®, AMD®, ARM®, NXP®, Microchip® and a smaller x86 player but still applicable is ZFMicro®. There are three that are x86 based, Intel, AMD and ZFMicro. NXP M&A acquired the original Motorola/freescale 68K CPU line; ARM Ltd stands alone and has licensed the technology to many players including all the above. Microchip is in a unique position due to the several M&A's of Atmel, SMSC and others. The processor lines for these major players cover a broad spectrum of applications which make it difficult to select a processor that will allow full control to insure security and privacy at the core level.
All the major players compete with their own versions of an IDE (Integrated Development Environment) package and once selected you are basically locked into the selected manufacturer. Using a third party package allows the fast turn of a product to market, however it does not guarantee privacy or security. In order to insure security and privacy a detailed understanding and disclosure of the internals of the processor and software packages used are a must as well as the access to the core macro assembler to be able to incorporate a users integrated security methodology. That being said we will now present the basics of CPU architectures in order to ask the right questions when performing our selection process.
The variations of processors today range from a dollar to hundreds of dollars depending on the bit size 8/16/32/64 bits and speed from a few KHz to GigaHertz as well as integrated peripheral functions. The main function of the CPU is the programmability of a sequence of instructions fetched from a memory system to control a users process. What makes CPU's different in the industry is the Micro-Coded ROM (MCR) which identifies the unique set of machine instructions for each CPU manufacturer. If you change the Micro-Coded ROM you change the processor instruction set even though it still controls memory access and some logic functions it has a unique Macro-Assembler assigned to the processor. Open source compilers like GCC have incorporated several families of processors into the compiler allowing the user to write code in C and compile it for several types of processors.
The remaining blocks of the CPU all perform similar functions, the memory controller and sequencer controls the access to memory locations and also controls the CPU jump tables. The Central Process Controller or Instruction Process Decoder is what performs the instructions that is fetched from memory and keeps track of the programmed instructions with a program address counter. The Execution Control Unit performs basic arithmetic and logic functions and incorporates a set of general purpose registers. The remaining interfaces, the External Memory is to store the application program and other application parameters, the Data/Address BUS & Control are for adding user peripherals. The BUS control allows direct access to the memory controller for fast data transfers. The final section the Power-Up Entry is a special one time execution during power-on that will allow the user to enter a unique memory address to start fetching instructions to execute. To understand the core requirements of security one should understand how the internals of selected processor functions. This introductory presentation will shed light on the complexities of designing an embedded system with the highest level of security possible from the core hardware to firmware to application software.
During the CPU presentation we will be creating a Key Security Requirements (KSR) list to be use for the Embedded Processor selection process. It is important to keep in mind that all the security requirements may not be met with COTS (Commercial-Off-The-Shelf) embedded processor and may shed some light on the limitations of COTS embedded systems and the compromises that are being made to put a product on the market.
CPU (Central Processor Unit) What They All Have In Common:
The roadmap for Intel®, AMD®, Microchip®, NXP® and other manufacturers of embedded processors are very well documented leading us from a 4 bit microprocessor (historical read Intel 4004 microprocessor) up to the 64 bit processor families on the market today. Our intent here is to present the core functionality of all Central Processing Units to understand how we will implement the security policies for the Core IoT Platform. The CPU is just a programmable block of logic gates that allow the user to program a set of instructions designed into the processor unit connecting real world peripherals, transfer data as well as perform arithmetic and logical computations on digital data.
When we perform a Internet search on Embedded Processors we get inundated with millions of hits on a variety of devices from Single Board Computers (SBC) to general purpose MIPS, ARM, etc. type micro controllers CPU and other names associated to the programmable device. These embedded systems are a finished product that the manufacturer supplies with a associated IDE (Integrated Development Environment) package to get a product on the market fast. Putting a product on the market with canned IDE's without knowing or understanding the software and the amount of control questions the vulnerabilities of the IDE for hackers etc. The better we understand the device we are programming the more confidence we will have in securing the device for our application. With that said, lets start at the core of the Embedded system, the Central Processing Unit. Figures 8.0 is a functional block diagram of a typical CPU. When we look at Figure 8.0 we see that the Instruction Process Logic Block is the central controlling core of the processing unit and everything else are "internal peripherals" that are used as support interface devices for the core process block. The real world connections for the CPU are just a interconnecting buffer block to form a protective communications mechanism to the internal BUS & Control. All central processing units have a few section in common, they all have:
An Instruction Process Logic Block
• Instruction Fetch Queue Buffer
• An Instruction Process Decoder to execute a series of programmed instructions stored at specific memory locations.
• A Micro-Coded ROM (MICR) Contains Machine Instruction Set - Registered, Patented or Trademarked.
• An Execution Control Unit - AKA - Arithmetic Logic Unit - performs data and logic manipulation
A Memory Access Controller, AKA, Memory Management Unit (MMU) - logic to fetch and store data to memory addresses.
A Memory Data/Address Interface Control BUS to attach External Memory, RAM, EEPROM, DRAM etc.
A Set of General Purpose Registers - usually eight registers plus a Stack, Status, Program & Control Register
A Small Internal RAM Buffer for general purpose registers for the Execution Control Unit
A External Data /Address Interface BUS to transfer data to/from the CPU internal logic
A Clock interface controller for CPU process timing.
A Power On Sequence (POS) unique to the CPU core to start executing instructions from a defined memory location.
Core Functions of the Central Processor Unit Architecture:
Today there are peripherals that include an embedded processor to communicate complex instructions to reduce the instructions required to communicate when connected to an external main CPU that controls the system. This adds a new challenge to the security of the system since the peripheral today does not necessarily need the CPU to communicate, it just needs to have access to the local BUS. Some of the more advanced intelligent peripherals are easily hacked today due to the vulnerability of the BUS access and the Internet controller used to communicate through the Internet, hence, The Reaper IoT Botnets Infects Millions of Networks is just the beginning of the security and privacy challenges. The challenge today is to incorporate security in all part of the designs especially in the BUS protocols in order to prevent unwanted intrusions and hidden root kits.
From Figure 8.0 which defines the Central Processor Core functions we may then define anything outside the core as a peripheral whether the peripheral function remains inside the actual chip or connected to the external pins of the chip. If outside access is allowed to the CPU then full control is obtained over the IoT platform and so enters the Security, Privacy & Safety Policies to insure the CPU performs its programmed sequences without interference. It is important that we understand all vulnerabilities of the selected CPU core, how it communicates with the internal bus to the real world external bus, so enters the cliché "If you don't know where you are going any BUS will take you there" and the adventure begins. The Memory Access Controller is considered a peripheral to connect external memory to the CPU core in order to execute instructions. It is half of the memory interface peripheral and is one of the vulnerabilities that we enter as part of the KSR list. Some embedded processor SoC include both EEPROM and RAM to address simple controllers, however for our Core IoT Platform we would like much more control that a fixed embedded system.
Instruction Process Logic Block
Instruction Process Decoder (IPD)
From a security point of view the IPD should not be accessible to the outside world and it should not be modifiable in any way. If you do a bit of research there are some processors that allow micro-code updates when loading specific Operating Systems. These patches are generally a firmware patch and linked to other (Basic Input/Output System) BIOS instructions toptimize the processor for the selected OS. Older processors before 2007 require the operating system to control all the BIOS functions as part of the OS and do not use the firmware BIOS except for loading the OS at POST, this is not the case today which we will cover as we progress through the series.
Intel processors incorporate a SMM (System Management Mode) controller to hold the setup of system parameters for access to the required core functions that the OS has privilege to. Initially, System Management Mode was used for implementing Advanced Power Management (APM) features. However, over time, some BIOS manufacturers have relied on SMM for other functionality like making a USB keyboard work in Legacy BIOS mode
Some uses of the System Management Mode are:
System Management Mode can also be abused to run high-privileged rootkits, as demonstrated at Black Hat 2008 and 2015. Sony BMG copy protection rootkit scandal in 2005 is just one example and robbing banks is still against the law but still happens, hackers are always tempted and execute these kits..
Types of Processors, MIPS, RISC, CISC, Pipeline, Clocked.
Processors that are not RISC based will handle completes functions via a series of steps in sequence and are called CISC (Complex Instruction Set Computers) such as Digital Equipment PD -11 and VAX systems that incorporated a Polynomial evaluation instruction set based on CISC Instruction Set Architecture as well as several other complex instructions. The Intel IA32, IA64, x86 product line, NXP 68000 are all CISC architecture processors.
The summary of RISC vs CISC is that the CISC processor performs a function in hardware such as a multiply Reg0, Reg1 and moves the results to Reg0 and only requires one line of code as MULT A, B and in C code, A=A*B. The MCR only needs one word for the entire operation. The RISC processor performs this same Multiply in several steps such as:
MOV Reg1, B
MULT A, B
MOV Reg0, A
It is obvious that more memory and more clock cycles are required as well as more compile time for the same function. However Even though it looks like a single operand there are still clock cycles involved. Also in a CISC based system only the result is saved and if the B has to be used again it has to be reloaded where as in a RISC based B is still in the register. We will come back to this at a later time when we get into the processor selection and security. When we look at the available FPGAs (Field Programmable Gate Arrays) today it allows us to apply both worlds to obtain greater performance. We will cover FPGA's and CPLD's (Complex Programmable Logic Device) when we present the peripheral interface section of the series, keep in mind that Intel Corporation purchased Altera Corporation so new advancements with processor technology and FPGA's falls in the probability arena.
The Instruction Process Decoder (IPD) is a key mechanism that determines the complexity of a processor core. Every manufacture of processors be it a stand alone CPU or embedded surrounded by a host of peripherals will incorporate a Instruction Process Decoder. It is this decoder logic that makes the block of logic a processor due to the fact that this IPD allows a sequence of instructions to be performed in a unique sequence of steps. The used within the IPD determines the type of Instruction Step Architecture for the processor, simply presented there are generally two type of architecture that applies to processors, the first is Memory Mapped I/O architecture and the second is Dedicated I/O architecture. Both types of architecture incorporate a memory access architecture that may be used for I/O peripherals, however the dedicated I/O incorporates an independent I/O instruction set that is separate from the typical fetch and store memory functions. In the Dedicated I/O architecture there is usually a separate set of address lines to be used for peripherals and communicate through a selected register set to identify the I/O physical address to send the data to. In a Memory Mapped I/O users may treat I/O Peripherals just like a memory location with all the memory instruction set functions within the processor. The Intel x86 processor line has a separate I/O instruction set and uses the first 16 bits of the address bus as a shared memory and I/O since only one of the two functions, memory or I/O may be accessed at a time. The NXP 68000 set of processors use the Memory Address/Data BUS as both for peripheral I/O and memory access. Both type of architecture have their pro's and con's.
All processor instructions require clock cycles to be implemented, the IPD is simply a sequencer that when input with a specific bit pattern will perform a specific sequence on the processor core. Since all processor instructions that are not I/O peripheral are of a fixed nature the exact timing for each instruction may be calculated by the number of clock cycles it takes and the time period of the clock executing the instruction. This is an important part of the processor and is added to the KSR list for the processor selection. I/O timing is dependent on the real world inputs and may vary when waiting for responses to continue and are calculated based on the peripheral features and the application.
Pipeline processors that include pipeline instruction logic still require clocks to function as they require some type of status flag setup as well as results to be placed somewhere. Pipeline instructions have the given of a fixed time to execute since it is based on two controllable parameters, the clock period and the propagation delays through the pipeline logic. This gives them an advantage of clocked instruction sequences especially if the clock is much slower than the pipeline execution time of the instruction. Pipeline instructions are less likely to be interrupted during the cycle since it is a fixed hardware function where as a clocked instruction may be interrupted in the middle of its process pushing the return state on the stack which requires more clocks then start executing the interrupt handling process until the interruption is completed, then returning to the interrupted instruction by popping the return state off the stack and continuing the instruction. We will address processor interrupts and the security policies required to handle them later in the series.
Micro-Coded ROM (MCR) for the unique registered, patented or trademarked instruction set.
In order to have feedback on an instructions process in real time a set of FLAG bits are used by the process block to identify the final state of the last instruction. Flags are set during the execution of instructions to determine the state or results of the instruction. The assigned bit position is specific to the processor design. Figure 8.1 shows a Typical; Intel Instruction Set Architecture Flag Bit Assignments. Generally there are three classes of FLAG bits, Status, Control and System. Example: if we compare two registers data and they are equal the ZF (Zero Flag) is set to 0.
It is not uncommon that a large number of instructions are designed into the hardware processor. The MCR, IPD along with the Sequencer interact to control the step sequence for all the defined instructions. There is no magic to a processor design just a large block of logic gates to control the 1's and 0's. If the instruction is a pipeline type instruction then it is started with a single clock cycle and the data is pipelined through the fixed logic for the process. There will always be a debate on which is better, pipeline logic or clocked logic blocks, regardless there are applications that are better for each. The clarity will present itself when we address security and peripheral communications in the sections that follow. CPU's that incorporate a Reduced Instruction Set methodology are directed to a fixed logic function generally a pipeline logic set that makes up the Instruction Set Architecture (ISA) that requires less clock cycles to complete.
Instruction Control Sequencer - Loop & Jump Control:
Execution Control Unit (ECU) - AKA - Arithmetic Logic Unit (ALU)
Instrucion Process Logic Block Summary:
One of the areas we did not cover yet is the interrupt architecture of the processor core. Interrupts are a independent function that are usually integrated into the core instruction architecture that allow an interruption of the instruction process usually at the end of the instruction. The architecture is such that it uses a memory pointer. This pointer is called the Stack and is a small contiguous block of memory specifically used for interrupts. When an interrupt happens the state of the core is pushed on the stack to be used for returning to the interrupted process; we will return to interrupts in the Memory section that follows.
Memory & Memory Access Controllers (MAC):
Allocating memory space is critical to a secure user process as well as a secure operating system, so before we get into the memory allocation lets look at the actual hardware that is used to communicate wit the memory. Memory chips handle access by a selecting an address the contains a cell of data that data is either placed on the memory data bus or there is data on the memory databus to be stored at that address. A direct read or write is the fastest memory cycle that exists in any processor block, delays enter the picture when we add address translation and protection to the simple memory access. Enter the MMU or MPU blocks of logic as well as some intelligent for encryption to determine if the address being accessed is assigned to the process being executed along with several other conditions. I would take a guess that many of you reading this blog that you have heard of Spectre and Meltdown, (not the movies 2015, 2004), however they are both disaster movies as is the conditions. These two vulnerabilities have to do with memory access from an unwanted source to intercept the process instructions to gain access to the system. This is both a hardware and software issue so we will cover the hardware side now and when we get to the security software part of the series we will address this in detail..
Regardless of the memory type be it Static RAM or Dynamic RAM the access still has to be protected from unwanted intrusions, a challenge that has been here from the beginning. There are basically two categories of memory Volatile and Non-Volatile, Volatile is memory that looses its data when the power is turned off and obviously Non-Volatile retains it data when the power is turned off. There are basically three types of volatile Random Access Memory today, Static (SRAM), Dynamic (DRAM) and Pseudostatic (PSRAM). Static (SRAM) does not require any refresh to maintain its memory content, however, the cost is chip density and reduced memory size. Dynamic (DRAM) does require the cells to be refreshed (rewriting the cell data) periodically in order to maintain data storage and requires a special DRAM controller that shares real world data access with the refresh cycle. A new kid on the block is Pseudostatic (PSRAM) which is dynamic RAM with a build in refresh that functions like SRAM but a bit slower speeds to handle the refresh and as with the DRAM controller requires a configuration process to maintain the thermals and access timing within the chip.
Within the volatile SRAM type there is some sub-types, Conventional memory and Content Addressable Memory (CAM), Conventional RAM is arranged as a single address to a single data word, Conventional RAM is present in all the desktops, tablets, smartphone's, 98% of the servers in the cloud, one address location one data location. Content Addressable Memory (CAM) an older technology that has been around for over 60 years but very little has been discussed or applied using this category in today's processors. There are multiple patents on CAM and applications filed from 1970 through 2012 for various versions and applications using CAM from companies like IBM®, AMD®, IDT®, TI® and many others to get their dominance in the market. Content Addressable Memory is primarily used with Content Addressable Parallel Processors (CAPP), however it does not have to be applied to parallel processors in general. This technology is very applicable to Artificial Intelligent since search compare and execute functions are a major part of AI processing and CAPP give higher speed and functionality than conventional processor systems. CAM is generally related to Content Addressable Parallel Processors (CAPP) a subject we will address at another time since it is beyond the scope of this project at this time.
The memory controller handles access to the internals of the CPU; since instructions cannot be executed without memory the CPU issues a fetch instruction requests from the Memory access controller to start an instruction processes from a memory address. Memory access is generally defined with the CPU specifications and features in order to incorporate security policies. There are simple Memory Access Controllers that just handle a large linear array of memory without any special features, then there are the Memory Management Units which incorporate memory segmentation and protection, however they also are integrated with other blocks of the CPU architecture. The MAC, MMU and MPU are just controllers that define memory access requirements attached to the CPU, they require some type of physical memory attached to them to function.
Memory access is a major security issue, how it is connected to the Core CPU and how it is allocated, Figure 8.2 shows the functional blocks of a typical memory access architecture connected to the core CPU that consists of a Dynamic Random Access Controller, DRAM, EEPROM and SRAM. Since nothing happens without memory it stands to reason that hackers want access to plant virus, worms, root kits and other codes to obtain access which makes the memory hardware a high profile part of the security architecture.
The simple Memory Access Controller is just a buffer cache that allows access to the memory array in a linear form from address 0000 to address nnnn, just one big linear array of memory addresses. This is the most vulnerable memory access since there is no protection of data or program code regardless of where it is placed in memory and accessed without any fault or protection. From Figure 8.2 we see that the MMU or MPU controls all aspects of memory access that includes Direct Memory Access from I/O devices which is a hardware security vulnerability that we will address in more detail as we progress through the series. This is a major concern and noted in the KSR list.
Memory Management Unit (MMU) is a controller that intercepts "ALL" memory access. Generally the common definition is that the MMU handles accesses to memory requests by the CPU, however that is not totally true. For devices that incorporate DMA (Direct memory Access) hardware the MMU also handles this function and is termed the I/O MMU. This does create security vulnerabilities since the direct writing to memory from a device may also put code as well as data in memory and if the code is in a page that is legal to the current program it may be executed These devices are typically USB ports, Firewire, or storage type peripherals. MMU's primary function is to control memory segmentation and paging, translate virtual memory to physical memory accesses and handle memory cache, protection and bus arbitration. Hence: the MMU is just an add-on to the CPU to create a complete CPU (Central Processing Unit). Unwanted access to the MMU and Memory is one of the root security vulnerabilities that has existed since its conception and still exists today.
The pros are paging, segmentation, virtual memory to physical memory, security with a question mark, by intercepting all memory transactions through a TLB (Translation Lookaside Buffer) to insure multiple processes are directed to the CPU accordingly. This does slow down the access since it adds additional clock cycles to perform the translation. The MMU segmentation protection methodology allows the memory to be partitioned and controlled into Kernal (trusted segment), code or program Protected segment, Supervisory and Data segment groups that add security to the processor system. MMU's are incorporated into the majority of 32 bit processors available to the COTS (Commercial-Off-The-Shelf) market today. We will research the market place for both simple processors without MMU that such that the MMU may be added as a peripheral for comparison as the series progresses. Intel incorporated MMU as been part of the processor since the 80286 release and since then ARM, AMD, MIPS and other COTS embedded processors incorporate MMU's.
What is incorporated is an interrupt function that is in all CPUs today. The interrupt sequence allows the core CPU to share time with other processes by saving the complete state of the core CPU along with a return location to restore the previous state and continue processing. Each manufacturer specifies how their interrupt functions processes information to accomplish the switching. Some use an interrupt controller that allows the fetching of a block of addresses defined as the interrupt vectors that hold pointers to processes to be executed. Those that do not use an interrupt controller are limited to only a single interrupt along with an interrupt handler that must poll all devices to determine which one caused the interrupt. Time sharing is an important feature when selecting a single processor core system since it will determine the overall program efficiency when several peripherals are attached. This is also an issue with multi-core chips as well since each processor runs independently and shares memory. Number of Processors and Interrupt capabilities will be put on the KSR list as well and does create a major security vulnerability if not handled properly.
Interrupts are a integral part of the CPU that allows the CPU to multi-task by interrupting a current process and jumping to another processor then returning when the interrupted process is completed. CPU architectures generally have a single interrupt process that requires an external interrupt controller allowing management of several interrupt lines connected to peripherals that are assigned unique memory address locations that contain pointer to the interrupt handler code. We will cover interrupt structures as we proceed through the series. From the Instruction Process Logic Block can conclude that instructions are single thread and require a memory fetch request for the instruction to be executed which point to the memory as a key security vulnerability.
Each of the user defined memory segments have access control parameters to prevent any unwanted access. Any intrusion to protected areas creates a page fault interrupt to a jump location that contains the error handling code to be executed sending a notification or correcting the intrusion. We will cover Interrupt control processes in the Interface section as the series progresses. The Memory Management Units control registers are generally integrated into the CPU's core control for better security such as in the Intel IA32, IA64 product lines.
Memory Protection Unit (MPU)
MMU, MPU Summary:
OK, now time to bring up interrupts again. Now that we have all this memory access control, what happens if an unwanted access happens as it does in multi-tasking and multi-processor systems. The answer is a page fault interrupt is generated and there is a separate stack of contiguous memory outside the normal OS that points to a block of code that handles the page fault interrupt. A few conditions that are tested and some internal fixes are automatically fixed and process is resumed. Some errors halt the system and some errors just notify the user with some options. Hackers are constantly playing with code injection and getting page faults to find back doors to the OS. In a multii-tasking application the task manager in the MMU handles the segmentation and virtual memory allocation along with all the page/bank switching to save each tasks position in the task scheduler.
The interrupt sequence allows the "single thread" core CPU to share time with other processes by saving the complete state of the core CPU along with the tasks return location to restore the previous state and continue processing. Each manufacturer specifies how their interrupt function processes information to accomplish the switching. Some use an interrupt controller that allows the fetching of a block of addresses defined as the interrupt vectors that hold pointers to processes to be executed. Those that do not use an interrupt controller are limited to only a single interrupt along with an interrupt handler that must poll all devices to determine which one caused the interrupt. Time sharing is an important feature when selecting a single processor core system since it will determine the overall program efficiency when several peripherals are attached. This is also an issue with multi-core chips as well since each processor runs independently and shares memory. Number of Processors and Interrupt capabilities will be put on the KSR list as well and does create a major security vulnerability if not handled properly.
Task handlers require the management of all the tasks opened for processing as well as supplying a separate interrupt structure for each task, this is where bank/page switching and virtual segmentation is used for management both locally and globally. CPU architectures generally have a single interrupt process that jumps to a specific location either pointed to by an internal CPU register set to a memory address. Some cores require the use of an external interrupt controller allowing management of several interrupt lines connected to peripherals that are assigned unique memory address locations that contain pointer to the interrupt handler code. We will cover interrupt structures as we proceed through the series. From the Instruction Process Logic Block can conclude that instructions are single thread and require a memory fetch request for the instruction to be executed which point to the memory as a key security vulnerability. Generally interrupt handlers are accessed indirectly from a contiguous block of memory specifically assigned by the core processor and integrated with an interrupt controller.
If access to the memory is permitted while the CPU is off executing tasks then all activity in the Core IoT Platform may be monitored, intercepted and controlled. Therefore controlling access to this section of the processor is noted in the KSR list to address during the design. Several embedded processor chips incorporate the Memory Management Unit (MMU) as part of the Memory Access Controller.
The majority of embedded platforms work in the linear memory address mode and segmentation is generally not implemented probably for simplicity for those applications that do not require an OS to function. Another major of concern for the MMY is the connection of a Direct Memory Access Controller (DMA) which also a direct access to memory without and CPU intervention. This is a major security concern since DMA controllers transfer data in both directions at very high speeds to/from the real world BUS interface. Intel incorporated DMA controllers in the original 8088 PC and used it for all disc transfers for the speed. When the 80286 was introduced the disk controller was redesigned and DMA was not used since the 80286 was faster by transferring direct. Direct Memory Access for devices like USB, Firewire and other streaming devices present a security risk since they access a block of defined memory without CPU intervention. Memory access is a critical security vulnerability since it contains everything that is happening in the core system therefore it is on the top level KSR list of concerns.
External BUS Interface - Data /Address & Control for External Devices & Memory
I/O Interface Control BUS:
Clock Interface Controller (CIC) for CPU process timing
Power On Sequence (POS) Initialization Test unique to the CPU core to start executing instructions from a defined memory location.
At this point we will end this part and cover the remaining part of the CPU core system.
This part of the series is just the beginning and is meant to be an outline for reference. The embedded processor world is expanding at such a rate that security is being bypassed for the fastest to market at the expense of the publics privacy and safety.
Reference Links for Part 8:
Reference documents for continued reading are listed below:
Part 9+ "Preliminary Outline" Embedded Processor Systems: Continued
Publishing this series on a website or reprinting is authorized by displaying the following, including the hyperlink to BASIL Networks, PLLC either at the beginning or end of each part.
For Website Link: cut and past this code: