Wednesday, October 20, 2021

Modifying the Acorn CLE-215+ FPGA into a PCILeech DMA attack device

PCILeech and MemProcFS allows for easy-to-use user-friendly DMA attacks and hardware assisted memory analysis. This is possible since PCI Express supports DMA. Unfortunately production of compatible hardware, such as the Screamer series has been hit hard by the global silicon shortage.

The goal with this project is to modify the Acorn CLE-215+ / Nitefury / Litefury FPGA boards, in a short time frame and on a relatively tight budget, to support PCILeech and MemProcFS at around 20-25MB/s.

PCILeech DMA with a modified Acorn CLE-215+ and FT2232H.

The Hardware

The Acorn CLE-215+ is a powerful FPGA board with PCI Express M.2 connector which is used for DMA hardware memory acquisition. It also have 12 additional GPIOs in a 20-pin DF52 header. It uses the most powerful Xilinx Artix7 FPGA chip - the 200T. The Acorn CLE-215+ was used for crypto mining but has been discontinued for some time. They can sometimes be found at a nice price at eBay. The Nitefury FPGA board is basically the same board as the Acorn CLE-215+. The Litefury has a smaller, still very capable, Artix 7 100T FPGA.

Acorn CLE-215+ and LiteFury.

PCILeech traditionally connects to the FPGA device over USB3 resulting in DMA around 150MB/s. The now sold-out FTDI FT601 chip is used in synchronous FT245 mode to achieve this. The FT601 uses 40 signals between itself and the FPGA.

The goal with this project is to modify the Acorn CLE-215+ / Nitefury / Litefury to support the FTDI FT2232H USB2 chip. In FT245 mode this should allow for DMA transfer speeds around 20-25MB/s. This should allow PCILeech to work with the CLE-215+ with relatively minor software modifications. 

Alternative ways that does not include hardware modifications would be UART (slow) or a carrier board with additional FPGA/chips (expensive and complex to design). The goal of this project was to create something low-cost in a limited time frame.

The FT2232H is readily available and an inexpensive mini module exists for labs. In this project the FT2232H-56Q mini module which has a micro-USB2 connector will be used.

The FT245 mode we require 15 signals: 8 data, 1 clock and 6 additional control signals. To make things even worse the FT2232H runs at 3.3V while the Acorn only have 4 GPIOs at 3.3V and 8 GPIOs at 2.5V. Also none of the GPIOs on the Acorn are clock capable - i.e. possible to use as a clock input pin.

Two hardware modifications are required:

1) Make the 2.5V GPIOs 3.3V. This is done by removing a 3.3V to 2.5V voltage regulator and connecting the 2.5V power rail to 3.3V. This has been previously discussed. The Schematics also support this change and it should have little or no side effects.

2) Desolder LED2, LED3 and LED4 and use the FPGA connections as three additional GPIOs. The FPGA PIN driving LED4 is also clock capable which is great!

In addition to the above a custom JTAG connector cable may have to be created.


Required Tools:

This is a hardware modification project. In addition to the hardware for the project itself it's assumed that one has access to:
  • Molex crimp tool.
  • Hot-air rework station.
  • JTAG programmer cable.
  • Soldering paste.
Also nice to have, but not required, is access to a logic analyzer or oscilloscope in case something isn't working and some signal debugging is required. A multimeter to check voltage levels and cable connections would also be nice to have at hand.


JTAG connector cable

JTAG is required to program the FPGA with PCILeech a compatible gateware.

The Acorn has a Pico EzMate JTAG header on the bottom side. The LiteFury comes with a nice connector cable which allows for easy connection to your JTAG programmer cable. I use the Digilent HS2 JTAG cable since it's directly compatible with the Xilinx Vivado development environment. The LiteFury cable, shown below in (5), is compatible with the Nitefury and Acorn as well.

If a LiteFury purchase is not desirable it's relatively easy to roll your own cable using the parts list below and molex crimp tool.

Custom JTAG connector cable (1-3). JTAG pinout (4). LiteFury JTAG connector cable (5).

  1. Cut the purchased Pico-EzMate cable in half.
  2. Then crimp standard molex connectors on the cable endings.
  3. Then insert the crimped connectors into the housing resulting to get a working JTAG cable adapter. Make sure the Pico-EzMate JTAG PINs are mapped to the correct PINs on your JTAG programming cable according to the pinout in (4).

Parts list custom JTAG connector cable:

6x Molex crimp connectors (get some extra just in case).
Total: €6 + VAT + shipping.


Flashing the LiteFury with the custom made JTAG cable.



Modification #1 - 2.5V to 3.3V

It's possible to modify the acorn 2.5V GPIOs to 3.3V which would support the FT2232H voltage levels. This has been previously discussed. This is possible by removing the voltage regulator U11 and shorting some of the underlying pads.

When looking at the schematics for the NiteFury/LiteFury this becomes clear. The Voltage regulator U11 converts 3.3V into 2.5V. The 2.5V is only used to drive FPGA Bank34 - which drives the 2.5V LVDS/GPIOs and also the 200MHz DDR clock oscillator - which seems to be supporting 3.3V as well.

2.5V voltage regulator U11 and schematics extracts.

Now let's remove U11 using a hot-air rework station and short the 2.5V power rail to the 3.3 power rail. The removal of U11 is likely to be destructive - the part is likely to fall apart. Please take great care not damaging the Acorn / LiteFury. If you are not familiar with work like this please practice on scrap components before attempting this!

Hot-air rework station setup.

The topmost image below shows the Acorn when the U11 power regulator has been removed. In order to connect the 2.5V power rail to 3.3V a tiny amount of ChipQuik soldering paste as well as a tiny cable fragment is applied on the three "topmost" power regulator pads as shown in the bottom image below. Hot air is applied to make the modification permanent. Now the modification is complete and the Acorn 2.5V GPIOs has become 3.3V.

Removal of power regulator U11 (top). Connecting 2.5V pad to 3.3V (bottom).



Modification #2 - LEDs to GPIOs

The 2nd modification is to desolder three LEDs and replace and solder on three cables on the innermost pad (which goes to the FPGA). The pad closes to the edge goes to ground - care should be taken not to connect the two by mistake.

For cabling I used three short DF52 cables since I had them at hand. Other cables may be used as well, but please try to keep them fairly short. On the other end Molex connectors was crimped on (after the soldering to the acorn was completed). Housing from three standard lab cables was used to shield the crimped Molex connector.

Image (1) shows the acorn LEDs before the hardware modification.

Image (2) shows the LEDs removed using hot air and a cable to the pad of LED3 is being attached with a rather messy application of soldering paste. The wire were individually soldered on using soldering paste and hot air. The wires were then routed in the space behind the DF52 header for some additional stability.

Image (3) shows the end result. The three additional GPIO wires are routed behind the DF51 had Molex connectors crimped on the other end after the soldering was complete. The worst ChipQuik mess was also cleaned up and the end result is decent enough and most importantly - fully functional :)


LED to GPIO hardware modification. (1) before, (2) during, (3) after.


FPGA/FT2232H custom signal cable

Creating the custom signal cable is one of the more tedious parts of the project. Care should be taken to create a good quality relatively short cable to allow for good enough signal quality.

The pinout of the DF52 connector is shown in the table below. PIN1 is at lower part (close to the SQRL logo) whilst PIN20 is on the upper part closest to LED1.

DF52FPGANAMEFT2232H
1. 3.3V (NC)
2. 3.3V (NC)
3. AB8DATA[0]CN2-7 (AD0)
4. AA8DATA[1]CN2-10 (AD1)
5. GND CN2-6*
6. Y9 DATA[2]CN2-9 (AD2)
7. W9 DATA[3]CN2-12 (AD3)
8. GND CN2-6*
9. Y8 DATA[4]CN2-14 (AD4)
10. Y7 DATA[5]CN2-13 (AD5)
11. GND CN2-4*
12. V9 DATA[6]CN2-16 (AD6)
13. V8 DATA[7]CN2-15 (AD7)
14. GND CN2-4*
15. K2 RXF_N CN2-18 (AC0)
16. J2 TXE_N CN2-17 (AC1)
17. GND CN2-2*
18. J5 RD_N CN2-20 (AC2)
19. H5 WR_N CN2-19 (AC3)
20. GND CN2-2*
----
LED2H3 OE_N CN2-23 (AC6)
LED3G4 SIWU_N CN2-22 (AC4)
LED4H4 CLK CN2-24 (AC5)

Start by cutting the DF52 cables in half. Populate the DF52 header. Skip the 3.3V at PIN 1 and 2 or cut  the cables after the cable is completed.

I opted for a ~10cm cable length. I first populated the DF52. Then I put a 2x10 PIN molex connector on the FT2232H CN2 PINs 5-24 - skipping PIN 1,2,3,4,25,26. Ideal would be to have a 2x13 connector but I only had some 2x10 connectors at hand. The parts list should contain a 2x13 pin connector housing though.

Making of a short custom signal cable.

I first crimped one connector then populated the housing on the proper pin with it. Then I cut the next cable to the correct length; crimped it and populated it. Iterating over all signals. For the ground I crimped two cables into the same connector (since there were 6 ground cables and only 3 slots on FT2232H). This iterative process is a good way making sure the cables are of an optimal length.

Upon completion I also glued the cable and taped it making sure every 2nd cable was ground like on the DF52 header. I also made some shorter (around 5cm) less impressive cables.

For the OE_N, SIWU_N and CLK cables a standard lab cable was used to connect.

The custom made signal cable.

Parts list custom signal cable:

12x DF52 pre-crimped cables (get some extra just in case).
20x Molex crimp connectors (get some extra just in case).
Total: €41 + VAT + shipping.


Flashing FT2232H and FPGA

The FT2232H must be flashed in FT245 mode prior to being usable by PCILeech. For the clock signal to enable it's required to both flash the FT2232H and start PCILeech/MemProcFS with the driver.

Download FT_Prog from ftdichip. Connect to the device. On Port A change Hardware to 245 FIFO and Driver to D2XX Direct. Do the same on Port B. IO Pins are set to 4mA and Schmitt Input on both Port A and Port B.

Remember to program the device after the changes have been made.

FT245 FIFO.


D2XX Direct.


To flash the FPGA first install Xilinx Vivado. It's possible to download a free WebPack edition. But beware - it's huge! After Vivado has been installed connect the JTAG cable and programmer. Open up Vivado and then Hardware Manager. Connect to the FPGA. Once connected right click the FPGA chip and select Spansion SPI memory according to the images below. Then program the bitstream / gateware. Please download the latest version from the PCILeech-FPGA project on Github.

Flashing the FPGA.


PCILeech DMA attacks and MemProcFS memory analysis


The Acorn CLE-215+ and LiteFury runs stable with PCILeech and MemProcFS with these modifications. For more information about how to use PCIleech and MemProcFS please check out my Github repos!

When using PCILeech / MemProcFS with the FT2232H make sure to install the FT2232H drivers from ftdichip and put the FTD2XX.DLL alongside the PCILeech / MemProcFS binaries.

Also make sure to specify -device fpga://ft2232h in the connection string.

Examples:

pcileech.exe dump -out memdump.raw -device fpga://ft2232h=1
memprocfs.exe -device fpga://ft2232h=1 -v -vv

Dumping memory using the acorn with PCILeech PCIe DMA attacks.

MemProcFS live memory analysis with the Acorn CLE-215+ works nicely.



Final Notes

This was an interesting project creating a fully usable low-cost (€50 + FPGA board) DMA attack device with decent performance.

Having FT2232H support in the code base - both in LeechCore and in the form of an FPGA IP core may be useful for other projects that doesn't require the higher performance of the FT601.

The device as such is due to the relative lack of available I/O however inferior to the USB3 variants of PCILeech compatible hardware such as the Screamer series. Latency is much higher and transfer speeds are significantly lower on the modified Acorn. But more high-speed alternatives are currently sold out due to the global chip shortage.

The LiteFury with a short custom signal cable.

More performant alternatives would be possible if designing a carrier board - utilizing some of the FPGA GTX lanes for communication with a 2nd FPGA then handling additional GPIOs. Such an approach would however be a larger undertaking and also be more costly. The Acorn Artix7 200T FPGA is however very powerful and is highly suitable for such an approach.

If you're interesting in doing these hardware modifications yourself please do so at your own risk.

PCILeech and MemProcFS is free open source and will continue to stay so. Many people and large corporations are using my tools for free - but few show any appreciation. If you enjoy using PCILeech and MemProcFS please consider becoming a Github Sponsor. Thank You 💖

Friday, April 5, 2019

Introducing the LeechAgent

The LeechAgent is a 100% free open source endpoint solution geared towards remote physical memory acquisition and analysis on Windows endpoints in Active Directory environments.

The LeechAgent provides an easy, but yet high performant and secure, way of accessing and querying the physical memory (RAM) of a remote system. Mount the remote memory with MemProcFS as an easy point-and-click file system - perfect for quick and easy triage. Dump the memory over the network with PCILeech. Query the physical memory using the MemProcFS Python API by submitting analysis scripts to the remote host! Do all of the above simultaneously.

Physical memory analysis have many advantages - a main one being able to analyze the state of a system independently from the, potentially compromised, system APIs.

The video below shows how easy it is to install the LeechAgent service on a remote computer and then using it to mount MemProcFS, dump physical memory and submit Python analysis scripts using the MemProcFS API to the remote LeechAgent.


The LeechAgent offers security and simplicity. Security is built transparently upon built-in Windows functionality. Only administrators are allowed to connect. Other authentication mechanisms does not exist. This simplicity means that there is no need to create users, provision certificates or set up authentication mechanisms to use the LeechAgent. Everything required for security is already there without any configuration!

Using the LeechAgent
The LeechAgent allows up to 10 simultaneously connected clients to remotely acquire physical memory and execute code in the form of Python analysis scripts with access to the MemProcFS API. Use command-line PCILeech or file-system based MemProcFS to dump the physical memory from the remote computer running the LeechAgent. Use the MemProcFS file system and/or API to quickly analyze the memory of the remote computer.
Using MemProcFS to mount the physical memory of the remote computer as a File System - enabling quick and easy access to remote physical memory with your favorite tools - or just for taking a quick look.
Acquring memory from the remote computer running the LeechAgent works fairly well even over medium-bandwidth medium-latency network connections. It may also be desirable to execute a Python memory analysis script accessing the MemProcFS API directly on the remote computer. The LeechAgent will upon receival of a script automatically spawn an embedded Python environment and execute the script. The Python analysis script will never touch disk on the target system.

This approach has many advantages. The main advantage is that physical memory may be accessed locally on the remote system - completely eliminating bandwidth and latency issues - making it ideal for physical memory analysis even over low-bandwith and highly laggy networks. Also since workload is shifted to the LeechAgent scripts may be run simultaneous on a large number of hosts - for example in an incident response scenario.

More information about the MemProcFS Python API is available in the MemProcFS wiki.

Consider you have a Python script looking for read-write-execute sections in user-mode applications by analyzing physical memory. This may be useful for some kinds of malware. Please note that rwx-sections may also exist in legit applications in some cases.

The script retrieves process information for all processes and then iterates over each process and will retrieve its memory map by walking the CPU page tables.
Sample Python script making use of the VmmPy MemProcFS API to analyze memory.
Submit the Python memory analysis script to the remote LeechAgent with PCILeech and wait for the result. The LeechAgent will capture all output written to the console by the submitted analysis script.
Submitting the analysis script to the remote LeechAgent and waiting for the result.
If anything should go wrong with the analysis script - for example if it should happen to contain a never ending loop execution will automatically be aborted after two minutes. In rare cases it may also be a good idea to disconnect all clients from the remote LeechAgent and wait a few minutes for it to clean up any problematic jobs.

Installing the LeechAgent
The LeechAgent supports both 32-bit and 64-bit Windows systems. The 32-bit version will work on both 32-bit and 64-bit systems - but in limited mode without the ability to process memory analysis scripts on the remote host. The 64-bit LeechAgent is strongly recommended!

The LeechAgent may be downloaded from the LeechCore repository on Github. The 64-bit version of the LeechAgent is located in LeechCore/files/agent/x64. The LeechAgent have dependencies on Python for analysis and WinPMEM for memory dumping. DumpIt may also be used for memory dumping if running in interactive (non-service) mode.

Dependencies:
Target system requirements:
  • Windows 7 or later.
  • Bitness - it's not possible to install the 64-bit version of the LeechAgent on a 32-bit system.
  • Active Directory environment: if installing as a service. (In lab environments it's possible to execute LeechAgent in an unauthenticated insecure mode which does not rely on Active Directory for authentication).
  • Administrative access: user running the LeechAgent installation is required to be an administrator on the remote computer. If installing on localhost the user is required to be an elevated administrator.
  • File share - Installation: access to the C$ administrative file share.
  • Firewall openings - Installation: Access to the service control manager (SCM) and File sharing is required for installation only.
  • Firewall openings - Using: Access to the LeechAgent or tcp/28473 is required.
Windows firewall rules recommended for remote LeechAgent installation.
Windows Firewall rule for the LeechAgent endpoint - tcp/28473.
Installation:

Installation is easy - run the command:

LeechAgent.exe -remoteinstall <remote_computer_name>

The LeechAgent and its dependencies will be copied to the Program Files\LeechAgent directory of the remote host. Uninstallation is possible in a similar way but with the -remoteuninstall command.

Security and Authentication
The primary design goal of the LeechAgent is to keep it simple and secure.

The LeechAgent relies exclusively on built-in Windows functionality for Kerberos authentication of connecting clients. Only remote users with administrative privileges on the computer running the LeechAgent are allowed to connect. In addition to this the connecting client is also required, by default, to verify the authenticity of the LeechAgent by supplying the Kerberos SPN of the user that runs the LeechAgent. This is usually the Active Directory computer account.
Connecting client mutually authenticates the remote LeechAgent user for additional security.

The RPC connection between connecting client and the remote LeechAgent is secured by mutually authenticated Kerberos and is also encrypted using built-in Windows functionality also relying on Kerberos.The connection is also compressed if both client and server is running on Windows 10.

Connecting clients are logged to the Application Event Log the computer running the LeechAgent.
The connecting user ulf@ad.frizk.net is logged to the Application Event Log by the LeechAgent.

Note! The LeechAgent allows authenticated remote administrators to both access physical memory and run arbitrary code as SYSTEM on the computer running the LeechAgent. This is by design. Since only administrators are allowed to connect this is not a security issue.

Note! It's also possible possible to run the LeechAgent without any form of authentication in interactive mode only. This is not recommended and should only be used in otherwise secure lab environments.

The Future
The primary design goal is to keep the LeechAgent secure, simple and easy to use. As such it's not likely that more authentication mechanisms or supported operating systems will be added in the near future. For now built-in Kerberos-based Windows authentication is suficient.

The MemProcFS Python API, while fast and powerful, is still somewhat limited. New and extended API functionality is a priority.

Also further optimizations of memory dumping will be looked into.

The MemProcFS Python API is already fast - the underlying multi-threaded native C analysis library is amazingly fast - but things may always be improved. Additional performance optimizations are planned.

Links and Additional information
The LeechAgent, PCILeech and MemProcFS are available for free on Github and are all licensed as Open Source GPLv3. Please find the projects below:
LeechAgent and LeechCore
MemProcFS
PCILeech



Wednesday, February 6, 2019

Remote LIVE Memory Analysis with The Memory Process File System v2.0

This blog entry aims to give an introduction to The Memory Process File System and show how easy it is to do high-performant memory analysis even from live remote systems over the network.

This and much more is presented in my BlueHatIL 2019 talk on February 6th.

Connect to a remote system over the network over a kerberos secured connection. Acquire only the live memory you require to do your analysis/forensics - even over medium latency/bandwidth connections.

An easy to understand file system user interface combined with continuous background refreshes, made possible by the multi-threaded analysis core, provides an interesting new different way of performing incident response by live memory analysis.

Analyzing and Dumping remote live memory with the Memory Process File System.
The image above shows the user staring MemProcFS.exe with a connection to the remote computer book-test.ad.frizk.net and with the DumpIt live memory acquisition method. it is then possible to analyze live memory simply by clicking around in the file system. Dumping the physical memory is done by copying the pmem file in the root folder.

Background
The Memory Process File System was released for PCILeech in March 2018, supporting 64-bit Windows, and was used to find the Total Meltdown / CVE-2018-1038 page table permission bit vulnerability in the Windows 7 kernel. People have also used it to cheat in games - primarily cs:go using it via the PCILeech API.

The Memory Process File System was released as a stand-alone project focusing exclusively on memory analysis in November 2018. The initial release included both APIs and Plugins for C/C++ and a Python. Support was added soon thereafter for 32-bit memory models and Windows support was expanded as far back as Windows XP.

What is new?
Version 2.0 of The Memory Process File System marks a major release that was released in conjunction with the BlueHatIL 2019 talk Practical Uses for Hardware-assisted Memory Visualization.

New functionality includes:
  • A new separate physical memory acquisition library - the LeechCore.
  • Live memory acquisition with DumpIt or WinPMEM.
  • Remote memory capture via a remotely running LeechService.
  • Support from Microsoft Crash Dumps and Hyper-V save files.
  • Full multi-threaded support in the memory analysis library.
  • Major performance optimizations.
The combination live memory capture via Comae DumpIt, or WinPMEM, and secure remote access may be interesting both for convenience and incident-response. It even works remarkably well over medium latency- and bandwidth connections.

The LeechCore library
The LeechCore library, focusing exclusively on memory acquisition, is released as a standalone open source project as a part of The Memory Process File System v2 release. The LeechCore library abstracts memory acquisition from analysis and makes things more modular and easier to re-use. The library supports multiple memory acquisition methods - such as:
  • Hardware: USB3380, PCILeech FPGA and iLO
  • Live memory: Comae DumpIt and WinPMEM
  • Dump files: raw memory dump files, full crash dump files and Hyper-V save files.
The LeechCore library also allows for transparently connecting to a remote LeechService running on a remote system over a compressed mutually authenticated RPC connection secured by Kerberos. Once connected any of the supported memory acquisition methods may be used.

The LeechService
The LeechService may be installed as a service with the command LeechSvc.exe install. Make sure all necessary dependencies are in the folder of leechsvc.exe - i.e. leechcore.dll and att_winpmem_64.sys (if using winpmem). The LeechService will write an entry, containing the kerberos SPN to the application event log once started provided that the computer is a part of an Active Directory domain.
The LeechService is installed and started with the Kerberos SPN: book-test$@AD.FRIZK.NET
Now connect to the remote LeechService with The Memory Process File System - provided that the port 28473 is open in the firewall. The connecting user must be an administrator on the system being analyzed. An event will also be logged for each successful connection. In the example below winpmem is used.
Securely connected to the remote system - acquiring and analyzing live memory.
It's also possible to start the LeechService in interactive mode. If starting it in interactive mode it can be started with DumpIt to provide more stable memory acquisition. It may also be started in insecure no-security mode - which may be useful if the computer is not joined to an Active Directory domain.
Using DumpIt to start the LeechSvc in interactive insecure mode.
If started in insecure mode everyone with access to port 28473 will be able to connect and capture live memory. No logs will be written. The insecure mode is not available in service mode. It is only recommended in secure environments in which the target computer is not domain joined. Please also note that it is also possible to start the LeechService in interactive secure mode.

To connect to the example system from a remote system specify:
MemProcFS.exe -device dumpit -remote rpc://insecure:<address_of_remote_system>

How do I try it out?
Yes! - both the Memory Process File System and the LeechService is 100% open source.
  1. Download The Memory Process File System from Github - pre-built binaries are found in the files folder. Also, follow the instructions to install the open source Dokany file system.
  2. Download the LeechService from Github - pre-built binaries with no external dependencies are found in the files folder. Please also note that you may have to download Comae DumpIt or WinPMEM (download and copy .sys driver file to directory of MemProcFS.exe) to acquire live memory.

The Future
Please do keep in mind that this is a hobby project. Since I'm not working professionally with this future updates may take time and are also not guaranteed.

The Memory Process File System and the LeechCore is already somewhat mature with its focus on fast, efficient, multi-threaded live memory acquisition and analysis even though current functionality is somewhat limited.

The plan for the near future is to add additional core functionality - such as page hashing and PFN database support. Page hashing will allow for more efficient remote memory acquisition and better forensics capabilities. PFN database support will strengthen virtual memory support in general.

Also, additional and more efficient analysis methods - primarily in the form of new plugins will also be added in the medium future.

Support for additional operating systems, such as Linux and macOS is a long-term goal. It shall however be noted that the LeechCore library is already supported on Linux.

Update
2019-02-18: Please also have a look at my Microsoft BlueHatIL 2019 talk in which I, among other things, talk about using the Memory Process File System v2.0 with the remote capture functionality discussed in this blog post. In the talk I also make use the Python API and demo the "Total Meltdown/CVE-2018-1038" vulnerability.