iaTest Demo

After installing the SigSRF SDK eval, below are notes and example command lines for the iaTest¹ demo. The demo has two (2) purposes:

show how to implement an Atom-based vision² + AI server with 34 total CPU cores under 75 W
provide an example application, including source code, to measure OpenCV performance between Atom only and Atom + coCPU cores, with no ARM or GPU

In addition to OpenCV, the next iteration of this demo will include TensorFlow.

¹iaTest = image analytics test
²vision = computer vision, machine vision, vehicle vision, etc

Other Demos

mediaTest Demo ( (Streaming Media, Buffering, Transcoding, and Packet RFCs))

paTest Demo (Predictive Analytics)

Vision + AI Server

The demo defines as follows the requirements for a practical, deployable vision + AI server:

Small size, 8" x 9" x 3"
Low power -- target of 50 W, the current prototype shown here is 75 W
High performance -- this demo shows a 16x increase in OpenCV capacity vs. dual-core Atom
Cloud compatible programming model -- use Atom x86 + Linux, with no ARM, GPU, or FPGA, no special APIs or flow graphs, etc
All cores have direct access to network I/O
Ready to run OpenCV and TensorFlow

Low SWaP³ requirements are obvious enough; what is less obvious, yet cannot be overemphasized, is the importance of a fully cloud compatible programming model. All new vision and AI algorithms, including low SWaP applications, are tested in the cloud prior to production. No one wants to be forced to port complex, performance sensitive algorithm based code to ARM and end up in an unsupported backwater.

Below are some pictures of the demo Atom server, with 32 coCPU™ cores installed. coCPU cores are high performance CPU cores that run gcc compatible C/C++ code

Specifics of the Atom-based vision + AI demo server include:

Mini-ITX motherboard and case
Dual core Atom (C2358, 1.74 GHz), 4x GbE interfaces, 8 GB DDR3 mem, 1333 MHz
32 coCPU cores (C6678, 1.6 GHz), GbE interface, 8 GB DDR3 mem, 1333 MHz, x8 PCIe
4x USB interfaces
IPMI (dedicated GbE)
Audio I/O interface (via USB)
VGA optional display

³SWaP = Size, Weight, and Power consumption

Architecture Diagram

Below is an architecture diagram showing network I/O and processing performed by x86 CPUs and c66x coCPUs.

Atom Only Tests

Below are example command lines to use with with or without a coCPU card installed. The demo source code performs motion detection and tracking, with a rudimentary algorithm that compensates for camera motion (such as jerky hand-held mobile phone video).

./iaTest -m1 -cx86 -s0 -itest_files/hallway_352x288p_30fps_420fmt.yuv -x352 -y288 -ohallway_test.yuv -l0x01000003

Atom + coCPU Tests

To run coCPU tests, a coCPU card must be installed in the Atom server. The pictures above show a 32-core card; a 64-core card is also available (but would increase the size of the enclosure). If supported by the riser, multiple 32-core cards can be installed. Cards can be obtained from Signalogic, Advantech, or Texas Instruments.

Below are example command lines to use with coCPU cards.

./iaTest -m1 -f1600 -eia.out -cSIGC66XX-8 -s0 -itest_files/hallway_352x288p_30fps_420fmt.yuv -x352 -y288 -ohallway_test.yuv -l0x01000003

./iaTest -m1 -f1600 -eia.out -cSIGC66XX-8 -s0 -itest_files/CCTV_640x360p_30fps_420fmt.yuv -x640 -y360 -occtv_test.yuv -l0x01100003

Output Frame Grabs

Below are some example frame grabs from output .yuv files. After running the demo, this is what you should see for output yuv file results, prior to making source code modifications.

coCPU™ Notes

coCPU cores must meet the following requirements:

High performance, including extensive SIMD capability, 8 or more cores per CPU, L1 and L2 cache, and advanced DMA capability
Contain onchip network I/O and packet processing and onchip PCIe
Access to 2 (two) GB or more external DDR3 mem
Able to efficiently decode camera input, e.g. H.264 streams arriving as input via onchip network I/O
CGT⁴ supports gcc compatible C/C++ build and link, mature and reliable debug tools, RTOS, and numerous libraries

The current vision + AI server demo uses TI C6678 CPUs, which meet these requirements. Over time, other suitable CPUs may become available.

Combining x86 and c66x CPUs and running software components necessary for AI applications such as H.264 decode, OpenCV and TensorFlow, is another form of an "AI Accelerator". The architecture described here favors fast, reliable development: mature semiconductors and tools, open source software, standard server format, and a wide range of easy-to-use peripherals and storage.

⁴CGT = Code Generation Tools

Install Notes

For Atom operation, the demo installs two OpenCV v3.2 libraries (libopencv_core_sig.so and libopencv_imgproc_sig.so). Demo-specific filenames (with _sig suffix) are used so as not to interfere with existing OpenCV installations. For coCPU operation, the demo installs the libopencv.le66 combined library, which includes a number of OpenCV v2.4.2 modules. This link has more information on how libopencv.le66 was created.
For Atom operation, demo source code (for example ia.c and yuv.c files) can be modified, rebuilt, and linked, but if new OpenCV functions are required then the iaTest Makefile will need to be modified to reference the required libraries. For coCPU operation, demo source can also be modified; in this case you will need to download the c66x CGT tools for Linux from TI's website in order to rebuild. The Makefile installed on the mCPU_target subfolder may need to be modified.

Demo Notes

The iaTest demo is one of several SigSRF demos, including mediaTest (media streaming and transcoding), and ffmpeg_accel (accelerated ffmpeg encoding and video streaming).
iaTest source code files use #ifdef's to allow compilation for both x86 and c66x CPUs. All source code examples invoke C or C++ OpenCV function calls per OpenCV standards. All source code can be augmented with network I/O APIs.
The demo includes optimized YUV - RGB conversion routines not provided with OpenCV. These are in the yuv.c file.

Power Consumption Notes

The demo server consumes up to 75 W. Ongoing R&D work aims to reduce usage to under 50 W, and also add a 35 W "minimum mode", under which reduced processing is performed until some event or trigger enables full processing.

To achieve portability in the 30 to 75 W range requires a sizable Lithium battery. A typical "generator style" battery provides 100W over 41 amp-hour, with dimensions 6" x 3" x 7" and weight 3.3 lb. Such a battery increases overall solution size and weight and must be considered vs. application SWaP requirements.

Embedded AI Comparison

The advantages of a miniaturized x86 based AI server are compelling: immediately run cloud software with no code rewrite, flexible, reliable peripherals and storage like any other server, and high performance using coCPU cores. The disadvantage is power consumption. Even though this approach consumes relatively low power, some of the embedded AI boxes out there, such as PerceptIn contain numerous ARM cores, which can reduce power consumption to the 20W - 30W range and potentially to under 10W.

To make an x86 based AI server completely portable requires a small Lithium battery, for example this one, but this increases overall solution weight.

SWaP tradeoffs -- as always -- depend on specific application requirements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iaTest_readme.md

iaTest_readme.md

iaTest Demo

Other Demos

Table of Contents

Vision + AI Server

Architecture Diagram

Atom Only Tests

Atom + coCPU Tests

Output Frame Grabs

coCPU™ Notes

Install Notes

Demo Notes

Power Consumption Notes

Embedded AI Comparison

Files

iaTest_readme.md

Latest commit

History

iaTest_readme.md

File metadata and controls

iaTest Demo

Other Demos

Table of Contents

Vision + AI Server

Architecture Diagram

Atom Only Tests

Atom + coCPU Tests

Output Frame Grabs

coCPU™ Notes

Install Notes

Demo Notes

Power Consumption Notes

Embedded AI Comparison