1 / 32

3D Object Recognition U sing Computer Vision

3D Object Recognition U sing Computer Vision. VanGogh Imaging, Inc. Kenneth Lee. CEO/Founder klee@vangoghimaging.com. Corporate Overview. Founded in 2007, located in McLean VA

Download Presentation

3D Object Recognition U sing Computer Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 3D Object Recognition Using Computer Vision VanGogh Imaging, Inc.

  2. Kenneth Lee CEO/Founder klee@vangoghimaging.com

  3. Corporate Overview • Founded in 2007, located in McLean VA • Mission: “Provide easy to use, real-time 3D computer vision (CV) technology for embedded and mobile applications” • 2D to 3D for better visualization, higher reliability, and accuracy • Solve problems that require spatial measurements (e.g. parts inspection) • Target customer: Application and System Developers • Enhance existing product or develop new products • Product: ‘Starry Night’ 3D-CV Middleware (Unity Plugin) • Operating Systems: Android and Linux • 3D Sensor: Occipital Structure and Intel RealSense • Processors: ARM and Xilinx Zynq • Our focus • Object recognition • Feature detection • Analysis (e.g., measurements)

  4. Potential Applications 3D Printing Robotics Parts Inspection Security Entertainment Medical Imaging Automotive Safety

  5. Challenges for ImplementingReal-Time 3D Computer Vision • Busy uncontrolled real-world environment • Limited processing power and memory • Noisy and uncalibrated low-cost scanners • Difficult to use libraries • Hard to find proficient computer vision engineers • Lack of standards • Large development investment

  6. Starry Night Unity Plugin(patent pending) Starry Night Video: https://www.youtube.com/watch?v=IZX-9PH7Erw&feature=youtu.be

  7. The ‘Starry Night’ Template-Based3D Model Reconstruction • Reliable - The output is always a fully-formed 3D model with known feature points despite noisy or partial scans • Easy to use – Fully automated process • Powerful – Known data structure for easy analysis and measurement • Fast – Real-time modeling Input Scan (Partial) + Reference Model = Full 3D Model

  8. 3D Object Recognition Algorithm for mobile and embedded Devices

  9. Challenges - Scene • Busy scene, object orientation, and occlusion

  10. Challenges - Platform • Mobile and Embedded Devices • ARM – A9 or A15, <2G RAM • Existing libraries were built for laptop/desktop platform • GPU processing is not always available

  11. Previous Approaches • (2D) Texture-Based Methods • Color-based → depends heavily on lighting or color of the object • Machine learning → robust, but requires training for each object • Neither method provides transform (i.e., orientation) • (3D) Methods • Hough transform and geometric hashing → slow • Geometric hashing → even slower • Tensor matching → not good for noisy and sparse scene • Correspondence-based methods using rigid geometric descriptors • The models must have distinctive feature points which is not true for most models (i.e., cylinder) Tried

  12. General Concept for CV-BasedObject Recognition Reference Object Descriptor Distance & Normal Compare Fine-Tune Orientation Location Transpose Match Criteria Scene Distance & Normal of Random Sample Points

  13. Block Diagram

  14. Model Descriptor (Pre-Processed) Sample all point pairs in the model that are separated by the same distance D Note: In the bear example, D = 5 cm which resulted in 1000 pairs Note: The keys are angles derived from the normal of the points. alpha(α) = first normal to second point beta(β) = second normal to first point omega(Ω) = angle of the plane between two points Use the surface normal of the pair to group them into the hash tablet

  15. Object Recognition Workflow Grab Scene Note: The example scene has around 16K points Sample point pair w/ distance D using RANSAC Note: We iterated this sampling process 100 times Generate key using same hash function Note: Entire process can be easily parallelized Use key to retrieve similarly oriented points in the model & rough transform Very Important: Multiple models can be found using a single hash table, for example, sampled point pair in the scene Match criteria to find the best match Use ICP to refine transform

  16. Implementation • Result Object Recognition Video: https://www.youtube.com/watch?v=h7whfei0fTw&feature=youtu.be

  17. * CONFIDENTIAL * Object Recognition Examples

  18. Adaptive 3D Object Recognition Algorithm Resize and Reshape

  19. Object Recognitionfor Different Sizes & Shape • Objects in the real world are not always identical • Similarity Factor, S%, can be used to denote % of shape difference • This allows recognition of object that’s similar but does not have the exact shape as the reference model • Size Factor, Z%, can be used to note the % size the object can recognize • This allows recognition of object that’s of different sizes from the reference model

  20. General Approach • Dynamically resizes the reference model • Dynamically reshapes the reference model • Uses our ‘Shape-based Registration’ technique • Hence, the reference model is ‘deformed’ to match the object in the scene • Results in very robust object recognition • The end reference model best represents the object in the scene both in size and shape

  21. Block Diagram – Adaptive Object Recognition with feedback • Reference model is iteratively modified with every new frame until it converges into the same object in the scene Note: Currently in the process of being implemented and will be available in Version 1.2 later this year

  22. Object Recognition Performance Numbers

  23. Reliability (w/ bear model) • Reliability • % false positives – depends on the scene • Clean scene: <1% • Noisy scene: 5% (1 out of 20 frames) • % negative results (cannot find the object) • Clean scene: <1% • Noisy scene: 10% (also takes longer) • Effect of orientation on success ratio • Model facing front: >99% • Model facing backwards: >99% • Model facing sideways (narrower): 85%

  24. Performance - Mobile • Performance on Cortex A-15 2GHz ARM (on Android mobile) • Amount of time it takes to find one object • Single thread: 2 seconds • Multi-thread & NEON: 0.3 second • Amount of time it takes to find two objects • Single thread: 2.5 seconds • Multi-thread & NEON: 0.5 second • Note: Effective use of NEON led to significant performance gains of X2.5 for certain functions

  25. Hardware Acceleration Using FPGA • Xilinx Zynq SoC provides 20 to 1,000 parallel voxel processors depending on the size of the FPGA Zynq FPGA voxel Processor 1 voxel ARM Processor 1 voxel Processor 1 scan voxel Processor 1 voxel Processor 20+

  26. Hardware Acceleration:FPGA (Xilinx Zynq) • Select Functions to Be Implemented in Zynq • FPGA: Matrix operations • Dual-core ARM: Data management + Floating point • Entire implementation done in C++ (Xilinx Vivado-HLS)

  27. Performance:Embedded Using FPGA • Note: Currently, only 30% of the computationally intensive functions are implemented on the FPGA with the rest still running on ARM A9. Speed will be much improved once the remaining high-intensity functions are transferred to the FPGA. • Performance on Xilinx Zynq (Cortex A-9 800 MHZ + FPGA) • Amount of time it takes to find one object • Zynq 7020: 0.7 second • Zynq 7045 (est.): 0.1 second • No test results for two objects, but should scale the same way as for the ARM

  28. Future • The chosen algorithm works well in most real-world conditions • The chosen algorithm is tolerant to size and shape differences respect to the reference model • The chosen algorithm can find multiple objects at the same time with minimal additional processing power • Additional improvements in performance are needed • Algorithm • Application-specific parameters (e.g., size of the model descriptor) • ARM - NEON • Optimize the use of FPGA core

  29. Summary • Key implementation issues • Model descriptor • Data structure • Sampling technique • Platform • IMPORTANT • Both ARM & FPGA provide the scalability • Therefore • Real-time 3D object recognition was very difficult but successfully implemented on both mobile and embedded platforms! • LIVE DEMO AT THE Xilinx BOOTH!

  30. Resources • www.vangoghimaging.com • Android 3D printing: http://www.youtube.com/watch?v=7yCAVCGvvso • “Challenges and Techniques in Using CPUs and GPUs for Embedded Vision” by Ken Lee, VanGogh Imaging—http://www.embedded-vision.com/platinum-members/vangogh-imaging/embedded-vision-training/videos/pages/september-2012-embedded-vision-summit • “Using FPGAs to Accelerate Embedded Vision Applications”, Kamalina Srikant, National Instruments— http://www.embedded-vision.com/platinum-members/national-instruments/embedded-vision-training/videos/pages/september-2012-embedded-vision-summit • “Demonstration of Optical Flow algorithm on an FPGA”—http://www.embedded-vision.com/platinum-members/bdti/embedded-vision-training/videos/pages/demonstration-optical-flow-algorithm-fpg • * Reference: “An Efficient RANSAC for 3D Object Recognition in Noisy and Occluded Scenes” by Chavdar Papazov and Darius Burschka. Technische Universitat Munchen (TUM), Germany.

More Related