This document discusses using deep learning techniques for semantic segmentation of indoor point clouds. It provides an overview of initial ideas for using deep learning models trained on 3D CAD models to classify and label points in an indoor point cloud. It also discusses preprocessing steps like denoising and simplifying the point cloud as well as potential techniques for primitive-based reconstruction from the segmented point cloud.
2. Implementation Initial ‘deep learning’ idea
.XYZ point cloud better than the
reconstructed .obj file for automatic
segmentation due to higher resolution
InputPointCloud
3D CAD MODEL
No need to have
planar surfaces
Sampled too densely
www.outsource3dcadmodeling.com
2DCAD MODEL
Straightforward from 3D to 2D
cadcrowd.com
RECONSTRUCT 3D
“Deep Learning”
3DSemantic Segmentation
frompointcloud / reconstructed mesh
youtube.com/watch?v=cGuoyNY54kU
arxiv.org/1608.04236
Primitive-based deep learning segmentation
The order between semantic segmentation and reconstruction could be swapped
3. Sensors Architectural spaces
https://matterport.com/
Some Company
could upgrade to? http://news.mit.edu/2015/object-recognition-robots-0724
https://youtu.be/m6sStUk3UVk
http://news.mit.edu/2015/algorithms-boost-3-d-imaging-resolution-1000-times-120
1
+
http://www.forbes.com/sites/eliseackerman/2013/11/17/
4. HARDWARE Existing scanners static
Scan space eventually with a drone
https://www.youtube.com/watch?v=dVPOf-oDUO
M
Introducing Cartographer We are happy to announce the open source release of
Cartographer, a real-time simultaneous localization and mapping (SLAM) library in 2D and
3D with ROS support. SLAM algorithms combine data from various sensors (e.g. LIDAR,
IMU and cameras) to simultaneously compute the position of the sensor and a map of the
sensor’s surroundings.
We recognize the value of high quality datasets to the research community. That’s why,
thanks to cooperation with the Deutsches Museum (the largest tech museum in the world
), we are also releasing three years of LIDAR and IMU data collected using our 2D and 3D
mapping backpack platforms during the development and testing of Cartographer.
http://www.ucl.ac.uk/3dim/bim | http://www.homepages.ucl.ac.uk/~ucescph/
Indoor Mobile Mapping
Rapid Data Capture for Indoor Modelling
As part of a working group we are investigating the great potential of indoor
mobile mapping systems for providing 3D capture of the complex and unique
environment that exists inside buildings. The investigation is taking the form
of a series of trials to explore the technical capabilities of Indoor Mobile
Mapping Systems, such as the i-MMS from Viametris, with a view to
performance in Survey and BIM applications with respect to UK standards.
The working group is investigating the potential of such technology in terms
of accuracies, economic viability and its future development.
6. Implementation rough Idea
InputPointCloud
CAD-Primitive based reconstruction
Trained on ModelNet.
CAD Primitives
ModelNet
modelnet.cs.princeton.edu
Possibly only simplified modelling, with only
walls, floor and openings
http://dx.doi.org/10.1016/j.cag.2015.07.008
2D CADFLOORPLAN
→ .SVG FOR REAL ESTATE AGENTS
7. Point clouds to Architectural Models #1
Point-Cloud Processing with Primitive Shapes
cg.cs.uni-bonn.de/en/projects
UCL > School of BEAMS > Faculty of Engineering Science > Civil, Environmental and Geomatic Engineeri
ng
http://discovery.ucl.ac.uk/id/eprint/1485847
From Point Cloud to Building Information Model: Capturing
and Processing Survey Data Towards Automation for High
Quality 3D Models to Aid a BIM Process
Thomson, CPH; (2016) From Point Cloud to Building Information Model:
Capturing and Processing Survey Data Towards Automation for High Quality
3D Models to Aid a BIM Process. Doctoral thesis, UCL (University College
London).
8. Point clouds to Architectural Models #2
Eric Turner, May 14, 2015
Electrical Engineering and Computer Sciences, University of California at Berkeley
Technical ReportNo.UCB/EECS-2015-105
http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-105.html, Cited by 1
Figure 3.2: (a) Point cloud of scanned area, viewed from above
and colored by elevation; (b) wall sample locations generated
from point cloud. Clutter such as furniture or plants do not aect
position of wall samples.
10. First level Idea
Input pointcloud
(1,039,097vertices)
'Simplified'pointcloud
By MarkusYlimäki
1) Noisyinputwith possible missing parts
2) Denoise, consolidate, find normalsand possibly upsample the pointcloud
3) Find planar surfaceswith semanticlabels(semanticsegmentation for point clouds)
1) optimally youwouldliketodescribeawalljustusing4cornerpoint massivereductionofpoints→
4) Remove toocomplexshapeslike chairs, flowers, chandeliers, etc. whatever
Oldschooltechniques,nomachinelearninghereyet
Each color correspond to a plane, black correspond to no plane
(1,039,097 vertices)
This algorithm gives okay results, but could be a lot faster, and
you never can have too robust method. And better ‘inpainting’
performance for missing data.
PhD student at the Center for Machine Vision Research in the University of Oulu.
11. First level Pre-processing justuseexistingcode
Point CloudDenoisingviaMovingRPCA
EMattei,ACastrodad,2016-Computer GraphicsForum-WileyOnlineLibrary
Walls become a lot better planar in top view
Optimize consolidation
for point clouds
Screened PoissonReconstruction
https://github.com/mkazhdan/PoissonRecon, C++ code)
CGAL, PointSetProcessing
http://doc.cgal.org/latest/Point_set_processing_3/
http://vcc.szu.edu.cn/research/2013/EAR/
Deep points consolidation -ACMDigital Library
bySWu -2015- Cited by 2 - Related articles
[webpage] [pdf] [video] [ppt] [code] [data]
EAR/WLOPCODE AVAILABLE
in CGAL as illustrated below
Consolidationof Low qualityPoint‐
Clouds fromOutdoorScenes
12. First level Pre-processing Motivationforimage-naîvepeople
https://www.youtube.com/watch?v=BlDl6M0go-c
Background of BM3D (and later BM4D) developed at Tampere
University of Technology, the state-of-the-art denoising algorithm
at least prior to deep learning denoisers
Images are always estimates of the “real images” like any measurement in general, and a photo of a black circle on a white background in practice for the computer
might not be composed of only two colors. But in practice is corrupted by noise and blur and quantitative image analysis might be facilitated by some image restoration
pre-processing algorithms. And we want to use ROBUST ALGORITHMS that perform well also with low-resolution and noisy point cloud (think of Google Tango scans
or even more professional laser scanners/ LIDARs)
Lu et al. (2016), https://doi.org/10.1109/TVCG.2015.2500222
“BM3D for point clouds”
Patch-Collaborative Spectral Point-Cloud Denoising
http://doi.org/10.1111/cgf.12139
You can visualize the removed noise with Hausdorff distance for example
http://dx.doi.org/10.1111/cgf.12802
http://staff.ustc.edu.cn/~lgliu/Publications/Publications/2015_SMI_QualityPoint.pdf
14. First level plane segmentation in practice #1
https://tams.informatik.uni-hamburg.de/people/alumni/xiao/publicati
ons/Xiao_RAS2013.pdf
junhaoxiao/
TAMS-Planar-Surface-Based-Perception
3DperceptioncodedevelopedatTAMS(http://tams.informatik.uni-hamburg.de/)by
JunhaoXiaoandothers,includingpointcloudplanesegmentation,planar segment
areacalculation,scanregistrationbasedonplanar segments,etc.
The following libraries will help also if
not everything is found from the
implementation above
● CGAL 4.9 - Point Set Processing: User Manual
● PCL - Point Cloud Library (PCL)
● PDAL - Point Data Abstraction Library — pdal.io
●
For ICC and BIM processing the
VOLVOX plugin for Rhino seemed
interesting
https://github.com/DURAARK
http://papers.cumincad.org/data/w
orks/att/ecaade2016_171.pdf
16. First level plane segmentation in practice #2
http://dx.doi.org/10.1117/1.JEI.24.5.051008
Furthermore, various enhancements are applied to improve the segmentation quality. The GPU implementation of the proposed algorithm segments depth images into
planes at the rate of 58 fps. Our pipeline-interleaving technique increases this rate up to 100 fps. With this throughput rate improvement, the application benefit of our
algorithm may be further exploited in terms of quality and enhancing the localization
17. First level Shape representations
Data-driven shape processing and modeling provides a promising
solution to the development of “big 3D data”. Two major ways of 3D
data generation, 3D sensing and 3D content creation, populate 3D
databases with fast growing amount of 3D models. The database
models are sparsely enhanced with manual segmentation and
labeling, as well as reasonably organized, to support data-driven
shape analysis and processing, based on, e.g., machine learning
techniques. The learned knowledge can in turn support efficient 3D
reconstruction and 3D content creation, during which the
knowledge can be transferred to the newly generated data. Such 3D
data with semantic information can be included into the database to
enrich it and facilitate further data-driven applications.
https://arxiv.org/abs/1502.06686
18. Synthesis Modular blocks as cloud microservices?
POINTCLOUD
2DFloorplan
3DCADModel
Denoising
Consolidation
Upsampling
Planar
Segmentation
TAMS
Simplification WLOP
DeepPoints
Bilateral
CGAL
Metafile
‘Imagerestoration’
pipeiline
Not necessarily every block
before planar segmentation is
needed and ‘pre-processing’
could be bypassed
Onlytoberunfromcloud?
Startfromexistinglibrariesandimplementations?
Seethedetailsfrom previousslides.
Each block has codeavailableso no new
codeneed to bewritten to get toMVP
S
H
U
F
F
L
E
GroundTruth
Benchmarkperformance
- Accuracy
- Computation speed
- Robustness
21. General Motivation #1
Where VR is going beyond this project
Carlos E. Perez, Software Architect - Design Patterns for Deep Learning
Architectures Written Aug 9
Yes.
(1) MagicLeap has known to be hiring Deep Learning experts for its
Augmented Reality system. They are known to use Movidius as their
chip which is a deep learning vision processor.
(2) Gesture recognition can be done via deep learning.
(3) Voice identification seems to have an importance in a VR
context.
See: Design Patterns for Deep Learning Architectures : Applications
https://techcrunch.com/2016/10/28/magic-leap-goes-to-finland-in-pursuit-of-nor
dic-vr-and-ar-talent/
http://www.forbes.com/sites/davidewalt/2016/11/02/inside-
magic-leap-the-secretive-4-5-billion-startup-changing-com
puting-forever/#2f9365e5e83f
https://www.wired.com/2016/04/magic-leap-vr/
22. General Motivation #2
Where 3D is going beyond this project
http://jobsearch.scania.com/segerjoblist/presentation.aspx?presGrpId=9470&langId=1&ie=False
http://www.sensorsmag.com/seventh-sense-blog/artificial-intelligence-autonomous-driving-24333
Viorica Pătrăucean, Ph.D: "BIM for existing infrastructure"
http://www-smartinfrastructure.eng.cam.ac.uk/files/generating-bim-models-for-existing-assets
23. General Motivation #2b
Where 3D: Autonomous driving
https://www.youtube.com/watch?v=4zOqJK-_GAk
Automatic object detection and removal from 3D point clouds
byOxbotica
Francis Engelmann, Jörg Stückler and Bastian Leibe
Computer Vision Group, RWTH Aachen University
https://www.youtube.com/watch?v=YebCdz7QsRs
24. General Motivation #2C
Where 3D: building information models (BIM)
http://www.spar3d.com/news/lidar/paracosms-new-handheld-lidar-scanner-built-construction-analytics/
GeoSLAM is playing into this trend with the release of their ZEB-CAM, an
add-on for the company’s ZEB-REVO handheld indoor mapper that
captures imagery at the same time as 3D scan data.
The data captured by the two sensors is fully synchronized, and users can
view the results side by side in GeoSLAM’s desktop software. Click a spot
in the scan, and the associated imagery is displayed. Click a spot in the
imagery, and the associated scan data is displayed.
25. General Motivation #3
Where AR is going beyond this project
http://adas.cvc.uab.es/varvai2016/
This half-day workshop will include invited talks from
researchers at the forefront of modern synthetic data
generation with VAR for VAI
●
Learning Transferable Multimodal Representations
in VAR, e.g., via deep learning
●
Virtual World design for realistic training data
generation
●
Augmenting real-world training datasets with
renderings of 3D virtual objects
●
Active & reinforcement learning algorithms for
effective training data generation and accelerated
learning
Xcede’s Data Science team are collaborating with one
of the world’s foremost image recognition and
augmented reality platforms. Already working with
some of the world's top brands, including Pepsi, Coca-
Cola, Procter & Gamble, General Mills, Anheuser-
Busch, Elle, Glamour, Honda and BMW their mobile
app has been downloaded over 45 million times.
Our client is now looking for a Computer Vision
Researcher to join their Deep Learning R&D team who
can help bring their technology to the next level.
http://www.eetimes.com/author.asp?section_id=36&doc_id=1330958
https://arxiv.org/pdf/1605.09533v1.pdf
26. NIPS 2016: 3D Workshop
Deep learning is proven to be a powerful tool to build
models for language (one-dimensional) and image
(two-dimensional) understanding. Tremendous efforts
have been devoted to these areas, however, it is still
at the early stage to apply deep learning to 3D data,
despite their great research values and broad real-
world applications. In particular, existing methods
poorly serve the three-dimensional data that drives
a broad range of critical applications such as
augmented reality, autonomous driving, graphics,
robotics, medical imaging, neuroscience, and
scientific simulations. These problems have drawn
the attention of researchers in different fields such as
neuroscience, computer vision, and graphics.
The goal of this workshop is to foster interdisciplinary
communication of researchers working on 3D data
(Computer Vision and Computer Graphics) so that
more attention of broader community can be drawn
to 3D deep learning problems. Through those
studies, new ideas and discoveries are expected to
emerge, which can inspire advances in related fields.
This workshop is composed of invited talks, oral
presentations of outstanding submissions and a
poster session to showcase the state-of-the-art
results on the topic. In particular, a panel discussion
among leading researchers in the field is planned, so
as to provide a common playground for inspiring
discussions and stimulating debates.
The workshop will be held on Dec 9 at NIPS 2016 in
Barcelona, Spain. http://3ddl.cs.princeton.edu/2016/
ORGANIZERS
●
Fisher Yu - Princeton University
●
Joseph Lim - Stanford University
●
Matthew Fisher - Stanford University
●
Qixing Huang - University of Texas at Austin
●
Jianxiong Xiao - AutoX Inc.
http://cvpr2017.thecvf.com/ In Honolulu, Hawaii
“I am co-organizing the
2nd Workshop on Visual
Understanding for
Interaction in conjunction
with CVPR 2017. Stay
tuned for the details!”
“Our workshop on Large-
Scale Scene Under-
standing Challenge is
accepted by CVPR 2017.
27. Labeling 3d Spaces Semantic Part
Manually labeling 3D scans
→ way too time consuming!
https://arxiv.org/abs/1511.03240
SynthCam3D is a library of synthetic indoor scenes collected from
various online 3D repositories and hosted at
http://robotvault.bitbucket.org
https://arxiv.org/abs/1505.00171
SYNTHETIC DATA
The advantages of synthetic 3D
models cannot be overstated,
especially when considering scenes:
once a 3D annotated model is
available, it allows rendering as many
2D annotated views as desired,
Samples of annotated images rendered at various camera
poses for an office scene taken from SynthCam3D
youtube.com/watch?v=cGuoyNY54kU
Existing datasets
NYUv2
28. SYNTHETIC Datasets #1
SynthCam3D is a library of synthetic indoor scenes
collected from various online 3D repositories and
hosted at http://robotvault.bitbucket.org.
Large public repositories (e.g. Trimble Warehouse) of
3D CAD models have existed in the past, but they
have mainly served the graphics community. It is
only recently that we have started to see emerging
interest in synthetic data for computer vision. The
advantages of synthetic 3D models cannot be
overstated, especially when considering scenes:
once a 3D annotated model is available, it allows
rendering as many 2D annotated views as desired,
at any resolution and frame-rate. In comparison,
existing datasets of real data are fairly limited both
in the number of annotations and the amount of
data. NYUv2 provides only 795 training images for
894 classes; hence learning any meaningful features
characterising a class of objects becomes
prohibitively hard.
https://arxiv.org/abs/1505.00171
29. SYNTHETIC Datasets #2
Creating large datasets with pixelwise semantic labels is known to be very challenging
due to the amount of human effort required to trace accurate object boundaries.
High-quality semantic labeling was reported to require 60 minutes per image for the
CamVid dataset and 90 minutes per image for the Cityscapes dataset. Due to the
substantial manual effort involved in producing pixel-accurate annotations, semantic
segmentation datasets with precise and comprehensive label maps are orders of
magnitude smaller than image classification datasets. This has been referred to as the
“curse of dataset annotation”: the more detailed the semantic labeling, the smaller
the datasets.
Somewhat orthogonal to our work is the use of indoor scene models to train deep
networks for semantic understanding of indoor environments from depth images [
15, 33]. These approaches compose synthetic indoor scenes from object models and
synthesize depth maps with associated semantic labels. The training data synthesized
in these works provides depth information but no appearance cues. The trained
models are thus limited to analyzing depth maps.
15
SynthCam3D
previous slide
33
30. Deep Learning Problems
Data columns: x, y, z, red, green, blue
Pointclouds can be huge
•
Voxelization of the scene impossible in
practice without severe downsampling /
discretization
•
Mesh/surface reconstruction increases
the data amount as well
How to handle massive
datasets in deep learning?
Simplify (primitive-based
reconstruction) before
semantic segmentation?
https://github.com/btgraham/SparseConvNet
https://ei.is.tuebingen.mpg.de
https://arxiv.org/abs/1605.06240
This can be used to analyse 3D models, or space-time
paths. Here are some examples from a
3D object dataset. The insides are hollow, so the data is
fairly sparse. The computational complexity of
processing the models is related to the
fractal dimension of the underlying objects.
https://arxiv.org/abs/1503.04949
https://github.com/MPI-IS/bilateralNN
doi:10.1111/j.1467-8659.2009.01645.x
1
2
3
Can't use 3D CNNs
Try alternative schemes
no normals
31. Point clouds with deep learning: example with Normals
Eurographics Symposium on Geometry Processing 2016, Volume 35 (2016), Number 5
http://dx.doi.org/10.1111/cgf.12983
Convolutional neural networks Work on normal estimation with CNNs focus on
using as input RGB images, or possibly RGB-D, but not sparse data such as
unstructured 3D point clouds. CNN-based techniques have been applied to 3D
data though, but with a voxel-based perspective, which is not accurate enough
for normal estimation. Techniques to efficiently apply CNN-based methods to
sparse data have been proposed too [Gra15], but they mostly focus on efficiency
issues, to exploit sparsity; applications are 3D object recognition, again with voxel-
based granularity, and analysis of space-time objects. An older, neuron-inspired
approach [JIS03] is more relevant to normal estimation in 3D point clouds but it
actually addresses the more difficult task of meshing. It uses a stochastic
regularization based on neighbors, but the so-called “learning process” actually is
just a local iterative optimization.
CNNs can also address regression problems such as object pose estimation
[PCFG12]. These same properties seem appropriate as well for the task of learning
how to estimate normals, including in the presence of noise and when several
normal candidates are possible near sharp features of the underlying surface
The question, however, is how to interpret the local neighborhood of a 3D point
as an image-like input that can be fed to a CNN. If the point cloud is structured, as
given by a depth sensor, the depth map is a natural choice as CNN input. But if the
point cloud is unstructured, it is not clear what to do. In this case, we propose to
associate an image-like representation to the local neighborhood of a 3D point via a
Hough transform. In this image, a pixel corresponds to a normal direction, and its
intensity measures the number of votes for that direction; besides, pixel adjacency
relates to closeness of directions. It is a planar map of the empirical probability of
the different possible directions. Then, just as a CNN for ordinary images can exploit
the local correlation of pixels to denoise the underlying information, a CNN for
these Hough-based direction mapsmight also be able to handle noise, identifying a
flat peak around one direction. Similarly, just as a CNN for images can learn a robust
recognizer, a CNN for direction maps might be able to make uncompromising
decisions near sharp features, when different normals are candidate, opting for one
specific direction rather than trading off for an average, smoothed normal.
Moreover, outliers can be ignored in a simple way by limiting the size of the
neighborhood, thus reducing or preventing the influence of points lying far from a
more densely sampled surface
Makes computationally
feasible
32. Literature Indoor point cloud segmentation with deep learning
http://robotvault.bitbucket.org/scenenet-rgbd.html
http://delivery.acm.org/10.1145/3020000/3014008
https://pdfs.semanticscholar.org/1ce8/1a2c8fa5731db944bfb57c9e7e8eb0fc5bd2.pdf
https://arxiv.org/pdf/1612.00593v1.pdf
33. Unsupervised deep learning segmentation
http://staff.ustc.edu.cn/~lgliu/
http://dx.doi.org/10.1016/j.cagd.2016.02.015 - cited by
34. Second level deep learning in Practice
btgraham/SparseConvNet
C++Spatially-sparseconvolutionalnetworks.Allowsprocessingof
sparse2,3and4dimensionaldata.BuildCNNsonthe
square/cubic/hypercubicor triangular/tetrahedral/hyper-tetrahedral
lattices
gangiman/PySparseConvNet
Pythonwrapper for SparseConvNet
in practice
http://3ddl.cs.princeton.edu/2016/slides/notchenko.pdf
Update oldschoolmachine
learningapproachtomodern
deeplearning.Reconstruct
theplanarshapes usinga
databaseofCADmodels(
ModelNet)?
Requiressomeworkforsure
http://staff.ustc.edu.cn/~juyong/DictionaryRecon.html
MOTIVATION
3dmodel_feature Code for extracting
3dcnn features of CAD models
35. Point cloud pipeline 2nd Step, “Deeplearnify”
Denoising
Consolidation
Upsampling
Planar
Segmentation
Simplification
2D Unstructured3D
Roughcorrespondencesfrommoreestablished2DDeepLearningWorld
btgraham/SparseConvNet
gangiman/PySparseConvNet
PythonwrapperforSparseConvNet
3dmodel_feature Code for extracting 3dcnn
features of CAD models
Sparselibrariesonlyasstartingpoints
https://arxiv.org/abs/1503.04949
https://github.com/MPI-IS/bilateralNN
http://arxiv.org/abs/1607.02005
Andrew Adams, Jongmin Baek, Myers Abraham Davis
May 2010, http://dx.doi.org/10.1111/j.1467-8659.2009.01645.x
36. 3D SHAPE representations #1: VRN EnsembleModelNet40: 95.54% Accuracy – The STATE-OF-THE-ART!
For this work, we select the Variational Autoencoder (VAE), a probabilistic
framework that learns both an inference network to map from an input space to
a set of descriptive latent variables, and a generative network that maps from
the latent space back to the input space.
Our model, implemented in Theano with Lasagne comprises an encoder
network, the latent layer, and a decoder network, as displayed in Figure 1.
https://arxiv.org/abs/1608.04236
https://github.com/ajbrock/Generative-and-Discriminative-Voxel-Modeling
37. 3D SHAPE representations #2: Probing filters
https://arxiv.org/abs/1605.06240
https://github.com/yangyanli/FPNN
Created by Yangyan Li, Soeren Pirk, Hao Su, Charles Ruizhongtai Qi,
and Leonidas J. Guibas from Stanford University.
Building discriminative representations for 3D data has been an important task
in computer graphics and computer vision research. Unfortunately, the
computational complexity of 3D CNNs grows cubically with respect to voxel
resolution. Moreover, since most 3D geometry representations are boundary
based, occupied regions do not increase proportionately with the size of the
discretization, resulting in wasted computation.
In this work, we represent 3D spaces as volumetric fields, and propose a novel design that employs field
probing filters to efficiently extract features from them.
Our learning algorithm optimizes not only the weights associated with the probing points, but also their
locations, which deforms the shape of the probing filters and adaptively distributes them in 3D space. The
optimized probing points sense the 3D space “intelligently”, rather than operating blindly over the entire
domain. We show that field probing is significantly more efficient than 3DCNNs, while providing state-of-
the-art performance, on classification tasks for 3D object recognition benchmark datasets
38. Point cloud pipeline in practice #1 Serialpipeline
Simplify
PlanarSegmentation
Objectdetection
Rares Ambrus
Robotics Perception and Learning (RPL) KTH,
39. Point cloud pipeline in practice #2 Jointpipeline
http://dx.doi.org/10.1016/j.neucom.2015.08.127
http://ai.stanford.edu/~quocle/tutorial2.pdf
41. Point Cloud Processing #1
http://www.thomaswhelan.ie/Whelan14ras.pdf | http://dx.doi.org/10.1016/j.robot.2014.08.019
http://www.cs.nuim.ie/research/vision/data/ras2014
"Incremental and Batch Planar Simplification of Dense Point Cloud Maps" by T.
Whelan, L. Ma, E. Bondarev, P. H. N. de With, and J.B. McDonald in Robotics and
Autonomous Systems ECMR ’13 Special Issue, 2014.
https://www.youtube.com/watch?v=uF-I-xF3Rk0
3DReshaper® is a tool to process 3D point clouds wherever they come from: 3D
scanners, laser scanning, UAVs, or any other digitization device... Whatever your point
cloud processing challenges are 3DReshaper has the tools you need. You can import one
or several point clouds whatever their origin and size. Point cloud preparation is often the
most important step to handle in order to save time with the subsequent steps (i.e.
meshing).
That is why 3DReshaper provides a complete range of simple but powerful functions to
process point clouds like:
•
Import without real limit of the imported number of points
•
Clever reduction to keep best points and remove points only where density is the highest
•
Automatic segmentation
•
Automatic or manual separation and cleaning
•
Extraction of best points evenly spaced, density homogenization
•
Automatic noise measurement reduction
•
Colors according to a given direction
•
Fusion
•
Registration, Alignment and Best Fit
•
3D comparison with a mesh or a CAD model
•
Planar sections
•
Best geometrical shapes extraction (planes, cylinders, circles, spheres, etc.)
•
Several representation modes: textured, shaded, intensity (information depending on the
imported data)
3dreshaper.com
42. Point Cloud Processing #2
Notice that in Fig.5a, the baseline data structure uses nearly 300MB of RAM
whereas the spatial hashing data structure never allocates more than 47MB of
RAM for the entire scene, which is a 15 meter long hallway.
Memory usage statistics (Fig. 5b) reveal that when all of the depth data is
used (including very far away data from the surrounding walls), a baseline fixed
grid data structure (FG) would use nearly 2GB of memory at a 2cm resolution,
whereas spatial hashing with 16 × 16 × 16 chunks uses only around 700MB.
When the depth frustum is cut off at 2 meters (mapping only the desk
structure without the surrounding room), spatial hashing uses only 50MB of
memory, whereas the baseline data structure would use nearly 300MB. We
also found that running marching cubes on a fixed grid rather than
incrementally on spatially-hashed chunks to be prohibitively slow
robotics.ccny.cuny.edu
43. Point cloud Segmentation Introduction
http://dx.doi.org/10.1016/j.cag.2015.11.003
There are three kinds of methods for point cloud segmentation [14]. The first
type is based on primitive fitting [3], [15] and [5]. It is hard for these methods
to deal with objects with complex shape.
The second kind of techniques is the region growing method. Nan et al. [2]
propose a controlled region growing process which searches for meaningful
objects in the scene by accumulating surface patches with high classification
likelihood. Berner et al. [16] detect symmetric regions using region growing.
Another line of methods formulates the point cloud segmentation as a Markov
Random Field (MRF) or Conditional Random Field (CRF) problem [4], [17] and
[14]. A representative random field segmentation method is the min-cut
algorithm [17]. The method extracts foreground from background through
building a KNN graph over which min-cut is performed. The shortcoming of
min-cut algorithm is that the selection of seed points relies on human
interaction. We extend the min-cut algorithm by first generating a set of object
hypotheses via multiple binary min-cuts and then selecting the most probable
ones based on a voting scheme, thus avoiding the seed selection.
Plane extraction from the point cloud
of a tabletop scene by using our
method (a) and RANSAC based
primitive fitting (b), respectively.
While our method can segment out
the supporting plane accurately,
RANSAC missed some points due to
the thin objects.
An overview of our algorithm. We first over-segment the scene and extract the supporting
plane on the patch graph, then segment the scene into segments and represent the whole
scene using a segment graph (a). To obtain the contextual information, we train a set of
classifiers for both single objects and object groups using multiple kernel learning (b). The
classifiers are used to group the segments into objects or object groups (c).
45. Point cloud Segmentation Example #2
http://dx.doi.org/10.1109/TGRS.2016.255154
6
Principal component analysis (PCA)-based local saliency features,
e.g., normal and curvature, have been frequently used in many
ways for point cloud segmentation.
However, PCA is sensitive to outliers; saliency features from
PCA are non-robust and inaccurate in the presence of outliers;
consequently, segmentation results can be erroneous and
unreliable. As a remedy, robust techniques, e.g., RANdom
SAmple Consensus (RANSAC), and/or robust versions of PCA
(RPCA) have been proposed. However RANSAC is influenced by
the well-known swamping effect, and RPCA methods are
computationally intensive for point cloud processing.
We propose a region growing based robust segmentation
algorithm that uses a recently introduced maximum consistency
with minimum distance based robust diagnostic PCA (RDPCA)
approach to get robust saliency features.
Many methods have been developed to improve the quality of
segmentation in PCD that can be grouped into three main
categories: 1) edge/border based; 2) region growing based; and 3)
hybrid. In edge/border based methods, points on edges/ borders are
detected, a border linkage process constructs the continuous
edge/border, and then points are grouped within the identified
boundaries and connected edges. Castillo et al. [14] stated that, due
to noise or uneven point distributions, such methods often detect
disconnected edges, which make it difficult for a filling or an
interpretation procedure to identify closed segments.
46. Point cloud Segmentation Example #3
http://dx.doi.org/10.3390/rs5020491
In the future, we will validate the method further with a large number of trunk and
branch measurements from real trees. We will also develop the method further, e.g.,
by utilizing generalized cylinder shapes. Together with the computational method
presented, laser scanning provides a fast and efficient means to collect essential
geometric and topological data from trees, thereby substantially increasing the
available data from trees.
https://www.youtube.com/watch?v=PKHJQeXJEkU
47. Point cloud Segmentation Example #4
http://dx.doi.org10.1016/j.isprsjprs.2015.01.016
http://dx.doi.org/10.1016/j.cag.2016.01.004
48. Cad Primary fitting traditional methods #1
http://dx.doi.org/10.1111/j.1467-8659.2007.01016.x; Cited by 680 http://dx.doi.org/10.1111/j.1467-8659.2009.01389.x; Cited by 63
[SDK09][SWK07]
http://dx.doi.org/10.1111/cgf.12802
49. Cad Primary fitting traditional methods #2
http://dx.doi.org/10.2312/egsh.20151001
http://dx.doi.org/10.1016/j.cag.2014.07.0050
The main phases of our algorithm: from the input model (a) we robustly
extract candidate walls (b). These are used to construct a cell complex in
the 2D floor plane. From this we obtain a partitioning into individual
rooms (c) and finally the individual room polyhedra (d). Note that in (a) the
ceiling has been removed for the sake of visual clarity.
51. Cad Primary fitting traditional methods #4
http://dx.doi.org/10.1080/17538947.2016.1143982
http://dx.doi.org/10.1007/s41095-016-0041-9
An overview of the proposed approach. Starting from an imperfect point cloud (a) of a
building, we first extract and refine planar segments (b) from the point cloud, and build a
dense mesh model using existing techniques. Then, we use the extracted planar
segments to partition the space of the input point cloud into axis-aligned cells (i.e.
candidate boxes). (d) shows the overlay of the candidate boxes on the dense mesh
model. After that, appropriate boxes (e) are selected based on binary linear programming
optimization. Finally, a lightweight 3D model (f) is assembled from the chosen boxes.
We conclude by observing that state of the art methods
for quadric fitting give reasonable results on noisy point
clouds. Our algorithm provides a means to enforce a
prior, allowing the algorithm to better fit the quadric that
the points were drawn from, particularly when there is
missing data or a large amount of noise. This has a
practical use for real datasets, since it allows a user to
specify the curvature of the surface where there are few
points available.
52. Cad Primary fitting traditional methods #5
http://dx.doi.org/10.1109/TVCG.2015.2461163
http://dx.doi.org/10.1016/j.cviu.2016.06.004
53. Cad Primary fitting traditional methods #6
URN: urn:nbn:se:kth:diva-173832
http://dx.doi.org/10.1111/cgf.12720
54. PCL Point Cloud Library
http://pointclouds.org/
What is it?
The Point Cloud Library (PCL) is a standalone, large scale, open project for
2D/3D image and point cloud processing.
PCL is released under the terms of the BSD license, and thus free for
commercial and research use. We are financially supported by a
consortium of commercial companies, with our own non-profit
organization, Open Perception. We would also like to thank individual
donors and contributors that have been helping the project.
Presentations
The following list of presentations describes certain aspects or modules in PCL, and have
been assembled by former research interns at Willow Garage.
•
Ryohei Ueda's presentation on Tracking 3D objects with Point Cloud Library (more details)
•
Jochen Sprickerhof's presentation on Large Scale 3D Point Cloud Mapping in PCL (more details)
•
Aitor Aldoma's presentation on Clustered Viewpoint Feature Histograms (CVFH) for object recognition (
more details)
•
Julius Kammerl's presentation on Point Cloud Compression (more details)
•
Dirk Holz's presentation on the PCL registration framework (pre 1.0) (more details)
•
Rosie Li's presentation on surface reconstruction (more details)
•
Kai Wurm's presentation on 3D mapping with octrees (more details)
56. Mesh Simplification
Huang et al. 2013
http://web.siat.ac.cn/~huihuang/EAR/EAR_page.html
CGAL, Point Set Processing
http://doc.cgal.org/latest/Point_set_processing_3/
Wei et al. (2015)
Borouchaki and Frey (2005)
QSlim Simplification Software, http://www.cs.cmu.edu/~./garland/quadrics/qslim.html
Monette-Theriault (2014): “he Matlab wrapper of QSlim is adapted from [19] and the same platform
is ... Using the initial mesh as a reference”
59. Mesh “re-Reconstruction” #2
This article presents a novel approach for constructing manifolds over meshes. The local geometry is represented
by a sparse representation of a dictionary consisting of redundant atom functions. A compatible sparse
representation optimization is proposed to guarantee the global compatibility of the manifold.
Future work. The Sparse-Land model generalized on the manifold is fascinating because of its universality and
flexibility, which make many of the geometric processing tasks clear and simple, and the superior performance to
which it leads in various applications. As with sparse coding in image processing, we can also apply our
framework of compatible sparse representations to various tasks in geometric processing, e.g., reconstruction,
inpainting, denoising, and compression. We believe that some of the extensions are feasible but not
straightforward.
http://dx.doi.org/10.1111/cgf.12821 | https://www.youtube.com/watch?v=jhgjiQoQxa0
First published: 27 May 2016: www.researchgate.net
60. Mesh “re-Reconstruction” #4: Sparsity
http://dx.doi.org/10.1016/j.cad.2016.05.013
https://arxiv.org/abs/1411.3230
http://arxiv.org/abs/1505.02890
Data is sparse if most sites take the value zero. For example, if a loop of string has
a knot in it, and you trace the shape of the string in a 3D lattice, most sites will not
form part of the knot (left). Applying a 2x2x2 convolution (middle), and a pooling
operation (right), the set of non-zero sites stays fairly small:
http://dx.doi.org/10.1016/j.conb.2004.07.007
http://dx.doi.org/10.1016/j.tins.2015.05.00
5
61. CGAL Computational Geometry Algorithms Library
CGAL is a software project that provides easy access to efficient and reliable geometric algorithms in the form of a C++ library. CGAL is
used in various areas needing geometric computation, such as geographic information systems, computer aided design, molecular
biology, medical imaging, computer graphics, and robotics.
The library offers data structures and algorithms like triangulations, Voronoi diagrams,Boolean operations on polygons and polyhedra,
point set processing, arrangements of curves, surface and volume mesh generation, geometry processing, alpha shapes,
convex hull algorithms, shape analysis, AABB and KD trees...
Learn more about CGAL by browsing through the Package Overview.
http://www.cgal.org/