Manifold Performance

4 minute read.

The primary purpose of my Manifold library is topologically-robust results, as I've already discussed. My secondary purpose is to make it fast by leveraging the power of parallel processing. There is still significant work that can be done to improve performance, but I'm proud enough of its current state to show some benchmark comparisons. 

How to benchmark? It would be nice to show some kind of big-O trend, but different shapes have wildly different results: if we take n to be the number of triangles in each input shape, the number of intersections could be anywhere from zero to n2, which will directly affect the processing time. The Menger sponge is an example that scales close to n2, but that is not common. If you think of the progression of n as a refinement function, subdividing the triangles into smaller ones, then you would generally expect to have O(n1/2) intersections since the surface is 2-dimensional while the intersection curve is 1-dimensional.

I decided upon a very simple test case: a difference of identical spheres. For each data point I subdivide each triangle into four, starting with 2,048 triangles in each sphere (and roughly the same in the output) and working my way up to over 8 million triangles. Here is an example of the third step, with 32,768 triangles. I've left the model intentionally faceted so you can see the size of the triangles. This density is obviously overkill for rendering a sphere since interpolated normals would make it look smooth with far fewer triangles than this, but it's a benchmark test case after all. 

I'm comparing Manifold to CGAL via OpenSCAD's F6 render, since OpenSCAD is near and dear to my heart and was my introduction to this computational geometry problem. It's also a reasonably fair comparison, as CGAL is one of the few libraries that does a decent job at maintaining manifoldness. I imported my library's geodesic spheres to OpenSCAD to make everything identical and because I like my spheres a lot better than OpenSCAD's.

I've recorded both the last release of OpenSCAD as well as the nightly build from March 1st, 2022, and that same build with the fast-csg and fast-csg-trust-corefinement options enabled since some significant speed-up work was done recently. I've recorded both the fair comparison of Manifold in CPU single-threaded mode (since I believe this is what CGAL uses) as well as the CUDA GPU-accelerated version. I ran everything on the same Linux box (Ubuntu 20.04, 32 GB RAM, Core™ i7-3930K CPU @ 3.20GHz × 12, GeForce GTX 1070 GPU): 

I know I've done something useful when I'm drawing a log-log plot. The recent fast-csg work is quite impressive, accelerating OpenSCAD by more than an order of magnitude. However, even the single-threaded version of my library is yet another order of magnitude faster than that. The GPU acceleration only yields an extra factor of two, though this is primarily because of the portions of the library that are not GPU-accelerated. Note this breakdown of the times for the rightmost data-point:

You can see that the only O(nlogn) parts of the algorithm (intersections and sorting) are accelerated significantly by the GPU (intersections by >20 times). The rest should be simple O(n) work and complex O(n1/2) work, like triangulating the intersected faces. The fact this step takes so much time and appears to scale as O(n) tells me there is likely some optimization I've missed. Additionally, while triangulation is not conducive to GPU-scale parallelization, it would be fairly trivial to multi-thread.

In addition to the simple benchmark above, I decided to try a complex one that tests a series of three operations where there are proportionally many more intersections and they are not in general position. I ran the same test cases on my Menger sponge (I imported the holes as a single object to OpenSCAD to avoid its implicit union operation) at various fractal depths:

The proportional performance is largely identical to the previous test, with the exception that fast-csg doesn't seem to scale as well and ends up barely faster than without for the most complex case. I also exported these OpenSCAD meshes as OFF files (STL does not preserve manifoldness) and imported them to my library to verify them. 

I was impressed that each OpenSCAD mesh was in fact manifold, had done the symbolic perturbation correctly (no thin membranes despite a total lack of extra overlap), and had the correct genus, indicating the topology was clean. However, these meshes had nearly three times the number of triangles: at depth 4, 1.15 million vs 400,000, including over 76,000 degenerate triangles. However, upon import the Manifold library managed to remove nearly all of these extra triangles.

I also checked OpenSCAD with only fast-csg and without fast-csg-trust-corefinement. Trusting corefinement seemed to improve performance in these tests by a roughly uniform 10%, so I didn't bother including another trace. I started working on Manifold years ago because of how slow OpenSCAD was and how often its output was not manifold. It has clearly improved in the meanwhile, but I believe my library is still a significant improvement over CGAL. 

I'm quite pleased with the performance I've found overall: the GPU version is 100-1,000 times faster than the current OpenSCAD release. And I think there is still significant headroom for improvement in my Manifold library, especially in increased parallelization. In addition to being faster, my library is also keeping track of the relationships to the original meshes so that any type of vertex properties like UV maps can be easily reapplied after the operations, which OpenSCAD does not yet support. 

Comments

  1. Thanks a lot for sharing these results! I've started instrumenting OpenSCAD to show what time is actually spent doing corefinement operations (as opposed to import/export and not-sure-yet-what-else) and it seems like a closer call :-)

    See details in https://github.com/openscad/openscad/pull/4163

    Re/ number of triangles in the corefinement output, this is an issue that I've tackled with https://github.com/openscad/openscad/pull/4117 (just landed in today's dev snapshot!) that remeshes coplanar patches. It makes no difference with the spheres example (makes it a bit slower for now), but magically fixes some models that were explosively slow to render (excesses of triangles were compounding with each further operation).

    ReplyDelete
    Replies
    1. Makes sense, I ran into exactly the same issue with mine (compounding extra triangles making things slower and slower) until I got my SimplifyTopology method worked out.

      I noticed this comment in your PR: "Patches that contain holes are skipped for now". In my long battle with polygon triangulation I eventually convinced myself that "with holes" and especially "degenerate with holes" is a fundamentally harder problem than simple polygons.

      Delete

Post a Comment

Popular posts from this blog

Perseverance - a history of Manifold

3D Interaction