Inside the Vision Engine: How OpenSpace Is Using Advanced Technology to Revolutionize the Construction Industry

Instead of taking photos manually, field teams strap 360 cameras to their hard hats and walk their sites as usual; it requires no additional effort and takes a fraction of the time of traditional tools. The walk track is automatically computed and photos are aligned, stitched together and mapped to the floor plans. The documentation it produces is robust enough to let team members revisit any specific location at any specific point in time, improving collaboration, reducing travel costs, and preventing rework.

But what’s behind this technology? In this post, we’ll give a bit of background on the technologies powering the Vision Engine and the team bringing it to life.

First, here’s a quick primer on three foundational technologies behind the Vision Engine platform:

1. Computer Vision: A branch of computer science that lets computers “see” and interpret the visual world. This is why OpenSpace can automatically map imagery to project plans; the system knows what it’s looking at.

2. 3D Reconstruction: The Vision Engine aligns features in overlapping images to calculate camera position and generate a sparse 3D point cloud of the project. This 3D reconstruction aids in alignment and informs deeper analysis.

3. Big Data Visualization: The practice of rendering visual representations of large, complex datasets, making them easier to digest and analyze. There’s a great example in this TED talk, where an MIT researcher (and former colleague of our three founders) describes how he recorded 90,000 hours of home video and worked the data to plot how and when his infant son learned new words.

The technology behind OpenSpace’s patent-pending algorithms—which are similar to the perception and navigation AI systems behind self-driving cars—is the culmination of nearly two decades of combined research and development using these foundational technologies.

The team first began working in computer vision and big data back visualization back at MIT, where they were earning their PhDs. Over the last three years, our team of PhDs and Masters of Science from MIT, CalTech, Stanford and Berkeley has been focused on adapting these algorithms for the construction industry.

In a few years’ time, we believe that having access to the corpus of data captured by the Vision Engine will be table stakes at construction sites, and building records will be queried and explored just as easily as people search the web.

To be clear, we have plenty of work to do, but we’re off to a tremendous start. In just over a year our customers have captured 150 million square feet of imagery data, which is equivalent to 50 Empire State Buildings! That’s a lot of data to work with, and the Vision Engine continuously learns as it’s exposed to more job sites and projects

For more information on the Vision Engine and how it gets smarter over time via machine learning, click here or download this handy overview.