5 min read

This post gives a short summary of the different methods and tools used to create AR experiences with today’s technology. We also outline the advantages and drawbacks of specific solutions. In a follow-up post, we outline how to create your own AR experience. 

Augmented Reality (AR)

Augmented Reality (AR) is the “it” thing of the moment thanks to Pokémon Go and the Microsoft Hololens. But how can you create your own AR “thing”? The good news is you don’t need to learn a new programming language. Today, there are tools available for most languages used for application development out there. The AR software out there is not packaged in the most user-friendly way, so most of the time, it’s a bit tricky to even get the samples, provided by the software company itself, up and running. The really bad news is that AR is one of those “native-only” features. This means that it cannot yet be achieved using web technologies (JavaScript & HTML5) only, or at least not with any real-world, production-grade performance and fidelity.

Consequently, in order to run our AR experience on the intended device (a smartphone, tablet or Windows PC, or Windows tablet with a camera) you need to wrap it into an app, which you need to build yourself. I am including instructions on how to run an app with your own code and AR included for each programming language below. 

3D Models

First, there’s some more bad news: AR is primarily visual media, so you will need to have great content to create great experiences. “Content” in this case means 3D models optimized for use in real-time rendering on a mobile device. This should not keep you from trying it because you can always get great, free, or open source models in OBJ, FBX, or Collada file format on Turbosquid or the Unity 3D Asset Store. These are the most common 3D file exchange formats, which can be exported by pretty much any software. This will suffice for now. 

One more thing before we dive into the code: Markers. Or Triggers. Or Target Images. Or Trackables. You will encounter these terms often. They all refer to the same thing. For a long time now, in order to be able to do AR at all, you needed to “teach” your app a visual pattern to look for in the real world. Once found, the position of the pattern in the real world will provide the frame of reference of positioning the 3D content into the camera picture. To make it easy, this pattern is an image, which you store inside your app either as a plain image or in a special format (depending on the AR framework you use). You can read more about what makes a good AR pattern here on the Vuforia Developer portal (be sure to check out “Natural Features and Image Ratings” and “Local Contrast Enhancement” for more theoretical background).

Please note: the augmentation will only work as long as the AR pattern is actually visible to the camera; once it moves outside the field of view, the 3D model vanishes again. That’s a drawback.

Using a visual AR pattern to do augmentations can have one benefit, though: in the cases where you know the exact dimensions of the image the AR uses to set the frame of reference (e.g. a magazine cover), you can configure the AR software in a way that the scale of the real world and the scale of the virtual frame of reference in your app match together. A 3D model of a desk, which is 2 feet tall, will then also be 2 feet tall when projected into a room using the magazine as the AR pattern. This enables life-size, 1:1 scale AR projections.

Life-size AR projection with fixed-size image pattern 

SLAM

The opposite of using a pattern for AR is called SLAM. It’s a method of creating a frame of reference for AR without any known patterns within the camera’s field of view. This works reasonably well with today’s software and hardware and has one major benefit: no printed AR marker is needed. The AR can just start; you don’t need to worry about the marker being in the camera’s field of view. So, why is this not the default way of doing it? Firstly, the algorithm is so expensive to run that it drains the battery of smartphones in minutes (well, half an hour, but still). Secondly, it looses “scale” altogether. SLAM-based AR tracking can detect the environment and its horizon (i.e., the plan that is perpendicular to gravity, based on the phone’s gravity sensor), but it cannot detect the environment’s scale. If you were to run a SLAM-based augmentation on three different devices in a row, the projected 3D model will likely be a different size every time. The quality of the software you use will make this more or less noticeable.

Some frameworks offer hybrid SLAM tracking methods, which are started once a pre-defined pattern is detected. This is a good compromise; it’s called “Extended Tracking” or “Instant Tracking.” If you encounter an option for something like this, enable it.

Summary

This is pretty much all of the theory you will need to understand before you can create your first AR experience. We will tell you how to do just that in a programming language you already know in a follow-up post. 

About the Author

Andreas is the Founder and CEO of Vuframe. He’s been working with Augmented & Virtual Reality on a daily basis for the past 8 years. Vuframe’s mission is to democratize AR & VR by removing the tech barrier for everyone.

LEAVE A REPLY

Please enter your comment!
Please enter your name here