Adding spheres to a scene using ARKit

IMG_1CA961D76C9F-1
Just some spheres floating above my bed

Boilerplate

First we’ll want to set up the basic ARKit scene. All this is boilerplate provided if you create an ARKit project in XCode


//  ViewController.swift

import UIKit
import SceneKit
import ARKit

class ViewController: UIViewController, ARSCNViewDelegate {

    @IBOutlet var sceneView: ARSCNView!

    override func viewDidLoad() {
        super.viewDidLoad()

        // Set the view's delegate
        sceneView.delegate = self

        // Show statistics such as fps and timing information
        sceneView.showsStatistics = true

        // Create a new scene
        let scene = SCNScene()

        // Set the scene to the view
        sceneView.scene = scene
    }

    override func viewWillAppear(_ animated: Bool) {
        super.viewWillAppear(animated)

        // Create a session configuration
        let configuration = ARWorldTrackingSessionConfiguration()
        configuration.planeDetection = .horizontal

        // Run the view's session
        sceneView.session.run(configuration)
    }

    override func viewWillDisappear(_ animated: Bool) {
        super.viewWillDisappear(animated)

        // Pause the view's session
        sceneView.session.pause()
    }
}

One nice thing to add when developing with ARKit is the debug point thingies:


// Goes in viewDidLoad()
sceneView.debugOptions = ARSCNDebugOptions.showFeaturePoints

Spheres

Now, we’ll create a separate sphere class to abstract away the SCNNodes which will contain the sphere:


//  Sphere.swift

import Foundation
import ARKit

class Sphere: SCNNode {

    static let radius: CGFloat = 0.01

    let sphereGeometry: SCNSphere

    // Required but unused
    required init?(coder aDecoder: NSCoder) {
        sphereGeometry = SCNSphere(radius: Sphere.radius)
        super.init(coder: aDecoder)
    }

   	// The real action happens here
    init(position: SCNVector3) {
        self.sphereGeometry = SCNSphere(radius: Sphere.radius)

        super.init()

        let sphereNode = SCNNode(geometry: self.sphereGeometry)
        sphereNode.position = position

        self.addChildNode(sphereNode)
    }

    func clear() {
        self.removeFromParentNode()
    }

}

Now, when we want to add a sphere, we can do something like this


    func addSphere(position: SCNVector3) {
        print("adding sphere at point: \(position)")
        let sphere: Sphere = Sphere(position: position)
        self.sceneView.scene.rootNode.addChildNode(sphere)
// if we keep an array of these babies, then calling
// sphere.clear() on each will remove them from the scene
spheres.append(sphere)
    }

Insert into scene

Now for the interseting bit. To calculate the position at which we’d like to place the sphere, we can take the position of the camera and add a transform in the direction the camera is facing. All this code could be executed in a button press handler, for instance.

I found this handy screenSpacePosition function in a stack overflow post, it’ll do the vector math addition along the z axis. The distance is negative because we want to go further into the scene.


    func sceneSpacePosition(inFrontOf node: SCNNode, atDistance distance: Float) -> SCNVector3 {
        let localPosition = SCNVector3(x: 0, y: 0, z: -distance)
        let scenePosition = node.convertPosition(localPosition, to: nil)
        // to: nil is automatically scene space
        return scenePosition
    }

then, in the button press handler:


if let cameraNode = self.sceneView.pointOfView {

	let distance: Float = 0.3 // Hardcoded depth
    let pos = sceneSpacePosition(inFrontOf: cameraNode, atDistance: distance)    

    addSphere(position: pos)
}

hitTest()

Another way to do this would be to try to add the object “into” the scene, at a depth which is the location in z space of the object in front of the camera. We can do this with a hit test:


if let cameraNode = self.sceneView.pointOfView {

	let width = sceneView.frame.size.width;
	let height = sceneView.frame.size.height;

	let distance = sceneView.hitTest(CGPoint(x: width, y: height), types: [.existingPlaneUsingExtent, .featurePoint]).first!.distance;
	let pos = sceneSpacePosition(inFrontOf: cameraNode, atDistance: Float(distance))

    addSphere(position: pos)
}

And that’s all for this post!

LSTM Adventures

Following this as a guide: http://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/

1: Started by training with the single-layer LSTM

model = keras.models.Sequential() 
model.add(kl.LSTM(256, input_shape=(X.shape[1], X.shape[2]))) 
model.add(kl.Dropout(0.2)) 
model.add(kl.Dense(y.shape[1], activation="softmax")) 
model.compile(loss="categorical_crossentropy", optimizer="adam")

results:

  • final loss after 20 epochs: 1.9054
  • time to train (approx): 1hr
  • mostly gibberish words, but word lengths and spacing look right!
  • can handle opening and closing quotation marks
  • sometimes gets ” ‘?’, said the . “

the mort of the sorts!’ she katter wept on, ‘and toene io the doer wo thin iire.’

‘io she mo tee toete of ther ’ou ’ould ’ou ’ould toe tealet ’our majesty,’ the match hare seid to tee jury, the was aoling to an the sooeo.

‘he d crust bi iele at all,’ said the mock turtle.

‘ie doersse toer miter ’hur paae,’ she mick turtle replied, ‘in was a little soiee an in whnl she firl th the kook an in oare

the rieet hor ane the was so tea korte of the sable, bnt the hodrt was no kently and shint oo the gan on the goor, and whnn hes lene the rueen sas so aeain, and whnn she gad been to fen ana tuiee oo thin shaee th the crrr asd then the was no toeeen at the winte tabbit wat she wiite rabbit wat she mitt of the gareen, and she woole tee whst her al iere at she could,

*note: I tried this again with a 512 unit single layer LSTM, the results were along these lines:

‘i con’t sein mo,’ said alice, ‘ho would ba au all aare an ierse then and what io the bane af in eine th the bare aadi in the cinrsn, and she was soit ion hor horo aerir the was oo the cane and the was oo the tanl oa teing the was soie in her hand

and saed to herself, ‘oh tou dane the boomer ’fth the semeen at the soiee tf the bareee and see seat in was note go the pine afd roe banl th the grore of the grureon,

Training time was slightly longer, and the results are not meaningfully different. One observation here is that training for 20 epochs might be too short…

2:  Added another LSTM layer

model = keras.models.Sequential()
model.add(kl.LSTM(256, input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(kl.Dropout(0.2))
model.add(kl.LSTM(256))
model.add(kl.Dropout(0.2))
model.add(kl.Dense(y.shape[1], activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam")

results:

  • final loss after 20 epochs: 1.5144
  • time to train (approx): 2hrs
  • seems to fall into loops
  • closer to ‘english’ words

and mowte it as all, and i dan see the that was a little boom as the soog of the sable it as all. and the pueen was a little boowle and the thing was she was a little boowle and the that was a little boowle and the that was a little boowse and the thing was the white rabbit say off and the thing was she was a little boowle and the that was a little boowle and the that was a little boowse and the thing was the white rabbit say off and the thing was she was a little boowle and the that was a little boowle and the that was a little boowse and the thing was the white rabbit say off and the thing was she was a little boowle and the that was a little boowle

3: Trying again with 3 LSTM layers

model = keras.models.Sequential()
model.add(kl.LSTM(256, input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(kl.Dropout(0.2))
model.add(kl.LSTM(256, return_sequences=True))
model.add(kl.Dropout(0.2))
model.add(kl.LSTM(256))
model.add(kl.Dropout(0.2))
model.add(kl.Dense(y.shape[1], activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam")

results:

  • final loss after 20 epochs (took much longer!):
  • time to train (approx): 2.5 hrs
  • real words, better sentence structure… but it’s also falling into pretty small loops.
  • Different seed text results in the first line being quite different but then quickly devolving into the same loop of the hatter and the mock turtle going back and forth.
at the way that the was to tee the court.

‘i should like the dormouse say “ said the king as he spoke. 
‘i don’t know what it moog and she beginning of the sea,’ the hatter went on, ‘i’ve sat a little sable. 
‘i don’t know what you con’t think it,’ said the mock turtle.

‘i don’t know what it moog and she beginning of the sea,’ the hatter went on, ‘i’ve sat a little sable. 
‘i don’t know what you con’t think it,’ said the mock turtle.

‘i don’t know what it moog and she beginning of the sea,’ the hatter went on, ‘i’ve sat a little sable. 
‘i don’t know what you con’t think it,’ said the mock turtle.

4: Single Layer GRU (for comparison with the single layer LSTM)

model = keras.models.Sequential()
model.add(kl.GRU(256, input_shape=(X.shape[1], X.shape[2])))
model.add(kl.Dropout(0.2))
model.add(kl.Dense(y.shape[1], activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam")

results:

  • final loss (20 epochs): 1.87897 (approx the same as the 1 layer LSTM)
  • time to train: 40 minutes (faster!)

tone of the hoore of the house, and the whrt huon a little crrree the sabbit say an in oish of the was soiereing to be a gooa tu and gerd an all, and she whrt huow foonnge to teyerke to herself, ‘it was the tait wuinten thi woils oottle goos,’

‘hht ser what i cen’t tamling,’ said alice, ‘ih’ le yhit herseree to teye then.’

‘i shanl sht me wiitg,’ said alice,

‘what ie y sheu wasy loteln, and teed the porers oaser droneuse toe thet, and the tored harden wo cnd too fuor of the word the was soteig of the sintle the white rabbit, and the tored hard a little shrenge the fab foott of the was sotengi th theee th the was soe tabdit and ger fee toe pame oo her loot ou teie the was soee i rher to thi kittle gorr, and the tar so tae it the lad soe th the waited to seterker

Still gibberish, though more english words. Training for more than 20 epochs might be the key!

Here’s a graph of the loss dropoff over time

alice_loss.png