Hunter Loftis Notes to self

Abundant concurrency in Go

Today, I merged a Pull Request that’s so obviously the right model for my Go rendering engine that I had to sit down and figure out why I did it any other way in the first place.

It turns out I have a JavaScripter’s mindset on concurrency.

In JavaScript, you have two levels of concurrency to choose from. On a single CPU you have the single-threaded, non-blocking type, where an event loop lets you simulate I/O concurrency. This doesn’t help with hashing passwords or processing long lists, but you can read a file without ignoring HTTP requests. Across CPUs you have workers (web workers in the browser, cluster workers in node). Each worker forks its own process with all that that entails for startup time, memory use, and inter-process communication.

This generates two habits: First, you always write async code because you’re sharing a single thread with the rest of the process. Iterating over a large array can cause performance issues in a JS app. Second, you consider concurrency in terms of single digits to map processes to CPUs. If you spend too much time on a task, you block other functions from the event loop; if you spin up too many processes, you create contention and waste resources.

So, when I added concurrency to pbr’s API, I forced the user to spin up a handful of goroutines that they’d monitor over channels. It was made “easier” via this complex Monitor type that created goroutines for you:

// AddSampler creates a new worker with that sampler.
// (user's responsibility)
func (m *Monitor) AddSampler(s *Sampler) {
	m.active++
	go func() {
		for {
			frame := s.SampleFrame()
			m.samples.Lock()
			m.samples.count += frame
			total := m.samples.count
			m.samples.Unlock()
			m.Progress <- total
			select {
			case <-m.cancel:
				m.active--
				m.Results <- s.Pixels()
				return
			default:
			}
		}
	}()
}

Pushing the concurrent requirements up to the user resulted in this monstrosity which will be familiar to anyone who’s used node’s cluster API.

As I sketched out a 100-line, 2-channel “hello, world” example, I realized my mistake. The pbr renderer could start as many goroutines as it needs to quickly render an image without ever exposing them to the user. In Go, I can build internal concurrency while exposing a simple, sequential API.

In JavaScript, it would be unthinkable to spawn several async routines that each for loop through two billion pixels at once, but that’s exactly what pbr does now:

// Sample samples every pixel in the Camera's frame at least once.
// (package's responsibility)
// (yeah I know this is a little messy, I'll clean it up later, the point is the user doesn't deal with the mess)
func (s *Sampler) Sample() {
	length := index(len(s.samples))
	workers := index(runtime.NumCPU())
	ch := make(chan sampleStat, workers)

	for i := index(0); i < workers; i++ {
		go func(i index, adapt, max int, mean float64) {
			var stat sampleStat
			rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
			for p := i * Stride; p < length; p += Stride * workers {
				samples := adaptive(s.samples[p+Noise], adapt, max, mean)
				stat.noise += s.samplePixel(p, rnd, samples)
				stat.count += samples
			}
			ch <- stat
		}(i, s.Adapt, s.Adapt*3, s.meanNoise+Bias)
	}

	var sample sampleStat
	for i := index(0); i < workers; i++ {
		stat := <-ch
		sample.count += stat.count
		sample.noise += stat.noise
	}
	s.count += sample.count
	s.meanNoise = sample.noise / float64(sample.count)
}

The new method approaches concurrency like a Go developer, with an abundance mindset.

Now you can render pbr’s Hello, world scene with 15 lines and zero channels. Underneath, sampler.Sample() creates goroutines for every block of N pixels to saturate your CPU. But why would you care? You just want a pretty picture.

func main() {
	scene := pbr.EmptyScene()
	camera := pbr.NewCamera(960, 540)
	sampler := pbr.NewSampler(camera, scene)
	renderer := pbr.NewRenderer(sampler)

	scene.SetSky(pbr.Vector3{256, 256, 256}, pbr.Vector3{})
	scene.Add(pbr.UnitSphere(pbr.Plastic(1, 0, 0, 1)))

	for sampler.PerPixel() < 200 {
		sampler.Sample()
		fmt.Printf("\r%.1f samples / pixel", sampler.PerPixel())
	}
	pbr.WritePNG("hello.png", renderer.Rgb())
}

Renamed types in Go

When I first saw this in Go, I thought, “why the hell would anyone want to do that?”

type MyType float64

Great, now I have a float that’s kind of like a float, but it throws errors if I try to use it like a float with other floats. I put this silly and clearly useless construct aside.

Then, working on my physically-based renderer a couple of days later, I got into a mess. PBR is nothing but Vector3s - just billions and billions of Vector3s. They’re perfect for points in space, distances, directions, light energy, pixel colors, and signal strength. Profiling shows that Vector3.Unit() is PBR’s single most-expensive operation, which makes sense because Unit requires a math.Sqrt and, again, this is done billions of times.

I wanted to eliminate unnecessary calls to Unit, but was afraid to. Many vector operations - like using the dot product to determine the cosine between two vectors - are only accurate between unit vectors. Applying them to non-unit vectors would create difficult-to-diagnose bugs. Liberal application of Unit would improve safety, but would also slow down the program.

Simultaneously, I was drowning in Vector3s. I’d written function after function like this:

func (m *Material) absorb(norm, inc Vector3) (bool, Vector3, Vector3)

One of those returned Vector3s represents the direction of a light ray. The other represents energy attenuation (color) when light hits a surface. Confusing the two results in nonsense values like “2 meters plus an incandescent light bulb” or “forward-left times fuschia.”

Suddenly, the idea of named types that refuse to mix with members of different names didn’t sound so stupid.

To experiment, I implemented a Direction type based on Vector3. All of the unit-vector-only operations found a home here. Anything that might possibly result in a non-unit vector returned a Vector3, not a Direction. Similarly, the expensive Unit() operation now converted Vector3s into Directions. Several benefits fell out of this arrangement:

First, it fixed a couple of bugs in the code. I’m not even exactly sure where they were, but after the change, some subtle camera field-of-view issues disappeared.

Second, my code became shorter and more readable. I got to delete a bunch of stuff and the above function turned into:

func (m *Material) absorb(norm, inc Direction) (bool, Direction, Vector3)

Third, the compiler alerted me to a couple of unnecessary calls to Unit(). How? After the change, Unit was no longer a valid method of Direction - a Direction is already a unit, after all. So this readability change also improved performance.

The fixed bugs, better readability, and improved performance convinced me to introduce a new Energy type, a Vector3 that exclusively deals with light energy.

Now Energies, Directions, and Vector3s are easy to differentiate and the compiler understands that “violet, divided by up, minus two feet” makes no sense:

func (m *Material) absorb(norm, inc Direction) (bool, Direction, Energy)

I’m still new to Go so I’m sure there are ways I could make this better. If you have recommendations, please let me know!

Useful constructors in Go

I’m learning Go (“golang”) by building a Physically-Based Renderer and today I’ve begun refactoring pbr’s constructors based on this philosophy:

Where possible, design structs that don’t require constructors and which have useful zero values by default (point := Vector3{}). But when a constructor is necessary, follow the pattern:

func NewFoo(absolutely, required int, config ...FooConfig) *Foo

That way, a minimal call like NewFoo(1, 2) points to a useful Foo. Users with more specific requirements can specify more config:

foo := NewFoo(3, 4, FooConfig{Bar: "baz"})

Internally, the Foo struct can elegantly support FooConfig via struct embedding:

type Foo struct {
  Absolutely int
  Required int
  FooConfig
}

This enables nice default properties with easy access from the top-level Foo instance. Within the constructor:

var c FooConfig
if len(config) > 0 {
  c = config[0]
}
if c.Bar == "" {
  c.Bar = "nice default"
}

As a user:

fmt.Println(foo.Bar)  // nice default

Here’s a working example on the Go Playground.