11 July 2022

Software Development

Go for C++ developers lesson - differences and insights

25 minutes reading

Go for C++ developers lesson - differences and insights

Some people enjoy discovering new songs, bands and genres. Others prefer listening to the songs they already know. One could say the same about programming languages.

I've been listening most often to the song of C++, which I just enjoyed building my expertise in. Also Python was a recurring beat. Recently, having started a new project, I've been asked to learn the song of Go.

This post is not supposed to be any kind of introduction to Go’s learning path. I just wanted to share some of my "Aha!" moments and analogies from the ongoing process of learning Go, from the perspective of the C++ genre.

>>If you want to learn more about Go programming language check out our article.<<

We’ll first take a look at some similarities and differences in how runtime polymorphism is handled in C++ and Go, and the problems it might come with. Then we’ll consider how exceptional control flow affects class correctness in both languages. At the end, we’ll have some discussion and an example of how C++ and Go handle user-defined iterables. And if you would like to find out more about differences between Go vs. C++ check out our article.

Where is my vtable?

Do you want to explain Go interfaces to an experienced C++ developer? Just tell him: Go uses type erasure.

C++ perspective

A small recap of C++. The oldest means of achieving runtime polymorphism with C++ is sub-typing using class hierarchies with virtual methods. Let’s take a simple C++ example of a polymorphic integer generator:

class IGenerator{
public:
    virtual ~IGenerator() = default;
    virtual int Next() = 0;
};

class AscendingGenerator : public IGenerator {
public:
    AscendingGenerator(const int start = 0) : current{start}{}
    int Next() override { return current++; }

private:
    int current;
};

void PrintNTimes(const int n, IGenerator& gen)
{
    for (int i = 0; i < n; ++i) {
        std::cout << "val " << i << " = " << gen.Next() << '\n';
    }
}

int main()
{
    AscendingGenerator gen{5};
    PrintNTimes(5, gen);
}

The exact memory layout of the IGenerator class is implementation-defined, but we can approximate it with the following:

class IGenerator{
    v_table* v_ptr; 
};

where the member v_ptr is a pointer to a table of function pointers, aka the vtable. During the construction of a concrete subclass, v_ptr is set to point to a concrete vtable. That means that the vtable and the concrete object are coupled. Refer to this blog post for some more insight into virtual tables.

An experienced eye might see that the v_ptr being a "member" of an abstract base class is purely an arbitrary design choice made by Bjarne Stroustrup, the creator of C++, and one could come up with be a different dispatch policy to achieve runtime polymorphism.

Go perspective

Go, unlike C++ and many other languages, does not offer polymorphic class hierarchies. The only runtime polymorphism we have in Go is interfaces, which is a structural typing approach. It’s like compile time duck typing. An interface is implemented implicitly by any conforming type, which is verified at compile time.

The Go implementation of the previous example would look simpler:

package main
import "fmt"

type Generator interface {
    Next() int
}

type AscendingGenerator struct {
    current int
}

func (a *AscendingGenerator) Next() int {
    a.current += 1
    return a.current - 1
}

func PrintNTimes(n uint, gen Generator) {
    for i := uint(0); i < n; i++ {
        fmt.Println("val ", i , " = ", gen.Next())
    }
}

func main(){
    gen := &AscendingGenerator{5}
    PrintNTimes(3, gen)
}

In the provided example, gen consists of two pointers that make up an interface:

type _interface struct {
	dynamicTypeInfo *_implementation
	dynamicValue    unsafe.Pointer // unsafe.Pointer means
	                               //* ArbitraryType in Go.
}
  • The first member points to the dispatch table for the implementation of the Generator interface, aka. the vtable.
  • The second member points to a concrete instance, implementing that interface.

Read this post for more information on the internal implementation of golang interfaces. What is important here is that, in the case of Go, the vtable and the concrete object are decoupled.

What's the analogy?

Comparing the dynamic dispatch abstractions offered by the two languages, we see that Go and C++ have different approaches. Go might seem superior with its structural typing approach, compared to the less flexible subtyping of C++. On the other hand, C++ is feature-rich and might provide an alternative.

That's where the technique called type erasure comes to C++’s rescue. I will not elaborate on the wider theory and implementations of that technique. What is important here is that type erasure from the user's perspective allows us to choose a different dispatch policy than an abstract base class v_ptr.

Unlike C++, which keeps the vtable as an inherited "member" of the abstract base class, Go interface pointers store the dispatch table not in the concrete object but next to it. We could achieve something similar with C++ type erasure. The following snippet shows a type-erased Generator class using a simplified pseudo-library code:

class Generator{
public:
    template<typename T>
    Generator(T&& t) 
    : v_ptr{/ *construct vtable based on t* /},
      storage{/ *construct storage based on t* /},
    {}
    int Next(){ / *call v_table using the storage* / }

private:
    v_ptr v_table;
    shared_storage storage
};

There are a few library solutions for implementing type erasure I know of:

and probably more I haven't seen yet.

I'll present a "simplified" type-erased alternative to the abstract base class approach using Boost::ext.TE. It's by no means a production quality example, but that library requires typing the least amount of boilerplate, so is convenient for a blog post:

#include <iostream>
#include <boost/te.hpp>

namespace te = boost::te;

struct Generator *{
    int Next() const {
        return te::call<int>(\[](auto &self) { return self.Next(); }, *this);
    }
};
using Generator = te::poly<Generator*>;

class AscendingGenerator {
public:
    AscendingGenerator(const int start = 0) : current{start}{}
    int Next() { return current++; }

private:
    int current;
};

void PrintNTimes(const int n, Generator& gen)
{
    for (int i = 0; i < n; ++i) {
        std::cout << "val " << i << " = " << gen.Next() << '\n';
    }
}

int main() {
    Generator gen = AscendingGenerator{5};
    PrintNTimes(3, gen);
}

Lessons learned

  • A runtime dispatch table is a common concept in computer science, but different programming languages utilize it in different ways.
  • Structural typing gives more flexibility to the user of an API and results in fewer lines of code produced, compared to subtyping.
  • When writing Go code, think in terms of the functionalities the classes provide, not the hierarchies they compose.
Golang Development Services for your project

Where is my flawless standard library polymorphism?

Interfaces can be misused. When you see an interface type in the Go standard library, don't be fooled. That does not necessarily imply polymorphism under the hood, but sometimes hidden requirements.

C++ perspective

Let's focus on runtime polymorphism only (compared to template-based static polymorphism). The C++ standard library complies with the Liskov substitution principle. What does that mean in practice? If an abstract base class is required as a parameter for an API, then any properly implemented, user-provided subclass can be used.

For example, let's take a look at an outline of std::pmr::memory_resource.

class memory_resource {
    ...\
    virtual void *do_allocate(size_t bytes, size_t alignment) = 0;
    virtual void do_deallocate(void* p, size_t bytes, size_t alignment) = 0;
    virtual bool do_is_equal(const memory_resource& other) const noexcept = 0;
};

There are standard implementations of different memory resources. The user is allowed to subclass and implement custom ones. All of these can be used in contexts where std::pmr::memory_resource is required, for example for creating a std::pmr::vector.

Go perspective

Lots of standard packages, for example net, produce and use not only structs, like IPAddr, but also interfaces, like Addr. The Liskov substitution principle originally applied to class hierarchies. In terms of Go, it should sound more like: a struct must fulfill the purpose of the interface.

What does that look like in practice?

Real-life example

During the early days of my Golang adventure, I was implementing a wrapper for sending UDP datagrams. I used the net.PacketConn interface to manage the connection, and the method WriteTo(p []byte, addr Addr) (n int, err error) to send the bytes to a given address. The Addr interface was defined as

type Addr interface {
    Network() string // name of the network (for example, "tcp", "udp")
    String() string  // string form of address (for example, "192.0.2.1:25", "\[2001:db8::1]:80")
}

At first, I passed an instance of net.UDPAddr as the Addr parameter. After some TDD iterations I found it convenient to use my own type in place of net.UDPAddr.

It took me almost an hour to debug why my tests were failing. Let's take a look at how the WriteTo method was implemented:

*// WriteTo implements the PacketConn WriteTo method.
func (c* UDPConn) WriteTo(b \[]byte, addr Addr) (int, error) {
    if !c.ok() {
        return 0, syscall.EINVAL
    }
    a, ok := addr.(*UDPAddr)
    if !ok {
        return 0, &OpError{Op: "write", Net: c.fd.net, Source: c.fd.laddr, Addr: addr, Err: syscall.EINVAL}
    ...
}

I got you, you little liar! What seems to be the problem with that method? The function signature says Addr, which is an interface shown above, but the function itself actually requires *UDPAddr. The real problem is not that the WriteTo method has preconditions reaching beyond type safety, but that that fact is either:

  • not documented, OR
  • not documented well enough for an aspiring Go developer like me to stumble upon.

Such non-polymorphic interfaces are common in the Go standard library, but some are well-documented. Look at the SameFile function from the os package:

// SameFile reports whether fi1 and fi2 describe the same file.
// For example, on Unix this means that the device and inode fields
// of the two underlying structures are identical; on other systems
// the decision may be based on the path names.
// SameFile only applies to results returned by this package's Stat.
// It returns false in other cases.
func SameFile(fi1, fi2 FileInfo) bool {
    fs1, ok1 := fi1.(fileStat)
    fs2, ok2 := fi2.(*fileStat)
    if !ok1 || !ok2 {
        return false
    }
    return sameFile(fs1, fs2)
}

The line // SameFile only applies to results returned by this package's Stat. tells you all you need to know.

Workaround

Suppose the library author still has some strong reasons to downcast a polymorphic interface to some other type. Luckily, it’s possible to expose an interface that will never hold an instance of a type declared outside of the package of that interface.

If you want to explicitly disallow the user from implementing the interface your library provides, put some lowercase method in your interface. The technique is explained in this blog post. The example shown there is:

type Opaque interface {                   // Public
        GetNumber() int                   // Public
        implementsOpaque()                // Private
}

The user would not be able to satisfy the Opaque interface with a custom implementation because the implementsOpaque() method is not exported and not visible in the user’s package. Thus, every implementation of that interface has to come from you, as the package author. Otherwise, the code won’t compile.

Lessons learned

  • Don't trust standard library polymorphism.
  • Think twice before trying to provide your own implementation to a standard library interface.
  • Write tests. Just do it. Period. :)
  • If you really need to ban the user from implementing your interface, put a lowercase method in it, so it’s not reachable from outside your package.

Where is my exceptional control flow?

One of the very first pieces of advice I encountered when learning Go was to use error return codes whenever possible. The built-in panic function is reserved for critical errors, or other unrecoverable situations that cannot/shouldn’t be handled by return values.

What C++ exceptions and Go panics have in common is that they cause abnormal control flow termination. That awareness is crucial to properly keep the type invariants while mutating a struct instance.

Real-life example

One day, a colleague of mine submitted a merge request for a review. It was an implementation of a session storage, meant for concurrent atomic use. Simplifying the problem, it looked similar to the following:

type sessionStorage struct {
    mutex sync.RWMutex
    data  map\[uint64]*session
}

func (s *sessionStorage) Update(id uint64, f func(*session)) {
    s.mutex.Lock()
    f(s.data\[id])
    s.mutex.Unlock()
}

My comment was:

Please use defer s.mutex.Unlock() instead of unlocking "by hand".

The main rationale was that defer mutex.Unlock() is a well-known idiom for releasing the mutex and should be preferred - read this post for more. In this specific case, the consequence was the following: given there was no 'defer', if f(s.data[id]) ever panics, the storage would become locked forever.

We quickly came to an agreement but there was some discussion between us: is it natural for Go to assume that a client callback might cause a panic? After all, using panics for error handling is not really the preferred Golang practice.

Later on that very same day I had a problem with a panicking parser. Our code looked similar to the following:

func Decode(binaryMsg \[]byte) (message.Message, error) {
    msg, err := message.Parse(binaryMsg)
    if err != nil {
        return nil, err
    }\
    return msg, nil
}

message.Parse was intended to return an error in case it was impossible to parse the bytes. We quickly found out that message.Parse panics given a specific input. The solution was to recover the panic and return an error, using the named return values:

func Decode(binaryMsg \[]byte) (msg message.Message, err error) {
    defer func() {
        if recover() != nil {
            msg, err = nil, fmt.Errorf("panic during decoding of: %v", binaryMsg)
        }
    }()
    msg, err = message.Parse(binaryMsg)
    return // named return params
}

Lessons learned

The situations described above taught me some valuable lessons:

  • Go, just like C++, can surprise you with unexpected control flows.
  • Assume a user-provided callback might terminate your normal control flow.
  • Assume other functions you are using might panic as well.
  • Your clients might try to recover from your panic for good or bad reasons, so take care of your class invariants well.

Where is my iteration?

There is no language-based mechanism in Go to iterate over a custom iterable with a simple zero-overhead range for loop. Let’s see how much of a problem that is.

C++ perspective

The concept of an iterator has been a fundamental part of STL since it was standardized as C++98. It's a powerful and flexible abstraction. By C++20, the language had evolved to the point that it fully acknowledged the existence of a range in the standard library, which is an abstraction over iterators.

In terms of C++, there are multiple iterator category types:

  • input iterators
  • forward iterators
  • bidirectional iterators
  • random access iterators
  • contiguous iterators
  • output iterators

For this blog post, we’ll concentrate on the category of input iterators, using a use-case of just iterating over a custom sequence once.

Generators, filters, lazy transformations - all of this and more is possible given a proper iterator model provided by the language. Other languages with an iterator concept offer that as well, for example Python.

>>Be sure to check out our article about main differences between Go vs. Python.<<

Let’s take std::ranges::views::iota as an example. Its definition might seem quite complex, but from the user's point of view it’s rather simple: it behaves as if it was a collection of successive integer-like values. In fact, those values are generated on the fly under the hood.

Go perspective

The language offers native range for support only for :

  • arrays
  • slices
  • maps
  • channels

Unfortunately, Go offers no general support for custom iterables and there is no standardized iteration model in Go, but some workarounds exist. I suggest you read this blog post by Krzysztof Kowalczyk to get a summary of the different ways of implementing iteration in Go. I'll just mention one of those, which is quite often used in the Go standard library. It looks like this:

for it.Next() {
    e := it.Value()
}

The Next() and Value() method is just an example. bufio.Scanner uses Scan() and Text().sql.Rows uses Next() and Scan(...).reflect.MapIter uses Next(), but Key() and Value() as distinct getters. dwarf.Reader uses a single func (rReader) Next(), so a separate value getter is not needed. To refer to the previous std::ranges::views::iota example from C++, in the case of Go the user would need to implement that by hand. An example might be:

type incrementable interface {
	constraints.Integer | ~rune
}

type Iota\[T incrementable] struct {
	current   T
	afterLast T
}

func NewIota\[T incrementable](start, afterLast T) Iota\[T] {
	return Iota\[T]{current: start - 1, afterLast: afterLast}
}

func (i *Iota\[T]) Next() bool {
	next := i.current + 1
	if next == i.afterLast {
		return false
	} else {
		i.current = next
		return true
	}
}

func (i *Iota\[T]) Value() T {
	return i.current
}

See the complete sample here.

Small case study

In my subjective experience, I squeezed the most out of the C++ iteration model when working with composable lazy transformations of data. Let’s say a function you are implementing is supposed to:

  • accept a collection of ints and number n
  • filter out even numbers
  • group the requests in batches of at most n elements
  • add up the numbers in each batch
  • print the sums

The task above might seem a bit artificial but is enough to prove the point. In a production scenario, instead of numbers, we might operate on client data. Instead of checking for even numbers, we might check run-time acceptance policies. Instead of printing the numbers, we might want to invoke i/o operation, etc.

C++ approach

There are a number of C++ libraries for composable range transformations:

And most probably others I haven’t mentioned.

The standard library is not feature-rich enough at the time of writing (it is missing the chunk view) so I’ll use the well-established ranges-v3 library instead. We can translate the requirements almost directly into C++ code:

void print_using_range_v3(
   const std::span<const int> input,
   const uint n)
{
   namespace views =  ranges::views;
   auto is_odd = \[](const auto i) -> bool { return i % 2; };

   auto results =  input |
          views::filter(is_odd) |
          views::chunk(n) |
          views::transform(\[](auto&& rng){
                      return ranges::accumulate(rng, 0);
                  });

   for(const auto& res : results){
       std::cout << res << '\n';
   }
}

Click here for a complete example.

It's perfectly reasonable to assume that the code above will not perform any unnecessary allocations, calculations, or create any temporary containers. Writing an equally efficient loop by hand would be a non-trivial task, and lots of readability and expressiveness would be lost in the process.

Go approach

I’ve spent some time googling generic-based alternatives to a composable range manipulation library in Go. I prefer the type system working to my advantage, so I’ve decided to ignore the older non-generic solutions. There are a number of Go open-source libraries providing generators, filters, transformations and other iterable utilities of the functional programming paradigm.

I’ve taken a closer look at

Having implemented the given example with the help of the abovementioned libraries, I’ve decided mtoohey31/iter seems to be the most suitable for the job:

func printUsingMtooheyIterator(input []int, n int) {
	isOdd := func(i int) bool { return i%2 == 1 }

	result :=
		iter.Map(
			chunk(
				iter.Elems(input).
					Filter(isOdd),
				n),
			func(rng iter.Iter[int]) int {
				return iter.Sum(rng)
			})

	for r, end := result(); end; r, end = result() {
		fmt.Println(r)
	}
}

It’s the only one implemented in terms of member functions whenever possible, so it allows readable left-to-right composition (see Elems and Filter above). The library core iter.Iter[T] type is an alias for func() (T, bool), so an iterator can be implemented as an anonymous function, whereas defining iter.Iter[T] as an interface would require more boilerplate code on implementation.

There is no chunk method provided by the library and it has to be implemented by us. Also, Map is not a member function, but a free function. I’ve asked the library author Matthew Toohey about that. He pointed me to the fact that with the current Go version (1.18), methods cannot be generic.

Despite all that, I find this library really promising. It properly leverages the features of the language. It’s simple, yet scalable. I’m just hoping we'll get generic methods in the language one day. We can’t expect the library to provide no-overhead abstractions due to the run-time overhead of using generics, but the expressiveness and correctness of transformations might be worth more for certain users than a performance overhead.

If you want to learn more, see the playground for a full example using the mentioned libraries and read about Generics in Go.

Idiomatic Go

I’m still learning how to write idiomatic Go. I’ve shown the above code to a senior colleague with vast Go experience and his feedback was:

It is far from idiomatic Go. I've never seen any large project using 3rd party libraries to do simple operations on collections in Go. Remember that “a little copying is better than a little dependency”.

That confirmed my initial impressions of the language: the concept of a general iterable range is not backed up by the language and there are no standard library utilities for manipulating iterables, so the Go community does not advocate for such an approach.

This case study has shown us again that learning a new programming language is not only a case of understanding the syntax, features and libraries. First and foremost, it’s about grasping the design principles of the language and diving into standard community-driven practices.

Opportunities for the future

I've come across a few proposals for adding a unified iteration strategy to Go:

  • This one, proposing to allow range iteration on any callable with a signature of func() (T, bool).
  • This one, proposing standardized Iterator and Iterable interfaces.
  • This one, proposing to allow range iteration over the Rangable interface, which works with user-provided callbacks.

There might be other proposals I haven't stumbled upon yet. I'm not here to judge which one should be accepted or not. I'm just looking forward to what the future brings, hoping we'll one day get a unified iteration model in the language.

Lessons learned

  • Don’t blindly move idioms and practices from one language to another. Always seek out well-established and mature community knowledge.
  • Don’t reinvent the wheel. In order to expose an iterable API for a type, lean towards the Next() and Value() idioms practiced by the standard library, or use func() (T, bool).

Conclusion

We’ve looked at a few lines of the song of Go that might ring a bell for a C++ developer. It’s not really about similarities but learning through analogy and the contrast to the things one might already be familiar with.

Exceptional control flow, class invariants, polymorphism, ranges - every C++ enthusiast should be proficient with those. That proficiency could definitely help with learning and understanding Go. You might ask yourself: why would I want to learn Go in the first place? Well, we’ve already written about why Golang may be a good choice for your project,so be sure to check it out.

Łukasz

Łukasz Drożdż

Software Engineer