Serialize using gob in Golang

Anand Rathod/ October 5, 2021/ Uncategorized

What is Gob ?

  • Gob is a serialisation technique specific to Go. It is designed to encode Go data types specifically and does not at present have support for or by any other programming language.
  • Gob packet is used to manage gob stream. It can send binary data stream between encoder (transmitter) and decoder (receiver).
  • It supports all Go data types except for channels and functions.

Gob works in a similar manner in which JSON works. The sender uses an Encoder to encode the data structure. After receiving the message, the receiver uses the Decoder to change the serialized data into a local variable.
Compared with the json encoding format, gob coding can realize the serialization of struct methods that are not supported by json.
Gob provides a way of serializing and deserializing Go data types without the need for adding string tags to structs, dealing with JSON incompatibilities, or waiting for json.

The structure of the sender and the structure of the receiving party do not need to be fully consistent or identical.
Here is some example from official documents :

// Defining a structure
struct { A, B int }

// The following types of data can be sent and received:
struct { A, B int }	// the same
*struct { A, B int }	// extra indirection of the struct
struct { *A, **B int }	// extra indirection of the fields
struct { A, B int64 }	// different concrete value type; see below 

// The following types can also receive:
struct { A, B int }	// the same
struct { B, A int }	// ordering doesn't matter; matching is by name
struct { A, B, C int }	// extra field (C) ignored
struct { B int }	// missing field (A) ignored; data will be dropped
struct { B, C int }	// missing field (A) ignored; extra field (C) ignored.

// The following format is problematic:
struct { A int; B uint }	// change of signed ness for B
struct { A int; B float }	// change of type for B
struct { }			// no field names in common
struct { C, D int }		// no field names in common

 

Gob encodes type information into its serialised forms. Type information is only included once for each piece of data, for eg, for a struct it includes the name of the struct field. It means we don’t need to create a separate files to explain as we need for photo buff.

This inclusion of type information makes Gob marshalling and unmarshalling fairly robust to changes or differences between marshaller and unmarshaller.
This property means that we will always be able to decode a gob stream stored in a file, even long after we’ve forgotten what data it represents.
So when we encode using Encoder in gob encoding format, an internal struct that describes the type gives it a unique number.
For eg.

type T struct{ X, Y, Z int }
var t = T{X: 7, Y: 0, Z: 8}

Thus when we send our first type T, the gob encoder sends a description of T and tags it with a type number, say 127. All values, including the first, are then prefixed by that number, so a stream of T values looks like this:

("define type id" 127, definition of type T)(127, T value)(127, T value), ...

Gob Functions

Gob is very simple package, it has below functions and types –func Register(value interface{})func RegisterName(name string, value interface{})type CommonTypetype Decoder

  • func NewDecoder(r io.Reader) *Decoder
  • func (dec *Decoder) Decode(e interface{}) error
  • func (dec *Decoder) DecodeValue(v reflect.Value) error

type Encoder

  • func NewEncoder(w io.Writer) *Encoder
  • func (enc *Encoder) Encode(e interface{}) error
  • func (enc *Encoder) EncodeValue(value reflect.Value) error

type GobDecodertype GobEncoder

Let’s see some main functions in detail :
Register

// Register records the type and name of the specific value.
// This name will be used to identify the specific type at the lower level when sending or receiving interface type values.
// This function should only be called during initialization. If the mapping of type and name is not one-to-one, it will panic.
func Register(value interface{})

RegisterName

// RegisterName is similar to Register, except that the provided name is used instead of the default name of the type.
func RegisterName(name string, value interface{})

Encoding
Data will be encoded before transmission. There are three methods related to encodingNewEncoder

// NewEncoder returns a * Encoder that writes encoded data to w.
func NewEncoder(w io.Writer) *Encoder

Encode

// The Encode method encodes e before sending, and ensures that all types of information are sent first.
func (enc *Encoder) Encode(e interface{}) error

EncodeValue

// The EncodeValue method encodes and sends the data represented by value, and ensures that all type information is sent first.
func (enc *Encoder) EncodeValue(value reflect.Value) error

Decoding
After receiving the data, it is necessary to decode the dataNewDecoder

// The function returns a * Decoder that reads data from R, if R does not satisfy io.ByteReader Interface will wrap r as bufio.Reader . 
func NewDecoder(r io.Reader) *Decoder

Decode

// Decode reads the next one from the input stream and stores the value in e.
// If e is nil, the value is discarded, otherwise e must be a pointer to the type that can receive the value.
// If the input ends, the method returns io.EOF And don't change e.
func (dec *Decoder) Decode(e interface{}) error

DecodeValue

// DecodeValue reads the next value from the input stream, if V is reflect.Value The zero value of type (v.Kind() == Invalid) is discarded by the method; otherwise, it stores the value in V.
// In this case, V must represent a non nil pointer to the actual value or writable reflect.Value (v.CanSet() is true).
// If the input ends, the method returns io.EOF And don't change e.
func (dec *Decoder) DecodeValue(v reflect.Value) error

Example (Encoding Interface) – 

The below example shows how to encode an interface value.

package main

import (
	"bytes"
	"encoding/gob"
	"fmt"
	"log"
	"math"
)

type Point struct {
	X, Y int
}

func (p Point) Hypotenuse() float64 {
	return math.Hypot(float64(p.X), float64(p.Y))
}

type Pythagoras interface {
	Hypotenuse() float64
}

// This shows how to encode the value of an interface type
// The difference between a key and a regular type is to register the specific type of the implementation interface.
func main() {
	var network bytes.Buffer // Standard input
	// We must register the concrete type for the encoder and decoder (which would
	// normally be on a separate machine from the encoder). On each end, this tells the
	// engine which concrete type is being sent that implements the interface.
	gob.Register(Point{})
	// Create an encoder interface and send values
	enc := gob.NewEncoder(&network)
	for i := 1; i <= 3; i++ {
		interfaceEncode(enc, Point{3 * i, 4 * i})
	}
	// Create a decoder interface and receive values
	dec := gob.NewDecoder(&network)
	for i := 1; i <= 3; i++ {
		result := interfaceDecode(dec)
		fmt.Println(result.Hypotenuse())
	}
}

// interfaceEncode encodes and saves the value to the encoder
func interfaceEncode(enc *gob.Encoder, p Pythagoras) {
	// The encode will fail unless the concrete type has been
	// registered. We registered it in the calling function.
	// Pass pointer to interface so Encode sees (and hence sends) a value of
	// interface type.  If we passed p directly it would see the concrete type instead.
	// See the blog post, "The Laws of Reflection" for background.
	err := enc.Encode(&p)
	if err != nil {
		log.Fatal("encode:", err)
	}
}
// interfaceDecode decodes the value of the interface and returns
func interfaceDecode(dec *gob.Decoder) Pythagoras {
	// The decode will fail unless the concrete type on the wire has been
	// registered. We registered it in the calling function.
	var p Pythagoras
	err := dec.Decode(&p)
	if err != nil {
		log.Fatal("decode:", err)
	}
	return p
}

Limitations

  • Gob is basically Go specific. Therefore, in the actual project development process, if we have other languages, we can’t use gob. We should use json in that case. So we suggest to use Go’s gob format only when we have sender and receiver both are developed in Golang.

Advantages

  • Gob encoding and decoding does not require the structure of the sender and the receiver to be identical.
  • Efficient than json encoding and decoding.
  • For a Go-specific environment, such as communicating between two servers written in Go, there’s an opportunity to build something much easier to use and possibly more efficient.

Encoding Used in Profile API

  • When we were planning to introduce caching in Profile API, we had explored different caching options like redis, memcache etc.
  • We had also explored efficient ways to serialize and deserialize data for storing and fetching from cache.
  • We tried JSON and gob, we did benchmarking and found gob to be much more efficient and faster than JSON.
  • Since this cached data were only suppose to used by Profile API which is written in Golang, we went ahead with gob for serializing profile data structs.
  • You can see the implementation of gob in Profile API here.

References