in ,

5 Common Pitfalls of JSON Manipulation in Golang






Published on

2024-05-25



|



Classified in





|



number of times read:


|


|



Count:

2,880

|



Reading time ≈

12

5 Common Pitfalls of JSON Manipulation in Golang

JSON is a data format that many developers often use in their work. It is generally used in scenarios such as configuration files or network data transmission. Due to its simplicity, easy to understand, and good readability, JSON has become one of the most commonly used formats in the entire IT industry. For this situation, Golang, like many other languages, also provides support at the standard library level. encoding/json

Just like JSON itself is easy to understand, the encoding/JSON library for manipulating JSON is also very easy to use. But I believe that many developers may encounter various strange problems or bugs like I did when I first used this library. This article summarizes the problems and errors I personally encountered when manipulating JSON with Golang. I hope to help more developers who read this article master the use of Golang, operate JSON more correctly, and avoid calling unnecessary “pitfalls”.

This article is based on Go 1.22. There may be minor differences between different versions. Please be aware of this when reading and using. encoding/jsondoes not involve any third-party JSON library.

Basic Usage

Let’s take a look first encoding/json Basic usage.
As a data format, JSON has only two core operations: serialization and deserialization. Serialization is to convert a Go object into a string (or byte sequence) in JSON format. Deserialization is the opposite, converting JSON format data into a Go object.

The object mentioned here is a broad concept. It not only refers to structure objects, but also includes slice and map type data. They also support JSON serialization.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import (
"encoding/json"
"fmt"
)

type Person struct {
ID uint
Name string
Age int
}
func MarshalPerson() {
p := Person{
ID: 1,
Name: "Bruce",
Age: 18,
}
output, err := json.Marshal(p)
if err != nil {
panic(err)
}
println(string(output))
}
func UnmarshalPerson() {
str := `{"ID":1,"Name":"Bruce","Age":18}`
var p Person
err := json.Unmarshal(()byte(str), &p)
if err != nil {
panic(err)
}
fmt.Printf("%+v\n", p)
}

The core is two functions json.Marshal and json.Unmarshalfor serialization and deserialization respectively. Both functions will return errors, here I just simply panic.
used encoding/json Readers of GitHub may know that there is another pair of tools that are often used: NewEncoder and NewDecoder. A quick look at the source code will reveal that the underlying core logic calls of these two tools are the same as Marshal, so I will not give examples here.

Common pitfalls

1. Public or private field processing

This may be the most common mistake made by developers who are new to Go. That is, if we use a structure to process JSON, the member fields of the structure must be public, that is, the first letter is capitalized, and private members cannot be parsed.
For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
type Person struct {
ID uint
Name string
age int
}

func MarshalPerson() {
p := Person{
ID: 1,
Name: "Bruce",
age: 18,
}
output, err := json.Marshal(p)
if err != nil {
panic(err)
}
println(string(output))
}
func UnmarshalPerson() {
str := `{"ID":1,"Name":"Bruce","age":18}`
var p Person
err := json.Unmarshal(()byte(str), &p)
if err != nil {
panic(err)
}
fmt.Printf("%+v\n", p)
}
// Output Marshal:
{"ID":1,"Name":"Bruce"}
// Output Unmarshal:
{ID:1 Name:Bruce age:0}

Here, age is set as a private variable, so there is no age field in the serialized JSON string. Similarly, when deserializing the JSON string into Person, the value of age cannot be read correctly.
The reason is simple. If we dig into the source code under Marshal, we will find that it actually uses reflect To dynamically parse struct Target:

1
2
3
4
5
6
7
// .../src/encoding/json/encode.go

func (e *encodeState) marshal(v any, opts encOpts) (err error) {
// ...skip
e.reflectValue(reflect.ValueOf(v), opts)
return nil
}

Golang prohibits reflective access to private members of structures at the language design level, so this reflective parsing will naturally fail, and the same is true for deserialization.

2. Be wary of structure combinations

Go is object-oriented, but it has no classes, only structures, and structures have no inheritance. Therefore, Go uses a combination to reuse different structures. In many cases, this combination is very convenient because we can operate other members in the combination just like the members of the structure itself, like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
type Person struct {
ID uint
Name string
address
}

type address struct {
Code int
Street string
}
func (a address) PrintAddr() {
fmt.Println(a.Code, a.Street)
}
func Group() {
p := Person{
ID: 1,
Name: "Bruce",
address: address{
Code: 100,
Street: "Main St",
},
}
// Access all address's fields and methods directly
fmt.Println(p.Code, p.Street)
p.PrintAddr()
}
// Output
100 Main St
100 Main St

It is indeed very convenient to use structure combinations. However, we need to pay attention to a small problem when using JSON parsing. Please see the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// The structure used here is the same as the previous one, 
// so I won't repeat it. error is also not captured to save space.

func MarshalPerson() {
p := Person{
ID: 1,
Name: "Bruce",
address: address{
Code: 100,
Street: "Main St",
},
}
// It would be more pretty by MarshalIndent
output, _ := json.MarshalIndent(p, "", " ")
println(string(output))
}
func UnmarshalPerson() {
str := `{"ID":1,"Name":"Bruce","address":{"Code":100,"Street":"Main St"}}`
var p Person
_ = json.Unmarshal(()byte(str), &p)
fmt.Printf("%+v\n", p)
}
// Output MarshalPerson:
{
"ID": 1,
"Name": "Bruce",
"Code": 100,
"Street": "Main St"
}
// Ouptput UnmarshalPerson:
{ID:1 Name:Bruce address:{Code:0 Street:}}

Here, a Person object is declared first, and then the serialized result is beautified and printed out using MarshalIndent. From the printout, we can see that the entire Person object is flattened. As far as the Person structure is concerned, it still looks like it has an address member field despite the combination. Therefore, sometimes we take it for granted that the serialized JSON of Person looks like this:

1
2
3
4
5
6
7
8
9
// The imagine of JSON serialization result
{
"ID": 1,
"Name": "Bruce",
"address": {
"Code": 100,
"Street": "Main St"
}
}

But it is not, it is flattened. This is more in line with our previous feeling when we directly accessed the address member through Person, that is, the address member seemed to become a member of Person directly. This needs to be noted because this combination will flatten the serialized JSON result.

Another somewhat counterintuitive question is that the address structure is a private structure, and private members should not seem to be serialized? Yes, this is also one of the shortcomings of this composite structure for JSON parsing: it exposes the public members of the private composite object.
If there is no special need (for example, the original JSON data has been flattened, and there are repeated fields of multiple structures that need to be reused), from my personal point of view, it is recommended to write it like this:

1
2
3
4
5
type Person struct {
ID int
Name string
Address address
}

3. Deserialize some member fields

View the code directly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
type Person struct {
ID uint
Name string
}

// PartUpdateIssue simulates parsing two different
// JSON strings with the same structure
func PartUpdateIssue() {
var p Person
// The first data has the ID field and is not 0
str := `{"ID":1,"Name":"Bruce"}`
_ = json.Unmarshal(()byte(str), &p)
fmt.Printf("%+v\n", p)
// The second data does not have an ID field,
// deserializing it again with p preserves the last value
str = `{"Name":"Jim"}`
_ = json.Unmarshal(()byte(str), &p)
// Notice the output ID is still 1
fmt.Printf("%+v\n", p)
}
// Output
{ID:1 Name:Bruce}
{ID:1 Name:Jim}

From the code comments, we can know that when we reuse the same structure to deserialize different JSON data, once the value of a JSON data contains only some member fields, the members that are not included will use the last deserialized value, which will cause dirty data pollution problems.

4. Processing pointer fields

Many developers get a headache when they hear the word pointer, but it's unnecessary. But pointers in Go do bring developers one of the most common panics in Go programs: null pointer exception. What happens when pointers are combined with JSON?

Look at the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
type Person struct {
ID uint
Name string
Address *Address
}

func UnmarshalPtr() {
str := `{"ID":1,"Name":"Bruce"}`
var p Person
_ = json.Unmarshal(()byte(str), &p)
fmt.Printf("%+v\n", p)
// It would panic this line
// fmt.Printf("%+v\n", p.Address.Street)
}
// Output
{ID:1 Name:Bruce Address:<nil>}

We define the Address member as a pointer. When we deserialize a JSON data that does not contain Address, the pointer field will be set to nil because it has no corresponding data. If we call p.Address.xxx directly, the program will crash because p.Address is empty.

So if we have a pointer to a member of our structure, remember to check if the pointer is nil before using it. This is a bit tedious, but it can't be helped. After all, writing a few lines of code may not be as much as the loss caused by a panic in a production environment.

In addition, when creating a structure with a pointer field, the assignment of the pointer field can be relatively cumbersome:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
type Person struct {
ID int
Name string
Age *int
}

func Foo() {
p := Person{
ID: 1,
Name: "Bruce",
Age: new(int),
}
*p.Age = 20
// ...
}

5. Problems that may be caused by zero value (default value)

Zero value is a feature of Golang variables, which we can simply think of as the default value. That is, if we do not explicitly assign a value to a variable, Golang will assign it a default value. For example, we saw in the previous example that the default value of int is 0, the default value of string is an empty string, the zero value of a pointer is nil, and so on.

What are the pitfalls of processing JSON with zero values?
Take a look at the following example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
type Person struct {
Name string
ChildrenCnt int
}

func ZeroValueConfusion() {
str := `{"Name":"Bruce"}`
var p Person
_ = json.Unmarshal(()byte(str), &p)
fmt.Printf("%+v\n", p)
str2 := `{"Name":"Jim","ChildrenCnt":0}`
var p2 Person
_ = json.Unmarshal(()byte(str2), &p2)
fmt.Printf("%+v\n", p2)
}
// Output
{Name:Bruce ChildrenCnt:0}
{Name:Jim ChildrenCnt:0}

we are at Person A ChildrenCnt field is added to the structure to calculate the number of children of the person. Since the value of this field is zero, when there is no ChildrenCnt data in the JSON data loaded by p, the field is assigned a value of 0. In the case of Bruce and Jim, due to missing data, the number of children of one is 0, while the number of children of the other is 0. In fact, the number of children of Bruce should be “unknown”. If we really treat it as 0, it may cause problems in the business.
In some scenarios with strict data requirements, this confusion is very fatal. So, is there any way to avoid this zero-value interference?
Let's change the type of Person's ChildrenCnt to *int and see what happens:

1
2
3
4
5
6
7
type Person struct {
Name string
ChildrenCnt *int
}
// Output
{Name:Bruce ChildrenCnt:<nil>}
{Name:Jim ChildrenCnt:0xc0000124c8}

The difference is that Bruce has no data, so ChildrenCnt is zero, while Jim is a non-zero pointer. This makes it clear that the number of Bruce's children is unknown. Essentially, this approach still uses the zero value, the zero value of the pointer, which is a bit like fighting fire with fire (laughs).

Summarize

In this article, I list 7 mistakes I made when using encoding/json libraries, most of which I encountered at work. If you haven't encountered them yet, congratulations! This also reminds us to be careful when using JSON in the future; if you have encountered these problems and are confused by them, I hope this article can help you.


————-The End————-

cloud sjhan wechat

subscribe to my blog by scanning my public wechat account

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

JA4+ – Suite Of Network Fingerprinting Standards

MITRE December 2023 attack: Threat actors created rogue VMs to evade detection