Converting JSON to a struct in Go
I’m in the process of writing a little command line app in Go that uploads images to Flickr. It’s called Rodeo and it’s working well as a project with which to learn Go in 2020.
Rodeo uses ExifTool to interrogate an image file for meta data. ExifTool is wonderful and very comprehensive and with the -j switch can return JSON.
Today I want to note down what I learned about dealing with JSON.
$ exiftool -j -iptc:* -exif:* RKA-glorious.jpeg
[{
"SourceFile": "RKA-glorious.jpeg",
"Country-PrimaryLocationName": "United Kingdom",
"Caption-Abstract": "On a wet gala weekend, Glorious returns after its 2018-19 restoration",
"Keywords": ["diesel","50 033","Severn Valley Railway","Glorious","railway"],
"ObjectName": "Glorious Sunrise",
"ImageDescription": "On a wet gala weekend, Glorious returns after its 2018-19 restoration",
"Make": "FUJIFILM",
"Model": "X-T3",
"ISO": 500,
"DateTimeOriginal": "2019:10:04 11:50:40",
"ShutterSpeedValue": "1/1000",
"ApertureValue": 4.0,
...
}]
(For brevity, I’ve removed many of the other fields that were returned…)
I don’t need all the data, and what I do need I would like in a struct. It partially looks like this:
type ImageInfo struct {
Title string
Keywords []string
ShutterSpeed string
Aperture float64
// etc...
}
The code to unmarshall a JSON string into the ImageInfo struct (excluding error checking) looks like this:
out, err := exec.Command(exiftool, "-j", "-exif:*", "-iptc:*", filename).Output()
info := ImageInfo{}
err = json.Unmarshal(out, &info)
info is now populated with the data from the JSON.
Mapping JSON names to struct names
The default unmarshalling assumes a one to one mapping of JSON property name to Struct property name. This is not usually the case so to provide that mapping, we use the Tag feature of a struct which is a string literal.
The mapping for me is:
type ImageInfo struct {
Title string `json:"ObjectName"`
Keywords []string `json:"Keywords"`
ShutterSpeed string `json:"ShutterSpeedValue"`
Aperture float64 `json:"ApertureValue"`
// etc...
}
One thing that caught me out for a while is that I found that I had to enclose the JSON property name in double quotes in order for it to work.
JSON that can be a string or an array
At this point things started to work, but it turns out that the the Keywords property in the JSON can be either a string or an array of strings.
When there are many keywords as in the example above you get this:
{
"Keywords": ["diesel","50 033","Severn Valley Railway","Glorious","railway"]
}
However, when there is one keyword, ExifTool produces:
{
"Keywords": "tree"
}
The standard unmarshalling cannot handle this, so the solution is to invent a type and an attach our own unmarshalling code to it. For this case, we need to unmarshall the single string into an array of strings with one element.
We start with a new type called stringArray:
// A stringArray is an array of strings that has been unmarshalled from a JSON
// property that could be either a string or an array of string
type stringArray []string
We then need to make our new stringArray type conform to the json.Unmarshaler interface by adding
an Unmarshal() function:
func (sa *stringArray) UnmarshalJSON(data []byte) error {
if len(data) > 0 {
switch data[0] {
case '"':
var s string
if err := json.Unmarshal(data, &s); err != nil {
return err
}
*sa = []string{s}
case '[':
var s []string
if err := json.Unmarshal(data, &s); err != nil {
return err
}
*sa = s
}
}
return nil
}
This code takes an array of bytes which is the raw JSON string for this property. We look at the first chart and if it is a double quote, then we know we have a string which we unmarshall to a string and the convert it to an array with []string{s}. If the first character is an opening square bracket, then we know we have an array of string and so can unmarshall to an array of string and assign directly. Otherwise, we assign to nil. We could even look for other types and handle them, but I’ve not needed that yet.
We then update our ImageInfo to mark keywords as a stringArray:
type ImageInfo struct {
Title string `json:"ObjectName"`
Keywords stringArray `json:"Keywords"`
ShutterSpeed string `json:"ShutterSpeedValue"`
Aperture float64 `json:"ApertureValue"`
// etc...
}
The very nice thing about doing it this way is that when the unmarshalling is complete, we know that we can trust ImageInfo.Keywords to always be an array of strings and the rest of our code doesn’t have to worry about it.
To conclude
Implementing an unmarshaller is an extensible way to take JSON data and put it into an arbitrarily complex struct. I’ve needed to convert a string to an array of strings, but there’s no reason why it couldn’t also handle those cases when the JSON has a number in a string such as { "value": "123" } and we want that to be held as an int.
You write:
Couldn't that have been:
Almost certainly! Go is still very new to me…
Just a thought: what if the first character of data []byte isn't either `"` nor `[`? I know this is not the case, but you could return an error for the default case and catch this just in case :-)
Good point Natan! I'm very trusting of my data source here…
Thanks, Rob for this great work!