jq to sort tags into CSV within AWS output. | welcome to my website. there is content on it. you're reading it now

jq is a command-line processing language for json! AWS has a bunch of CLI commands you can run that output in … pseudo-JSON! How do you sort that? Well … there are a lot of great posts on stackoverflow for it. I’ve gone through and am going to share the ones I liked, and my small changes, that I used in order to actually get something useful out of it.

This is what I use to actually convert to csv:

https://stackoverflow.com/questions/32960857/how-to-convert-arbitrary-simple-json-to-csv-using-jq

Yes, this means I’m converting to something valid to this string, and am /then/ converting to csv. It makes it easier for me, and really … as long as the results are valid that’s all I care about.

So what this expects is a json document like:

{
    "key":"value",
    "key":"value",
    "key":"value"
}

AWS output is usually like this. However for some things it’s different, specifically Tags, especially if you have a lot of them:

{
    "key":"value",
    "key":"value",
    "key":"value",
    "Tags": [
     { "Key": key,
       "value": value
     },
     { "Key": key,
        "value": value
     },
     { "Key": key,
       "value": value
     },
}

Why is it like this?? I don’t know. Don’t ask me. Anyways, you can clear this up by adding a few different things together. I’ve just been using with_entries. With this you can pick which entries you want. So for Tags, you can just first ignore the tags:

| jq -r 'map( [(with_entries(select(.key != "Tags"))]

and then add the Tags, which you’re processing differently:

+ [with_entries(select(.key == "Tags" and .value != null))

The select for no null values is because otherwise jq errors out. Avoid the errors, it’s good practice, woo. If it’s null all is fine. Because in AWS’s syntax it has this as an array we want to get it out of the array, so we add:

| .[]? |

And using this stackoverflow idea:

https://stackoverflow.com/questions/34226370/jq-print-key-and-value-for-each-entry-in-an-object

we can edit it to grab the values and set it up like we need to ‘fix’ the key-value pairs. So this:

keys[] as $k | "\($k), \(.[$k] | .ip)"

becomes this:

keys[] as $k | "\(.[$k] | .Key), \(.[$k] | .Value)")]|add)'

The add at the end makes sure that we add all of the values together and create our flattened JSON output for the csv line properly.

All together now, it looks like this:

(your AWS input command) | jq -r 'map( [(with_entries(select(.key != "Tags"))] + [with_entries(select(.key == "Tags" and .value != null)) | .[]? | keys[] as $k | "\(.[$k] | .Key), \(.[$k] | .Value)")]|add)' | jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | @csv' > outputfile.csv

So if you’re outputting from AWS, or are outputting any json that has a ‘Tags’ field or other field that you need to flatten to query, you can do this to it in order to properly flatten the fields and have the tags be searchable for specific values. From here you can do whatever you want to the resulting, properly-organized code.

This has been really useful for me. I’m pretty new to jq - full disclaimer that it is definitely not exactly-syntactically-correct jq - but this is working for my usecase!

Tags: jq code tech