JSON to JSX code with Babel

A few days ago, I went through a course with Steve Kinney on building your own language. In the course, we build a basic compiler, which takes an input string (we used a LISP-ish language with a (add 3 4 5) syntax), tokenizes it, parses the tokens into an Abstract Syntax Tree (AST), and then generates code from that tree.

The steps look like:

const input = '(add 3 4)'
const tokens = tokenize(input)
// yields:
[
{type: 'Parenthesis', value: '('},
{type: 'Name', value: 'add'},
{type: 'Number', value: 3},
{type: 'Number', value: 4},
{type: 'Parenthesis', value: ')'}
]
const tree = parse(tokens)
// yields:
{
type: 'CallExpression',
name: 'add',
arguments: [
{ type: 'NumericLiteral', value: 3 },
{ type: 'NumericLiteral', value: 4 },
],
}
const generated = generate(tree)
// yields:
'add(3, 4)'

There are many other steps involved, and we built only a super simple language that can do basic computation and define variables. By no means do I fully "understand" "literally any" of this. But! The course was super informative. And it hits the same concepts that Gary Bernhardt covers in his Compiler From Scratch webcast on on Destroy All Software.

The part of building your own language that made me want to write about transforming JSON to JSX was when we got to the code generation step. Steve built the language to conform to what Babel - that tool that I am always relying on to make my JavaScript work - can read and work with.

I'd never really taken the time to peek under the hood on what Babel is doing, so having a Smart Dude show me the basics made it seem a lot less intimidating. At its core, Babel's just compiling JavaScript into JavaScript. It reads in next-generation JavaScript input, tokenizes it, parses those tokens into an abstract syntax tree, and ultimately generates older, safer, more-browsers-compliant JavaScript.

Since I now knew the steps to make a Babel-compliant abstract syntax tree (AST), I wanted to see if I could turn a JSON object that holds data about the markup structure of a page into a JSX string.

The goal

The goal is to turn an input like:

// json input
{
"root": {
"type": 'div',
"props": {
"className": 'hello',
"numberThing": 5,
"isTrue": true
},
"innerText": 'this is the text',
"children": [] // we would specify child components here, but we're not going to worry about them for now
}
}

into a React string that looks like:

<div className="hello" numberThing={5} isTrue={true}>this is the text</div>

And in order to make that transformation, I need to use Babel's generate functionality on an AST that I build. The code is going to look something like:

// json-to-jsx.js
const generate = require('@babel/generator').default;
const toJSX = input => {
const parsed = JSON.parse(input) // parsing is solved!
const ast = ????
return generate(ast).code; // generation is solved!
}

The how

With JSON.parse and Babel's generate, we've got the parsing and generating steps already taken care of. And since we're already assuming that we've got some JSON input, we don't have any tokenizing to work with, either. So, we've got the tokenizing, parsing, and generation all already handled for us. The only problem is that when JSON parses a string into a JavaScript object, the object tree is not in a form that Babel sees as JSX. After the JSON parse, we have:

{
root: {
type: 'div',
props: {
className: 'hello',
numberThing: 5,
isTrue: true
},
innerText: null,
children: []
},
}

Babel can certainly read that as a JS object with property names and values. But it has absolutely no idea that my type: div is something that I want to translate into a JSX <div></div> tag.

What we need to do, then, is transform this parsed JavaScript object into an AST that Babel recognizes as JSX.

// json-to-jsx.js
const generate = require('@babel/generator').default;
const transform = ({input}) => {
// ???
}
const toJSX = input => {
const parsed = JSON.parse(input) // parsing is solved!
const ast = transform({input: parsed}) // transform our parsed JS object
return generate(ast).code; // generation is solved!
}

In order to figure out what transforms we need to make, we can work backwards. We'll start with a valid JSX string and explore the AST structure that the string breaks down to. Then, once we know the shape we have to build, we can start to manipulate our JS object input into the correct structure.

Learn the AST shape we need to build

So, first up, let's see that code as an AST. To do so, there's an awesome tool called AST Explorer that @fkling built that allows us look into exactly what Babel sees when it sees the above JSX code.

The AST for the above JS object looks like this:

{
type: "JSXElement",
openingElement: {
type: "JSXOpeningElement",
name: {
type: "JSXIdentifier",
name: "div"
},
attributes: [
{
type: "JSXAttribute",
name: {
type: "JSXIdentifier",
name: "className"
},
value: {
type: "StringLiteral",
value: "hello"
}
},
{
type: "JSXAttribute",
name: {
type: "JSXIdentifier",
name: "numberThing"
},
value: {
type: "JSXExpressionContainer",
expression: {
type: "NumericLiteral",
value: 5
}
}
},
{
type: "JSXAttribute",
name: {
type: "JSXIdentifier",
name: "isTrue"
},
value: {
type: "JSXExpressionContainer",
expression: {
type: "BooleanLiteral",
value: true
}
}
}
],
selfClosing: false
},
closingElement: {
type: "JSXClosingElement",
name: {
type: "JSXIdentifier",
name: "div"
}
},
children: [
{
type: "JSXText",
value: "this is the text"
}
]
}

Mmmkay, so not exactly what I'd call terse. The verbosity might make it look more intimidating (and this is the AST without any metadata about where bits of code start and end), but this structure allows Babel to represent every possible bit of JavaScript that we could write. Just from a few dozen different building block types with some properties!

Enumerate the types of nodes we need to build

Now that we have the AST that we're targeting, we need to look at each of the types we're trying to build. In this AST, we've got ten different types of node that we need to build up. And looking at the values inside of each of those nodes, we can hazard a guess at what each type is responsible for.

  • JSXElement: The overall parent container
  • JSXOpeningElement: The name of the element (div), along with some attributes about the element (our props)
  • JSXClosingElement: The name for the closing tag of the element (div)
  • JSXAttribute: A list on the JSXOpeningElement that enumerates the props of the element
  • JSXText: The text markup of the element (innerText of this is the text)
  • JSXIdentifier: Looks like the way that Babel marks the prop names and the tag name of the element
  • JSXExpressionContainer: Looks like the {} in a prop declaration like numberThing={5}
  • BooleanLiteral: The true or false value of a boolean
  • NumericLiteral: A numeric value - the 5 in our numberThing prop
  • StringLiteral: A string value - the hello value of our className prop

Write helpers to help us build each type

When we're building up this AST, we need to know which node types are children of which types, as well as which nodes are required to build each type. For this, we can go to the source that defines all of the core and JSX types that the Babel work with.

We'll start at the leaf nodes of what the AST explorer spit out and build some helpers as we go. First up are the non-JSX types.

Boolean, String, and Numeric literals

For the BooleanLiteral, we see a type definition like this:

defineType("BooleanLiteral", {
builder: ["value"],
fields: {
value: {
validate: assertValueType("boolean"),
},
},
aliases: ["Expression", "Pureish", "Literal", "Immutable"],
});

In reading this, we see in both the builder and fields that the required field on the node is a value, which is of type boolean.

So, wherever we see a true or false as a value in our input JS object, we need to transform it into:

{
type: `BooleanLiteral`,
value: true|false
}

The StringLiteral and NumericLiteral are very similar:

// String literals
{
type: 'StringLiteral',
value: 'any string'
}
// Numeric literals
{
type: 'NumericLiteral',
value: 5 // any number
}

We can build some helper functions for each of these:

// json-to-jsx.js
const generate = require('@babel/generator').default;
// Helpers
// NEW!
const buildStringLiteral = value => ({type: 'StringLiteral', value})
const buildBooleanLiteral = value => ({type: 'BooleanLiteral', value})
const buildNumericLiteral = value => ({type: 'NumericLiteral', value})
const transform = ({input}) => {
// ???
}
const toJSX = input => {
const parsed = JSON.parse(input) // parsing is solved!
const ast = transform({input: parsed}) // transform our parsed JS object
return generate(ast).code; // generation is solved!
}

If we call these builders for a string, number, or boolean value, we will get back Babel-ready nodes.

JSX Text, Identifier, and Closing Element

These next three types are also pretty straightforward.

The JSXText type also only requires a value of type string, so:

{
type: 'JSXText',
value: 'aString'
}

JSXIdentifier has a name property that's a string instead of a value:

{
type: `JSXIdentifier`,
name: 'aString'
}

The JSXClosingElement is more of the same, requiring only a name that's a string:

{
type: 'JSXClosingElement',
name: 'aString'
}

Let's build out the helpers for these, too:

// json-to-jsx.js
const generate = require('@babel/generator').default;
// Helpers
// From above
const buildStringLiteral = value => ({type: 'StringLiteral', value})
const buildBooleanLiteral = value => ({type: 'BooleanLiteral', value})
const buildNumericLiteral = value => ({type: 'NumericLiteral', value})
// NEW
const buildJSXText = value => ({ type: 'JSXText', value})
const buildJSXIdentifier = name => ({type: 'JSXIdentifier', name})
const buildJSXClosingElement = name => ({type: 'JSXIdentifier', name})
const transform = ({input}) => {...}
const toJSX = input => {...}

JSXExpressionContainer

Next up, we've getting slightly more complex with the JSXExpressionContainer, which is a node that has an expression property. What's an expression, you may be wondering. And I'm wondering a bit, too, as I can't find a formal definition in the definitions types. From what I can surmise, any type that has an alias of "Expression" is something that could be deemed an expression. For this particular use case of input we're using, the BooleanLiteral and NumericLiteral are the only types that are used as Expression nodes on the JSXExpressionContainers. So, if we run into the propName={...} with the curly braces, we know we need to build a JSXExpressionContainer, which has an expression value of either the BooleanLiteral or NumericLiteral types that we defined above. As we make this more robust, we would want our builder helpers to handle more than just these two types of literals.

{
type: 'JSXExpressionContainer',
expression: {
type: 'BooleanLiteral', // or NumericLiteral
value: true|false // and a number
}
}

A helper we could use here:

// json-to-jsx.js
const generate = require('@babel/generator').default;
// Helpers
// From above
const buildStringLiteral = value => ({type: 'StringLiteral', value})
const buildBooleanLiteral = value => ({type: 'BooleanLiteral', value})
const buildNumericLiteral = value => ({type: 'NumericLiteral', value})
const buildJSXText = value => ({ type: 'JSXText', value})
const buildJSXIdentifier = name => ({type: 'JSXIdentifier', name})
const buildJSXClosingElement = name => ({type: 'JSXIdentifier', name})
// NEW!
const buildJSXExpression = expression => ({type: 'JSXExpressionContainer', expression})
const transform = ({input}) => {...}
const toJSX = input => {...}

JSXAttribute

In the AST from the explorer, we can see that there is an array of attributes with type JSXAttribute, which represent the props of the input object (className, numberThing, and isTrue). We don't have any function or object values as props yet - just handling booleans, strings, and numbers. Again, as we build this out into something more robust, we'd want our helpers to handle the additional cases.

The JSXAttribute type is a little more complex:

defineType("JSXAttribute", {
visitor: ["name", "value"],
aliases: ["JSX", "Immutable"],
fields: {
name: {
validate: assertNodeType("JSXIdentifier", "JSXNamespacedName"),
},
value: {
optional: true,
validate: assertNodeType(
"JSXElement",
"JSXFragment",
"StringLiteral",
"JSXExpressionContainer",
),
},
},
});

To build an attribute, we need to have a name that is of type JSXIdentifier (we aren't going to be using JSXNamespacedName in our basic version), and the value needs to be StringLiteral or JSXExpressionContainer (we're not supporting JSXElements or JSXFragments as prop values in this example, either). And above, we just decided that the expression container can be either the BooleanLiteral or the NumericLiteral. So, this is where we can start to use some of our little helpers.

// json-to-jsx.js
const generate = require('@babel/generator').default;
// Helpers
// From above
const buildStringLiteral = value => ({type: 'StringLiteral', value})
const buildBooleanLiteral = value => ({type: 'BooleanLiteral', value})
const buildNumericLiteral = value => ({type: 'NumericLiteral', value})
const buildJSXText = value => ({ type: 'JSXText', value})
const buildJSXIdentifier = name => ({type: 'JSXIdentifier', name})
const buildJSXClosingElement = name => ({type: 'JSXIdentifier', name})
const buildJSXExpression = expression => ({type: 'JSXExpressionContainer', expression})
// NEW!
const buildJSXAttribute = ({name, value}) => {
if (typeof value === 'string') {
return {
type: 'JSXAttribute',
name: buildJSXIidentifier(name),
value: buildStringLiteral(value)
}
}
if (typeof value === 'boolean') {
return {
type: 'JSXAttribute',
name: buildJSXIdentifier(name),
value: buildJSXExpression(buildBooleanLiteral(value))
}
}
if (typeof value === 'boolean') {
return {
type: 'JSXAttribute',
name: buildJSXIdentifier(name),
value: buildJSXExpression(buildNumericLiteral(value))
}
}
throw new TypeError(`${value} is an unsupported type: ${typeof value}`)
}
const transform = ({input}) => {...}
const toJSX = input => {...}

There's certainly some repetition here with the type and name, so we could extract a buildJSXAttributeValue that does the type checking, but I'll leave that undone here.

JSXOpeningElement

Whew, we're almost done - just the JSXOpeningElement and JSXElement left.

The JSXOpeningElement has a name (which is a JSXEpression), an attributes array where each item is a JSXAttribute, and a selfClosing property, which describes whether the tag is self-closing. For the purposes of this input, a div, the tag is not self-closing. The tags that might be self-closing are img, br, hr, iframe, etc.

So, to build the JSXOpeningElement:

// json-to-jsx.js
const generate = require('@babel/generator').default;
// Helpers
// From above
const buildStringLiteral = value => ({type: 'StringLiteral', value})
const buildBooleanLiteral = value => ({type: 'BooleanLiteral', value})
const buildNumericLiteral = value => ({type: 'NumericLiteral', value})
const buildJSXText = value => ({ type: 'JSXText', value})
const buildJSXIdentifier = name => ({type: 'JSXIdentifier', name})
const buildJSXClosingElement = name => ({type: 'JSXIdentifier', name})
const buildJSXExpression = expression => ({type: 'JSXExpressionContainer', expression})
const buildJSXAttribute = ({name, value}) => {...}
// NEW!
const SELF_CLOSING_TAGS = ['img', 'br', 'hr', 'iframe'];
// attributes map to the props of the input, which look like:
// props: {
// className: 'hello',
// numberThing: 5,
// isTrue: true
// }
const buildJSXOpeningElement = ({name, attributes}) => {
return {
type: 'JSXOpeningElement',
name: buildJSXIdentifier(name),
attributes: Object.entries(attributes).map(([attributeName, attributeValue]) => {
return buildJSXAttribute({name: attributeName, value: attributeValue})
}),
selfClosing: SELF_CLOSING_TAGS.includes(name)
}
}
const transform = ({input}) => {...}
const toJSX = input => {...}

JSXElement

Alrighty! Let's finish it out with the JSXElement, which has to be the parent of all of these nodes.

Looking at the type definition, we see that we need an openingElement (JSXOpeningElement), a closingElement (JSXClosingElement), children (an array of JSXText or JSXElements), and the selfClosing attribute.

Since we've been building these helpers all along, building out this element isn't too tough. We can put this element builder code into the transform function we've been ignoring!

// json-to-jsx.js
const generate = require('@babel/generator').default;
// Helpers
// From above
const SELF_CLOSING_TAGS = ['img', 'br', 'hr', 'iframe'];
const buildStringLiteral = value => ({type: 'StringLiteral', value})
const buildBooleanLiteral = value => ({type: 'BooleanLiteral', value})
const buildNumericLiteral = value => ({type: 'NumericLiteral', value})
const buildJSXText = value => ({ type: 'JSXText', value})
const buildJSXIdentifier = name => ({type: 'JSXIdentifier', name})
const buildJSXClosingElement = name => ({type: 'JSXIdentifier', name})
const buildJSXExpression = expression => ({type: 'JSXExpressionContainer', expression})
const buildJSXAttribute = ({name, value}) => {...}
const buildJSXOpeningElement = ({name, attributes}) => {...}
// NEW!
const transform = ({input, target = 'root'}) => {
const elementToTransform = input[target];
const {children, innerText, props, type} = elementToTransform;
let transformedChildren;
if (innerText) {
// no children; just the innerText,
// This is the case for the example input we've been working from
// The JSXElement children are a one-element array with JSXText
transformedChildren = [buildJSXText(innerText)]
} else {
// we have children to consider.
// The JSXElement's children is just a map of recursive calls to
// transform with the child id as the target
transformedChildren = children.map(childId => {
return transform({input, target: childId})
})
}
// return the AST tree!
return {
type: 'JSXElement',
openingElement: buildJSXOpeningElement({name: type, attributes: props}),
children: transformedChildren,
...(!SELF_CLOSING_TAGS.includes(type) && {closingElement: buildJSXClosingElement(type)})
}
}
const toJSX = input => {...}

Generating JavaScript

We did it! We run our transform method on our input object and return a Babel-ready JSX AST. So let's use it in our toJSX function and get out a JavaScript string. We can use the Babel built-in generator on the ast we've built, and we should be good to go.

// json-to-jsx.js
const generate = require('@babel/generator').default;
// Helpers
// From above
const SELF_CLOSING_TAGS = ['img', 'br', 'hr', 'iframe'];
const buildStringLiteral = value => ({type: 'StringLiteral', value})
const buildBooleanLiteral = value => ({type: 'BooleanLiteral', value})
const buildNumericLiteral = value => ({type: 'NumericLiteral', value})
const buildJSXText = value => ({ type: 'JSXText', value})
const buildJSXIdentifier = name => ({type: 'JSXIdentifier', name})
const buildJSXClosingElement = name => ({type: 'JSXIdentifier', name})
const buildJSXExpression = expression => ({type: 'JSXExpressionContainer', expression})
const buildJSXAttribute = ({name, value}) => {...}
const buildJSXOpeningElement = ({name, attributes}) => {...}
const transform = ({input, target = 'root'}) => {...}
// NEW!
const toJSX = input => {
const ast = transform({input});
return generate(ast).code;
}

And there we have it! Once the AST is in a form that Babel can read, all we have to do is feed our AST into the generate function.

Where to go from here

Alright, so now we can take an object and convert it into JSX code, after exploring the AST and babel types. Building a new builder helper for each type would be a bit of a pain if we were to continue on with this and make it more robust. The babel-types package already has a builder for every type, and we can just use those.

At this point, if we stick to the above structure of the input, we could store an entire webpage as JSON. We could use a prebuild script to take the JSON, parse it into a JS object, feed that object into our toJSX function, and write that output into a .js file. And then with a static site generator, we could build this brand-new page into small, fast HTML page.

We wouldn't be the entire way there, as we'd still need to figure out how to input const React = import('react') at the top of the file, along with any other packages or custom components we want to use. We could just hard-code that in to the top of every top-level toJSX call. Or we could build it with a babel-template.

And once we have the basic js pages with the JSX, we could use a static site generator like Gatsby to build the pages while wrapping them in a Layout component. We could go from JSON to a full, lightning-fast Gatsby site. If I build that, I'll keep y'all updated.

Thanks for following along! ✌️

©2020 Matthew Knudsen