Skip to content

Benoit-Vasseur/guess_json_mapping_code

Repository files navigation

GUESS JSON MAPPING CODE

Build Status

GOAL

The goal of this library is to produce (Javascript or groovy) code to map two JSONs (target <- source). The starting points are the source and the target JSONs and we want to get the code to produce the target from the source.

Main use case : data migration

You have two systems A and B and have some duplicated data (at least one). It is the day, we want to automate the duplication of the data from A to B. So you have the same data represented in the two data structures (A and B). The library and the UI will help you to write a migration script and one of your duplicate will be your starting point.

Documentation

There are two main functions in this library :

  • guessMapingRules : takes the JSON sourceand the JSON target as entries and generate the mappingRules.
  • generateMappingCode : takes the mappingRules (generated by guessMapingRules) to produce javascript or groovy source code.

global explanation

The starting points of the library (guessMapingRules()) are two json : a source and a target. And the end result is a generated source code that, if run, will produce the json target from the json source .

Here an example on how the library works : https://repl.it/@Benoit_Vasseur/guessjsonmapping

And bellow the same example explained : So if the source is :

{
  "p": {
    "firstname": "Benoit"
  },
  "lastname": "Vasseur"
}

And the target :

{
  "personne": {
    "firstName": "Benoit",
    "lastName": "Vasseur"
  }
}

The generated (JavaScript) source code is :

var target = {
  personne: {
    firstName: source.p.firstname,
    lastName: source.lastname
  }
}

The Groovy code is :

target = [
    "personne": [
        "firstName": source.p.firstname,
        "lastName": source.lastname
    ]
]

Here is a running example that use the generated Groovy code : https://jdoodle.com/a/XkJ

intermediate data structure : mappingRules

To generate the source code, there is one intermediate data structure : mappingRules, that represents the mapping target <- source.

If we stick to our previous example, here is the mappingRules that is generated :

{
  "personne": {
    "firstName": ".p.firstname",
    "lastName": ".lastname"
  }
}

This representation (same structure as the target) may change in the future. I doubt that is it well suited for more complex relation than pure equality.

Here are the others possibilty that I have in mind for now :

  • For the stucture itself
    • try a flat struct (array or object), it may be easier to work with and better to do some optimisation in the source code (see 'dream' in example below) :
  • To describe more complexe relation
    • try a LISP like pseudo language, here some ideas :
// one prop in the source
// but this prop has to be split to populate two props of the target
const source = {
    name: "Benoit Vasseur"
}

const target: {
    firstName: "Benoit",
    lastName: "Vasseur"
}

// --- GOAL ---
// js code would be
const target = {
    firstName: source.name.split(" ")[0],
    lastName: source.name.split(" ")[1]
}
// DREAM version (not even in the roadmap but would be very cool :)
const [firstName, lastName] = source.name.split(" ")
const targetDream = {
    firstName,
    lastName
}

// --- Mapping rules ---
// we need a way to represent the operation needed

// V1 : with a pseudo LISP lang (a la clojure)
// $0 represents the source obj
const mappingRules = {
               // nth(split(source.name, ""), 0)
    firstName: ["NTH", ["SPLIT", "$0.name", " "], 0],
    lastName: ["NTH", ["SPLIT", "$0.name", " "], 1]
}

// V2 : implict pipe
const mappingRules2 = {
    firstName: [["SPLIT", "$0.name", " "], /* >> */ ["NTH", 0]],
    lastName: [["SPLIT", "$0.name", " "], /* >> */ ["NTH", 1]]
}

// V3 : flat struct + pipe 
const mappingRules3 = [
    /*
    {
        "path in target":[
            1: "path in source",
            2: operation on 1,
            3: operation on 2,
            etc (piping)
        ]
    } */
    {
        ".firstname":[
            ".name",
            [SPLIT, ""],
            [NTH, 0]
        ]
    }
]

For now, the last one (mappingRules3) seems nicer to me but I could not "justify" it. We will see shortly how it ends, when I will work on the substring detection !

ROADMAP

The order of the items in sublists are not meaningful for the priority.

  • simple json mapping : no array, strict equal (no substring), ...
  • unit tests
    • guessMapingRules()
    • [] findPath()
    • [] generateMappingCode()
  • [-] integration tests
    • code generation for Javascript
    • [] code generation for Groovy
  • add CD/CI
  • add linter and prettier
  • [-] be sure to have a nice workflow and good tools for :
    • publishing source
    • testing
    • [-] documentation : for now I just use markdown in github but want to test some stuff (generate doc from test -> living documentation, gatsby, jsDoc / typescript, github pages, ...)
  • add basic UI
    • simple version : no linter for json, no code formatting, etc
    • add beautiffier for json inputs
    • add code formatting for generated code
  • [] add documentation and test cases to support the roadmap. Especially add red tests to show where you want to go and what is not implemented yet
  • [] improve algorithm, to handle
    • [] arrays
    • [] substring
    • [] concat
    • [] multiple matches
    • [] date format conversion
  • UI ++
    • [] can make a choice if multiple solutions are returned for one property
    • [] visual mode of the mapping (arrows to represent the mapping, etc)
  • [] open for extension to support other languages

workflow and contribution

Tools

Tools used to have style consistency (and try to check some best practises) :

  • prettier : npm run format
  • eslint : npm run lint

Spirit

  • No specific workflow should be forced during coding on local (no git hooks, etc).
  • Travis will complain if : tests or linting rules are failing
  • If a branch/PR is red, it is not merged into master or develop
  • feature branch

I do not want to borrow the potential contributor during his dev process. So I do not want to force other to use prettier and eslint during local coding (or other tools). Travis will complain if the rules are broken, and I think it is enough. So if the dev want to use prettier and eslint during his coding process, the tools are here (npm scripts and config files). It is just only before the PR that prettier and eslint have to be green (if you want a merge). But of course, prettier and eslint has to be green to merge, so run the tasks format and then lint and fix issues if necessary.

About

library to produce (JavasScript or Groovy) code that map two JSONs (target <- source).

Resources

License

Stars

Watchers

Forks

Packages

No packages published