Introduction
Piri uses a configuration file to govern output structure and contents. This section is the introductionary course to Piri.
The setup¶
For this introduction course we will use piri-cli since it provides you with a simple command line tool to run piri. And no need to create any python files.
Install with pip:
pip install piri-cli
All examples will have a config, input and output json tab like this:
{}
{}
{}
Copy the contents of config.json and input.json down to your working dir.
Run all examples with the following unless otherwise stated.
piri config.json input.json
About JSON¶
Json is a human readable data format that stores data in objects consisting of attribute-value pairs and arrays. We will use the terms object
and attribute
quite often in this guide. To put it simply an object contains attributes that hold values. These values can sometimes be another object or even an array of objects.
{
"person": {
"name": "Bob",
"height": 180.5,
"friends": [
{
"name": "John",
"height": 170.5
}
]
}
}
person
is an object, name
and height
are attribtes, "Bob"
and 180.5
are values to those attributes. friends
is a list(array) of objects.
The Root¶
The root of all ev... piri configs looks like this
{
"name": "root",
"array": false,
"attributes": [],
"objects": []
}
So this will fail since we consider empty result a failure, but this config generates the enclosing {} brackets you can see in the example in the About JSON section.
{}
Adding Attributes to root¶
To actually map some data we can add attributes
.
{
"name": "root",
"array": false,
"attributes": [
{
"name": "firstname",
"default": "Thomas"
}
]
}
{}
{
"firstname": "Thomas"
}
Congratulations, you've just mapped a default value to an attribute! - Click output.json
tab to see the output.
Structuring with objects¶
{
"name": "root",
"array": false,
"objects": [
{
"name": "person",
"array": false,
"attributes": [
{
"name": "firstname",
"default": "Thomas"
}
]
}
]
}
{}
{
"person": {
"firstname": "Thomas"
}
}
What we just did is the core principle of creating the output structure. We added an object with the name person
, then we moved our firstname
attribute to the person
object.
Time to Map some values!¶
We will now introduce the mappings
key, it's and array of mapping
objects.
The mapping
object is the only place where you actually fetch data from the input. And you do that by specifying a path
. The path
describes the steps to take to get to the value we are interested in.
Mapping.path with flat structure¶
{
"name": "root",
"array": false,
"objects": [
{
"name": "person",
"array": false,
"attributes": [
{
"name": "firstname",
"mappings": [
{
"path": ["name"]
}
]
}
]
}
]
}
{
"name": "Neo"
}
{
"person": {
"firstname": "Neo"
}
}
Mapping.path with nested structure¶
{
"name": "root",
"array": false,
"objects": [
{
"name": "actor",
"array": false,
"attributes": [
{
"name": "name",
"mappings": [
{
"path": ["the_matrix", "neo", "actor", "name"]
}
]
}
]
}
]
}
{
"the_matrix": {
"neo": {
"actor": {
"name": "Keanu Reeves"
}
}
}
}
{
"actor": {
"name": "Keanu Reeves"
}
}
Mapping.path with data in lists¶
Consider the following json:
{
"data": ["Keanu", "Reeves", "The Matrix"]
}
In our mapping
object we supply path
which is a list of how we get to our data. So how do we get the lastname
in that data?
Easy, we reference the index
of the list. The first data in the list starts at 0
, second element 1
, third 2
and so on. This number is the index
and to get the last name we must use the index: 1
{
"name": "root",
"array": false,
"attributes": [
{
"name": "firstname",
"mappings": [
{
"path": ["data", 0]
}
]
},
{
"name": "lastname",
"mappings": [
{
"path": ["data", 1]
}
]
}
]
}
{
"data": ["Keanu", "Reeves", "The Matrix"]
}
{
"firstname": "Keanu",
"lastname": "Reeves"
}
Note
We still have to reference the "data"
key first, so our path
goes first to data
then it finds the value at index 1
Combining values¶
Now lets learn how to combine values from multiple places in the input.
It's fairly normal to only need name
but getting firstname
and lastname
in input data. Lets combine them!
{
"name": "root",
"array": false,
"objects": [
{
"name": "actor",
"array": false,
"attributes": [
{
"name": "name",
"mappings": [
{
"path": ["the_matrix", "neo", "actor", "firstname"]
},
{
"path": ["the_matrix", "neo", "actor", "lastname"]
}
],
"separator": " ",
}
]
}
]
}
{
"the_matrix": {
"neo": {
"actor": {
"firstname": "Keanu",
"lastname": "Reeves"
}
}
}
}
{
"actor": {
"name": "Keanu Reeves"
}
}
To find more values and combine them, simply add another mapping
object to mappings
array.
Use separator
to control with what char values should be separated.
Slicing¶
You can use slicing to cut values at value[from:to] which is very useful when you are only interested in part of a value. The value is turned into a string with str()
before slicing is applied.
String slice¶
Lets say that we have some value like this street-Santas Polar city 45
. We would really like to filter away the street-
part of that value. And that is exactly what Slicing is for.
{
"name": "root",
"array": false,
"objects": [
{
"name": "fantasy",
"array": true,
"path_to_iterable": ["data"],
"attributes": [
{
"name": "name",
"mappings": [{"path": ["data", 0]}]
},
{
"name": "street",
"mappings": [
{
"path": ["data", 1],
"slicing": {
"from": 7
}
}
]
}
]
}
]
}
{
"data": [
["santa", "street-Santas Polar city 45"],
["unicorn", "street-Fluffy St. 40"]
]
}
{
"fantasy": [
{
"name": "santa",
"street": "Santas Polar city 45"
},
{
"name": "unicorn",
"street": "Fluffy St. 40"
}
]
}
Hint
If you have some max length on a database table, then you can use string slicing to make sure the length does not exceed a certain length with the to
key. Some databases also has for example two address fields for when the length of one is too short. Then map both with slicing "from" :0, "to": 50
and "from": 50, "to": null
respectively and you'll solve the problem.
Slicing numbers and casting¶
You can also slice numbers, bools and any other json value since we cast the value to string first. This means that if you for example get a social security number but is only interested in the date
part of it, you can slice it. And then even cast the value to a date
.
2020123112345
-> "20201231"
-> "2020-12-31"
{
"name": "root",
"array": false,
"objects": [
{
"name": "fantasy",
"array": true,
"path_to_iterable": ["data"],
"attributes": [
{
"name": "name",
"mappings": [{"path": ["data", 0]}]
},
{
"name": "birthday",
"mappings": [
{
"path": ["data", 1],
"slicing": {
"from": 0,
"to": 8
}
}
],
"casting": {
"to": "date",
"original_format": "yyyymmdd"
}
}
]
}
]
}
{
"data": [
["santa", 2020123112345],
["unicorn", 1991123012346]
]
}
{
"fantasy": [
{
"name": "santa",
"birthday": "2020-12-31"
},
{
"name": "unicorn",
"birthday": "1991-12-30"
}
]
}
Hint
If you need to take values from end of string, like the 5 last characters, then you can use a negative from
value to count from the end instead. This works just like pythons slicing functionality.
If statements¶
Are useful for when you for example get some numbers in your data that are supposed to represent different types.
Simple if statement¶
Let's check if the value equals 1
and output type_one
.
{
"name": "root",
"array": false,
"attributes": [
{
"name": "readable_type",
"mappings": [
{
"path": ["type"],
"if_statements": [
{
"condition": "is",
"target": "1",
"then": "type_one"
}
]
}
]
}
]
}
{
"type": "1"
}
{
"readable_type": "type_one"
}
If statements are really useful for changing the values depending on some condition. Check the list of supported conditions.
otherwise
can also be used to specify should happen if the condition is false
. If otherwise
is not provided then output will be the original value.
Chain If Statements¶
if_statements
is a list of if statement
objects. We designed it like this so that we can chain them. The output of the first one will be the input of the next one.
the mapping
object is not the only one that can have if statements, the attribute
can also have them. This allows for some interesting combinations.
{
"name": "root",
"array": false,
"attributes": [
{
"name": "readable_type",
"mappings": [
{
"path": ["type"],
"if_statements": [
{
"condition": "is",
"target": "1",
"then": "boring-type"
},
{
"condition": "is",
"target": "2",
"then": "boring-type-two",
"otherwise": "fun-type"
},
{
"condition": "contains",
"target": "fun",
"then": "funky_type"
}
]
}
],
"if_statements": [
{
"condition": "not",
"target": "funky_type",
"then": "junk",
"otherwise": "funk"
}
]
}
]
}
{
"type": "1"
}
{
"readable_type": "funk"
}
{
"type": "2"
}
{
"readable_type": "junk"
}
Using input.json the places that are highlighted is everywhere the value changes.
For input2.json the first if statement is false and no value change. The second if statement is true so value is changed to boring-type-two
. The third if statement is false so no value change. The last if statement checks if the value is not
funky_type
which is true, so the value is changed to junk
.
You can even add if statements for every mapping
object you add into mappings
so this can handle some quite complicated condition with multiple values.
Casting values¶
You've learned how to structure your output with objects, find values and asigning them to attributes, combining values and applying if statements. Its now time to learn how to cast values.
Casting values is very useful for when we get string(text) data that should be numbers. Or when you get badly(non-iso) formatted date values that you want to change to ISO dates
Casting is straightforward. You map your value like you would and then add the casting object.
Casting to decimal¶
{
"name": "root",
"array": false,
"attributes": [
{
"name": "my_number",
"mappings": [
{
"path": ["string_number"]
}
],
"casting": {
"to": "decimal"
}
}
]
}
{
"string_number": "123.12"
}
{
"my_number": 123.12
}
Casting to ISO Date¶
When casting to a date
we always have to supply the original_format
which is the format that the input data is on. without knowing this there would be know way to know fore sure in every case if it was dd.mm.yy or yy.mm.dd
{
"name": "root",
"array": false,
"attributes": [
{
"name": "my_iso_date",
"mappings": [
{
"path": ["yymmdd_date"]
}
],
"casting": {
"to": "date",
"original_format": "yymmdd"
}
}
]
}
{
"yymmdd_date": "101020"
}
{
"my_iso_date": "2010-10-20"
}
Check our the configuration docs on casting for more info
Working with lists¶
Finally! Last topic and the most interesting one!
Usually the data that you are processing is not one thing, but a list of things(data). We want to iterate the list and for each and every piece of data in that list we want to transform it. This is the section that lets you do that.
Lets say we are creating a website for an RPG game that dumps its data in a flat format. Every line is a player with a name
, class
, money
, and x
, y
coordinates for where he is in the world.
{
"data": {
"something_uninteresting": [1, 2, 3],
"character_data": [
["SuperAwesomeNick", 1, 500],
["OtherAwesomeDude", 2, 300],
["PoorDude", 2, 10]
]
}
}
Now to make the frontend dudes happy we would liketo structure this nicely... something like:
{
"players": [
{
"nickname": "SuperAwesomeNick",
"class": "warrior",
"gold": 500,
},
{
"nickname": "OtherAwesomeDude",
...
}
]
}
Introducing Path to Iterable¶
We can use path_to_iterable
on an object
which works similar to mapping.path
, but it applies the current object
and all its attrbute mappings and nested objects to each and every element in whatever list path_to_iterable
points to.
Lets solve the above example!
{
"name": "root",
"array": false,
"objects": [
{
"name": "players",
"array": true,
"path_to_iterable": ["data", "character_data"],
"attributes": [
{
"name": "nickname",
"mappings": [
{
"path": ["character_data", 0]
}
]
},
{
"name": "class",
"mappings": [
{
"path": ["character_data", 1]
}
],
"if_statements": [
{
"condition": "is",
"target": 1,
"then": "warrior",
"otherwise": "cleric"
}
]
},
{
"name": "gold",
"mappings": [
{
"path": ["character_data", 2]
}
]
}
]
}
]
}
{
"data": {
"something_uninteresting": [1, 2, 3],
"character_data": [
["SuperAwesomeNick", 1, 500],
["OtherAwesomeDude", 2, 300],
["PoorDude", 2, 10]
]
}
}
{
"players": [
{
"nickname": "SuperAwesomeNick",
"class": "warrior",
"gold": 500
},
{
"nickname": "OtherAwesomeDude",
"class": "cleric",
"gold": 300
},
{
"nickname": "PoorDude",
"class": "cleric",
"gold": 10
}
]
}
Note that we still have to reference the key when mapping. The key name(character_data
) is the last name in the list of path_to_iterable
.
And thats it!
Congratulations the introduction course is done!
Time to map some data and have fun doing it!