Skip to content

Turing-complete programming language to simplify data workflows.

Notifications You must be signed in to change notification settings

julianjumper/yadl

Repository files navigation

Yet Another Data Language

Simplify Data Workflows by Combining Programming and Data Operations!
Beyond basic queries, SQL struggles with complex data manipuiation. APIs often return data in JSON format, requiring additional parsing. YADL bridges the gap, offering built-in functionality for both - write less code, analyze more effectively. It's a programming language that allows to parse different file types (json, csv, etc.) with a singe load function and to work with data operations on it.

YADL is a Turing-complete language and was created in a group at university. Thanks to all of them. This repository is a re-upload.

weather_data = load("./weather-data.json", "json") // open json file

bern = weather_data["bern"] // extract "bern" object

// example of a function (fat-arrow style)
has_freezing_days = (city) => {
    return check_any(city, (item) => item["temp"] < 0)
}

print3("Has Bern freezing days?: " + has_freezing_days(bern))
print3("Has Bern not freezing days?: " + check_all(bern, (item) => item["temp"] < 0)) // use function in print-statement
print3("Is Bern the best city?: idk, im a computer")

// find continuous data with a while-loop and if/else
print3("Has Bern continuous data?:")
index = 1
continuous_data = true
while (index < len(bern) and continuous_data) {
    if (bern[index-1]["day"] +1 != bern[index+0]["day"]) {
        continuous_data = false
    }
    
    index = index + 1
}
print3(continuous_data)

Table of Contents

  1. Introduction
    1. Example
    2. Anonymous Functions
    3. Standard Library/Built-ins
    4. Common Bugs
  2. How does YADL work?
  3. Quick Start/Installation
  4. Build Instructions
    1. Prerequisites
    2. Building in Terminal/Shell
    3. Building in intellij IDEA
  5. Testing of Code
    1. Unit testing
    2. Testing with pytest

Introduction

Example

Let's examine the example above!
Assuming there is the file weather-data.json (you can also find it in fancy-tests/weather-data.json), this file will use functions, loops and if-statements to analyze the data in the JSON. This is of course only a small demonstration.

Functions, that are not specifically declared in this example are inbuilt functions. For all in-built functions (with description), click here.
Please keep in mind, that this project was created in a short period of time. This has not reach its full potential. The most important idea we had in mind, is to load the data chunk-wise so that not all data needs to be stored in memory.

Standard Library / In-built Functions

YADL comes with a lot of in-built functions - map, filter and reduce to name a few. All of them are described here.

Anonymous functions

We also have anonymous functions!
This code from the example uses one:

has_freezing_days = (city) => {
    return check_any(city, (item) => item["temp"] < 0)
}
print3("Is it freezing?", has_freezing_days(bern)

Notice this line: check_any(city, (item) => item["temp"] < 0)
Right there, we use the anonymous function (item) => item["temp"] < 0) which returns true, if attribute "temp" of the passed object "item" is below 0.
Is this example, it is passed to one of our in-built function from the standard library check_any ("see here" for more information). It is a higher-order function, which takes the anonymous function as an argument.

But we can also immediately call anonymous functions instead:

print3("2 + 1:", ((a,b) => a+b)(2,1))

Common bugs:

  • In the example, you might have noticed the index+0 in the if-statement. This is because the parser expects a value of an operation. I will fix it when I have time!
    btw, index+0 is an easy fix - in our group we made fun of this bug by using an anonymous identity function and passing the wanted value: ((i) => i)(index)🤪
  • After an if-statement, you have to insert a blank line - also an issue with the parser.

How does YADL work?

YADL is an interpreted language. That means we use a parser to read the yadl-file and an interpreter to interpret what the parser has parsed. Scala was due to its functional-first style the language of our choice and to make our lifes easier, we use a framework for the parser called FastParse. This is a combinator parser, meaning every parsing-rule has its own parser and each parser can take another parser - they are higher-order functions. The interpreter has to keep track of all the variables, functions and evaluates operations (like x = 2+2*3; notice the precendece of multiplication/addition), loops and conditionals. The data stream functions as well as the interpreter and parser are handled in Scala.

Quick Start/Installation

Download JAR

  • download the JAR from this GitHub repository

Running a .yadl file

Run this command: java -jar <path-to-jar> <path-to-yadl> path-to-jar is the path to the downloaded jar-file and path-to-yadl the path to your yadl program. You can download and use the test file fancy-tests/wather-data.yadl.

Build Instructions

Prerequisites

  • Scala 3.X
  • Recent Java SDK (openjdk 22 for example)

Building in Terminal/Shell

Run the following commands in the project root.

Just building:

sbt compile

Running:

sbt run

Running with Program arguments:

sbt "run args..."

The quotes are neccessary here because otherwise they would be interpreted as a new command from sbt.

Building in intellij IDEA

Installing Plugins

install the Scala Plugin from the jetbrains marketplace.

Setting up build tasks

When you are in a project go to the top-right where you select your current task and chose 'Edit Configurations...' in the drop-down menu.

In the Configuration menu select the + to add a new task and chose the 'sbt Task'.

Now you can give the task a meaningful name and pick a task to run (for example run or "run args..." with arguments) among other settings.

Once done hit 'Apply' or 'OK' to finish the task setup.

Now you should be able to build/run/package/... the project depending on what you chose as a task.

Unit testing and Python Script testing

Unit testing

Similar to building in the terminal you execute the following for the scala unit tests:

sbt test

Python Script testing

These tests involve a bit more work to be run. For the duration of these steps I assume you are at the root of the project.

Prerequisites

Install pytest

Step 1

Similar to building in the terminal you execute the assembly-task added by the project/plugin.sbt build config:

sbt assembly

This will emit a jar-file which we use in the following steps.

Step 2

The python scripts relies on the YADL_JAR envirnoment variable to be pointed to the yadl interpreter.

To set the env. var. use:

For Linux and Mac:

export YADL_JAR=target/scala-3.4.1/yadl.jar

For Windows:

set YADL_JAR=target/scala-3.4.1/yadl.jar
Step 3

Finally run pytest:

pytest

About

Turing-complete programming language to simplify data workflows.

Resources

Stars

Watchers

Forks

Packages

No packages published