Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Add docs for formatting logic #123

Merged
merged 1 commit into from
Jan 1, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 69 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ print(format_sql(example_sql))
and b.qwer = 2
and a.asdf <= 1 --comment that
or b.qwer >= 5
GROUP BY a.asdf;
GROUP BY a.asdf


It can even deal with subqueries and it will correct my favourite simple careless mistake (comma at the end of SELECT statement before of FROM) for you on the flow :-)
Expand Down Expand Up @@ -116,7 +116,7 @@ where qwer1 >= 0
FROM table2
WHERE qwer2 = 1) as b
ON a.asdf = b.asdf
WHERE qwer1 >= 0;
WHERE qwer1 >= 0


The formatter is also robust against nested subqueries
Expand All @@ -138,7 +138,7 @@ field3 from table1 where a=1 and b>=100))
field3
FROM table1
WHERE a = 1
and b >= 100));
and b >= 100))


If you do not want to get some query formatted in your SQL file then you can use the marker `/*skip-formatter*/` in your query to disable formatting for just the corresponding query
Expand Down Expand Up @@ -198,6 +198,70 @@ This package is just a SQL formatter and therefore
* cannot parse your SQL queries into e.g. dictionaries
* cannot validate your SQL queries to be valid for the corresponding database system / provider

Up to now it only formats queries of the form

* `CREATE TABLE / VIEW ...`
* `SELECT ...`

Every other SQL commands will remain unformatted, e.g. `INSERT INTO` ...

## Formatting Logic

The main goal of the `sql_formatter` is to enhance readability and quick understanding of SQL queries via proper formatting. We use **indentation** and **lowercasing / uppercasing** as means to arrange statements / clauses and parameters into context. By **programmatically standardizing** the way to write SQL queries we help the user understand its queries faster.

As a by-product of using the `sql_formatter`, developer teams can focus on the query logic itself and save time by not incurring into styling decisions, this then begin accomplished by the `sql_formatter`. This is similar to the goal accomplished by the [black package](https://github.com/psf/black) for the Python language, which was also an inspiration for the development of this package for SQL.

We can summarize the main steps of the formatter as follows:

1. Each query is separated from above by two newlines.
2. Everything but **main statements\* / clauses** is lowercased

\* Main statements:

* CREATE ... TABLE / VIEW table_name AS
* SELECT (DISTINCT)
* FROM
* (LEFT / INNER / RIGHT / OUTER) JOIN
* UNION
* ON
* WHERE
* GROUP BY
* ORDER BY
* OVER
* PARTITION BY

3. Indentation is used to put parameters into context. Here an easy example:

```sql
SELECT field1,
case when field2 > 1 and
field2 <= 10 and
field1 = 'a' then 1
else 0 end as case_field,
...
FROM table1
WHERE field1 = 1
and field2 <= 2
or field3 = 5
ORDER BY field1
```
> This is a very nice, easy example but things can become more complicated if comments come into play

4. Subqueries are also properly indented, e.g.:```sqlSELECT a.field1,
a.field2,
b.field3
FROM (SELECT field1,
field2
FROM table1
WHERE field1 = 1) as a
LEFT JOIN (SELECT field1,
field3
FROM table2) as b
ON a.field1 = b.field1:
```

5. Everything not being a query of the form `CREATE ... TABLE / VIEW` or `SELECT ...` is left unchanged

## How to contribute

You can contribute to this project:
Expand Down Expand Up @@ -282,3 +346,5 @@ For development we follow these steps:
Thank you very much to Jeremy Howard and all the [nbdev](https://github.com/fastai/nbdev) team for enabling the *fast* and delightful development of this library via the `nbdev` framework.

For more details on `nbdev`, see its official [tutorial](https://nbdev.fast.ai/tutorial.html)

Thank you very much for the developers of the [black](https://github.com/psf/black) package, which was also an inspiration for the development of this package
80 changes: 77 additions & 3 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ <h3 id="Usage-in-Python">Usage in Python<a class="anchor-link" href="#Usage-in-P
and b.qwer = 2
and a.asdf &lt;= 1 --comment that
or b.qwer &gt;= 5
GROUP BY a.asdf;
GROUP BY a.asdf
</pre>
</div>
</div>
Expand Down Expand Up @@ -224,7 +224,7 @@ <h3 id="Usage-in-Python">Usage in Python<a class="anchor-link" href="#Usage-in-P
FROM table2
WHERE qwer2 = 1) as b
ON a.asdf = b.asdf
WHERE qwer1 &gt;= 0;
WHERE qwer1 &gt;= 0
</pre>
</div>
</div>
Expand Down Expand Up @@ -275,7 +275,7 @@ <h3 id="Usage-in-Python">Usage in Python<a class="anchor-link" href="#Usage-in-P
field3
FROM table1
WHERE a = 1
and b &gt;= 100));
and b &gt;= 100))
</pre>
</div>
</div>
Expand Down Expand Up @@ -375,6 +375,79 @@ <h3 id="What-sql_formatter-does-not-do">What <code>sql_formatter</code> does not
<li>cannot parse your SQL queries into e.g. dictionaries</li>
<li>cannot validate your SQL queries to be valid for the corresponding database system / provider</li>
</ul>
<p>Up to now it only formats queries of the form</p>
<ul>
<li><code>CREATE TABLE / VIEW ...</code></li>
<li><code>SELECT ...</code></li>
</ul>
<p>Every other SQL commands will remain unformatted, e.g. <code>INSERT INTO</code> ...</p>

</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h2 id="Formatting-Logic">Formatting Logic<a class="anchor-link" href="#Formatting-Logic"> </a></h2><p>The main goal of the <code>sql_formatter</code> is to enhance readability and quick understanding of SQL queries via proper formatting. We use <strong>indentation</strong> and <strong>lowercasing / uppercasing</strong> as means to arrange statements / clauses and parameters into context. By <strong>programmatically standardizing</strong> the way to write SQL queries we help the user understand its queries faster.</p>
<p>As a by-product of using the <code>sql_formatter</code>, developer teams can focus on the query logic itself and save time by not incurring into styling decisions, this then begin accomplished by the <code>sql_formatter</code>. This is similar to the goal accomplished by the <a href="https://github.com/psf/black">black package</a> for the Python language, which was also an inspiration for the development of this package for SQL.</p>
<p>We can summarize the main steps of the formatter as follows:</p>
<ol>
<li>Each query is separated from above by two newlines.</li>
<li>Everything but <strong>main statements* / clauses</strong> is lowercased</li>
</ol>
<p>* Main statements:</p>
<ul>
<li>CREATE ... TABLE / VIEW table_name AS</li>
<li>SELECT (DISTINCT)</li>
<li>FROM</li>
<li>(LEFT / INNER / RIGHT / OUTER) JOIN</li>
<li>UNION</li>
<li>ON</li>
<li>WHERE</li>
<li>GROUP BY</li>
<li>ORDER BY</li>
<li>OVER</li>
<li>PARTITION BY</li>
</ul>
<ol>
<li>Indentation is used to put parameters into context. Here an easy example:</li>
</ol>
<div class="highlight"><pre><span></span><span class="k">SELECT</span> <span class="n">field1</span><span class="p">,</span>
<span class="k">case</span> <span class="k">when</span> <span class="n">field2</span> <span class="o">&gt;</span> <span class="mi">1</span> <span class="k">and</span>
<span class="n">field2</span> <span class="o">&lt;=</span> <span class="mi">10</span> <span class="k">and</span>
<span class="n">field1</span> <span class="o">=</span> <span class="s1">&#39;a&#39;</span> <span class="k">then</span> <span class="mi">1</span>
<span class="k">else</span> <span class="mi">0</span> <span class="k">end</span> <span class="k">as</span> <span class="n">case_field</span><span class="p">,</span>
<span class="p">...</span>
<span class="k">FROM</span> <span class="n">table1</span>
<span class="k">WHERE</span> <span class="n">field1</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">and</span> <span class="n">field2</span> <span class="o">&lt;=</span> <span class="mi">2</span>
<span class="k">or</span> <span class="n">field3</span> <span class="o">=</span> <span class="mi">5</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">field1</span>
</pre></div>
<blockquote><p>This is a very nice, easy example but things can become more complicated if comments come into play</p>
</blockquote>
<ol>
<li><p>Subqueries are also properly indented, e.g.:```sqlSELECT a.field1,</p>

<pre><code>a.field2,
b.field3
</code></pre>
<p>FROM (SELECT field1,</p>

<pre><code> field2
FROM table1
WHERE field1 = 1) as a
</code></pre>
<p>LEFT JOIN (SELECT field1,</p>

<pre><code> field3
FROM table2) as b
ON a.field1 = b.field1:
</code></pre>
<p>```</p>
</li>
<li><p>Everything not being a query of the form <code>CREATE ... TABLE / VIEW</code> or <code>SELECT ...</code> is left unchanged</p>
</li>
</ol>

</div>
</div>
Expand Down Expand Up @@ -472,6 +545,7 @@ <h2 id="Acknowledgements">Acknowledgements<a class="anchor-link" href="#Acknowle
<div class="text_cell_render border-box-sizing rendered_html">
<p>Thank you very much to Jeremy Howard and all the <a href="https://github.com/fastai/nbdev">nbdev</a> team for enabling the <em>fast</em> and delightful development of this library via the <code>nbdev</code> framework.</p>
<p>For more details on <code>nbdev</code>, see its official <a href="https://nbdev.fast.ai/tutorial.html">tutorial</a></p>
<p>Thank you very much for the developers of the <a href="https://github.com/psf/black">black</a> package, which was also an inspiration for the development of this package</p>

</div>
</div>
Expand Down
85 changes: 80 additions & 5 deletions nbs/index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@
" and b.qwer = 2\n",
" and a.asdf <= 1 --comment that\n",
" or b.qwer >= 5\n",
"GROUP BY a.asdf;\n"
"GROUP BY a.asdf\n"
]
}
],
Expand Down Expand Up @@ -189,7 +189,7 @@
" FROM table2\n",
" WHERE qwer2 = 1) as b\n",
" ON a.asdf = b.asdf\n",
"WHERE qwer1 >= 0;\n"
"WHERE qwer1 >= 0\n"
]
}
],
Expand Down Expand Up @@ -231,7 +231,7 @@
" field3\n",
" FROM table1\n",
" WHERE a = 1\n",
" and b >= 100));\n"
" and b >= 100))\n"
]
}
],
Expand Down Expand Up @@ -322,7 +322,80 @@
"This package is just a SQL formatter and therefore\n",
"\n",
"* cannot parse your SQL queries into e.g. dictionaries\n",
"* cannot validate your SQL queries to be valid for the corresponding database system / provider"
"* cannot validate your SQL queries to be valid for the corresponding database system / provider\n",
"\n",
"Up to now it only formats queries of the form\n",
"\n",
"* `CREATE TABLE / VIEW ...`\n",
"* `SELECT ...`\n",
"\n",
"Every other SQL commands will remain unformatted, e.g. `INSERT INTO` ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Formatting Logic\n",
"\n",
"The main goal of the `sql_formatter` is to enhance readability and quick understanding of SQL queries via proper formatting. We use **indentation** and **lowercasing / uppercasing** as means to arrange statements / clauses and parameters into context. By **programmatically standardizing** the way to write SQL queries we help the user understand its queries faster.\n",
"\n",
"As a by-product of using the `sql_formatter`, developer teams can focus on the query logic itself and save time by not incurring into styling decisions, this then begin accomplished by the `sql_formatter`. This is similar to the goal accomplished by the [black package](https://github.com/psf/black) for the Python language, which was also an inspiration for the development of this package for SQL. \n",
"\n",
"We can summarize the main steps of the formatter as follows:\n",
"\n",
"1. Each query is separated from above by two newlines.\n",
"2. Everything but **main statements\\* / clauses** is lowercased\n",
"\n",
"\\* Main statements:\n",
"\n",
"* CREATE ... TABLE / VIEW table_name AS\n",
"* SELECT (DISTINCT)\n",
"* FROM\n",
"* (LEFT / INNER / RIGHT / OUTER) JOIN\n",
"* UNION\n",
"* ON\n",
"* WHERE\n",
"* GROUP BY\n",
"* ORDER BY\n",
"* OVER\n",
"* PARTITION BY\n",
"\n",
"3. Indentation is used to put parameters into context. Here an easy example:\n",
"\n",
"```sql\n",
"SELECT field1,\n",
" case when field2 > 1 and\n",
" field2 <= 10 and\n",
" field1 = 'a' then 1\n",
" else 0 end as case_field,\n",
" ...\n",
"FROM table1\n",
"WHERE field1 = 1\n",
" and field2 <= 2\n",
" or field3 = 5\n",
"ORDER BY field1\n",
"```\n",
"\n",
"> This is a very nice, easy example but things can become more complicated if comments come into play\n",
"\n",
"4. Subqueries are also properly indented, e.g.:\n",
"\n",
"```sql\n",
"SELECT a.field1,\n",
" a.field2,\n",
" b.field3\n",
"FROM (SELECT field1,\n",
" field2\n",
" FROM table1\n",
" WHERE field1 = 1) as a\n",
" LEFT JOIN (SELECT field1,\n",
" field3\n",
" FROM table2) as b\n",
" ON a.field1 = b.field1:\n",
"```\n",
"\n",
"5. Everything not being a query of the form `CREATE ... TABLE / VIEW` or `SELECT ...` is left unchanged"
]
},
{
Expand Down Expand Up @@ -427,7 +500,9 @@
"source": [
"Thank you very much to Jeremy Howard and all the [nbdev](https://github.com/fastai/nbdev) team for enabling the *fast* and delightful development of this library via the `nbdev` framework.\n",
"\n",
"For more details on `nbdev`, see its official [tutorial](https://nbdev.fast.ai/tutorial.html)"
"For more details on `nbdev`, see its official [tutorial](https://nbdev.fast.ai/tutorial.html)\n",
"\n",
"Thank you very much for the developers of the [black](https://github.com/psf/black) package, which was also an inspiration for the development of this package"
]
}
],
Expand Down