-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release/r8 #309
Release/r8 #309
Conversation
Adding @chuwy to review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @oguzhanunlu, we definitely need many fixes around implementation of #221. This is most important and interesting (for me personally) ticket in this release.
/** | ||
* Type representing value for `description` key | ||
* | ||
* @see http://json-schema.org/latest/json-schema-validation.html#rfc.section.7.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link describes "Implementation Requirements", not description
.
val lintDescriptionPresent: Linter = (schema: Schema) => { | ||
(schema.description) match { | ||
case Some(_) => propertySuccess | ||
case None => s"${schema.`type`.get} Schema doesn't contain description property".failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never ever use Option#get
. This linter will fail with runtime exception for any schema that has no description and no type
, which is extremely common.
(schema.description) match { | ||
case Some(_) => propertySuccess | ||
case None => s"${schema.`type`.get} Schema doesn't contain description property".failure | ||
case _ => propertySuccess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this case for? schema.description: Option[_]
is either Some(_)
or None
. Nothing else is possible.
List(s"Warning: File [${input.getAbsolutePath}] contains a schema whose version is NOT 1-0-0") | ||
else List.empty | ||
val schemaVerError: List[String] = | ||
if(input.isDirectory && versions.head != "1-0-0" && !force) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine: input
is a com.acme
directory with files:
com.acme/event/jsonschema/1-0-0
com.acme/context/jsonschema/1-0-1
com.acme/context/jsonschema/1-0-2
1-0-0
is head? Yep. Is it correct? Nope.
Scala core primitives allow you to group schemas by name
and check presence of 1-0-0
version for each particular schema. This is again benefit of "structured" (SchemaMap
) types versus "unstructured" (String
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are definitely right. I forgot to consider name.
else if(input.isDirectory && versions.head == "1-0-0" && existMissingSchemaVersion(versions) && !force){ | ||
List(s"Error: Directory [${input.getAbsolutePath}] has gaps between schema versions. You can use --force to override.") | ||
} | ||
else List.empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broken formatting. Should be } List.empty
on single line. Also, List.empty[Foo]
is useful to help compiler with type inference, but here Nil
can be used.
List(s"Warning: File [${input.getAbsolutePath}] contains a schema whose version is NOT 1-0-0") | ||
else List.empty | ||
val schemaVerError: List[String] = | ||
if(input.isDirectory && versions.head != "1-0-0" && !force) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed space: if (input...
.
@@ -92,6 +93,20 @@ case class GenerateCommand( | |||
// Parse Self-describing Schemas | |||
val (schemaErrors, schemas) = splitValidations(files.map(_.extractSelfDescribingSchema)) | |||
|
|||
// Check schema versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, refactor this as a separate function and implement unit tests for it. Right now it has too many wrong outputs (read below) - especially we want to avoid runtime exceptions.
private[ctl] def existMissingSchemaVersion(versions: List[String]): Boolean = { | ||
val numOfVersions = versions.length | ||
|
||
if (numOfVersions == 1) return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never ever use return
in Scala. https://tpolecat.github.io/2014/05/09/return.html
|
||
if (numOfVersions == 1) return false | ||
|
||
val versionsArr = versions.map(version => version.split('-').map(_.toInt)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be List[SchemaVer]
, but we casted it to unnecessary vague List[String]
and now trying to get back structure of List[SchemaVer]
.
There's a notion of total and partial functions. If you say that it accepts List[String]
and returns Boolean
then it means that it works for any list of strings. But it will fail (with runtime exceptions - String#toInt
, List#head
!) for way too many strings. You can avoid this by switching back to List[SchemaVer]
.
val schemaVerError: List[String] = | ||
if(input.isDirectory && versions.head != "1-0-0" && !force) | ||
List(s"Error: Directory [${input.getAbsolutePath}] does NOT contain a schema whose version is 1-0-0. You can use --force to override.") | ||
else if(input.isDirectory && versions.head == "1-0-0" && existMissingSchemaVersion(versions) && !force){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, directory com.acme
with:
com.acme/event/jsonschema/1-0-0
com.acme/context/jsonschema/1-0-1
com.acme/context/jsonschema/1-0-2
has no gaps (List(1-0-0, 1-0-1, 1-0-2)
) according to this function, but obviously incorrect overall.
Think of this function working with schemas
directory from https://github.com/snowplow/iglu-central, not jsonschema
from https://github.com/snowplow/iglu-central/tree/master/schemas/com.mailchimp/cleaned_email
5f13ad5
to
4eb4f86
Compare
Hey @oguzhanunlu, I'm trying to compile local copy of Igluctl from
Do you have any idea what happened? Could you try to reproduce it after deleting cache ( By the way - really awesome work on #221! |
Hi @chuwy , I tried to reproduce by deleting |
private[ctl] def validateSchemaVersions(schemas: List[IgluSchema]): (List[String], List[String]) = { | ||
|
||
def existMissingSchemaVersion(schemaMaps: List[SchemaMap]): Boolean = { | ||
val numOfMaps = schemaMaps.length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever you're processing List
with pattern like this (if empty then X, if non-empty then Y) and call head or tail - you need to use pattern-matching:
def existMissingSchemaVersion(schemaMaps: List[SchemaMap]): Boolean =
schemaMaps match {
case Nil => false
case _ :: Nil => false
case h :: tail => {
val prevModel = h.version.model
val prevRevision = h.version.revision
val prevAddition = h.version.addition
val curModel = tail.head.version.model
val curRevision = tail.head.version.revision
val curAddition = tail.head.version.addition
??? // same implementation
}
}
This is form of Mathematical induction. You basically step-by-step reduce whole input to a single simplest step (no schemas = no error). This is a basis for recursive functions (btw, your function is tail-recursive - don't forget to add corresponding annotation).
if (numOfMaps == 1){ | ||
false | ||
} else { | ||
val prevModel = schemaMaps.head.version.model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation itself can also be little bit generalized and simplified (later statement might sound questionable for you now, but will be clearer when we'll have a chance to talk about Iglu ASTs) using two following facts:
SchemaVer
is an ordered set (someone may argue that it's a poset, but let's omit it for a moment). It means that we can build aList
ofSchemaVer
from "lowest" to "highest" by(versions: List[SchemaVer]).sorted)
(instance forOrdering
is in iglu-core)- We can define very general (so general that they can live in
scala-core
) functions:
def modelSucc(a: SchemaVer.Full, b: SchemaVer.Full): Boolean = a.model + 1 == b.model
def revisionSucc(a: SchemaVer.Full, b: SchemaVer.Full): Boolean = a.revision + 1 == b.revision
def additionSucc(a: SchemaVer.Full, b: SchemaVer.Full): Boolean = a.addition + 1 == b.addition
def succ(a: SchemaVer.Full, b: SchemaVer.Full): Boolean = modelSucc(a, b) || revisionSucc(a, b) || additionSucc(a, b)
These functions tell us if a
in some way is successor of b
(succ
originates from Haskell).
Having these two facts we can write very simple function checking if in row of ordered SchemaVer
sneaked a gap.
case class Acc(previous: Option[Full], gaps: List[SchemaVer])
object Acc {
def empty: Acc = Acc(None, Nil)
}
orderedSchemaVers.foldLeft(Acc.empty) { (acc, cur) =>
acc match {
case Acc(None, gaps) if cur == Full(1,0,0) => Acc(Some(Full(1,0,0)), gaps)
case Acc(None, gaps) => Acc(Some(cur), Full(1,0,0) :: gaps)
case Acc(Some(prev), gaps) if succ(prev, cur) => Acc(Some(cur), gaps)
case Acc(Some(prev), gaps) => Acc(Some(cur), cur :: gaps)
}
}
This is quick and dirty implementation (not even entirely correct - it saves not "gaps", but latest found vers, up to you to fix it), but idea should be clear. I want this general functions to live somewhere is upstream libraries (Scala Core or Schema DDL) because this is a basis for a consistent Iglu repositories, where each schema does not live in isolation, but has a "connection" to its successors and predecessor and therefore basis for a schema inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying you need to implement it in R8, but this is a main reason why I was so excited about #221. If you're not going to implement this solution in R8 - then please create a ticket with link (or cite) to above message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
|
||
if (schemas.empty) { | ||
(List.empty[String], List.empty[String]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to use private case class MessageAccumulator(warnings: List[String], errors: List[String])
|
||
result.warnings.foreach(printMessage) | ||
val dirErr = result.warnings.filter(w => w.startsWith("Error: Directory")) | ||
if (!dirErr.isEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!x.isEmpty === x.nonEmpty
.map(_.write(force)).foreach(printMessage) | ||
|
||
result.warnings.foreach(printMessage) | ||
val dirErr = result.warnings.filter(w => w.startsWith("Error: Directory")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this check does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Schema version errors happen due to either
1 - Input directory not containing schema version 1-0-0 or
2 - Input directory containing gap(s) between schema versions
per issue strategy which indicates to refuse to do anything if either 1 or 2 occur.
Their error messages start with "Error: Directory..." and I want to check if that happened, before outputting anything else. I didn't want to insert any exit mechanism to transformSelfDescribing hence the current implementation. It doesn't sound much clever to continue program after this line but outputResult
is only place to look at for outputting any end result. Sure it would take shorter to exit if I had inserted. Also, current check is a loose connection with #221 and error outputting which could be a hidden tiny bug in future error-related issues. Which way should we follow @chuwy ?
case (Nil, errors) => | ||
println(errors.mkString("\n")) | ||
false | ||
case (List(_), List(_)) => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this last match means if we have one warning and one error then everything is correct
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chuwy , actually this is the case that never gets executed but inserted here to be able to compile. validateSchemaVersions never return warnings together with errors, it is either (no warning, no error), (warning, no error) or (no warning, error) per issue strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two points here:
- If you need a universal fallback case, you write it as
case _ => true
- It again looks like a way to trick the compiler and developer who doesn't know about context. In languages with strong type discipline like Scala, one of the most important purposes of types is to provide developer enough information to reason about code using smallest possible scope, ideally just types. Therefore it's easier to spot bugs and test the code. Here developer needs to look into different functions/modules and examine this code to understand that we have a line that never gets executed. That's okay for now, but I'd like to see this code refactored someday (of course, it was me who started this mess, but I was young back then!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a short comment for now.
Yeah, I had outdated |
8f8f4c2
to
3ceebc7
Compare
541987e
to
28e57f0
Compare
00fb521
to
0a48cdc
Compare
5eca15c
to
a55a6d2
Compare
action { (x, c) => c.copy(linters = x) } | ||
valueName "<linters>" | ||
text "Lint without specified linters, given comma separated\tDefault: None\n" + | ||
"\t\t\t All linters and their explanations are below.\n" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been derived from allowedToBeExcluded
or other place where we have names.
1b5c612
to
a023862
Compare
------ RELEASE TO-DO BELOW ----------