Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Upserts #595

Closed
wants to merge 5 commits into from
Closed

[WIP] Upserts #595

wants to merge 5 commits into from

Conversation

hntd187
Copy link

@hntd187 hntd187 commented Oct 19, 2016

This is in reference to #16

Problem

This ideally should bring atomic upsert support to PostgreSQL 9.5+ and MySQL via the following queries

PgSQL

INSERT INTO TestEntity (s,l,o) VALUES (?, ?, ?) ON CONFLICT(i) DO UPDATE SET i = ?, l = ?, s = ?

MySQL

INSERT INTO TestEntity (s,l,o) VALUES (?, ?, ?) ON DUPLICATE KEY UPDATE i = ?, l = ?, s = ?;

Solution

I have added to the DSL an interface that I think encompasses most of the upsert syntax here. You can see in the example queries there are 3 important parts, the insert, the conflict key and then what to do about it.

Thus I have modeled the DSL to look like this

quote {
  query[TestEntity]
    .upsert(lift(e))
    .conflict(_.i)
    .conflictUpdate(_.i -> lift(2), _.l -> lift(1L), _.s -> lift("Test String"))
}

There are some problems with it right now as I've written it, but I think this is mostly consistent with how it should be, but you guys are in charge so don't let me tell you how to model your DSL :-)

The checklist in #16 was rather out of date with the current library it seemed, so I had to improvise a bit in some places. I'm not sure if the normalization is necessary at all to be honest. There is also some other bugs I've run across I'd like some insight on how to sort out.

Notes

Additional notes.

Checklist

  • Base DSL Finished
  • Able to generate a standard PostgreSQL upsert
  • Fix inability to lift a case class in to conflictUpdate()
  • Fix the ability to implement the dialect specific SQL in it's proper dialect subtrait
  • Actually implement the proper dialect specific SQL
    • PostgreSQL
    • MySQL
  • Add inability to use upserts without specifying conflicts or conflict updates
  • Add ability to do nothing when an upsert conflicts, such as INSERT INTO TestEntity (s,l,o) VALUES (?, ?, ?) ON CONFLICT(i) DO NOTHING
  • Placeholder for things I've probably forgot.

@getquill/maintainers

"upsert" in {
val e = TestEntity("", 1, 1L, Some(1))
val q = quote {
query[TestEntity].upsert(lift(e)).conflict(_.i).conflictUpdate(_.i -> lift(2), _.l -> lift(1L), _.s -> lift("Test String"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't seem that conflict can be used without conflictUpdate, I think we could simplify things and avoid misuse:

query[TestEntity].insert(lift(e)).onConflict(_.i)(_.i -> lift(2), _.l -> lift(1L), _.s -> lift("Test String"))

The AST could also be simplified to a single new type. Wdyt?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that's a great idea, I'd must rather have one type, than the 2 I had, but my familiarity with the library led me to do what I knew how to do at the time. I can definitely trim that down back to the single type. Lemme try that

@hntd187
Copy link
Author

hntd187 commented Oct 20, 2016

I updated some of this with new commits,

// Update a conflict
val q = quote {
 query[TestEntity].insert(lift(e)).onConflict(_.i)(_.i -> lift(1), _.l -> lift(2L))
}
// Do nothing...
val q2 = quote {
  query[TestEntity].insert(lift(e)).onConflict(_.i)().returning(_.i)
}
INSERT INTO TestEntity (s,l,o) VALUES (?, ?, ?) ON CONFLICT(x1.i) UPDATE SET x2.i = ?, x3.l = ?
INSERT INTO TestEntity (s,l,o) VALUES (?, ?, ?) ON CONFLICT(x4.i) DO NOTHING

is how the queries look. When the cases are implemented in the top level SQL idiom they don't have the x1, x2, x3 so on things added to them, but when they are implemented in their proper dialect trait, in this case PostgreSQL, they add them. @fwbrasil can you explain why this is and how I can stop it?

@schmitch
Copy link
Contributor

schmitch commented Nov 10, 2016

UPDATE SET x2.i = ?, x3.l = ?wouldn't it make more sense to have: UPDATE SET x2.i = EXCLUDED.s, x3.l = EXCLUDED.l?

@hntd187
Copy link
Author

hntd187 commented Nov 10, 2016

@schmitch I don't think it really matters, you can do either way of writing a full update and binding to it or you can write it with what you had laid out, they'll do mostly the same thing I believe

@hntd187
Copy link
Author

hntd187 commented Nov 10, 2016

@fwbrasil poke poke, could you help me with some of the issues I described above?

@schmitch
Copy link
Contributor

I don't think it really matters, you can do either way of writing a full update and binding to it or you can write it with what you had laid out, they'll do mostly the same thing I believe

Actually they are quite different when it comes to resource usage. a prepared statement is not free every variable costs space in big queries that actually can matter.

@hntd187
Copy link
Author

hntd187 commented Nov 10, 2016

Well my point was you are reusing the same variables from before, which wouldn't add any cost, right? Or maybe I'm not understanding it correctly.

@schmitch
Copy link
Contributor

schmitch commented Nov 10, 2016

unfortunatly you don't. if you have two ? with the same variable you need to set them twice.
Consider the following:

def bla(implicit connection: Connection) = {
    val stmt = connection.prepareStatement("INSERT INTO hello VALUES (?, ?)")
    val id = 1
    stmt.setInt(1, id) 
    stmt.setInt(2, id)
  }

of course postgresql see's that as two variables the same happens in ON CONFLICT DO UPDATE.
a prepared statement does not reuse the variable for each parameter, especially not on the database side.

@fwbrasil
Copy link
Collaborator

@hntd187 there's a special tokenizer for properties within actions that removes the table identifier. You probably need to pass the implicit to your tokenizer.

@fwbrasil
Copy link
Collaborator

@hntd187 do you have plans to work on this feature? It'd be great if we could support upserts!

@hntd187
Copy link
Author

hntd187 commented Apr 25, 2017

@fwbrasil hey! I'm sorry, I've just been swamped and haven't gotten a chance to tie this up, but I promise I'll finish it. I just need to rebase and tie up a few things, but this is basically at the finish line I just gotta push it across, sorry for the delay.

@fwbrasil
Copy link
Collaborator

@hntd187 no worries at all :) let us know if there's something we can help with!

@letalvoj
Copy link
Contributor

Hey. Could you please give me a brief info into what's missing? I'd love to help out. I find this feature super useful.

@hntd187
Copy link
Author

hntd187 commented Jul 14, 2017

Okay, I rebased master back onto my branch, but clearly it's been awhile, I still have some work to do here, but this clearly is all kinds of screwed up now. Leave it or open a new pull @fwbrasil ?

@mxl
Copy link
Contributor

mxl commented Jul 14, 2017

@hntd187 You did not rebase, but merged master into your branch. You should reset your branch to latest commit you made with git reset --hard <commit sha>, rebase your branch on master with git rebase upstream/master where upstream is the name of the [email protected]:getquill/quill.git remote and push it with git push -f.

@hntd187
Copy link
Author

hntd187 commented Jul 14, 2017

@mxl not the same as git pull --rebase upstream master ?

@mxl
Copy link
Contributor

mxl commented Jul 14, 2017

@hntd187 No, you do git pull --rebase upstream master when you are working on master branch and want to rebase your commits in local master on upstream master. Here you work in separate branch, so you should first git fetch upstream and then git rebase upstream/master.

@hntd187
Copy link
Author

hntd187 commented Jul 15, 2017

Well TIL thanks, I'll fix this up in a bit.

Stephen Carman and others added 5 commits July 14, 2017 23:43
… on update. There is some compatibility issues with PSQL vs MySQL, which makes some of the code look a bit off. I'd like to fix this, but it'll require probably some more DSL additions.
@hntd187
Copy link
Author

hntd187 commented Jul 15, 2017

@mxl well would ya look at that, cool beans. This still needs some work, but I've sorted out at least that part of it so far. The MySQL and Postgres dialects are different enough that the syntax I have for them seems good for PSql, but horrible for MySQL, I'm not sure if I should refactor it again to be somehow agnostic to them both or if there is a way to implement a dialect specific version of each?

@letalvoj
Copy link
Contributor

letalvoj commented Jul 30, 2017

@hntd187 I mean what about both approaches? Both dialects could have a onConflict function. The only difference would be that the MySQL version would be parameterless.

And I am also for having a some implicit/explicit update all functionality which would work something like updateAll()

Examples

PgSQL

query[TestEntity].insert(lift(e)).onConflict(_.a).updateAll() 
query[TestEntity].insert(lift(e)).onConflict(_.a).updateSet(_.a -> lift(e.a), _.b -> lift(e.b))
INSERT INTO TestEntity (a, b) VALUES (?, ?) ON CONFLICT(a) DO UPDATE SET a = EXCLUDED.b, b = EXCLUDED.b
INSERT INTO TestEntity (a, b) VALUES (?, ?) ON CONFLICT(a) DO UPDATE SET a = ?, b = ?
-- or the actual values, I am not sure how does quill handle the right-hand side params now

MySQL

query[TestEntity].insert(lift(e)).onConflict().updateAll()
INSERT INTO TestEntity (a, b) VALUES (?, ?) ON DUPLICATE KEY UPDATE a = VALUES(a), b = VALUES(b);

In a case of Postgres it can be simplified by removing all CONFLICT columns from the UPDATE clause. Which does not hold for MySQL but it should not affect the functionality in the first place.

What do you think? I find it intuitive.

@letalvoj
Copy link
Contributor

letalvoj commented Jul 31, 2017

Well I had a free afternoon so I checked that out and it seems to me that this would require some changes.

  • parser for vararg of Property
    • Property will than need to have a simmilar spot under the sun as the Assignment has

I kind of fail to change the signature of Conflict to

case class Conflict(query: Ast, property: List[Ast]) extends Action
// or
case class Conflict(query: Ast, property: List[Property]) extends Action

because in BetaReduction and some other places I need Ident which I can not get since Property has only a generic Ast as an argument.

If you agree with what I described above could you please changing the

def conflict[R](f: E => R): ActionConflict[E, R] = NonQuotedException()

to

def conflict[R](f: (E => R)*): ActionConflict[E, R] = NonQuotedException()

and dealing with the problem I described above?

Also an idea. We can

  • throw a simple error from the dialect itself if there is any column specified in the MySQL case
  • or maybe force the both the IDE and compiler to "enable" the proper language specific variants of the method by introducing an implicit parameter, which would be resolvable only based on the dialect

@fwbrasil
Copy link
Collaborator

@letalvoj Thanks for looking into it. I think it'd be nicer to have a single onConflict method that takes a parameter because the query would fail if there isn't a update* after it. Regarding the promoting fields to Ast, it should be fine if justified.

@letalvoj
Copy link
Contributor

letalvoj commented Aug 1, 2017

Again just an idea but the problem which you've described has a solution. What if there were those two traits

  • PartialAction <- this is what is currently being processed internally in all the pattern matches as Action
  • Action <: PartialAction this is executable by def run(quoted: Quoted[Action[_]])

This would allow to have a builder-like structure for some more complicated and yet it would not fail during runtime. One problem I can think of is that the error would be something like Conflict is not an instance of Action which might not be such a big of a deal.

Anyway I can not proceed until someone implements the vararg parser for Properties and applies it to def conflict[R](f: (E => R)*):. There is too much implementation details which I do not fully underestand to do the change (as for example in FreeVariables).

@dsounded
Copy link

any progress ?

/*
INSERT INTO TestEntity (s, l, o) VALUES (s, l, o) ON CONFLICT(i) DO UPDATE SET i = ?, l = ?, s = ?
INSERT INTO TestEntity (s, l, o) VALUES (?, ?, ?) ON CONFLICT(i) DO NOTHING
INSERT INTO TestEntity (s, l) VALUES ('Hi', 10) ON CONFLICT(i) DO UPDATE SET s = 'Yoyo'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe Hihi instead of Yoyo ?

@mosyp
Copy link
Collaborator

mosyp commented Nov 8, 2017

@hntd187 Hello, how things are going with that? Do you mind finishing that or you need some help? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants