Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sqlc.nembed #2472

Closed
wants to merge 2 commits into from
Closed

Add sqlc.nembed #2472

wants to merge 2 commits into from

Conversation

richchurcher
Copy link

@richchurcher richchurcher commented Jul 18, 2023

This:

  • adds sqlc.nembed, which declares an embed to be nullable.
  • checks returned query values associated with the nembed
    • If all are NULL, the struct pointer will be nil.
    • If at least one is NOT NULL, a scan into the struct will be attempted.
  • prevents scans of non-nullable fields in the case where the RHS of the join is NULL

Obviously, this is very rough, but I'm wondering if I can get a validation of the overall approach? Which is to say, "know where all the nullable embed column indices are, and check to see if they're all going to receive nil values". If so, I'll flesh it out and try to get the non-pgx drivers working... just wanted to check in before I spent more time on it.

I'm not wild about the extra complexity in the generated code and the template, but I'm not sure there's an easier way around it until <future date> when sqlc knows a bit more about joins. At least this approach is opt-in, and doesn't care about things like primary keys and what kind of join we're in.

Given the query:

-- name: GetUserRelationships :one
select sqlc.embed(u), sqlc.embed(b), sqlc.nembed(i) from users u
  left join buckets b on u.bucket_id = b.id
  left join images i on u.avatar_id = i.id
  where u.deleted_at is null
    and u.id = $1
  limit 1;

Produces sample output:

const getUserRelationships = `-- name: GetUserRelationships :one
select u.id, u.auth_provider_type, u.auth_provider_identifier, u.avatar_id, u.bucket_id, u.email, u.name, u.created_at, u.updated_at, u.deleted_at, b.id, b.amount, b.created_at, b.updated_at, b.deleted_at, i.id, i.object_key, i.created_at from users u
  left join buckets b on u.bucket_id = b.id
  left join images i on u.avatar_id = i.id
  where u.deleted_at is null
    and u.id = $1
  limit 1
`

type GetUserRelationshipsRow struct {
	User   User   `db:"user" json:"user"`
	Bucket Bucket `db:"bucket" json:"bucket"`
	Image  *Image `db:"image" json:"image"`
}

func (q *Queries) GetUserRelationships(ctx context.Context, id hashid.HashID) (GetUserRelationshipsRow, error) {
	rows, err := q.db.Query(ctx, getUserRelationships, id)
	var i GetUserRelationshipsRow
	if err != nil {
		return i, err
	}
	var nImage Image
	cols := []interface{}{
		&i.User.ID,
		&i.User.AuthProviderType,
		&i.User.AuthProviderIdentifier,
		&i.User.AvatarID,
		&i.User.BucketID,
		&i.User.Email,
		&i.User.Name,
		&i.User.CreatedAt,
		&i.User.UpdatedAt,
		&i.User.DeletedAt,
		&i.Bucket.ID,
		&i.Bucket.Amount,
		&i.Bucket.CreatedAt,
		&i.Bucket.UpdatedAt,
		&i.Bucket.DeletedAt,
		&nImage.ID,
		&nImage.ObjectKey,
		&nImage.CreatedAt,
	}
	defer rows.Close()
	// This effectively duplicates the behaviour of Row.Scan, which we can't use (because it doesn't
	// provide Values).
	if !rows.Next() {
		if rows.Err() == nil {
			return i, pgx.ErrNoRows
		}
		return i, rows.Err()
	}
	vals, verr := rows.Values()
	if verr != nil {
		return i, verr
	}
	nullableIndices := [][]int{[]int{15, 16, 17}}
	setEmbedsNil(vals, cols, nullableIndices)
	if err := rows.Scan(cols...); err != nil {
		return i, err
	}
	i.Image = &nImage
	return i, err
}

Fixes #2348

@PatrLind
Copy link

I haven't tested it, but it looks like it could work fine!
A small optimization would be to either pre-allocate the cols slice or initialize it in place.
Maybe like this:

	cols := []interface{}{
		&i.User.ID,
		&i.User.AuthProviderType,
		&i.User.AuthProviderIdentifier,
		&i.User.AvatarID,
		&i.User.BucketID,
		&i.User.Email,
		&i.User.Name,
		&i.User.CreatedAt,
		&i.User.UpdatedAt,
		&i.User.DeletedAt,
		&i.Bucket.ID,
		&i.Bucket.Amount,
		&i.Bucket.CreatedAt,
		&i.Bucket.UpdatedAt,
		&i.Bucket.DeletedAt,
		&nImage.ID,
		&nImage.ObjectKey,
		&nImage.CreatedAt,
	}
	cols := make([]interface{}, 0, {{ .COL_COUNT }})
	cols = append(cols,
		&i.User.ID,
		&i.User.AuthProviderType,
		&i.User.AuthProviderIdentifier,
		&i.User.AvatarID,
		&i.User.BucketID,
		&i.User.Email,
		&i.User.Name,
		&i.User.CreatedAt,
		&i.User.UpdatedAt,
		&i.User.DeletedAt,
		&i.Bucket.ID,
		&i.Bucket.Amount,
		&i.Bucket.CreatedAt,
		&i.Bucket.UpdatedAt,
		&i.Bucket.DeletedAt,
		&nImage.ID,
		&nImage.ObjectKey,
		&nImage.CreatedAt,
	)

@richchurcher
Copy link
Author

richchurcher commented Jul 18, 2023

Thanks, good idea! Yeah, there's bound to be a bunch of things we could do to smooth out some of those rough edges, it's a bit "thrown together" as a prototype (hence the use of an actual query instead of a mock 😆). I need to learn more about Go templates though...

cols = append(cols, {{.Ret.Scan}})
defer rows.Close()
// This effectively duplicates the behaviour of Row.Scan, which we can't use (because it doesn't
// provide Values).
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, by which I mean Row doesn't provide Values, as it's just a wrapper around Rows that QueryRow returns. Unless I'm missing something.

@kyleconroy
Copy link
Collaborator

Obviously, this is very rough, but I'm wondering if I can get a validation of the overall approach?

I'm not sure this is the approach we want to take with nullable embeds. I'm concerned about the re-implemenation of Scan and having to create intermediate results. I'm also worried about the behavior of partial results. What does it mean to populate an embedded struct with only a subset of the fields? Without any indication on the struct itself, it will be difficult to know when this occurs programmatically.

For now I'd like to let this sit for a bit longer as we discuss the semantics around sqlc.embed and how we expect it work when possibly joining nullable data.

@nickjackson
Copy link
Contributor

nickjackson commented Aug 11, 2023

Apologies for not getting back to you sooner. @richchurcher

I wonder if sqlc.nembed() should just embed a variant of the model where all the fields are forced nullable.

-- name: GetOrder :many
SELECT sqlc.embed(orders), sqlc.nembed(products) FROM orders
LEFT JOIN products ON orders.id = product.order_id

would produce

type GetOrderRow struct {
  Order   Order
  Product NullProduct
}

Alongside the Product model, would be a NullProduct variant like...

// models.go

type Product struct {
  ID      string
  OrderID string
  Amount  decimal.Decimal
  ...
}

type NullProduct struct {
  ID      sql.NullString
  OrderID sql.NullString
  Amount  decimal.NullDecimal
  ...
}

It is not as convenient as having a single mapping for every model, but I think this still helps.

This leaves the scanning logic alone for the most part, and leaves it up the implementer on what is considered "Null". It is a lot closer to how sqlc.narg() functions and therefore more predictable.

@nickjackson
Copy link
Contributor

nickjackson commented Aug 12, 2023

I've thrown together a rough implementation of what I'm talking about above.

a3a37aa

The good bit that is really quite simple, doesn't require any code generation change. The bad is that it adds a second "table" to the catalog (and therefore the plugin CodeGenRequest) to represent the nullable version, which technically won't exist. I'm really not sure if this should be implemented in the compiler or in the code generator.

@richchurcher
Copy link
Author

Ooo, that does look enticingly simple. Lemme give it a crack locally and see how it feels.

@richchurcher
Copy link
Author

richchurcher commented Aug 19, 2023

At first glance, I think it's a perfectly fine compromise. It essentially turns (for example):

	if entity.Image != nil {

into:

	if entity.NullImage.ID.Valid {

(I use a custom hashid type, but I'm sure the same goes for nil checks on pointers to id).

It's certainly a much smaller change and simpler to maintain. I'm trying to think of what the edge cases might be, but nothing springs to mind (except, as you say, the extra additions to the catalog, which after all would be opt-in for users that wanted this facility).

Edit: I wonder if there would be an expectation that the generated models provided their own version of .Valid. I suppose that might be a little too much to ask at first...

@richchurcher
Copy link
Author

richchurcher commented Aug 24, 2023

It does require more work on the consuming application side. Knowing that the entity will either be there or not is simpler than having to deal with two rather different types, e.g.:

type Location struct {
	ID          hashid.HashID       `db:"id" json:"id"`
	Name        string              `db:"name" json:"name"`
	DisplayName string              `db:"display_name" json:"displayName"`
	CreatedAt   pgtype.Timestamptz  `db:"created_at" json:"createdAt"`
	DeletedAt   *pgtype.Timestamptz `db:"deleted_at" json:"deletedAt"`
}

and

type NullLocation struct {
	ID          hashid.HashID       `db:"id" json:"id"`
	Name        *string             `db:"name" json:"name"`
	DisplayName *string             `db:"display_name" json:"displayName"`
	CreatedAt   *pgtype.Timestamptz `db:"created_at" json:"createdAt"`
	DeletedAt   *pgtype.Timestamptz `db:"deleted_at" json:"deletedAt"`
}

I think the other thing that's nagging at me is that the column types are no longer an accurate representation of their NOT NULL definition in the schema, so you can't depend on that in code.

I think in terms of ease of use, it'd be more ergonomic to have either a wrapped pointer type with Valid() or just a pointer. The complexity in the consuming code is proving a little cumbersome.

@KevenGoncalves
Copy link

I'm wondering when this feature will be completed, it's make a lot of difference

@richchurcher
Copy link
Author

I don't think this PR will be completed @KevenGoncalves, it was really just a discussion point for potential approaches to the problem. I think @kyleconroy and team are aware of the surrounding issues (and at a much deeper level than I!) so we might just have to be a bit patient.

@vamshiaruru32
Copy link

Since I can't find another place to comment this, I'd like to point out that this is a feature that we'd find helpful too.
In my opinion when using sql.nembed, the struct that is generated should just be a Nullable variant of that type, but not all fields should be forced nullable. Consuming the variant where all columns have their types changed would be very cumbersome, so I agree with @richchurcher here.

@richchurcher
Copy link
Author

I'll close this PR so it won't add to noise or maintenance burden for repo maintainers, and perhaps suggest #2348 as a possible location for ongoing discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sqlc.embed() generates code that cannot handle NULL items
6 participants