Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute and check type of cells #860

Open
lars-reimann opened this issue Jun 24, 2024 · 3 comments
Open

Compute and check type of cells #860

lars-reimann opened this issue Jun 24, 2024 · 3 comments
Labels
enhancement 💡 New feature or request

Comments

@lars-reimann
Copy link
Member

lars-reimann commented Jun 24, 2024

Is your feature request related to a problem?

Cells are our concept to lazily (i.e. the data must not be loaded immediately) execute vectorized operations on tables. The laziness is essential for efficiency, since it enables optimization of queries and more. However, the drawback of the laziness is that validation of operations on cells also happens very late.

Desired solution

Type computation:

  • When creating a cell from a column/row store the type of the associated column in the cell.
  • When calling a method of a cell, store the type of the operation's result in the returned cell.
  • For now, it is not needed to compute the type of arbitrary Python expressions. If you don't know a type of an operator and the operation might return multiple incompatible types, set the type to None.

Type checking:

  • At the start of the methods of a cell, check whether the operation can be applied to a cell based on the cell's type.

Possible alternatives (optional)

No response

Screenshots (optional)

No response

Additional Context (optional)

No response

@lars-reimann lars-reimann added enhancement 💡 New feature or request lab Suitable for the lab labels Jun 24, 2024
@github-project-automation github-project-automation bot moved this to Backlog in Library Jun 24, 2024
@LIEeOoNn
Copy link
Contributor

LIEeOoNn commented Jul 5, 2024

Question I am checking the types using

T_co = TypeVar("T_co", covariant=True)
P_contra = TypeVar("P_contra", contravariant=True)
R_co = TypeVar("R_co", covariant=True)
``` and for example doing 
```python 
 def add(self: Cell[T_co], other: Cell[T_co]) -> Cell[R_co]: ...

the user sees something like

(method) def add(other: Cell[int]) -> Cell[int]

but what do T_co R_co P_contra actually mean ?

@LIEeOoNn
Copy link
Contributor

LIEeOoNn commented Jul 5, 2024

Add do we need to write new tests to see if the checking works ?

@wastedareas wastedareas moved this from Backlog to In Progress in Library Jul 5, 2024
@lars-reimann
Copy link
Member Author

That's Python's type system. T/R/P are the actual, arbitrary names of type variables. The _co/_contra suffix is a convention to indicate the variance of the type variable.

Python's type system is not capable of capturing the type of the cell at runtime, though. You need to add an actual field for that.

@LIEeOoNn LIEeOoNn moved this from In Progress to Backlog in Library Jul 12, 2024
@lars-reimann lars-reimann removed team1 lab Suitable for the lab labels Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request
Projects
Status: Backlog
Development

When branches are created from issues, their pull requests are automatically linked.

4 participants