Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring Value (decouple Gc from Value) #465

Closed
HalidOdat opened this issue Jun 8, 2020 · 9 comments · Fixed by #498
Closed

Refactoring Value (decouple Gc from Value) #465

HalidOdat opened this issue Jun 8, 2020 · 9 comments · Fixed by #498
Assignees
Labels
builtins PRs and Issues related to builtins/intrinsics discussion Issues needing more discussion help wanted Extra attention is needed performance Performance related changes and issues technical debt
Milestone

Comments

@HalidOdat
Copy link
Member

HalidOdat commented Jun 8, 2020

Whats wrong with the current implementation?
The current implementation after #383 coupled the Gc and Value together and the operations were promoted to the gc level, the problem with this is that Gc values are stored in the heap which means a performance penalty. Furthermore we need to do some checks before accessing the underlying ValueData.

Changes

  • Move Value logic/operations => ValueData
  • Rename ValueData => Value (it makes more sense to call it Value)
  • Move Gc<...> to Value::Object field (Value::Object(Gc<GcCell<Object>>))

If you have any suggestions on how to improve it, or even how other engines to it, please comment :)

I would like to start working on this preferably after #419 because the rebase is going to be a nightmare

@HalidOdat HalidOdat added technical debt builtins PRs and Issues related to builtins/intrinsics API labels Jun 8, 2020
@HalidOdat HalidOdat self-assigned this Jun 8, 2020
@jasonwilliams
Copy link
Member

jasonwilliams commented Jun 8, 2020

This is needed for the VM also, we want to be able to store Sized Values on the stack without them being wrapped in the GC. For now we’re just storing he cloned pointer on the stack (which isn’t ideal but can change it once this is done)

I’m glad you raised this, it’s refactor I agree with

@HalidOdat HalidOdat changed the title Refactoring Value part 2 Refactoring Value (decouple Gc from Value) Jun 8, 2020
@HalidOdat HalidOdat added performance Performance related changes and issues discussion Issues needing more discussion help wanted Extra attention is needed and removed API labels Jun 8, 2020
@HalidOdat HalidOdat pinned this issue Jun 8, 2020
@HalidOdat
Copy link
Member Author

HalidOdat commented Jun 9, 2020

It seems that (correct me if I'm wrong), Value does not need to be garbage collected only the Object field,
So we would have Value::Object(Gc<GcCell<Object>>). The value it self is immutable and can be copied does not need to have shared reference, but the only part that needs to be mutated and have shared reference is object.

@jasonwilliams
Copy link
Member

I think you're right yes, all the other values are immutable (even the symbol).
So we could do this at object level.

Looking at SpiderMonkey:
They have a function which allocates an object:
https://github.com/mozilla/gecko-dev/blob/dd53506c39875f5b8109768de71d42bcd4469543/js/src/gc/Allocator.cpp#L37-L86

They use it whenever a new object is needed:
https://github.com/mozilla/gecko-dev/blob/17388ad7ea00c874d132c46331477e21dd13ee23/js/src/vm/ArrayObject-inl.h#L54

@HalidOdat HalidOdat mentioned this issue Jun 11, 2020
8 tasks
@HalidOdat
Copy link
Member Author

HalidOdat commented Jun 13, 2020

Also as suggested by @Razican in #419, the String are immutable and be can stored as a Box<str>, instead of String which should remove 8 bytes of ValueData::String

@HalidOdat
Copy link
Member Author

HalidOdat commented Jun 13, 2020

Also I think we can make it even better by making the ValueData::String field a Rc<Box<str>> which should make Value triviality to copy its just a count increment, this will also remove another 8 bytes from Value::String() which will make it 8 byte field (64-bits) and there should never be a cyclic reference because there is no use of RefCell (this will be done to Symbol primitive too, because it has an optional string description)

This should be a much better option than garbage collecting it, in terms of performance.

What do you think? @Razican @jasonwilliams

Edit: I think we can make it even better 😅 , by not storing it as a Rc<Box<str>> but storing it as a Rc<str> this will eliminate an allocation and remove the double pointer indirection, which should make string access even faster.

I think, although I'm not sure, but Rc<str> is 16 bytes but Rc<Box<str>> is 8 bytes I think Rust stores the length of the str on the stack, also Box<str> is 16 but Box<String> is 8, this may be a good thing because we don't have to deref the pointer to get the length one less indirection.

If we store them as a Rc<str> we also get another optimization for free, if a type implements PartialEq an Eq which str does when checking for equality/inequality it checks if it points to the same memory allocation if so they are equal, if not it does a normal &str == &str check, same for inequality. See Rc source code here https://doc.rust-lang.org/src/alloc/rc.rs.html#1236

This will greatly improve performance when checking strings, especially large strings

I think we should use Rc<str>, instead of String for Object property access as the key value, the above optimizations apply to this.

@HalidOdat HalidOdat added this to the v0.9.0 milestone Jun 13, 2020
@Razican
Copy link
Member

Razican commented Jun 14, 2020

Wouldn't this be almost interning them? What if we have two str with the same value but different pointers?

I think in that case we wouldn't be able to get the optimisation, right?

We currently create each string separately, so I don't think there is a case where the Rc would point to the same place.

But I like this idea, maybe we should try to intern them again (and we should have a benchmark for string comparison and concatenation).

@HalidOdat
Copy link
Member Author

HalidOdat commented Jun 14, 2020

Wouldn't this be almost interning them?

Not exactly. In string interning the strings are held in an interner structure and the interned strings live as long as the interner lives, but with Rc<str> the handling of memory is separate from the interner, so if a Value goes out of scope and there are no references then it is freed, but in the string interning the life is static because there will always be a reference to the string that the interner structure will hold, even if there are no references to the string.

What if we have two str with the same value but different pointers?
I think in that case we wouldn't be able to get the optimisation, right?

Then a normal string compare is done. in this case we would not have the optimization.

We currently create each string separately, so I don't think there is a case where the Rc would point to the same place.

I think Rc<str> is need to make the value trivially copyable so the example does not create two strings in memory "hello"

let x = "hello";
let y = x;

x == y // just a pointer check is needed

we technically already have a Rc<str>, because Value is garbage collected Value(Gc<ValueData>) although we don't get the previous mentioned optimization, when we clone Value (when it's a string it does not clone the string, it just points to the same memory location), but gc does some other things that are not performant

we should have a benchmark for string comparison and concatenation

We should have this, definitely, I would also like to see the speed up for #419

@Razican
Copy link
Member

Razican commented Jun 14, 2020

we technically already have a Rc<str>, because Value is garbage collected Value(Gc<ValueData>) although we don't get the previous mentioned optimization, when we clone Value (when it's a string it does not clone the string, it just points to the same memory location), but gc does some other things that are not performant

Ok, yes, I see it. String interning would give better performance, but this is an almost free optimisation that will probably be easy to implement, so we should do it.

we should have a benchmark for string comparison and concatenation

We should have this, definitely, I would also like to see the speed up for #419

Let me create a couple of benchmarks for strings, I will PR today.

@HalidOdat
Copy link
Member Author

#419 Has been merged, I'll start working on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
builtins PRs and Issues related to builtins/intrinsics discussion Issues needing more discussion help wanted Extra attention is needed performance Performance related changes and issues technical debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants