Skip to content

Conversation

@palaviv
Copy link
Contributor

@palaviv palaviv commented Mar 29, 2019

I tried to make the code sharing between set and frozenset more nice. I created a shared trait SetProtocol that implements all the shared function. I also moved the set only functions to the new format.
I was not sure how to move the new implementation to SetProtocol if anyone has suggestions?

@coolreader18 coolreader18 mentioned this pull request Mar 31, 2019
53 tasks
@palaviv palaviv mentioned this pull request Mar 31, 2019
where
Self: Sized,
{
fn get_elements(&self) -> HashMap<u64, PyObjectRef>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definitely don't want to copy the entire map just to eg check its size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as the current implementation. I would prefer not changing the functionality of the code in this PR to make it easier to review. This is a big diff and I think improving in a different PR will be better.

I did add __iter__ to frozenset but used the same code as already existed in set.

@OddCoincidence
Copy link
Contributor

Sharing functions between python classes with a trait doesn't really seem like the right approach to me, because now the structure of the rust code does not directly map to that of the python code it represents. Also I don't think this meshes super well with the proc-macro approach to class definition being developed in #764. I understand the desire to have them share code, but imo duplication is far cheaper than the wrong abstraction.

@adrian17
Copy link
Contributor

adrian17 commented Mar 31, 2019

Also I don't think this meshes super well with the proc-macro approach to class definition being developed in #764

Side note, I asked about it in comment #764 (review) and @coolreader18 thinks it can work.

Sharing functions between python classes with a trait doesn't really seem like the right approach to me, because now the structure of the rust code does not directly map to that of the python code it represents.

Now we're going back to my topic from that PR - I do think they don't necessarily should map anyway.
I do imagine that long-term people will want to do stuff like "here's some magic trait/macro that makes the whole iterator type and __iter__ boilerplate for you, you just need implement __next__", or having multiple Python methods bind to the same Rust function.

However, I do agree with your last statement that this deduplication may possibly complicate code in the long run, while not giving proportional advantages. We can also see that in #770 (comment) - we're again figuring out reference passing, while recent refactors let us drop get_elements and not worry about this extra layer at all - this feels like a regression in this aspect. So yeah, you made me feel less positive about these two PRs.

@palaviv
Copy link
Contributor Author

palaviv commented Apr 1, 2019

Sharing functions between python classes with a trait doesn't really seem like the right approach to me,

You are currently sharing code between the types just without the trait that allow you to use the new arg style. My opinion is that the code is more readable this way but I am biased.

while recent refactors let us drop get_elements and not worry about this extra layer at all - this feels like a regression in this aspect. So yeah, you made me feel less positive about these two PRs.

I understand your concern and will try solving the problems that were pointed about the copy.

I understand the desire to have them share code, but imo duplication is far cheaper than the wrong abstraction.

I don't think this adds a level of abstraction to the implementation. Specifically talking about set and frozenset they are not totally separate as they interact with each other. You can see in python documentation they are both under Set Types. The same goes for Binary Sequence Types. I agree that classes that are not related should not share a trait but this is not the case. Arguing the same for list and tuple as I did in #770 Is harder but I do believe we can look of them as part of the sequence family even if they do not interact with each other.

@jgirardet
Copy link
Contributor

I'd be interested in an answer about it (share a trait between frozentset and set) because I was on the point to do it for bytes and bytearray.

@cthulahoops
Copy link
Collaborator

This feels like the wrong abstraction to me too. As does the existing code, I think the new code is probably an improvement.

Everything feels a bit inside out. It feels to me like we want: a rust set implementation that provides the basic interface for sets in pure rust, and then some repetitive simple wrappers that unwrap/wrap from both types. I think it would be simpler but more repetitive. (This is hard/be sure it makes sense without trying it.)

@palaviv
Copy link
Contributor Author

palaviv commented Apr 2, 2019

Everything feels a bit inside out. It feels to me like we want: a rust set implementation that provides the basic interface for sets in pure rust, and then some repetitive simple wrappers that unwrap/wrap from both types. I think it would be simpler but more repetitive. (This is hard/be sure it makes sense without trying it.)

I am not sure I understand what your suggestion. Are the wrappers written in python?

@OddCoincidence
Copy link
Contributor

@cthulahoops is this something like what you were suggesting? I see this "Inner" pattern in a lot of rust code.

struct PySet {
    inner: RefCell<PySetInner>,
}

impl PySetRef {
    fn some_set_method(self) {
        self.inner.borrow().some_set_method()
    }
}
struct PyFrozenSet {
    inner: PySetInner,
}

impl PySetRef {
    fn some_set_method(self) {
        self.inner.some_set_method()
    }
}
struct PySetInner {
     // actual set represetation
}

impl PySetInner {
    fn some_set_method(&self) {
        // the real implementation
    }
}

@palaviv
Copy link
Contributor Author

palaviv commented Apr 2, 2019

This look like a nice way to go @OddCoincidence. I have a few points:

  1. How would you implement all the functions that are set only. update for example.
  2. When you will need to create a new object for example in union. Would you pass the function the class of the caller?
  3. This will create a lot of duplicated wrapper code that will look the same in PySetRef and PyFrozenSet impl. Using the trait remove all this duplicate code.

@OddCoincidence
Copy link
Contributor

OddCoincidence commented Apr 2, 2019

How would you implement all the functions that are set only. update for example.

impl PySetInner {
    fn update(&mut self, other: PyIterable) {
        // do it
    }
}

impl PySetRef {
    fn update(self, other: PyIterable) {
        self.inner.borrow_mut().update(other)
    }
}

impl PyFrozenSetRef {
    // no update method
}

When you will need to create a new object for example in union. Would you pass the function the class of the caller?

No, PySetInner would never reference PySet or PyFrozenSet, it would operate in terms of itself.

impl PySetInner {
    fn union(&self, other: &Self) -> PySetInner {
        // do it
    }
}

impl PySetRef {
    fn union(self, other: PySetRef) -> PySet {
        self.inner.borrow().union(&other.inner).into() // via appropriate From impl
    }
}

impl PyFrozenSetRef {
    fn union(self, other: PySetRef) -> PyFrozenSet {
        self.inner.union(&other.inner).into() // via appropriate From impl
    }
}

This will create a lot of duplicated wrapper code that will look the same in PySetRef and PyFrozenSet impl. Using the trait remove all this duplicate code.

I am 100% okay with this. They are different types, the methods should be explicitly specified for each one.

@cthulahoops
Copy link
Collaborator

Sorry for not being clearer in my last rushed post. Thanks, @OddCoincidence, that's exactly what I was thinking.

@cthulahoops
Copy link
Collaborator

(I'm not sure I like the name inner, I would have gone with something like SetElements and .elements, but this does appear to be a common rust naming scheme.)

@jgirardet
Copy link
Contributor

Why does this use HashMap instead of Hashset ?

@cthulahoops
Copy link
Collaborator

So that we can store arbitrary hashable objects in sets. It's a map of hash value to PyObjectRef.

@palaviv
Copy link
Contributor Author

palaviv commented Apr 3, 2019

I will start working on objset with inner as suggested by @OddCoincidence.

@OddCoincidence
Copy link
Contributor

So that we can store arbitrary hashable objects in sets. It's a map of hash value to PyObjectRef.

How do we handle hash collisions?

@cthulahoops
Copy link
Collaborator

How do we handle hash collisions?

We don't. (I tried to work out the answer to this myself earlier.)

>>>>> {A(43), A(4)}
{4}

I guess once we have proper dicts working we can use the same approach to fix sets.

@palaviv
Copy link
Contributor Author

palaviv commented Apr 3, 2019

impl PySetRef {
    fn union(self, other: PySetRef) -> PySet {
        self.inner.borrow().union(&other.inner).into() // via appropriate From impl
    }
}

I have started working on the implementation and I seem to have the same problems as before. The other needs to be PyObjectRef as it can be either set or frozenset. This keep me again with get_elements (or get_inner now). I think #774 macro might help here but I am not sure.

@OddCoincidence
Copy link
Contributor

The other needs to be PyObjectRef as it can be either set or frozenset.

Ah, I didn't consider this. Yeah, using PyObjectRef is probably the best thing to do right now. We could eventually add a SetLike arg extractor or something.

I think #774 macro might help here but I am not sure.

I think this will help too. @cthulahoops Are you okay with me merging #774 or do you still have concerns with it?

@jgirardet
Copy link
Contributor

I thought Pysetref was mandatory since user could call union with any type, not only set or frozenset

@OddCoincidence
Copy link
Contributor

Actually, @palaviv for union it looks like you should use PyIterable.

>>> {1, 2, 3}.union([4])
{1, 2, 3, 4}
>>> {1, 2, 3}.union(4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable

@OddCoincidence
Copy link
Contributor

I guess once we have proper dicts working we can use the same approach to fix sets.

We might also be able to implement sets as dicts where the value is None, kind of like Rust's HashSet is implemented internally with HashMap<T, ()>.

@cthulahoops
Copy link
Collaborator

I think this will help too. @cthulahoops Are you okay with me merging #774 or do you still have concerns with it?

You mostly convinced me. I'm okay with you merging that.

@cthulahoops cthulahoops mentioned this pull request Apr 4, 2019
1 task
@jgirardet jgirardet mentioned this pull request Apr 7, 2019
@palaviv
Copy link
Contributor Author

palaviv commented Apr 8, 2019

Was implemented using inner in #785.

@palaviv palaviv closed this Apr 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants