Skip to content

Conversation

@saifalkatout
Copy link
Contributor

@saifalkatout saifalkatout commented Sep 28, 2023

resolves #875

What was wrong?

The new parser (v2 branch) wrongly supposes the assignment destination is a pattern, but it should be an expression.

How was it fixed?

Parse lhs in assignment statements as an expression.

cburgdorf and others added 30 commits July 17, 2023 18:46
moving data between mem and storage
fixed notes on mem to storage, compare_bud and compare still need to be fixed
Merge pull request argotorg#907 from saifalkatout/master
create an eq trait and implement ir for the various primitive types
Changed Eq for signed integers
Change Eq for signed integers 2.
Copy link
Member

@Y-Nak Y-Nak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed another one.

Comment on lines 67 to 74
pub fn expr(&self) -> Option<super::Expr> {
support::child(self.syntax())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect; please refer to ast::expr::BinExpr implementation as an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fixed it, I'll start working on AugAssignStmtScope, and then will squash the commits.

}
pub fn parse_lhs_aug<S: TokenStream>(parser: &mut Parser<S>) -> bool {
//include Lp and Lb ?
parse_expr_with_min_bp(parser, 146, true)
Copy link
Contributor Author

@saifalkatout saifalkatout Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some work on augassign; Is this bp acceptable ? (currently it takes dot, Lparen, Lbrack, and Lt)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bp seems fine.

However, I've started considering changing the specifications of the assignment. Clearly, it'd be better to define assign and augassign as expressions rather than statements.
Because:

  1. We can implement a trait AddAssign or something like that.
  2. Emitting a good error message in the parsing phase is difficult when they are defined as statements. e.g., please consider x + 1 += 1.

This is my fault. I didn't fully consider it, even though I felt I should've when I implemented parser2. Sorry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think we could take two options.

  1. Merge this PR as is, then someone could change the specification, or
  2. You could work on the change in this PR if you are willing to help us

Anyways, I'm sorry that I didn't consider it fully and gave insufficient suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbillig Any thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I can start working on it.

Copy link
Member

@Y-Nak Y-Nak Oct 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I can start working on it.

Ok, thanks!

the rough sketch for implementing this would be

  1. Modify the parser so that it can parse them as expressions
  2. Change the HIR definition for Assign and AugAssign
  3. Rewrite the HIR lowering and remove the unnecessary desugaring related types(e.g., hir::span::DesugaredOrigin::AugAssign)

Copy link
Member

@Y-Nak Y-Nak Oct 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please feel free to let me know if you find any trouble.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey Yoshi, 2 questions:

  1. the push_expr receives an origin so maybe we shouldn't delete the hir::span::DesugaredOrigin::AugAssign ??
    image

  2. the aug_assign and assign are going to be parsed in the parse_atom_expr correct ?? if so, is it ok to pass the checkpoint as null ?? sth along the lines of:
    image

Copy link
Member

@Y-Nak Y-Nak Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the push_expr receives an origin so maybe we shouldn't delete the hir::span::DesugaredOrigin::AugAssign ??

We don't need desugar_aug_assign and DesugaredOrigin::AugAssign because we no longer need to desugar them. Instead, you need to modify hir::expr::Expr like the one below.

pub enum Expr {
    Assign(ExprId, ExprId),
    AugAssign(ExprId, ExprId, BinOp),
    ...
}

Then you could lower ast::Assign and ast::AugAssign in lower_expr. Please refer to lower_expr to see how push_expr is used inside it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the aug_assign and assign are going to be parsed in the parse_atom_expr correct ??

I don't think you need to modify parse_expr_atom. Conceptually, what you need to do is the same that BinExprScope does. The one difficult thing is we don't have tokens to represent AugAssign operators(e.g., +=).
This would be solved in two different ways.

  1. Just modify SyntaxKind and accept them as an atom kind (Please refer to SyntaxKind).
  2. Modify infix_binding_power and bump_bin_op to handle them, then if bump_bin_op indicates that the binary operator is assign or aug assign, call set_kind in BinExprScope::parse method. Please refer to ParenScope::parse for set_kind method.

I think option 1. is easier to implement, but I prefer option 2. because it allows us to extract the span of the binary operation part in augassign op(e.g., span for + in +=)

support::child(self.syntax())
/// Returns the expression of the lhs and rhs of the assignment.
pub fn lhs_expr(&self) -> Option<super::Expr> {
support::children(self.syntax()).next()
Copy link
Contributor Author

@saifalkatout saifalkatout Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I convert this to support::child(...) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result is the same. Also, it seems the performance wouldn't be different. Is there any motivation to change to support::child?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to keep the code concise. Also, with next, I feel it implies that it is pointing at the second item rather the first item, since it is not clear where the iter is starting (index -1 or 0).

Copy link
Member

@Y-Nak Y-Nak Oct 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it implies that it is pointing at the second item rather the first item, since it is not clear where the iter is starting (index -1 or 0).

This is a well-known Rust idiom. Also, It'd be pretty strange if an iterator starts with the second element in any language, right?

When I wrote support::children(...).next(), I might feel it'd be slightly readable because lhs and rhs are symmetric(but I don't remember why, actually).

Anyways, this is clearly a bikeshedding discussion. I'm perfectly fine with either.

@saifalkatout
Copy link
Contributor Author

Hey Yoshi, this is a rough sketch of what im trying to do with AugAssign. Could you check if I'm on the right path with infix_binding_power and BinExprScope ? Note that I didn't change bump_bin_op bcuz it doesn't return a value. So I just added my own function which I will populate later with -=, *=, etc...
Also when I get the ok on this approach I'll get started with the assign.

@Y-Nak
Copy link
Member

Y-Nak commented Oct 9, 2023

Sorry for the late response.
This doesn't seem a right approach.

  1. set_kind usage is wrong(You don't need to define Scope and call them if you use set_kind)
  2. The way of lowering the AugAssign is semantically wrong. Also, AugAssign(ExprId, ExprId, BinOp) would be the better definition; this means you don't need to desugar AugAssign.
  3. As 2. indicates, you can remove pub enum AugAssignDesugared.
  4. You don't need to define PlusEqScope or LShiftEqScope just to check if the binary operator is AugAssign since those scopes have no corresponding SyntaxKind. What you need to do is 1. Seeing the current token is a binary operator, then proceed parser to the next token and see if the next token is = in dry run mode.

For example,
The expected CST for the below expression

x += 1 + 1

is

  AugAssignExpr@52..62
    PathExpr@52..53
      Path@52..53
        PathSegment@52..53
          Ident@52..53 "x"
    WhiteSpace@53..54 " "
    Plus@54..55 "+"
    Eq@55..56 "="
    WhiteSpace@56..57 " "
    BinExpr@57..62
      LitExpr@57..58
        Lit@57..58
          Int@57..58 "1"
      WhiteSpace@58..59 " "
      Plus@59..60 "+"
      WhiteSpace@60..61 " "
      LitExpr@61..62
        Lit@61..62
          Int@61..62 "1

, and not the one below that this PR generates.

BinExpr@52..62
  AugAssignExpr@52..58
    PathExpr@52..53
      Path@52..53
        PathSegment@52..53
          Ident@52..53 "x"
    WhiteSpace@53..54 " "
    AugAssignExpr@54..58
      Plus@54..55 "+"
      Eq@55..56 "="
      WhiteSpace@56..57 " "
      LitExpr@57..58
        Lit@57..58
          Int@57..58 "1"
  WhiteSpace@58..59 " "
  Plus@59..60 "+"
  WhiteSpace@60..61 " "
  LitExpr@61..62
    Lit@61..62
      Int@61..62 "1"

@saifalkatout
Copy link
Contributor Author

Hey Yoshi, this is some bare bones changes, I want to see what CSTs will be generated

@saifalkatout
Copy link
Contributor Author

saifalkatout commented Oct 14, 2023

Hey Yoshi, regarding the latest push:

  1. The reason exprs like x += 1 + 1 was not working even after the changes on Oct 9th was because the infix_binding_power I had for eq was wrong (to fix it I used the following article as reference: https://rust-analyzer.github.io/blog/2020/09/16/challeging-LR-parsing.html)

2)This is the CST created from x += 1 + 1:

AugAssignExpr@0..10
      PathExpr@0..2
        Path@0..1
          PathSegment@0..1
            Ident@0..1 "x"
        WhiteSpace@1..2 " "
      Plus@2..3 "+"
      Eq@3..4 "="
      WhiteSpace@4..5 " "
      BinExpr@5..10
        LitExpr@5..6
          Lit@5..6
            Int@5..6 "1"
        WhiteSpace@6..7 " "
        Plus@7..8 "+"
        WhiteSpace@8..9 " "
        LitExpr@9..10
          Lit@9..10
            Int@9..10 "1"
  1. I wanted to finish through with the stmt approach. Now I must research which is better; aug/augassign being expr or stmt, this is the final thing we agreed on in the meeting right ?

  2. is lowering into hir and mir in the scope of this PR, for now I have this in the hir:
    image

and I will have to write tests for our new augassign expression.

  1. The lint fixes (clippy), are caching somewhere, thats why it keeps failing, I think I'm facing this issue: Cache invalidation with cargo clippy --fix --allow-dirty --allow-staged rust-lang/rust-clippy#8991, do you have any advice ?

@Y-Nak
Copy link
Member

Y-Nak commented Oct 14, 2023

  1. I wanted to finish through with the stmt approach. Now I must research which is better; aug/augassign being expr or stmt, this is the final thing we agreed on in the meeting right ?

IIRC, we decided to make them exprs.
I tried to make them stmts because I thought it might be better to restrict the side-effect location to statements. But I couldn't have enough time to consider it fully, and we decided to have mutable references. So, I can't come up with any advantages to having them as statements now.

  1. is lowering into hir and mir in the scope of this PR, for now I have this in the hir:

Almost all parts of MIR will be discarded, so I think you don't need to modify MIR.

  1. The lint fixes (clippy), are caching somewhere ...

You have two options.

  1. Make your worktree clean.
  2. Fix them manually.
    NOTE clippy-fix doesn't apply to all suggestions, please refer to https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lint_defs/enum.Applicability.html#variants. IIRC, only lints coming with Applicability::MachineApplicable are automatically applicable. This means you might need to fix some of them by hand.

@Y-Nak
Copy link
Member

Y-Nak commented Oct 14, 2023

Oh, regarding the answer about 3., I did not mean you should not try to find a better solution. I just described my thoughts and the current situation. Your research is always welcome, of course.

@saifalkatout
Copy link
Contributor Author

Hey Yoshi, just some last comments from my end:

  1. Clippy is failing on some files I didn't edit (https://rust-lang.github.io/rust-clippy/master/index.html#/filter_map_bool_then). Maybe this was a change in Clippy policy ? If you want I can go around fixing each one of the cases, I already did some: 55cb951

  2. I believe, if we want to follow Rust defns, an assign/augassign would fit more as an expr (I am sure you know this already, I just wanted to tell you what I found on this). Also, could you help me understand how mutable references would affect our decision ?

3)I would still need to squash changes into 1 commit, just need to know how to proceed with 1.

@Y-Nak
Copy link
Member

Y-Nak commented Oct 21, 2023

Clippy is failing on some files I didn't edit

Rust publishes a new version every six weeks, and Clippy as well. It often causes new lint errors that are detected by newly added lint pass of Clippy.

If you want I can go around fixing each one of the cases, I already did some: 55cb951

I fixed them in #931, so there is no need to fix them.

Also, could you help me understand how mutable references would affect our decision ?

I vaguely thought that it might be nice if we could guarantee side effects only happen in stmt. In this idea, stmt is a kind of escape hatch to allow side effect, and all expressions works as if purely functional language with linear types. But I didn't pursue the idea seriously, and I ended up thinking that having mutable references would be a better design and an easier way to achieve our goal. Assuming having mutable references, we can't avoid side effects in expression, and if so, there is no advantage to making them stmt.

would still need to squash changes into 1 commit

You can squash them into multiple commits that have reasonable granularity. I mean, you don't necessarily need to squash them to a single commit. But it's up to you.

@saifalkatout
Copy link
Contributor Author

Hey @Y-Nak, do you think the granularity is acceptable like this ? GH doesn't allow squashing merge commits, I found a solution for this by reverting all the changes then merging to the latest commit and pushing without syncing with the main branch, please let me know if this is necessary. Other than this, I think we can merge ?

Copy link
Member

@Y-Nak Y-Nak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. After those comments are addressed, I'll revisit this PR again for a detailed review.


#[cfg(test)]
mod tests {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Yoshi, I fixed all the comments, had to rewrite this test because we stopped desugaring.

Comment on lines 171 to 177
Star | Slash | Percent => {
if is_aug(parser) {
(38, 39)
} else {
(130, 131)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I came up with, to give accurate BPs by peeking forward, is this approach concise/fast enough ? if so I'll do the same in the Plus | Minus and Lshift | Rshift arms.

@saifalkatout
Copy link
Contributor Author

hey @Y-Nak could u check my latest comments plz

Copy link
Member

@Y-Nak Y-Nak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a suggestion about

  1. the assignment's binding power and its associativity
  2. wrong infix_binding_power usage

Also, would you add LazyAugAssignExprSpan and LazyAssignSpan in hir::span::expr? Please refer to the implementation for LazyBinExprSpan for the implementation details.

Comment on lines 172 to 173
if is_aug(parser) {
(38, 39)
Copy link
Member

@Y-Nak Y-Nak Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if is_aug(parser) {
(38, 39)
if is_aug(parser) {
(11, 10)

I think aug assign and assign operators should have a minimum binding power and right associativity.
e.g., a = b = c should be parsed as a = (b = c).

Dot => (151, 150),
Eq => {
// `Assign` and `AugAssign` have the same binding power
(40, 41)
Copy link
Member

@Y-Nak Y-Nak Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(40, 41)
(11, 10)

ditto

Comment on lines 207 to 222
if is_aug(parser) {
self.set_kind(SyntaxKind::AugAssignExpr);
bump_aug_assign_op(parser);
let (_, rbp) = infix_binding_power(parser).unwrap();
parser.bump_or_recover(SyntaxKind::Eq, "expected `=`", None);
parse_expr_with_min_bp(parser, rbp, true);
} else if is_asn(parser) {
self.set_kind(SyntaxKind::AssignExpr);
parser.set_newline_as_trivia(false);
let (_, rbp) = infix_binding_power(parser).unwrap();
parser.bump_or_recover(SyntaxKind::Eq, "expected `=`", None);
parse_expr_with_min_bp(parser, rbp, true);
} else {
bump_bin_op(parser);
parse_expr_with_min_bp(parser, rbp, false);
}
Copy link
Member

@Y-Nak Y-Nak Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if is_aug(parser) {
self.set_kind(SyntaxKind::AugAssignExpr);
bump_aug_assign_op(parser);
let (_, rbp) = infix_binding_power(parser).unwrap();
parser.bump_or_recover(SyntaxKind::Eq, "expected `=`", None);
parse_expr_with_min_bp(parser, rbp, true);
} else if is_asn(parser) {
self.set_kind(SyntaxKind::AssignExpr);
parser.set_newline_as_trivia(false);
let (_, rbp) = infix_binding_power(parser).unwrap();
parser.bump_or_recover(SyntaxKind::Eq, "expected `=`", None);
parse_expr_with_min_bp(parser, rbp, true);
} else {
bump_bin_op(parser);
parse_expr_with_min_bp(parser, rbp, false);
}
if is_aug(parser) {
self.set_kind(SyntaxKind::AugAssignExpr);
bump_aug_assign_op(parser);
parser.bump_expected(SyntaxKind::Eq);
parse_expr_with_min_bp(parser, rbp, true);
} else if is_asn(parser) {
self.set_kind(SyntaxKind::AssignExpr);
parser.set_newline_as_trivia(false);
parser.bump_expected(SyntaxKind::Eq);
parse_expr_with_min_bp(parser, rbp, true);
} else {
bump_bin_op(parser);
parse_expr_with_min_bp(parser, rbp, false);
}
  1. It's not necessary to recalculate rbp, also this would fix a bug that is introduced in if is_aug block(the infix_binding_power returned a wrong binding power).
  2. I think it'd be clearer to call parser.bump_expected(SyntaxKind:Eq) instead of parser.bump_or_recover(SyntaxKind::Eq, ...) since you already assert it inside the if conditions.

@saifalkatout saifalkatout reopened this Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants