Skip to content

Parse actions skipped inside delimited_list #345

@kimgr

Description

@kimgr

Hello,

I'm not sure if this is a bona-fide bug, or if I'm misusing pyparsing :-)

I have a huge pyparsing grammar for ASN.1 syntax over in https://github.com/kimgr/asn1ate/blob/master/asn1ate/parser.py. The repro below is not minimal. but uses some techniques from asn1ate's parser to demonstrate what I'm trying to do:

#!/usr/bin/env python
from pyparsing import *


class AnnotatedToken(object):
    def __init__(self, kind, elements):
        self.kind = kind
        self.elements = elements

    def __str__(self):
        return 'T(%s, %r)' % (self.kind, self.elements)

    __repr__ = __str__
    

def grammar():
    def annotate(name):
        def _(t):
            return AnnotatedToken(name, t.asList())
        return _

    identifier = Word(srange('[a-z0-9]'))
    numeral = Word(nums)

    named_number_value = Suppress('(') + numeral + Suppress(')')
    named_number = identifier + named_number_value

    named_number_list = (Suppress('{') +
                         Group(Optional(delimitedList(named_number))) +
                         Suppress('}'))

    identifier.setParseAction(annotate("id"))
    named_number.setParseAction(annotate("nn"))

    # BUG(?): This parse action is never called after commit
    # 9987004c94ccf7d9b6b3adbcf06d05d2ff197737
    named_number_value.setParseAction(annotate("val"))

    g = OneOrMore(named_number_list)
    return g


g = grammar()
res = g.parseString("""
{ x1(1), x2(2) }
""")
print(res.dump())

The concrete problem I'm seeing downstream is that a parse action intended to decorate the parse result with a type name is never called, and so later stages can't use the annotation to identify what kind of element it is.

I think the issue is that delimited_list now mutates the expression by calling its streamline method: 9987004#diff-daba53cec7bed1be7b180ee5e8378c772408d07afe6f2cad6d62e966993b9e45L38.

I haven't fully gotten my head around what streamline is supposed to do, but it seems to reduce the result to a literal value, skipping over any interim rules and parse actions.

Is that a bug? Or is there a way to phrase the grammar in a way that it works with both old and new pyparsing?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions