Skip to content

Backward compatibility broken for custom lexers which parse a non-textual stream of data #1560

@jmfernandez

Description

@jmfernandez

Describe the bug

Lark 1.2.2 supported custom lexers focused on parsing non-textual stream of data . Lark 1.3.0 has broken that support, as you can see running stable documentation's example:

python parse_to_dict.py 
['alice', 1, 27, 3, 'bob', 4, 'carrie', 'dan', 8, 6]
Traceback (most recent call last):
  File "/home/jmfernandez/projects/python-groovy-parser/parse_to_dict.py", line 46, in <module>
    test()
    ~~~~^^
  File "/home/jmfernandez/projects/python-groovy-parser/parse_to_dict.py", line 38, in test
    tree = parser.parse(data)
  File "/home/jmfernandez/projects/python-groovy-parser/.full13/lib/python3.13/site-packages/lark/lark.py", line 676, in parse
    return self.parser.parse(text, start=start, on_error=on_error)
           ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jmfernandez/projects/python-groovy-parser/.full13/lib/python3.13/site-packages/lark/parser_frontends.py", line 122, in parse
    stream = self._make_lexer_thread(text)
  File "/home/jmfernandez/projects/python-groovy-parser/.full13/lib/python3.13/site-packages/lark/parser_frontends.py", line 113, in _make_lexer_thread
    return text if self.skip_lexer else cls(self.lexer, None) if text is None else cls.from_text(self.lexer, text)
                                                                                   ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/home/jmfernandez/projects/python-groovy-parser/.full13/lib/python3.13/site-packages/lark/lexer.py", line 457, in from_text
    text = TextSlice.cast_from(text_or_slice)
  File "/home/jmfernandez/projects/python-groovy-parser/.full13/lib/python3.13/site-packages/lark/utils.py", line 213, in cast_from
    return cls(text, 0, len(text))
  File "<string>", line 6, in __init__
  File "/home/jmfernandez/projects/python-groovy-parser/.full13/lib/python3.13/site-packages/lark/utils.py", line 196, in __post_init__
    raise TypeError("text must be str or bytes")
TypeError: text must be str or bytes

To Reproduce

Example at https://lark-parser.readthedocs.io/en/stable/examples/advanced/custom_lexer.html#sphx-glr-examples-advanced-custom-lexer-py used to work in Lark release 1.2.2 , but it does not work with newest release 1.3.0 .

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions