itamarst has quit [Quit: Connection closed for inactivity]
derpydoo has joined #pypy
derpydoo has quit [Ping timeout: 256 seconds]
<cfbolz>
korvo: more context needed
<cfbolz>
korvo: we have a peg generator for pypy3, based on the cpython stuff
<cfbolz>
but it's even more adhoc
lritter has joined #pypy
Dejan has joined #pypy
Dejan has quit [Quit: Leaving]
xcm has joined #pypy
xcm_ has quit [Remote host closed the connection]
derpydoo has joined #pypy
jcea has joined #pypy
derpydoo has quit [Quit: derpydoo]
itamarst has joined #pypy
<korvo>
Yeah, it'd be nice if PyPy were already nicely factored into generic RPython tools.
<korvo>
cfbolz: Concretely, I don't know how to convince RPLY's lexer that a '}' can occur at the start of two different tokens, but which token to match depends on the current parser state. I could do it if the lexer had a stack of states.
<korvo>
Maybe I should just hack RPLY.
<cfbolz>
korvo: which language is that?
<cfbolz>
korvo: "nicely factored into generic rpython tools" hah (rpython has like five users? maybe ten?)
<korvo>
Nix. Here are two legal Nix strings: "{ x = 2; }" and '"x + ${ y } + z"'
<cfbolz>
korvo: and the other implementations have a stateful lexer?
<cfbolz>
is that like f-strings?
<nikolar>
can't you introduce a token for ${
<korvo>
Yeah, it's like f-strings; it's a quasiliteral syntax. The reference implementation is YACC + Bison and uses both lexer and parser states.
lritter has quit [Quit: Leaving]
<nikolar>
korvo: wouldn't creating a separate token for "${" solve that problem
<korvo>
nikolar: Yes. That doesn't help with recognizing the end of the splice, though. RPLY doesn't have stateful lexing, so the lexer doesn't know whether it's currently in a splice or a string literal.
<nikolar>
ah right
<cfbolz>
korvo: so what I would do is probably the following: hack rply to allow writing your own lexer, then write a stateful lexer by hand
<korvo>
Yeah, I guess so. I still want a better version of this grammar, but I should duplicate the already-working grammar first.
<cfbolz>
the other option could be what pypy does for f-strings: always parse the string as a full token, then later (during bytecode compilation/ast construction) parse the insides of the quasi quotes
<korvo>
Hm. I wonder if that actually forms a regular language (and thus could be lexed); I'm thinking nasty cases like '"x + ${ { y = "z + ${ w }"; } }"' might make it hard to defer parsing. Fun idea though.
<cfbolz>
korvo: that kind of construction used to be forbidden in python
<nikolar>
but not anymore kek
<cfbolz>
thankfully we don't support that version yet