User talk:Tengstrand

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Hi Tengstrand! It is okay when you do Laja-advertisement here, but please use the format used by all the other parser generators. I am interested, which algorithm is exactly used? It really handles ambiguous grammars? When reading the website I think it is for unambiguous grammars and not for natural language parsing etc. --Chricho (talk) 23:21, 9 December 2010 (UTC)[reply]


Hi Chricho! Thank's for your help to get this right! I'm an autodidakt in programming and parser generator techniques and I am not sure to which type my algorithm belongs to. I read about Ambiguous grammar and tried to understand if Laja can handle ambiguous grammars. If I understand right, a parser can handle ambigous grammars if the parsed text can match several "paths" in the grammar rules. And yes, you have probably right there, the parsed text will normally follow one way along the grammar rules. There is one exception here and that is the support for variables in the grammar. You can have variables in the grammar rules like:

 grammar hello {
   hello = $myflag "hello" $name;
 
   Hello hello;
   $myflag boolean hello.isThisTrue();
   $name String hello.getName();
 }
 

When generating the parser you need to implement the methods isThisTrue() and getName(). The parser will continue the path only if the result from isThisTrue returns the boolean value true and if the source is matching against the string returned from getName(). With these two variants you can make decisions at runtime based on the content in the grammar (or other external "data").

I will try to explain how the algorithm works. The grammar contains of one or more grammar rules. The grammar alway have a top/base definition/rule that represents the grammar that the parser will match against. The top node has the same name as the grammar (mygrammar in this example). The parser matches the source against the definition "mygrammar", starting from the left most definition in definition mygrammar. Example:

 grammar mygrammar {
   a = "a";
   operator = "+"|"-";
   mygrammar = a [operator a]+;
 }


If parsing the following text:

 a + a - a

It will be matched agains: a, operator, a, operator, a.

The parsing is performed in two steps. In the first step, every element/expression (like "a" or operator) is matched against the current possition in the source (starting with index 0) and the result will be a list of boolean values (stored as bits). If a failing path is found, then Laja will start from the last correct branch in the "list of booleans" and overwrite the result. When phase one is finished, the correct path through the "desicion tree" is ready to be used in phase two. When phase two begins, we know that the source was fully matching the grammar, and we start to create the "object structure" that is the output from the parser, by instantiating objects via the factory which is is a memory consuming activity, that's why we don't want to perform it in phase one, only when we traverse the "happy path" tree.


After all this, which section on your great page do you think my parser generator belongs to? And what do you mean with "use the format used by all the other parser generators"?

BR, Joakim Tengstrand

Hi, thank you for the long reply, could you paste the generated source for an example-grammar (e.g. the mentioned mygrammar)? In the few grammars I have seen the top-definition has always been the last definition, is there any reason for that pattern? Usually people put the most important thing at the top. With the “format used by alle the others” I mean the placement of links etc. in the table, I have fixed that, btw, it is not “my” table. ;) --Chricho (talk) 19:07, 10 December 2010 (UTC)[reply]

PS: Could you paste generated source for hello, too? — Preceding unsigned comment added by Chricho (talkcontribs) 19:08, 10 December 2010 (UTC)[reply]

Okay, I had a look at it, it seems to have these properties:

  • The parser is “generated” at runtime, the code generator just translates the grammar directly into Java-expressions holding the structure of the grammar (I mean OrList, AndList, Element etc.), which get evaluated at runtime.
  • The parsing routine is performed top-down (starting at the top-element, then evaluating the children).
  • Boolean grammars are supported (with and and not).
  • It probably cannot handle left-recursion (a = a "+" b; etc.)
  • It has probably most similarities with the listed PEG-parser-generators, the parsers listed in the last category are (though I am not 100% sure) exen suitable for natural language evaluation where multiple parse-trees can exist.
  • The algorithm may be described as 2-phase-top-down-backtracking, maybe the runtime-flexibility could be mentioned, do you have any nice formulation?

So I think you do not have to post any generated source-code, anyway the generated source is not very interesting, the real parsing routine is implemented in parser/engine2. --Chricho (talk) 21:10, 10 December 2010 (UTC)[reply]


Hi again!

  • The grammar rules can be defined in arbitrary order, it has support for both bottom-up and top-down (or a mix). You have probably right about that it should be better to define the top-element first! If all rules are based on previously defined rules (bottom-up) the generated code will be a litte more readable (just slightly) by holding together the declarations and the assignments. This is the only reason why i have used bottom-up instead of top-down.
  • I think you have understand right, but just to clearify: the code that performs the parsing is not generated at runtime, it has to be in place before the parsing begins. You have right about that the method that performs the actual parsing instantiates an object structure that has a one-to-one relation with the grammar. This object structure is code generated from the grammar before the parsing begins.

And the rest you say is also correct:

  • The parsing routine is performed top-down.
  • Boolean grammars are supported.
  • It can not handle left-recursion.

I moved it to the PEG section and changed the algorithm desciption to: 2-phase scannerless top-down backtracking + runtime support

Thanks for your input! BR, Joakim Tengstrand

Okay, I think it is okay now, thanks. :-)
„I think you have understand right, but just to clearify: the code that performs the parsing is not generated at runtime, it has to be in place before the parsing begins.“
Yes, my formulation was quite imperfect, however, it would be easy to write a parser reading its grammar at runtime without further compilation-steps and without worse performance, except of the overhead for parsing the grammar-file etc. --Chricho (talk) 16:41, 12 December 2010 (UTC)[reply]

November 2023[edit]

Information icon Please do not add promotional material to Wikipedia, as you did to Component-based software engineering. While objective prose about beliefs, organisations, people, products or services is acceptable, Wikipedia is not a vehicle for soapboxing, advertising or promotion. Thank you. —Bruce1eetalk 07:41, 26 November 2023 (UTC)[reply]

Hi,
What about this instead:
In 2018, [polylith.gitbook.io Polylith] was released as an open source component-based architecture.
It may be good for you to know also, that Bit has been heavily inspired by Polylith, and that we reach out to them in April 2021 when we realised that they had copied a lot of their documentation from our Polylith documentation. Tengstrand (talk) 13:59, 26 November 2023 (UTC)[reply]

Managing a conflict of interest[edit]

Information icon Hello, Tengstrand. We welcome your contributions, but if you have an external relationship with the people, places or things you have written about on Wikipedia, you may have a conflict of interest (COI). Editors with a conflict of interest may be unduly influenced by their connection to the topic. See the conflict of interest guideline and FAQ for organizations for more information. We ask that you:

In addition, you are required by the Wikimedia Foundation's terms of use to disclose your employer, client, and affiliation with respect to any contribution which forms all or part of work for which you receive, or expect to receive, compensation. See Wikipedia:Paid-contribution disclosure.

Also, editing for the purpose of advertising, publicising, or promoting anyone or anything is not permitted. Thank you. MrOllie (talk) 03:17, 27 November 2023 (UTC)[reply]