Re: [BLACKBOX] EBNF question from [at]} on 2012-02-17 (BB-FullArchive.mbox)

From: [at]} <Bob>
Date: Fri, 17 Feb 2012 15:02:57 -0000

I'm probably missing something, but it doesn't strike me as being much different from {}. The difference is that the parser would need to count each symbol type within the brackets, then check at the end of the loop that there were at least n of each symbol type. Is that what you mean by "use a context condition implemented in the semantics checker of the parser, but not within the syntax checker."? (I am not a compiler writer, merely an admirer).

first sym

until stopSymbol do

inc count[sym.type]

next sym

end

for each type do

if count[type] < N then

Error

end

end

This doesn't seem dramatically different from extensions to {} which impose a limit on the cardinality

> The question remains if this construct is required at all. In typical

> programming languages it is not. In your application it may be, but

> I would recommend to think carefully about it and to avoid it if possible.

Well, I have no control over the xml schema that I am trying to simplify, and the schema includes precisely the syntax that I have described, and programs have to parse it, although the specifiers have had to revert to natural language to state that each type must appear at least once.

In my simplified non-XML version I could certainly impose an order, but then I would have to justify it and I'd find it quite difficult to justify on technical grounds.

But that is not really why I raised the question here. It has always been my understanding that any sequence could be expressed in terms of sequence, selection and iteration - this is the basis of go-to-less programming, and EBNF relies on this theorem, but here is a sequence that it appears to be unable to define. I am just a humble programmer, so I'm probably wrong, and I would quite like to know why. Ergo my question here.

All the best,

Bob

From: BlackBox [mailto:BLACKBOX{([at]})nowhere.xy
Sent: 17 February 2012 13:04
To: BLACKBOX{([at]})nowhere.xy
Subject: Re: [BLACKBOX] EBNF question

EBNF does not have any kind of support for permutations.

Why not? Because there is no straight forward way of

implementing such a rule in a parser. Note that all EBNF constructs

have a direct counterpart in the parser code needed to implement the rule,

for example {} corresponds to a WHILE loop, | corresponds to an IF etc.

You would either have to expand all permutations by alternatives,

which can become fairly long, or you would have to use a context condition

implemented in the semantics checker of the parser, but not within the

syntax checker.

The question remains if this construct is required at all. In typical

programming languages it is not. In your application it may be, but

I would recommend to think carefully about it and to avoid it if possible.

- JT

----- Original Message -----

From: Bob Walkden <mailto:bob{([at]})nowhere.xy

To: BLACKBOX{([at]})nowhere.xy

Sent: Friday, February 17, 2012 12:20 PM

Subject: Re: [BLACKBOX] EBNF question

Hi Doug,

I'm not sure I understand that. I would like to be able to write

production = A { B } C .

A = whatever

C = whatever

B = X | Y | Z | … | ad inf .

X = x { x } .

Y = y { y } .

Z = z { z } .

but within B there must occur at least one x and at least one y and at least one z, … ad inf.

So if we have a set B = { a, b, c, … }

and the bag G whose range is elements from B, I would like to be able say in EBNF that rng G = B, as opposed to rng G IN B.

Bob

Bob,
So you want the longest sequence of symbols such that when the sequence is converted to a set, the set is equal to your specified symbol set?
-Doug

On 2/16/2012 3:46 PM, Bob Walkden wrote:

Hi Doug,

thanks for your interest. In the syntax that I want to specify, "baa" should succeed. Remember that each of the symbols must appear at least once, not once only. I think this makes it a combination rather than a permutation:

<http://en.wikipedia.org/wiki/Combination>

What I want to state is that

Pre: #a = 0 ^ #b = 0 ^ #c = 0 ^ ... ^ #z = 0

Parse

Post: #a > 0 ^ #b > 0 ^ #c > 0 ^ … ^ #z > 0

I think you can sort of specify a permutation, but only by, in essence, listing all the permutations. In the original email I suggested this:

mandatoryParts = ( A B C | A C B | B C A | B A C | C A B | C B A ) { a | b | c } .

A = a {a} .

B = b {b} .

C = c {c} .

...

But if you have to list all the permutations, like enumerating a very large set, it kind of defeats the purpose. It should be possible to define a comprehensive specification.

B

From: BlackBox [mailto:BLACKBOX{([at]})nowhere.xy
Sent: 16 February 2012 21:18
To: BLACKBOX{([at]})nowhere.xy
Subject: Re: [BLACKBOX] EBNF question

Bob,
Ok, got it.
You want to specify that any permutation of a set of symbols is accepted for a production.
There are n! (n factorial) permutations of n things. So bear with me as I walk through the
thoughts on this.

How would one write a parser to say yay, or nay to a string of symbols as a permutation?
Suppose our hypothetical notation states a permutation production as
x = P(a, b, c)
So there is a set S = {a, b, c} that our parser must keep track of. There is also another set
E={} which is the examined symbols so far in the parse.
Let's say our input is I=bca then the parser would need to read '"b" and firstly determine if it
is in the set S. It then needs to determine it that symbol has been previously examined for
this specific production so it check "b" against E. It is not there so all is well we hence progress
as
E S I
{} {abc} bca
{b} {abc} ca
{bc} {abc} a
{bca} {abc}
and the parse succeeds. If the input were "baa" or "bxc" the parse would fail.

The problem is that there is a 'state' E which is changing as the production is processed.
But surely EBNF has complex states so what is the difference? If the input is "a...a" where
there are n-2 symbols between the "a"s for our set S of n symbols then the parse path has
to look back n-1 symbols to determine if it has seen the last "a" before. So, do any of the
EBNF constructs look back to previously encountered symbols?

O = [a] (optional)
S = {a} (zero or more)
F = a b (followed by)
A = a | b (alternatives)
L = "string" (literal)
G = (a) (grouping)

I don't see any of those rules looking back. The parse certainly can 'backtrack' to an earlier state if it fails but the failure does
not seem to be dependent upon the symbols in the parse path.

So, tentatively, I conclude that no, you can't do a permutation in EBNF.
I am not happy with that since a more formal specification is warranted.

(the reason for my interest is that I am in the process or writing a parser for arbitrary EBNF
grammars)

-Doug

On 2/15/2012 11:36 PM, Bob Walkden wrote:

From: BlackBox [mailto:BLACKBOX{([at]})nowhere.xy

Danforth

Sent: 16 February 2012 01:53
To: BLACKBOX{([at]})nowhere.xy
Subject: Re: [BLACKBOX] EBNF question

On 2/15/2012 3:23 PM, Bob Walkden wrote:
I can't write

mandatoryParts = { a {a} | b {b} | c {c} | ... | z {z} } .

Bob,
Why not just write

mandatoryParts = ( a {a} | b {b} | c {c} | ... | z {z} ) .
Just parenthesize the list. That would allow any order to occur and
one of the options must be chosen.

Thanks, but that would mean only one of them is mandatory. I am trying to
state that each of them is mandatory, in any order.

B

----
To unsubscribe, send a message with body "SIGNOFF BLACKBOX" to LISTSERV{([at]})nowhere.xy 
---- To unsubscribe, send a message with body "SIGNOFF BLACKBOX" to LISTSERV{([at]})nowhere.xy---- To unsubscribe, send a message with body "SIGNOFF BLACKBOX" to LISTSERV{([at]})nowhere.xy---- To unsubscribe, send a message with body "SIGNOFF BLACKBOX" to LISTSERV{([at]})nowhere.xy---- To unsubscribe, send a message with body "SIGNOFF BLACKBOX" to LISTSERV{([at]})nowhere.xy---- To unsubscribe, send a message with body "SIGNOFF BLACKBOX" to LISTSERV{([at]})nowhere.xy---- To unsubscribe, send a message with body "SIGNOFF BLACKBOX" to LISTSERV{([at]})nowhere.xy

Received on Fri Feb 17 2012 - 16:02:57 UTC

This archive was generated by hypermail 2.3.0 : Thu Sep 26 2013 - 06:30:07 UTC