[egenix-users] How to create nested search pattern witn mxTextTools ?

Mike C. Fletcher mcfletch at rogers.com
Sun Jun 9 16:21:51 CEST 2002


Pekka, are you sure you're not optimising this app too early?  I mean, 
you're only dealing with 2MB files.  The simple loop over the results 
table isn't likely to be a time problem compared to your original 
solution with line-by-line regex runs.  If you code the whole grammar as 
an EBNF so you can process the whole file in a single call to 
TextTools.tag, you're not likely going to be able to detect the time 
required to do the parsing, and the looping approach to getting the 
results from the tagging results is pretty fast for must uses.

As for the mxTextTool equivalent of the EBNF:

mxDecl = []
mxDecl.extend(
[
     (
         (None, 207, ((None, 204, (mxDecl, 2)),
                      (None, 207, (('match', 204, (mxDecl, 1)),
                                   (None, 204, (mxDecl, 2))), 1, 0))),),
     (
         (None, 207, (
             (None, 21, '?'),
             (None, 204,(mxDecl, 2)),
             (None, 207, (('match', 204, (mxDecl, 1)),
                          (None, 204, (mxDecl, 2))),1, 0),
             (None, 21, '!'))),),
     (
         (None, 207, (
             (None, 11, 
'-_abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ:. ', 
1),)),)
]

You can find it in the generator's tupleset attribute:

	table = generator.buildParser(
		declaration
	).tupleset

I've never felt the need to add AppendMatch, but the code is available 
if someone wants to add the syntax to the EBNF grammar and the 
objectgenerator code.

Enjoy,
Mike

Pekka Niiranen wrote:
> I am searching nested strings that are limited (and include)
> with ?- and ! -signs.
> 
> I have created a nested EBNF-search pattern with Simpleparse-1.0 module
> for mxTextTools. The problem is: I am interested only on matched strings
> 
> and cannot use flag (true or not ?) "AppendMatch" with SimpleParse.
> 
> What is the mxTextTool equivalent of the EBNF -notation below:
> 
> declaration := a,(match,a)*
> match := '?',a,(match,a)*,'!'
> <a> := [-_a-z0-9A-Z:. ]*        #Not returning this line
> 
> in case of a string "aa?BB?CC!DD!ee?FF!gg"
> it should return:
> 
> [?BB?CC!DD!, ?CC!, ?FF!]
> 
> Any help appreciated,
> 
>     -pekka-
> 
> 
> 
> _______________________________________________________________________
> eGenix.com User Mailing List                     http://www.egenix.com/
> http://lists.egenix.com/mailman/listinfo/egenix-users
> 


-- 
_______________________________________
   Mike C. Fletcher
   http://members.rogers.com/mcfletch/





More information about the egenix-users mailing list