[egenix-users] Continuous searching of text thru all characterson line: Does not work, vol 1

Pekka Niiranen krissepu at vip.fi
Mon Jun 24 17:17:08 CEST 2002


Nope,

this code works the same:

---- code starts ---
import os
import pprint
from simpleparse import generator
from mx.TextTools import *

letter_set = set(alpha)
linput = "aa?BB!aa?DD!cc"
head_pos = None


def pr(taglist,txt,l,r,subtag):
    """Print matched string"""
    print txt[l:r]

matchtable = ((pr, AllIn+CallTag, '?', +1),
              (pr, AllInSet+CallTag, letter_set, +1),
              (pr, AllIn+CallTag, '!', MatchFail, MatchOk))


tagtable = ((None, AllInSet, letter_set, +1),
            ('Match', Table+AppendMatch, matchtable),
            (None, Table, ThisTable)) # Continue searching after first match
on line.


result,taglist,next = tag(linput, tagtable)
print taglist
print "-------"

---- code stops ---

Since one needs parsers mostly because of nested structures cannot be
searched with re-expressions, please add examples of nested searches in your
future documentations. Once I get this one running, I will send it to you

-pekka-


"M.-A. Lemburg" wrote:

> Pekka Niiranen wrote:
> > I am using the mxTextTool in mxBase 2.1.0b2.
> >
> > I am parsing a line that may contain multiple non-overlapping matches:
> >
> > ---- code starts ---
> > import os
> > import pprint
> > from mx.TextTools import *
> >
> > letter_set = set(alpha)
> > linput = "aa?BB!aa?DD!aa"
> > head_pos = None
> >
> >
> > def pr(taglist,txt,l,r,subtag):
> >     """Print matched string"""
> >     print txt[l:r]
> >
> > matchtable = ((pr, AllIn+CallTag, '?', +1),
> >               (pr, AllInSet+CallTag, letter_set, +1),
> >               (pr, AllIn+CallTag, '!', +1, MatchOk),
> >               (None, Fail, Here)) #This is needed in order to avoid
> > infinite loop
> >
> >
> > tagtable = ((None, AllInSet, letter_set, +1),
> >             ('m', Table+AppendMatch, matchtable),
> >             (None, Table, ThisTable)) # Continue searching after first
> > match on line.
> >
> >
> > result,taglist,next = tag(linput, tagtable)
> > print taglist
> > print "-------"
> >
> > ---- code ends ---
> >
> > The problem is that "print taglist" returns only ['?BB!'] instead of
> > ['?BB!', ?DD!']
> > i.e the recursive call of tagtable is not added into taglist. However,
> > as function pr
> > reveals, ?DD! is found by mxTextTool.
>
> The reason is that failing sub table matches restore the tag list
> to what it was before recursion. You should remove the (None, Fail, Here)
> and replace (pr, AllIn+CallTag, '!', +1, MatchOk) with
> (pr, AllIn+CallTag, '!', MatchFail, MatchOk).
>
> > Is it possible to add all the matched strings into a single table that
> > does not subtables ?
> > (not ['?BB!, [?DD!]])
>
> Yes. The command SubTable does this for you.
>
> --
> Marc-Andre Lemburg
> CEO eGenix.com Software GmbH
> ______________________________________________________________________
> Company & Consulting:                           http://www.egenix.com/
> Python Software:                   http://www.egenix.com/files/python/
> Meet us at EuroPython 2002:                 http://www.europython.org/
>
> _______________________________________________________________________
> eGenix.com User Mailing List                     http://www.egenix.com/
> http://lists.egenix.com/mailman/listinfo/egenix-users




More information about the egenix-users mailing list