 |
|
 |
|
Next: Custom Install and Updates
|
| Author |
Message |
External

Since: Dec 31, 2007 Posts: 12
|
(Msg. 1) Posted: Sat Jan 05, 2008 11:07 pm
Post subject: Help with RegEx? Archived from groups: comp>sys>mac>programmer>help, others (more info?)
|
|
|
I want to build a regular expression that will find certain characters in a
field. For example:
i,n,t,u,o,n
all need to be present (at least once) for the RegEx interpreter to label
this search True. The order is not important, and case should be ignored.
I tried
[Ii][Nn][Tt][Uu][Oo][Nn]
and
[Ii].*[Nn].*[Tt].*[Uu].*[Oo].*[Nn]
No joy.
Also tried use of ^ and $ but I'm not sure how to implement them, and whether
or not they are required.
So basically I'm stumbling around in the dark. But I want to learn. I've
viewed several tutorials on-line, but this subject is so obtuse to me that
it's difficult even getting started.
Any suggestions would be greatly appreciated.
Thanks! >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Dec 31, 2007 Posts: 12
|
(Msg. 2) Posted: Sun Jan 06, 2008 1:43 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
> [Ii].*[Nn].*[Tt].*[Uu].*[Oo].*[Nn]
It turns out that this does work to find terms with all these letters in this
order even if there are other characters interspersed between them. Such as:
intuition
in3t5u7ition
in tuitio9n
inBBtuCCon
Now I want to find terms that have all these characters in any order. Such
as:
noiiitunt
tn8no9uitii
otAAnuiBiit
unt ioitni
I guess this has something to do with the ^ and $ parsing metasymbols but I'm
not knowledgeable enough on this topic to know how, exactly.
Any help would be greatly appreciated.
Thanks! >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Jul 06, 2004 Posts: 7
|
(Msg. 3) Posted: Sun Jan 06, 2008 1:43 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
SparkyGuy wrote:
# I want to build a regular expression that will find certain characters in a
# field. For example:
#
# i,n,t,u,o,n
#
# all need to be present (at least once) for the RegEx interpreter to label
# this search True. The order is not important, and case should be ignored.
#
# I tried
#
# [Ii][Nn][Tt][Uu][Oo][Nn]
#
# and
#
# [Ii].*[Nn].*[Tt].*[Uu].*[Oo].*[Nn]
#
# No joy.
The software really isn't intended to be used this way. It would
be simpler to conjoin a number of searches. Assuming an interface
like
numberofmatches = regexp(pattern,string)
you can do something like
regexp("[Ii]",string)==1
&& regexp("[Nn]",string)==2
&& regexp("[Tt]",string)==1
&& regexp("[Uu]",string)==1
&& regexp("[Oo]",string)==1
Some interfaces also allow a flag to ignore character case.
regexpnocase("i",string)==1
&& regexpnocase("n",string)==2
&& regexpnocase("t",string)==1
&& regexpnocase("u",string)==1
&& regexpnocase("o",string)==1
To do this in a single RE, you have to use all 120 permutations,
[^IiNnTtUuOoNn]*[Ii][^IiNnTtUuOoNn]*[Nn][^IiNnTtUuOoNn]*[Tt]...
|[^IiNnTtUuOoNn]*[Ii][^IiNnTtUuOoNn]*[Uo][^IiNnTtUuOoNn]*[Nn]...
|...
--
SM Ryan http://www.rawbw.com/~wyrmwif/
GERBILS
GERBILS
GERBILS >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Jul 18, 2004 Posts: 548
|
(Msg. 4) Posted: Sun Jan 06, 2008 3:06 pm
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
SparkyGuy wrote:
> > [Ii].*[Nn].*[Tt].*[Uu].*[Oo].*[Nn]
>
> It turns out that this does work to find terms with all these letters in this
> order even if there are other characters interspersed between them. Such as:
>
> intuition
> in3t5u7ition
> in tuitio9n
> inBBtuCCon
>
> Now I want to find terms that have all these characters in any order. Such
> as:
>
> noiiitunt
> tn8no9uitii
> otAAnuiBiit
> unt ioitni
To clarify: do you want at least one each of "I", "T", "U" and "O", and
at least two "N"s, in any order and mixed with any other characters (or
more of the same ones), ignoring case?
Tricky. You can't do that with a simple regular expression, or even a
Perl-compatible regular expression.
It would best done in parallel, testing the string against five
different regular expressions:
[Ii]
[Nn].*[Nn]
[Tt]
[Uu]
[Oo]
The string has to match all of these to pass.
If you really don't need two "N"s then you can simplify the second test
to be like the others. If you want a certain number of each letter, but
in any position, then use the same general syntax as the second line.
> I guess this has something to do with the ^ and $ parsing metasymbols but I'm
> not knowledgeable enough on this topic to know how, exactly.
Those just mean "start of string" and "end of string" respectively. For
example, if you want to only match a string which starts with "I" or "i"
then your regular expression is "^[Ii]".
--
David Empson
dempson RemoveThis @actrix.gen.nz >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Feb 18, 2004 Posts: 137
|
(Msg. 5) Posted: Sun Jan 06, 2008 3:25 pm
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
In article ,
SparkyGuy wrote:
> I want to build a regular expression that will find certain characters in a
> field.
>
> For example:
>
> i,n,t,u,o,n
>
> all need to be present (at least once) for the RegEx interpreter to label
> this search True. The order is not important, and case should be ignored.
The 'order is not important' part makes a regular expression a 'less
than the ideal' solution for your problem.
> I tried
>
> [Ii][Nn][Tt][Uu][Oo][Nn]
>
> and
>
> [Ii].*[Nn].*[Tt].*[Uu].*[Oo].*[Nn]
>
> No joy.
>
> Also tried use of ^ and $ but I'm not sure how to implement them, and whether
> or not they are required.
>
> So basically I'm stumbling around in the dark.
The 'I tried', and 'I am not sure' parts already carried that message,
but it is good to hear that you know that you do not really know what
you are doing.
> But I want to learn. I've viewed several tutorials on-line, but this
> subject is so obtuse to me that it's difficult even getting started.
I guess that you are at the stage where 'every thing looks like a nail,
even your thumb'. Regular expressions are powerful, but not suited for
every job. This is one of those jobs. You can create a regular
expression of this, but it would have to sum up all 360 (that would be
720 if the six letters were different) permutations of the six letters
used, for a regular expression of length around 34 * 360 + 2 * 359.
> Any suggestions would be greatly appreciated.
For me, the #1 rule when building regular expressions is: when your
regular expression does not do what you think it should do, shorten it,
and check (in a simple test program) that the shorter one does what you
think it should do.
In your case, you might want to start with "[Ii].*[Nn]" and work from
there.
<http://www.regular-expressions.info/> might help you.
If you have access to a Windows machine: have you seen
<http://www.regexbuddy.com/test.html>? I have not used it myself, but
heard positive comments about it.
> Thanks!
In article ,
SparkyGuy wrote:
> > [Ii].*[Nn].*[Tt].*[Uu].*[Oo].*[Nn]
>
> It turns out that this does work to find terms with all these letters in this
> order even if there are other characters interspersed between them. Such as:
>
> intuition
> in3t5u7ition
> in tuitio9n
> inBBtuCCon
That should work on with most, if not all, regular expression libraries.
See for example <http://www.regextester.com/>. Which one are you using?
> I guess this has something to do with the ^ and $ parsing metasymbols
Why makes you think that?
Reinder >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Oct 26, 2007 Posts: 19
|
(Msg. 6) Posted: Sun Jan 06, 2008 3:25 pm
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
On Sat, 05 Jan 2008 23:07:58 -0500, SparkyGuy wrote:
> I want to build a regular expression that will find certain characters in a
> field. For example:
>
> i,n,t,u,o,n
Exactly which language/library/regex engine are you using?
In any caseisn't easy in a single regexp.
A bientot
Paul
--
Paul Floyd http://paulf.free.fr >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Jun 20, 2005 Posts: 30
|
(Msg. 7) Posted: Tue Jan 08, 2008 9:27 pm
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: comp>sys>mac>programmer>help, others (more info?)
|
|
|
In article ,
SparkyGuy wrote:
> I want to build a regular expression that will find certain characters in a
> field. For example:
>
> i,n,t,u,o,n
>
> all need to be present (at least once) for the RegEx interpreter to label
> this search True. The order is not important, and case should be ignored.
REs are not the right tool for this job; while it is possible to build such an
RE, it's annoying. For example, this is a RE that would match every string
containing all of 'a', 'b', and 'c' in any order, case-insensitively:
([^aA][aA][^bB][bB][^cC][cC])|([^aA][aA][^cC][cC][^bB][bB])|([^bB][bB][^aA][aA][^
cC][cC])|([^bB][bB][^cC][cC][^aA][aA])|([^cC][cC][^aA][aA][^bB][bB])|([^cC][cC][^
bB][bB][^aA][aA])
If you are using a RE engine that allows RE options, then you can use /i to
indicate you want a case-insensitive match, and then you can simplify this to
/([^a]a[^b]b[^c]c)|([^a]a[^c]c[^b]b)|([^b]b[^c]c[^a]a)|([^b]b[^a]a[^c]c)|([^c]c[^
a]a[^b]b)|([^c]c[^b]b[^a]a)/i
and you can further simplify this to
/([^a]a(([^b]b)|([^c]c)))|([^b]b(([^a]a)|([^c]c)))|([^c]c(([^a]a)|([^b]b)))/i
but as you can see, this is pretty painful and it gets exponentially more
painful the more characters you want to match.
Ben
--
If this message helped you, consider buying an item
from my wish list: <http://artins.org/ben/wishlist>
I changed my name: <http://periodic-kingdom.org/People/NameChange.php> >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Dec 31, 2007 Posts: 12
|
(Msg. 8) Posted: Wed Jan 09, 2008 1:15 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
I have reduced my requirements to just matching input that contains the
letters "i n t u o" in that order even if interspersed with other characters.
The RE I worked out is:
[Ii].*[Nn].*[Tt].*[Uu].*[Oo]
Thanks to all who helped me come to this result.
I also need to match input that exceeds 50 characters. I tried several REs,
paring it down, eventually, to:
.{50}
It doesn't work.
This seems like it should be pretty simple, but it's not working. It works in
regextester.com's tester, but it isn't working for me. Is there a simpler (or
different) way to check for input length? (I presume that my "flavor" of
regex interpreter isn't accepting this form...)
Thanks for your help. >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Jul 18, 2004 Posts: 548
|
(Msg. 9) Posted: Wed Jan 09, 2008 5:58 pm
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
SparkyGuy wrote:
> I have reduced my requirements to just matching input that contains the
> letters "i n t u o" in that order even if interspersed with other characters.
>
>
> The RE I worked out is:
>
> [Ii].*[Nn].*[Tt].*[Uu].*[Oo]
>
> Thanks to all who helped me come to this result.
>
> I also need to match input that exceeds 50 characters. I tried several REs,
> paring it down, eventually, to:
>
> .{50}
>
> It doesn't work.
That regular expression matches exactly 50 of any combination of
characters. Assuming your input is being processed on a line at a time
basis, it should match any line which contains a minimum of 50
characters.
> This seems like it should be pretty simple, but it's not working. It works in
> regextester.com's tester, but it isn't working for me. Is there a simpler (or
> different) way to check for input length? (I presume that my "flavor" of
> regex interpreter isn't accepting this form...)
The {} syntax is only available if your regular expression engine
supports Perl-compatible regular expressions ("PCRE").
A simple regular expression has no way to represent quantity, other than
"zero or more" (*) or "one or more" (+).
To match at least 50 characters in a simple regular expression you would
need to enter 50 periods:
...................................................
--
David Empson
dempson RemoveThis @actrix.gen.nz >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Dec 31, 2007 Posts: 12
|
(Msg. 10) Posted: Wed Jan 09, 2008 5:58 pm
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
> The {} syntax is only available if your regular expression engine
> supports Perl-compatible regular expressions ("PCRE").
>
> A simple regular expression has no way to represent quantity, other than
> "zero or more" (*) or "one or more" (+).
>
> To match at least 50 characters in a simple regular expression you would
> need to enter 50 periods:
>
> ..................................................
The author of the application within which I'm using RegEx to develop filters
tells me that the app uses this RegEx library:
<http://arglist.com/regex/>
This is what I found through testing:
Using "." (a single period) matches all input, regardless of length.
Apparently it is being interpreted equivalent to ".*"
Multiple uses of "." (ie, "......") are redundant.
This is interesting:
"Matches RE '.{1,50}' " matches nothing
"Does not match RE '.{1,50}' " matches all input
More tests:
[Ee] matches all input with at least 1 E or e.
[Ee]{1,3} matches nothing, although there's plenty of valid input.
[Ee][Ee][Ee] also fails to match anything.
Is my "flavor" of RE interpreter broken? I'd think that at least some of the
basic forms should be supported...
Ideas?
Thanks, >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Dec 31, 2007 Posts: 12
|
(Msg. 11) Posted: Thu Jan 10, 2008 10:25 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
> The author of the application within which I'm using RegEx to develop filters
> tells me that the app uses this RegEx library:
He further says that it is the regex(3) library that he has implemented.
"Basic" expressions.
It seems that the basic (referred to as "obsolete") REs are a subset of
"extended" REs.
Of all places, I found a list of basic expressions on Wikipedia:
<http://en.wikipedia.org/wiki/Regular_expression>
(scroll down the heading "POSIX".)
My question is about the metacharacter ".". Using a single ".", shouldn't it
match input that consists of a single character, and not match anything with
more than one character?
When I use this metacharacter I'm getting matches for all input, regardless
of length.
Thanks. >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Jan 30, 2005 Posts: 6
|
(Msg. 12) Posted: Fri Jan 11, 2008 6:20 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
SparkyGuy wrote:
> My question is about the metacharacter ".". Using a single ".", shouldn't it
> match input that consists of a single character, and not match anything with
> more than one character?
Only if you anchor it, i.e. "^.$" will match lines containing only a
single character. Read this as "At the start of the input, match a
single character, which must be followed by the end of the line"
(newline or end-of-string). A "." alone will match anything, other than
an empty line. Some regex matchers have options to implicitly anchor
the regex but others don't.
Don't know if anyone has suggested this, but one way to achieve the
match you want (in the original post) is to sort the characters of the
field prior to matching. Then match against a simpler regex. The
sorting eliminates the complications of specifying a regex that can cope
with the arbitrary ordering of the input characters. Of course it may
not be the most efficient thing to do depending upon the nature of the
input (quantity, likelihood of match etc...) and a complex regex may be
better. >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Dec 31, 2007 Posts: 12
|
(Msg. 13) Posted: Fri Jan 11, 2008 6:20 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
> Only if you anchor it, i.e. "^.$" will match lines containing only a
> single character. Read this as "At the start of the input, match a
> single character, which must be followed by the end of the line"
> (newline or end-of-string). A "." alone will match anything, other than
> an empty line. Some regex matchers have options to implicitly anchor
> the regex but others don't.
Ah. Thanks! It now works to identify specific numbers of characters, such as:
^.....$ five characters
^..........$ ten characters, etc.
My goal is to select a range of numbers of characters, the general form of
which would be:
.{5,10}
But in this limited set of supported expressions, however, the range
metacharacters must be escaped:
\{5,10\}
So how do I incorporate this with "."? I tried
^.\{5,10\}$
to no avail. Other permutations I can think of don't work either.
Ideas?
> Don't know if anyone has suggested this, but one way to achieve the
> match you want (in the original post) is to sort the characters of the
> field prior to matching. Then match against a simpler regex. The
> sorting eliminates the complications of specifying a regex that can cope
> with the arbitrary ordering of the input characters. Of course it may
> not be the most efficient thing to do depending upon the nature of the
> input (quantity, likelihood of match etc...) and a complex regex may be
> better.
I may have stated earlier (?) that I'm not working in a programming language,
but simply using RegEx to set up filters in an application that supports the
basic set of RegEx expressions. The application presents a single field
within which a single RegEx can be entered. In a single RegEx, can I sort and
match?
Thanks for your help. It is very much appreciated. >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Dec 31, 2007 Posts: 12
|
(Msg. 14) Posted: Fri Jan 11, 2008 6:20 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
It turns out that simple range expressions are not supported in this (very)
limited set of RegEx:
"\{m,n\} Matches the preceding element at least m and not more than n times.
For example, a\{3,5\} matches only "aaa", "aaaa", and "aaaaa". ***This is not
found in a few, older instances of regular expressions.***"
(Emphasis mine.)
<http://en.wikipedia.org/wiki/Regular_expression#POSIX> >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
External

Since: Jul 18, 2004 Posts: 548
|
(Msg. 15) Posted: Fri Jan 11, 2008 6:20 am
Post subject: Re: Help with RegEx? [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
SparkyGuy wrote:
> It turns out that simple range expressions are not supported in this (very)
> limited set of RegEx:
>
> "\{m,n\} Matches the preceding element at least m and not more than n times.
> For example, a\{3,5\} matches only "aaa", "aaaa", and "aaaaa". ***This is not
> found in a few, older instances of regular expressions.***"
> (Emphasis mine.)
That notation is wrong. For extended regular expressions (specifically
"Perl Compatible Regular Expressions"), the curly braces should NOT be
preceded by a backslash. The backslash means "ignore the special meaning
of the next character and treat it as a normal character" (or treat a
normal characgter as a special character, such as \s for space).
The correct syntax for "match at least m but no more than n of any
character" is
..{m,n}
If you used this:
..\{m,n\}
it would mean "match any character, then a "{", then m, then a comma,
then n, then a "}".
If your regular expression engine only supports basic regular
expressions then the "{" and "}" characters have no special meaning and
are treated as normal characters. A backslash in front of them will just
be ignored.
--
David Empson
dempson.TakeThisOut@actrix.gen.nz >> Stay informed about: Help with RegEx? |
|
| Back to top |
|
 |  |
| Related Topics: | memory leaking in 10.2.8? - Hi groups, I have a question concerning the memory management in the OS X (10.2.8 to be exact). Either I don't understand the way it works or there is something like major memory leak in the version I referred to (I haven't noticed it with previous..
Question about Bindings and Archiving - I'm referring to the tutorial "Enhanced Currency Converter" (from ADCHome > Documentation > Cocoa > Design Guidelines > Cocoa Bindings. I did it . Works great. Now I'm trying to make it save and open files (since it is based on an N...
Post encodeWithCoder stall? (archiving large graphs part 3) - In order to save large and tangled graphs without getting recursive unarchiving I decided to take the graph apart and archive it in the form of a series of linear arrays. What I have now are two linear arrays holding the elements of the graph. The..
Problem under Panther with NSView: dataWithEPSInsideRect a.. - Hi, our problem comes down to this: We do some painting in a custom view (lines and such) and also have some text (via NSFont -fontWithName and -drawAtPoint). The data is then pulled from the view with dataWithEPSInsideRect and written to an EPS file....
XCode - help ! - Is it possible to build a PEF executable using XCODe ? ( any methods ) or is that ONLY Macho-o's could be developed using Xcode ?? <font color=purple> ;> rG.</font> -- macro --------------------------------------------------------... |
|
You can post new topics in this forum You can reply to topics in this forum You can edit your posts in this forum You can delete your posts in this forum You can vote in polls in this forum
|
|
|
|
 |
|
|