Python: Regex to extract part of URL found between parentheses -
I have this weird formatted URL. I have to remove the content in '()'.
Sample URL: http://sampleurl.com/ (think (codecod)) /profile/view.aspx
If I out of it < Code> ThinkCode , I'm a happy person! I have a hard time with regexing special characters like '(' and '/').
foo = re.compile (r "(? & Lt; = \ (of \ () [^ \)] *")) gt; & gt; foo.findall (r) "http: / / Sampleurl.com/(K(ThinkCode))/profile/view.aspx ") ['ThinkCode']
explanation
in regex-world, A The way to say "I want to match with ham
, but only if it is before spam
. & Lt; = spam) ham . So in this case, we want to match the [^ \]] *
, but only when This is \ (K \ (
.
now \ (K \ (
is a nice, easy regex, because it's plain text! , Exactly the string (K (
Note that to avoid the brackets (insert \
in front of them), otherwise the Regedx Parser feels that they can match a character Instead of regex!
Finally, when you put something in square bra, in Regex-world ckets, it means "any of the characters here are fine" Put inside quantify where the first letter ^
is, it means "any characters not OK" here. Then [^ \]]
means "any character which is not right-bracket", and [^ \]] *
"as many as possible letters as possible By putting it together, (? & Lt; = \ (K \) [^ \]] *
means "the right-bracket" Match, the more you can not right the straight (k (
.
Oh, the brackets before one last thing, because \
means Wire in Python as well as inside regexes, we only play "spam"
Raw strings - use r "spam". It tells Python
's.
another method
< P> If you are a bit complicated for you, you can also use the Capturing Group . It is that regex pattern pattern, but also can remember subpatterns. This means that you There is no need to worry about viewing, because you can match the whole pattern and then just remove the subpattern inside it! Even Capture YOH, just put it in parentheses: (foo)
will capture foo
as the first group. Then, use .groups ()
to spit out all the groups that match you! This second method works.
Comments
Post a Comment