• Home

CodingExperiments.com

Linux, PHP, and the blogosphere

Search

Category:

  • Apple Inc.
  • Apps
  • Facts
  • Fun
  • Google
  • Ideas
  • Internet
  • Linux
  • Microsoft
  • PHP
  • Programming
  • Rants
  • Security
  • Uncategorized
  • web 2.0

Archives:

  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007

Pages

  • About
    • The Authors
  • Commenting your code
  • How to Write Papers with Groff
  • ModCMS Anti-Spam Component Set
  • ModCMS Technical Specifications
  • Regular Expressions Guessing Game
  • Saving code directly to a web server
  • The (Almost) Perfect PHP 404 Page

Meta:

  • RSS
  • Comments RSS

Awesomeness tracker

CodingExperiments at Blogged View blog authority
Free Page Rank Tool

Sir, Please Step Away from the Regex. When to Use Regular Expressions in Code.

June 4th, 2008 by Rishabh Mishra

UPDATE: As mentioned in the comments, the example regex was not written properly. This has now been corrected.

Introduction

Regular expressions are several things. They are:

  • powerful.
  • useful.
  • sometimes overkill.

A difficult part for beginning programmers is when to use a regular expression in code. Sometimes, whipping up something with string functions is a better idea. You’re a smart coder if you can tell the difference from when you should use regular expressions and when you should not.

Signs that a regular expression is too much

1) Your regular expression is small and easy to understand

If you just made a really small, simple regular expression, it is possible that something similar could be achieved by other means. Below is an (almost extreme) example of this symptom.

/^3.*/

The above regular expression detects when a string starts with “3″. If you aren’t groaning at that regular expression, I would like to suggest that you direct yourself to documentation of string functions in whatever languages that you program in.

2) Diverse types of strings match your regular expression

The above example of “/^3.*/” fits this one too. Any string that starts with “3″ is a very diverse group of strings. Regular expressions were designed to match specific types of patterns. Now, a very general regular expression has its place in non-programming areas. A great example of this is find and replace.

3) You only need to check the start of a string

There are plenty of ways in computer languages to check the first few characters of a string. Our example of “/^3.*/” fits this again. You could just write code that just looks for the “3″.

Signs that a regular expression is not enough

1) If regular expressions aren’t robust enough for the task

Ever try to use regular expressions for XML parsing? Have you ever wanted to use regular expressions for XML parsing?

2) If the regular expression is more complex than any manual parsing

I remember once when I was looking through all sorts of regex tutorials, books, and so forth to find out how to do a certain task. That task could have been done easier with manual parsing.

Signs that a regular expression is perfect

1) You have to match something very specific.

Two great examples of this is using regular expressions to parse URLs and email addresses. For a URL, you can’t just write parsing code that looks for “http://” or “https://” at the beginning of a string. That would allow “http://$^%” to get through.

2) You would have to write, like, a billion lines of parsing code otherwise.

If regular expressions look like they will save you major developement time and headaches, go for it.

3) Everybody else is doing it.

Email addresses and URLs are examples for this one too. Everybody uses regular expressions in this situation because it’s the best way to do so. If you’re unsure on whether or not to use a regular expression, see if an experienced developer would do so.

Conclusion

Regular expressions have their place, but you have to make sure that you aren’t using them for way too simple tasks, or when there’s a smarter way to do so.

Credits: Thank i80and for some of the tips.


Posted in Uncategorized |

  • steve
    If you are going to write an article criticizing regex overuse, you might want a regex that matches your description.

    /^3.+/ does not simply mean 'start with 3'. It means 'start with 3 and one or more additional characters'.

    /^3.*/ is probably what you wanted.

    + means 'match the preceding element one or more times' whereas * means 'match the preceding element zero or more times'
  • Rishabh Mishra (possible248)
    Oh dear, you are correct.

    My apologies for the obvious sloppiness in writing the post. Much thanks given to you. I have corrected the post to reflect this.
blog comments powered by Disqus