Strip email addresses from HTML or Text with C#.

This code shows you how to extract email addresses out of text and HTML documents. Some day you may need to strip email address from text, for what reason you would want to do this I leave to you, but hopefully not for spam. You can also find phone numbers, web addresses and other patterned parts of text using this code with some simple modifications. Spammers use methods like this for data mining and stripping email addresses off websites. This just goes to show that you should crypt your email address on any websites you use. Using methods such as myName-at-isp.com can help deter data miners.

using System;
using System.Net;
using System.Text.RegularExpressions;

namespace StripEmailAddresses
{
   ///
   /// Description of StripEmail.
   ///
   public class StripEmail
   {
      public StripEmail()
      {
      }

      public string stripEmails(string url)
      {
         //WebClient wc1 = new WebClient();
         //string htmlData = wc1.DownloadString(url);
         string emails="";
         Regex re = new Regex("\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b");
         MatchCollection mc = re.Matches("jim@whatever.com\r\nxxx@1234.com\r\n\r\n");
         foreach( Match m in mc)
         {
            emails += m.Value.ToString();
         }

         return emails;

      }
   }
}


Digg: DIGG ME

3 Comments »

  1. hey!!! any emails??

    Comment by dex — January 11, 2010 @ 6:13 am

  2. It should pick up any properly linked email address or regular email address in text.

    Comment by Jimmy — January 11, 2010 @ 1:02 pm

  3. Pretty interesting code. I think it will be helpful for my juniors who are still getting their programming concepts cleared.

    Comment by Rohit Sane — February 7, 2010 @ 10:46 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Sponsor & Advertise
Tech Buzz

Zero-Day VBScript plagues Windows XP / 2000.

Apparently there is a new zero-day flaw that affects Windows XP and 2000 computers utilizing VBScript. An attacker can trick someone into visiting a website that binds the F1 key to a VBScript event which ultimately installs malicious code on your machine. Microsoft’s fix: Don’t press the F1 key when windows pop up. LOL. Ok [...]

Read More »

Has Verizon been hacked? Security certificates revoked!

Has Verizon been hacked? Google Chrome seems to think so. Just a few minutes ago I tried to log into Verizon to see why my phone isn’t making any calls and to also see why I can’t make any text messages. I’m going to have to probably assume they haven’t been hacked, but how does [...]

Read More »

My first blocked number in Google Voice.

Today I received my first piece of spam in Google Voice. At first I was really PISSED-OFF but then a feeling of serenity passed over my whole body as I noticed the “block” button. Slowly and cautiously I clicked it, making my day THAT MUCH better. Just knowing that I will no longer be getting [...]

Read More »

3500 Netflix on Linux petitions.

Currently you can not watch Netflix if you are a Linux user and all those new Ubuntu Netbook owners will not be watching Netflix anytime soon either. Watching movies online through Netflix is an awesome service, but worthless to Linux users. Netflix has chosen to only allow Windows and MAC users access to their online [...]

Read More »

Apple bans “android” from apps store.

Apples waving the ban stick around again, this time rejecting an educational iPhone app because it contained the word “Android”. The application? Flash of Genius: SAT Vocab 2.2, an iPhone app developed by Tim Novikof. The app did really well in the Android Developer Challenge that Google puts on and decided to mention that [...]

Read More »

My .02 on Apple’s anti-flash, anti-freedom movement.

As many of you iPhone consumers may know, Flash isn’t supported and will probably never be supported. The new up and coming iPad (AppleĀ Tablet) will be running a version of the iPhone operating system and will probably have the same exact restrictions. The reason Apple doesn’t want to support Flash is because it allows third [...]

Read More »