Jack J. Garzella

Email: jgarzellaucsd.edu but replace the "a" with an "@"

Why don't you just put my email on your website so I can click on it and/or copy-paste it?

Email-harvesting bots roam the web, looking for email addresses of real people (probably so the bot owners can send lots of spam emails). Changing one's email in a human-readable way is quite common in academic websites. The fancy word for this is address munging.

What follows is a short demonstration of why it works:

How to make a simple email bot

You can make your own email bot! All the knowledge you need is contained in Automate the Boring Stuff with Python.

Here's what you need to do:

  1. Learn the Basics of Python (Chapters 1-6).

  2. Learn about Regular Expressions, and write one that matches valid email addresses (Chapter 7).

  3. Write a web scraper using Beatiful Soup that searches through web pages looking for emails (Chapter 12).

  4. Use an email sending module (Chapter 18) to send any email you want to the web addresses you found!

Now, you have a bot that, given a web page, searches for correctly formatted email address like "johndoe@example.com" and can send messages to those addresses.

Fighting the bots

Ok, now let's try to modify the bot to scrape academic websites for emails.

First, modify the bot to serach for both valid email addresses and emails that are formatted like my email.

Now, make the bot search for email addresses formatted correctly, like mine, and like Sam Spiro's.

Now, make the bot search for email addresses formatted correctly, like mine, like Sam Spiro's, and like Aaron Bertram's.

Now, make the bot search for email addresses formatted correctly, like mine, like Sam Spiro's, like Aaron Bertram's, and like Jennifer Balakrishnan's.

Now, make the bot search for email addresses formatted correctly, like mine, like Sam Spiro's, like Aaron Bertram's, like Jennifer Balakrishnan's, and like Chenyang Xu's.

...and hopefully at this point you see how hard it would be to make a bot that just gets any researcher's email.