See Regular Expressions for more information about Perl‘s pattern matching abilities.
This originally came about because I needed to replace a chunk of html with another smaller chunk. Basically instead of hardcoding a menu into each page, we could use php includes to include the code in a separate file. This gives one place to edit the menus future changes, instead of each individual html page.
So, basically I wanted to find this piece of code:
<!-- left column begins -->
<div id="leftcol">
<ul>
<li><a href="../life/services.html">Campus Services</a></li>
<li><a href="../main/calendar.html">Graduate Calendar</a></li>
<li><a href="../gradcatalog/">Catalog</a></li>
<li><a href="../funding/">Costs/Funding</a></li>
<li><a href="../about/diversity.html">Diversity Initiatives</a></li>
<li><a href="../procedures/forms.html">Forms</a></li>
<li><a href="../contact/staff.html">Graduate School Staff</a></li>
<li><a href="../contact/gpd.html">Graduate Program Directors/Coordinators</a></li>
<li><a href="../life/">Grad Life</a></li>
<li><a href="../requirements/">Graduation Requirements</a></li>
<li><a href="http://www.umbc.edu/gsa/">GSA</a></li>
<li><a href="../life/faqs.html">FAQs</a></li>
<li><a href="http://www.umbc.edu/oir/cgi-bin/rws3.pl?FORM=UMBC_IncomingGraduateSurvey">New Student Survey</a></li>
</ul>
</div>
<!-- left column ends -->
And replace it with this code:
<!-- left column begins -->
<?php include("../includes/left_menu_life.php"); ?>
<!-- left column ends -->
Note: ideas originated from http://www.noctilucent.org/blog/archives/2003/12/replacing_large.html
generateRegEx.pl#! /usr/bin/perl -w use strict; print "s%\n"; while (<>) { # escape any regex meta-chars s/([].[\\^#|\$%*+?(){}])/\\$1/g; # match trailing whitespace (incl. newlines) on non-empty lines s/(.)$/$1\\s+/; # match any internal whitespace s/(\S)[ \t]+/$1\\s+/g; print $_; } print <<EOT; %PUT REPLACEMENT TEXT HERE %six EOT
find.html in this examplegenerateRegEx.pl to create a script called substitution.pl$ perl generateRegEx.pl < find.html > substitution.pl
substitution.pl to allow for wild-card character matching via (.*) or (.*?)$ perl -p -0777 substitution.pl < file01.html | less$ perl -p -0777 -i.bak substitution.pl file*.html