Xem mẫu

www.it-ebooks.info Chapter8 Content Formatting with Regular Expressions We’re almost there! We’ve designed a database to store jokes, organized them into categories, and tracked their authors. We’ve learned how to create a web page that displays this library of jokes to site visitors. We’ve even developed a set of web pages that a site administrator can use to manage the joke library without having to know anything about databases. In so doing, we’ve built a site that frees the resident webmaster from continually having to plug new content into tired HTML page templates, and from maintaining an unmanageable mass of HTML files. The HTML is now kept completely separate from the data it displays. If you want to redesign the site, you simply have to make the changes to the HTML contained in the PHP templates that you’ve constructed. A change to one file (for example, modifying the footer) is immediately reflected in the page layouts of all pages in the site. Only one task still requires the knowledge of HTML: content formatting. On any but the simplest of web sites, it will be necessary to allow content (in our case study, jokes) to include some sort of formatting. In a simple case, this might Licensed to botuongxulang@yahoo.com www.it-ebooks.info 242 Build Your Own Database Driven Web Site Using PHP & MySQL merelybetheabilitytobreaktextintoparagraphs.Often,however,contentproviders will expect facilities such as bold or italic text, hyperlinks, and so on. Supporting these requirements with our current code is deceptively easy. In the past couple of chapters, we’ve used htmlout to output user-submitted content: chapter6/jokes-helpers/jokes.html.php (excerpt) If,instead,wejustecho outtherawcontentpulledfromthedatabase,wecanenable administrators to include formatting in the form of HTML code in the joke text: Following this simple change, a site administrator could include HTML tags that would have their usual effect on the joke text when inserted into a page. But is this really what we want? Left unchecked, content providers can do a lot of damage by including HTML code in the content they add to your site’s database. Particularly if your system will be enabling non-technical users to submit content, you’ll find that invalid, obsolete, and otherwise inappropriate code will gradually infest the pristine web site you set out to build. With one stray tag, a well-meaning user could tear apart the layout of your site. In this chapter, you’ll learn about several new PHP functions that specialize in finding and replacing patterns of text in your site’s content. I’ll show you how to use these capabilities to provide for your users a simpler markup language that’s bettersuitedtocontentformatting.Bythetimewe’vefinished,we’llhavecompleted a content management system that anyone with a web browser can use—no know-ledge of HTML required. Regular Expressions To implement our own markup language, we’ll have to write some PHP code to spotourcustomtagsinthetextofjokesandreplacethemwiththeirHTMLequival-ents. For tackling this sort of task, PHP includes extensive support for regular ex-pressions. A regular expression is a string of text that describes a pattern that may occur in text content like our jokes. Licensed to botuongxulang@yahoo.com www.it-ebooks.info Content Formatting with Regular Expressions 243 The language of regular expression is cryptic enough that, once you master it, you may feel as if you’re able to weave magical incantations with the code that you write.Tobeginwith,however,let’sstartwithsomeverysimpleregularexpressions. This is a regular expression that searches for the text “PHP” (without the quotes): /PHP/ Fairlysimple,youwouldsay?It’sthetextforwhichyouwanttosearchsurrounded by a pair of matching delimiters. Traditionally, slashes (/) are used as regular ex-pression delimiters, but another common choice is the hash character (#). You can actuallyuseanycharacterasadelimiterexceptletters,numbers,orbackslashes(\). I’ll use slashes for all the regular expressions in this chapter. To use a regular expression, you must be familiar with the regular expression functionsavailableinPHP.preg_match isthemostbasic,andcanbeusedtodeterm-ine whether a regular expression is matched by a particular text string. Consider this code: chapter8/preg_match1/index.php In this example, the regular expression finds a match because the string stored in the variable $text contains “PHP.” This example will therefore output the message shown in Figure 8.1 (note that the single quotes around the strings in the code pre-vent PHP from filling in the value of the variable $text). Licensed to botuongxulang@yahoo.com www.it-ebooks.info 244 Build Your Own Database Driven Web Site Using PHP & MySQL Figure 8.1. The regular expression finds a match By default, regular expressions are case sensitive; that is, lowercase characters in the expression only match lowercase characters in the string, and uppercase char-acters only match uppercase characters. If you want to perform a case-insensitive searchinstead,youcanuseapatternmodifiertomaketheregularexpressionignore case. Pattern modifiers are single-character flags following the ending delimiter of the expression. The modifier for performing a case-insensitive match is i. So while /PHP/ will only match strings that contain “PHP”, /PHP/i will match strings that contain “PHP”, “php”, or even “pHp”. Here’s an example to illustrate this: chapter8/preg_match2/index.php Again, as shown in Figure 8.2 this outputs the same message, despite the string ac-tually containing “Php”. Licensed to botuongxulang@yahoo.com www.it-ebooks.info Content Formatting with Regular Expressions 245 Figure 8.2. No need to be picky … Regularexpressionsarealmostaprogramminglanguageuntothemselves.Adazzling variety of characters have a special significance when they appear in a regular ex-pression.Usingthesespecialcharacters,youcandescribeingreatdetailthepattern of characters for which a PHP function like preg_match will search. When you first encounter it, regular expression syntax can be downright confusing and difficult to remember, so if you intend to make extensive use of it, a good refer- ence might come in handy. The PHP Manual includes a very decent regular expres-sion reference.1 Let’s work our way through a few examples to learn the basic regular expression syntax. First of all, a caret (^) may be used to indicate the start of the string, while a dollar sign ($) is used to indicate its end: /PHP/ /^PHP/ /PHP$/ /^PHP$/ Matches “PHP rules!” and “What is PHP?” Matches “PHP rules!” but not “What is PHP?” Matches “I love PHP” but not “What is PHP?” Matches “PHP” but nothing else. Obviously, you may sometimes want to use ^, $, or other special characters to rep-resent the corresponding character in the search string, rather than the special meaning ascribed to these characters in regular expression syntax. To remove the special meaning of a character, prefix it with a backslash: /\$\$\$/ Matches “Show me the $$$!” but not “$10”. 1 http://php.net/manual/en/regexp.reference.php Licensed to botuongxulang@yahoo.com ... - tailieumienphi.vn
nguon tai.lieu . vn