A PHP script that searches for key words in a chunk of text?

Question:

Josh Davis

2011-03-01 13:37:07 UTC

i'm new too PHP so bare with me, i want to know how i'd go about writing a script that searches for keywords in a chunk of text if i told it what to search for. All i'm asking if this is possible and how i'd do it. I know basic PHP, HTML, CSS. Please help.

Three answers:

D.F.

2011-03-01 21:26:56 UTC

Yes, it is possible and there are several approaches you could follow.

As it was already suggested, you could try to explode the chunk of text into an array, each element containing a single word, but I advise against it as your array would be very large if you had a very long chunk of text.

I would use regular expressions, which are a very powerful tool and would make your life much easier. To give you an idea of what you could do, the following snippet of code looks for the keywords contained in the array into the first paragraph of "Lorem Ipsum":

$text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc quam dui, mattis consequat posuere non, egestas sit amet ante. Vivamus vel ante vitae ipsum fringilla malesuada quis vitae est. Nullam condimentum augue at mauris congue ac scelerisque justo posuere. Curabitur sed dui id sem dignissim tempor. Ut dolor libero, volutpat nec ornare nec, tempor non sem. Phasellus a tortor in diam volutpat euismod. Morbi lacinia pharetra libero, sed aliquam urna volutpat sit amet. Fusce felis augue, dictum sit amet tincidunt at, venenatis eget nunc. Praesent auctor lacinia dolor, sit amet viverra tortor suscipit ac. In hac habitasse platea dictumst. Donec mi magna, sodales blandit condimentum sit amet, iaculis tristique nisi.";

$keywords = array('Google', 'adipiscing', 'viva', 'curabitur', 'dictumst');

foreach($keywords as $keyword)

{

preg_match_all("/\b".$keyword."\b/i", $text, $matches);

echo "Matches for $keyword:
";

print_r($matches);

echo "

";

}

If you run the code exactly as I listed it, you will get the following output:

Matches for Google:

Array ( [0] => Array ( ) )

Matches for adipiscing:

Array ( [0] => Array ( [0] => adipiscing ) )

Matches for viva:

Array ( [0] => Array ( ) )

Matches for curabitur:

Array ( [0] => Array ( [0] => Curabitur ) )

Matches for dictumst:

Array ( [0] => Array ( [0] => dictumst ) )

The essential part of this code is this line:

preg_match_all("/\b".$keyword."\b/i", $text, $matches);

If you have never used regular expressions before, you will probably want to do some reading on the subject. What that line does, basically, is to look in $text for the pattern defined by "/\b".$keyword."\b/i". The / is one of the possible delimiters that are commonly used to define a pattern, whereas the i following the last / means that the search should be case insensitive. The \b before and after $keyword stands for "word boundary", which means that only a complete match should be found, regardless or whether there is any punctuation immediately following or preceding a keyword. In light of that, the output of my example can be explained as this:

- keyword "Google": no matches, since the word doesn't appear in the text

- keyword "adipiscing": one match

- keyword "viva": no matches, even if the text contains "vivatur" (if \b were removed, you'd obtain one match)

- keyword "curabitur": one match, even if it is spelled with an uppercase "C" in the text

- keyword "dictumst": one match, even if in the text it is immediately followed by a "." and thus appears as "dictumst."

If you decided to follow the approach of splitting the original text in an array of single words, please note that the syntax used in the first answer is incorrect.

You could use something like this:

$words = str_word_count($text, 2);

str_word_count() is a PHP function that would pretty much do the job, providing you an array where the key is the the position of a word in the original text, and the value would be the word itself.

Alternatively, you could also use something like this:

$words = preg_split("/[\s,;:.]+/", $text);

This would split a chunk of text into words, using space characters and punctuation marks as delimiters.

2016-09-09 10:36:28 UTC

If you're speakme approximately the yahoo toolbar: click on the pencil and pass to transparent seek historical past! TO Delete up to date searches from doping up within the drop down menu on a seek web page. Go to the seek web page web page, click on wherein you might style then push the down arrow for your keyboard. As you progress your pointer over a phrase it's highlighted, Highlight the phrase within the record then preserve down the delete key for your keyboard. To flip this option off in yahoo instrument bar: Click the pencil icon pass to toolbar choices/ underneath Search field uncheck the primary two. To flip this option off Open an IE web page pass to gear on the most sensible/ Internet choices click on the content material tab/ then click on the automobile entire button/ uncheck each factor there Be definite to come back again and vote for the first-class reply!

2011-03-01 13:45:43 UTC

A simple and stupid way of doing so is using "split".

$string = "this is a string";

$tmp = new array();

$tmp = $string.split("is");

//you will now have a array "splitted" where you found the string "is"...

//so:

//$tmp[0] = "th"

//$tmp[1] = "is"

//$tmp[2] = " "

//$tmp[3] = "is"

//$tmp[4] = " a string"

//you can also replace easily like so:

$string= $string.split("is").join("IS");

//$string is now = "thIS IS a string"

Useful little feature, arrays are the best when you know how to use them!!!

ps: my syntax might be wrong it's been awhile since I have done PHP. If you are looking at making a home made search engine, make sure you use lowercase characters and simplify everything...é=e, ê=e E=e etc etc....

Good luck!

ⓘ

This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.

about - legalese