English Deutsch Français Italiano Español Português 繁體中文 Bahasa Indonesia Tiếng Việt ภาษาไทย
All categories

To quickly find words in large volumes of text, we just enter the keyword in the "Find" feature of Notepad or a similar text editor. The function finds in the desired word, if its in text, in a fraction of a second.

It cant be a sequential word by word search..it would have takem too long. It cant be binary search. Then what is it?

Can anybody out there please tell me what is the algorithm that the function this "Find" feature uses?

2007-07-10 07:47:32 · 5 answers · asked by AJ 1 in Computers & Internet Programming & Design

5 answers

Notepad, MS Word, etc use what is called a Regular expression engine. There could be a book made about the algorithm used. Notepad uses a dumbed dowm version of the Regex engine but it allows you to search for simple word patterns: i.e. "Hello"

Try using MS word and open the find dialog box. You will see a checkbox for wildcards: check it.

type this into find:
[Hello] Notice the brackets.

This will find any instance of h, any instance of e, any instance of L, Etc....

Using regular expressions in MS word is a little limited but still it is there. Do a search in MS word for help on the find dialog box. You will be amazed at some of the simple things that you can do.

Remember the engine is actually looking for a pattern, not a word or character even though it seems like it is only trying to find simple words.

A little background:

Stephen Kleene invented Regular expressions in the 1950s.

Google regular expressions also to get a little background.
Dont dive too deep into how the patterns are used unless you are a programmer.

Hope this helps you

2007-07-10 10:31:36 · answer #1 · answered by ebred 3 · 0 1

It isn't character by character...the algorithms are smarter then that by now.

Look up the Knutt-Morris-Pratt algorithm for more details.

2007-07-10 08:02:15 · answer #2 · answered by aguynamedgeoff 1 · 0 0

It's a plain old character-by-character scan of the text. It doesn't take long -- computers are pretty fast these days.

2007-07-10 07:58:49 · answer #3 · answered by Anonymous · 0 0

I'm sure it is a Boyer-Moore algorithm or something derived from it.

2007-07-10 08:03:49 · answer #4 · answered by SWDude 2 · 0 0

Take a look at this link.

2007-07-10 08:02:52 · answer #5 · answered by AnalProgrammer 7 · 0 0

fedest.com, questions and answers