eDesign.nl

Textual difference detector

Posted by Jurgen in Algorithms, Text processing on May 7th, 2009

Today I uploaded my textual difference detector to the eDesign examples. This is an example application demonstrating the theory of applying the Levenshtein algorithm to detect differences between two versions of the same text. Also, the ‘Find the differences‘ post is updated with a link to this example.

This example takes two texts as input and outputs one merged text marked with what was deleted and what was added. Take a look and feel free to download the source code. This also inlcludes the Levenshtein algorithm source code.

No Comments

Challenge Hash

Posted by Jurgen in Algorithms, Security on May 5th, 2009

The Internet is a crowd and everybody in it can potentially hear what you say. Methods have been developed to prevent this and ensure identity, integrity and authenticity. Often these three can be seen as properties of encryption. Encryption implies the possibility of decryption. Passwords are precious things you don’t want others to decrypt and read. With a technique called challenge hashing you don’t need to have any worries about it. Challenge hashing is a technique used to verify a password on site B which was sent from site A without sending the password in plain text. This article covers how. Read the rest of this entry »

1 Comment

Character entities

Posted by Jurgen in Character Encoding, Web standards on May 4th, 2009

As in real life characters that build written language differ from system to system. Ελληνικά characters differ from Русский, 汉语 and Latin characters. Fortunately these character sets have been standardized and called alphabets. The same goes for character sets in the digital world. As computers can only process binary data, all characters are mapped to a number. In the early days such a mapping of the Latin alphabet, along with some other graphical ‘characters’, digits and control characters (e.g. escape, tab, line feed, carriage return) was standardized. This standard is known as the American Standard Code for Information Interchange (ASCII) and was developed by the American Standards Association (currently: ANSI). This 7-bit encoding lacked digital representations for many characters of e.g. foreign characters (as respectively Greek, Russian and Chinese are mentioned above) but also accents like å, è, ï, ó and û were not represented in the set. But as you can see in this paragraph, improvements have been made to facilitate such ’special’ characters. Read the rest of this entry »

No Comments

Sudoku Logic – part I

Posted by Jurgen in Algorithms on May 1st, 2009

If you haven’t heard of Sudoku puzzles (数独,, sūdoku) you’ve either been sleeping under a rock or been space traveling for quite a while. These 9×9 square puzzles originating from around 1900 became an international hit in 2005. Sudokus appear in newspapers, online and special sudoku puzzle books all-over-the-world. And as if that is not yet enough Sudoku TV shows and all kinds of variants of the puzzle are made. One can solve a sudoku using logic only. Because of this computational algorithms to solve every possible Sudoku must exist. This is part one in the series on such algorithms. Read the rest of this entry »

1 Comment

Security basics

Posted by Jurgen in Security on April 24th, 2009

Security is an issue on every level of communication. If you order a bread at the bakery you pay and receive your bread. This face to face approach doesn’t really need any security. What does it matter if your neighbor, in line next to you, overhears you ordering a bread and sees you pay as long as you get your bread? But what if this would be done online and it involves not a bread but a loan or a transfer from your savings account? You wouldn’t want a John Doe messing with the data you need to communicate with your bank, would you? Read the rest of this entry »

1 Comment

Find the differences

Posted by Jurgen in Algorithms, Text processing on April 12th, 2009

Comparing files is something developers do every once in a while. For example, comparing configuration files to see what is different in the other environment or compare programming files to see what has changed in the source code. Implementations of text comparison algorithms are therefore widespread and used in several fields. For instance, in blogs and content managements systems, one might need to know what was altered in an update of a text (in cms like systems) or a programmer in a team would like to see what changed in the source code (svn). Also a lot of (combined) search, spell checking, speech recognition and plagiarism detection software compare texts (strings) in a certain way. This article covers the Levenshtein distance algorithm and how to use it to indicate alterations to texts. Read the rest of this entry »

1 Comment

Ratings, Ranks and Scores

Posted by Jurgen in Prototype, Widgets on April 9th, 2009

Recovered from the ancient spelunks of about two years ago, the somewhat popular Rater is back. When this rating widget was first released, it soon got attention from all over the world. Now, after the eDesign server crash, this JavaScript rating control is reinstated.

Rater 2.0 can be used to have your website visitors assign scores to subjects in nice intuitive way. You then will be able to rank subjects by popularity, quality or anything you make the rater about. You probably have seen more of these rating widgets. When watching a video on Youtube for instance you can assign “stars” to it to rate the video. You can check out the example page of Rater 2.0 to see the diversity in its appearance and functional possibilities. Custom configurations can be made using the Rater 2.0 API page.

Currently Rater 2.0 is a Prototype based control, optionally with Scriptaculous. As the web is evolving, so should eDesign. Therefor MooTools and jQuery versions of Rater are expected by the end of May 2009.

No Comments

Regular expression tester

Posted by Jurgen in Regular Expressions, Text processing on March 26th, 2009

First one to be back is the simple but very useful regular expression tester. What it does is simply dump the contents of a pattern match and its subpattern matches. Developers might find this a useful tool not only to test their regular expressions, but also to see the way subpatterns are counted to use backreferences.

At the tester links to PCRE documentation (Perl Compatible Regular Expressions) are available. Also I would like to point to a handy cheat sheet on general regular expressions by Dave Child and another cheat sheet on JavaScript regular expressions from Visibone.

This tool is accessible again at regex.edesign.nl and can be downloaded there as wel (use the src link).

No Comments

M	T	W	T	F	S	S
« Dec
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

eDesign.nl

Textual difference detector

Challenge Hash

Character entities

Sudoku Logic – part I

Security basics

Find the differences

Ratings, Ranks and Scores

Regular expression tester

Blogroll

Blogroll Dutch

Coder's Tools

DeJureDeFacto