Yesterday's test at my university covered codes as one of its topics. As an example, EAN13 codes or barcodes as they are commonly known respectively were brought up. While preparing for the test we practiced verifying them and calculating missing digits. After some time I thought about how to automate these tasks. Of course, the first thing that comes to mind is the practically unmatched calculation performance of excel.
How do barcodes work?
The principle is pretty simple. Every digit is made up of seven lines of equal width. If a symbol is made up of multiple lines with the same color next to each other they appear as one thick line. This is also why they are a pain for humans to read. The digits zero to nine have each three different code symbols associated with them, called A, B, and C. The first half might be coded with a mix of symbols from A and B, while the second half only consists of symbols from C. Both halves make up the rear 12 digits of the code.
The very first digit is called implied-digit, as it is derived from the pattern of A and B symbols used to code the first half. If the first half is only made up of symbols from the A category a zero is implied. The whole table looks like this:
Implied Digit | Pattern |
0 | A A A A A A |
1 | A A B A B B |
2 | A A B B A B |
3 | A A B B B A |
4 | A B A A B B |
5 | A B B A A B |
6 | A B B B A A |
7 | A B A B A B |
8 | A B A B B A |
9 | A B B A B A |
Checksum
The last digit can be retrieved either by decoding it from its symbol or by calculating it from the other digits. Only if both match, the barcode is correct. The way the checksum is computed is fortunately pretty simple: All twelve digits are added with every other digit multiplied by three. The sums last digit has to match.
Detecting if a code is correct is not really a challenge. Summing things up is excels biggest strengths, besides having way too many options for what is basically the same chart. Extracting the sums last digit for comparison is done by diving it by ten and taking the divisions rest aka modulo.
The more interesting part is to find a missing digit, provided the known ones are correct. In the case a digit needs to be found, that isn’t multiplied by three in the checksum, subtracting the rest from ten yields the correct value. The other digits are a little bit more tricky, as a number has to be found, which extends the sum to be divisible by ten while being divisible by three. When doing it by hand, I usually just did multiple guesses.
Turns out that there is a nice solution to this problem. Like in the first case, subtracting the rest from ten returns the number needed to extend the sum to be divisible by ten. Now you only need to find a multiple of three that ends with this digit. This can be looked up from a table of numbers.
Rest | Required digit | Mulitple | ÷3 |
0 | 0 | 0 | 0 |
1 | 9 | 9 | 3 |
2 | 8 | 18 | 6 |
3 | 7 | 27 | 9 |
4 | 6 | 6 | 2 |
5 | 5 | 15 | 5 |
6 | 4 | 24 | 8 |
7 | 3 | 3 | 1 |
8 | 2 | 12 | 4 |
9 | 1 | 21 | 7 |
As you can see above, a digit required to extend the sum can be mapped to a number divisible by three that does that. It doesn’t matter that it might be larger than ten, as the sum will still be divisible. The last digit of all multiples of three have from zero to 27 are the numbers from zero to nine, which makes this whole thing work. There are multiple numbers this works with, for example, 7, 9, and 11.
Excel Formulas
Checking for the correctness of the barcode is simply done by taking the rest of the sum divided by ten and checking whether it is zero. Missing digits are calculated via multiple formulas. One of them checks which type of digit is unknown and then selects the correct result, as both are calculated simultaneously.
Digits on odd positions are looked up via a table similar to the one shown above.
Finally, we can see the table in action, computing a missing digit.