A bit of maths: December 2018

In recent years researchers have found numerous different ways to fool image classification software. An image of say a face can be tweaked slightly so that image recognition software detects something completely different, perhaps a rifle, even though for a human observer the face is still clearly visible and the rifle cannot be seen. In one respect, when AI is being fooled, it is becoming more human. There are optical illusions that trick a human but not AI and we have now found the reverse.

Let's now look at the mathematics of what is going on. We'll represent the image as a vector x. This can of course contain a value for red, green and blue at each pixel. The neural net estimates the likelihood of a rifle being in the images, We'll call this R(x). For the original input image the likelihood is low. The hacker can then work out the gradient, i.e. the partial derivative of R with respected to each element. This can be done numerically using a standard finite difference approach: \[\frac{\partial R}{\partial x_i} \approx \frac{R(\underline{x} + \delta x_i) - R(\underline{x} - \delta x_i)}{2 \delta x_i} \] (The purists might slightly argue with my notation there, but they know what I mean.)
The hacker can then use an iterative gradient ascent methodology to tweak the image, so that the net reports it to be more likely that a rifle is present. After each step, the gradient would be recalculated. One of the interesting features of this hack is that the hacker doesn't need to know the details of the architecture of the neural net. He just needs access to the output R(x). It is sometimes found that a small tweak to the original image results in the net reporting that a completely different object is contained, even though to a human the tweaked image appears almost identical to the original.

But what can we do to improve the robustness of image classification (AI) software? One option would be to use multiple different neural networks with different architectures and different training data. If one of the nets detects a rifle but all the others detect a face, then we would suspect that one of the nets may be being fooled. One of the nets could be trained on only grey-scale images, i.e. with the red, green and blue replaced by shades of grey. Before the combined neural nets could be deployed, an algorithm would need to be written to combine the results from the various nets. We may have one net that is the best, in which case it would only be over ruled if all the other nets give a different result. However, one problem would be that if the hacker were to return and if he were to attempt the same hack against our combined net, he may also be somewhat successful. Remember we showed above that the hacker doesn't need to know any details of the neural net architecture. If we replace one net with a set of nets it may not necessarily improve matters that much. We would deem our defense against the hack to be a success if the only way the hacker could convince our system that a rifle is in the image would be to change the image so much that we would all agree that it does appear as though a rifle is in the image. However combining the result from multiple nets does not necessarily guarantee this.

Here is the outline of a suggested algorithm that attempts to defeat the hacker:
We start with our primary, lovingly trained neural net which is fairly reliable. We also construct a set of perhaps 10 alternative nets, with different architectures and trained with different data. When we want to recognize an image, we first use a hash of the image as a random seed and then use a random generator to pick say 5 of the 10 alternative nets. The image is then passed through our primary net along with our 5 chosen alternative nets. The final result is some non-linear combination of our primary net along with the others. When the hacker then comes along and uses his gradient ascent method he would find that there is so much noise from our random switching of the alternative nets that his method doesn't work well. And so we would have a fairly robust defense against the hack.

There are some drawbacks to this approach: Instead of constructing 1 neural net, we would construct and extra 10 and at run-time 5 of the alternative nets would be evaluated. Also we would find that if the exact same image is presented twice the same result would be returned, however if even one pixel were tweaked we could get quite a different result. Some users may deem this instability to be undesirable.

Online retailers would like to make it as straightforward as possible for existing customers to make multiple purchases. Clearly they would like to store the credit and debit card numbers of their clientele, so when they come back, they don't need to re-enter their card details. However, when the sensitive data is stored, it might be leaked, the consequences of which would be severe. It would of course make sense to encrypt the data before saving. But to use the card number, it will need to be decrypted and so, there is a risk that the unencrypted data will be obtained by a criminal due to staff incompetence or corruption. So what should a retailer do?

One option would be to use public key encryption, but rather than use the retailer's own public key, they could use the card issuer's public key. The card number could be encrypted in the browser of the client and then sent to the retailer's server to be stored. In this case if there were some dishonest employees working for the retailer, they still wouldn't be able to decrypt the card number. When the retailer wants to request a payment from the card issuer, it can send the encrypted number and the card issuer will be able to decrypt it.

However, if it were to become standard practice for firms to use the encrypted number, then that encrypted number would be like a proxy for the original card number. In that case if it were leaked, it would once again be problematic. One way round that would be to tag the card number with the retailer ID first, before encrypting it. That encrypted data would be useless to any other retailer, corrupt or otherwise. In that case the head of information security at the retailer would sleep well at night.

All that assumes that the public key encryption of the card issuer hasn't been compromised. But just in case it has been, one option would be for the retailer to put an extra layer of encryption. On the retailer's website, it may make sense to first encrypt using the card issuer's public key and then encrypt again with the retailer's public key. The doubly encrypted data would be transmitted and then stored. Before the payment request is sent to the card issuer, the outer layer of encryption would need to be removed (decrypted).

A bit of maths

Tuesday, December 11, 2018

Robust image recognition

Monday, December 3, 2018

Encryption and debit card numbers

About Me