Protecting strings in JVM Memory
Our applications occasionally process sensitive user information, such as passwords, private keys or financial information and I would guess that many people don’t think twice about using a String
for these.
The case against String
actually comes down to its most invaluable feature; immutability. There is no way to wipe a String
before you release that last reference to it and leave it to the garbage collection to do its job. This means the contents of that String
can reside in the heap long after you are done with it.
Malware running alongside your Java process need only run jmap
https://stackoverflow.com/a/3042463/360211 to dump the heap of your process and pull String-like data from it.
Strings are immutable. That means once you’ve created the
String
, if another process can dump memory, there's no way (aside from reflection) you can get rid of the data before garbage collection kicks in. — Jon Skeet
This is why char[]
is a common solution. You can clear a char[]
once you are done with it which minimizes the time window of an attack https://stackoverflow.com/a/8881376/360211 but the attack will still be feasible.
Another potential problem with both String
and char[]
(and indeed any managed object solution) is that they can be moved around in the heap by the garbage collector as it tries to consolidate memory generations. As Jon Skeet points out in his answer, linked earlier, it is implementation-specific whether the old copy will be zeroed out by the GC or if our sensitive data will be left in the heap despite our best efforts with achar[]
.
Another attack is for a program to allocate a large amount of RAM and sniff it for strings, trying to find data that was left over from the previous program ran in that memory address.
For these reasons we should have a way of minimizing the effectiveness of these attack vectors window and minimizing the time window they can occur in and it should be easy or at least not much harder so people are likely to use it.
Ideal solution
- Allow clearing of string data once we are done
- Prevent GC moving/copying data
- Obfuscate that data so that it is not readily visible in a memory dump
Allow clearing of string data once we are done
Any kind of read/write buffer will allow this. We just need to steer clear of String
.
Prevent GC moving/copying data
We require a buffer that’s outside of the GCs control. This will ensure that multiple copies cannot be left beyond the time we are done with it.
For this we can use ByteBuffer.allocateDirect
The documentation for https://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer.html only says a direct buffer may exist outside of the managed heap but it is at least pinned memory, as they are safe for I/O with non JVM code so the GC won’t be moving this buffer and making copies.
Obfuscate the data
You can use any kind of obfuscation you wish, but ideally it should be possible to deobfuscate one character at a time. If your chosen technique requires deobfuscating all chars at once, then the memory will contain all chars in plain text at the same time undoing our good work.
One simple way of achieving this is with an xor with a large enough key. See https://en.wikipedia.org/wiki/XOR_cipher.
If you choose a random data key at least as long as the data, the attacker must locate both the key and the obfuscated data in order to read the message.
If we could keep the key private this is an uncrackable encryption technique but remember we have both the key and the data in the memory, so it’s fair to say this is just obfuscation.
SecureCharBuffer
So, luckily I’ve done this basic obfuscation for you. SecureCharBuffer
allows easy character by character population and reading. It splits the data into two unmanaged buffers and implementsClosable
so can be used in try
with resources.
https://github.com/novacrypto/SecureString
It’s early version, so all feedback appreciated.