Preface
The Unsafe class is used in multiple classes of the jdk source code. This class provides some underlying functions to bypass the JVM, and its implementation can improve efficiency. However, it is a double-edged sword: as its name foreshadowed, it is Unsafe, and the memory it allocates needs to be manually free (not recycled by GC). Unsafe class, provides a simple alternative to certain features of JNI: ensuring efficiency while making things easier.
This class belongs to the class in the sun.* API, and it is not a real part of J2SE, so you may not find any official documentation, and sadly, it does not have better code documentation either.
This article is mainly about the compilation and translation of the following articles.
http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/
1. Most methods of the Unsafe API are native implementations, which consist of 105 methods, mainly including the following categories:
(1) Info related. Mainly return some low-level memory information: addressSize(), pageSize()
(2) Objects related. Mainly provide Object and its domain manipulation methods: allocateInstance(), objectFieldOffset()
(3) Class related. Mainly provide Class and its static domain manipulation methods: staticFieldOffset(), defineClass(), defineAnonymousClass(), ensureClassInitialized()
(4) Arrays related. Array manipulation method: arrayBaseOffset(), arrayIndexScale()
(5) Synchronization-related. Mainly provide low-level synchronization primitives (such as CPU-based CAS (Compare-And-Swap) primitives): monitorEnter(), tryMonitorEnter(), monitorExit(), compareAndSwapInt(), putOrderedInt()
(6) Memory-related. Direct memory access method (bypass the JVM heap and manipulate local memory directly): allocateMemory(), copyMemory(), freeMemory(), getAddress(), getInt(), putInt()
2. Getting Unsafe class instance
The Unsafe class design is only provided to the JVM trusted startup class loader, and is a typical singleton pattern class. Its instance acquisition method is as follows:
public static Unsafe getUnsafe() { Class cc = sun.reflect.Reflection.getCallerClass(2); if (cc.getClassLoader() != null) throw new SecurityException("Unsafe"); return theUnsafe;}A non-start class loader will directly call the Unsafe.getUnsafe() method and will throw a SecurityException (the specific reason involves the parent loading mechanism of the JVM class).
There are two solutions. One is to specify the class to be used as the startup class through the JVM parameter - Xbootclasspath. The other method is Java reflection.
Field f = Unsafe.class.getDeclaredField("theUnsafe");f.setAccessible(true);Unsafe unsafe = (Unsafe) f.get(null);By brutally setting accessible to true for private singleton instance, and then directly obtain an Object cast to Unsafe through Field's get method. In the IDE, these methods will be marked as Error and can be resolved by the following settings:
Preferences -> Java -> Compiler -> Errors/Warnings -> Deprecated and restricted API -> Forbidden reference -> Warning
3. "Interesting" application scenarios of Unsafe class
(1) Bypass the class initialization method. The allocateInstance() method becomes very useful when you want to bypass object constructors, security checkers, or constructors without public.
class A { private long a; // not initialized value public A() { this.a = 1; // initialization } public long a() { return this.a; }}The following is a comparison of the construction method, reflection method and allocateInstance()
A o1 = new A(); // constructoro1.a(); // prints 1 A o2 = A.class.newInstance(); // reflectiono2.a(); // prints 1 A o3 = (A) unsafe.allocateInstance(A.class); // unsafeo3.a(); // prints 0
allocateInstance() does not enter the constructor method at all, and in singleton mode we seem to see a crisis.
(2) Memory modification
Memory modification is relatively common in C language. In Java, it can be used to bypass the security checker.
Consider the following simple access check rules:
class Guard { private int ACCESS_ALLOWED = 1; public boolean giveAccess() { return 42 == ACCESS_ALLOWED; }}Under normal circumstances, giveAccess always returns false, but it doesn't always happen
Guard guard = new Guard();guard.giveAccess(); // false, no access // bypassUnsafe unsafe = getUnsafe();Field f = guard.getClass().getDeclaredField("ACCESS_ALLOWED");unsafe.putInt(guard, unsafe.objectFieldOffset(f), 42); // memory corruption guard.giveAccess(); // true, access grantedBy calculating the memory offset and using the putInt() method, the ACCESS_ALLOWED of the class is modified. When a class structure is known, the data offset can always be calculated (consistent with the data offset calculation in the class in C++).
(3) Implement the sizeOf() function similar to C language
Implement a C-like sizeOf() function by combining Java reflection and objectFieldOffset() function.
public static long sizeOf(Object o) { Unsafe u = getUnsafe(); HashSet fields = new HashSet(); Class c = o.getClass(); while (c != Object.class) { for (Field f : c.getDeclaredFields()) { if ((f.getModifiers() & Modifier.STATIC) == 0) { fields.add(f); } } c = c.getSuperclass(); } // get offset long maxSize = 0; for (Field f : fields) { long offset = u.objectFieldOffset(f); if (offset > maxSize) { maxSize = offset; } } return ((maxSize/8) + 1) * 8; // padding}The algorithm's idea is very clear: start from the underlying subclass, take out the non-static domains of itself and all its superclasses in turn, place them in a HashSet (the repeated calculations are only once, Java is single inheritance), and then use objectFieldOffset() to obtain a maximum offset, and finally consider alignment.
In a 32-bit JVM, size can be obtained by reading a long with a class file offset of 12.
public static long sizeOf(Object object){ return getUnsafe().getAddress( normalize(getUnsafe().getInt(object, 4L)) + 12L);}The normalize() function is a method that converts signed int to unsigned long
private static long normalize(int value) { if(value >= 0) return value; return (0L >>> 32) & value;}The size of the two sizeOf() computed is the same. The most standard sizeOf() implementation is to use java.lang.instrument, however, it requires specifying the command line parameter -javaagent.
(4) Implementing shallow Java replication
The standard shallow replication scheme is to implement the Cloneable interface or the replication functions implemented by itself, and they are not multi-purpose functions. By combining the sizeOf() method, shallow copying can be achieved.
static Object shallowCopy(Object obj) { long size = sizeOf(obj); long start = toAddress(obj); long address = getUnsafe().allocateMemory(size); getUnsafe().copyMemory(start, address, size); return fromAddress(address);}The following toAddress() and fromAddress() convert the object to its address and the reverse operation respectively.
static long toAddress(Object obj) { Object[] array = new Object[] {obj}; long baseOffset = getUnsafe().arrayBaseOffset(Object[].class); return normalize(getUnsafe().getInt(array, baseOffset));} static Object fromAddress(long address) { Object[] array = new Object[] {null}; long baseOffset = getUnsafe().arrayBaseOffset(Object[].class); getUnsafe().putLong(array, baseOffset, address); return array[0];}The above shallow copy function can be applied to any java object, and its size is calculated dynamically.
(5) Eliminate passwords in memory
Password fields are stored in String, however, String recycling is managed by the JVM. The safest way is to overwrite the password field after it is used.
Field stringValue = String.class.getDeclaredField("value");stringValue.setAccessible(true);char[] mem = (char[]) stringValue.get(password);for (int i=0; i < mem.length; i++) { mem[i] = '?';}(6) Dynamic loading of classes
The standard method of dynamically loading classes is Class.forName() (when writing jdbc programs, I remember it deeply). Unsafe can also dynamically load Java class files.
byte[] classContents = getClassContent();Class c = getUnsafe().defineClass( null, classContents, 0, classContents.length); c.getMethod("a").invoke(c.newInstance(), null); // 1getClassContent() method reads a class file to a byte array. private static byte[] getClassContent() throws Exception { File f = new File("/home/mishadoff/tmp/A.class"); FileInputStream input = new FileInputStream(f); byte[] content = new byte[(int)f.length()]; input.read(content); input.close(); return content;}It can be applied in dynamic loading, proxying, slicing and other functions.
(7) The package detection exception is a runtime exception.
getUnsafe().throwException(new IOException());
This can be done when you don't want to catch the checked exception (not recommended).
(8) Quick serialization
The standard Java Serializable is very slow, and it also limits that the class must have a public parameterless constructor. Externalizable is better, it needs to specify a schema for the class to be serialized. Popular efficient serialization libraries, such as kryo, relying on third-party libraries, will increase memory consumption. You can obtain the actual value of the domain in the class through getInt(), getLong(), getObject() and other methods, and persist information such as class name to the file together. kryo has attempted to use Unsafe, but there is no specific performance improvement data. (http://code.google.com/p/kryo/issues/detail?id=75)
(9) Allocate memory in non-Java heap
New using java will allocate memory for objects in the heap, and the object's life cycle will be managed by JVM GC.
class SuperArray { private final static int BYTE = 1; private long size; private long address; public SuperArray(long size) { this.size = size; address = getUnsafe().allocateMemory(size * BYTE); } public void set(long i, byte value) { getUnsafe().putByte(address + i * BYTE, value); } public int get(long idx) { return getUnsafe().getByte(address + idx * BYTE); } public long size() { return size; }}The memory allocated by Unsafe is not limited by Integer.MAX_VALUE, and is allocated on non-heap memory. When using it, you need to be very cautious: if you forget to manually recycle it, memory leaks will occur; if you illegal address access, it will cause the JVM to crash. It can be used when you need to allocate large continuous areas, real-time programming (not tolerating JVM latency). java.nio uses this technology.
(10) Applications in Java concurrency
By using Unsafe.compareAndSwap(), it can be used to implement efficient lock-free data structures.
class CASCounter implements Counter { private volatile long counter = 0; private Unsafe unsafe; private long offset; public CASCounter() throws Exception { unsafe = getUnsafe(); offset = unsafe.objectFieldOffset(CASCounter.class.getDeclaredField("counter")); } @Override public void increment() { long before = counter; while (!unsafe.compareAndSwapLong(this, offset, before, before + 1)) { before = counter; } } @Override public long getCounter() { return counter; }}Through testing, the above data structure is basically the same as the efficiency of Java atomic variables. Java atomic variables also use Unsafe's compareAndSwap() method, and this method will eventually correspond to the corresponding primitives of CPU, so it is very efficient. Here is a solution to implement lock-free HashMap (http://www.azulsystems.com/about_us/presentations/lock-free-hash. The idea of this solution is: analyze each state, create copies, modify copies, use CAS primitives, spin locks). In ordinary server machines (core <32), using ConcurrentHashMap (before JDK8, the default 16-channel separation lock was implemented, and ConcurrentHashMap has been implemented using lock-free) is obviously enough.
Summarize
The above is the entire content of this article. I hope that the content of this article has certain reference value for everyone's study or work. If you have any questions, you can leave a message to communicate. Thank you for your support to Wulin.com.