1. The upper limit of array allocation
The size of an array in Java is limited because it uses the int type as the array subscript. This means you cannot apply for arrays that exceed the size of Integer.MAX_VALUE (2^31-1). This does not mean that the upper limit of your memory application is 2G. You can apply for an array of larger types. for example:
The code copy is as follows:
final long[] ar = new long[ Integer.MAX_VALUE ];
This will allocate 16G -8 bytes. If the -Xmx parameter you set is large enough (usually your heap must retain at least 50% of the space, that is, allocate 16G of memory, you have to set it to -Xmx24G. This is just General rules, the specific allocation depends on the actual situation).
Unfortunately, in Java, due to the type limitation of array elements, it will be more troublesome to operate memory. When it comes to operating arrays, ByteBuffer should be the most useful class, which provides methods to read and write different Java types. Its disadvantage is that the target array type must be byte[], which means that the maximum memory cache you allocate can be 2G.
2. Use all arrays as byte arrays to operate
Assuming that 2G memory is far from enough for us now, it is OK if it is 16G. We have assigned a long[], but we want to operate it as a byte array. In Java we have to ask for help from C programmers - sun.misc.Unsafe. This class has two sets of methods: getN(object, offset), this method is to obtain a value of the specified type from the position where the object offset is offset and return it. N here represents the type to return the value. The putN(Object,offset,value) method is to write a value to the position of the offset of the Object.
Unfortunately, these methods can only get or set values of a certain type. If you copy data from an array, you also need another method of unsafe, copyMemory(srcObject, srcOffset, destObject, destOffet, count). This works similarly to System.arraycopy, but it copies bytes rather than array elements.
To access the data of an array through sun.misc.Unsafe, you need two things:
1. The offset of data in the array object
2. The offset of the copied elements in the array data
Like other Java objects, Arrays has an object header, which is stored in front of the actual data. The length of this header can be obtained by the unsafe.arrayBaseOffset(T[].class) method, where T is the type of array element. The size of the array element can be obtained through the unsafe.arrayIndexScale(T[].class) method. This means that if you want to access the Nth element of type T, your offset offset should be arrayOffset+N*arrayScale.
Let's write a simple example. We allocate a long array and update a few bytes inside it. We update the last element to -1 (0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF), and then clear all bytes of this element one by one.
The code copy is as follows:
final long[] ar = new long[ 1000 ];
final int index = ar.length - 1;
ar[ index ] = -1; //FFFFFFFFFFF
System.out.println( "Before change = " + Long.toHexString( ar[ index ] ));
for ( long i = 0; i < 8; ++i )
{
unsafe.putByte( ar, longArrayOffset + 8L * index + i, (byte) 0);
System.out.println( "After change: i = " + i + ", val = " + Long.toHexString( ar[ index ] ));
}
If you want to run the above example, you have to add the following static code block to your test class:
The code copy is as follows:
private static final Unsafe unsafe;
static
{
try
{
Field field = Unsafe.class.getDeclaredField("theUnsafe");
field.setAccessible(true);
unsafe = (Unsafe)field.get(null);
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
private static final long longArrayOffset = unsafe.arrayBaseOffset(long[].class);
The output result is:
The code copy is as follows:
Before change = ffffffffffffffffff
After change: i = 0, val = ffffffffffffff00
After change: i = 1, val = ffffffffffff0000
After change: i = 2, val = ffffffffff000000
After change: i = 3, val = ffffffff00000000
After change: i = 4, val = ffffff00000000000
After change: i = 5, val = ffff0000000000000
After change: i = 6, val = ff000000000000000
After change: i = 7, val = 0
3. Memory allocation of sun.misc.Unsafe
As mentioned above, in pure Java, the memory size we can allocate is limited. This restriction was set in the initial version of Java, and at that time people dared not share several G of memory like this. But now it is the era of big data, and we need more memory. In Java, there are two ways to get more memory:
1. Allocate many small chunks of memory and then logically use them as a continuous large chunk of memory.
2. Use sun.misc.Unsafe.allcateMemory(long) to allocate memory.
The first method is just a little more interesting from the perspective of algorithms, so let’s take a look at the second method.
sun.misc.Unsafe provides a set of methods to allocate, re-allocate, and release memory. They are very similar to C's malloc/free method:
1.long Unsafe.allocateMemory(long size)-allocate a piece of memory space. This piece of memory may contain junk data (not automatically cleared). If the allocation fails, an exception of java.lang.OutOfMemoryError will be thrown. It returns a non-zero memory address (see the description below).
2.Unsafe.reallocateMemory(long address, long size)-Reallocate a piece of memory and copy data from the old memory buffer (where the address points to) the newly allocated memory block. If the address is equal to 0, this method has the same effect as allocateMemory. It returns the address of the new memory buffer.
3.Unsafe.freeMemory(long address)-free a memory buffer generated by the previous two methods. If the address is 0, do nothing.
The memory allocated by these methods should be used in a mode called a single register address: Unsafe provides a set of methods that accept only one address parameter (unlike the dual register mode, they require an Object and an offset offset) . The memory allocated in this way can be larger than what you configure in the Java parameters of -Xmx.
Note: The memory allocated by Unsafe cannot be garbage collected. You have to treat it as a normal resource and manage it yourself.
Here is an example of using Unsafe.allocateMemory to allocate memory, and it also checks whether the entire memory buffer is readable and writeable:
The code copy is as follows:
final int size = Integer.MAX_VALUE / 2;
final long addr = unsafe.allocateMemory( size );
try
{
System.out.println( "Unsafe address = " + addr );
for ( int i = 0; i < size; ++i )
{
unsafe.putByte( addr + i, (byte) 123);
if ( unsafe.getByte( addr + i ) != 123 )
System.out.println( "Failed at offset = " + i );
}
}
Finally
{
unsafe.freeMemory( addr );
}
As you can see, using sun.misc.Unsafe you can write very general memory access code: no matter what kind of memory is allocated in Java, you can read and write any type of data at will.