Test the transfer file size of JSON and Protocol Buffer through Java

Author：Eve Cole Update Time：2025-07-01 16:16:02

I believe everyone knows what JSON is. If you don’t know, it’s really OUT. GOOGLE. I won't introduce anything here.
I guess everyone rarely hears about Protobuffer, but if it is done by GOOGLE, I believe everyone will be interested in trying it. After all, GOOGLE exports are mostly high-quality products.
Protobuffer is a transmission protocol similar to JSON. In fact, it cannot be said to be a protocol, it is just a data transmission thing.
So what is the difference between it and JSON?
Cross-language, this is one of its advantages. It comes with a compiler, protoc, which only needs to be compiled with it, and can be compiled into JAVA, python, and C++ code. There are only these three for the time being, so don't think about the others for the time being, and then you can use them directly without writing any other code. Even the parsed ones already come with them. JSON is of course cross-language, but this cross-language is based on writing code.
If you want to know more, you can check it out:
https://developers.google.com/protocol-buffers/docs/overview
Okay, without further ado, let's take a look at why we need to compare protobuffer (hereinafter referred to as GPB) and JSON.
1. Because JSON has a certain format and exists in characters, there is still room for compression in the amount of data. When the amount of big data on GPB is much smaller than that of JSON, we can see the example below.
2. The efficiency difference between JSON libraries is quite large, and there is a gap of about 5-10 between jackson libraries and GSON (this has only been tested once, if there is any error, please pat it). GPB only needs one, and there is no difference between so-called multiple libraries. Of course, this point is just made up for the numbers, and it can be ignored.

Talk is cheap, Just show me the code.
In the programming world, code is always the king, so let’s just go to the code.
Before uploading the code, you need to download the protobuffer first, here:
https://github.com/google/protobuf

1. First of all, GPB needs to have a file with similar class definitions, called a proto file.
Let’s take the example of students and teachers to make an example:
We have the following two files: student.proto

 option java_package = "com.shun"; option java_outer_classname = "StudentProto"; message Student { required int32 id = 1; optional string name = 2; optional int32 age = 3; }</span>

teacher.proto

 import "student.proto"; option java_package = "com.shun"; option java_outer_classname = "TeacherProto"; message Teacher { required int32 id = 1; optional string name = 2; repeated Student student_list = 3; }</span>

Here we encountered some strange things:
import, int32, repeated, required, optional, option, etc.
1) Import means importing other proto files
2) required and optional indicates whether the field is optional. This determines what processing will be done by the protobuffer if the field has a value or not. If required is marked, but when processing, the field does not pass the value, an error will be reported; if optional is marked, no value will be transmitted, there will be no problem.
3) I believe you can understand repeated, which means whether it is repeated, it is similar to the list in JAVA.
4) Message is equivalent to class
5) Option represents the option, where java_package represents the package name, that is, the package name used when generating JAVA code. java_outer_classname is the class name. Note that this class name cannot be the same as the class name in the message below.
As for other options and related types, please visit the official documentation.

2. With these documents, what can we do?
Remember the compiler downloaded above. Unzip it and we get a protoc.exe. Of course, this is based on Windows. I didn’t do other systems. If you are interested, you can try it out.
Add to path (it is easy to add or not, it is just inconvenient), and then you can generate the class file we need through the above file.
protoc --java_out=path to store source code --proto_path=path to proto file proto specific file
--proto_path specifies the folder path of the proto file, not a single file, it is mainly used for import file search, and can be omitted

If I need to put the source code in D:/protobufferVsJson/src, my proto file is stored in D:/protoFiles
Then my compilation command is:

 protoc --java_out=D:/protobufferVsJson/src D:/protoFiles/teacher.proto D:/protoFiles/student.proto

Note that in the last file here, we need to specify all files that need to be compiled.

After compilation, you can see the generated file.
The code is not posted, too much. You can take a look privately. There are a lot of Builders in the code. I believe you will know it is the Builder mode at a glance.
At this time, you can paste the code into your project, and of course, there are a lot of errors.

Remember the source code we downloaded earlier? Unzip it, don't be ruthless. Then find src/main/java/ to copy one of them to your project. Of course, you can also compile ant or maven, but I am not familiar with these two things, so I won’t be ugly anymore. I am still used to copying them directly to the project.

The code error, haha, normal. For some reason, GOOGLE insists on leaving such a pit for us.
Go back to /java in the protobuffer directory and see a readme.txt, and find a sentence:

After looking at it, I feel that this code will be a bit strange, as if it is wrong. Anyway, I didn't execute it, and my command is:

 <span style="font-size: 16px;">protoc --java_out=or the path of the proto file where the code is placed (here is the path of the descriptor.proto file)</span>

After execution, we can see that there are no errors in the code.

3. The next step is of course testing.
Let's conduct GPB write test first:

 package com.shun.test; import java.io.FileOutputStream; import java.io.IOException; import java.util.ArrayList; import java.util.List; import com.shun.StudentProto.Student; import com.shun.TeacherProto.Teacher; public class ProtoWriteTest { public static void main(String[] args) throws IOException { Student.Builder stuBuilder = Student.newBuilder(); stuBuilder.setAge(25); stuBuilder.setId(11); stuBuilder.setName("shun"); //Construct List List<Student> stuBuilderList = new ArrayList<Student>(); stuBuilderList.add(stuBuilder.build()); Teacher.Builder teaBuilder = Teacher.newBuilder(); teaBuilder.setId(1); teaBuilder.setName("testTea"); teaBuilder.addAllStudentList(stuBuilderList); //Write gpb to file FileOutputStream fos = new FileOutputStream("C://Users//shun//Desktop//test//test.protoout"); teaBuilder.build().writeTo(fos); fos.close(); } }</span>

Let's look at the file, if nothing unexpected happens, it should have been generated.
After it is generated, we must read it back.

 package com.shun.test; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import com.shun.StudentProto.Student; import com.shun.TeacherProto.Teacher; public class ProtoReadTest { public static void main(String[] args) throws FileNotFoundException, IOException { Teacher teacher = Teacher.parseFrom(new FileInputStream("C://Users//shun//Desktop//test//test.protoout")); System.out.println("Teacher ID:" + teacher.getId() + ",Name:" + teacher.getName()); for (Student stu:teacher.getStudentListList()) { System.out.println("Student ID:" + stu.getId() + ",Name:" + stu.getName() + ",Age:" + stu.getAge()); } } } }</span>

The code is very simple because all the code generated by GPB is done for us.
The above knows the basic usage. We will focus on the difference between GPB and JSON generated file sizes. I will not post the detailed code of JSON here. I will post an example later. If you are interested, you can download it.
Here we use Gson to parse JSON. The following is only the code for converting the object into JSON and writing the file:
I won’t write the basic definitions of the two classes Student and Teacher, just do it as you like, the code is as follows:

 package com.shun.test; import java.io.FileWriter; import java.io.IOException; import java.util.ArrayList; import java.util.List; import com.google.gson.Gson; import com.shun.Student; import com.shun.Teacher; public class GsonWriteTest { public static void main(String[] args) throws IOException { Student stu = new Student(); stu.setAge(25); stu.setId(22); stu.setName("shun"); List<Student> stuList = new ArrayList<Student>(); stuList.add(stu); Teacher teacher = new Teacher(); teacher.setId(22); teacher.setName("shun"); teacher.setStuList(stuList); String result = new Gson().toJson(teacher); FileWriter fw = new FileWriter("C://Users//shun/Desktop//test//json"); fw.write(result); fw.close(); } }</span>

Next, we officially enter our real test code. In the beginning, we just put an object in the list. Next, we test the file sizes generated by GPB and JSON in turn.
Improve the previous GPB code, let it generate different number of lists and regenerate files:

 package com.shun.test; import java.io.FileOutputStream; import java.io.IOException; import java.util.ArrayList; import java.util.List; import com.shun.StudentProto.Student; import com.shun.TeacherProto.Teacher; public class ProtoWriteTest { public static final int SIZE = 100; public static void main(String[] args) throws IOException { //Construct List List<Student> stuBuilderList = new ArrayList<Student>(); for (int i = 0; i < SIZE; i ++) { Student.Builder stuBuilder = Student.newBuilder(); stuBuilder.setAge(25); stuBuilder.setId(11); stuBuilder.setName("shun"); stuBuilderList.add(stuBuilder.build()); } Teacher.Builder teaBuilder = Teacher.newBuilder(); teaBuilder.setId(1); teaBuilder.setName("testTea"); teaBuilder.addAllStudentList(stuBuilderList); //Write gpb to file FileOutputStream fos = new FileOutputStream("C://Users//shun//Desktop//test//proto-" + SIZE); teaBuilder.build().writeTo(fos); fos.close(); } }</span>

The SIZE here is changed to the test number we said above in turn, and you can get the following:

Then let's take a look at the JSON test code:

 package com.shun.test; import java.io.FileWriter; import java.io.IOException; import java.util.ArrayList; import java.util.List; import com.google.gson.Gson; import com.shun.Student; import com.shun.Teacher; public class GsonWriteTest { public static final int SIZE = 100; public static void main(String[] args) throws IOException { List<Student> stuList = new ArrayList<Student>(); for (int i = 0; i < SIZE; i ++) { Student stu = new Student(); stu.setAge(25); stu.setId(22); stu.setName("shun"); stuList.add(stu); } Teacher teacher = new Teacher(); teacher.setId(22); teacher.setName("shun"); teacher.setStuList(stuList); String result = new Gson().toJson(teacher); FileWriter fw = new FileWriter("C://Users//shun//Desktop//test//json" + SIZE); fw.write(result); fw.close(); } }</span>

The same method is used to modify SIZE and perform corresponding tests.

It can be clearly seen that the file size of json and GPB will have a big difference when the data volume gradually increases. JSON is obviously much larger.

The table above should be clearer. GPB of big data is very dominant, but in general, the client and the server will not directly interact with such big data. Big data mainly occurs in the server transmission. If you face the needs, you need to transmit hundreds of M of log files to another server every day, then the GPB here may be a big help.

It is said to be a depth comparison, but the main comparison is the size, and the time comparison is not too much, nor is there much difference.
For the Gson parser selected in the article, interested friends can choose Jackson or fastjson, or other, but the generated file size is the same, but the parsing time is different.