Serialization and Deserialization
Serialization in simple terms means converting an object into a sequence of bytes, deserialization is exactly the opposite. In deserialization, an object is reconstructed back from the sequence of bytes. In Java, Serialization and deserialization play a great role in the transfer of data and saving it to a database or disk. Let's dive in and figure out how it happens and what exact meaning these terms hold in Java.
Knowledge about Java basic constructs(basic syntax, packages, methods, etc.) and basic OOP concepts in Java shall be really helpful moving forward.
Before diving into the depths of the topic, let's consider a real-life example that shall help you appreciate the concept of serialization and deserialization in Java.
Assume you are talking to a friend on a phone call, i.e., via cellular network. Voice first gets converted to electric signals, which are further propagated to a mobile tower via radio signals. These electric signals are then channeled by the tower to the particular phone number. So before your voice(basically data) reaches the other end, which can be miles away, it goes through a number of conversions.
Relating to the above case, when data in Java (or Java objects, as it is an OOP language) travel from one interpreter( or Java Virtual Machine) to another, several conversions occur. These are mainly serialization and deserialization. Let's learn more about these!
What is Serialization in Java?
An object in Java has three characteristic properties: State: It represents the value or the data of the object. Behavior: Behavior tells us about the functionality of an object. Identity: As the name suggests, it helps to uniquely identify an object. Serialization means, by definition, the mechanism for converting an object's state into a stream of bytes in order to store or transmit the object to memory, a database, or a file.
The objects we create in Java are stored in memory, and these get removed by the garbage collector when not in use anymore. If this object is to be transferred, for example, you want to send it over a network, it should be in an encrypted and compatible format. For this purpose, the object needs to be transformed into a byte stream.
One thing to note here is that the byte stream is platform-independent, which means after serializing the object on one platform, it can be converted back to the original state on any other platform.
Now let's talk about how we achieve serialization in Java. In Java, there is something called Marker Interface. A marker interface is an interface that has no fields or methods. All it does is, add some special behavior to a class. So to add the serializable behavior to our class, we shall take the help of a marker interface called Serializable, which is a part of the java.io package. The following example shows how a class implements a Serializable interface.
In the above code snippet, we have imported the Serializable interface, which is implemented by class Employee. Thus, all the objects of this class can be converted to a byte stream.
We have added serializable behavior to the class. Let's see how we can serialize the objects. So basically, for writing objects into a stream, we have the ObjectOutputStream Class in Java. This is possible only if the class implements a serializable interface, and thus the objects can be passed as arguments to the methods of ObjectOutputStream Class.
Methods of ObjectOutputStream Class
- writeObject() method: Serializes the object, and writes it to ObjectOutputStream.
- close() : Closes a current stream.
- flush(): Flushes the current output stream.
Continuing with the example, let's see how the object of class Employee can be serialized.
Inside the class Persist, we created an object of Class Employee. This object is further serialized by the writeObject() method, and the state of the object is saved to the file fout.text. With the help of the close() method, the current stream is closed.
Example of Serialization in Java
Let's consider a few examples with some conditions.
Example 1: Serialization in Java with Inheritance
In OOP, the capability of a class to derive properties and characters from the other class is called inheritance. We have a Parent/Super class or Child/Derived class. Let's take an example:
So in the example above, Class A is the parent class, and class B inherits from Class A. As you can see, Class B does not implement Serializable but is still eligible for serialization. This is because, in Java child class doesn't have to implement serializable if parent class has already done so.
Example 2: Serialization in Java With Aggregation
Aggregation in OOP is a method to reuse a class, basically a class defines another class as an entity reference.
The code results in an error because, when a class has a reference to another class, all individual references must implement a Serializable interface too.
Example 3: Serialization in Java With Array or Collection
If any of the fields in a serializable object consists of an array of objects, in that case, all the objects must be serializable as well. Otherwise, a NotSerializableException shall occur.
How does Java Serialization Work?
Till this point, we have understood the serializable interface and ObjectOutputStream Let's understand how the serialization algorithm works. The whole process of serialization occurs recursively.
- Java makes use of a feature called Reflection to examine and scrape the data from fields of the object to be serialized in runtime.
- If an object is present inside the field of the previous object, it is serialized in a recursive manner.
Why do We Need Java Serialization?
Let's list out a few use cases and understand why we need serialization in Java.
- Communication: The conversion of objects into byte streams helps in transfer via networks, thus allowing multiple systems to design, share and execute objects simultaneously.
- Persistence: The state of any object can be stored in a database easily and can be retrieved at any point in time.
- Cloning: It makes the process of cloning really simple, as an exact copy of an object can be created by serializing it into a bit stream and then converting back to the original form.
- Time Saving: Converting a byte stream to an object takes way less time than creating an actual object from a class.
Why Is Serializable Not Implemented by Object?
A question that arises is when the object is being serialized, why is serializable not implemented by it?
Well, this way, various security loopholes will arise. What if you don't want to make certain fields serializable, the ones with credentials or sensitive information, for example?
When a class implements serializable, it makes you aware that all those fields that are not to be serialized should be marked transient. (Transient sounds a little Jargonish, right? Don't worry. We have covered it in a section ahead.) However, when we consider implementing serializable for an object, what if you forget to mark the fields transient? It would be unsafe.
Thus, it is better to allow certain classes you want to get serialized than to examine the whole code and look for the fields that shouldn’t be included during the serialization.
Serial Version UID in Java
During the serialization process, the Java runtime associates a version number with each serializable class, called Serial Version UID. This attribute acts as an identifier during the process of serialization and deserialization. It verifies and ensures that the same class which was there during the serialization process is loaded. The sender and receiver of the serialized object must load the same class. So, in case the UID of the class loaded by the receiver does not match with the UID of the corresponding sender's class, it would result in
How is this SerialVersion UID declared? We can declare its own UID by declaring a field name for any serializable class. The syntax for the same is: Syntax:
A serial version UID is declared for the Employee class.
But what if we explicitly don't declare any UID for a class? Does that mean there won't be any identifier associated with the class? Well, Serialization runtime takes care of this situation. It assigns a default UID for the class based on several aspects of the particular class as per the Java Object Serialization Specification.
Serialver Command in Java
In order to get the Serial version UID for a class, the Java Development Kit has a built-in command, serialver. The command is written as : serialver [-classpath classpath] [-show] [classname...]
So we have two options here,
- -classpath: This option helps us specify the path of the class or where to look for a particular class.
- -show: show option displays a simple user interface, where you can enter the full class name and press ENTER or SHOW to get the serial version UID.
Points to Remember while Implementing the Serializable Interface in Java
There are a few points you need to remember while implementing serialization.
- Serializable Interface must be implemented by all the associated objects.
- If the parent class has already implemented the Serializable interface, in that case, a child class doesn't have to implement it.
- During the serialization process, only non-static data members are saved.
- While converting the byte stream back to the original object, the constructor of the object is not called.
- Serializable Interface has no methods or fields of its own.
Advantages of Serialization
Mostly we have covered the merits and use cases of Serialization in the section where we discussed its need. But to summarize, the main advantages of serialization are:
- It helps to preserve the state or the data of the object.
- It is platform-independent.
- It makes a time-saving and efficient transfer of objects happen between two platforms.
- It helps in creating replicas of objects, , i.e., cloning them.
- It is easy to understand and customizable.
- It allows encrypted and safe Java computing.
What is Deserialization in Java?
Precisely saying, deserialization is the opposite of serialization. It is exactly the reverse process. So to define deserialization, it is the process of converting a stream of bytes to the original state of the object. To perform deserialization, Java provides the ObjectInputStream class, which is again opposite to the ObjectOutputStream we studied for serialization. Let's talk about the methods of this class.
Methods of ObjectInputStream Class
- readObject(): This method converts the stream of bytes to the state of the object. In other words, it reads an object from the input stream.
- close(): As the name suggests, this method basically closes the Input stream.
The above image shows the complete process of serialization and deserialization.
Example of Deserialization in java
Now that we have understood how to implement deserialization let's take an example for the same. In the section explaining serialization, we took a serializable class Employee and serialized its object. Let's deserialize those.
So here in the Class Persist, after serialization of the object is done, we move to deserialization. With the help of ObjectInputStream, the object is read from the file f.txt. Further, the writeObject() method performs deserialization. To verify the process, we have printed the fields of the object. Name and id are getting printed correctly. Thus, deserialization is successful. Use this Online Java Compiler to compile your code.
How does Java Deserialization Work?
So looking at the deserialization algorithm, one thing to note is the constructor of the object is not called. An empty object is created, and again with the help of the Reflection feature, data is written to the fields. Similar to serialization, private and final fields are included as well.
Advantages of Deserialization in java
- It helps reconstruct the object from the byte stream rather than actually creating an object from class, which is quite time-consuming.
- It is simple to customize.
- Built-in feature of Java, no third-party tool is required.
Explaining Java Deserialize Vulnerabilities
In simple words, Java deserialize vulnerabilities are security vulnerabilities that occur when undesired or modified objects are inserted during the process of serialization-deserialization by malicious activities. Let's consider that for your Java application. You're reconstructing the object from the byte stream. So you're expecting the already serialized object, let's say obj1. However, instead of obj1, you get obj2. The retrieved object is thus a result of some malicious activities; this is a Java deserialize vulnerability. Untrusted and malicious byte-streams can easily exploit vulnerable deserialization code.
How to Prevent a Java Deserialize Vulnerability?
Following approaches can be followed to prevent Java deserialize vulnerability:
- The most basic approach is performing inspection of the objects from a deserialized object stream or in other words, basic filtration of the ObjectInputStream. There are several libraries to perform these validation actions.
- The other way is to forbid objects of some classes from being deserialized, the blacklisting approach.
- We can also allow a set of objects of approved classes to get deserialized, the whitelist approach. Deserialization will occur in a restrictive manner thus avoiding the chances of deserialize vulnerabilities.
- Keep the open source libraries up to date.
Java Transient Keyword
Let's consider a case where you are serializing an object and want that a certain field of the object doesn't get serialized. To achieve this, we can take the help of the transient keyword. So, the Java transient keyword helps to prevent a particular field from being serialized. Let's see how it is declared,
In the above code snippet, age will not be serialized since it is declared transient. Whereas name and age being non-static will be serialized.
What is the Difference Between Serialization and Deserialization in Java
|Serialization is the mechanism of conversion of an object to a stream of bytes.||Deserialization helps to convert the stream of objects to the original state of the object.|
|It helps to write byte stream to file,db, etc.||It helps to read byte stream from file, db, etc.|
|It is performed with the help of the ObjectOutputStream Class.||It is performed with the help of the ObjectInputStream class.|
- Serialization is the process of converting the state of an object to a byte stream.
- Deserialization is the reverse of serialization and converts the byte stream back to the original object.
- A class must implement Serializable interface to be eligible for serialization.
- ObjectOutputStream and ObjectInputStream classes help to serialize and deserialize an object respectively.
- Only non-static data members get serialized.