On serialization and deserialization

Introduction to serialization

Java serialization refers to the process of converting a Java object into a binary stream, and deserialization refers to the process of converting a binary stream into a Java object. The purpose of serialization is as follows:

  • When the program exits, these objects disappear, and serialization is to save these objects for future use only;
  • The serialized binary stream is transmitted to the remote JVM through the network (the basis of RPC and RMI).

All objects that may be transferred over the network should be serializable, such as the parameters and return values in the RMI process; all objects that need to be saved to disk should also be serializable, such as the objects that need to be saved to HttpSession or ServletContext. To implement serialization, an object must implement one of the following two interfaces:

  • Serializable
  • Externalizable

The differences between the two interfaces are described later.

Object flow

To serialize objects, we can use ObjectOutputStream and ObjectInputStream.

//Syntax sugar used to automatically close resources
try( ObjectOutputStrem oos = new ObjectOutputStrem(new FileOutputStream("object.txt")) ){
    Person p = new Person();
    oos.writeObject(p);
}catch(Exception e){
    //handle Exception
}

The above code is to serialize the object into a local file. To deserialize the object, you can use ObjectInputStream. Use it in a similar way to the above code. During deserialization, only the data of Java object is read, but the data of Class cannot be read. Therefore, the Class file corresponding to this object must be provided during deserialization, or the Class will be reported as no exception. In addition, deserialization does not initialize the Class through its constructor.

When a serializable class has multiple parents (including direct and indirect parents), these parents either have parameterless constructors or can be serialized. Otherwise, InvalidClassException will be thrown when the child class is deserialized. If the parent class is not serializable but has a parameterless constructor, the child class will not serialize the member variables defined in the parent class into the binary stream when serializing.

Serialization of reference type member variables

If a class contains a reference variable, the class itself can only be serialized if the reference variable is Serializable. Otherwise, no matter whether you implement Serializable or not, you cannot serialize.

All objects can only be serialized once, otherwise there will be A problem: object A and object B belong to the same class, and they refer to Object C at the same time. If we serialize object C twice when we serialize A and B, then there will be two C objects in the system when we deserialize. This is not consistent with our original intention of serialization. So Java uses the following serialization algorithm when serializing:

  • All objects saved to disk will have a serialization number;
  • When the program attempts to serialize an object, the program will first check whether the object has been serialized. Only if the object has never been serialized (in this JVM), the object will be converted into a byte sequence output;
  • If the object has been serialized, the program will only output a serialization number, and will not serialize the object again.

If it is a mutable object, we will change the content of the object after serializing it, and then try to serialize the object, which will not take effect.

Custom Serialization

When an object is serialized, the system will automatically serialize all instance variables of the object in turn. If an instance references another object, the referenced object will also be serialized.

If we don't want a property of an object to be serialized, we can define the member variable with the transient keyword. The transient keyword can only decorate instance variables.

The mechanism provided by transient is too simple. What should developers do if they want to have a more complex serialization mechanism for an instance variable? In the process of serialization and deserialization, if objects need special processing logic, these objects should provide the following methods:

//This method allows you to modify the serialized object
private Object writeReplace() throws ObjectStreamException; 
//Call in method
private void writeObject(java.io.ObjectOutputStream out) throws IOException; 
//Using the default serialization method of writeObject, you can add some other operations, such as adding additional serialization objects to the output: out.writeObject("XX")
defaultWriteObject() 

private void readObject(java.io.ObjectInputStream in) throws Exception; 
//This method allows you to modify the returned object
private Object readResolve() throws ObjectStreamException; 

Here is a single example of serialization and deserialization:

public class PersonSingleton implements Serializable {
    private static final long serialVersionUID = 1L;
    private String name;
    private PersonSingleton(String name) {
        this.name = name;
    };
    private static PersonSingleton person = null;

    public static synchronized PersonSingleton getInstance() {
        if (person == null)
        return person = new PersonSingleton("cgl");
        return person;
    }

    private Object writeReplace() throws ObjectStreamException {
        System.out.println("1 write replace start");
        return this;//Can be modified to other objects
    }

    private void writeObject(java.io.ObjectOutputStream out) throws IOException {
        System.out.println("2 write object start");
        out.defaultWriteObject(); //out.writeInt(1);
    }

    private void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException {
        System.out.println("3 read object start");
        in.defaultReadObject(); //int i=in.readInt();
    }

    private Object readResolve() throws ObjectStreamException {
        System.out.println("4 read resolve start");
        return PersonSingleton.getInstance();//No matter what the serialization operation is, the returned object is a local singleton object
    }
}

In addition, Java also provides another custom serialization mechanism, which is to implement the Externalizable interface. This mechanism allows programmers to have greater control over the serialization process.

serialVersionUID function

How does the JVM determine whether the serialized and deserialized class files are the same?

This is not to say that the two class files should be identical, but judged by a private property of the class, serialVersionUID. If we do not specify this property as shown, the JVM will automatically use the hashcode value of the class to set this property, At this time, if we change the class (such as adding a property or deleting a property), it will cause the serialVersionUID to be different. Therefore, for the class to be serialized, we will generally display the setting of this property, so in time, we will make some changes to the class, as long as the value remains the same, The JVM will still think that this class file has not changed.

Tags: Java jvm network

Posted on Mon, 16 Mar 2020 00:09:32 -0700 by davard