Java 中的序列化 - Java 序列化
Java 中的序列化是在 JDK 1.1 中引入的,它是Core Java的重要特性之一。
Java 中的序列化
Java 中的序列化允许我们将对象转换为流,然后可以通过网络发送或将其保存为文件或存储在数据库中以供以后使用。反序列化是将对象流转换为实际 Java 对象以供程序使用的过程。Java 中的序列化乍一看似乎非常容易使用,但它带来了一些琐碎的安全性和完整性问题,我们将在本文的后面部分讨论这些问题。我们将在本教程中研究以下主题。
Java 中的序列化
如果您希望类对象可序列化,则只需实现接口即可java.io.Serializable
。Java 中的 Serializable 是一个标记接口,没有要实现的字段或方法。这就像一个 Opt-In 过程,通过该过程我们可以使我们的类可序列化。Java 中的序列化由ObjectInputStream
和实现ObjectOutputStream
,因此我们需要的只是对它们进行包装,以便将其保存到文件或通过网络发送。让我们看一个简单的 Java 程序示例中的序列化。
package com.journaldev.serialization;
import java.io.Serializable;
public class Employee implements Serializable {
// private static final long serialVersionUID = -6470090944414208496L;
private String name;
private int id;
transient private int salary;
// private String password;
@Override
public String toString(){
return "Employee{name="+name+",id="+id+",salary="+salary+"}";
}
//getter and setter methods
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public int getSalary() {
return salary;
}
public void setSalary(int salary) {
this.salary = salary;
}
// public String getPassword() {
// return password;
// }
//
// public void setPassword(String password) {
// this.password = password;
// }
}
请注意,这是一个具有一些属性和 getter-setter 方法的简单 Java bean。如果您希望对象属性不序列化为流,则可以使用temporaryObjectInputStream
关键字,就像我对 salary 变量所做的那样。现在假设我们想将对象写入文件,然后从同一文件中反序列化它。因此,我们需要将使用和用于序列化目的的实用方法ObjectOutputStream
。
package com.journaldev.serialization;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
/**
* A simple class with generic serialize and deserialize method implementations
*
* @author pankaj
*
*/
public class SerializationUtil {
// deserialize to Object from given file
public static Object deserialize(String fileName) throws IOException,
ClassNotFoundException {
FileInputStream fis = new FileInputStream(fileName);
ObjectInputStream ois = new ObjectInputStream(fis);
Object obj = ois.readObject();
ois.close();
return obj;
}
// serialize the given object and save it to file
public static void serialize(Object obj, String fileName)
throws IOException {
FileOutputStream fos = new FileOutputStream(fileName);
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeObject(obj);
fos.close();
}
}
请注意,方法参数适用于 Object,它是任何 Java 对象的基类。它以这种方式编写本质上是通用的。现在让我们编写一个测试程序来查看 Java 序列化的实际操作。
package com.journaldev.serialization;
import java.io.IOException;
public class SerializationTest {
public static void main(String[] args) {
String fileName="employee.ser";
Employee emp = new Employee();
emp.setId(100);
emp.setName("Pankaj");
emp.setSalary(5000);
//serialize to file
try {
SerializationUtil.serialize(emp, fileName);
} catch (IOException e) {
e.printStackTrace();
return;
}
Employee empNew = null;
try {
empNew = (Employee) SerializationUtil.deserialize(fileName);
} catch (ClassNotFoundException | IOException e) {
e.printStackTrace();
}
System.out.println("emp Object::"+emp);
System.out.println("empNew Object::"+empNew);
}
}
当我们运行上述 Java 序列化测试程序时,我们得到以下输出。
emp Object::Employee{name=Pankaj,id=100,salary=5000}
empNew Object::Employee{name=Pankaj,id=100,salary=0}
由于工资是瞬时变量,因此其值未保存到文件中,因此无法在新对象中检索。同样,静态变量值也不会序列化,因为它们属于类而不是对象。
使用序列化和 serialVersionUID 进行类重构
Java 中的序列化允许对 Java 类进行一些可以忽略的更改。 类中的一些更改不会影响反序列化过程,包括:
- 向类中添加新变量
- 将变量从瞬态变为非瞬态,对于序列化来说,就像拥有一个新字段一样。
- 将变量从静态更改为非静态,对于序列化来说,就像拥有一个新字段一样。
But for all these changes to work, the java class should have serialVersionUID defined for the class. Let’s write a test class just for deserialization of the already serialized file from previous test class.
package com.journaldev.serialization;
import java.io.IOException;
public class DeserializationTest {
public static void main(String[] args) {
String fileName="employee.ser";
Employee empNew = null;
try {
empNew = (Employee) SerializationUtil.deserialize(fileName);
} catch (ClassNotFoundException | IOException e) {
e.printStackTrace();
}
System.out.println("empNew Object::"+empNew);
}
}
Now uncomment the “password” variable and it’s getter-setter methods from Employee class and run it. You will get below exception;
java.io.InvalidClassException: com.journaldev.serialization.Employee; local class incompatible: stream classdesc serialVersionUID = -6470090944414208496, local class serialVersionUID = -6234198221249432383
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1601)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
at com.journaldev.serialization.SerializationUtil.deserialize(SerializationUtil.java:22)
at com.journaldev.serialization.DeserializationTest.main(DeserializationTest.java:13)
empNew Object::null
The reason is clear that serialVersionUID of the previous class and new class are different. Actually if the class doesn’t define serialVersionUID, it’s getting calculated automatically and assigned to the class. Java uses class variables, methods, class name, package etc to generate this unique long number. If you are working with any IDE, you will automatically get a warning that “The serializable class Employee does not declare a static final serialVersionUID field of type long”. We can use java utility “serialver” to generate the class serialVersionUID, for Employee class we can run it with below command.
SerializationExample/bin$serialver -classpath . com.journaldev.serialization.Employee
Note that it’s not required that the serial version is generated from this program itself, we can assign this value as we want. It just need to be there to let deserialization process know that the new class is the new version of the same class and should be deserialized of possible. For example, uncomment only the serialVersionUID field from the Employee
class and run SerializationTest
program. Now uncomment the password field from Employee class and run the DeserializationTest
program and you will see that the object stream is deserialized successfully because the change in Employee class is compatible with serialization process.
Java Externalizable Interface
If you notice the java serialization process, it’s done automatically. Sometimes we want to obscure the object data to maintain it’s integrity. We can do this by implementing java.io.Externalizable
interface and provide implementation of writeExternal() and readExternal() methods to be used in serialization process.
package com.journaldev.externalization;
import java.io.Externalizable;
import java.io.IOException;
import java.io.ObjectInput;
import java.io.ObjectOutput;
public class Person implements Externalizable{
private int id;
private String name;
private String gender;
@Override
public void writeExternal(ObjectOutput out) throws IOException {
out.writeInt(id);
out.writeObject(name+"xyz");
out.writeObject("abc"+gender);
}
@Override
public void readExternal(ObjectInput in) throws IOException,
ClassNotFoundException {
id=in.readInt();
//read in the same order as written
name=(String) in.readObject();
if(!name.endsWith("xyz")) throw new IOException("corrupted data");
name=name.substring(0, name.length()-3);
gender=(String) in.readObject();
if(!gender.startsWith("abc")) throw new IOException("corrupted data");
gender=gender.substring(3);
}
@Override
public String toString(){
return "Person{id="+id+",name="+name+",gender="+gender+"}";
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getGender() {
return gender;
}
public void setGender(String gender) {
this.gender = gender;
}
}
Notice that I have changed the field values before converting it to Stream and then while reading reversed the changes. In this way, we can maintain data integrity of some sorts. We can throw exception if after reading the stream data, the integrity checks fail. Let’s write a test program to see it in action.
package com.journaldev.externalization;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
public class ExternalizationTest {
public static void main(String[] args) {
String fileName = "person.ser";
Person person = new Person();
person.setId(1);
person.setName("Pankaj");
person.setGender("Male");
try {
FileOutputStream fos = new FileOutputStream(fileName);
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeObject(person);
oos.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
FileInputStream fis;
try {
fis = new FileInputStream(fileName);
ObjectInputStream ois = new ObjectInputStream(fis);
Person p = (Person)ois.readObject();
ois.close();
System.out.println("Person Object Read="+p);
} catch (IOException | ClassNotFoundException e) {
e.printStackTrace();
}
}
}
When we run above program, we get following output.
Person Object Read=Person{id=1,name=Pankaj,gender=Male}
So which one is better to be used for serialization in java. Actually it’s better to use Serializable interface and by the time we reach at the end of article, you will know why.
Java Serialization Methods
We have seen that serialization in java is automatic and all we need is implementing Serializable interface. The implementation is present in the ObjectInputStream and ObjectOutputStream classes. But what if we want to change the way we are saving data, for example we have some sensitive information in the object and before saving/retrieving we want to encrypt/decrypt it. That’s why there are four methods that we can provide in the class to change the serialization behavior. If these methods are present in the class, they are used for serialization purposes.
- readObject(ObjectInputStream ois): If this method is present in the class, ObjectInputStream readObject() method will use this method for reading the object from stream.
- writeObject(ObjectOutputStream oos): If this method is present in the class, ObjectOutputStream writeObject() method will use this method for writing the object to stream. One of the common usage is to obscure the object variables to maintain data integrity.
- Object writeReplace(): If this method is present, then after serialization process this method is called and the object returned is serialized to the stream.
- Object readResolve(): If this method is present, then after deserialization process, this method is called to return the final object to the caller program. One of the usage of this method is to implement Singleton pattern with Serialized classes. Read more at Serialization and Singleton.
Usually while implementing above methods, it’s kept as private so that subclasses can’t override them. They are meant for serialization purpose only and keeping them private avoids any security issue.
Serialization with Inheritance
Sometimes we need to extend a class that doesn’t implement Serializable interface. If we rely on the automatic serialization behavior and the superclass has some state, then they will not be converted to stream and hence not retrieved later on. This is one place, where readObject() and writeObject() methods really help. By providing their implementation, we can save the super class state to the stream and then retrieve it later on. Let’s see this in action.
package com.journaldev.serialization.inheritance;
public class SuperClass {
private int id;
private String value;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
}
SuperClass is a simple java bean but it’s not implementing Serializable interface.
package com.journaldev.serialization.inheritance;
import java.io.IOException;
import java.io.InvalidObjectException;
import java.io.ObjectInputStream;
import java.io.ObjectInputValidation;
import java.io.ObjectOutputStream;
import java.io.Serializable;
public class SubClass extends SuperClass implements Serializable, ObjectInputValidation{
private static final long serialVersionUID = -1322322139926390329L;
private String name;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@Override
public String toString(){
return "SubClass{id="+getId()+",value="+getValue()+",name="+getName()+"}";
}
//adding helper method for serialization to save/initialize super class state
private void readObject(ObjectInputStream ois) throws ClassNotFoundException, IOException{
ois.defaultReadObject();
//notice the order of read and write should be same
setId(ois.readInt());
setValue((String) ois.readObject());
}
private void writeObject(ObjectOutputStream oos) throws IOException{
oos.defaultWriteObject();
oos.writeInt(getId());
oos.writeObject(getValue());
}
@Override
public void validateObject() throws InvalidObjectException {
//validate the object here
if(name == null || "".equals(name)) throw new InvalidObjectException("name can't be null or empty");
if(getId() <=0) throw new InvalidObjectException("ID can't be negative or zero");
}
}
Notice that order of writing and reading the extra data to the stream should be same. We can put some logic in reading and writing data to make it secure. Also notice that the class is implementing ObjectInputValidation
interface. By implementing validateObject() method, we can put some business validations to make sure that the data integrity is not harmed. Let’s write a test class and see if we can retrieve the super class state from serialized data or not.
package com.journaldev.serialization.inheritance;
import java.io.IOException;
import com.journaldev.serialization.SerializationUtil;
public class InheritanceSerializationTest {
public static void main(String[] args) {
String fileName = "subclass.ser";
SubClass subClass = new SubClass();
subClass.setId(10);
subClass.setValue("Data");
subClass.setName("Pankaj");
try {
SerializationUtil.serialize(subClass, fileName);
} catch (IOException e) {
e.printStackTrace();
return;
}
try {
SubClass subNew = (SubClass) SerializationUtil.deserialize(fileName);
System.out.println("SubClass read = "+subNew);
} catch (ClassNotFoundException | IOException e) {
e.printStackTrace();
}
}
}
When we run above class, we get following output.
SubClass read = SubClass{id=10,value=Data,name=Pankaj}
So in this way, we can serialize super class state even though it’s not implementing Serializable interface. This strategy comes handy when the super class is a third-party class that we can’t change.
Serialization Proxy Pattern
Serialization in java comes with some serious pitfalls such as;
- The class structure can’t be changed a lot without breaking the java serialization process. So even though we don’t need some variables later on, we need to keep them just for backward compatibility.
- Serialization causes huge security risks, an attacker can change the stream sequence and cause harm to the system. For example, user role is serialized and an attacker change the stream value to make it admin and run malicious code.
Java Serialization Proxy pattern is a way to achieve greater security with Serialization. In this pattern, an inner private static class is used as a proxy class for serialization purpose. This class is designed in the way to maintain the state of the main class. This pattern is implemented by properly implementing readResolve() and writeReplace() methods. Let us first write a class which implements serialization proxy pattern and then we will analyze it for better understanding.
package com.journaldev.serialization.proxy;
import java.io.InvalidObjectException;
import java.io.ObjectInputStream;
import java.io.Serializable;
public class Data implements Serializable{
private static final long serialVersionUID = 2087368867376448459L;
private String data;
public Data(String d){
this.data=d;
}
public String getData() {
return data;
}
public void setData(String data) {
this.data = data;
}
@Override
public String toString(){
return "Data{data="+data+"}";
}
//serialization proxy class
private static class DataProxy implements Serializable{
private static final long serialVersionUID = 8333905273185436744L;
private String dataProxy;
private static final String PREFIX = "ABC";
private static final String SUFFIX = "DEFG";
public DataProxy(Data d){
//obscuring data for security
this.dataProxy = PREFIX + d.data + SUFFIX;
}
private Object readResolve() throws InvalidObjectException {
if(dataProxy.startsWith(PREFIX) && dataProxy.endsWith(SUFFIX)){
return new Data(dataProxy.substring(3, dataProxy.length() -4));
}else throw new InvalidObjectException("data corrupted");
}
}
//replacing serialized object to DataProxy object
private Object writeReplace(){
return new DataProxy(this);
}
private void readObject(ObjectInputStream ois) throws InvalidObjectException{
throw new InvalidObjectException("Proxy is not used, something fishy");
}
}
- Both
Data
andDataProxy
class should implement Serializable interface. DataProxy
should be able to maintain the state of Data object.DataProxy
is inner private static class, so that other classes can’t access it.DataProxy
should have a single constructor that takes Data as argument.Data
class should provide writeReplace() method returningDataProxy
instance. So when Data object is serialized, the returned stream is of DataProxy class. However DataProxy class is not visible outside, so it can’t be used directly.DataProxy
class should implement readResolve() method returningData
object. So when Data class is deserialized, internally DataProxy is deserialized and when it’s readResolve() method is called, we get Data object.- Finally implement readObject() method in Data class and throw
InvalidObjectException
to avoid hackers attack trying to fabricate Data object stream and parse it.
Let’s write a small test to check whether implementation works or not.
package com.journaldev.serialization.proxy;
import java.io.IOException;
import com.journaldev.serialization.SerializationUtil;
public class SerializationProxyTest {
public static void main(String[] args) {
String fileName = "data.ser";
Data data = new Data("Pankaj");
try {
SerializationUtil.serialize(data, fileName);
} catch (IOException e) {
e.printStackTrace();
}
try {
Data newData = (Data) SerializationUtil.deserialize(fileName);
System.out.println(newData);
} catch (ClassNotFoundException | IOException e) {
e.printStackTrace();
}
}
}
When we run above class, we get below output in console.
Data{data=Pankaj}
If you will open the data.ser file, you can see that DataProxy object is saved as stream in the file.
Download Java Serialization Project
That’s all for Serialization in Java, it looks simple but we should use it judiciously and it’s always better not to rely on default implementation. Download the project from above link and play around with it to learn more.