jdk版本是openjdk-7u40-fcs-src-b43-26_aug_2013
java中,计算对象的默认hashcode的方法主要在synchronizer.cpp文件中。对象的hashcode并不是在创建对象时就计算好的,而是在第一次使用的时候,也就是首次调用hashCode
方法时进行计算,并存储在对象的标记字中的。
java.lang.Object#hashCode()
方法是native,会调用ObjectSynchronizer::FastHashCode
方法获取hashcode,基本流程如下图所示:
具体到计算hashcode
时,会涉及到get_next_hash
方法(在synchronizer.cpp,用于计算新的hashcode)和hash
方法(在markOop.hpp中,用于获取已有的hashcode):
其中,hash
方法的实现是先获取该对象的标记字对象,然后对该标记字对象的的地址做位移和逻辑与操作,以结果作为hashcode(其中,mark_bits方法在globalDefinitions.hpp),之所以做移位操作操作是因为hashcode在标记字中只占用了部分位(32位机器上是占用25位,64位机器上占用31,标记字的内容参见这里)。
intptr_t hash() const {
return mask_bits(value() >> hash_shift, hash_mask);
}
uintptr_t value() const {
return (uintptr_t) this;
}
inline intptr_t mask_bits (intptr_t x, intptr_t m) {
return x & m;
}
get_next_hash
方法会根据传给JVM的参数-XX:hashCode=n
来选择使用哪种方法生成对象的hashcode:
- hashCode=0,使用系统生成的随机数作为hashcode
- hashCode=1,对对象地址做移位和逻辑操作,生成hashcode
- hashCode=2,所有的hashcode都等于1
- hashCode=3,用一个自增序列给hashcode赋值
- hashCode=4,以对象地址作为hashcode
- hashCode=其他,好复杂的位操作
正常情况下,
markOopDesc::hash_mask
的值应该全是1,此时,如果之前计算出的hashcode为0,则会触发断言错误
static inline intptr_t get_next_hash(Thread * Self, oop obj) {
intptr_t value = 0 ;
if (hashCode == 0) {
// This form uses an unguarded global Park-Miller RNG,
// so it's possible for two threads to race and generate the same RNG.
// On MP system we'll have lots of RW access to a global, so the
// mechanism induces lots of coherency traffic.
value = os::random() ;
} else
if (hashCode == 1) {
// This variation has the property of being stable (idempotent)
// between STW operations. This can be useful in some of the 1-0
// synchronization schemes.
intptr_t addrBits = intptr_t(obj) >> 3 ;
value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
} else
if (hashCode == 2) {
value = 1 ; // for sensitivity testing
} else
if (hashCode == 3) {
value = ++GVars.hcSequence ;
} else
if (hashCode == 4) {
value = intptr_t(obj) ;
} else {
// Marsaglia's xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we'll
// likely make this the default in future releases.
unsigned t = Self->_hashStateX ;
t ^= (t << 11) ;
Self->_hashStateX = Self->_hashStateY ;
Self->_hashStateY = Self->_hashStateZ ;
Self->_hashStateZ = Self->_hashStateW ;
unsigned v = Self->_hashStateW ;
v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
Self->_hashStateW = v ;
value = v ;
}
value &= markOopDesc::hash_mask;
if (value == 0) value = 0xBAD ;
assert (value != markOopDesc::no_hash, "invariant") ;
TEVENT (hashCode: GENERATE) ;
return value;
}
上面是计算出的hashcode值,但并不是直接将该值写入到的标记字中,需要经过处理,如下所示:
temp = mark->copy_set_hash(hash); // merge the hash code into header
// use (machine word version) atomic operation to install the hash
test = (markOop) Atomic::cmpxchg_ptr(temp, obj->mark_addr(), mark);
if (test == mark) {
return hash;
}
// If atomic operation failed, we must inflate the header
// into heavy weight monitor. We could add more code here
// for fast path, but it does not worth the complexity.
其中,copy_set_hash方法会复制原标记字中其他位的内容,与新hashcode合成新的标记字:
markOop copy_set_hash(intptr_t hash) const {
intptr_t tmp = value() & (~hash_mask_in_place);
tmp |= ((hash & hash_mask) << hash_shift);
return (markOop)tmp;
}
此外,如注释所说,如果通过原子操作失败,则需要使用重量级锁来设置hashcode,如下所示:
// Inflate the monitor to set hash code
monitor = ObjectSynchronizer::inflate(Self, obj);
// Load displaced header and check it has hash code
mark = monitor->header();
assert (mark->is_neutral(), "invariant") ;
hash = mark->hash();
if (hash == 0) {
hash = get_next_hash(Self, obj);
temp = mark->copy_set_hash(hash); // merge hash code into header
assert (temp->is_neutral(), "invariant") ;
test = (markOop) Atomic::cmpxchg_ptr(temp, monitor, mark);
if (test != mark) {
// The only update to the header in the monitor (outside GC)
// is install the hash code. If someone add new usage of
// displaced header, please update this code
hash = test->hash();
assert (test->is_neutral(), "invariant") ;
assert (hash != 0, "Trivial unexpected object/monitor header usage.");
}
}