文章詳情頁

JVM系列之:再談java中的safepoint說明

瀏覽：4日期：2022-08-24 16:58:25

safepoint是什么

java程序里面有很多很多的java線程，每個java線程又有自己的stack，并且共享了heap。這些線程一直運行呀運行，不斷對stack和heap進行操作。

這個時候如果JVM需要對stack和heap做一些操作該怎么辦呢？

比如JVM要進行GC操作，或者要做heap dump等等，這時候如果線程都在對stack或者heap進行修改，那么將不是一個穩定的狀態。GC直接在這種情況下操作stack或者heap，會導致線程的異常。

怎么處理呢？

這個時候safepoint就出場了。

safepoint就是一個安全點，所有的線程執行到安全點的時候就會去檢查是否需要執行safepoint操作，如果需要執行，那么所有的線程都將會等待，直到所有的線程進入safepoint。

然后JVM執行相應的操作之后，所有的線程再恢復執行。

safepoint的例子

我們舉個例子，一般safepoint比如容易出現在循環遍歷的情況，還是使用我們之前做null測試用的例子：

public class TestNull { public static void main(String[] args) throws InterruptedException { List<String> list= new ArrayList(); list.add('www.flydean.com'); for (int i = 0; i < 10000; i++) { testMethod(list); } Thread.sleep(1000); } private static void testMethod(List<String> list) { list.get(0); }}

運行結果如下：

JVM系列之:再談java中的safepoint說明

標紅的就是傳說中的safepoint。

線程什么時候會進入safepoint

那么線程什么時候會進入safepoint呢？

一般來說，如果線程在競爭鎖被阻塞，IO被阻塞，或者在等待獲得監視器鎖狀態時，線程就處于safepoint狀態。

如果線程再執行JNI代碼的哪一個時刻，java線程也處于safepoint狀態。因為java線程在執行本地代碼之前，需要保存堆棧的狀態，讓后再移交給native方法。

如果java的字節碼正在執行，那么我們不能判斷該線程是不是在safepint上。

safepoint是怎么工作的

如果你使用的是hotspot JVM，那么這個safepoint是一個全局的safepoint，也就是說執行Safepoint需要暫停所有的線程。

如果你使用的是Zing，那么可以在線程級別使用safepoint。

我們可以看到生成的匯編語言中safepoint其實是一個test命令。

test指向的是一個特殊的內存頁面地址，當JVM需要所有的線程都執行到safepint的時候，就會對該頁面做一個標記。從而通知所有的線程。

我們再用一張圖來詳細說明：

JVM系列之:再談java中的safepoint說明

thread1在收到設置safepoint之前是一直執行的，在收到信號之后還會執行一段時間，然后到達Safepint暫停執行。

thread2先執行了一段時間，然后因為CPU被搶奪，空閑了一段時間，在這段時間里面，thread2收到了設置safepoint的信號，然后thread2獲得執行權力，接著繼續執行，最后到達safepoint。

thread3是一個native方法，將會一直執行，知道safepoint結束。

thread4也是一個native方法，它和thread3的區別就在于，thread4在safepoint開始和結束之間結束了，需要將控制器轉交給普通的java線程，因為這個時候JVM在執行Safepoint的操作，所以任然需要暫停執行。

在HotSpot VM中，你可以在匯編語言中看到safepoint的兩種形式：’{poll}’ 或者 ‘{poll return}’ 。

總結

本文詳細的講解了JVM中Safepoint的作用，希望大家能夠喜歡。

補充知識：JVM源碼分析之安全點safepoint

上周有幸參加了一次關于JVM的小范圍分享會，聽完R大對虛擬機C2編譯器的講解，我的膝蓋一直是腫的，能記住的實在有點少，能聽進去也不多

1、什么時候進行C2編譯，如何進行C2編譯（這個實在太復雜）

2、C2編譯的時候，是對整個方法體進行編譯，而不是某個方法段

3、JVM中的safepoint

一直都知道，當發生GC時，正在執行Java code的線程必須全部停下來，才可以進行垃圾回收，這就是熟悉的STW（stop the world），但是STW的背后實現原理，比如這些線程如何暫停、又如何恢復？就比較疑惑了。

然而這一切的一切，都涉及到一個概念safepoint，openjdk的實現位于

openjdk/hotspot/src/share/vm/runtime/safepoint.cpp

什么是safepoint

safepoint可以用在不同地方，比如GC、Deoptimization，在Hotspot VM中，GC safepoint比較常見，需要一個數據結構記錄每個線程的調用棧、寄存器等一些重要的數據區域里什么地方包含了GC管理的指針。

從線程角度看，safepoint可以理解成是在代碼執行過程中的一些特殊位置，當線程執行到這些位置的時候，說明虛擬機當前的狀態是安全的，如果有需要，可以在這個位置暫停，比如發生GC時，需要暫停暫停所以活動線程，但是線程在這個時刻，還沒有執行到一個安全點，所以該線程應該繼續執行，到達下一個安全點的時候暫停，等待GC結束。

什么地方可以放safepoint

下面以Hotspot為例，簡單的說明一下什么地方會放置safepoint

1、理論上，在解釋器的每條字節碼的邊界都可以放一個safepoint，不過掛在safepoint的調試符號信息要占用內存空間，如果每條機器碼后面都加safepoint的話，需要保存大量的運行時數據，所以要盡量少放置safepoint，在safepoint會生成polling代碼詢問VM是否要“進入safepoint”，polling操作也是有開銷的，polling操作會在后續解釋。

2、通過JIT編譯的代碼里，會在所有方法的返回之前，以及所有非counted loop的循環（無界循環）回跳之前放置一個safepoint，為了防止發生GC需要STW時，該線程一直不能暫停。另外，JIT編譯器在生成機器碼的同時會為每個safepoint生成一些“調試符號信息”，為GC生成的符號信息是OopMap，指出棧上和寄存器里哪里有GC管理的指針。

線程如何被掛起

如果觸發GC動作，VM thread會在VMThread::loop()方法中調用SafepointSynchronize::begin()方法，最終使所有的線程都進入到safepoint。

// Roll all threads forward to a safepoint and suspend them allvoid SafepointSynchronize::begin() { Thread* myThread = Thread::current(); assert(myThread->is_VM_thread(), 'Only VM thread may execute a safepoint'); if (PrintSafepointStatistics || PrintSafepointStatisticsTimeout > 0) { _safepoint_begin_time = os::javaTimeNanos(); _ts_of_current_safepoint = tty->time_stamp().seconds(); }

在safepoint實現中，有這樣一段注釋，Java threads可以有多種不同的狀態，所以掛起的機制也不同，一共列舉了5中情況：

1、執行Java code

在執行字節碼時會檢查safepoint狀態，因為在begin方法中會調用Interpreter::notice_safepoints()方法，通知解釋器更新dispatch table，實現如下：

void TemplateInterpreter::notice_safepoints() { if (!_notice_safepoints) { // switch to safepoint dispatch table _notice_safepoints = true; copy_table((address*)&_safept_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address)); }}

2、執行native code

如果VM thread發現一個Java thread正在執行native code，并不會等待該Java thread阻塞，不過當該Java thread從native code返回時，必須檢查safepoint狀態，看是否需要進行阻塞。

這里涉及到兩個狀態：Java thread state和safepoint state，兩者之間有著嚴格的讀寫順序，一般可以通過內存屏障實現，但是性能開銷比較大，Hotspot采用另一種方式，調用os::serialize_thread_states()把每個線程的狀態依次寫入到同一個內存頁中，實現如下：

// Serialize all thread state variablesvoid os::serialize_thread_states() { // On some platforms such as Solaris & Linux, the time duration of the page // permission restoration is observed to be much longer than expected due to // scheduler starvation problem etc. To avoid the long synchronization // time and expensive page trap spinning, ’SerializePageLock’ is used to block // the mutator thread if such case is encountered. See bug 6546278 for details. Thread::muxAcquire(&SerializePageLock, 'serialize_thread_states'); os::protect_memory((char *)os::get_memory_serialize_page(), os::vm_page_size(), MEM_PROT_READ); os::protect_memory((char *)os::get_memory_serialize_page(), os::vm_page_size(), MEM_PROT_RW); Thread::muxRelease(&SerializePageLock);}

通過VM thread執行一系列mprotect os call，保證之前所有線程狀態的寫入可以被順序執行，效率更高。

3、執行complied code

如果想進入safepoint，則設置polling page不可讀，當Java thread發現該內存頁不可讀時，最終會被阻塞掛起。在SafepointSynchronize::begin()方法中，通過os::make_polling_page_unreadable()方法設置polling page為不可讀。

if (UseCompilerSafepoints && DeferPollingPageLoopCount < 0) { // Make polling safepoint aware guarantee (PageArmed == 0, 'invariant') ; PageArmed = 1 ; os::make_polling_page_unreadable();}

方法make_polling_page_unreadable()在不同系統的實現不一樣

linux下實現

// Mark the polling page as unreadablevoid os::make_polling_page_unreadable(void) { if( !guard_memory((char*)_polling_page, Linux::page_size()) ) fatal('Could not disable polling page');};

solaris下實現

// Mark the polling page as unreadablevoid os::make_polling_page_unreadable(void) { if( mprotect((char *)_polling_page, page_size, PROT_NONE) != 0 ) fatal('Could not disable polling page');};

在JIT編譯中，編譯器會把safepoint檢查的操作插入到機器碼指令中，比如下面的指令：

0x01b6d627: call 0x01b2b210 ; OopMap{[60]=Oop off=460} ;*invokeinterface size ; - Client1::main@113 (line 23) ; {virtual_call} 0x01b6d62c: nop ; OopMap{[60]=Oop off=461} ;*if_icmplt ; - Client1::main@118 (line 23) 0x01b6d62d: test %eax,0x160100 ; {poll} 0x01b6d633: mov 0x50(%esp),%esi 0x01b6d637: cmp %eax,%esi

test %eax,0x160100 就是一個檢查polling page是否可讀的操作，如果不可讀，則該線程會被掛起等待。

4、線程處于Block狀態

即使線程已經滿足了block condition，也要等到safepoint operation完成，如GC操作，才能返回。

5、線程正在轉換狀態

會去檢查safepoint狀態，如果需要阻塞，就把自己掛起。

最終實現

當線程訪問到被保護的內存地址時，會觸發一個SIGSEGV信號，進而觸發JVM的signal handler來阻塞這個線程，The GC thread can protect some memory to which all threads in the process can write (using the mprotect system call) so they no longer can. Upon accessing this temporarily forbidden memory, a signal handler kicks in。

再看看底層是如何處理這個SIGSEGV信號，實現位于

hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp// Check to see if we caught the safepoint code in the// process of write protecting the memory serialization page.// It write enables the page immediately after protecting it// so we can just return to retry the write.if ((sig == SIGSEGV) && os::is_memory_serialize_page(thread, (address) info->si_addr)) { // Block current thread until the memory serialize page permission restored. os::block_on_serialize_page_trap(); return true;}

執行os::block_on_serialize_page_trap()把當前線程阻塞掛起。

線程如何恢復

有了begin方法，自然有對應的end方法，在SafepointSynchronize::end()中，會最終喚醒所有掛起等待的線程，大概實現如下：

1、重新設置pooling page為可讀

if (PageArmed) { // Make polling safepoint aware os::make_polling_page_readable(); PageArmed = 0 ; }

2、設置解釋器為ignore_safepoints，實現如下：

// switch from the dispatch table which notices safepoints back to the// normal dispatch table. So that we can notice single stepping points,// keep the safepoint dispatch table if we are single stepping in JVMTI.// Note that the should_post_single_step test is exactly as fast as the// JvmtiExport::_enabled test and covers both cases.void TemplateInterpreter::ignore_safepoints() { if (_notice_safepoints) { if (!JvmtiExport::should_post_single_step()) { // switch to normal dispatch table _notice_safepoints = false; copy_table((address*)&_normal_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address)); } }}

3、喚醒所有掛起等待的線程

// Start suspended threads for(JavaThread *current = Threads::first(); current; current = current->next()) { // A problem occurring on Solaris is when attempting to restart threads // the first #cpus - 1 go well, but then the VMThread is preempted when we get // to the next one (since it has been running the longest). We then have // to wait for a cpu to become available before we can continue restarting // threads. // FIXME: This causes the performance of the VM to degrade when active and with // large numbers of threads. Apparently this is due to the synchronous nature // of suspending threads. // // TODO-FIXME: the comments above are vestigial and no longer apply. // Furthermore, using solaris’ schedctl in this particular context confers no benefit if (VMThreadHintNoPreempt) { os::hint_no_preempt(); } ThreadSafepointState* cur_state = current->safepoint_state(); assert(cur_state->type() != ThreadSafepointState::_running, 'Thread not suspended at safepoint'); cur_state->restart(); assert(cur_state->is_running(), 'safepoint state has not been reset'); }

對JVM性能有什么影響

通過設置JVM參數 -XX:+PrintGCApplicationStoppedTime，可以打出系統停止的時間，大概如下：

Total time for which application threads were stopped: 0.0051000 seconds Total time for which application threads were stopped: 0.0041930 seconds Total time for which application threads were stopped: 0.0051210 seconds Total time for which application threads were stopped: 0.0050940 seconds Total time for which application threads were stopped: 0.0058720 seconds Total time for which application threads were stopped: 5.1298200 secondsTotal time for which application threads were stopped: 0.0197290 seconds Total time for which application threads were stopped: 0.0087590 seconds

從上面數據可以發現，有一次暫停時間特別長，達到了5秒多，這在線上環境肯定是無法忍受的，那么是什么原因導致的呢？

一個大概率的原因是當發生GC時，有線程遲遲進入不到safepoint進行阻塞，導致其他已經停止的線程也一直等待，VM Thread也在等待所有的Java線程掛起才能開始GC，這里需要分析業務代碼中是否存在有界的大循環邏輯，可能在JIT優化時，這些循環操作沒有插入safepoint檢查。

以上這篇JVM系列之:再談java中的safepoint說明就是小編分享給大家的全部內容了，希望能給大家一個參考，也希望大家多多支持好吧啦網。

Java

上一條：Java優化for循環嵌套的高效率方法下一條：Java實現ip地址和int數字的相互轉換

相關文章：

1. JavaMail 1.4 發布2. JSP數據交互實現過程解析3. Python importlib動態導入模塊實現代碼4. vue使用webSocket更新實時天氣的方法5. 解決啟動django,瀏覽器顯示“服務器拒絕訪問”的問題6. Yii2.0引入CSS,JS文件方法7. Nginx+php配置文件及原理解析8. 淺談python出錯時traceback的解讀9. 如何使用CSS3畫出一個叮當貓10. 關于HTML5的img標簽

排行榜

					
					JSP數據交互實現過程解析
vue使用webSocket更新實時天氣的方法
Nginx+php配置文件及原理解析
Yii2.0引入CSS,JS文件方法
解決啟動django,瀏覽器顯示“服務器拒絕訪問”的問題
淺談python出錯時traceback的解讀
JavaMail 1.4 發布
Python importlib動態導入模塊實現代碼
Intellij IDEA 閱讀源碼的 4 個絕技(必看)
SpringCloud URL重定向及轉發代碼實例
Docker究竟是什么 為什么這么流行 它的優點和缺陷有哪些？