No Reservations: Key to Successful Automation III

測試自動化是為了確保產品的品質.
但當Developer的程式碼有測試程式檢驗時, 誰來驗證測試程式本身是否有bug?
我想應該沒有人敢這樣說 "老闆, 我需要再找一批人來測試自動化測試的程式."

測試程式的品質不好會有甚麼影響? 倘若每次自動化測試的結果並不穩定, 常常有假警報或是沒有抓到真正的問題, 久而久之, 測試工程師對自動測試的報告的信心指數是不夠的. 要嘛就是花很多時間去仔細檢查報告上"可能"有問題的地方, 要嘛就是隨著產品發行日一天一天逼近, 所有測試最後還是靠手動測試來確保品質. 這樣當初測試自動化的目的-節省回歸測試的成本, 不但完全沒達到, 反浪費了許多寶貴的時間, 可說是賠了夫人又折兵.

好, 現在我們知道沒做好的後果有多嚴重了.
那要怎麼提升測試自動化的品質呢?

1. 把測試自動化當專案在開發

測試自動化應有其目標, 策略, 資源, 計畫, 設計. 要像開發專案一樣地嚴謹.
測試開發的時程應與產品配合. 目標方面則是質應重於量: 20個穩定的測試項比200個不穩定的測試項有用得多.
由於一樣是程式開發, 版本控制絕對是必需的, bug追蹤系統最好也有.
此外, 需要有方法追蹤進度, 我們每天早上都會有一個15分鐘的會議 (Daily sync-up meeting) 討論昨天完成的工作, 今天要做的事及遇到的阻礙. 還有WBS (Work breakdown structure) 來紀錄工作的完成度.

2. 程式檢視 (Code Inspection)

所謂程式檢視, 就是由一群人一起審核程式碼, 提出程式碼中有錯誤或是需要改進的地方. 每個人或多或少都有盲點, 所以藉由眾人的智慧來提升程式的品質.

我們實際的經驗如下: 一次檢視的範圍大約是1000行的程式, 整個測試團隊都要參加 (大約四~六人, 如果人更多可以拆成Feature team), 在正式檢視的會議數天前, 這次被檢視程式的作者會向團隊大概講解一下程式的架構, 幫助大家加速檢視. 在會議前所有成員都應該確實讀完要檢視的程式碼, 並將有問題的地方記錄下來.

檢視的範圍應著重於程式的架構, 邏輯, 或者是可維護性 (Copy-paste, Hard code都應該被修正), 註解不夠的地方也可以提出來加強, 而不用拘泥於一些小錯字. 另外有一樣東西也很值得檢視, 程式中是否有時間差的問題 (Timing issue)? 測試程式有時候在等待使用者介面切換或是產品完成某件事時, 會用睡眠 (Sleep)的方式去等, 但這些時間會隨著機器規格不同而有所變化, 因此使用固定的時間等待將導致自動測試結果不穩定, 時好時壞. 比較好的作法應該用事件 (Event) 或檢查產品的紀錄 (Log) 來判斷是否一件事情已經完成.

檢視的標準可以設定在如果今天這段程式碼要交接給你維護, 你覺得能夠接受.

檢視的會議中, 會有主席及紀錄, 主席會讓與會成員提出觀察到的問題. 但主席也需要維持整個會議進行的流暢, 每個問題的討論不應超過3分鐘, 亦不討論詳細的解法, 只要大家覺得這真的是個問題就可以繼續下一項, 3分鐘討論不完的部分則可以另闢新的會議. 紀錄將所有該討論該解決的地方都記下來, 以便追蹤. 最後主席會讓大家表決, 是要找人追蹤呢? 還是如果品質過差的話, 需要再一次的檢視會議.
整個會議進行的時間不宜過長, 超過2小時與會者注意力就會開始渙散, 效率降低.

程式檢視除了提升測試程式的品質外, 還有帶來一些好處. 首先就是因為知道程式會被別人檢視, 開發中就會比較注意細節及註解, 有達到警惕的效果. 此外閱讀資深工程師的程式可以幫助新手成長, 也能夠瞭解是否有函式可以共用. 如果人事上有異動, 交接上也變得比較容易.

程式檢視會花一些時間, 但相信是值得的.

3. 保持簡單

保持測試程式, 除錯, 閱讀測試報告簡單.

簡單可以節省例行公事的時間, 專注於更重要的事情上, 也能讓測試自動化的效果發揮到最大.

測試程式簡單的話, 問題會比較少而且除錯簡單, 程式檢視也較容易. 單元測試 (Unit test) 就是一個很典型的例子. 一般來說, 單執行緒 (Single-thread) 的程式也會比多執行緒 (Multi-thread) 的程式容易除錯.

除錯簡單可以從幾個地方來觀察: 測試架構是否能單步除錯? 是否有收集產品除錯資訊(Debug log)?測試程式本身的除錯資訊寫得完整嗎? 是否有紀錄當時測試環境的資訊? 像是螢幕的截圖, 甚至如果有足夠的硬碟空間, 自動測試發生錯誤時可以記錄一個虛擬機器的還原點. 收集夠多的資訊, 除錯將會更容易.

閱讀測試報告簡單是說, 當測試報告出來時, 能否能很容易地知道目前產品的品質? 對測試的結果自動產生一些分析, 能提升閱讀以及除錯的速度. 像是如果有10台測試機器, 那除了每台機器有自己的報告外, 也應該有一份完整的分析報告可以看到所有機器的狀態. 另外, 如果有些測試程式剛開發完成, 是不是能跟目前穩定的測試程式有所區隔? 之前同事有一個建議我覺得很不錯, 自動化測試的報告上可以分成"穩定/開發中" (Stable/Developing), 剛開發完的測試程式可以先標記成開發中, 閱讀上就不容易混淆.

做測試自動化應該像是經營精品, 像是PRADA.
也唯有精品化的測試程式, 才能真正確保產品的品質.

-----------------------------------------------------

The objective of test automation is to make sure the quality of the product.
However, when production code is verified by test automation, who is responsible for making sure the quality of test program itself?
I bet nobody tells his/her boss. "Hey boss, we need more test engineers to test our testing code."

What is the impact of test automation with poor quality?
If the results of test automation are not stable, there are usually false positives or false negatives; as a result, test engineers aren't confident of test automation, so they may spend lots of time double checking issues in automation reports, or they may give up test automation and test manually because of the tight project schedule.
In the end, the goal, saving regression testing effort, of test automation is not achieved, but also precious resources are not utilized efficiently. Bad automation is worse than no automation.

Well, now we understand how serious bad test automation is.
How can we improve it?

1. Treat Test Automation as a Project

For test automaton, we should define the goal, the strategy, the schedule, the design, and the resource plan. It should be as formal as product development.
Automation schedule should always align with the product. When defining the goal, quality is more important than quantity; 20 stable cases are worth than 200 unstable cases.
Test automation is code development, too. Source control system is absolutely essential, and it is better to have bug tracking system.
Besides, there should be ways to follow up the development progress. My team has a 15 minutes daily sync-up meeting every morning. We exchange the information about what we accomplished yesterday, what we plan to do today, and what obstacles we met. There is as well as a WBS (Working breakdown structure) to trace all tasks.

2. Code Inspection

Formal code inspection involves multiple participants to review code. It is intended to find and fix bugs, and improves code quality. Everyone may have a blind spot, but it can be covered by others.

Our practice is that, the scope is 1000 lines of code per review, and whole test team (4 to 6 persons, big team can be divided into feature teams.) is involved. Before the inspection meeting, there is an introducing meeting to give an overview of inspected code by the author; it assists reviewers to read code more efficiently. Participants must review the code before the inspection meeting and take notes of the defects he/she found.

Inspection should focus on the structure and the logic, and the maintainability (Copy-paste and hard code should be avoided.) of code; lack of comments can be highlighted, too. Typo can be noted but doesn't need to be highlighted in the meeting. A common error - timing issue in test automation should be noticed. Sometimes, while test program waits for the user interface switching or the product response, it is implemented by sleep function to wait a fixed period. However, the switching or response time varies depending on testing machine specs and environment, so test automation is unstable by waiting fixed amount of time; sometimes it works but sometime doesn't. A better way is to communicate by system events or logs created by the product.

The criteria of inspection should be that if this code is transferred to you, you are comfortable to maintain it.

During the inspection meeting, Moderator leads it, draws attention to each section of code. Inspectors contribute issues from their preparation logs. Issues are evaluated to determine if they are real defects, and recorder helps document them. The discussion of each issue should be less than 3 minutes. To save time, detailed solutions are not come out during the inspection meeting. In the end of meeting, Moderator lets reviewers vote to decide if this inspection can be finished, and assigns member to follow up issues; or if the code quality is not acceptable, there is another inspection meeting needed.

Code inspection not only improves code quality, but also brings some benefits. First, people know their code will be reviewed by others, so they develop test more carefully, and may take down more comments. Second, reading code written by experienced engineers is helpful to junior engineers, and lets others know what functions can be reused. If there are personnel changes, the transfer should be easier.

It takes time to inspect automation code, but it is worthy.

3. Keep Simple

Keep test program, debugging, and reading test reports simple.

Keeping things simple can free precious time from regular tasks, and people can focus on more important things. It also lets automation get maximum result.

If test program is simple enough, bugs should be few and debugging is intuitive, and code inspection is efficient. Unit testing is a typical example of simple test. In general, single-thread programming debugging is much easier than multi-threads one.

Simple debugging can be checked by some quick questions. Does test framework support tracing step by step? Does test program collect enough product debug logs? Does test program itself provide adequate logs? Is the environment and the system information recorded when error occurs? For example, taking screenshot at the moment; or taking the snapshot of the virtual machine with support of testing infrastructure. Sufficient information helps debugging and improves test program quality.

Keeping reading test reports simple means that, when testing reports are sent out, is it easy to understand the quality of product? Auto-generated analysis of testing results can help us check them more systematically and efficiently. For instance, 10 testing machines have executed automation test, and generate their own test reports. You would not want to check the reports one by one; complete analysis of 10 reports can save your time. Another example is that, some test program is new created and is unstable now; can we distinguish it from stable one? My colleague suggested that there be labels for "stable/developing" in testing reports. We can mark new cases as "developing" and are not confused.

Test automation should like luxury goods, such as PRADA.
Only test automation with good quality can make sure the quality of the product.