Protecting Mozilla Firefox users on the web

I have followed Pwn2Own ever since its inception in 2007. For those of you who do not know what Pwn2Own is, it is a competition in which hackers try to take advantage of software weaknesses in browsers (Internet Explorer, Firefox, Chrome, Safari etc.), put up specially crafted webpages and click on them to try and launch another application, usually calc.exe. They then gain a monetary reward in return. It usually happens on the sidelines of CanSecWest, a yearly security conference held in Vancouver.

During my university days in Singapore on the other side of the world, I always followed this competition with anticipation. I told myself, one day, just one day, I will be at the frontline helping to decipher the problem and help to get the fix out to Firefox users around the world as soon as possible.

Over the years, a security researcher by the name of Nils took down Firefox in 2009 (bug 484320) and in 2010 (bug 555109), whereas in 2011, nobody took down Firefox.

Last year in 2012, I was on-site in Vancouver and I witnessed Willem Pinckaers and Vincenzo Iozzo take down Firefox. However, the bug (720079) was already identified and fixed through internal processes.

This year, Pwn2Own became the venue for many exploits against major browsers, including Firefox (bug 848644), as well as other plugins which are more often used in browsers, such as Flash and Java. The team that took down Firefox this year was VUPEN Security, who also punched holes through Internet Explorer 10, Java and Flash.

Some of my colleagues / co-workers were present at the conference and were relaying us information live, while I stayed back at the office preparing my machines to diagnose the issue.

===

The following timeline (all times PST) describes my role behind the scenes with respect to the Firefox exploit by VUPEN, on March 6, 2013:

~3pm: Rumblings heard on IRC channels that Firefox has been moved from its scheduled slot to 5.30pm.

5.30pm: VUPEN gets ready.

~5.54pm: VUPEN takes down Firefox. On-site team gets to work getting details of the exploit.

~7pm: Bug 848644 gets filed.

Looking at the innings of the testcase, together with confirmation with team members over IRC that there is no malicious code present (Proof of Concept (PoC) code just crashes), I manage to reproduce the crash on a fully-patched Windows 7 system.

More analysis from early responders flow in; information such as the attack vector (Editor), Asan stack trace showing the implicated functions (possibly nsHTMLEditRules::GetPromotedPoint).

I did a quick stab at the regression range here. Using the bisection technique described here, I found that early January 2012 builds did not crash, whereas early January 2013 builds did crash.

The testcase seemed initially tricky; until it was eventually found (quite awhile later) that one could reliably trigger this with one tab that somehow caused the “pop-up blocked” info bar to show, I had to try the testcase repeatedly, sometimes reloading, sometimes closing then opening the browser again to trigger the crash.

Using mozregression here might have been a good idea – however due to an incorrect decision whether a particular build was crashing or not, one would bisect down to an incorrect regression window and waste precious time.

Time was of the essence here – the sooner one gets an accurate regression window, the faster a developer can potentially pinpoint the cause of the crash.

I found myself repeatedly downloading and checking builds to see if they did crash or not. Sometimes the crash happened immediately on load (with the initial PoC). Other times it happened only after a few minutes, or only after a restart.

I eventually settled on the following regression window: crash happens on the October 6, 2012 nightly, but not on the previous day’s (October 5), and I posted a comment, so this could get confirmation from other people. I then immediately looked through the hgweb regression window to see if anything stood out – bug 796839 seemed to be a likely cause, but everything else was still a possibility.

in that regression window, more clues emerge. The Asan stack trace pointed to nsHTMLEditRules::GetPromotedPoint being part of the bigger picture here, and some detective work showed that in this changeset from bug 796839, the file editor/libeditor/html/nsHTMLEditRules.cpp was changed, and this was the file that nsHTMLEditRules::GetPromotedPoint was located in.

Coincidence? Probably. However, this made everything more likely. At this point in time, it was 8pm, approximately one hour from the point in which the testcase was obtained.

I began to consider (and possibly discount) other possibilities, including bug 795610. Thanks to great work by Nicolas Pierron and his git wizardry, we found that nsHTMLInputElement::SetValueInternal (also implicated in the Asan stack trace), existed in nsHTMLInputElement.cpp which was modified in that bug. However, this possibility was quickly discounted.

At this point, I was able to get independent verification that the regression window (Oct 5 – Oct 6) was indeed correct. Further checking showed that our Extended Support Releases (ESR) builds on version 17 was also affected.

This made bug 796839 extremely likely to be the root cause, because it was landed on mozilla-central during the version 18 nightly window, but was backported to mozilla-aurora at that time, which was the version 17 branch. Bug 796839 would encompass the patch landing that inadvertently opened up a vulnerability in Firefox.

Independent confirmation of this regressor came at 9pm.

Within 2 hours, we had gotten from having a PoC testcase with no idea what was affected, to knowing which patch caused the issue. I thus nominated for the fix to be landed on all affected branches.

By about 10pm, the fix was put up for review. After that, lots of great work by various people/teams went towards quick approvals, landing of the fix, along with QA verification.

Overnight, builds were created and by late morning the next day, the advisory was prepared, with QA about to sign-off on the new builds.

At 4pm, a new version of Firefox (19.0.2) was shipped with the fix.

===

Credit must be given to the other Mozilla folks in this effort, who have, outside of normal day working hours, worked till late night to make this possible. I am proud to be part of this fabulous team effort.

It certainly has been my honour to have helped keep Mozilla users safe on the web.

Advertisements

MozCamp Asia 2012 (Singapore) – My experience in 5 languages: English, Mandarin, Cantonese, Singlish, Korean

MozCamp Asia just finished a few hours ago. ~200 of us gathered in this small rainy and humid city-state, at the Scape (at Somerset) and the Hub just across, and raining consistently every afternoon at approximately 3-4pm was probably an interesting experience to some folks unaccustomed to it. Welcome to a tropical climate!

Anyway, I just thought to blog my experience in the 5 parts, each containing 1 part of a spoken language that I actually spoke. I apologize in advance if parts of the upcoming multilingual paragraphs are incorrectly expressed, but plowing through 5 spoken languages at MozCamp to different communities was an incredibly extraordinary experience that I wanted to share with everyone. The regions in parentheses were regions where people I personally spoke to, actually came from.

Here goes:

  1. English (for Westerners/Others): When I first arrived at MozCamp, it had a homely feeling. I studied in Singapore for over 20 years, and after moving to the States for work, coming back was a surreal experience. Was I a local? Was I a foreigner? I just had to adapt.
  2. Mandarin (for Chinese/Taipei friends): 很快的,我又遇见了好多旧朋友,也很幸运能够遇到很多新朋友。能够说普通话/国语/中文的朋友不只来自中国大陆或台北市,我也碰到法国和澳洲朋友,能够相当流利地说中文。好神奇的世界啊。
  3. Cantonese (for Hong Kongers): 有d活動都幾得意。我特別鐘意我美國寫字樓o既一位同事o既一個活動,session名叫做 “Help the UX Team Understand Security and Privacy Concerns in Asia“. 佢個名係 “Larissa Co”. 依個活動都係幾有趣,好好玩。之後我好幸福能夠認識Sammy Fung,佢係我來自香港o既第一位Mozilla朋友。幸會,幸會!
  4. Singlish (for Singaporeans/Southeast Asian friends): And then after that I was vely vely lucky to meet people from Southeast Asia, some again and again these few years. They all very very friendly, make me sometimes miss the times when I was around here. I super enjoyed my time leh, got good local food, got many friend, all vely vely happy.
  5. Korean (for Koreans): 저는 한국친구하고 저녁식사를 먹었어요. 한국친구가 싱가포르 동네식사하고 싸다하고 좋다음식을 좋아해요. 저도 좋아해요. Night Safari 에 택시로 갔어요. Night Safari 가 재밌어요.

Note: All of the phrases, including the translations, are off the top of my head, with virtually zero references from anywhere else. I make no guarantee to their grammar correctness / colloquial updated-ness at all. Once again, I apologize if I had inadvertently made any errors.

Note 2: Cantonese and Singlish are largely spoken languages, and as such may make absolutely no sense when written down. Also, Singlish is not exactly a new language of its own, but it’s unique enough to be understandable by folks from Southeast Asia and relatively not to someone from anywhere else / the Western world, so I’ve included it in.

===

English version/translation (may not be 100% accurate):

  1. ENGLISH: When I first arrived at MozCamp, it had a homely feeling. I studied in Singapore for over 20 years, and after moving to the States for work, coming back was a surreal experience. Was I a local? Was I a foreigner? I just had to adapt.
  2. MANDARIN: Very quickly, I met up with a lot of old friends again, and was very fortunate to be able to meet a lot of new ones. Our Mandarin-speaking friends not only came from mainland China or Taipei, I also met folks from France and Australia who were able to speak somewhat decent Mandarin. What a interesting/mysterious world.
  3. CANTONESE: There were interesting and unique activities. I especially enjoyed the session titled “Help the UX Team Understand Security and Privacy Concerns in Asia“, by my co-worker also from our American office. She is Larissa Co. This activity was very interesting and fun. After the activity, I was deeply honoured to be able to meet Sammy Fung, one of our community members from Hong Kong, and the first I’ve met in person. Pleased to meet you!
  4. SINGLISH: After that, I was very lucky to be able to meet people hailing from Southeast Asia. For some, it was a case of meeting up again these few years. Everyone was very friendly, indirectly causing me to miss those days when I was in Singapore. I definitely enjoyed these few days, with lots of local foods and many happy friends.
  5. KOREAN: I went to dinner with Korean friends. Korean friends like local, cheap and good Singapore food. I like it too. Went to Night Safari in a taxi. Night Safari was interesting.

Addendum: There were times when I would get confused and mix languages up. e.g. Speaking Singlish to a community not accustomed to it, or Chinese to others. Rectification usually took a few seconds/minutes.

Valgrind builds are now green on TBPL

Valgrind builds are now green on TBPL as of this morning!

I filed bug 800435 to get the build unhidden – previously it was hidden on tbpl.mozilla.org (TBPL) because it was always fiercely burning.

Note that the some of the multiple builds you see in the screenshot were manually triggered, otherwise only one per day is automatically scheduled.

What are Valgrind builds?

Running Firefox binaries with Valgrind helps to detect run-time memory management bugs, so it finds problems like use-after-frees (invalid reads/writes), uninitialized variables as well as memory leaks.

Note that the run-time speed of the application will take a substantial hit – it can take a long time to start up a Valgrind build. Moreover, a fairly powerful computer running Linux / Mac (preferably Linux) with about 4 GB memory is recommended. Tests are thus run once a day due to the slowdown, and we currently run only PGO tests (a small subset of our tests, note that our Valgrind builds are not PGO though).

How was this accomplished?

It is important to note that a lot of work by other folks was put in a year or two ago to get Valgrind showing up before it was hidden by default on TBPL for being perma-red. I then stepped up to help turn it green since I had some experience in running JavaScript binaries with Valgrind, and we all love greenery. 🙂

When I first embarked on this about 3-4 weeks ago, I found and helped to fixed 3 harness bugs, assisted in upgrading Valgrind twice, detected 35 potential issues at the time of writing (some of which were intended leaks), with 3 non-sensitive ones being fixed, some being recent regressions and one other being potentially security-sensitive. With the issues now known and filed as bugs, they were added to suppression files which also live in the mozilla-central tree. I also accidentally stumbled on a supposed TBPL selfserve bug that turned out to be a Firefox regression.

How can we help?

This is just a small step forward. In the future, ideally we should:

and we should also incorporate AddressSanitizer (Asan) builds into TBPL. (Asan is a faster memory error detector than Valgrind, and sometimes finds a different class of bugs, but it does not detect uninitialized values the way Valgrind does)

Christian Holler [:decoder] has some regular Asan builds but they are only run through the Try servers.

Anything else?

Shout-outs go out to the following people: Julian Seward, Nicholas Nethercote, Jesse Ruderman, Ted Mielczarek, Releng folks Chris AtLee, Nick Thomas and Rail Aliiev, our sheriffs edmorley, philor, RyanVM, and all others whom I have inadvertently left out. Without any of your collaboration and hard work we would be unable to have this set of Valgrind greenery. You folks rock!

Edit: Bug 800435 has been fixed, Valgrind builds now appear by default, thanks Ehsan! See the following screenshot:

Edit 2: It’s been re-hidden again because it’s “not a tier-1 platform“.

Bonus: Here’s a video showing the Endeavour Space Shuttle flypast in Mountain View. Just in case you haven’t seen it. 🙂