Vulnerability found in Bash

2014 must be a really bad year for open source community in security.

Less than 6 months after Heartbleed was found in OpenSSL, now Bash is found vulnerable of remote code execution. This time I’m not sure it’s because of poor funding or something else.

Maybe it’s a good time now to look back on how did the Heartbleed bug come about. Mr. Bruce Schenier posted a very good article on this.

Time for ConnectedFlight?

Black Thursday for air lines? Last Thursday, MH-17, yes Malaysia Airline again, was shot down. This Thursday, AH-5017, Algeria Airline, crashed in Mali.

These incidents happened only 4 month after the disappearing of MH-370, for which no one has a clue yet.

Only after MH-370 did people realize that modern air plane is not so modern after all. Inter-continental air plane cruise above oceans are more like ballistic rocket than guided missile.

Actually not all air lines are that primitive. I’ve tried the WIFI service on Emrite’s A380, it’s rather decent. So technically this shouldn’t be a issue. For tracking the position of the air plane, something 16kbps should be enough for each plane. Let’s say at any time there are 50,000 air planes on the sky, we need a 800Mbps satellite link – hmm, shouldn’t be so difficult to get. If I remember correctly, one satellite can typically provide 1Gbps.

With this “ConnectedFlight”, at least I’ll know the position if a air plane disappears. Should I register this trade mark now? 🙂

Office Open XML小试

尝试了一下微软的Office Open XML,发现功能是挺全面了,很多之前用Word宏实现的自动化操作,可以用Office Open XML实现了,微软的Office总算在开放性方面迈出了一大步。

顺便仔细看了一下docx的实现,这xml可是够复杂的,不知道为什么微软要设计成这样样子:-(。

即使使用Office Open XML,要想插入一个脚注格式,需要增加三层xml elements:

XmlElement element1 = xd1.CreateElement(“w”, “r”, Class1.namespaceURI);
XmlElement element2 = xd1.CreateElement(“w”, “rPr”, Class1.namespaceURI);
XmlElement element3 = xd1.CreateElement(“w”, “vertAlign”, Class1.namespaceURI);
element3.SetAttribute(“val”, Class1.namespaceURI, “superscript”);
element2.AppendChild((XmlNode) element3);
element1.AppendChild((XmlNode) element2);

这三个元素分别是run,就是一段格式一致的文字;rPr是run property,代表这一段文字的格式属性集合;vertAlign就是Vertical Alignment垂直方向的对齐属性。。。

有时间考虑用Linux + Mono处理word文件,就比Windows灵巧的多了。

Wow, what happened to TrueCrypt?

I heard about this last week but didn’t have time to check it out. I can’t believe it’s so drama!

I’m a regular user of TrueCrypt, but I don’t visit its website.  I’ve never had to ask anything for support. It just works.

Now this is what on their website:

 

truecrypt_website_after_may

REALLY?!

The only explanation that makes sense to me is: The developers were forced to (by threatening or beating) abandon this great software. Why? Because it’s so good that some secrete agency simple cannot break into a TrueCrypt encrypted disk.

Now have all the reason to continue using TrueCrypt!

building self-aware device – part 1

The canonical way to tell whether an animal (or anything) is self-aware is the mirror-test. Put it in front of a mirror, if the animal recognize itself, then it’s self-aware; otherwise it’s not.

But then what exactly is self-awareness? What exactly does the mirror-test test? Can we make self-aware machines? I was pondering on this because I recently realized that vehicles (or in general any device) are considered dumb not only because they have no intelligence, but also because they are not self-aware.

Take a car for example, it’s considered dumb not only because it cannot make any intelligent decision on itself, but also because it will do things that are obviously against it’s own best interest as long as that’s the command from a human being. Would it be interesting if we can build a car that cares itself and avoids crashes and collisions out of its own interests?

Back to the mirror test. Essentially the test tests the ability of an animal to recognize oneself through visual signal of optical reflection. Let’s try to break it down by replacing non-essential part.

First of all,  it seems that there’s no obvious reason we should limit ourselves to visual signal. It’s just one form of signal that a lot of animals can sense easily. A lot of other animals rely on other sensors. For example, bats are known to be able to tell its own ultra sound signal from others. If we’re not limited to visual reflection, then recognizing oneself through reflected/echoed signal is not that difficult. For example, if an device could simply broadcast its own identity through ultra sound like a bat.

We can build a device that broadcast its own identity through ultra-sound, let’s say the identity has a form of a GUID. Now, our device will be able to tell its own signal apart from other signals. Is that enough to be self-aware?

Most cars manufactured nowadays have more than one ultra sound radar built in. They beep when they sense the danger of crashing into something. It seems that it kind of self-aware, but not quite, right?

We have to take a closer look at the mirror test. When a self-aware animal first sees itself through a mirror, it has no prior knowledge about its own appearance in the mirror. Then how exactly does it come to the conclusion that the object inside the mirror is a visual representation of himself? The only possibility is, the animal actually learns that by experimenting.

Wikipedia actually has a full description of the mirror test being conducted the first time:

In 1970, Gordon Gallup, Jr., experimentally investigated the possibility of self
-recognition with two male and two female wild pre-adolescent chimpanzees (Pan
troglodytes), none of which had presumably seen a mirror previously. Each chimpanzee
was put into a room by itself for two days. Next, a full-length mirror was placed in
the room for a total of 80 hours at periodically decreasing distances. A multitude
of behaviors were recorded upon introducing the mirrors to the chimpanzees. Initially,
the chimpanzees made threatening gestures at their own images, ostensibly seeing their
own reflections as threatening. Eventually, the chimps used their own reflections for
self-directed responding behaviors, such as grooming parts of their body previously
not observed without a mirror, picking their noses, making faces, and blowing bubbles
at their own reflections.

From this description it’s obvious that the recognition is a learning process. This observation has several implications:

First of all, because it’s a learning process. It’s very flexible, very adaptive. The animal doesn’t have to stand still in front of a mirror to recognize itself. Even the physical appearance later changes dramatically, the animal will be able to recognize himself again very quickly.

In contract, if a device just broadcast our own identity, then in a noisy environment, then it may have difficulties. Or, if for some reason we have to change the identity, then we also have to change the verification logic.

Secondly, indeed we don’t have to limit ourselves to visual reflection. Voice will also do. Touching will also do. In fact, people that are born to be blind are able to recognize themselves by other senses. And we don’t think they are not self-ware.

Last, a very subtle prerequisite of such a learning process is, the animal has to know its own properties and boundary. Otherwise, the animal won’t be able to know a waving arm is its own or not (with or without a mirror). Every animal knows it’s own properties and boundary (This is my fur, this is my claw, etc),  if we’re trying to design devices to be self-aware, we have to build this capability as well.

So by now I think we can define self-awareness as:

  1. Knows ones own properties and boundary;
  2. Able to learn ones own identity from experiments;

Heartbleed vulnerability

Just saw some friends sharing this in Wechat. Seems I will have a lot of work to do – patching my servers, replacing certificates, regenerating keys, etc.:(

It’s really unbelievable. This will be a big blow on the open source community. I already saw people saying “see? Open source is no securer”. Well, they are right.

To me, this has nothing to do with open source or not. M$ may have something even worse but you will need longer time to find out. I just searched the web and found that as a library that has been so widely used, OpenSSL has only one full time developer and receives on average $2,000 donation per year. What do you expect?

But that simply won’t justify such a disaster and it will cast a bad image for open source community in general. I can only hope leaders in this industry see this differently and start to support these great open source projects. They deserve better!

spf记录怎么用

曾经稀里糊涂地用过好几回,但是一直没有认真研究过。论坛上关于使用方法众说纷纭,可能分别适用于不同的场景,但是很少看到完整的介绍。刚刚看到DigitalOcean上这一篇文章说的比较清楚了。

How To use an SPF Record to Prevent Spoofing & Improve E-mail Reliability

作为一个text记录,spf字段的结构如下:

<spf preamble> <address list>

v=spf1 spf记录的起始标志。目前只有1号版本的spf,所以只有这一种写法。

后面是一个地址或者名称的列表,列表项用空格符隔开。

在每个地址列表项之前,可以有如下四种修饰符:

  1. + 表示该地址被明确允许发送来自该域名的邮件。该修饰符可省略。
  2. – 表示该地址被明确禁止发送来自该域名的邮件。
  3. ? 表示对该地址目前暂时没有策略。
  4. ~ 表示对该地址应该使用柔性策略,接收方应该接受,但是可以进行特殊标记。

地址列表项本身有如下几种格式:

  1. all 匹配任何地址;
  2. a 匹配该域中任何A记录地址;
  3. ip4: 匹配紧随其后的IPv4地址或者地址段;
  4. ip6: 匹配紧随其后的IPv6地址或者地址段;
  5. mx 匹配该域中的任何mx记录地址;

因此,假设一个域domainabc.com有spf记录如下:

v=spf1 ip4:123.23.4.5 -all

这个记录的意思是,除了123.23.4.5这个地址之外,其他任何地址发送来自domainabc.com的邮件,都应该被拒绝。

考虑到高可用性,一个域通常至少有两台smtp服务器在工作,因此这条记录更可能是这样的:

v=spf1 ip4:123.23.4.5 ip4:123.23.4.5 -all

如果该域之前已经有MX记录,指向就是123.23.4.5和123.23.4.6,那么该记录可以简化为:

v=spf1 mx -all

如果有一天,domainabc.com需要额外允许一个网段发送邮件,可以将记录改为:

v=spf1 mx ip4:107.8.9.0/24 -all

假设domainabc.com需要迁移自己的邮件服务器,mx记录已经切换,但是原本的邮件服务器需要逐步停止使用,那么可以首先这么修改:

v=spf1 mx ip4:107.8.9.0/24 ?123.23.4.5 -all

其中123.23.4.5是准备退休的邮件服务器。经过一段时间之后,可以再把记录改为:

v=spf1 mx ip4:107.8.9.0/24 ~123.23.4.5 -all

最后,确认所有内部应用都开始使用新的服务器了,再把原来的服务器地址防到禁止列表里去。

v=spf1 mx ip4:107.8.9.0/24 -all

注意最后的“-all”匹配任意地址。就是这样。

Linear regression in Excel – continued

Doing linear regression in Excel is simple. Now I’d like to show you how exactly this is done.

Let’s use the same data set:

x 9.42 2.78 5.87 5.49 6.48 4.91 7.82 2.70 2.65 8.42 5.85 0.93
y 52.24 4.15 13.72 7.00 5.28 6.98 42.00 7.15 3.41 31.00 12.59 1.09

Let’s say we want do a order-2 linear regression. That means, we want a function:

f(x)=c*x^2 + b*x+a

a, b and c should be chosen in a way such that if we put x into f(x), the difference/error between f(x) and y, will be minimized, in the least square sense.

So we have the difference R as a function of a,b,c:

R(a,b,c)=(c* 9.42^2 + b* 9.42 + a – 52.24)^2+(c*2.78^2+b*2.78+a-4.15)^2+…+(c*0.93^2+b*0.93+a)

R is smooth every where in space. So R is minimal where:

D(R)/D(a)=0; D(R)/D(b)=0; D(R)/D(c)=0

Now we’ll have to employ the compact form to avoid tedious typing:

equ1

In order for R to be minimal, we need:

equ2

Insert R into the equations and expand them, we get:

equ3

That is:

equ4

Or in matrix form:

equ5

Now this is a very simple equation, so we can simply solve it using Cramer’s rule:

equ6

It looks scary but actually not too complex. We need the sum of x, x^2, x^3, x^4, x*y and x^y. We can actually verify this in excel:

x y x^2 x^3 x^4 x*y x^2*y
9.42 52.24 88.69578 835.323 7866.942 491.974 4633.334
2.78 4.15 7.721688 21.45697 59.62447 11.53932 32.06537
5.87 13.72 34.43254 202.0476 1185.6 80.52797 472.5321
5.49 7.00 30.18894 165.8715 911.3719 38.47401 211.3934
6.48 5.28 41.99053 272.099 1763.204 34.19838 221.6059
4.91 6.98 24.08858 118.227 580.2595 34.23907 168.0457
7.82 42.00 61.22442 479.0568 3748.43 328.6334 2571.426
2.70 7.15 7.269068 19.59829 52.83935 19.27266 51.96143
2.65 3.41 7.003793 18.53532 49.05312 9.028243 23.89296
8.42 31.00 70.90802 597.0945 5027.948 261.0414 2198.149
5.85 12.59 34.27254 200.6409 1174.607 73.71933 431.5732
0.93 1.09 0.869064 0.810173 0.755272 1.014426 0.945685
63.32 186.61 408.665 2930.761 22420.63 1383.662 11016.92

So we have:

calculation_0

calculation_1calculation_2

calculation_3

So in the end we get:

cal_result

This is exactly what Excel displayed on the chart:

trend_line_res

I actually put all this in an Excel file so you can try it yourself. Please note that in the excel file we have a data set of 12 points. If your data set has a different size, you may have to adapt the formulas a little bit. 🙂

Linear regression in Excel

Excel has a very handy feature that does linear regression for you very intuitively.

Let’s say you have a data set like this:

x 9.42 2.78 5.87 5.49 6.48 4.91 7.82 2.70 2.65 8.42 5.85 0.93
y 52.24 4.15 13.72 7.00 5.28 6.98 42.00 7.15 3.41 31.00 12.59 1.09

First, you’ll have to insert a scatter chart:

insert_scatter_chart

You’ll get a chart like this (Excel is smart enough to figure out what should be the x-axis and what should be the y-axis):

scatter_chart_ex1

Now, right click on the dots and then you can select “Add Trendline…” from the pop-up menu.

add_trendline

Then you’ll be presented with a dialog like this:

trend_line_options

Note that Excel has different options for “Linear” and “Polynomial”. Here the option is about the relationship between the variables. All these different options are all Linear Regressions.

Let’s try a order 2 polynomial regression and check the option “Display Equation on chart” and “Display R-squared value on chart”. This is what you’ll get:

order_2_regression

Good thing about doing this in Excel is, you can always change the data set and your chart and equation will be updated automatically.

Depending on the nature of your data set, you may want to try different model to get a better fit, the procedure will be the same.