## Some thoughts on Vehicle Telematics & Big Data application in CRM

From OEMs perspective, every new technology has to serve the ultimate goal, selling more cars.

Traditionally, just like other business department, CRM can only base their decision making on sales related data. In comparison, the data collected directly from the vehicle via telematics has the following advantages:

• much better data – OEMs get data from customers directly instead of through 3 parties; this means both accuracy and completeness of data.
• has a much higher frequency – instead of once in a while, we get updates from customers/vehicles constantly;
• higher flexibility – OEMs are able to change the data collection policy and setup with minimal management overhead;

The same can be said for the other direction – delivering information from OEMs to customers. The advantages of precise targeting are so well established that I don’t even need to list them here.

The only side effect of doing CRM using vehicle telematics is, this will accumulate huge amount of data – Big Data.

Interestingly, for OEMs, the motivation to employ big data hasn’t been strong enough till recently. That’s because, for OEMs, there’s already established practices in traditional business intelligence. The added value of changing to a new technology hasn’t been so obvious or significant.

It’s true that Big Data as a field of special expertise, stemmed from real world need of storing, processing ever increasing amount of data. However, A whole set of technologies were developed to deal with the new challenges posed by the volume of data. Along with this development, the focus of Big Data have changed from volume and velocity towards data mining, artificial intelligence and machine learning. It’s in these aspects where Big Data may provide unique value to CRM compared to traditional BI.

Take the sales pipeline for example:

1. Campaign
3. Opportunities
4. Sales
5. Client
6. Retention

Traditional wisdom dictates that the data in each stages has to be complete, reliable and continuous, otherwise the model won’t make much sense. However, that is actually a limitation of the ability of the tools and the model. Traditional BI is incapable of dealing with incomplete data; the model it based upon cannot handle fuzziness.

However, in real world, incomplete data is the norm, fuzziness is simply the nature of human as oppose to machines. Luckily, techniques have been developed in Big Data to deal with incomplete data and fussiness. With these techniques, CRM system will behave more like human, making predictions based on  incomplete data with probabilities in mind.

## 创建DVD Video光盘

DVD Video使用标准DVD光盘作为存储介质。22厘米直径，使用650nm激光读取。

DVD Video采用UDF Bridge文件系统。此文件系统兼容ISO9660文件系统。

DVD Video有如下目录结构：

## 中国地图坐标(GCJ-02)偏移算法破解小史

2010年1月，网友wuyongzheng发现：

I accidentally found the Chinese version of Google Map ditu.google.com to be able to correlate satellite image with map, and it gives the amount of deviation for any location in China. This URL queries the deviation of 34.29273N,108.94695E (Xi’an): http://ditu.google.com/maps/vp?spn=0.001,0.001&t=h&z=18&vp=$34.29273,108.94695 (seems it’ doesn’t work now) 有了足够的数据，wuyongzheng建议使用回归算法来逼近这个偏移算法：https://wuyongzheng.wordpress.com/2010/01/22/china-map-deviation-as-a-regression-problem/ 在此之前的尝试都是零星的，针对个别城市的。wuongzheng的这个建议算是在全面系统地解决这个问题上迈出了第一步。 2013年5月，Maxime Guilbot根据这个建议得到4-5米精度的逼近： https://github.com/maxime/ChinaMapDeviation 2013年10月，wuyongzheng自己进行了回归，得到如下结果： http://wuyongzheng.github.io/china-map-deviation/paper.html Maxime Guibot和wuyongzheng的回归结果基本代表了在黑暗中摸索的最佳结果，因此得到了广泛的注意和应用。 在另一条路径上，2010年4月，emq project增加了一个文件，Converter.java： http://emq.googlecode.com/svn/emq/src/Algorithm/Coords/Converter.java 这段代码可以以很高的精度把WGS-84坐标转换到GCJ-02坐标。 2013年2月，这段代码被网友coolypf注意到，整理后用到了他自己的项目中： https://on4wp7.codeplex.com/SourceControl/changeset/view/21483#353936 其中的关键代码值得贴在这里：  const double pi = 3.14159265358979324; // // Krasovsky 1940 // // a = 6378245.0, 1/f = 298.3 // b = a * (1 - f) // ee = (a^2 - b^2) / a^2; const double a = 6378245.0; const double ee = 0.00669342162296594323; // // World Geodetic System ==> Mars Geodetic System public static void transform(double wgLat, double wgLon, out double mgLat, out double mgLon) { if (outOfChina(wgLat, wgLon)) { mgLat = wgLat; mgLon = wgLon; return; } double dLat = transformLat(wgLon - 105.0, wgLat - 35.0); double dLon = transformLon(wgLon - 105.0, wgLat - 35.0); double radLat = wgLat / 180.0 * pi; double magic = Math.Sin(radLat); magic = 1 - ee * magic * magic; double sqrtMagic = Math.Sqrt(magic); dLat = (dLat * 180.0) / ((a * (1 - ee)) / (magic * sqrtMagic) * pi); dLon = (dLon * 180.0) / (a / sqrtMagic * Math.Cos(radLat) * pi); mgLat = wgLat + dLat; mgLon = wgLon + dLon; } 2013年3月，coolypf在自己的博客中介绍了这一段代码： http://blog.csdn.net/coolypf/article/details/8686588 2014年9月，wuyongzheng注意到了coolypf的项目。至此，两条路径合流，坐标偏移问题基本得到了完美解决。 从上面的代码可以看出，相对于WGS－84，GCJ－02一方面采用了不同的参考椭球体(SK-42, Krasovsky。应该属于前苏联影响的遗留)，另一方面引入了高频非线性偏移。 ## RSA illustration with not-so-small numbers – part 2 Let’s have a closer look at the encryption. During the communication, what’s been exposed are: Alice’s public key (n=2627, e=13) , and the encrypted message. For anyone who’s entered the world of modern cryptography from the old age, it’s tempting to try to decrypt the encrypted message using the encrypting key, the public key. For these people, I have the below chart that shows the mapping between the plain text and the encrypted data: x-axis is the plain-text data (sorted from 1 to 2627) and y-axis is the encrypted data(from 0 to 2626). I did the calculation using this line of script: ~$ for i in seq 1 2627; do echo "\$i^13 %2627" | bc; done > /tmp/encryption.mapping

Below is part of this chart zoomed-in:

So you know the encrypted data, let’s say 2144, and you know the public key (n=2627, e=13). How do you find the number x such that x^13 % 2627 = 2144.

You cannot unless you compute everyone possible 1<x<2627 and then find the correct one. That’s brutal force. This is one of the basic assumption behind the security of RSA: There’s no efficient way to find x. This is called the discrete logarithm problem.

In real world scenarios, the 2 prime numbers will be so large that brutal force is simple impractical.

Then to decrypt the message, one would need the private key. The private key is the modular inverse of phi(n). However, in order to get phi(n), he has to know the factors that form n. And factoring large number is mathematically hard. That is the other assumption behind the security of RSA: There’s no efficient way to factor a large number.

As you will see in other places, these 2 assumptions are the corner stones of modern cryptography.

## RSA illustration with not-so-small numbers

Modern cryptography is difficult to understand without illustrations. One of the reason is, modern cryptography involves very large numbers that easily exceed the capacity of a standard calculator, let alone human comprehension. There are some illustrations out there using small numbers. The problem is, the numbers are too small to be convincing. So I’d like to try some no-so-small numbers here. Most of the necessary calculations can be done with GNU bc, so you can try yourself on just any GNU Linux distribution.

Let’s say Bob wants to send the below number to Alice (and make sure only Alice can decrypt the message):

520

Here’s what Alice will do first:

1. Pick up two distinct prime numbers. The numbers should be sufficiently large so that brutal force is difficult. Here we choose p=37 and q=71.
2. Calculating n=pq=37*71=2627.
3. Calculating the n‘s totient function: phi(n)=(p-1)*(q-1)=2520.
4. Pick a number e between 1 and phi(n) that is co-prime with phi(n). Here we choose 13.
5. Find number d so that e*d mod (phi(n)) =1. Here we choose 1357. This step cannot be done with bc. Intead, you can try this online calculator. Just put “modinv(13,2520)” in the text field and then press “go” you’ll get the result.

Now Alice has a public key (n=2627, e=13) and a private key (n=2627, d=1357). She can simply distribute her public key to everyone, including Bob.

Now for Bob to encrypt the message 520 to Alice, he has to encrypt the message using Alice’s public key:

520^13 % 2627 = 2235

Now Alice received this number 2235 from Bob. In order to decrypt this message, she do the following calculation:

2235^1357 % 2627 = 520

Actually, here Bob can encrypt just any number that is less than or equal to n in this way.

Bob:

1^13 % 2627 = 1

Alice:

1^1357 % 2627 = 1

Bob:

2^13 % 2627 = 311

Alice:

311^1357 % 2627 = 2

Bob:

3^13 % 2627 = 2361

Alice:

2361^1357 % 2627 = 3

Bob:

4^13 % 2627 = 2149

Alice:

2149^1357 % 2627=4

Bob:

137^13 % 2627 = 2431

Alice:

2431^1357 % 2627 = 137

If his message is large, then he has to split his message into chunks that are smaller than n and encrypt them one by one.

Note that this only illustrates how Bob can send secrete messages to Alice. If Alice wants to send secrete messages to Bob then she has to have Bob do the same first:

1. Pick up 2 sufficiently large prime numbers;
2. Get the product of these 2 prime numbers – This is part of the keys;
3. Get the totient of this product;
4. Pick a number that is co-prime with this totient but smaller – This combined with the product is the public key;
5. Find the number that is the multiply modular inverse of this number – This combined with the product is the private key;

Then Bob sends his public key to Alice and Alice can encrypt the messages using Bob’s public key. Upon receiving the messages, Bob can decrypt the messages using his private key.

## 关于中国地图坐标偏移

• 什么是地图坐标偏移

• 国内不同厂商提供的地图是否一致？

• GPS设备呢？

GPS设备通常返回WGS-84坐标，因此如果直接标注到GCJ-02地图上会不准确。没有证据表明GPS信号或者GPS芯片被修改。国产的GPS设备可以返回GCJ-02坐标，但是不清楚这种坐标转换是硬件实现还是可以软件实现。

• 地图怎么可能被偏移而不被察觉

## How to dodge “the Great Cannon”

I don’t want to go in details and risk my own blog. So basically one of the scripts that’s very common among websites is targeted and redirection code was injected.

Using Adblock, you can simply block this script:

And then you won’t get redirected. It’s that simple. 🙂

There might be other scripts I haven’t encounter yet, but you should be able to use the same technique to block them as well.

## Stereotyping and its costs

Recently I watched this

And this:

I’ve been watching TED videos for years now but still feel like an eye opening.

People may say, “Oh come on, these are TED videos right? They are meant to impress people.” I’m actually not that easily impressed. I’m not talking about the technology or the plasticity of human brain. I’m talking about the very fact that a disabled person could become an MIT professor, lead a world class research team or could be so sharp, so articulate and appear so *normal*.

Despite all the pride of being Chinese, we have to admit, that would not happen in modern China.

If Mr. Hugh Herr had been born in China, he would have probably at best dropped out of school very early on and attended a special school or even worse, simply stay at home, completely isolated. If Mr. Daniel Kish were in China, he won’t have had the chance to share his personal experience with others. Instead, with his outstanding ability, he probably will end up making a living by showing off his special ability in a circus (Or in Beijing subway if circus fade out of favor completely).

The reason behind the differences, I believe, lies primarily in everyone’s mind.

I happen to know the concept of “stereotype threat”. For those who don’t know, according to wikipedia it is “one of the most widely studied topics in the field of social psychology”, that evaluates the impact of stereotyping. As it turns out, a lot of performance gaps between groups can be explained by this stereotype threat. I personally believe that stereotype threat is the key reason behind the performance gap between disabilities in China and disabilities in the US.

Let’s face it: China is still a country full of biased stereotypes. It’s true that stereotyping is part of human nature and that stereotypes exist in every society. However, China stands out in allowing stereotypes to go unchecked in every corner of everyday life, TV programs, newspapers, magazines, even textbooks for children. As a consequence, people are so used to all sort of stereotypes that no one even bothers to stand up against said stereotype, even though everyone has been a victim of one form of stereotype or another.

I have to admit that, I only started to pay attention to this topic after my wife and I had a child. My wife and I are lucky, our daughter is normal in every aspect. However, as new and inexperienced parents, at times when my daughter was sick and sometimes we became scared and couldn’t help but think about all kinds of what-if scenarios.

Out of this kind of reasoning I became a person that is conscious about stereotype. Bit by bit I recalled how I have struggled against all sorts of stereotypes against myself when I was young. I started to realized how I have stereotyped others and how destructive that could be. Everyone is a victim of this inescapable net of stereotyping.

So, on this special day, I propose one thing we could do to bring positive changes to China, without disturbing the government: reflect on ourselves and stop stereotyping.