March 16, 2010

Companion.JS rocks in IE JS debugging

Human & Machine

I was having a hard time in debugging JavaScript for Internet Explorer (IE). Luckily I've found the Companion.JS addon and it works great in at least IE 7 and very easy to install on a plain Windows :D

The following article gives a detailed explanation on this handy tool's motivation and installation:

    http://www.grepsedia.com/development/debugging-javascript-in-ie-without-visual-studios

With this addon, I quicky located a nasty issue in jQuery's $.html method that fails to construct DOM subtrees in IE yesterday. By setting a node's innerHTML attribute directly, I finally got my web app running happily in both IE 7 and IE 6 :D

No pain any more.

March 16, 2010 01:57 AM

March 01, 2010

LG手机KS360修理记

rogerz

lp一年前买的LG手机忽然开不了机了,活动滑盖的时候听到一些异响,多滑几次居然露出来一截线头,原来排线断掉了。在老家找遍一条街都没有配件,只好回上海再处理了。

借助着简陋的家庭多用螺丝刀,好不容易拆开了机器。拆机有个技巧就是看保修标签的位置,如果是直接贴在缝上,那么那条缝就是可以抠开的。如果贴在机壳上,那么下面一般会有一颗螺丝。还有就是为了美观,暴露在外壳的螺丝有时候会用塑料片盖住,撬开就可以看见了。拆机的时候不能硬碰硬,很难抠开的时候一定要检查一下有没有隐藏的螺丝。比如下图与三个粉红色圆片对称的位置就还有三颗螺丝,需要把键盘卸掉才能看到。

拆开之后就简单了,把淘宝上买来的排线换上,机器即恢复了正常工作,总花费23元人民币+半个小时人工。滑盖手机的排线每天都被扭来扭去的,确实是一个很容易坏掉的部件。

March 01, 2010 12:49 PM

February 26, 2010

我们需要优秀的实习生来帮助我们推进 nginx C 开发

Human & Machine

正如我在我的个人博客中所介绍的,我们正在淘宝 SDS 部门基于 nginx 构建新一代的 OpenResty. 我们现在有一些实习生岗位的 headcount. 学历不限,量才录用,呵呵。

最好能熟悉 Linux C 编程和 socket 非阻塞编程,知道什么是 C10K,能无障碍地阅读英语技术资料。如果有 Apache 模块和 Nginx 模块开发的经验,就更完美了。编译技术和 TCP 协议的流式解析方面的知识是一个 plus,自动化测试和测试驱动方面的知识也是一个 plus. 但我们更看重学习能力、接受新事物 的能力,以及对技术的热情。毕竟经验是可以通过实际工作快速取得的,呵呵。

工作的内容主要就是编写各种各样的 nginx C 模块,以完善 OpenResty 这个平台,从而支撑包含淘宝量子店铺统计在内的在线应用。我们关注高性能,高并发,和高可靠的 web 应用框架,以及纯 AJAX 和 Flash 等富客户端的应用。在这里,集群、架构等设计是永远的主题 ;)

已经开发的并且开源出去的大部分 nginx 模块组件可以在下面这个 网页中找到:

   http://wiki.nginx.org/Nginx3rdPartyModules

其中作者里含有 chaoslawful 或者 agentzh 的就是我们的模块,比如 ngx_echo 模块, ngx_drizzle 模块,ngx_rds_json 模块等等。

还有两个已经比较完整但尚未发布的 nginx C 模块在这里:

    http://github.com/agentzh/set-misc-nginx-module

    http://github.com/agentzh/array-var-nginx-module

如果你或者你的朋友对这个实习职位感兴趣,请与我们联系:发送简历到 chunlai at taobao dot com 或者 agentzh at gmail dot com ). 工作地点在北京朝阳区东三环附近。

因为是实习职位,所以我们不能提供很高的薪资,但我们可以提供尽可能好的学习环境和实践环境,毕竟即使在世界范围内,如我们这样的 nginx 高级开发职位和环境也不多见,呵呵。

February 26, 2010 03:01 AM

February 25, 2010

Weave sync

wanglianghome

最近又从chrome切换回firefox,只因Weave Sync

Weave Sync可以同步五种东西:书签、密码、选项、历史和标签,我只选择了后面两个。没有同步书签,是因为我根本就不保存书签,全部保存到delicious.com;没有同步密码,是因为我还有点不放心,虽然号称所有东西都是加密之后才上传到mozilla的服务器上,但是对于版本号1.0.1的东西,我还是想谨慎点,万一有点bug就死定了;我没有什么特别的选项设置,所以也就没有同步。

值得称道的是,Firefox和Weave Sync在诺基亚的Internet tablet,如N800和N810,以及新一代智能手机N900上都可以使用,使用同步密码功能,可以省去在移动设备上输入密码的烦恼。同步历史和标签,可以让你迅速延续在另一台设备上的阅读。

February 25, 2010 01:51 PM

February 22, 2010

记录一下南通的长牌

rogerz

这个新年一半在老家,一半在老婆的老家度过,除了走亲访友之外,还有一个“收获”就是学会了南通地区流行的长牌。

网上能搜到的有用的介绍不多,这篇图文并茂的南通双将长牌相对全面。该blog上还有一些其他相关文章,比如南通长牌扫盲(牌图介绍)。其他搜到的内容基本上是同一篇文章的不断转载,与百度百科的南通长牌词条雷同。

有麻将基础的话,学打长牌并不困难,成牌的方式跟麻将里的胡牌是基本一致的,只是结算分数的规则不同,此外,一开始想要识别那些代表数字的图案比较费劲,实在是没有什么规律,只能强记。

打牌如人生,你不能控制拿到手里的牌,但是可以选择出牌的方式。

February 22, 2010 01:05 PM

February 05, 2010

Free but secure login mechanism

wanglianghome

blogger.com即将停止支持FTP方式发布blog,而我就是那不幸的0.5%。安装一个blog平台不是特别难,问题是如果我不想自己的密码以明文方式在这个恐怖的网络空间里飞舞,那么就需要SSL,对于dreamhost来说,还要额外为unique IP付费。

不想付。

Google上搜索了一阵,终于找到一个不错的解决方案——使用openid登录。Google啊Yahoo啊都可以。

February 05, 2010 01:51 PM

Start tweeting

Human & Machine

I've just registered a twitter account and here's my page

   http://twitter.com/agentzh

Feel free to follow me. And I'll try sticking to English in my messages ;)

February 05, 2010 06:25 AM

February 03, 2010

Download podcast with miro and privoxy

wanglianghome

喜欢使用Miro下载、播放podcast,甚至video,然而尴尬的是,有些feed无法正常访问,虽然Miro支持http代理,然而走代理,尤其是通过tor下载会很慢,只好使用Google reader订阅。在线收听的缺点是不适合很长的内容,就像stackoverflow的podcast那样,动辄一个小时以上。

终于被我发现并学会了privoxy

privoxy是一个http代理,它具有firefox插件foxyproxy类似的功能,即有能力为不同的网址选择使用不同的代理,包括不使用代理。privoxy在Ubuntu 9.10上的配置文件位于/etc/privoxy/config,只需添加一行代码,就可以使Miro在抓取feedburner的feed时走tor代理,而下载文件时不使用任何代理。

forward-socks5 feeds.feedburner.com 127.0.0.1:9050 .

注意最后有一个点。

February 03, 2010 12:44 PM

一个关于动态应用单核千级 rps 的传说

Human & Machine

我们在过去的四个月中,已经开发了 8 个 nginx C 模块(第 8 个这两天刚放到 GitHub 上面,名为 ngx_set_misc ),后面还会有更多更精彩的模块甚至应用面世。这些模块的功能看起来很碎很零散,但其实是一个更宏伟的项目的组成部分。这个项目叫做 ngx_openresty.

我们的一个方向是几个客户端 .js 文件 + 几个 .html/.css 文件 + nginx.conf + 一个 mysql/oracle/pgsql 数据库,就轻松搞定一个完整的交互式 web 应用,或者至少是一个很复杂应用的一部分。

不幸的是,业界的一些同仁对这种应用开发模式产生了质疑,认为我们对 nginx 的各种应用层面的扩展让 nginx 自身变得臃肿和低效。本文旨对这些质疑进行一次非正式的反驳。目的不在于驳人面子和树己威风,而是为了争取更多的朋友加入我们在 nginx 领域的努力。

我们已经在 ngx_openresty 原型的基础上,使用 nginx.conf 和纯客户端 JavaScript 开发了一个比较完整的 blog 应用(blog 应用的源码在这里: http://github.com/agentzh/ngx_openresty/tree/master/demo/Blog/,服务器端使用的 nginx.conf 文件在这里: http://agentzh.org/misc/nginx.conf

在我们的压力实测中(利用我的 ThinkPad),上面这个 blog 应用的 AJAX 接口单核就能扛千级 rps,小请求在 keepalive 下几百并发也可以扛到 ~ 6000 rps,且禁用了 nginx 级别的 mysql 结果集 cache,内存、CPU 等资源的占用只是 php 等传统应用的一个零头,呵呵。其主要原因是:

  1. 都是认真书写的高度可复用的 C 代码来完成原先 PHP 等代码来完成的事情,
  2. 绝大部分应用都是 I/O 密集型的,所以让 nginx 高效的事件模型来统一调度整个应用的所有网络 I/O 操作(包括到 DB 等后端的),只会进一步提高性能,而非降低。相比之下,用 PHP 等脚本语言访问 mysql,一定会阻塞当前 PHP 进程或者线程,而 PHP 解释器的阻塞开销是非常大的。
  3. 我们总是可以通过 LVS + 多机 nginx 来线性 scale. 总体资源是一个常量,不会因为多启几个进程和线程就会增加,只要不是浪费。阻塞在 I/O 操作上、在多进程多线程间无谓地切换和争用、在后端应用和 nginx 之间通过 socket 复制请求和应答数据,等等等等,这些才是浪费。
  4. 通过 fastcgi 等协议,PHP 这些脚本语言不免要自己再解析一遍 HTTP 头和 URL 参数啥的,还要自己做 URL 分发;而 nginx 其实一般都已经解析和处理过了。
  5. 根据模版渲染 HTML 展现的工作现在改由客户浏览器去做了,通过 Jemplate 等客户端模版来从 JSON 数据生成最终的 HTML 等内容,于是这进一步节约了服务器端的计算。如果一秒有几千请求,在 cache miss 的情况下,就可以节约几千次 HTML 渲染,这累积起来,相比生成 JSON 是相当可观的,呵呵。
  6. 在现有的 FastCGI 后端应用的解决方案中(特别是 PHP),不可避免地会 buffer 输入请求和输出应答的所有数据(无论在 RAM 里,还是写到磁盘临时文件),无法高效地实现流式处理,所以在大数据量的应用中,开销甚巨。
如果还有我没有想到的理由,请来信补充,呵呵。

其实我们的方向不算特别新奇,Apache 2.2 早在几年前就提供了强大的 apr-dbd APImod_dbd 模块来提供从 server 到 mysql/pgsql/oracle/odbc 这些数据库后端的直连服务。

我们在雅虎工作的时候,也曾经通过 Apache 2.2 mod_dbd + PostgreSQL 后端搭建过单机几千 rps 的在线服务,当时这个东西是服务于雅虎财经。即使在那个 DB I/O 阻塞 apache worker 线程的场景中,我们也没觉得 Apache 在线上的性能差到哪儿去了,呵呵。

当然,我们有机会在 nginx 世界中做得比 Apache dbd 更好 ;) nginx 的内部架构比 Apache2 优秀多了,这是真的,呵呵。

所以还是面对现实,让我们一起拥抱高简洁和高性能吧!呵呵!

Update: 近期我们会推纯 nginx.conf 配置文件编程。长期的目标是:整合 coco luaconcurrent lua 到 nginx core,这样就可以直接在 nginx.conf 中使用 Lua 来 script nginx 核心了,同时提供类似 erlang 的透明的非阻塞 I/O、透明的跨 worker 跨机器的消息传递、以及 JIT 加速 :D

February 03, 2010 04:20 AM

February 01, 2010

ngx_drizzle v0.0.7: now running on *BSD

Human & Machine

We're proud to announce the v0.0.7 release of the ngx_drizzle module.

ngx_drizzle is an nginx upstream module that can help nginx talk directly to mysql and other RDBMS backends that support the mysql or drizzle TCP protocol.

This release includes patches from our new contributor, Piotr Sikora, to make this module work with kqueue and other event models on *BSD systems. (Now select/poll/epoll/kqueue all have been tested, but for rtsig, the server will hang.)

Also, this release fixed a long-standing bug captured by Piotr that the nginx process may crash when DB queries complete in a single run of upstream function calls under high traffic.

We have also tested this module with the recent new release of libdrizzle, v0.7.

Release tarballs can be downloaded below

   http://github.com/chaoslawful/drizzle-nginx-module/downloads

Enjoy!

February 01, 2010 09:17 AM

January 29, 2010

ngx_memc v0.06: new directive memc_flags_to_last_modified

Human & Machine

I'm pleased to announce the v0.06 release of the ngx_memc module:

   http://wiki.nginx.org/NginxHttpMemcModule

This release has the highlight of a new directive named "memc_flags_to_last_modified".

If this directive is turned on, then for memcached get operations, ngx_memc will read the memcached flags as epoch seconds and set it as the value of the Last-Modified header.

For conditional GET requests, it will signal nginx's not_modified_filter module to return the "304 Not Modified" response to save bandwidth.

Here's a small sample config that I've tested with ngx_memc v0.06:

  # read the memcached flags into the Last-Modified header
  # to respond 304 to conditional GET
  location /memc {
      set $memc_key $arg_key;

      memc_pass 127.0.0.1:11984;

      memc_flags_to_last_modified on;
  }


My GET request sets the following header:

   If-Modified-Since: Thu, 28 Jan 2010 12:09:23 GMT

And the key stored in memcached has the flags 1264680563. Then we get

 $ curl -H 'If-Modified-Since: Thu, 28 Jan 2010 12:09:23 GMT' \
                   -I 'http://localhost:1984/memc?key=foo'
 HTTP/1.1 304 Not Modified
 Server: nginx/0.8.32
 Date: Fri, 29 Jan 2010 07:10:52 GMT
 Last-Modified: Thu, 28 Jan 2010 12:09:23 GMT
 Connection: keep-alive

How about setting the flag with ngx_memc too? Well, I'm going to implement the $echo_time and $echo_http_time variables in our ngx_echo module so that we can have

     set $memc_flags $echo_time;
     add_header Last-Modified $echo_http_time;


for memcached storage operations set/add/etc. That is, $echo_time will return the epoch seconds while $echo_http_time returns the textual representation in the HTTP date format. Patches welcome! Volunteers welcome!

Have fun!

January 29, 2010 07:28 AM

January 28, 2010

Skip svn branch for git svn clone

wanglianghome

通常情况下,git svn clone可以很好的完成任务。比如转换Google v8

git svn clone -s http://v8.googlecode.com/svn/ v8

但是最近experimental branch发生了变化,git svn fetch无法继续。其实我们完全可以忽略这个branch,方法是枚举所有想要fetch的branch,原来的配置如下:

[svn-remote "svn"]
        url = http://v8.googlecode.com/svn
        fetch = trunk:refs/remotes/trunk
        branches = branches/*:refs/remotes/*
        tags = tags/*:refs/remotes/tags/*

修改后如下:

[svn-remote "svn"]
        url = http://v8.googlecode.com/svn
        fetch = trunk:refs/remotes/trunk
        fetch = branches/bleeding_edge:refs/remotes/bleeding_edge
        tags = tags/*:refs/remotes/tags/*

January 28, 2010 11:02 AM

Commenting re-enabled on this blog site :)

Human & Machine

I've just turned on the commenting permission on this site for I no longer have any fear of spam ;)

January 28, 2010 08:23 AM

January 26, 2010

ngx_xss: Native support for cross-site scripting in an nginx

Human & Machine

I'm delighted to announce the first release of our new module, ngx_xss. This output filter module adds native support for simple cross-site AJAX to the nginx server. Currently only cross-site GET is implemented, but cross-site POST support is on our TODO list.

Here's a small example using our ngx_echo module together:

   location /foo {
    default_type "application/json";
    echo '{"errcode":400,"errstr":"Bad Request"}';

    xss_get on; # enable cross-site GET support
    xss_callback_arg callback; # use $arg_callback
 }

Then accessing /foo?callback=blah gives the following response:

  blah({"errcode":400,"errstr":"Bad Request"}
  );

And the ultimate response Content-Type is set to "application/x-javascript", which can be overridden by the "xss_output_type" directive like this

   xss_output_type text/javascript;

By default, the ngx_xss module filter will skip responses with Content-Type set to anything other than "application/json". If that's not what you want, you can use the "xss_input_types" directive to override that:

   xss_input_types text/plain text/css;

This module can also be chained with other output filters like ngx_rds_json:

  xss_get             on;
  xss_callback_arg    _callback;

  location /query {
     drizzle_query "select name from products limit 0, 10";
     drizzle_backend my_mysql;

     rds_json on;
  }

Then you can expect something like this when doing GET /query?_callback=OpenResty.callback[32]

   OpenResty.callback[32]([{"name":"Bike"},{"name":"Book"}]);

Be careful with the order of output filters while building nginx. The ngx_rds_json filter expects a valid binary stream in the RDS format while the ngx_xss filter expects some JSON text. If you don't take the order right, you'll see your ngx_xss settings get completely ingored in the final responses. Because the ngx_xss filter sees RDS first and it ignores it due to its "application/x-resty-dbd-stream" content type.

Below is the correct nginx configure command if you want to use ngx_xss and ngx_rds_json together:

 ./configure \
   --add-module=/path/to/xss-nginx-module \
   --add-module=/path/to/rds-json-nginx-module \
   # more options omitted here...

You see, the order of adding output filters on the configure time is just the reversed order that the output filters are actually applied at runtime. Generally speaking, the nginx output filter chain is a stack, not a queue ;)

Only a very limited set of callback values is allowed to prevent JavaScript injection. Valid callback values can be expressed using the following (Ragel) grammar:

  identifier = [$A-Za-z] [$A-Za-z0-9_]*;

  index = [0-9]* '.' [0-9]+
        | [0-9]+
        ;

  main := identifier ( '.' identifier )*

              ('[' index ']')?

This is exactly the Ragel grammar used to generate the C validator used by the ngx_xss module itself.

Here goes the project home page & code repository:

  http://github.com/agentzh/xss-nginx-module

as well as the download page for release tarballs:

  http://github.com/agentzh/xss-nginx-module/downloads

Enjoy!

January 26, 2010 10:34 AM

January 25, 2010

search for the latest file create time in more than one directories, with respect to time zone

purl in your heart

TZ=Asia/Shanghai perl -MPOSIX -le 'print map {$_=strftime(q(%Y-%m-%d %H:%M:%S),localtime($_))} (sort( map { (stat($_))[10] } map {glob(qq($_/*))} @ARGV))[-1]' /tmp .

January 25, 2010 11:02 PM

time format

TZ=Asia/Shanghai perl -MPOSIX -le 'print strftime("%Y-%m-%d %H:%M:%S %Z",(localtime(time)))'

# All this long line simulate date command: % TZ=Asia/Shanghai date '+%Y-%m-%d %H:%M:%S %Z'

January 25, 2010 10:25 PM

January 22, 2010

Genesis 49:22-26. Blessing to joseph

purl in your heart

22 Joseph is a fruitful bough, a fruitful bough by a spring; his
branches run over the wall. 23 The archers bitterly attacked him, shot
at him, and harassed him severely, 24 yet his bow remained unmoved;
his arms were made agile by the hands of the Mighty One of Jacob (from
there is the Shepherd, the Stone of Israel), 25 by the God of your
father who will help you, by the Almighty who will bless you with
blessings of heaven above, blessings of the deep that crouches
beneath, blessings of the breasts and of the womb. 26 The blessings of
your father are mighty beyond the blessings of my parents, up to the
bounties of the everlasting hills. May they be on the head of Joseph,
and on the brow of him who was set apart from his brothers.

January 22, 2010 07:49 AM

Simulink中对enumeration的支持

rogerz

Matlab Simulink从R2008b版本开始支持枚举类型,并且可以通过RealTime Workshop生成相应的C代码。

从语法上看,对枚举类型的支持采用了曲线救国的方式,是从Simulink.IntEnumType派生出来的一个类。其中 methods 这一段是可选的。

classdef(Enumeration) BasicColors < Simulink.IntEnumType
  enumeration
    Red(0)
    Yellow(1)
    Blue(2)
  end
  methods (Static = true)
    function retVal = getDefaultValue()
      retVal = BasicColors.Blue;
    end
  end
end

采用枚举有两个好处,一是在simulink仿真过程中可以使用和显示所定义的字符串而不是数字。二是可以生成可读性更高的C代码。

在使用过程中,你必须为每一种枚举类型创建单独的定义文件,并放在路径中。在使用枚举类型作为参数时,必须指明完整的名字,即BasicColors.Blue,这一点与C语言中的方式有所差别。但在显示的时候,会只显示Blue

从这个例子生成C代码将会是这样的。

typedef enum {
  Red,
  Yellow,
  Blue
}BasicColors;

从以上机理,我们可以推断,你可以在两个不同的枚举类型中定义同名的键值,因为在simulink中使用总是带有该枚举类的前缀,但是会在生成代码过程中产生冲突,因为C语言中的枚举类型是不带前缀的。

实验结果正是如此。由此造成的一个负面影响是,在Simulink中使用枚举类型将不得不使用一个很长的字符串。

January 22, 2010 07:39 AM

January 19, 2010

ngx_rds_json: help ngx_drizzle and other DBD modules emit JSON data

Human & Machine

I'm happy to announce the first release of our ngx_rds_json module that can convert Resty DBD Streams (RDS) to JSON.

As some of you might have noticed, the mysql/drizzle DBD driver module ngx_drizzle generates a specific binary stream in a format known as RDS that is defined by ourselves. We introduced RDS just because we didn't want to be bound to a specific textual format and makes internal data exchange or conversion unnecessarily hard.

As web app developers, we're certainly more interested in more popular textual formats like JSON, YAML, CSV, or even HTML. This module does the job of formating RDS to JSON in a truly streaming fashion.

The project is hosted on GitHub as our other nginx modules:

    http://github.com/agentzh/rds-json-nginx-module/

Release tarballs can be downloaded from here

   http://github.com/agentzh/rds-json-nginx-module/downloads

Just as the ngx_drizzle module, this module is now considered highly experimental, but it's maturing very rapidly because it's part of our Taobao.com's company $project. If you have found any bugs, or any wishlist, please send us mails or create tickets on GitHub.

Here's some typical use cases drawn from ngx_rds_json's test suite ( http://github.com/agentzh/rds-json-nginx-module/blob/master/test/t/sanity.t ):


   mysql db init

    create table cats (id integer, name text);
    insert into cats (id) values (2);
    insert into cats (id, name) values (3, 'bob');

   nginx.conf

    upstream backend {
        drizzle_server 127.0.0.1:3306 dbname=test
             password=some_pass user=monty
             protocol=mysql;
    }
    server {
        ...
        location /mysql {
            drizzle_pass backend;
            drizzle_query 'select * from cats';
            rds_json on;
        }
    }

   request

       GET /mysql

   response

   [{"id":2,"name":null},{"id":3,"name":"bob"}]


   mysql db init

       (ditto)

   nginx.conf

    upstream backend {
        # ditto
    }
    server {
        ...
        location /mysql {
            if ($arg_name ~ '[^A-Za-z0-9]') {
                return 400;
            }
            drizzle_pass backend;
            drizzle_query "update cats set name='$arg_name' where name='$arg_name'";
            rds_json on;
        }
    }

   request

       GET /mysql?name=bob

   response

    {"errcode":0,"errstr":"Rows matched: 1  Changed: 0  Warnings: 0"}


   mysql db init

    create table foo (id serial, flag bit);

   nginx.conf

    upstream backend {
        # ditto
    }
    server {
        ...
        location /mysql {
            if ($arg_bit !~ '^[01]$') {
                return 400;
            }
            drizzle_pass backend;
            drizzle_query "insert into foo (flag) values ($arg_bit);";
            rds_json on;
        }
    }

   request

       GET /mysql?bit=1

   response

   {"errcode":0,"insert_id":1,"affected_rows":1}

You'll see fancier (working) use cases in the test suite, by combining ngx_echo module's echo_location or echo_location_async directive. Parallel SQL queries can be very useful for certain applications.

As a side note, chaoslawful++ is working on the ngx_rds_tt2 module which will allow us to use Perl TT2's template language to specify custom output formater for RDSs. Here's a quick example that will work very soon:

  location /myxml {
     drizzle_query 'select * from products';
     drizzle_pass my_mysql;

     echo_before_body '<?xml version="1.0"?>';
     echo_before_body '<pie>';

     rds_tt2_line_template
        '<slice title="[% title | xml %]" color="[% color | xml %]>'
           '[% count %]'
        '</slice>';

     echo_after_body '</pie>';
  }

And you will get streaming output as well, just buffered on the data line level. (in contrast, ngx_json_rds does not buffer data for large fields like BLOBs).

And there will be ngx_srcache that can allow you to cache database output by ngx_memc + memcached as well ;)

Stay tuned!

January 19, 2010 08:18 AM

January 15, 2010

更多的关于咱们 nginx 新模块的有趣想法。。。

Human & Machine

今天我和 Marcus Clyne 一直在讨论 ngx_list_var 模块的界面,我建议的最新的版本是这样的:

  list_map "name='$it'" $arg_names --to $names --sep ",";
  list_join " or " $names --sep "," --to $condition;

这样对于 arg_names=dog,cat,tiger 的情形,我们可以在 $condition 里组出 SQL 条件

  name='dog' or name='cat' or name='tiger'

其他模块还可以注册新的算子,以用于 list_map,比如 ngx_drizzle 模块可以按 mysql 常量的 quote 规则,提供一个 "drizzle_quote" 算子,于是用户可以这么用:

  list_map_op drizzle_quote $names --args type=string;

这里 --args 可以指定 map 算子的额外的参数。所以我们这里也等于实现了一个简单的 currying,用 Haskell 记法表示就是

  map (drizzle_quote string) names

这样,nginx 变量能当 list 使,并能玩 lambda,就对 ngx_lua 和 php 的诉求更少了一些。当然,我们不希望自己再做出一个 lua 或者一个 php,平衡很重要 ;)

我一直觉得咱们还缺一个 ngx_urlencode 模块,比如 $arg_xxx 里的都是没有 unescape 的。nginx 核心中虽然提供了 C API,但在 config 文件里并没有暴露出接口来。我希望我们以后可以在 nginx.conf 里这么写:

  set_url_unescape $names $arg_names;

或者逆运算:

  set_url_escape $escaped_url $url;

ngx_urlencode 模块也可以向 ngx_list_var 注册一个 map 算子,这样就爽了:

  list_map_op url_unescape $arg_names --to $names;

这样,即使 $arg_names 里有 %xx 也可以搞定,比如值中有汉字的场合。

chaoslawful 先生又想到了一种 nginx 宏结构,可以把若干条重复出现的 nginx 配置指令序列,组合为一个参数化的宏,然后在配置文件要用到的地方展开。比如

  marco_define drizzle {
    drizzle_query $query;
    drizzle_pass mysql_cluster;
  };

然后在每次要用的 location 中写成 drizzle; 就可以展开成那两条配置了。这可以通过咱们未来的 ngx_macro 模块来实现,实现细节上应当可以仿照 ngx_eval 模块的做法,即 macro_define 不过是一个参数类型为 block 的指令而已。标准的 map 和 if 指令,还有 geo 也是这么玩的,比如

  geo $var { ... }

进一步地,define 定义的“宏”还可以带参数,就像 C 宏那样。。。

ngx_macro 的想法,其实是刚刚被一个实际需要驱动的。Marcus 认为 list_split, list_map 和 list_join 经常是组合在一起使用,所以他提议了一个单条指令,以封装了此三种操作。而我和阿哲则认为 lambda 的灵活性因此损失了,于是我们想到了宏定义的方式应当可以两全。

macro_define 和 macro_apply 指令反正是在 config time 完成的,所以没有请求时开销,请求时的效果等价于手写宏定义体中的那些指令。

另外还有一个很常见的应用场景,那就是受限于目前 ngx_rewrite 的 if 的制约,经常需要在多个 if 分支中重复相同或者相近的 handler 配置。此时宏可以简化配置文件中的重复的片段。if 虽然被认为 ugly,但确实可以满足不少很具体的业务逻辑的需求,可以少写不少 C 或者 PHP 啥的。if + 正则 + return 的表达力真的已经很强了。目前的 if 有较大制约就是“一旦进去了就出不来了”,所以才不得不重复书写 handler 的配置。这便是另一个 ngx_macro 模块可以用得着的地方。

这样下去,我们可以让 nginx 配置文件大变样,复杂度肯定是要跑到某个地方的,不能在客户端 JS 里写,又不想在 PHP 或者 C  或者 Lua 或者 Perl 里写,就只好在 nginx.conf 里写了。

在后面的日子里,我们就要开始着手实现这些模块。在此过程中,我们会尽量享用 Marcus 在他的 ngx_devel_kit (NDK) 模块中提供的便利接口和功能。(直接编码 nginx core API 是很痛苦的一件事情呢,哈哈!)

January 15, 2010 08:26 AM

January 14, 2010

Dropbox

wanglianghome

开始使用dropbox,目前主要用来同步我的装修进度照片水电验收照片

我的邀请链接是https://www.dropbox.com/referrals/NTQwMTc3NjI5

January 14, 2010 03:29 PM

January 12, 2010

VIM插件snippetsEmu

rogerz

snippetsEmu是VIM的一个插件,模仿MacOS里TextMate的snippets功能,用于快速插入可定制的文字片段,功能强大,使用方便。

我的需求是这样的,我需要插入一系列下面格式的文本

DR001 = ModelAdvisor.Task('DR001');
DR001.DisplayName = 'DR001 - ar_0001: Filenames';
DR001.Description = 'DR001 - ar_0001: Filenames';
setCheck(DR001, 'StyleGuide: ar_0001');
mdladvRoot.register(DR001);

其中的 DR00X 是一系列规则的编号,他们的 DisplayName 和 Description 自然是各不相同的,但格式是固定的。实现这一点并不难,基本的思路就是读入一个带占位符的模版,然后跳转到各个占位符,将其替换为实际需要的文字,比如Vim File TemplatesUsing Templates等里面提到的例子。

在snippetsEmu里实现这一点则更加人性化,整个过程非常自然。

在insert模式下输入trigger_name<Tab>,会自动把’trigger_name’展开成预设的文字,并将光标移动到插入点,即’<{}>’的中间。这个步骤很自然,就像你通过补全代码一般。

这个模式比较适合创建简单的snippet,特别是单行的,如果需要换行,可以用<CR>表示。

插入这个snippet时,光标会先停留在<{forename}>处,输入后按<TAB>可以跳转到<{surname}>,第二个<{forename}>不必重复输入,会自动填充。使用效果如下图所示:

可以用visual模式选中一段文字,然后按:输入CreateSnippet,系统会提示你输入trigger name,然后在buffer中创建一个草稿,经过手工编辑之后可以粘贴到命令行中使用,这一步有点麻烦,需要对vim的那些register比较了解,实在不行就采用系统剪贴板吧……

snippetsEmu还支持一些复杂的tag替换操作,具体请看帮助文件。

最后,这样创建的snippet都是临时的,如果需要重复使用,请把命令保存到配置文件中,需要的时候加以调用。

January 12, 2010 09:42 AM

January 06, 2010

利用 Erlang 编写的 TCP Proxy 工具 etcproxy 定位 ngx_drizzle 模块的一个 bug

Human & Machine

阿哲老师(chaoslawful)的 Erlang 版 TCP proxy 工具 etcproxy,刚刚成功地帮我定位了 ngx_drizzle 模块中的一处隐秘的 bug,哈哈!

原来我在 ngx_http_drizzle_output.c 的函数组中的 size 局部变量都忘了初始化为 0 了。这种未初始的情形只有在 field 的 packet 被拆分时才会出现。我不用在测试 DB 里准备很大的 blob 数据来测试此种情况,而且非常可靠,每次都能精确复现场景。

定位 bug 的具体过程是这样的:我启动 etcproxy,让它的上游是本地 mysql server 监听的 3307,它自己监听 3306 端口,然后我让 nginx 去连 3306. 这样 etcproxy 就会开始作连接和数据流中传。它提供了一些选项(目前是写在代码内部的宏定义)。我是这样使用的,先设置了按每字节来拆分 packet 数据,即 mysql 发送的 packet 都被强制拆成了一个字节一个 packet. 同时我设置了相邻发送的 packet 之间的延时为 1ms,这样这些小 packet 在 localhost 方式发送时,不会被 kernel 自动合并。于是 nginx 得到的 packet 也是一个字节,再一个字节的。

于是这就触发了 libdrizzle 这个流式 mysql 客户端的 field 分割逻辑,即对于结果集中的一个 field,会分多份返回给我的 ngx_drizzle 模块,而我的模块对于多片的 offset + len 的 field 值接收代码,正好有刚才提到的 size 未初始化为 0 的 bug,于是出现了 size 为 4 这样的随机数的情形。这样最终导致 buffer 使用发生错误,并被我自己的一个 buffer 指针断言给捕捉到了,抛到了 nginx 的 error.log 里;与此同时,我的 Test::Nginx 测试台也发现 nginx 的输出发生了截断,结果不是期望,并在终端上向我报了错误,我也就第一时间看到了。

其实 etcproxy 还有一个功能,我还没有来得及使用。那就是在指定的读或者写的 data 偏移量处发生超时或者关闭上下游的 TCP 连接。这个也可以精确地测试我的非阻塞客户端的状态机的一些特殊方面,主要就是在特定状态下,超时和连接出错的处理代码,是否如期望那般运行了,哈哈哈。比如在接受 SQL 查询结果的 columns 部分的超时定时器是否如期望的工作,接受 rows 时的超时定时器是否也如期望般的工作了。

今晚我决定给自己做点好吃的,庆祝一下这个 bug 的顺利定位和修复,呵呵!

January 06, 2010 10:38 AM

关于 ngx_drizzle 和 ngx_rds 的一些随机的想法。。。

我终于想到了为我的 ngx_drizzle 模块再添加一条 drizzle_type 指令,这样用户可以自定义嵌入 SQL 的参数类型和正则约束,比如:

  drizzle_type email '^[-A-Za-z0-9._]+\@\w+\.(?:com|cn|net)' quote=on;
  drizzle_quote $arg_email email;
  drizzle_query "select * from users where email=$arg_email";

这样内建类型 column, table, int, bool 啥的也可以让用户自己覆盖了,哈哈!ngx_drizzle 其实已经拥有了 OpenResty View API 的主要功能了,哈哈哈,而且更灵活,更高效!

我刚才吃早饭的时候,还想到可以为咱的二进制的 Resty DBD Stream (RDS) 格式再引入两种非结构化的查询结果类型

这样可以在 ngx_rds 模块中通过 output filter 对 RDS 结果集进行 Excel 风格的“单元格”或者“单元格区域”的 pick 操作。只要我们的 ngx_rds_json 等数据格式化器也支持这两种新类型,就可以生成更直接的 JSON 表示,从而简化在 ngx_eval 这样的 subrequest in memory 的上下文中的 if + 正则等判断处理。

我同时还想到了通过更高层的 merger 来融合多个 RDS 流。比如每个流分别来自不同 mysql 机器的 select + order by 查询,然后 merger 再把这些 RDS 结果进行流式的全局的 order by. 在牺牲流处理并限定各个 RDS 支流的资源占用的情况下,还可以进行多 RDS 支流的全局 join/group by 等操作。

这些都将是  ngx_rds 模块的功能,同时支持 RDS picker 和 merger :D

January 06, 2010 04:39 AM

January 04, 2010

为 nginx 正名,为 Igor 申冤

Human & Machine

好久没在这里写中文文章了,主要是因为这篇东西的文字主要来自聊天记录 ;)

刚才在和 cnhackTNT 聊到 ngx_drizzle 模块未来的若干种极酷的玩法的时候,不由地开始感慨 nginx 社区大部分用户其实并没有真正理解 nginx 的设计真谛。包括目前网上流行的那些 nginx 的玩法,本身都是直接违反作者 Igor Sysoev 老大的设计初衷和本意的。

所以从这个意义上讲,Igor 是孤独的。这也是为什么 Igor 有时终于忍不住在 nginx 列表里揭斯底里地说他压根原本就不想支持 fastcgi,他压根就不想把 nginx 的 proxy 模块做成第二个 squid。当然了,Igor 自己玩的 ngx_perl 模块也违反了 nginx 的设计原则,包括他自己说要做的 ngx_v8 模块。不过我相信 Igor 这两个模块只是玩玩而已,耍酷之类。真正对 nginx 有意义的脚本语言引擎,是那些支持 C 级别的 coroutine 的解释器,perl 和 v8 都不满足这一基本条件。( coco lua 是可以的!)

Nginx 的核心是网络 I/O 非阻塞,是的,再说一遍,非阻塞。从网络 I/O 的角度看,fastcgi 协议的另一边永远是阻塞的,ngx_perl 和 ngx_v8 中的脚本代码运行时永远是阻塞的,ngx_proxy 协议的另一头经常也是阻塞的,所以说它们没有本质的意义和价值,不会带来性能的真正飞跃 。任何时候,我们都不应阻塞在 Web I/O 上,任何时候都不!无论是到 memcached 的 TCP 通信,还是到 mysql, Oracle, PostgreSQL 这些 RDBMS 的 TCP 通信,抑或是到上游其他 HTTP web service 源的次级请求,都不应阻塞!

Igor 在 nginx core 中设计了极为精巧的非阻塞的编程模型,无论是 subrequest, 还是 upstream,但极少有人了解这些,极少人有欣赏这些,更少有人懂得去利用好这些。在非阻塞和并发 I/O 的上下文中,编程模型和传统的代码是很不一样的,就像 JS 的许多写法和 perl 里极不相同一样,其道理本身倒不难理解,但对于“精巧”的感悟,则需要对 core 的更多了解 ;) 正因为如此,所以我说 Igor 是孤独的,所以我说 nginx 社区许多朋友其实不懂高并发,不懂高性能,不懂 C10K,呵呵。

当然了,这一切我也是在晓哲老师过去几个月每日一课的耐心点拨下,在我手抄了 nginx core 中的大量 C 源之后,我才开悟的。需要我们编写更多教程,需要我们向世人揭示这种力量,这种神奇。Igor 老大的表达能力貌似非常有限,至少在英语表达方面,这不得不说是一件憾事。但幸运的是他还是可以用很漂亮的 C 代码来传递和实践他的那些极好的 idea. 我必须说这些 idea 是我在 OpenResty 中的一些引以为豪的想法的高级发展形式。所以我第一眼看到,我就有着某种强烈的共鸣,即使当时我还没有足够了解其细节,呵呵。

最后用一句话来总结,道一下 nginx 鲜为人知的秘密:“nginx is a web application framework.”

January 04, 2010 03:48 AM

January 02, 2010

WWW::Google::Contacts

Fayland And Programming

this one is my first module of 2010 - WWW::Google::Contacts

it's nothing big but implement the Google Contacts Data API.

so Enjoy!

Thanks.

January 02, 2010 07:32 PM

December 31, 2009

ngx_drizzle: make nginx talk directly to mysql, drizzle, and sqlite3 by libdrizzle

Human & Machine

This is the last day in 2009 and I'm too impatient to hold the first release (version v0.0.1) of our ngx_drizzle module, an upstream module that talks directly to RDBMS backends like mysql, drizzle, and the drizzle server shipped with libdrizzle for sqlite3.

This module was initially started by my friend and colleague, chaoslawful. His Chinese name is 王晓哲 :) He did the initial (and most difficult) work all in his own time :) I also shamelessly borrowed a lot of code from Igor's ngx_http_upstream.c and ngx_http_memcached_module.c in the nginx 0.8.30 core, as well as Maxim Dounin's excellent upstream_keepalive module. These parts of code are copyrighted by these authors, respectively.

This module is still at its very early phase of development and considered highly experimental. But you're encouraged to test it out on your side and report any quirks that you experience :)

Here's some sample configurations:

   http {
       ...

       upstream cluster {
           # simple round-robin
           drizzle_server 127.0.0.1:3306 dbname=test
                password=some_pass user=monty protocol=mysql;
           drizzle_server 127.0.0.1:1234 dbname=test2
                password=pass user=bob protocol=drizzle;
       }

       upstream backend {
           drizzle_server 127.0.0.1:3306 dbname=test
                password=some_pass user=monty protocol=mysql;
       }

       server {
           location /mysql {
               set $my_sql 'select * from cats';
               drizzle_query $my_sql;

               drizzle_pass backend;
           }
           ...
       }
   }

Essentially it provides a very efficient and flexible way for nginx internals to access mysql, drizzle, sqlite3, as well as other RDBMS's that support the drizzle protocol or mysql protocol. Also it can serve as a direct REST interface to those RDBMS backends.

It also has a builtin per-worker connection pool mechanism borrowed from Maxim Dounin's upstream_keepalive module.

Here's a sample configuration:

  upstream backend {
    drizzle_server 127.0.0.1:3306 dbname=test
          password=some_pass user=monty protocol=mysql;
    drizzle_keepalive max=100 mode=single overflow=reject;
  }

You may wonder why it will be useful for your PHP/Python/Perl/Java applications fastcgi'd or proxied by nginx. Well, we'll work out an ngx_accel_subrequest module some time in the future to allow these backend apps directly issue subrequests by means of the special X-Accel-Subrequest header and continuation passing style.

Unlike the current X-Accel-Redirect trick we're already familiar with, X-Accel-Subrequest is more like a function invocation that issues one (or multiple parallel subrequests), and eventually *returns* back when the resulting data the subrequest obtains is ready and gives control back to your backend apps to go on processing the data.

Sadly this mdoule does not have a wiki page yet, just the source repository on GitHub:

   http://github.com/chaoslawful/drizzle-nginx-module

I've done some work in the README file there. I promise I'll work on the wiki doc in another day ;)

Release tarballs can be downloaded from the page below

   http://github.com/chaoslawful/drizzle-nginx-module/downloads

I've also cc'd the nginx-devel mailing list in hope to get interested developers to join this project's development. __YOU__ are always welcomed!

Happily, we've already got a working framework for integrating third-party libraries like libdrizzle and libpq into nginx's upstream system as long as the libraries meet the following prerequisites:

libdrizzle and libpq meet these conditions while oracle's OCI library needs some hack for the requirement #1. So, you can expect ngx_oracle and even ngx_pgsql to be announced here in the next few weeks or so, and you'll see even more! If you'd join the fun, please don't hesitate to drop us a line :)

Happy new year!

P.S. I was hoping to release v0.0.1 by this Christmas, but missed the deadline because I had spent too much time on fixing weird bugs in the ngx_chunkin module :P

The year 2009 is my first year in the nginx community. It's really live and has a lot of fun. I'm going to do more nginx C hacking in the next year! :D

December 31, 2009 11:14 AM

December 29, 2009

64bit Virtual Machine builder for Oracle 10g

purl in your heart

sudo ubuntu-vm-builder vmw6 jaunty --addpkg libstdc++5 --addpkg build-essential --addpkg g++-multilib --addpkg openssh-server --addpkg xauth --addpkg gawk --addpkg libaio1 --addpkg xterm --addpkg ia32-libs --addpkg rlwrap --mem 512 --hostname omx --user oracle --pass oracle --part diskpart.txt --mirror http://debian.nctu.edu.tw/ubuntu

and sudo those links:

sudo ln /usr/bin/basename /bin/basename
sudo ln /usr/bin/awk /bin/awk
sudo ln /usr/bin/ /bin/

This machine will avoid a lot of strange errors for 10g release 2 install at ubuntu.

December 29, 2009 10:19 PM

Ignore whitespace

wanglianghome

第一次收到patch,看了一遍,感觉没问题,于是

patch -p1 -i fix-test-in-nested-expression.patch

打第一段补丁很顺利,打第二段时却失败了。百思不得其解,看了半天,忽然觉得可能是缩进的问题,于是增加了忽略空格不同的选项,如下

patch -p1 -l -i fix-test-in-nested-expression.patch

完美解决!

December 29, 2009 04:48 PM

December 27, 2009

For those who have trouble at root directory free space

purl in your heart

sudo find / -mount -type d -maxdepth 2 | egrep -v ^/\($\|proc\|var\|usr\|home\) | sudo xargs du -sh

You can add more pattern at egrep -v to limit the space analysis (to the directories local to root file system)

December 27, 2009 08:00 PM

December 26, 2009

圣诞观影后记

rogerz

说一下昨天看的电影《十月围城》。去看这部片子,是出于过节的需求,看了豆瓣上的评价还不错,就去了。略览了几篇影评,据说还挺感人的。

我是个很不容易被感动的人,经常看一些经典的片子都没有什么感觉……可是我承认,我昨天被感动了。当看到李玉堂一字一句地重复陈少白那段大道理时,我感觉眼睛湿润了。革命,是用这一代人的鲜血换取下一代人的幸福,而有的人,连自己的下一代都牺牲了。

影片很多地方处理的很细致,比如开头辅仁文社里师生的发型,李玉堂家吃饭时众人的举止。我想,影片之所以能感动我这类人,可能是由于触动了demos这根敏感的神经吧。

最后,附今日去补牙时所拍的无关照片一张

@地铁西藏南路站附近

December 26, 2009 05:31 AM

December 25, 2009

The End of 2009 CN Perl Advent Calendar

Fayland And Programming

I'm really very happy that we get it done today. the last article is perlthanks from I. and we didn't miss one day. 25 tips 25 days.

I have totally 18 articles published, really Wow! they include ack, autodie, dzil, local::lib, Devel::NYTProf, Padre, pip, Plack, REPL, perlthanks and more.
check them if you missed. :)

Thanks.

December 25, 2009 09:52 PM

December 23, 2009

GPRS opened

purl in your heart

Reference for all the GPRS sms instructions in Beijing. This proved to be the most quiet way to make the system work. Looking for coming new year with monthly 150M bytes usage.

http://www.bj.chinamobile.com/index/products/inter/63025/

Still need to wait for 2 weeks for this package to work.

December 23, 2009 01:13 AM

December 21, 2009

Welcome to Christmas Party

purl in your heart

We are having a Christmas party at Beijing, welcome to join us to celebrate the Birth of Christ, this invitation is also open to perl people in Beijing and who visit here.

Address is 东四十条,南新仓,人均 50 圆自助餐,25 日晚七点开始(请尽量着正装),please contact with me to know detailed info。

December 21, 2009 08:28 PM

December 15, 2009

This entry is the reason why I write the tools in last entry

purl in your heart

Maybe I missed one hidden feature, but to my knowledge there's currently no way of getting the URL of the current page other than copying it from the address bar.

This is annoying in a situation where you are on a page that opened with the adress bar "hidden" (strangely, at least in OSX, choosing from the menu View>Toolbars>Navigation Toolbar has no offect on the current window. One of the rare moments when I miss the old IE5, where there was even a keyboard-shortcut for this). I agree this is a rare situation, but it's annoying.

IMO there should be a ctrl-click option "Copy Current Page Location".

The simplest way for now is to select "Send Link", and then copy the URL, but this is unpractical if your email client isn't open.

Another plausible option would be to select "View Page Info", which displays the URL, and you can even select it.... but you CAN'T COPY IT! This seems profoundly unlogical.

Small annoyances, but this could make firefox more perfect...

December 15, 2009 08:14 PM

replace evolution with a paste URL tool, for firefox to "send link"

sudo pp -e 'use URI::Escape; use Gtk2; Gtk2->init; $c=Gtk2::Clipboard->get(Gtk2::Gdk->SELECTION_CLIPBOARD); $c->set_text(uri_unescape(join(qq(\n), map {s/mailto:\?body=//;s/&subject.*//;$_} @ARGV))); $c->store;' -o /usr/bin/evolution

December 15, 2009 05:25 PM

December 14, 2009

POE echo server wrapped with PAR::Packer(pp)

purl in your heart

pp -e 'use POE; require POE::Filter; require POE::Component::Server::TCP; POE::Component::Server::TCP->new(Port=>3240, ClientInput=>sub {$_[HEAP]{client}->put($_[ARG0])});POE::Kernel->run;' -o bin/echo_slave

many years passed and this combination changed somewhat, now you need to replace 'use' with 'require', and that's why -M stopped to work.

December 14, 2009 06:33 PM

December 12, 2009

perl unicode dump 2nd version (with charnames)

purl in your heart

PERL_UNICODE=ADSL perl -Mcharnames=:full -e 'print $a=sprintf(qq(\\x{%04X}),$_),qq(\t@{[eval q(").$a.q(")]}\t@{[charnames::viacode($_)]}\n) for 0x0 .. 0xa00' | less

December 12, 2009 12:22 AM

December 11, 2009

上海落户历程

rogerz

前天又请了个年假,把户口的事情办妥了,拿到了集体户口页。本来想总结一下落户的过程,回顾一下发现真是手续繁杂。先把涉及到的材料列一下吧,流程以后再补。

December 11, 2009 03:44 PM

December 08, 2009

unicode list in one-liner

purl in your heart

pp -e 'print eval qq(qq(\\x{@{[sprintf(q(%04x),$_)]}})),qq(\t),sprintf(q(%04x),$_),qq(\n) for 0x0000..0xffff' -o bin/unicodes

December 08, 2009 09:07 PM

Test::Nginx::LWP and Test::Nginx::Socket are now on CPAN

Human & Machine

I've just released the Perl modules Test::Nginx::LWP and Test::Nginx::Socket as a single Test-Nginx distribution to CPAN (it may still require some more time to reach the CPAN mirror near you):

   http://search.cpan.org/perldoc?Test::Nginx::LWP

   http://search.cpan.org/perldoc?Test::Nginx::Socket

The latter is still a hack using while (1) with non-blocking sysread/syswrite. I'll rewrite it using IO::Select at some point in the future.

These two Test::Base-style modules are driving the test suites of the following Nginx C modules:

ngx_echo

http://wiki.nginx.org/NginxHttpEchoModule

ngx_headers_more

http://wiki.nginx.org/NginxHttpHeadersMoreModule

ngx_chunkin

http://wiki.nginx.org/NginxHttpChunkinModule

ngx_memc

http://wiki.nginx.org/NginxHttpMemcModule


And our ngx_drizzle module, started by my friend and colleague, chaoslawful++, will soon be on that list as well ;)

Happy testing nginx modules with Perl!

December 08, 2009 05:36 AM

December 07, 2009

tools for encode & decode of HTML entities

purl in your heart

pp -e 'use HTML::Entities; print decode_entities(<>)' -o bin/html-decode-entities
pp -e 'use HTML::Entities; print encode_entities(<>)' -o bin/html-encode-entities

December 07, 2009 08:39 PM

December 06, 2009

ngx_memc: an extended version of ngx_memcached that supports set, add, delete, and many more commands

Human & Machine

I'm happy to announce the first release of the ngx_memc module, an extended version of the standard "memcached" module that supports almost the whole memcached TCP protocol:

   http://wiki.nginx.org/NginxHttpMemcModule

I know the state of the art in the nginx community is to manipulate contents in memcached from within a backend (fastcgi) application while just use the nginx memcached module as a frontend that simply "serves" the content to the outside world. But read on...

Our motivation here, however, is to build an efficient and flexible memcached TCP client component for nginx itself  so that we can reuse it in an non-blocking way by means of an nginx subrequest or a "fakerequest" [1].

As always, special thanks go to Igor Sysoev for all his heavy lifting work already done in ngx_http_upstream and ngx_http_memcached_module in the nginx core.

At last but not least, thanks my current employer, Taobao.com, for allowing me to opensource this work, as well as other nginx modules I have already announced or am going to announce here ;)

Enjoy!

References
[1] See http://github.com/srlindsay/nginx-independent-subrequest

December 06, 2009 10:27 AM

December 04, 2009

Major updates to ngx_chunkin: lots of bug fixes and beginning of keep-alive support

Human & Machine

Prompted by the bug reports from one of my users, the J guy, I've made lots of improvements into the ngx_chunkin module:

  http://wiki.nginx.org/NginxHttpChunkinModule

Please see the change log for the new v0.07 release:

  http://wiki.nginx.org/NginxHttpChunkinModule#v0.07

It's worth special mentioning that it's really a shame that older versions do not work with chunked data with non-ascii octets. This was caused by incorrect alphtype definition in my Ragel spec. Thanks J for reporting it.

Also, I've introduced a new directive spelled "chunkin_keepalive", which will make ngx_chunkin work in the non-pipelined keep-alive context. Preliminary support for HTTP 1.1 pipelining is also introduced but has not been tested very well. (For technical details, see this nginx-devel thread )

As before, this module is still considered experimental but you're encouraged to try it out and report any issues that you encounter. I promise I'll fix bugs as fast as I can :)

Happy chunking nginx!

December 04, 2009 10:14 AM

December 01, 2009

2009 CN Perl Advent Calendar

Fayland And Programming

we just have the first article for our first advent calendar, from I. :)

http://perlchina.org/advent/

besides:
http://www.catalystframework.org/calendar/
http://advent.rjbs.manxome.org/

Enjoy! Thanks

December 01, 2009 04:39 PM

November 28, 2009

换了个新电脑

rogerz

曾经听到过个为什么应该买笔记本而不是台式机的说法。大意是,房子都上万元一平米了,你怎么忍心去买一个大铁盒子占去几千块钱的地方。于是我决心买一个迷你型的,今天终于入手了。可以发挥我的液晶显示器的余热了。

IMG_2001

机箱虽小,要连的线是一根都不少……

November 28, 2009 12:39 PM

managed to setup the plone login with SQLPASPlugin

purl in your heart

Thanks for the links at http://blog.fourdigits.nl/using-sqlpasplugin-for-mysql-user-authenticating

This helped me to authenticate crypted password stored in mysql, instead of put them inside ZODB file.
One point to note is the hack of valid function can be changed to the single below:

return reference == crypt.crypt(attempt, reference)

and I also do import crypt at the head of this file encrypt.py, it's for safe :)

November 28, 2009 07:40 AM

November 23, 2009

Links for CJK compatible

purl in your heart

統一碼相容字符 http://zh.wikipedia.org/zh-cn/%E7%B5%B1%E4%B8%80%E7%A2%BC%E7%9B%B8%E5%AE%B9%E5%AD%97%E7%AC%A6

This is what I have talked about at Beijing Perl Workshop 2009.

November 23, 2009 02:10 AM

November 22, 2009

清洗油烟机的陷阱

rogerz

这个社会还是太复杂,一不小心就被人给骗了。

月初搬到新租的房子里,其他都还好,就是厨房的油污积得很厚,于是就想叫人来清洗一下。每天等班车时都会看到路边有些劳动妇女模样的在自行车上挂个牌子,提供清洗油烟机的服务。打听了一下,清洗油烟机加煤气灶大概四十元左右,我估摸干这活大概要1个小时左右,觉得这个价钱还算合理,就要了张名片,相当劣质。

周末恰好有空,就打电话叫人,接电话的是个男的,说正在忙,要一个半小时以后,我说那我找别人吧,他一听急了,说那他马上叫别人过来。不出十分钟,那个我在路边天天看到的劳动妇女就上门了。

她先试了一下油烟机,告诉我灯是坏的,然后油烟机有点声音,我一看确实如此。然后就开始拆,拆了掉壳子和排气管后,她从接口掏出一些老化的橡胶末,告诉我油封坏了,所以老漏油,我也没在意,后来才知道这是伏笔。

我从来不给乞丐钱,但是对靠自己劳动挣钱的人一向很尊重,她让我搬个椅子什么的我都很爽快。没想到她得寸进尺,一会儿又让我烧锅水,一会儿又让我帮忙把泡着扇叶的桶提上楼,我心里虽然不太爽,花钱请人干活还要被支使,但想想这样可能会更快些就没有推辞。

大概过了一个小时左右,那个接电话的男人过来了,也不知道是不是一家。东西都洗干净了,准备装回去,这时候他告诉我油烟机的油封已经不能用了,要换。一听要换东西,我就知道要讹钱了。

他问我用高级的还是普通的,高级的58一米,普通的38一米。这真是营销高手啊,不是问你换不换,而是问你要好的还是差的,先把你框到一定要换的圈子里去。老婆没有上当,很坚决地说不要,直接装吧。他就开始不停地说你不换的话抽风会漏气,油也会漏出来,就白洗了。

他说的这个可能性倒也不是没有,于是我就问他要用几米?他说这个不好说,到时候量一下就知道了。我没有搭理他,说那你估算一下先。他又推三阻四了好几遍最后说大概两三米吧。

我觉得这不是个小钱,于是打电话给房东,让房东跟他谈价钱,房东一听要一百多,自然不愿意了。那个人又开始强调如果没有密封圈的话,油烟机就不好使了,其实我也担心这一点,房东电话里跟我讲他们就是这样的,每次清洗都会告诉你这个密封圈坏了,让你换,你过几个月再洗,他们也会这么说。最后磨了好久,讲到了70元。

那个人换完之后喃喃自语,说用了4米,做出一副自己很亏的样子,我也庆幸自己当时坚持一口价,没有按长度计算。但心里想想还是不踏实,于是那个人走后上网搜了一下,果然有很多人碰到过同样的事情,而且过程如出一辙,拆的时候搞点碎末给你看,装的时候问你用高级的还是普通的,价钱都是按长度开给你的,如果你不注意,用完之后一算就是上百元钱,最多的有三百的。

清洗的价钱讲半天也就少掉个十几二十,这一骗就被骗去近一百,心里真是不爽。如果按照他们每天能接到四到六次活计算,一个月可以骗到上万啊,哪里是什么底层的劳动人民。回想起上次从南京搬家到上海,所托的山寨铁通也狠狠地宰了我们一刀,这些人外貌都很淳朴,但其实宰人都很老道,而且有固定的套路。这种人在社会上的泛滥也断绝了真正淳朴的人的生路,也让一些行业臭名远扬。

这次的教训除了对人的警惕性不够之外,还缺乏对价格的正确估量,当时被他的高级的58,普通的38所迷惑。如果仔细看看那个东西,或者立刻上网查一下,就知道根本连10元都要不了。

November 22, 2009 08:36 AM

November 18, 2009

The "headers more" module: scripting input and output filters in your Nginx config file

Human & Machine

I've been working madly on the "headers more" module:

   http://github.com/agentzh/headers-more-nginx-module

And got everything that I want working now. It also has a nice wiki page (which also has brief explanation of the underlying implementation):

   http://wiki.nginx.org/NginxHttpHeadersMoreModule

Our buzzword is that it can rewrite the "Server" output header dynamically! See this:

   location /foo {
        more_set_headers   "Server: $arg_server";
   }


Then GET /foo?server=Foo will get a response with the "Server: Foo" header set ;)

Input headers can be trivially rewritten as well, including the "Host" header:

    more_set_input_headers   "Host: some-other-host";

Well, the full practical power of this module is out of my current imagination. If you have some crazy uses, please drop me a line ;)

Happy Nginx hacking!

November 18, 2009 10:09 AM

November 16, 2009

Git mirror of all GCC SVN branches and tags

wanglianghome

终于找到使用http方式做git clone GCC源代码的方法了。GCC gitweb的链接是

http://gcc.gnu.org/git/?p=gcc.git;a=summary

上面有三种clone方式的链接。

November 16, 2009 12:58 PM

November 15, 2009

The "chunkin" module: Experimental chunked input support for Nginx

Human & Machine

Pushed by those cutting-edge users on the Nginx mailing list, I've quickly worked out the "chunkin" module which adds HTTP 1.1 chunked input support for Nginx without the need of patching the core:

    http://github.com/agentzh/chunkin-nginx-module

This module registers an access-phase handler that will eagerly read and decode incoming request bodies when a "Transfer-Encoding: chunked" header triggers a 411 error page in Nginx (hey, that's what you have to pay for avoiding patching the core ;)). For requests that are not in the "chunked" transfer encoding, this module is a "no-op".

To enable the magic, just turn on the "chunkin" config option like this:

    chunkin on;
    location /foo { ... }
    ....

No other modification is required in your nginx.conf file. (The "chunkin" directive is not allowed in the location block BTW.)

This module is still considered highly experimental and there must be some serious bugs lurking somewhere. But you're encouraged to play and test it in your non-production environment and report any quirks to me :)

Efforts have been made to reduce data copying and dynamic memory allocation, thus unfortunately raising the risk of potential buffer handling bugs caused by premature optimizations :P

This module is not supposed to be merged into the Nginx core because I've used Ragel to generate the chunked encoding parser for joy :)

The following Nginx versions have been successfully tested by this module's (very limited) test suite:

   0.8.0 ~ 0.8.24
   0.7.21 ~ 0.7.63

The test suite definitely needs more test cases and the code is hacky in various places. If you're willing to contribute, feel free to ask me for a commit bit in a private email :)

Update: I've also added a wiki page for it: http://wiki.nginx.org/NginxHttpChunkinModule
Update 2: This module is now considered production ready and some of the users have already put it into production. Thanks J for reporting lots of bugs for real phones in the real world.

November 15, 2009 10:25 AM

November 14, 2009

PW Workshop

purl in your heart


This is for the WFL session demostration.


November 14, 2009 01:47 AM

November 12, 2009

The songs I love very much

purl in your heart

http://www.google.cn/music/album?id=B94a511c728d17c1a

November 12, 2009 06:54 AM

November 05, 2009

Why ADSL is better than DSL for PERL_UNICODE

purl in your heart

% echo 调查资料 | perl -e 's{@{[shift @ARGV]}}{INFO} && print $_ while (<STDIN>)' 资料
调查INFO

% echo 调查资料 | perl -e 's{资料}{INFO} && print $_ while (<STDIN>)'
调查INFO

% echo 调查资料 | PERL_UNICODE=DSL perl -e 's{资料}{INFO} && print $_ while (<STDIN>)'

% echo 调查资料 | PERL_UNICODE=DSL perl -e 's{@{[shift @ARGV]}}{INFO} && print $_ while (<STDIN>)' 资料

% echo 调查资料 | PERL_UNICODE=ADSL perl -e 's{@{[shift @ARGV]}}{INFO} && print $_ while (<STDIN>)' 资料
调查INFO

November 05, 2009 12:03 PM

November 02, 2009

orgmode for mail

wanglianghome

使用orgmode越来越顺手,于是用它替换markdown作为写HTML邮件的首选。配置如下:

(defun wl-message-goto-body-end ()
  "Go to the end of message body.  Before attachment part."
  (or (save-excursion
        (when (re-search-forward
               "^#part .+ filename=.+ disposition=attachment>$"
               nil
               t)
          (forward-line -1)
          (end-of-line)
          (point)))
      (point-max)))

(defun wl-org-export-region-as-html-string (beg end)
  (interactive "r")
  (save-excursion
    (org-export-region-as-html beg end t 'string)))

(defun wl-mail-org2html-region (beg end)
  (interactive "r")
  (save-excursion
    (let ((html-txt (wl-org-export-region-as-html-string beg end)))
      (goto-char end)
      (message "%s" end)
      (insert "<#part type=text/html>\n<html>\n<head>\n<title>HTML version of email</title>\n</head>\n<body>")
      (insert html-txt)
      (insert "\n</body>\n</html>\n<#/multipart>\n")
      (goto-char beg)
      (insert "<#multipart type=alternative>\n"))))

(defun wl-mail-org2html-message-body ()
  (interactive)
  (save-excursion
    (message-goto-body)
    (wl-mail-org2html-region (point) (wl-message-goto-body-end))))

(add-hook 'message-send-hook 'wl-mail-org2html-message-body)

November 02, 2009 04:52 PM

November 01, 2009

Big Snow at Beijing

purl in your heart

Today we have experienced the 1st snow at beijing, in early november this year. While it's not cold at all, so we managed to play snow with our son Daniel. Thanks God for this special birthday gift, it's a sign that we have served 4 seasons at this city :)

______________________________________________________________________
This communication contains information which is confidential. It is for the
exclusive use of the intended recipient(s). If you are not the intended
recipient(s) please note any distribution, copying or use of this
communication or the information in it is strictly prohibited. If you have
received this communication in error please notify us by e-mail or
by telephone (as above) and then delete the e-mail and all attachments and
any copies thereof.
______________________________________________________________________

November 01, 2009 05:34 AM

October 31, 2009

说说GitHub上的Network

rogerz

前日意外收到chunzi的邀请,加入了了progit的中文翻译项目。progit是一本介绍Git的书,我之前有读过,收益匪浅。书的源文件及其国际化工作host在GitHub上。

GitHub网站有个显著的特点,就是它所倡导的Social Coding。一个比较有意思的功能就是里面的Network

network

在其中你可以看到每一个分支的来龙去脉,你在进行自己工作的同时,可以了解到其他分支的工作进度,既避免了不必要的重复劳动,也有利于及时归并其他分支的更新。

这张网络图的每一个节点对应于Git repository里的一次commit,Git的repository是分布式的,因此同一个commit可能会同时存在于多个repository中,但这个图只会显示其中一个。系统会优先将其显示在当前的repository中,而不是最初提交到的那个repository。图上看到rogerz这一行里有非常多的节点,但实际都不是我commit的,而是直接clone自官方分支。因为commit里并不记录它的出生地,因而也很难追溯。并且对这个图来说,有意义的部分是哪些分支尚未被归并到自己的repository中,出身并不重要。

网站上有一篇文章专门讲述Network,但侧重于这张图的作用,并没有介绍其生成的机理,于是让我琢磨了好久。

由于Git的commit里包含了指向parent的指针,所以追溯祖先并不是什么困难的事情,可是包含在其他分支里的那些节点怎么处理呢?由于这个图里只显示保存在GitHub上的节点,并且是按照GitHub上的注册用户来划分泳道的,所以我猜测这个网络关系是结合网站数据库读取的。

至于是顺着parent及remote里的ref这个藤摸的瓜呢,还是整个网站cache一份commit的浑水从中摸鱼就不得而知了。考虑到你无法直接操作GitHub上的repository来添加remote,而只能clone到本地之后再push分支上去,我觉得浑水摸鱼的可能性更大些。

October 31, 2009 02:19 AM

October 30, 2009

Re: Waiting for Lord's Grace

purl in your heart

I Love this song a lot, wish you enjoy this too.

October 30, 2009 11:52 PM

October 26, 2009

gmail中不用插件实现多个签名档的办法

rogerz

由于同一个gmail邮箱在多个群体中使用,有时候需要使用不同的签名档,而gmail本身只支持一个。在google上搜索关键词multiple signatures gmail,大部分是使用Firefox的Addon或GreaseMonkey的解决方法,不能用于我日常使用的Google Chrome。

不过还是有一篇提到了通用的解决方案:How To: Have Multiple Signatures in Gmail with No Extensions …,采用gmail labs功能里的canned responses。这个功能实际上是借用了回复模版来实现多签名档,简单易用。

具体步骤

  1. 启用labs里的canned responses
  2. 新建一封空邮件,编写签名档
  3. 将签名档保存到canned responses中
  4. 回复邮件时,从canned responses里选择需要插入的签名档

canned responses

内部机理

点开All mail标签,可以窥见签名档是被保存在了一封Draft邮件中,如果这个draft被discard,设置的签名档就失效了。但这个Draft又比较特殊,它不是存在于系统的Draft标签下的,直接创建并保存的草稿不会被canned responses索引。

cr internal

October 26, 2009 02:38 PM