September 02, 2010

ngx_set_misc v0.14: extending ngx_rewrite's "set" directive

Human & Machine

I'm happy to announce the first public release of our Taobao.com ngx_set_misc module, v0.14.

ngx_set_misc is an nginx module that extends the standard ngx_rewrite module's "set" directive to support various advanced functionalities like MD5, SHA1, json/mysql/postgresql string literal quoting, URI escaping/unescaping, default variable value assignment, upstream hashing based on a custom key, base32 encoding/decoding, and more :)

Please see the project homepage for more details:

    http://github.com/agentzh/set-misc-nginx-module

And the release tarball can be downloaded from

    http://github.com/agentzh/set-misc-nginx-module/tarball/v0.14

Various (funny) use cases can be found in my "nginx.conf scripting" talk's slides:

    http://agentzh.org/misc/slides/nginx-conf-scripting/  (use the arrow keys on your keyboard to switch pages)

I must thank my colleagues shrimp and calio for their work on polishing this module in the last few months.

This module won't be possible if Marcus Clyne does not publish his crazy Nginx Development Kit (NDK) project:

    http://github.com/simpl-it/ngx_devel_kit

And it's a prerequisite for this module :)

I know that this module has a really terrible name, but it's been there for months already :P

We've been using it extensively in our products of Taobao.com. And Qunar.com is also using it heavily in their production environment.

Enjoy!

September 02, 2010 08:15 AM

ngx_drizzle v0.0.12: better timeout control

I'm delighted to announce the v0.0.12 release of ngx_drizzle, a non-blocking upstream module that helps nginx talk directly to mysql, drizzle, and sqlite3 servers (and with an optional connection pool). The project source repository and the homepage is on GitHub:

   http://github.com/chaoslawful/drizzle-nginx-module

and the release tarball can be downloaded from

   http://github.com/chaoslawful/drizzle-nginx-module/tarball/v0.0.12

This release has the highlight of several new config directives that control the various timeout settings used by ngx_drizzle:

  drizzle_connect_timeout <time>
  drizzle_send_query_timeout <time>
  drizzle_recv_cols_timeout <time>
  drizzle_recv_rows_timeout <time>

The default timeout values for them are all "60 s", i.e., 60 seconds, which may be too long for many real world applications.

Thanks my colleague shrimp++ for testing them and fixing bugs in the original (undocumented) implementation :)

Also thanks Piotr Sikora for his tireless improving the test suite to fit our grand test build farm and his fixes for the recent nginx 0.8.47+ releases.

As always, special thanks go to my colleague and closest friend, chaoslawful++, for creating such an excellent module in the first place ;)

FWIW, we're already using this module (combined with ngx_rds_json as well as many other modules) in production, in particular, the Taobao.com LiangZi Shop Stats website:

   http://lz.taobao.com/

Have fun!

 

September 02, 2010 07:51 AM

August 30, 2010

MIME mail builder

purl in your heart

Here you can see the difference of this two program, mime-construct is
a nice client for you to build Chinese mail. Remove --output option to
send it.

$ mime-construct --to me@mail.com --subject 汉字--string test --output
To: me@mail.com
Subject: 汉字MIME-Version: 1.0 (mime-construct 1.10)
Content-Transfer-Encoding: base64

dGVzdA==
---
$ mime-construct --to me@mail.com --subject $(echo 汉字 | perl
-MEncode=encode -lne 'print encode(q(MIME-Header),$_)') --string test
--output
To: me@mail.com
Subject: =?UTF-8?B?w6bCscKJw6XCrcKX?=
MIME-Version: 1.0 (mime-construct 1.10
Content-Transfer-Encoding: base64

dGVzdA==

Posted via email from purl's posterous

August 30, 2010 07:58 AM

August 24, 2010

所盼望的遲延未得、令人心憂.所願意的臨到、卻是生命樹。

purl in your heart

August 24, 2010 08:59 AM

August 19, 2010

cheater: yet another rule-driven tool to generate random databases

Human & Machine

cheater 是我前一阵子开发的规则驱动的随机数据库生成器,是量子开发、测试工具链的组成部分。

cheater 已在量子店铺统计的前端开发中得到了广泛的应用,为前端开发提供大量的指定模式的伪造数据,从而大大减轻了对后端和真实数据的依赖,同时也可以得到比较理想的数据覆盖率。

cheater 工具的源码仓库位于下面这个位置:

    http://github.com/agentzh/cheater/

相比我们 QA 部门的 xdata,Ruby 世界的 faker 和 Perl 世界的 Data::Faker 等同类型的工具,cheater 具有以下优点:

  1. 能自动处理表间的关联和外键约束,因此是真正的“数据库实例生成器”
  2. 定义了一种类 SQL 的小语言来表达期望生成的数据模型
  3. 支持强大的 {a, b, c} 离散集合,数值/时间/日期区间记法 a..b,Perl 正则表达式模式 /regex/,常量值 'string', 1.32 等方式来表达数据字段的值域
  4. 能直接生成 JSON 或者 SQL insert 语句,便于导入到 mysql/Pg 等数据库或者 hive 等其他系统

下面是一个很简单的示例,演示其基本用法。

我们首先在一个工作目录(比如 ~/work/ 下)新建一个 .cht 文件,用来描述我们想生成的数据模型。假设我们有一个 company.cht 文件:

    # Empolyee table
table employees ( id serial;
name text /[A-Z]a-z{2,5} [A-Z]a-z{2,7}/ not null unique;
age integer 18..60 not null;
tel text /1[35]8\d{8}/;
birthday date;
height real 1.50 .. 1.90 not null;
grades text {'A','B','C','D','E'} not null;
department references departments.id;
)

# Department table
table departments (
id serial;
name text /\w{2,10}/ not null;
)

10 employees;
2 departments;

这里我们使用的是 cheater 自己的小语言,其含义几乎是一眼就明了的,特别地,最后两行是说随机生成符合规则的 10 条 employees 表的记录和 2 条 departments 表的记录。

然后我们使用 cht-compile 命令来编译我们的 company.cht 生成随机的数据库实例:

    $ cht-compile company.cht
Wrote ./data/departments.rows.json
Wrote ./data/employees.rows.json

我们看到它分别为 departments 和 employees 这两张表生成了两个 JSON 格式的数据文件。其中的 data/employees.rows.json 文件在我机器上的此次运行是这个样子的:

    $ cat data/employees.rows.json
[["id","name","age","tel","birthday","height","grades","department"],
["7606","Kxhwcn Cflub",54,"15872171866","2011-04-01","1.67276","D","408862"],
["63649","Whf Iajgw",55,"13850771916",null,"1.65297","E","844615"],
["348161","Nnwe Obfkln",27,"15801601215","2011-03-06","1.69275","D","408862"],
["353404","Shgpak Xvqxw",28,"15816453097",null,"1.67796","A","408862"],
["445500","Bdt Mhepht",47,"13855517847",null,"1.89943","C","844615"],
["513515","Ipsa Mcbtk",25,"13874017694","2011-01-06","1.79534","A","844615"],
["658009","Lboe Etqo",27,null,"2011-04-14","1.85162","E","408862"],
["716899","Gey Elacflr",18,"15804516095","2011-02-27","1.75681","A","844615"],
["945911","Hsuz Qcmky",39,"13862516775","2011-05-31","1.75947","B","408862"],
["960643","Qbmbe Ijnbqsb",24,"15872418765","2011-04-11","1.78864","B","844615"]]

最后,为导入到关系数据库,我们可以使用 cht-rows2sql 命令将得到的 .json 数据文件转换为 .sql 文件:

    $ cht-rows2sql data/*.rows.json
Wrote ./sql/departments.rows.sql
Wrote ./sql/employees.rows.sql

其中的 sql/departments.rows.sql 在我这里是这样的:

    $ cat sql/departments.rows.sql
insert into departments (id,name) values
(408862,'dJRq7LCXL'),
(844615,'G_m9Nkh3q');

这样我们就可以直接往 mysql 这样的数据库里导入数据了:

    $ mysql -u monty test -p < sql/departments.rows.sql
目前 cheater 仍处于比较活跃的开发阶段,缺乏比较完整的文档。最完整的文档是它的自动化测试集: 

     http://github.com/agentzh/cheater/tree/master/t/

点开其中的 .t 文件,便可以看到一个个的声明性的测试用例 wink

如果您在使用过程中发现任何 bug 或者有任何功能提议,请在 GitHub 上创建相应的 ticket:

     http://github.com/agentzh/cheater/issues

Enjoy!


August 19, 2010 03:24 AM

August 18, 2010

png icon builder with perl, imagemagick and xpm2wico

purl in your heart

jjiang@flatpan:~$ perl -e 'print qx{convert -resize 32x32\! @{[($a=$_=shift)=~s{.png}{.ico} && $_]} /tmp/a.xpm; xpm2wico -f /tmp/a.xpm $a}' home.png
jjiang@flatpan:~$ ls -l home.ico
-rw-r--r-- 1 jjiang jjiang 2238 2010-08-18 15:09 home.ico

Posted via email from purl's posterous

August 18, 2010 12:12 AM

August 11, 2010

make single-quoted list for SQL in list

purl in your heart

perl -lne 'END{print join q(,), @p} s{(^|$)}{chr(0x27)}eg; push @p, $_'

Posted via email from purl's posterous

August 11, 2010 01:07 AM

August 10, 2010

A patch for libdrizzle to fix issues on Mac OS X

Human & Machine

libdrizzle is an excellent piece of software but we've noticed that it does not compile on Mac OS X due to its use of the new bool type in C:

  /usr/local/include/libdrizzle/result.h:69: error: syntax error before ‘drizzle_result_eof’

The following small patch for libdrizzle 0.8 fixes this (as well as another bug regarding streaming parsing on the TCP protocol level, as reported by my colleague chaoslawful++ months ago):

   http://agentzh.org/misc/nginx/libdrizzle-0.8-parsebug_and_mac_fixes.patch

I'm looking forward to the next libdrizzle release or I'll have to keep my patch for my  ngx_drizzle module's users ;)

August 10, 2010 03:30 AM

August 05, 2010

Verse is your friend

purl in your heart

August 05, 2010 04:32 PM

August 03, 2010

ngx_chunkin v0.20: fixed some memory bugs and added support for chunked PUT

Human & Machine

I've just pushed a new release (v0.20) for my ngx_chunkin module:

  http://github.com/agentzh/chunkin-nginx-module/tarball/v0.20

Here's the changes included in this version:

This module adds HTTP 1.1 chunked input support for Nginx without the need of patching the Nginx core.

Behind the scene, it registers an access-phase handler that will eagerly read and decode incoming request bodies when a "Transfer-Encoding: chunked" header triggers a 411 error page in Nginx. For requests that are not in the chunked transfer encoding, this module is a "no-op".

To enable the magic, just turn on the chunkin config option and define a custom 411 error_page using chunkin_resume, like this:

     server {
       chunkin on;

       error_page 411 = @my_411_error;
       location @my_411_error {
           chunkin_resume;
       }

       ...
     }

See its wiki page for the full documentation:

  http://wiki.nginx.org/NginxHttpChunkinModule

Note that nginx 0.8.41 and 0.7.67 are confirmed to work with this version. Nginx releases newer than 0.8.41 will not work due to this
issue:

  http://forum.nginx.org/read.php?29,103078

There's no reply from the nginx author Igor Sysoev yet :(

Enjoy!

Update: I've just kicked ngx_chunkin v0.21 out of the door, which applies a patch from Gong Kaihui that fixed a small bug:

   http://wiki.nginx.org/NginxHttpChunkinModule#v0.21

   http://github.com/agentzh/chunkin-nginx-module/tarball/v0.21

August 03, 2010 03:24 AM

July 31, 2010

Bar Chart of current user's accumulate of pcpu

purl in your heart

pgrep -f .|xargs -i{} ps -o user=,pcpu= {}|perl -MGD::Graph::pie -lane 'END{$p=GD::Graph::pie->new(); binmode(STDOUT); print $p->plot([[keys(%pcpu)], [values(%pcpu)]])->png;} $pcpu{$F[0]}+=$F[1]' > pcpu-pie.png

Posted via email from purl's posterous

July 31, 2010 06:21 AM

July 30, 2010

tab seperated mail sender, work with pipe

purl in your heart

perl -F\\t -lane 'BEGIN{$a=shift; $s=shift} END{use MIME::Lite; use HTML::Entities qw(encode_entities); eval {local $\; $m = MIME::Lite ->new(To => $a, Subject => $s, Type => q(multipart/related)); $m->attach(Encoding =>q(base64), Type => q(text/html), Data => qq(<head><style>td {border:1px inset black;}</style></head><body><table style="border:1px outset black; border-collapse:collapse;">$t</table></body>)); print $m->send()}} $t.=qq(<tr>\t@{[map {s{(.*)}{<td>$1</td>};$_} map {encode_entities($_)} @F]}\t</tr>\n)' $*

Posted via email from purl's posterous

July 30, 2010 12:14 AM

July 27, 2010

putty duplicate window with autohotkey

purl in your heart

Put the lines below into AutoHotkey.ahk and you will able to mimic gnome-terminal's Ctrl-Shift-N with putty:

^+n::
IfWinExist,ahk_class PuTTY
{
  WinActivate (0.11)
  Send,{ALTDOWN}{SPACE}{ALTUP}d
}

Posted via email from purl's posterous

July 27, 2010 07:24 PM

July 23, 2010

diff of Pod::DocBook.pm

purl in your heart

11a12
> use HTML::Entities;
404c405,406
<       $argument =~ s/\s(?![^<]*>)/&nbsp;/g;
---
>       #$argument =~ s/\s(?![^<]*>)/&nbsp;/g;
>       $argument =~ s/\s(?![^<]*>)/ /g;
409c411
<       $string = "<indexterm><primary>$argument</primary></indexterm>";
---
>       $string = "<indexterm><primary>@{[encode_entities($argument)]}</primary></indexterm>";
680,682c682
<                               "<variablelist>\n",
<                               $parser->_indent (),
<                               "<varlistentry>\n",
---
>                               "<itemizedlist>\n",
684c684
<                               qq!<term><anchor id="$id">$paragraph</term>\n!,
---
>                               qq!<term><anchor id="$id"/>$paragraph</term>\n!,
705,708d704
<                           $parser->_outdent (),
<                           "</varlistentry>\n",
<                           $parser->_indent (),
<                           "<varlistentry>\n",
710c706
<                           qq!<term><anchor id="$id">$paragraph</term>\n!,
---
>                           qq!<term><anchor id="$id"/>$paragraph</term>\n!,
829,831c825
<                               "</varlistentry>\n",
<                               $parser->_outdent (),
<                               "</variablelist>\n",
---
>                               "</itemizedlist>\n",

Posted via email from purl's posterous

July 23, 2010 09:05 PM

July 10, 2010

DBI Usage

purl in your heart

July 10, 2010 03:48 AM

July 05, 2010

one-liner to find the GBK *.txt files in /tmp

purl in your heart

perl -MFile::Find::Rule -le 'print for File::Find::Rule->file()->name(q(*.txt))->grep(q(((?:[\xB0-\xF7][\xA1-\xFE]){1,})))->in(q(/tmp))'

Posted via email from purl's posterous

July 05, 2010 01:10 AM

June 30, 2010

File::Find::Rule wrapper

purl in your heart

pfind ()
{
    perl -MFile::Find::Rule -le 'print for File::Find::Rule->file()->name(shift)->grep(shift)->ctime(qq(>=@{[time-3600*shift]}))->in(shift)' $*
}

Posted via email from purl's posterous

June 30, 2010 12:32 AM

June 28, 2010

Perl 北京工作机会

purl in your heart

Download now or preview on posterous
JD-Local-Developer.pdf (91 KB)

ISI / Internet Securities Inc,. 北京办事处目前正在招聘 Perl 开发人员,欢迎加入,参考 http://www.securities.com

Posted via email from purl's posterous

June 28, 2010 01:42 AM

June 24, 2010

ngx_lua now has (basic) subrequest support

Human & Machine

Last night's ngx_lua hackathon has been proven extremely fruitful. chaoslawful and I didn't stop coding until midnight, and successfully finished the first draft of the most tricky bit in ngx_lua, that is, transparent non-blocking IO interface (or nginx subrequest interface) on the Lua land.

The following test case is now passing:

  location /other {
      echo "hello, world";
  }

  # transparent non-blocking I/O in Lua
  location /lua {
      content_by_lua '
          local res = ngx.location.capture("/other")
          if res.status == 200 then
              ngx.print(res.body)
          end';
  }


And on the client side:

   $ curl 'http://localhost/lua'
   hello, world

In the /other location, we can actually have drizzle_pass, postgres_pass, memcached_pass,  proxy_pass, or any other content handler configuration.

Here's a more amusing example to do "recursive subrequest":

   location /recur {
       content_by_lua '
           local num = tonumber(ngx.var.arg_num) or 0
           ngx.say("num is: ", num)

           if num > 0 then
               res = ngx.location.capture("/recur?num=" .. tostring(num - 1))
               ngx.print("status=", res.status, " ")
               ngx.print("body=", res.body)
           else
               ngx.say("end")
           end                            
           ';
   }

Here's the output on the client side:

    $ curl 'http://localhost/recur?num=3'
    num is: 3
    status=200 body=num is: 2
    status=200 body=num is: 1
    status=200 body=num is: 0
    end 
          

You can checkout the git HEAD of ngx_lua to try out the examples above yourself:

   http://github.com/chaoslawful/lua-nginx-module

So...time to replace our PHP code in the business with nginx.conf + Lua scripts!

We'd make the first public release of ngx_lua when its implementation and API become solid enough ;)

June 24, 2010 04:52 AM

Backup of /etc/passwd with DBM::Deep

purl in your heart

perl -MDBM::Deep -F: -lane 'BEGIN{$db = DBM::Deep->new(q(passwd.db))} $db->{$F[0]}=[@F]' /etc/passwd

Posted via email from purl's posterous

June 24, 2010 02:42 AM

zip and mail in one shot

zip -q - * | mime-construct --to abc@gmail.com --subject 'zipped' --attachment data.zip --type application/zip --file -

Posted via email from purl's posterous

June 24, 2010 02:20 AM

June 22, 2010

ngx_xss v0.02: fixed a bug that prevents responses from being gzipped

Human & Machine

I'm glad to announce the v0.02 release of the ngx_xss module:

   http://github.com/agentzh/xss-nginx-module/tarball/v0.02

This module provides native cross-site scripting (XSS) support in nginx, and cross-site GET via JSONP in particular. Please visit the project homepage for more details:

   http://github.com/agentzh/xss-nginx-module

This release fixes a nasty bug in Content-Type header handling. The previous version does not clear r->headers_out.content_type_lowcase which sadly prevents responses from being compressed by the ngx_http_gzip_filter_module if configured.

Thanks my teammate kindy++ for catching it in our production environment :P

June 22, 2010 09:49 AM

June 21, 2010

ngx_drizzle v0.0.11 and ngx_rds_json v0.09: significant performance boost

Human & Machine

I'm happy to announce that ngx_drizzle v0.0.11 and ngx_rds_json v0.09 are finally out:

   http://github.com/chaoslawful/drizzle-nginx-module/tarball/v0.0.11

   http://github.com/agentzh/rds-json-nginx-module/tarball/v0.09

ngx_drizzle is an upstream module that talks to mysql, drizzle, and sqlite3 by libdrizzle, and generates output in a binary format known as RDS (Resty DBD Stream), just like Piotr Sikora's ngx_postgres module.

ngx_rds_json is an output filter module that converts the RDS outputs of ngx_drizzle and ngx_postgres, to plain JSON text.

The highlight of these releases are significant performance boost due to extensive refactoring and optimizations in these two modules recently.

One can observe hundreds of times of improvement for big responses above 300KB for both ngx_drizzle + ngx_rds_json and ngx_postgres + ngx_rds_json combination.

We even observed that ngx_drizzle + ngx_rds_json achieved 128MB/sec throughput rate (Yes, 1024Mbit/sec!) for a 380KB data-set query while connecting to a simple mysql server, about 3 ~ 4 times as fast as php + libmysql in an identical setting.

Technically, we (partially) adapted the "fixed-size bufs" model used in ngx_http_gzip_filter_module (and elsewhere in the nginx core) in both of our modules, effectively eliminating lots of unnecessary packet splitting and buffer allocations, which contributes most of the performance boost. And we'd like ngx_postgres to apply this technique in the near future too (for a typical 470 KB data-set query, ngx_drizzle + ngx_rds_json is now more than 50% faster than ngx_postgres + ngx_rds_json).

We also introduce the drizzle_buffer_size and rds_json_buffer_size directives to allow the user to adjust the size of each buf that is used in the output emitter:

    drizzle_buffer_size 4k;
    rds_json_buffer_size 4k;

The default setting is the page-size, usually 4k ~ 8k. The bigger the buffer size, the less streamming the output will be.

At last but not least, this release of ngx_drizzle also includes various new features ported from ngx_postgres, like the "method-specific queries" support in the drizzle_query directive.

Here's an example that implements a full RESTful interface to a mysql backend using a single nginx location, and no "if hacks":

  location ~ '^/cat/(\d+)' {
      set $id $1;
      set_form_input $name;
      set_quote_sql_str $quoted_name $name;

      drizzle_query HEAD GET "select * from cats where id=$id";
      drizzle_query DELETE "delete from cats where id=$id";
      drizzle_query POST "insert into cats (name) values($quoted_name)";

      drizzle_pass my_mysql_backend;
  }

There's also an example for ngx_postgres in my slides:

   http://agentzh.org/misc/slides/recent-dev-nginx-conf/#18

Have fun!


June 21, 2010 09:48 AM

Recent developments in nginx.conf scripting

Last Saturday I gave a talk in Beijing OpenParty's monthly meetup regarding our recent developments in nginx.conf scripting with the highlight of new modules like ngx_postgres, ngx_form_input, ngx_srcache, ngx_encrypted_session, and ngx_lua.

Here's the slides that I used in this event:

    http://agentzh.org/misc/slides/recent-dev-nginx-conf/

Please use the arrow keys or page-up/page-down keys on your keyboard to navigate through them.

Special thanks go to Piotr Sikora for his preview of these slides before the talk and lots of his good suggestions.

Feedback and comments are very welcome as usual :)

Enjoy!

June 21, 2010 03:33 AM

June 18, 2010

ngx_openresty 端午节大优化

Human & Machine

端午节前后我对 ngx_drizzlengx_rds_json 进行了深入的重构和优化,现在对于几百 KB 的大结果查询,比一周前提升了几百倍的性能。现在对于一个典型的 380 KB 的大结果集的 mysql 查询,10 并发,单机可达 430+ q/s,128 MB/s 的传输速率,20 ms 平均响应时间 [1]。

nginx 这一侧没有使用任何针对结果集的缓存,mysql 那一侧倒是有可能用到了其自己的内存缓存,呵呵。现在我们的性能终于是 php + libmysql 的 3 倍多了。

主要的改动就是重新设计了缓冲区的管理模型,采用了类似 ngx_http_gzip_filter_module 的定长 buf 链和回收机制,后面有机会再详细介绍一下,呵呵。

值得一提的是,编译 nginx 时 gcc -O1 会比 -O0 提升 50%;-O2-O1 只提升几个百分点的样子。

ngx_rds_json 的优化也让 ngx_postgres 和它一起工作时,性能提高了一二百倍 :)

另外,git HEAD 中还分别给这两个模块新增了一条配置指令,一是 drizzle_buffer_size,一是 rds_json_buffer_size. 二者的默认值都是 page size,  即一般是 4k 或者 8k, 对于大结果集,适当调大有助于进一步提升性能,但 buffer size 设得越大,流输出的特性就越少。

明天 OpenParty 的 talk 完了之后,我再发布这两个模块的新版本 :)

Stay tuned~

[1] 测试机器的配置是 4 核的 Intel(R) Xeon(R) CPU 5130 @ 2.00GHz, 4 GB RAM.




June 18, 2010 07:41 AM

June 15, 2010

passing away

purl in your heart

"And God shall wipe away all tears from their eyes; and there shall be no more death, neither sorrow, nor
 crying, neither shall there be any more pain: for the former things are passed away." -- Revelation 21:4

Posted via email from purl's posterous

June 15, 2010 06:00 PM

June 14, 2010

select aggregate login group by shell from /etc/passed

purl in your heart

perl -F: -lane 'END{print $_,q( => ), join q(, ), @{$sh{$_}} for keys %sh} push @{$sh{$F[6]}},$F[0]' /etc/passwd

Posted via email from purl's posterous

June 14, 2010 07:49 AM

June 13, 2010

uhex & uchr in firefox bookmark keyword

purl in your heart

uhex => javascript:'%s'.charCodeAt(0).toString(16)
uchr => javascript:String.fromCharCode(parseInt('%s',16))

Posted via email from purl's posterous

June 13, 2010 08:55 PM

June 06, 2010

ngx_headers_more v0.10: ability to remove a header completely

Human & Machine

I'm happy to announce that the v0.10 release of ngx_headers_more has just landed:

   http://github.com/agentzh/headers-more-nginx-module/tarball/v0.10

This module allows you to add, set, or clear any output or input header that you specify.  Please see the full documentation for more details.

From this release, the more_clear_headers and more_clear_input_headers directives can remove a specified header completely.

In earlier releases, clearing out a header usually just clears the header's value, not its key. For instance,

    more_clear_headers 'Server';

yields

   Server:

in the response headers. Now, it'll be erased completely.

Non-standard headers like X-Foo can be totally erased as well.

The same change applies to more_clear_input_headers, where we can erase an input header in the request object before forwarding it to a content handler like proxy_pass or fastcgi_pass.

You're very welcome to send bug report (if any!) or wishlist to us at GitHub Issues.

Enjoy!

June 06, 2010 12:46 PM

June 04, 2010

ngx_echo v0.32: various memory issue fixes inspired by valgrind

Human & Machine

I'm happy to announce the v0.32 release of the ngx_echo module:

   http://github.com/agentzh/echo-nginx-module/tarball/v0.32

This module wraps lots of Nginx internal APIs for streaming input and output, parallel/sequential subrequests, timers and sleeping, as well as various meta data accessing. Basically it provides various utilities that help testing and debugging of other modules by trivially emulating different kinds of faked subrequest locations.

This release fixes several memory issues reported by valgrind's memcheck tool and below is the complete change log for this version:

   http://wiki.nginx.org/NginxHttpEchoModule#v0.32

We've just integrated valgrind support into our test scaffold Test::Nginx which is used by all of our nginx module projects. Now running a particular module's test suite with valgrind's memcheck is as simple as

    cd test
    TEST_NGINX_USE_VALGRIND=1 prove -r t

This facility also helps spotting quite a few memory-related issues in several other modules developed by ourselves :)

Enjoy!

June 04, 2010 04:08 AM

June 03, 2010

The upcoming nginx.conf scripting talk at OpenParty

Human & Machine

I'm going to talk about nginx.conf scripting in this month's OpenParty meetup:

    http://app.beijing-open-party.org/topic/10

You're very welcome to attend my talk:

    http://app.beijing-open-party.org/

See you there ;)

June 03, 2010 03:00 AM

June 02, 2010

ngx_headers_more v0.09: wildcard support in more_clear_headers

Human & Machine

I'm delighted to announce the v0.09 release of the ngx_headers_more module:

   http://github.com/agentzh/headers-more-nginx-module/tarball/v0.09

This module allows you to add, set, or clear any output or input headers that you specify. This is an enhanced version of the standard headers module because it provides more utilities like resetting or clearing "builtin headers" like Content-Type, Content-Length, and Server.

This release features the wildcard (*) support in the more_clear_headers directive. For example, the following directive effectively clears any output headers starting by X-Hidden-:

   more_clear_headers 'X-Hidden-*';

Thanks our new contributor Bernd Dorn for implementing this :)

See the complete change log for this version if you're interested:

   http://wiki.nginx.org/NginxHttpHeadersMoreModule#v0.09

Also, in the previous version, i.e., v0.08, Bernd Dorn also introduced the -r option to the more_set_input_headers directive to replace the value of the input header specified if and only if that header actually exists.

You can find the complete documentation of this module on the nginx wiki site:

   http://wiki.nginx.org/NginxHttpHeadersMoreModule

Enjoy!

June 02, 2010 12:10 PM

June 01, 2010

Optimizing ngx_openresty for big responses

Human & Machine

(Piotr Sikora blames me for posting my previous article in Chinese. Hence the following English transcript.)

I've been spotting bottlenecks in ngx_drizzle and ngx_rds_json for responses with big data sets. Now it's finally 100% faster than php + libmysql for huge mysql responses of 380+ KB. The former can reach 250+ q/s using ab -c50 while the latter merely gets 120+ q/s.

Earlier last week, for requests of such big responses, ngx_drizzle + ngx_rds_json merely got 12 ~ 14 q/s, just 1/10 of the that of php + libmysql, which drove me crazy.

This serious performance issue mainly comes from the "strict context-free streamming output model" introduced into ngx_drizzle and ngx_rds_json around the beginning of this year, due to my laziness. This has been proven to be a really bad idea because it will split the data buffer into many many little pieces and even worse, it sets buf->flush to true which effectively enables auto-flushing. Sigh. For that specific 380 KB result set query mentioned above, it will result in 10000+ calls of writev and slowness compared to even php :)

I've already turned off buf->flush in ngx_drizzle v0.0.11rc1 and ngx_rds_json v0.08.

calio++ will refactor the bufs and chains model used by these two modules this month, so as to achieve the best performance here.


For little data sets (less than 500 bytes), ngx_drizzle's qps is always at least an order of magnitude higher than php + libmysql, without doubt. Thousands of q/s is not uncommon at all.

A good news is that, qunar.com is already using ngx_postgres + ngx_rds_json to access PostgreSQL in production for several online services. And according to their report, they can achieve 7k ~ 8k q/s per machine without caching the result sets. Their nginx worker processes have run flawlessly for 12 days and never crash.

We are also preparing to push ngx_drizzle online for our own data product.

Stay tuned~~



June 01, 2010 06:04 AM

ngx_openresty 优化进行时

这几日我一直在优化 ngx_drizzlengx_rds_json. ngx_drizzle 对于 380 KB 这样的大结果集终于比 php + libmysql 快一倍以上了,前者是 250+ q/s 单机,后者是 120+ q/s.

在上周的时候,这种大结果集的单次查询,ngx_drizzle + ngx_rds_json 只有 php + libmysql 的 1/10 的 q/s 性能,只有 12 ~ 14 q/s,当时我都快疯了,哈哈!

原先对大结果集查询有性能瓶颈的主要原因是我年初做 ngx_drizzle 和 ngx_rds_json 的时候为偷懒,实现了一种上下文无关的严格流式输出模型,导致数据切片过细,并且置了 buf->flush,导至 writev 系统调用次数过多(在上面那个 380 k 输出的测试用例中,最初有 10000+ 次调用)。确实,下游输出的 tcp packet 的大小应该和上游 mysql 的 tcp packet 的这个“上下文”有关 :) 我已经让 calio 同学在这个月重构 ngx_drizzle 和 ngx_rds_json 的输出组件的内存管理模型。

对于小结果集的查询(小于 500 字节的结果集),ngx_drizzle 比 php + libmysql 高一二个数量级以上没什么悬念,呵呵。前者都是千级的 qps.

一个好消息是,qunar.com 已经在生产上使用 ngx_postgres + ngx_rds_json 来访问他们的 PostgreSQL 数据库后端,据他们的报告,单机性能可达 7k ~ 8k q/s,并且已经在线上稳定运行 12 天了,nginx worker 进程从未掉过一个,呵呵,也没有观察到内存泄漏 :D

我们自己的数据产品也在紧锣密鼓地准备把 ngx_drizzle 推上线 ;)



June 01, 2010 03:28 AM

May 27, 2010

some other way to do funny translation

purl in your heart

$ echo \'\' | perl -pne "s{@{[chr(0x27)]}}{chr(0x22)}eg"
""
$ echo -ne "''\n'\n''\n" | perl -pne "s{@{[chr(0x27)]}}{chr(0x22)}eg"
""
"
""
$

Posted via email from purl's posterous

May 27, 2010 08:44 AM

May 26, 2010

ngx_echo v0.31: more sequential subrequest fixes inspired by ngx_srcache

Human & Machine

I'm glad to release ngx_echo v0.31 which include various fixes on sequential subrequests inspired by the ngx_srcache development:

   http://github.com/agentzh/echo-nginx-module/tarball/v0.31

Here's the (boring) technical change log:

BTW, ngx_srcache is near the corner of its first public release:

   http://github.com/agentzh/srcache-nginx-module

calio++ is currently working on the last missing bit, i.e., response headers caching :)

May 26, 2010 02:33 AM

ngx_rds_json v0.07: fixed boolean values for PosgreSQL

I'm pleased to announce that ngx_rds_json v0.07 has been released:

    http://github.com/agentzh/rds-json-nginx-module/tarball/v0.07

ngx_rds_json is an output filter module that can format the RDS (Resty DBD Stream) outputs of the ngx_drizzle and ngx_postgres modules to JSON text.

This release includes the patch from Tom Tuling, that fixes the boolean values in the result sets emitted by PostgreSQL, i.e., values like "f" and "t". Thanks Tom Tuling :)

FWIW, my friend xunxin++ has already put ngx_postgres + ngx_rds_json into production :)

May 26, 2010 02:09 AM

May 25, 2010

putty mouse setting

purl in your heart

http://dag.wieers.com/blog/content/improving-putty-settings-on-windows

I prefer to do an implicit copy when selecting and using the middle mouse button for pasting. So I go to Category: Window > Selection and set the Action of mouse buttons to xterm (Right extends, Middle pastes)

Posted via email from purl's posterous

May 25, 2010 11:40 PM

May 18, 2010

A simple ngx_lua example for the future

Human & Machine

One of my nginx module users is asking for a "get_once" command support for the ngx_memc module, that is, read a key and then delete it immediately. But I'd rather do such specific things on the ngx_lua level. Here's a quick example for this task:

   location /get_once {
       content_by_lua "
            -- read the memcached key from the nginx variable $uri
            key = ngx.var.uri;

             -- do memcached get
             res = ngx.location.capture('/memc?cmd=get&key=' .. key);

             -- forward the memcached get response to the downstream
             ngx.res.status = res.status;
             ngx.res.out(res.body);

             if res.status ~= 404 then
                 -- delete the key and discard the response
                 ngx.location.capture('/memc?cmd=delete&key=' .. key);
             end
             ";
  }

  location /memc {
        internal;
        set $memc_key $arg_key;
        set $memc_cmd $arg_cmd;
        memc_pass 127.0.0.1:11211;
  }

Furthermore, we can use the content_by_lua_file directive if we'd put the lua code into a separate .lua file ;)

It's worth mentioning that the ngx.location.capture method runs in completely non-blocking fashion. Thanks to Lua's coroutine support :D

And yes! At the moment, ngx_lua does not have content_by_lua and content_by_lua_file yet. But we're already working on them and they'll come soon.

Because Lua can run pretty fast especially when JIT is enabled (even compared to plain C), I think we do not have much performance penalty here :)

Do you like it?

Update

Here's a quick example using ngx_echo and ngx_eval that implements exactly the same feature and more importantly, runs now:

    location = /get_once {
        echo_location /get;
        echo_location /del?foo;
    }
    location = /get {
        set $memc_cmd get;
        set $memc_key foo;
        memc_pass mc;
    }
    location = /del {
       eval $res {
           set $memc_cmd delete;
           set $memc_key $query_string;
           memc_pass mc;
       }
       return 200;
   }

assuming you define the "mc" backend like this:

    upstream mc {
        server localhost:11211;
    }

 
The only drawback is that when the key is not found at /get, its 404 status code won't affect the main request (i.e., the /main location).

BTW, echo_location is a directive provided by my echo module:

    http://wiki.nginx.org/NginxHttpEchoModule

and the eval directive is provided by my fork of the ngx_eval module. The official ngx_eval module contains some bugs which may crash the worker when working together with the echo module. Sigh.


May 18, 2010 02:51 AM

May 17, 2010

something strange in many online bible

purl in your heart

"惹缰皎" <=> "基路伯"

Posted via email from purl's posterous

May 17, 2010 12:23 AM

May 08, 2010

ncv is a function

purl in your heart

ncv ()
{
    GET http://ncvs.holybible.com.cn/$*.htm | perl -plne 's{.*<span class="reftext">(.*)</span>&nbsp;(.*)}{\1 \2}' | perl -pnle 's{(.*)<p><hr.*}{\1}' | w3m -T text/html
}

Posted via email from purl's posterous

May 08, 2010 05:05 AM

May 05, 2010

ngx_echo v0.29: major core refactoring and more robust sequential subrequests

Human & Machine

I'm happy to announce the v0.29 release of the ngx_echo module:

  http://github.com/agentzh/echo-nginx-module/tarball/v0.29

As mentioned in some other threads on the nginx mailing list, I've completed the big refactoring of the ngx_echo module's core in this version to reflect my latest understanding (hopefully being quite
complete and correct already) of the nginx internals.

Now the implementation of echo_subrequest, echo_location, echo_sleep, and echo_read_request_body directives has been massively rewritten. I'm trying to set up a design pattern for nginx content handlers that require to do various kinds of non-blocking I/O during its lifetime (similar to upstream modules but for different tasks like subrequests and others).

For sequential subrequests issued by the echo module's content handler, we now use a totally different approach.

Instead of issuing subrequests directly in our post_subrequest callback (as fed into the ngx_http_subrequest call), we now postpone firing the subrequests in a custom write event handler which will be called automatically once the current subrequest in question gets completely finalized (that post_subrequest callback is called by ngx_http_finalize_request of the subrequest in question).

This works because the parent request will always be waken up once its subrequest completes. (This is done by ngx_http_finalize_request via posting the parent request into the "posted requests" queue and the posted requests will be always called at the end of the top-level read/write event handler, i.e., ngx_http_request_handler).

This solves a lot of issues like mangled r->main->count and the following alert message when ginx is configured with the --with-debug option, for instance,

    2010/05/05 16:46:14 [alert] 23853#0: *1 http finalize non-active request: "/main?" ...

We'll apply this technique to the upcoming ngx_lua and ngx_srcache modules soon :)

P.S. See the ngx_echo module's wiki page for more information: http://wiki.nginx.org/NginxHttpEchoModule

Update: ngx_echo v0.30 is just out:

   http://github.com/agentzh/echo-nginx-module/tarball/v0.30

I didn't get the r->main->count right for the echo_exec directive in the previous version. Use of the echo_exec directive in v0.29 may result in random server hang for nginx >= 0.8.11.

Please consider upgrading to v0.30 immediately if you're using ngx_echo. And please feel free to report any issues on your side.

May 05, 2010 09:25 AM

April 30, 2010

Jesus love you in zipped chinese filename version

purl in your heart

$ echo Ò®öÕ°®Äã | iconv -f utf8 -t iso88591 | iconv -f gbk -t utf8
耶稣爱你

Posted via email from purl's posterous

April 30, 2010 07:10 AM

April 26, 2010

The end

wanglianghome

blogger.com从5月1日起停止FTP发布支持,本blog也同时被终止。

http://feeds2.feedburner.com/casper继续使用,目前正在试用wordpress,决定之后这个feed将导向新的blog发布。

April 26, 2010 12:14 PM

April 25, 2010

最长的空白

rogerz

从3月1日至今,已经快两个月了,恐怕是开始写blog来最长的空白。首先是觉得忙了,每天上班下班,有点懒得思考人生。在者有点怕了,人肉搜索的威力让人不敢在网上想说啥就说啥。渐渐地觉得矛盾,既想跟人分享有有点对网络后面陌生的眼睛感到一丝不安。曾经疯狂追新的劲头也大不如前,真的是岁月催人老啊。

April 25, 2010 01:11 AM

April 21, 2010

mediabox size of first page for the PDF file

purl in your heart

pp -e 'use PerlIO; use PDF::API2; print join q( ), map {$_ = $_->{val}} @{${PDF::API2->open(shift)->openpage(0)}{MediaBox}->{" val"}}' -o ~/bin/mediabox

Posted via email from purl's posterous

April 21, 2010 01:25 AM

April 20, 2010

CUV mailer

purl in your heart

#!/usr/bin/perl
use WWW::Mechanize;
use MIME::Lite;
use Encode;
use Encode::HanConvert qw(simple);

$a = WWW::Mechanize ->new();
eval { $m = MIME::Lite ->new(To => q(joejiang@securities.com), Subject => q(HYT), Type => q(multipart/related)); $m->attach(Type => q(text/html), Data => $_); $m->send()} for grep {$_ = encode(q(utf8),simple(decode(q(utf8),$a->get($_)->content())));
s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{甚么}{什么}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{锁炼}{锁链}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg; s{}{}sg;
s{<a href="http://m.youversion.com/bible/more/[\w\d.]+">\[more\]</a>}{}sg; s{.*(<h2 class="reader_location top"><span class="reference">[\w ]+</span></h2>).*<div id="reader" class="content">(.*)<div class="link" style="border-bottom:1px solid #ccc">.*}{<body>\1\2</div></body>}s; $_} grep { $_ = qq(http://m.youversion.com/bible/cuv/$_) } @ARGV

Posted via email from purl's posterous

April 20, 2010 07:36 PM

April 18, 2010

Patch is limited

purl in your heart

Luke 5:36 耶稣又设一个比喻、对他们说、没有人把新衣服撕下一块来、补在旧衣服上.若是这样、就把新的撕破了、并且所撕下来的那块新的、和旧的也不相称。
37 也没有人把新酒装在旧皮袋里.若是这样、新酒必将皮袋裂开、酒便漏出来、皮袋也就坏了。 38 但新酒必须装在新皮袋里。 39
没有人喝了陈酒又想喝新的、他总说陈的好。

Posted via email from purl's posterous

April 18, 2010 08:49 PM

April 08, 2010

CUV Daily mailer from youversion.com

purl in your heart

#!/usr/bin/perl
use WWW::Mechanize;
use MIME::Lite;
use Encode;
use Encode::HanConvert qw(simple);

$a = WWW::Mechanize ->new();
$a->get('http://m.youversion.com/sign-in?redirect=home');
$a->form_number(2);
$a->field('password', '...');
$a->field('username', '...');
$a->click('submit');
$c = ($a->get('http://m.youversion.com/home')->content());
eval { $m = MIME::Lite ->new(To => q(my@mail.com), Subject => q(OYT),
Type => q(multipart/related)); $m->attach(Type => q(text/html), Data
=> $_); $m->send()} for grep {$_ =
simple(decode(q(utf8),$a->get($_)->content())); s{豫备}{预备}g;
s{豫言}{预言}g; s{\[more\]}{}sg;
s{.*?(
.*).*}{\1
}s; $_} grep { $_ =~
s{/asv/}{/cuv/}; $_ } $c =~
m{"(http://m.youversion.com/bible/asv/.*)"}g

Posted via email from purl's posterous

April 08, 2010 03:04 AM

April 02, 2010

Use iswitchb for orgmode completion

wanglianghome

很简单,就是(setq org-completion-use-iswitchb t)。也可以用ido,即(setq org-completion-use-ido t)

即便如此,遇到中文仍然很麻烦,宁可用C-sC-s遍历,也不想切换输入法。

April 02, 2010 03:33 PM

March 17, 2010

We are having breakfast at 11AM :)

purl in your heart

March 17, 2010 10:32 PM

March 02, 2010

有权势的必如麻瓤、他的工作、好像火星、都要一同焚毁、无人扑灭

purl in your heart

And the strong shall become tinder, and his work a spark, and both of
them shall burn together, with none to quench them.

Posted via email from purl's posterous

March 02, 2010 01:43 PM

March 01, 2010

LG手机KS360修理记

rogerz

lp一年前买的LG手机忽然开不了机了,活动滑盖的时候听到一些异响,多滑几次居然露出来一截线头,原来排线断掉了。在老家找遍一条街都没有配件,只好回上海再处理了。

借助着简陋的家庭多用螺丝刀,好不容易拆开了机器。拆机有个技巧就是看保修标签的位置,如果是直接贴在缝上,那么那条缝就是可以抠开的。如果贴在机壳上,那么下面一般会有一颗螺丝。还有就是为了美观,暴露在外壳的螺丝有时候会用塑料片盖住,撬开就可以看见了。拆机的时候不能硬碰硬,很难抠开的时候一定要检查一下有没有隐藏的螺丝。比如下图与三个粉红色圆片对称的位置就还有三颗螺丝,需要把键盘卸掉才能看到。

拆开之后就简单了,把淘宝上买来的排线换上,机器即恢复了正常工作,总花费23元人民币+半个小时人工。滑盖手机的排线每天都被扭来扭去的,确实是一个很容易坏掉的部件。

March 01, 2010 12:49 PM

February 25, 2010

Weave sync

wanglianghome

最近又从chrome切换回firefox,只因Weave Sync

Weave Sync可以同步五种东西:书签、密码、选项、历史和标签,我只选择了后面两个。没有同步书签,是因为我根本就不保存书签,全部保存到delicious.com;没有同步密码,是因为我还有点不放心,虽然号称所有东西都是加密之后才上传到mozilla的服务器上,但是对于版本号1.0.1的东西,我还是想谨慎点,万一有点bug就死定了;我没有什么特别的选项设置,所以也就没有同步。

值得称道的是,Firefox和Weave Sync在诺基亚的Internet tablet,如N800和N810,以及新一代智能手机N900上都可以使用,使用同步密码功能,可以省去在移动设备上输入密码的烦恼。同步历史和标签,可以让你迅速延续在另一台设备上的阅读。

February 25, 2010 01:51 PM

February 24, 2010

耶和华阿、你的慈爱、上及诸天.你的信实、达到穹苍。你的公义、好像高山.你的判断、如同深渊.耶和华阿、人民牲畜、你都救护

purl in your heart

February 24, 2010 11:49 PM

你趁着年幼、衰败的日子尚未来到、就是你所说、我毫无喜乐的那些年日未曾临近之先、当记念造你的主

February 24, 2010 11:35 PM

少年人哪、你在幼年时当快乐.在幼年的日子、使你的心欢畅、行你心所愿行的、看你眼所爱看的、却要知道、为这一切的事、神必审问你

February 24, 2010 11:03 PM

February 22, 2010

记录一下南通的长牌

rogerz

这个新年一半在老家,一半在老婆的老家度过,除了走亲访友之外,还有一个“收获”就是学会了南通地区流行的长牌。

网上能搜到的有用的介绍不多,这篇图文并茂的南通双将长牌相对全面。该blog上还有一些其他相关文章,比如南通长牌扫盲(牌图介绍)。其他搜到的内容基本上是同一篇文章的不断转载,与百度百科的南通长牌词条雷同。

有麻将基础的话,学打长牌并不困难,成牌的方式跟麻将里的胡牌是基本一致的,只是结算分数的规则不同,此外,一开始想要识别那些代表数字的图案比较费劲,实在是没有什么规律,只能强记。

打牌如人生,你不能控制拿到手里的牌,但是可以选择出牌的方式。

February 22, 2010 01:05 PM

February 18, 2010

As for me, I shall behold your face in righteousness; when I awake, I shall be satisfied with your likeness.

purl in your heart

February 18, 2010 11:27 AM

We are fishing

So happy when we are young, with pure heart ...

Posted via email from purl's posterous

February 18, 2010 01:01 AM