Monday, April 30, 2012

UTF-8 Regular Expressions in PHP

While PHP itself doesn't know about different character sets and treats all characters as being one byte long, the PCRE engine understands UTF-8. There's also mb_ereg_match(), but I prefer the PCRE functions (preg_...). Here's a piece of code to see if your PHP was compiled with PCRE UTF-8 support.

$str = 'ありがとう';
echo "strlen('$str') = " . strlen($str) . "\n";
echo "preg_match_all('/./', '$str', \$matches) = " .
  preg_match_all('/./', $str, $matches) . "\n";
echo "preg_match_all('/(*UTF8)./u', '$str', \$matches) = " .
  preg_match_all('/(*UTF8)./u', $str, $matches) . "\n";

Which outputs the correct length of 5 characters when you start your regular expresssion with (*UTF8) and use the /u modifier.

strlen('ありがとう') = 15
preg_match_all('/./', 'ありがとう', $matches) = 15
preg_match_all('/(*UTF8)./u', 'ありがとう', $matches) = 5

You can also use Unicode character properties to match only letters (in any language) for example:

// The WRONG way to do it, only works for ASCII:
preg_match_all('/[a-zA-Z]/', $str, $matches);

// This way it works with any language:
preg_match_all('/(*UTF8)\p{L}/u', $str, $matches);

You can see other Unicode character properties in the PHP Manual.

Wednesday, April 25, 2012

Logging fatal PHP errors

If you turned off the display_errors setting in your php.ini in production (as you should), then when your code dies with a fatal error, you can't see the message anywhere. It would be better to log these errors to the Apache error log (this is true even if you didn't disable display_errors, for debugging errors that other users might report.) PHP has a log_errors directive in php.ini, but it doesn't seem to log anything for me. Instead, I used register_shutdown_function() to make PHP log the errors:

register_shutdown_function(function() {
  $error = error_get_last();
  if($error !== NULL) {
  error_log('PHP Fatal: file:' . $error['file'] . ' line:' . $error['line'] .
            ' type:' . $error['type'] . ' message:' . $error['message']);
  }
});

This causes PHP to log the error on shutdown.

Saturday, March 31, 2012

Compiling PHP 5.4 on Mac OS X Lion

PHP 5.4 came out, but Apple hasn't updated Mac OS Lion with it yet:

$ php --version
PHP 5.3.8 with Suhosin-Patch (cli) (built: Nov 15 2011 15:33:15) 
Copyright (c) 1997-2011 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2011 Zend Technologies

To compile PHP 5.4, I had to download and install libjpeg first. I downloaded it from http://www.ijg.org/files/, specifically the jpegsrc.v8d.tar.gz. Comiplation and installation were easy:

./configure --prefix=/usr/local/libjpeg8d
make
sudo make install

After that I configured PHP with the following options:

./configure \
  --prefix=/usr/local/php540 \
  --with-apxs2=/usr/sbin/apxs \
  --with-openssl \
  --with-pcre-regex \
  --with-zlib \
  --enable-bcmath \
  --with-bz2 \
  --enable-calendar \
  --with-curl \
  --enable-exif \
  --enable-ftp \
  --with-gd \
  --with-jpeg-dir=/usr/local/libjpeg8d \
  --with-png-dir=/usr/X11 \
  --with-freetype-dir=/usr/X11 \
  --enable-gd-native-ttf \
  --with-ldap \
  --with-ldap-sasl \
  --enable-mbstring \
  --with-mysql \
  --with-pdo-mysql \
  --with-libedit \
  --enable-pcntl \
  --enable-shmop \
  --with-snmp \
  --enable-soap \
  --enable-sockets \
  --enable-sysvmsg \
  --enable-sysvsem \
  --enable-sysvshm \
  --with-tidy \
  --enable-wddx \
  --with-xmlrpc \
  --with-xsl \
  --enable-zip

Followed by:

make
make test

Some tests failed, a few of which already had bugs created for them. I sent the failed test report to PHP so they can debug the rest of them. The CLI binary works:

$ sapi/cli/php --version
PHP 5.4.0 (cli) (built: Mar 31 2012 14:49:14) 
Copyright (c) 1997-2012 The PHP Group
Zend Engine v2.4.0, Copyright (c) 1998-2012 Zend Technologies

Then finally installed PHP with:

sudo make install

Monday, February 13, 2012

Compiling Ruby MRI on Mac OS X

EDIT: include libyaml

Mac OS Snow Leopard comes with a pretty old Ruby:

$ ruby --version
ruby 1.8.7 (2010-01-10 patchlevel 249) [universal-darwin11.0]

You can download and compile the latest Ruby from source. You'll need to have XCode and libyaml installed first. Download libyaml from http://pyyaml.org/wiki/LibYAML and extract it in your Source directory:

$ cd Source
$ tar xzf ~/Downloads/yaml-0.1.4.tar.gz
$ cd yaml-0.1.4/
$ CC=clang ./configure --prefix=/usr/local/yaml-0.1.4
... output omitted ...
$ make
... output omitted ...
$ sudo make install
... output omitted ...

Download the latest Ruby from http://www.ruby-lang.org/en/downloads/ and then extract it in your Source directory:

$ cd Source/
$ tar xzf ~/Downloads/ruby-1.9.3-p0.tar.gz
$ cd ruby-1.9.3-p0/

Configure for compilation with clang and installation in /usr/local/:

$ CC=clang LDFLAGS=-L/usr/local/yaml-0.1.4/lib CPPFLAGS=-I/usr/local/yaml-0.1.4/include ./configure --prefix=/usr/local/ruby-1.9.3-p0
... output omitted ...

Compile:

$ make
... output omitted ...
$ make test
... output omitted ...
PASS all 943 tests
... output omitted ...
PASS all 1 tests

(I tried that with the latest stable, ruby-1.9.2-p290, and some tests failed there, so I didn't use it.)

Install:
$ sudo make install
... output omitted ...
$ /usr/local/ruby-1.9.3-p0/bin/ruby --version
ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-darwin11.2.0]

You can either always specify the path to it or you can add it to your PATH variable.

Monday, February 14, 2011

Making an executable Ruby archive distribution

As a follow-up to my previous post, you can even make a single executable file of your Ruby archive. It relies on the fact that zip files are pretty resilient to extra garbage data in the beginning. Let's say I have two files, lib/greetings.rb and bin/hello.rb. greetings.rb defines a make_hello() function that is used by hello.rb. Here's hello.rb:

require "lib/greetings"

puts "Enter your name:"
name = gets.strip

puts "\n" + make_hello(name)

You can run it like this:

$ ruby bin/hello.rb 
Enter your name:
Steve

Hello, Steve!
$

First, we make a zip file that includes all the required source files:

$ zip -r hello.zip bin lib
  adding: bin/ (stored 0%)
  adding: bin/hello.rb (deflated 11%)
  adding: lib/ (stored 0%)
  adding: lib/greetings.rb (deflated 7%)
$ unzip -l hello.zip 
Archive:  hello.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
        0  02-14-11 21:01   bin/
       98  02-14-11 21:01   bin/hello.rb
        0  02-14-11 20:59   lib/
       46  02-14-11 20:59   lib/greetings.rb
 --------                   -------
      144                   4 files

Now we need a ruby header for the zip file, which we'll put into header.rb:

#!/usr/bin/ruby -rubygems -x

require "zip/ziprequire"
$:.push $0
require "bin/hello.rb"
__END__

It runs the ruby interpreter, which then loads zip/ziprequire, adds itself (which will be the zip file) to the path, and then requires our main file, bin/hello.rb. We now prepend this header to the zip file using cat header.rb hello.zip > hello_tmp.zip. If you now try to run unzip -l on this file, you'll get:

$ unzip -l hello_tmp.zip 
Archive:  hello_tmp.zip
warning [hello_tmp.zip]:  97 extra bytes at beginning or within zipfile
  (attempting to process anyway)
  Length     Date   Time    Name
 --------    ----   ----    ----
        0  02-14-11 21:01   bin/
       98  02-14-11 21:01   bin/hello.rb
        0  02-14-11 20:59   lib/
       46  02-14-11 20:59   lib/greetings.rb
 --------                   -------
      144                   4 files

We need to fix this zip file to make it valid. Thankfully zip has an option for this:


$ zip --fix hello_tmp.zip --out hellox.zip
Fix archive (-F) - assume mostly intact archive
Zip entry offsets appear off by 97 bytes - correcting...
 copying: bin/
 copying: bin/hello.rb
 copying: lib/
 copying: lib/greetings.rb

We now have a valid zip archive in hellox.zip, we just need to make it executable by running chmod +x hellox.zip. Now your whole application is a single executable file that you can run:


$ ./hellox.zip 
Enter your name:
Mike

Hello, Mike!

Tuesday, February 8, 2011

Distributing a Ruby application as an archive

Let's say you have a Ruby application you've written, and it consists of multiple files that you require inside your code. You want to run this application on some remote machines. To make it easier to deploy this application, you want to distribute it as a single file (archive.) This is possible with the rubyzip gem. Let's say your main application file (myapp.rb) looks like this:

require "lib/mylib"
require "lib/otherlib"
require "vendorlib/something"

# Do stuff.

Normally you might run your application with ruby ./myapp.rb. To run it on another machine, we can use the zip/ziprequire library in rubyzip. First, make a zip file containing all your application files:

zip -r myapp.zip myapp.rb lib vendorlib

Copy myapp.zip to the remote machine, and you can run it like this:

ruby -rubygems -Imyapp.zip -e 'require "zip/ziprequire"' -e 'require "myapp"'

See rubyzip documentation, specifically ziprequire.rb for more information.

Tuesday, January 11, 2011

Using FileMerge with Mercurial

Mac OS comes with a GUI merge tool called FileMerge. This can be used for merges in Mercurial - Mercurial will do this if its internal merge fails. The binary for FileMerge (opendiff) cannot be used as is, so we need to create a small shell script. You can put this script anywhere in your path, and call it hopendiff:

#!/bin/sh
`/usr/bin/opendiff "$@"`

Then modify your ~/.hgrc and add:

[extensions]
hgext.extdiff =

[extdiff]
cmd.opendiff = hopendiff

[merge-tools]
filemerge.executable = hopendiff
filemerge.args = $local $other -ancestor $base -merge $output 

If you already have some of those sections (like [extensions]), then just add the corresponding lines to those sections.

That's it. When you do a merge in Mercurial, it'll open FileMerge if the internal merge fails. You can also use FileMerge for normal diffs by using hg opendiff instead of hg diff.