Some random bits of experience

Here you will find a few stories of the pitfalls I stumbled upon (or into) while writing the program for the "Sonnentaler" web site. Take them with a large grain of salt - I'm not sure everything I write here is correct, it's more like notes to myself about things I learned in the process (usually the hard way:-) and that I am puuting them here in the hope that it might be useful to others experiencing similar problems. If you find things to be wrong please let me know...

2008/5/27

Surprises with scalar references and `DBIx::Class`

Sometimes, even when using DBIx::Class, you still may want to pass a SQL command (or parts of one) directly through to the database, e.g. to be able to use a database function. An example is

$obj->update( { date => \'NOW()' } );

To do so you pass a scalar reference to the part of the SQL statement you want to be executed. And that works fine.

But it gets "interesting" is when you now try to use the value of date of the object. It's not the value stored in the database (which is what you probbaly expect it to be, i.e. the exact date update() was called with) but still the scalar reference to the string 'NOW()'.

If you want to make sure that not only the database but also the object gets updated to the new value, you have to enforce that explicitely. The simplest way to do that is

$obj->update( { date => \'NOW()' } )->discard;

The discard() method will update the object from the values in the database, thus getting rid of the scalar reference that was kept as the value of date. Just be aware that the discard() method will throw away all changes made to the object's values which have been made to the object between the last call of update() and the call of update() (thus it's best to call it immediately afterwards as shown above)!

2008/6/13 (last changed 2012/1/5)

Catalyst, `DBIx::Class`, Template Toolkit and UTF-8

At least some of the following information is wrong (as of 2012/1/5). Please read the corrections at the end concerning the UTF8Columns package!

The program that creates the "Sonnentaler" web pages is supposed to work properly with UTF-8. That includes that the content, the templates, the data in the database etc. are all UTF-8 encoded. And, of course, user input also has to be dealt with as UTF-8. That seemed to work as expected most of the time. But sometimes there were some glitches – for some strange reasons some of the (correctly UTF-8 encoded) strings showed up garbled when they reached the web browser.

After quite a bit of time of reading mailing lists and web sites I finally found out that I had gotten everything wrong. It did seem to work just because two mistakes canceled each other - most of the time. So here's what I learned is needed to get it right - at least I hope ;-)

Let's start with the database part. Of course, I assume that the encoding of the database (we're using PostgreSQL) has been set to UTF-8. When you look into the Catalyst documentation you will find examples like this for creating a DBIC "Result Source" file (this is directly taken from the Catalyst manual):

package MyAppDB::Book;
    
use base qw/ DBIx::Class /;
    
# Load required DBIC stuff
__PACKAGE__->load_components( qw/ PK::Auto Core / );

# Set the table name
__PACKAGE__->table( 'books' );

# Set columns in table
__PACKAGE__->add_columns( qw/ id title rating / );

package MyAppDB::Book;
    
use base qw/ DBIx::Class /;
    
# Load required DBIC stuff
__PACKAGE__->load_components( qw/ PK::Auto Core UTF8Columns / );

# Set the table name
__PACKAGE__->table( 'books' );

# Set columns in table
__PACKAGE__->add_columns( qw/ id title rating / );

# Mark column 'title' as returning a UTF-8 string
__PACKAGE__->utf8_columns( qw/ title / );

# You can also simply specify all columns to be UTF-8 using
# __PACKAGE__->utf8_columns( __PACKAGE__->columns );

But we're not done yet. If you only get the strings returned from the database to be recognized as UTF-8 it still won't work. Two more changes are required. One thing you should do is load Catalyst::Plugin::Unicode. It makes sure that all input you receive (e.g. via $c->req->params) is UTF-8 and everything you output is also. So in your MyApp.pm you need

use strict;
use warnings;

use Catalyst::Runtime '5.70';

use Catalyst qw/ -Debug ConfigLoader Static::Simple Unicode /;

__PACKAGE__->config( name => 'MyApp' );

And even more important, you need to make the Template Toolkit aware that all template files are in UTF-8. You also do that in MyApp.pm by adding a line like

__PACKAGE__->config( 'View::TT'  => { ENCODING => 'UTF-8' } );

__PACKAGE__->config( 'View::TT'  => { ENCODING => 'UTF8' } );

See the documentation of the Encode package for the gory details of what's the difference between UTF8 and UTF-8 (and it's synonyms utf-8 and utf8). Basically, with the hyphen (and the alternate forms) you enforce strict UTF-8 conformance while without the hyphen a somewhat more liberal interpretation that Perl uses internally is applied. Since the data most likely will end up being sent to a browser it probably makes sense to use UTF-8 here – the browser is unlikely to understand the more relaxed Perlish notion of a UTF8 character.

Please note that you need at least version 2.15 of the Template Toolkit (if you're using Debian etch like we do, you will have to use the backport of the libtemplate_perl package). In older versions all templates containing UTF-8 encoded text needed to start with a BOM (byte order mark), at least that's what I've read in one of the mailing lists.

With this you're set for using UTF-8 all throughout the program. Of course, if the program reads in files, make sure you open them in UTF-8 mode, e.g.

open my $file, "<:encoding(utf8)", $filename;

And if you read from another program that outputs UTF-8 encoded data you also have to make sure the input is opened in UTF-8 with

open my $handle, "-|:encoding(utf8')" $cmd;

Or set the encoding for open() at the start of the file and be done with it by using

use open ':encoding(utf8)';

The problem with this is that, even when the database encoding is correctly set to UTF-8, the strings one receives when using the auto-generated functions like the one for returning the title of the book (e.g. book->title() aren't recognized by Perl as being UTF-8 character strings but get treated as "byte strings". To change that you have to add another component, UTF8Columns, to the call of __PACKAGE__->load_components() and to specify the columns that are UTF-8 strings by a call of __PACKAGE__->utf8_columns() with a list of these columns. So the above code needs to be changed to

Update from 2012/1/5

The use of the DBIx::Class::UTF8Columns package is now strongly discouraged - it has been found to have a bug that can wreck havoc to the data in the database. If you have a newer version of that package installed

perldoc DBIx::Class::UTF8Columns

will give you the gory details.

I'm in no position to make recommandations for what will work with all kinds of databases, but at least with some (like PostgreSQL and MySQL) there's a recommended work-around. Instead of using the DBIx::Class::UTF8Columns package and the utf8_columns() function the way to go seems to be to set a flag (to true) when connecting to the database, which tells it that all data of replies are to be in UTF-8 and all data it will receive are also in UTF-8. This flag is pg_enable_utf8 for PostgreSQL and mysql_enable_utf8 for MySQL. As far as I have seen this works nicely (at least with PostgrSQL) and it actually looks to me to be more "natural".

2010/1/4

Checks for read/write permissions of files and ACLs

If you want to know if you have read or write permissions for a file you do something like

if ( -r $filename ) {
    ...
}

and

if ( -w $filename ) {
    ....
}

– aren't you?

Normally, that works quite fine. But if you use ACLs (access control lists) you may be in for a surprise. Per default '-r' and '-w' just test for the normal permissions, not for extra permissions granted via ACLs. Thus, if you use ACLs these tests can fail even though everything looks quite fine, i.e. getfacl tells you that you can access the files, if you try to read or change them, it works perfectly well etc. – just a Perl program that you wrote carefully to check access permissions before you do something with a file fails miserably.

What you need to do in that kind of situation is to tell Perl that it shouldn't do just the standard tests (i.e. just look for the traditional permissions) but also take ACLs into account. You do that with an extra pragma:

use filetest 'access';

If this pragma is used -r and-w etc. also take ACLs into account. But take care, there still are a few pitfalls: First of all this only works with file names, not file handles! And the stat() result cache accessed by "_" is not set. For a more comprehensive (and perhaps comprehensible) explanation see perldoc.perl.org/filetest.

Letzte Aktualisierung: 14.10.2024

Some random bits of experience

Surprises with scalar references and DBIx::Class

Catalyst, DBIx::Class, Template Toolkit and UTF-8

Update from 2012/1/5

Checks for read/write permissions of files and ACLs

Surprises with scalar references and `DBIx::Class`

Catalyst, `DBIx::Class`, Template Toolkit and UTF-8