Monthly Archives: August 2020

PHP validation of UTF-8 input

Last weeks I have done some PHP programming (my web hotel where I run wordpress supports PHP, and it is trickier to run Node.js on a simple web hotel). I like to do input validation:

function err($status,$msg) {
  http_response_code($status);
  echo $msg;
}

if ( 1 !== preg_match('/^[a-z_]+$/',$_REQUEST['configval']) ) {
  return err(400,'invalid param value: configval=' . $_REQUEST['configval']);
}

Well, that is good until I wanted a name of something (like Düsseldorf, that becomes D%C3%BCsseldorf when sent from the browser to PHP). It turned out such international characters encoded as Unicode/UTF-8 can not be matched/tested in a nice way with PHP regular expressions.

PHP does not support UTF-8. So ü in this case becomes two characters, neither of them matches [A-Za-z] or [[:alpha:]]. However, PHP can process it as text, use it in array keys, and output valid JSON without corrupting it, so not all is lost. Just validation is hard.

I needed to come up with something good enough for my purposes.

  • I can consider ALL such unicode characters (first byte 128+) valid (even though there may be strange characters, like extra long spaces and stuff, I don’t expect them to cause me problems if anyone bothers to enter them)
  • I don’t need to consider case of Ü/ü and Å/å
  • I don’t need full regexp support
  • It is nice to be able to check length correctly, and international characters like ü and å counts as two bytes in PHP.
  • I don’t need to match specific characters in the ranges A-Z, a-z or 0-9, but when it comes to special characters: .,:,#”!@$, I want to be able to include them explictly

So I wrote a simple (well) validation function in PHP that accepts arguments for

  • minimum length
  • maximum length
  • valid characters for first position (optional)
  • valid characters
  • valid characters for last position (optional)

When it comes to valid characters it is simply a string where characters mean:

  • u: any unicode character
  • 0: any digit 0-9
  • A: any capital A-Z
  • a: any a-z
  • anything else matches only itself

So to match all letters, & and space: “Aau &”.

Some full examples:

utf8validate(2,10,’Aau’,’Aau 0′,”,$str)

This would match $str starting with any letter, containing letters, spaces and digits, and with a length of 2-10. It allows $str to end with space. If you dont like that, you can do.

utf8validate(2,10,’Aau’,’Aau -&0′,’Aau0′,$str)

Now the last character can not be a space anymore, but we have also allowed – and & inside $str.

utf8validate_error

The utf8validate function returns true on success and false on failure. Sometimes you want to know why it failed to match. That is when utf8validate_error can be used instead, returning a string on error, and false on success.

Code

I am not an experienced PHP programmer, but here we go.

function utf8validate($minlen, $maxlen, $first, $middle, $last, $lbl) {
  return false === utf8validate_error($minlen, $maxlen,   
                                      $first, $middle, $last, $lbl);
}

function utf8validate_error($minlen, $maxlen, $first, $middle, $last, $lbl) {
  $lbl_array = unpack('C*', $lbl);
  return utf8validate_a(1, 0, $minlen, $maxlen,
                        $first, $middle, $last, $lbl_array);
}

function utf8validate_utfwidth($pos,$lbl) {
  $w = 0;
  $c = $lbl[$pos];
  if ( 240 <= $c ) $w++;
  if ( 224 <= $c ) $w++;
  if ( 192 <= $c ) $w++;
  if ( count($lbl) < $pos + $w ) return -1;
  for ( $i=1 ;$i<=$w ; $i++ ) {
    $c = $lbl[$pos+$i];
    if ( $c < 128 || 191 < $c ) return -2;
  }
  return $w;
}

function utf8validate_a($pos,$len,$minlen,$maxlen,$first,$middle,$last,$lbl) {
  $rem = 1 + count($lbl) - $pos;
  if ( $rem + $len < $minlen )
    return 'Too short';
  if ( $rem < 0 )
    return 'Rem negative - internal error';
  if ( $rem === 0 )
    return false;
  if ( $maxlen <= $len )
    return 'Too long';

  $type = NULL;
  $utfwidth = utf8validate_utfwidth($pos,$lbl);
  if ( $utfwidth < 0 ) {
    return 'UTF-8 error: ' . $utfwidth;
  } else if ( 0 < $utfwidth ) {
    $type = 'u';
  } else {
    $cv = $lbl[$pos];
    if ( 48 <= $cv && $cv <= 57 ) $type = '0';
    else if ( 65 <= $cv && $cv <= 90 ) $type = 'A';
    else if ( 97 <= $cv && $cv <= 122 ) $type = 'a';
    else $type = pack('C',$cv);
  }

// type is u=unicode, 0=number, a=small, A=capital, or another character

  $validstr = NULL;
  if ( 1 === $pos && '' !== $first ) {
    $validstr = $first;
  } else if ( '' === $last || $pos+$utfwidth < count($lbl) ) {
    $validstr = $middle;
  } else {
    $validstr = $last;
  }

  if ( false === strpos($validstr,$type) ) {
    return 'Pos ' . $pos . ' ('
         . ( 'u'===$type ? 'utf8-char' : pack('C',$lbl[$pos]) )
         . ') not found in [' . $validstr . ']';
  }
  return utf8validate_a(1+$pos+$utfwidth,1+$len,$minlen,$maxlen,
                        $first,$middle,$last,$lbl);
}

That is all.

Tests

I wrote some tests as well.

$err = false;
if (false!==($err=utf8validate_error(1,1,'','a','','g')))
  throw new Exception('g failed: ' . $err);
if (false===($err=utf8validate_error(1,1,'','a','','H'))) 
  throw new Exception('H should have failed');
if (false!==($err=utf8validate_error(3,20,'Aau','Aau -','Aau','Edmund')))
  throw new Exception('Edmund failed: ' . $err);
if (false!==($err=utf8validate_error(3,20,'Aau','Aau -','Aau','Kött')))
  throw new Exception('Kött failed: ' . $err);
if (false!==($err=utf8validate_error(3,20,'Aau','Aau -','Aau','Kött-Jan')))
  throw new Exception('Kött-Jan failed: ' . $err);
if (false!==($err=utf8validate_error(3,3,'A','a0','0','X10')))
  throw new Exception('X10 failed: ' . $err);
if (false!==($err=utf8validate_error(3,3,'A','a0','0','Yx1')))
  throw new Exception('Yx1 failed: ' . $err);
if (false===($err=utf8validate_error(3,3,'A','a0','0','a10')))
  throw new Exception('a10 should have failed');
if (false===($err=utf8validate_error(3,3,'A','a0','0','Aaa')))
  throw new Exception('Aaa should have failed');
if (false===($err=utf8validate_error(3,3,'A','a0','0','Ax10')))
  throw new Exception('Ax10 should have failed');
if (false===($err=utf8validate_error(3,3,'A','a0','0','B0')))
  throw new Exception('B0 should have failed');
if (false!==($err=utf8validate_error(3,3,'u','u','u','äää')))
  throw new Exception('äää failed: ' . $err);
if (false===($err=utf8validate_error(3,3,'','u','','abc'))) 
  throw new Exception('abc should have failed');
if (false!==($err=utf8validate_error(2,5,'Aau','u','Aau','XY')))
  throw new Exception('XY failed: ' . $err);
if (false===($err=utf8validate_error(2,5,'Aau','u','Aau','XxY')))
  throw new Exception('XxY should have failed');
if (false!==($err=utf8validate_error(0,5,'','0','',''))) 
  throw new Exception('"" failed: ' . $err);
if (false!==($err=utf8validate_error(0,5,'','0','','123'))) 
  throw new Exception('123 failed: ' . $err);
if (false===($err=utf8validate_error(0,5,'','0','','123456')))
  throw new Exception('123456 should have failed');
if (false===($err=utf8validate_error(2,3,'','0','','1'))) 
  throw new Exception('1 should have failed');
if (false===($err=utf8validate_error(2,3,'','0','','1234'))) 
  throw new Exception('1234 should have failed');

Conclusions

I think input validation should be taken seriously, also in PHP. And I think limiting input to ASCII is not quite enough 2020.

There are obviously ways to work with regular expressions and UTF8 too, but I do not find it pretty.

My code/strategy above should obviously only be used for labels and names where international characters make sense and where the form of the input is relatively free. For other parameters, use a more accurate validation method.

Air Coolers: Arctic Air & Evapolar

A few weeks every summer it gets uncomfortably hot indoors (where I live). I have no aircondition in my home and I can not easily install one either.

There are devices called air coolers. I have two from Arctic Air and two from EvaPolar. Would I recommend them?

First, lets quickly discuss the concept of creating cooling in a warm room.

A FAN creates an air flow without cooling the air. As long as the air is significantly cooler than your body (37C) you will experience a cooling effect (perhaps even when the air is warmer – I have little experience). If the air is 25C it will tend to make your skin 25C. If a lot of air passes by your skin that effect will be significant, but if the air is completely still it is a much slower process.

An AIRCONDITION works like a refridgerator. It is a machine that takes in air of one temperature and outputs air in two different streams: one cooler and one warmer. The warmer stream must be removed, and the cooler stream is sent into the room (or refridgerator). This costs energy (which is heat), so the warm stream gets more warm than the cold stream gets cold. It is important to understand that a refridgerator (or aircondition) in a closed space heats that space. A refridgerator left open makes your room warmer. An AC that does not send (the) warm air out of your home makes your home warmer. Properly installed and at a high energy cost an AC can truly cool a room or a home.

An AIR COOLER turns water into water vapour. That comes at an energy cost, but that energy is drawn from the air, effectively making the air cooler. To speed up this process an Air Cooler has thin wet membranes (large water-air contact area) and a fan (so the air around the membranes is constantly warm and dry). In theory this is very smart. In practice the effect is limited (but real). Apart from the fan, and air cooler does nothing different from just hanging your wet laundry to let it dry.

Humidity

An air cooler raises the humidity of the air. With my understanding of physics I would say that the energy content of the air is the same, but the temperature is lower and the humidity higher. The human body cools itself by sweating (evaporating liquid on the skin) and this process is more effective in dry air. So when considering (evaporating) air coolers I think it is important to understand that the higher humidity can make it feel warmer.

If you live in a dry place (relative humidity below 50%) you may find that the Air Cooler is good. Dry air is not nice for your skin, nose and throat. If the humidity is already very high (above 75%) the air cooler may be of no use whatsoever.

Typical Effect

It is hard to make exact and scientific experiments in your own home. Some days are warmer, sunnier, more windy or drier. I wish I could tell you that “on the week without air coolers the average temperature was 23C, and on the week with air coolers the average temperature was 21C”, but that would require large scale and controled tests.

However, a typical air cooler (Arctic, or EvaPolar) uses a few liters of water every day (so it needs to be refilled). They have different speeds, which produce different noise and different effect. On medium speed you could expect:

  • Air in: 24C, 60% humidity
  • Air out: 21C, 75% humidity (measured just in front of the machine)

If you use it as a fan – directing the cool air towards you – you feel much cooler than with no fan. If you leave it on 24×7 in a room, I would guess the effect is not insignificant, but it is nothing like an AC.

Arctic Air vs Evapolar

I first bought to Arctic Cooling units. Later I bought two Evapolar units (one evaCHILL and one evaLIGHT).

Evapolar units are more expensive. The build quality feels good, and the evaLIGHT also has thermometers for in/out air and a few other features.

What I can say is that Evapolar units are significantly more quiet. So whenever noise is a problem they give you more cooling (like when you sleep or don’t want a loud fan noise).

Evapolar indicates on their web page that they use some hightech membrane materials that gives advantages. My impression is that the Evapolar units have a stronger air flow and drink more water during a day (at similar power consumption ~5VA).

So even though an Evapolar costs 2-5x more than the (cheap) competition, if noise matters to you, I can imagine you can get 2-5x more cooling from it on a typical day.

Power Consumtion

All units I have run on USB 5V and use from 0.3A to a little more than 1A depending on speed. This means you CAN run them from a PowerBank, a computer, or most any USB charger.

My Arctic Air units came without power supplies. evaCHILL comes with a USB-C-connector (but it runs on 5V, not 20V as is standard for USB-C), I doubt the USB-C-standard allows this. So don’t connect your evaCHILL to a 20V laptop charger.

Replacing membranes

The aircooler has membranes that absorb water which is evaporated. Only the water evaporates and inpurities remain in the membranes. So you will need to replace the membranes (at some cost) eventually.

In my case these units will be in use only a few weeks per year so I expect them to last a few years without changing membranes. I also have access to very pure tap water.

Conclusion

If aircondition is not a possibility and humidity is reasonaby low, and Air Cooler is probably the best you can do. If you prefer a more silent unit go for quality (Evapolar) rather than the much cheaper alternatives.

If all you want is a cool experience at your workplace (desk) a fan might be good enough – and in that case it will be cheaper, more silent and requre less maintenance. However an air cooler will give you a slightly cooler air flow.

Goodbye One.com – Hi Inleed.se

I have been running this site on One.com for a few years. Yesterday I moved it to Inleed.se. Details about the moving of wordpress are found here.

On One.com

One.com has served me well. For a hobby non-profit site with a couple of dozen visitors per day the smallest package possible was fine (at about 50 Euros per year, including a domain).

A few weeks ago One.com informed me about new packages, and my little plan was being converted to the new little plan. At first it seemed fine: same price and more storage space (I use very little).

However, the day after the upgrade, I could not SSH into the server to edit a few files. It turned out SSH was no longer included in the most basic plans anymore. In a market economy, One.com can package their services the way they want. But

  • I had a feature (SSH) since years, it was removed from me
  • To get this feature back, my costs were doubled
  • No long term fix for an old customer was offered
  • SSH costs nothing to offer, SFTP was still available, so One.com had put effort into making my little plan less useful.

I use SSH (and linux shells) for everything. Production, test, development, professional servers, hobby servers, workstations, laptops, macOS, Linux, Windows, configuration, programming, and other work. It is just unproductive to not use SSH to

  • edit text files (.html, .js, .php, .htaccess, and so on)
  • check and fix file permissions
  • pack/unpack files
  • manage folders and files

So, as a customer, after expressing my dissatisfaction and getting no long term solution, I vote with my wallet and find another hosting company.

On Inleed.se

I have been a customer of Inleed.se for a few days so I can’t really write a review. But what I immediately notice is that compared to One.com

  • cheapest package is half price
  • …and includes much less storage
  • I get a cgi-bin folder (I don’t think One.com offered that, but not completely sure)
  • I get more than one database if I need
  • I can use SSH 🙂
  • … and there some kind of Node.js support: very interesting, I need to look into it!

So far so good!

Move WordPress to new Domain

I gave up my old web hotel (one.com) and moved to a new one (inleed.se) (read more about why here). As a wordpress blog owner, not being very familiar with MariaDB, Apache and PHP this can seem a bit scary.

However, it was quite fine. With a new web hotel and a database ready, this was basically the tools/steps required.

  1. Use WordPress Plugin Duplicator to produce a complete backup (a downloadable zip file) and a downloadable installer (installer.php)
  2. Configure .htaccess on new server to forward requests to a wordpress folder
  3. Upload backup-zip-file and installer.php to wordpress folder on new server
  4. Run installer (go to http://newsite/installer.php), follow instructions
  5. Use WordPress Plugin Velvet Blues Update Urls to make sure all my links point to newsite rather than oldsite.
  6. Create a .htaccess on old server to permanently forward traffic to new domain

I ended up doing this thing twice, learning the first time and perfecting it the second time.

.htaccess on new server

The purpose of this is to place wordpress in its own directory, while still not needing to expose that directory in the URLs.

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?techfindings.net$ 
RewriteCond %{REQUEST_URI} !^/techfindings/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /techfindings/$1
RewriteCond %{HTTP_HOST} ^(www.)?techfindings.net$ 
RewriteRule ^(/)?$ techfindings/index.php [L]
</IfModule>

So, wordpress is entirely installed in a directory named techfindings, but behaving like it was in my root. Any other page is served normally.

.htaccess on old server

I don’t want people to access the old site when the new site is up. This was a pretty simple and effective .htaccess file:

RedirectPermanent / https://techfindings.net/

This will be in place as long as my old domain is valid and hosted on the old web hotel.

Conclusion

Moving wordpress from one domain and server to another domain and server is perfectly possible with a good result.