UTF-8 and other encoding improvements

Helpdesk for my helpdesk software

Moderator: mkoch227

Post Reply
Hyvätti
Posts: 8
Joined: Thu Jan 05, 2012 12:03 pm

UTF-8 and other encoding improvements

Post by Hyvätti »

Hi,

In order to get a translation and email piping and email sending working with UTF-8, I'm in a process of improving the encoding support of HESK 2.3. Getting all email headers working will take a little more effort, but anyway here's what I got so far. Outgoing Subject: line is fixed. Incoming From: address person name containing non-ASCII chars is still not done. The HTTP header changes allow UTF-8 charset to work in all PHP environments, servers and browsers.

My plan is to create also Finnish translations. 70% done. Will post later this spring.

The code below probably does not copy right, so get your patch from http://www.iki.fi/hyvatti/sw/hesk23-a.diff

Code: Select all

diff -ru /tmp/h/inc/email_functions.inc.php ./inc/email_functions.inc.php
--- /tmp/h/inc/email_functions.inc.php	2011-10-19 19:08:58.000000000 +0300
+++ ./inc/email_functions.inc.php	2012-01-05 08:42:37.320663615 +0200
@@ -48,6 +48,12 @@
 function hesk_mail($to,$subject,$message) {
 	global $hesk_settings, $hesklang;
 
+    $subject2 = quoted_printable_encode($subject);
+    if ($subject2 != $subject) {
+      $subject2 = str_replace (' ', '_', $subject2);
+      $subject = "=?" . $hesklang['ENCODING'] . "?Q?${subject2}?=";
+    }
+
     /* Use PHP's mail function */
 	if ( ! $hesk_settings['smtp'])
     {
diff -ru /tmp/h/inc/header.inc.php ./inc/header.inc.php
--- /tmp/h/inc/header.inc.php	2011-09-15 21:36:06.000000000 +0300
+++ ./inc/header.inc.php	2011-12-19 07:07:28.107662879 +0200
@@ -35,9 +35,10 @@
 /* Check if this is a valid include */
 if (!defined('IN_SCRIPT')) {die('Invalid attempt');}
 
+header('Content-Type: text/html; charset='.$hesklang['ENCODING']);
 ?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="<?php echo $hesk_settings['languages'][$hesk_settings['language']]['folder'] ?>" lang="<?php echo $hesk_settings['languages'][$hesk_settings['language']]['folder'] ?>">
 <head>
 	<title><?php echo (isset($hesk_settings['tmp_title']) ? $hesk_settings['tmp_title'] : $hesk_settings['hesk_title']); ?></title>
 	<meta http-equiv="Content-Type" content="text/html;charset=<?php echo $hesklang['ENCODING']; ?>" />

diff -ru /tmp/h/print.php ./print.php
--- /tmp/h/print.php	2011-09-15 21:36:08.000000000 +0300
+++ ./print.php	2011-12-19 06:58:53.475662759 +0200
@@ -72,6 +72,7 @@
 $sql = "SELECT * FROM `".hesk_dbEscape($hesk_settings['db_pfix'])."replies` WHERE `replyto`='".hesk_dbEscape($ticket['id'])."' ORDER BY `id` ASC";
 $res  = hesk_dbQuery($sql);
 $replies = hesk_dbNumRows($res);
+header('Content-Type: text/html; charset='.$hesklang['ENCODING']);
 ?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
 <html>
Hope this helps developers or someone. If you can figure out how person name is decoded correctly from incoming encoded email headers, I would appreciate it.

Regards,
Jaakko
Tel. +358 40 5011222
Klemen
Site Admin
Posts: 10145
Joined: Fri Feb 11, 2005 4:04 pm

Re: UTF-8 and other encoding improvements

Post by Klemen »

Thanks for sharing your work.

I plan to move HESK to UTF8 exclusively in the future for all languages (should have done this from the start, would save a lot of trouble). Not sure if version 2.4 will already have this, but definitely will be done.

Incoming messages are a very delicate thing as they can be in any encoding that you have to also encode in UTF8 (for example using mb_convert_encoding() ). The "mime_parser.php" class (in inc/mail) is a very powerful one, you may want to dig through that as it may help you.

Sorry I can't give any specific help at the moment, but when I get to tackle it surely it will be included in HESK.
Klemen, creator of HESK and PHPJunkyardWas this helpful? You can buy me a drink here Image

Image You should follow me on Twitter here

Help desk software | Cloud help desk | Guestbook | Link manager | Click counter | more PHP Scripts ...

Also browse for php hosting companies, read php books, find php resources and use webmaster tools
Hyvätti
Posts: 8
Joined: Thu Jan 05, 2012 12:03 pm

Re: UTF-8 and other encoding improvements

Post by Hyvätti »

Hi,

I finally fixed the last thing that did not support non-ASCII: the incoming email sender name. Now the names of the customers do not get mangled. Below is the diff. Also the Finnish translation is done, as you can see in the AddOns section. The diff below includes some additional date formats, most notably the ISO8601 format.

http://www.iki.fi/hyvatti/sw/hesk23-fi.diff

Code: Select all

diff -ru /tmp/h/inc/calendar/calendar_js.php ./inc/calendar/calendar_js.php
--- /tmp/h/inc/calendar/calendar_js.php	2011-09-15 21:36:06.000000000 +0300
+++ ./inc/calendar/calendar_js.php	2011-12-20 03:56:19.100948433 +0200
@@ -61,11 +61,23 @@
 function f_tcalParseDate (s_date) {
 
         var re_date = /^\s*(\d{1,2})\/(\d{1,2})\/(\d{2,4})\s*$/;
-        if (!re_date.exec(s_date))
+        var re_date1 = /^\s*(\d{1,2})\.\s*(\d{1,2})\.\s*(\d{2,4})\s*$/;
+        var re_date2 = /^\s*(\d{2,4})-(\d{1,2})-(\d{1,2})\s*$/;
+        var n_day, n_month, n_year;
+        if (re_date.exec(s_date)) {
+	  n_day = Number(RegExp.$2);
+	  n_month = Number(RegExp.$1);
+	  n_year = Number(RegExp.$3);
+	} else if (re_date1.exec(s_date)) {
+	  n_day = Number(RegExp.$1);
+	  n_month = Number(RegExp.$2);
+	  n_year = Number(RegExp.$3);
+	} else if (re_date2.exec(s_date)) {
+	  n_day = Number(RegExp.$3);
+	  n_month = Number(RegExp.$2);
+	  n_year = Number(RegExp.$1);
+	} else
                 return alert ("<?php echo $hesklang['cinv']; ?>: '" + s_date + "'.\n<?php echo $hesklang['cinv2']; ?>.")
-        var n_day = Number(RegExp.$2),
-                n_month = Number(RegExp.$1),
-                n_year = Number(RegExp.$3);
 
         if (n_year < 100)
                 n_year += (n_year < this.a_tpl.centyear ? 2000 : 1900);
diff -ru /tmp/h/inc/email_functions.inc.php ./inc/email_functions.inc.php
--- /tmp/h/inc/email_functions.inc.php	2011-10-19 19:08:58.000000000 +0300
+++ ./inc/email_functions.inc.php	2012-01-11 12:57:52.440440463 +0200
@@ -48,6 +48,12 @@
 function hesk_mail($to,$subject,$message) {
 	global $hesk_settings, $hesklang;
 
+    $subject2 = quoted_printable_encode($subject);
+    if ($subject2 != $subject) {
+      $subject2 = str_replace (' ', '_', $subject2);
+      $subject = "=?" . $hesklang['ENCODING'] . "?Q?${subject2}?=";
+    }
+
     /* Use PHP's mail function */
 	if ( ! $hesk_settings['smtp'])
     {
@@ -82,7 +88,7 @@
                 "Reply-To: $hesk_settings[noreply_mail]",
                 "Return-Path: $hesk_settings[webmaster_mail]",
 				"Subject: " . $subject,
-				"Date: ".strftime("%a, %d %b %Y %H:%M:%S %Z"),
+				"Date: ".strftime(DATE_RFC2822),
                 "Content-Type: text/plain; charset=".$hesklang['ENCODING']
 			), $message))
     {
diff -ru /tmp/h/inc/header.inc.php ./inc/header.inc.php
--- /tmp/h/inc/header.inc.php	2011-09-15 21:36:06.000000000 +0300
+++ ./inc/header.inc.php	2011-12-19 07:07:28.107662879 +0200
@@ -35,9 +35,10 @@
 /* Check if this is a valid include */
 if (!defined('IN_SCRIPT')) {die('Invalid attempt');}
 
+header('Content-Type: text/html; charset='.$hesklang['ENCODING']);
 ?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="<?php echo $hesk_settings['languages'][$hesk_settings['language']]['folder'] ?>" lang="<?php echo $hesk_settings['languages'][$hesk_settings['language']]['folder'] ?>">
 <head>
 	<title><?php echo (isset($hesk_settings['tmp_title']) ? $hesk_settings['tmp_title'] : $hesk_settings['hesk_title']); ?></title>
 	<meta http-equiv="Content-Type" content="text/html;charset=<?php echo $hesklang['ENCODING']; ?>" />
diff -ru /tmp/h/print.php ./print.php
--- /tmp/h/print.php	2011-09-15 21:36:08.000000000 +0300
+++ ./print.php	2011-12-19 06:58:53.475662759 +0200
@@ -72,6 +72,7 @@
 $sql = "SELECT * FROM `".hesk_dbEscape($hesk_settings['db_pfix'])."replies` WHERE `replyto`='".hesk_dbEscape($ticket['id'])."' ORDER BY `id` ASC";
 $res  = hesk_dbQuery($sql);
 $replies = hesk_dbNumRows($res);
+header('Content-Type: text/html; charset='.$hesklang['ENCODING']);
 ?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
 <html>
diff -ru /tmp/h/inc/mail/email_parser.php ./inc/mail/email_parser.php
--- /tmp/h/inc/mail/email_parser.php	2011-09-15 21:36:06.000000000 +0300
+++ ./inc/mail/email_parser.php	2012-01-15 00:07:27.809378260 +0200
@@ -171,14 +171,17 @@
   foreach($email_info as $info){
     $address = "";
     $name = "";
+    $encoding = "";
     if ( array_key_exists("address", $info) ){
       $address = $info["address"];
     }
     if ( array_key_exists("name", $info) ){
       $name = $info["name"];
     }
-    
-    $result[] = array("address"=>$address,"name"=>$name);
+    if ( array_key_exists("encoding", $info) ){
+      $encoding = $info["encoding"];
+    }
+    $result[] = array("address"=>$address,"name"=>$name,"encoding"=>$encoding);
   }
 
   return $result;
diff -ru /tmp/h/inc/mail/hesk_pipe.php ./inc/mail/hesk_pipe.php
--- /tmp/h/inc/mail/hesk_pipe.php	2012-01-05 14:16:22.529670777 +0200
+++ ./inc/mail/hesk_pipe.php	2012-01-15 00:07:22.335410549 +0200
@@ -58,6 +58,13 @@
 
 /* Variables */
 $tmpvar['name']	    = hesk_input($results['from'][0]['name']) or $tmpvar['name'] = $hesklang['unknown'];
+if (!empty($results['from'][0]['encoding']))
+{
+	if (strtolower($results['from'][0]['encoding']) != strtolower($hesklang['ENCODING']))
+    {
+		$tmpvar['name']=mb_convert_encoding($tmpvar['name'],$hesklang['ENCODING'],$results['from'][0]['encoding']);
+    }
+}
 $tmpvar['email']	= hesk_validateEmail($results['from'][0]['address'],'ERR',0);
 $tmpvar['category'] = 1;
 $tmpvar['priority'] = 3;
Post Reply