PHP 正则表达式匹配中文问题（phpemail正则）-技术教程-四时宝库

正则匹配中文汉字根据页面编码不同而略有区别：

GBK/GB2312和utf-8字符集：[\x80-\xff]+ 或 [\xa1-\xff]+ 两者字符集都支持(推荐GBK使用)

utf-8编码：[\x{4e00}-\x{9fa5}]+/u

例子：

GBK 使用

<?php
$str = "学习php是一件快乐的事。";
preg_match_all("/[\x80-\xff]+/", $str, $match); //GBK和UTF-8 执行结构都是一样
print_r($match);
?>

Array ( [0] => Array ( [0] => 学习 [1] => 是一件快乐的事。 ) )

UTF-8 使用：

<?php
$str = "学习php是一件快乐的事。";
preg_match_all("/[\x{4e00}-\x{9fa5}]+/u", $str, $match); //只能适用于UFT-8字符集
print_r($match);
?>

输出：

Array
(
[0] => Array
(
[0] => 学习
[1] => 是一件快乐的事。
)

)

实例1：如何去除中文(GB2312和utf-8字符集均可以使用)

<?php
$string = "中华教具网www.cnjiaju.com";
$str = preg_replace('/([\x80-\xff]*)/i','',$string); //去掉中文汉字
echo $str;
?>

实例2:(只能适用于uft-8字符集)

$string = "中华教具网www.cnjiaju.com";
echo preg_match('/([\x{4e00}-\x{9fa5}])+/u', $string, $match); //1 true
echo $match;
/*
array (
0 => '中华教具网',
1 => '网',
)
*/

实例3：如何匹配中文(utf-8)

<?php
$platform_name = '<li>赶集网 <a href="http://www.ganji.com" target="_blank" rel="nofollow"><font color="#0033CC">http://www.ganji.com</font></a></li>';
$count = preg_match_all('/[\x{4e00}-\x{9fa5}]+/u', $platform_name, $mathes); //匹配中文的正则表达式
echo $count."<br>"; //匹配的次数
var_export($mathes); //匹配的结果
?>

js的写法

var reg=/[\u4e00-\u9fa5]+/;

<script>
<script>
    var str="ftgfg风缘择敏hjkhj";
    var reg=/[\u4e00-\u9fa5]+/;
    if(reg.exec(str)){
          alert('有中文');
    }
    else{
        alert('没有中文');
    }
</script>

匹配中文：

javascript：[\u4e00-\u9fa5]
PHP: [\x80-\xff] GBK 和 utf-8 兼容使用 推荐GBK字符集使用
[\x{4e00}-\x{9fa5}] utf-8

四时宝库

程序员的知识宝库

PHP 正则表达式匹配中文问题（phpemail正则）