java检测字符串中⽂乱码
1.检测是否为乱码
public static boolean isMessyCode(String strName) {
Pattern p = pile("\\s*|\t*|\r*|\n*");
Matcher m = p.matcher(strName);国外幼儿movie>zqh
String after = m.replaceAll("");
String temp = placeAll("\\p{P}", "");
char[] ch = im().toCharArray();
float chLength = 0 ;
float count = 0;四级题目
for (int i = 0; i < ch.length; i++) {
char c = ch[i];
if (!Character.isLetterOrDigit(c)) {
if (!isChine(c)) {
count = count + 1;
}
全国乙卷英语答案chLength++;
}
}
mommy
float result = count / chLength ;
if (result > 0.4) {
return true;
} el {
return fal;
}
}
2.检查字符是否为中⽂
private static boolean isChine(char c) {
Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);
if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHSnascar
|| ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
|| ub == Character.UnicodeBlock.GENERAL_PUNCTUATION
2014高考成绩查询|| ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION
|| ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS) {
amlesslyreturn true;
}
return fal;
}
3.中⽂转换编码
public static String toChine(String msg){
if(isMessyCode(msg)){
christchurchtry {
return new Bytes("ISO8859-1"), "UTF-8");
} catch (Exception e) {
}
}
return msg ;
}
Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS : 4E00-9FBF:CJK 统⼀表意符号
Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS :F900-FAFF:CJK 兼容象形⽂字Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A :3400-4DBF:CJK 统⼀表意符号扩展 A CJK的意思是“Chine,Japane,Korea”的简写,实际上就是指中⽇韩三国的象形⽂字的Unicode编码Character.UnicodeBlock.GENERAL_PUNCTUATION :2000-206F:常⽤标点
Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION :3000-303F:CJK 符号和标点
Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS :FF00-FFEF:半⾓及全⾓形式tradition
Character.isLetter(c):判断字符是否是字母
Character.isDigit(c):判断字符是否是数字