问题描述
I am parsing XML, with simplexml_load_string()
, and using the data within it to update Active Directory (AD) objects, via LDAP.
Example XML (simplified):
<?xml version="1.0" encoding="UTF-8"?>
<users>
<user>Bìlbö Bággįnš</user>
<user>Gãńdåłf Thê Gręât</user>
<user>Śām Wīšë</user>
</users>
I firstly run an ldap_search()
to find a single user and then proceed to change their attributes. Pumping the above values straight into AD, using LDAP, will result in some pretty mangled characters showing up.
For example: Bìlbö Bággįnš
I've tried the following functions, to no avail:
utf8_encode($str);
utf8_decode($str);
iconv("UTF-8", "ISO-8859-1//TRANSLIT", $str);
iconv("UTF-8", "ASCII//TRANSLIT", $str);
iconv("UTF-8", "T.61", $str);
Ideally, I don't want to do any of these string conversions. UTF-8 should be fine, right?!
I've also noticed the following: I have printed out the values to see how they come out. curl-ing the script in CLI will show the correct characters, but web browsers show the same as AD.
What's going on? Should I be looking at something else, eg. URL encoding? I'm hoping this is down to a simple mistake on my end.
EDIT:
I entered in these characters using AD admin GUI to see how they would come out. I can read them via LDAP fine. Correct characters are displayed when in a browser. curl-ing via CLI will show question marks instead of foreign characters. Passing one of these returned values into mb_detect_encoding()
will return UTF-8.
I decided to immediately modify the same object by not writing in a new string, but just reversing the existing value and saving the object. This works fine - I see the correct value (reversed) in AD.
- Developing on Mac OS X 10.7 Lion - PHP 5.4.3
- Running production on: Red Hat 6 - PHP 5.4.3
- AD server: Windows 2003
UPDATE: After a few months, I was unable to find the answer/solution to this problem. In the end, I went with replacing characters to their non-accented equivalent (NOT ideal, I know).
Are you using LDAP v3?
ldap_set_option($ldap, LDAP_OPT_PROTOCOL_VERSION, 3);
LDAPv3 supports UTF-8 by default, which it expects requests and responses to be in by default. See here: http://technet.microsoft.com/en-us/library/cc961766.aspx
这篇关于外来字符和 LDAP.LDAP 期望什么编码/字符集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!