在 R 中从 MySQL 获取 UTF-8 文本返回“????";

Fetching UTF-8 text from MySQL in R returns quot;????quot;(在 R 中从 MySQL 获取 UTF-8 文本返回“????;)
本文介绍了在 R 中从 MySQL 获取 UTF-8 文本返回“????";的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在试图从 R 中获取 MySQL 数据库中的 UTF-8 文本时遇到困难.我在 OS X 上运行 R(通过 GUI 和命令行都尝试过),其中默认语言环境是 en_US.UTF-8,无论我怎么尝试,查询结果都显示?"用于所有非 ASCII 字符.

I'm stuck trying to fetch UTF-8 text in a MySQL database from R. I'm running R on OS X (tried both via the GUI and command line), where the default locale is en_US.UTF-8, and no matter what I try, the query result shows "?" for all non-ASCII characters.

我尝试在通过 ODBC 连接时设置 options(encoding='UTF-8')DBMSencoding='UTF-8',设置 Encoding(res$str) <- 'UTF-8' 获取结果后,以及每个结果的 'utf8' 变体,都无济于事.从命令行 mysql 客户端运行查询可以正确显示结果.

I've tried setting options(encoding='UTF-8'), DBMSencoding='UTF-8' when connecting via ODBC, setting Encoding(res$str) <- 'UTF-8' after fetching the results, as well as 'utf8' variants of each of those, all to no avail. Running the query from the command line mysql client shows the results correctly.

我完全被难住了.任何想法为什么它不起作用,或者我应该尝试的其他事情?

I'm totally stumped. Any ideas why it's not working, or other things I should try?

这是一个相当小的测试用例:

Here's a fairly minimal test case:

$ mysql -u root
mysql> CREATE DATABASE test;
mysql> USE test;
mysql> CREATE TABLE test (str VARCHAR(10)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO test (str) VALUES ('こんにちは');
Query OK, 1 row affected (0.00 sec)

mysql> select * from test;
+-----------------+
| str             |
+-----------------+
| こんにちは      |
+-----------------+
1 row in set (0.00 sec)

同时使用 RODBC 和 RMySQL 查询 R 中的表显示?????"对于 str 列:

Querying the table in R using both RODBC and RMySQL shows "?????" for the str column:

> con <- odbcDriverConnect('DRIVER=mysql;user=root', DBMSencoding='UTF-8')
> sqlQuery(con, 'SELECT * FROM rtest.test')
    str
1 ?????
> library(RMySQL)
Loading required package: DBI
> con <- dbConnect(MySQL(), user='root')
> dbGetQuery(con, 'SELECT * FROM rtest.test')
    str
1 ?????

为了完整起见,这是我的 sessionInfo:

For completeness, here's my sessionInfo:

> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RMySQL_0.9-3 DBI_0.2-5    RODBC_1.3-6 

推荐答案

感谢@chooban,我发现连接会话使用的是 latin1 而不是 utf8.这是我找到的两个解决方案:

Thanks to @chooban I found out the connection session was using latin1 instead of utf8. Here are two solutions I found:

  • 对于 RMySQL,连接后运行查询 SET NAMES utf8 以更改连接字符集.
  • 对于 RODBC,使用 DSN 字符串中的 CharSet=utf8 进行连接.我无法通过 ODBC 运行 SET NAMES.
  • For RMySQL, after connecting run the query SET NAMES utf8 to change the connection character set.
  • For RODBC, connect using CharSet=utf8 in the DSN string. I was not able to run SET NAMES via ODBC.

这个问题指向我正确的方向.

这篇关于在 R 中从 MySQL 获取 UTF-8 文本返回“????";的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Execute complex raw SQL query in EF6(在EF6中执行复杂的原始SQL查询)
Hibernate reactive No Vert.x context active in aws rds(AWS RDS中的休眠反应性非Vert.x上下文处于活动状态)
Bulk insert with mysql2 and NodeJs throws 500(使用mysql2和NodeJS的大容量插入抛出500)
Flask + PyMySQL giving error no attribute #39;settimeout#39;(FlASK+PyMySQL给出错误,没有属性#39;setTimeout#39;)
auto_increment column for a group of rows?(一组行的AUTO_INCREMENT列?)
Sort by ID DESC(按ID代码排序)