如何在 SQLAlchemy 的 `create_engine` 中使用 `charset` 和 `encoding`(创建 Pandas 数据框)?

本文介绍了如何在 SQLAlchemy 的 `create_engine` 中使用 `charset` 和 `encoding`(创建 Pandas 数据框)?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我对字符集和 encoding 在 SQLAlchemy 中工作.我了解(并已阅读)字符集和编码之间的区别，我对编码的历史.

I am very confused with the way charset and encoding work in SQLAlchemy. I understand (and have read) the difference between charsets and encodings, and I have a good picture of the history of encodings.

我在 latin1_swedish_ci 中的 MySQL 中有一个表 (为什么?可能是因为 this).我需要创建一个 Pandas 数据框，在其中获取正确的字符(而不是奇怪的符号).最初，这是在代码中:

I have a table in MySQL in latin1_swedish_ci (Why? Possible because of this). I need to create a pandas dataframe in which I get the proper characters (and not weird symbols). Initially, this was in the code:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db')
sql_query = "select * from table1"
df = pandas.read_sql(sql_query, connect_engine)

我们开始遇到 Š 字符(对应于 u'u0160' unicode，但我们得到 'x8a')的问题.我希望这能奏效:

We started having troubles with the Š character (corresponding to the u'u0160' unicode, but instead we get 'x8a'). I expected this to work:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db', encoding='utf8')

但是，我继续得到 'x8a'，我意识到这是有道理的，因为编码参数的默认值是 utf8.于是，我尝试了 encoding='latin1' 来解决这个问题:

but, I continue getting 'x8a', which, I realized, makes sense given that the default of the encoding parameter is utf8. So, then, I tried encoding='latin1' to tackle the problem:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db', encoding='latin1')

但是，我仍然得到相同的 'x8a'.需要明确的是，在两种情况下(encoding='utf8' 和 encoding='latin1')，我都可以执行 mystring.decode('latin1') 但不是 mystring.decode('utf8').

but, I still get the same 'x8a'. To be clear, in both cases (encoding='utf8' and encoding='latin1'), I can do mystring.decode('latin1') but not mystring.decode('utf8').

然后，我重新发现了连接字符串中的charset参数，即'mysql://user:password@1.1.1.1/db?charset=latin1'.在尝试了所有可能的字符集和编码组合后，我发现这个可行:

And then, I rediscovered the charset parameter in the connection string, i.e. 'mysql://user:password@1.1.1.1/db?charset=latin1'. And after trying all possible combinations of charset and encoding, I found that this one work:

connect_engine = create_engine('mysql://user:password@1.1.1.1/db?charset=utf8')

如果有人能解释我如何正确使用连接字符串中的charset，以及create_engine<中的encoding 参数，我将不胜感激/code>?


I would appreciate if somebody can explain me how to correctly use the charset in the connection string, and the encoding parameter in the create_engine?
推荐答案
encoding 是用于在 SQLAlchemy 中编码/解码的编解码器.来自文档:
encoding is the codec used for encoding/decoding within SQLAlchemy. From the documentation:
对于那些检测到 DBAPI 不支持的场景Python unicode 对象，此编码用于确定源/目标编码.它不使用用于以下情况DBAPI 直接处理 unicode.

  For those scenarios where the DBAPI is detected as not supporting a
  Python unicode object, this encoding is used to determine the
  source/destination encoding. It is not used for those cases where the
  DBAPI handles unicode directly.
[...]
要正确配置系统以容纳 Python unicode 对象，应配置 DBAPI 以最大程度地处理 unicode适当的程度[...]
To properly configure a system to accommodate Python unicode objects,
  the DBAPI should be configured to handle unicode to the greatest
  degree as is appropriate [...]
mysql-python 直接处理 unicode，所以不需要使用这个设置.
mysql-python handles unicode directly, so there's no need to use this setting.
charset 是特定于 mysql-python 驱动程序的设置.来自文档:
charset is a setting specific to the mysql-python driver. From the documentation:
这个字符集是连接的客户端字符集.
此设置控制服务器上的三个变量，特别是 character_set_results，这是您感兴趣的内容.设置后，字符串将作为 unicode 对象返回.
This setting controls three variables on the server, specifically character_set_results, which is what you are interested in. When set, strings are returned as unicode objects.
请注意，这仅适用于数据库中有 latin1 编码数据的情况.如果您已将 utf-8 字节存储为 latin1，则使用 encoding 可能会更好.
Note that this applies only if you have latin1 encoded data in the database. If you've stored utf-8 bytes as latin1, you may have better luck using encoding instead.

                        这篇关于如何在 SQLAlchemy 的 `create_engine` 中使用 `charset` 和 `encoding`(创建 Pandas 数据框)?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持编程学习网！

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除！

如何在 SQLAlchemy 的 `create_engine` 中使用 `charset` 和 `encoding`(创建 Pandas 数据框)?

问题描述

推荐答案

相关文档推荐