[Reprint] 50 things to know before migrating Oracle to MySQL

发表: 2015-11-19 浏览: 2470

MYSQL入门

A while back I was at a seminar on migrating database applications to MySQL. A lot of the attendees were Oracle users. Based on their questions, comments and conversations during lunch, I made the following list of things Oracle users need to know about migrating to MySQL. Most of these are “gotchas” that would be contraindications or require some thought about a workaround.

Note: this is not meant to be MySQL-bashing. Some of these limitations are going to be fixed in future versions of MySQL, but they generally apply to current GA version 5.1. Some things are possible to achieve by choosing one particular way to use the server, at the exclusion of other things (e.g. Cluster rules out foreign keys, spatial data types rules out transactions). I am posting this list at the request of some Oracle DBAs who asked me to produce it for them.

Subqueries are poorly optimized.
Complex queries are a weak point.
The query executioner (aka query optimizer / planner) is less sophisticated.
Performance tuning and metrics capabilities are limited.
There is limited ability to audit.
Security is unsophisticated, even crude. There are no groups or roles, no ability to deny a privilege (you can only grant privileges). A user who logs in with the same username and password from different network addresses may be treated as a completely separate user. There is no built-in encryption comparable to Oracle.
Authentication is built-in. There is no LDAP, Active Directory, or other external authentication capability.
Clustering is not what you think it is.
Stored procedures and triggers are limited.
Vertical scalability is poor. No longer true; with Percona XtraDB or MySQL 5.5 (unreleased) you get excellent scalability
There is zero MPP support.
SMP is supported, but MySQL doesn’t scale well to more than 4 or 8 cores/CPUs. No longer true; with Percona XtraDB or MySQL 5.5 (unreleased) you get excellent scalability
There is no fractional-second storage type for times, dates, or intervals.
The language used to write stored procedures, triggers, scheduled events, and stored functions is very limited.
There is no roll-back recovery. There is only roll-forward recovery.
There is no support for snapshots.
There is no support for database links. There is something called the Federated storage engine that acts as a relay by passing queries along to a table on a remote server, but it is crude and buggy.
Data integrity checking is very weak, and even basic integrity constraints cannot always be enforced.
There are very few optimizer hints to tune query execution plans.
There is only one type of join plan: nested-loop. There are no sort-merge joins or hash joins.
Most queries can use only a single index per table; some multi-index query plans exist in certain cases, but the cost is usually underestimated by the query optimizer, and they are often slower than a table scan.
There are no bitmap indexes. Each storage engine supports different types of indexes. Most engines support B-Tree indexes.
There are fewer and less sophisticated tools for administration.
There is no IDE and debugger that approaches the level of sophistication you may be accustomed to. You’ll probably be writing your stored procedures in a text editor and debugging them by adding statements that insert rows into a table called debug_log.
Each table can have a different storage backend (“storage engine”).
Each storage engine can have widely varying behavior, features, and properties.
Foreign keys are not supported in most storage engines.
The default storage engine is non-transactional and corrupts easily.
Oracle owns InnoDB, the most advanced and popular storage engine.
Certain types of execution plans are only supported in some storage engines. Certain types of COUNT() queries execute instantly in some storage engines and slowly in others.
Execution plans are not cached globally, only per-connection.
Full-text search is limited and only available for non-transactional storage backends. Ditto for GIS/spatial types and queries.
There are no resource controls. A completely unprivileged user can effortlessly run the server out of memory and crash it, or use up all CPU resources.
There are no integrated or add-on business intelligence, OLAP cube, etc packages.
There is nothing analogous to Grid Control.
There is nothing even remotely like RAC. If you are asking “How do I build RAC with MySQL,” you are asking the wrong question.
There are no user-defined types or domains.
The number of joins per query is limited to 61.
MySQL supports a smaller subset of SQL syntax. There are no recursive queries, common table expressions, or windowing functions. There are a few extensions to SQL that are somewhat analogous to MERGE and similar features, but are very simplistic in comparison.
There are no functional columns (e.g. a column whose value is calculated as an expression).
You cannot create an index on an expression, you can only index columns.
There are no materialized views.
The statistics vary between storage engines and regardless of the storage engine, are limited to simple cardinality and rows-in-a-range. In other words, statistics on data distribution are limited. There is not much control over updating of statistics.
There is no built-in promotion or failover mechanism.
Replication is asynchronous and has many limitations and edge cases. For example, it is single-threaded, so a powerful slave can find it hard to replicate fast enough to keep up with a less powerful master.
Cluster is not what you think it is. Maybe I already said that, but it bears repeating.
The data dictionary (INFORMATION_SCHEMA) is limited and very slow (it can easily crash a busy server).
There is no online ALTER TABLE.
There are no sequences.
DDL such as ALTER TABLE or CREATE TABLE is non-transactional. It commits open transactions and cannot be rolled back or crash-recovered. Schema is stored in the filesystem independently of the storage engine.

I hope this is helpful.

About The Author

Baron is the founder and CEO of VividCortex. He is the author of High Performance MySQL and many open-source tools for performance analysis, monitoring, and system administration. Baron contributes to various database communities such as Oracle, PostgreSQL, Redis and MongoDB.

转自：http://www.xaprb.com/blog/2009/03/13/50-things-to-know-before-migrating-oracle-to-mysql/

1、对子查询的优化表现不佳。

2、对复杂查询的处理较弱

3、查询优化器不够成熟

4、性能优化工具与度量信息不足

5、审计功能相对较弱

6、安全功能不成熟，甚至可以说很粗糙。没有用户组与角色的概念，没有回收权限的功能(仅仅可以授予权限)。当一个用户从不同的主机/网络以同样地用户名/密码登录之后，可能被当作完全不同的用户来处理。没有类似于Oracle的内置的加密功能。

7、身份验证功能是完全内置的。不支持LDAP，Active Directory以及其它类似的外部身份验证功能。

8、Mysql Cluster可能与你的想象有较大差异。

9、存储过程与触发器的功能有限。

10、垂直扩展性较弱。

11、不支持MPP(大规模并行处理)。

12、支持SMP(对称多处理器)，但是如果每个处理器超过4或8个核(core)时，Mysql的扩展性表现较差。

13、对于时间、日期、间隔等时间类型没有秒以下级别的存储类型。

14、可用来编写存储过程、触发器、计划事件以及存储函数的语言功能较弱。

15、没有基于回滚(roll-back)的恢复功能，只有前滚(roll-forward)的恢复功能。

16、不支持快照功能。

17、不支持数据库链(database link)。有一种叫做Federated的存储引擎可以作为一个中转将查询语句传递到远程服务器的一个表上，不过，它功能很粗糙并且漏洞很多。

18、数据完整性检查非常薄弱，即使是基本的完整性约束，也往往不能执行。

19、优化查询语句执行计划的优化器提示非常少。

20、只有一种表连接类型：嵌套循环连接(nested-loop)，不支持排序-合并连接(sort-merge join)与散列连接(hash join)。

21、大部分查询只能使用表上的单一索引;在某些情况下，会存在使用多个索引的查询，但是查询优化器通常会低估其成本，它们常常比表扫描还要慢。

22、不支持位图索引(bitmap index)。每种存储引擎都支持不同类型的索引。大部分存储引擎都支持B-Tree索引。

23、管理工具较少，功能也不够成熟。

24、没有成熟能够令人满意的IDE工具与调试程序。可能不得不在文本编辑器中编写存储过程，并且通过往表(调试日志表)中插入记录的方式来做调试。

25、每个表都可以使用一种不同的存储引擎。

26、每个存储引擎在行为表现、特性以及功能上都可能有很大差异。

27、大部分存储引擎都不支持外键。

28、默认的存储引擎(MyISAM)不支持事务，并且很容易损坏。

29、最先进最流行的存储引擎InnoDB由Oracle拥有。

30、有些执行计划只支持特定的存储引擎。特定类型的Count查询，在这种存储引擎中执行很快，在另外一种存储引擎中可能会很慢。

31、执行计划并不是全局共享的，，仅仅在连接内部是共享的。

32、全文搜索功能有限，只适用于非事务性存储引擎。 Ditto用于地理信息系统/空间类型和查询。

33、没有资源控制。一个完全未经授权的用户可以毫不费力地耗尽服务器的所有内存并使其崩溃，或者可以耗尽所有CPU资源。

34、没有集成商业智能(business intelligence)， OLAP 多维数据集等软件包。

35、没有与Grid Control类似的工具(http://solutions.mysql.com/go.php?id=1296&t=s)

36、没有类似于RAC的功能。如果你问”如何使用Mysql来构造RAC”，只能说你问错了问题。

37、不支持用户自定义类型或域(domain)。

38、每个查询支持的连接的数量最大为61。

39、MySQL支持的SQL语法(ANSI SQL标准)的很小一部分。不支持递归查询、通用表表达式（Oracle的with 语句）或者窗口函数（分析函数）。支持部分类似于Merge或者类似特性的SQL语法扩展，不过相对于Oracle来讲功能非常简单。

40、不支持功能列(基于计算或者表达式的列，Oracle11g 开始支持计算列，以及早期版本就支持虚列(rownum，rowid))。

41、不支持函数索引，只能在创建基于具体列的索引。

42、不支持物化视图。

43、不同的存储引擎之间，统计信息差别很大，并且所有的存储引擎支持的统计信息都只支持简单的基数(cardinality)与一定范围内的记录数(rows-in-a-range)。换句话说，数据分布统计信息是有限的。更新统计信息的机制也不多。

44、没有内置的负载均衡与故障切换机制。

45、复制(Replication)功能是异步的，并且有很大的局限性。例如，它是单线程的(single-threaded)，因此一个处理能力更强的Slave的恢复速度也很难跟上处理能力相对较慢的Master。

46、 Cluster并不如想象的那么完美。或许我已经提过这一点，但是这一点值得再说一遍。

47、数据字典(INFORMATION_SCHEMA)功能很有限，并且访问速度很慢(在繁忙的系统上还很容易发生崩溃)。

48、不支持在线的Alter Table操作。

49、不支持Sequence。

50、类似于ALTER TABLE或CREATE TABLE一类的操作都是非事务性的。它们会提交未提交的事务，并且不能回滚也不能做灾难恢复。Schame被保存在文件系统上，这一点与它使用的存储引擎无关。

1 个评论

老头子专注是唯一的捷径

看完才发现这是老版本的说明。5.6已经支持hash、事物等，InnoDB也作为了默认存储引擎。

要回复文章请先登录或注册