Q&A Webinar Part 4 - MySQL Cluster(转贴)

原文转自:http://onlinesolutionsmysql.blogspot.com/2007/05/q-webinar-part-4-mysql-cluster.html

Q from Olivier - Are the data nodes MySQL servers too ?
No, the data node is handled by a separate process, ndbd, that only manages data.

Q from Olivier: So, what is MySQL Cluster? A MySQL AB product ?
Technically speaking, MySQL Cluster is a storage engine, based on a network distributed database. From a commercial point of view, we refer at MySQL Cluster as a product. We also provide APIs to access to the Cluster database directly, bypassing the MySQL Server and the storage engine architecture.

Q from Ludovico: Are there load balancing mechanisms to balance sql requests between active sql server nodes?
Yes, absolutely. The SQL nodes all see the same data, and load may well be balanced between them, if necessary.
The only caveat is that transactions must be locked into the same SQL server for the duration of the transaction.
In real life, load balancing can be achieved through hardware and software load balancers, or using the capabilities provided by some connectors such as JDBC, or just handling the load balancing at http level to the web servers, assuming that on average they will generate pretty much the same workload.

Q from Ludovico: What about embedded solutions, HW load balancers or round robin dns?
Any of them should work. Keep in mind the transaction caveat though. The load balancer must be activated when the client must initiate the connection, then the transaction must be related to the same connection.

Q from Roberto: Using the API and the native access, do we lose whole or part of the HA features ?
No, not at all, Cluster, even without the SQL nodes, is fully reduntant. The HA features are at the Cluster level, below the SQL nodes.

Q from Vasilij: Does Linux support MySQL Cluster?
Absolutely, various Linux and Unices are fully supported

Q from Manuel: Is the management server is a Single Point of Failure?
No, it's not. First of all, you may have as many as you want. Secondly, it is important that the management server is on when a new node is added to the cluster or in case of failure in a small cluster with only two data nodes. During normal operations, it could be switched off.
Typically, the management node is a lightweight process and it can be colocated with other servers.

Q from Manuel: You said that the minimum number of nodes is 3. Can't we have only two nodes and the management server running on both nodes and so you have absolute redundancy with only 2 boxes?
No, having 2 nodes is not a good idea, as that might cause a split-brain situation. If you implement such configuration, MySQL Cluster will issue a warning

Q from Olivier: Where are the data stored? On MySQL servers? On the NDBD nodes ? In memory or on disk ?
Data is In the data nodes (NDB nodes) and data is in memory. With version 5.1, data can be optionally on disk, although indexes must still reside in memory.

Q from Manuel: May the management servers run on the same machine where the nodes or the SQL Servers run?
Yes, they can. There is no need for an extra physical server, provided the cluster has at least 3 physical servers.

Q from Colin: What is the simplest cluster config if all access is via C++ API?
3 nodes. 2 datanodes, one for mgm, which may be colocated with the application node, for example.

Q from John: All the examples of hostnames in the cinfig file are ipv4 addresses - does cluster support ipv6?
No, not at the moment.

Q from John: If mysql nodes and data nodes are on the same boxes can they share IP's?
Sure, they can share the same address, as they respond to different ports

Q from Manuel: What is a split-brain and how does it happen?
A split brain is a particular situation that may happen in some HA infrastructures. It may happen that in an HA solution with only two servers, in case of network fault between the servers, the two nodes cannot talk each other, but they are both active and can provide the service to their clients. In this case we have a split brain situation. MySQL cluster does not allow a split brain, since it requires a third node as arbitrator.

Q from Danilo: Is is possible to have multiple INSERT coming from different MySQL Server that require auto_increment fields?
Yes, that is absolutely possible and fully supported.

Q from Olivier: Is all the datas on every node? What happens, for example, if I have 4 nodes with 20GB of memory for each server?
First of all, you have to specify the exact number of replicas that you want. You can have only one copy of data (NoOfReplica=1), but in this scenario the system will not be redundant. You can have up to 4 replicas. WIth a classic replica of 2, you will have only two copies of the same data. In the example that you have mentioned, the total memory available will be 4 (nodes) x 20 (memory) / 2 (replica) = 40 GB in total.

Q Danilo: How many data nodes are supported?
The current limit is 48.

Q from Domenico: Can MySql Server and Data Node reside on the same physical server?
Yes, they can. For better scalability and performance, you should consider separate nodes though.

Q from Colin: If the data is not on disk which safety/backup policy is advised in case of total system failure?
There is a checkpointing and redo logging system that assure that the data are safely stored on disk. You can also used the online backup provided for disaster recovery.

Q from Ludovico: What happens if a power shortage happens and all my data nodes are turned off?
In this case you have to recover from disk up to the latest checkpoint and then you can apply everything up to the latest position in the redo log. Bear in mind you should configure the whole infrastructure in a way that is suitable for high availability.

Q from Hilmar: So in the worst case scenario, when a checkpoint was not fired after an updated, you loose power, then cluster recovers from loss of power and after the restart of the servers it has ' rolledback' to its previous checkpoint, it that correct?
First it recovers up to the latest checkpoint, then it applies the redo log.

Q from David: Does the Transaction Coordinator ensure referential integrity?
No, it does not. Foreign keys are not supported in the current release and not in 5.1

Q from Guy: Is the whole database held in memory or just the tables in use?
The whole database is in-memory. That is the attribute of an In-Memory database, to achieve deterministic performance.

Q from Ludovico: Il the FK support at SQL level coming also for NDB?
YeS, as far as the SQL interface is used.

Q from Colin: Is the FK support planned for all storage engines in future versions?
We are working on this. We are still unsure about global support for all the storage engines, but this is definitely one of the possible solutions.

Q from Manuel: If you have 2 management servers on different machines and you lose network connection between them, don't you have a split-brain situation?
Not necessarily. As long as there is an odd # of arbitration capable nodes, you don't.

Q from Jonas: Is there any estimate in time on when the 'next' MySQL version, with FK at SQL level will be available?
Not at the moment, but this is one of the hottest topics within MySQL.

Q from Manuel: Can you explain or give a URL where the use of redundant management servers is explained?
In the reference manual for version 5.0 - section 15.4.4.2 - there are examples of connection strings with multiple management nodes

Q from Guy: What happens if the DB drows significantly in a short period therefore size increases above the memory limit?
The size is fixed and preallocated in this point in time. When available memory is full it's full. 5.1 will be able to have non-indexed data on disk, which will change this.

Q from Danilo: When will you plan to introduce multiple index use in mysql select?
It is in there already, to an extent. I assume you ask about multiple indexes per table, and again, that is to an extent already implemented. The usefulness of this depends on the stoarge engine though. Falcon will support this in a very effective manner.

Q from Hilmar: How do I know for sure that a certain transaction is sent to disk asap? is possible to trigger a checkpoint using an SQL statement?
You can setup the system in order to have continuous checkpoints, although this is not the answer to the problem. The checkpoint will not be related to a single commit in any case. MySQL Cluster architecture has been designed to be an in-memory database, therefore the infrastructure underneath must be able to support the architecture. Bear in mind that MySQL cluster achieves HA through redundancy; since you can have up to 4 replicas, in the worst case scenario you have to have 4 faults on 4 servers and you may lose data executed after the last position available in the REDO LOG on disk - I am not aware of any other HA architecture that may achieve more than this, whether it is in-memory or on disk.

Q from Hilmar: But what happen if you have a power failure for the whole data center?
First of all, you should have UPS systems in place that should at least allow you to shut down the server gracefully, in order to not lose any data at all. Secondly, with version 5.1 you may use geographic replication and activate the cluster in another data center.

Q from Jonas: Is MySQL Cluster Replication is done on a node-by-node basis, or the cluster as a whole?
In 5.1, it is Cluster to Cluster.

Q from Hilmar: Has mysql cluster already been tested on sparc T1's for the SQL nodes? And are these a better option than a regular AMD box ?
SUN HW and Solaris and well supported, you have only to take care of having the same hardware type (SPARC, Intel or AMD) between nodes.

技术相关: