Re: [eclipselink-dev] Code review: initial partitioning support, bug#328

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937

From: James Sutherland <JAMES.SUTHERLAND@xxxxxxxxxx>
Date: Wed, 17 Nov 2010 07:48:11 -0800 (PST)
Delivered-to: eclipselink-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/eclipselink-dev>
List-help: <mailto:eclipselink-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/eclipselink-dev>, <mailto:eclipselink-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/eclipselink-dev>, <mailto:eclipselink-dev-request@eclipse.org?subject=unsubscribe>

Hello Samba,

Thank you for your comments, please also submit them to the design document discussion page so they can be tracked accordingly.

Database clustering is obviously a big complex area, and this is our first entrance into this space. We will not support everything imaginable in our first release. I think the policies outlined in the design doc cover the common use cases, and are probably even a little to ambitious for a first release.

What you seem to be requesting is support for both partitioning and replication... and load balancing for the same data. As you can understand this would be pretty complex. Scenarios like this are something that the design of the partitioning framework are capable of supporting, but given the complexity, this is not something we will direct support in our first release. To implement this you can define your own PartitioningPolicy subclass (or subclass the policy that matches your requirements the closest). Then for a given query you can define, in your own code, which databases to send the request to. You would need to define which set of servers each range should go to. For a read request, you could load balance across these servers. For a write request, you could write to each of the servers. If you detect an error in one of the servers, you can failover to the other.

I understand that the root class name PartitioningPolicy does not describe everything it can do adequately. But naming the class PartitioningLoadBalanacingReplicationAndFailoverPolicy would be a little too wordy; I think partitioning is the main usecase, hence its name.

-----Original Message-----
From: Samba [mailto:saasira@xxxxxxxxx]
Sent: Wednesday, November 17, 2010 10:33 AM
To: Dev mailing list for Eclipse Persistence Services
Subject: Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937

Hi James,

I have few comments on the implementation; please read these as constructive criticism.

A PartitioningPolicy is supposed to be used for partitioning both the read queries as well as write statements such that queries get distributed and end up at different database instances.

A ReplicationPolicy, as noted in the comments above the class, is instrumented to duplicate all writes across all the configured nodes of replication.

A LoadBalancingPolicy can be an implementation of either of the above or a combination of both the partitioning and the replication features. So, we need to create a base class for LoadBalancingPolicy that can be exteded to support various ways of load balancing by utilizing replication or partitioning or both.

The difference I'm trying to bring out is that a ReplicationPolicy cannot be an extension of PartitioningPolicy.

Similarly, a LoadBalancingPolicy can have an instance each of PartitioningPolicy and ReplicationPolicy but it is not an extension of either of these; instead it can only be a composition of these two features.

How we provide load balancing can be dealt in the implementations like:

1. RounRobinLoadBalancingPolicy, CluserLoadBalancingPolicy, etc are possible if the data is only

replicated and not partitioned; they support active/passive fail-over. Their primary purpose is to support

fail over and optionally to reduce load on a single server. These implementations rely on having

ReplicationPolicy implementations and cannot have PartitioningPolicy instance.

2. RangeLoadBalancingPolicy, HashLoadBalancingPolicy,etc are possible with partitioning the data across

several nodes, their primary purpose is to provide scalability and performance.However we can also

replicate each partitioned node and thus can support passive fail-over. These policies will primarily reply

on having PartitioingPolicy implementation but can optionally also include ReplicationPolicy features as

well in order to support fail over in addition to scale and performance.

I hope I'm making some sense here :)

Thanks and Regards,
Samba

On Mon, Nov 8, 2010 at 7:57 AM, James Sutherland <JAMES.SUTHERLAND@xxxxxxxxxx> wrote:

Code review: initial partitioning support, bug#328937

https://bugs.eclipse.org/bugs/show_bug.cgi?id=328937

design doc,

http://wiki.eclipse.org/EclipseLink/DesignDocs/328937

Changes:

- added partitioningPolicy to ClassDescriptor

- added null check to FetchGroupManager to avoid null-pointer on failed deploy

- added PartitionPolicy abstract class, defines getConnections API

- added ReplicationPolicy, replicates writes to multiple pools

- added pool reference to Accessor, so it knows where it came from

- added acquire/release connection logging

- added partitioningPolicy to AbstractSession

- changed AbstractSession accessor to Collection accessors (and updated references)

- changed transaction to work with multiple accessors (2 stage commit)

- changed call execution to work with multiple accessors

- made client sessions (isolated, exclusive) execute calls consistently, added support for partitioning

- added @Overrides to sessions, some micro

- fixed finally connection release in ReferenceMapping

- changed DatabaseQuery accessor to Colleciton accessors

- changed SessionBroker getAccessor API to use same getAccessor for partitioning, only a single call, pass query

- added setURL to DatabaseLogin

- changed ClientSession writeAccessor to Map writeAccessors keyed on pool name

- changed ClientSession connection to be lazily assigned

- changed getAccessor on ClientSession to assign a connection if in a transaction to support backward compatiblity, and internal usage

- changed ServerSession call execution to support partitioning

- added JPA partitioned model

- changed JPA test framework methods to be instances methods and use inherited getPersistenceUnitName define in test avoid common mistakes in non default unit tests

- added JPA paritioning test switch using derby "cluster" and round robin and replication, tests only run on Derby as need to create multiple databases

- added batch-fetch example

- added partitioned, and isolated partitioned version of UnitOfWork test model, uses "virtual rack" (multiple connection pools to the same database)

Code review: Andrei (pending)

_______________________________________________
eclipselink-dev mailing list
eclipselink-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/eclipselink-dev

Follow-Ups:
- Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937
  - From: Samba

References:
- Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937
  - From: Samba

Prev by Date: Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937
Next by Date: Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937
Previous by thread: Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937
Next by thread: Re: [eclipselink-dev] Code review: initial partitioning support, bug#328937
Index(es):
- Date
- Thread

Breadcrumbs