Skip to main content

How to achieve 100% availability on WSO2 product deployments

WSO2 products comes with several different components. These components can be configured through different configurations. Once the system is moved into production, it is quintissential that system needs to go through various updated and upgrades during it's life time. There are 3 main configurations related to WSO2 products.
  • Database configurations
  • Server configurations
  • Implementation code
Any one or all of the above configuration components can be changed during an update/upgrade to the production system. In order to keep the system 100% available, we need to make sure that product update/upgrade processes does not impact the availability of the production system. We can identify different scenarios which can challenge the availability of the system. During these situations, users can follow the guidelines mentioned below so that system runs smoothly without any interruptions.

During outage of server(s)

  • We need to have redundancy (HA) in the system in terms of active/active mode. In a 2 node setup, if 1 node goes down, there must be a node which can hold the traffic for both nodes for some time. Users may get some slowness, but system will be available. During the capacity planning of the system, we must make sure that at least 75% of the overall load can be handled from 1 active node.
  • If we have active/passive mode in a 2 node setup, each node should be capable of handling the load separately and passive node should be in hot-standby mode. Which means that passive node must keep on running even though it does not get traffic. 
  • If an entire data center goes down, then we should have a Disaster Recovery (DR) in a separate data center with the same setup. This can be in a cold-standby mode since these type of outages are very rare. But if we go with cold standby, there will be a time window of service unavailability

Adding a new service (API)

  • Database sharing needs to be properly done through master-datasources.xml file and through registry sharing
  • File system sharing needs to be done so that deployment is one time and other nodes will get the artifacts through file sharing
  • Service deployments needs to be done from one node (manager node) and other nodes needs to be configured in read-only mode (to avoid conflicts)
  • Use passive node as manager node (If you have active/passive mode)
  • Once the services are deployed in all the nodes, do a test and expose the service (API) to the consumers

Updating an existing service (fixing a bug)

  • Bring one additional passive node to the system with existing version of services. This is in case if the active node goes down while updating the service on first passive node (system will be 1 active/ 2 passive)
  • Disable the file sharing (rsync) in passive node.
  • Deploy the patched version and carry out testing into this passive node
  • Once the testing is passed, allow traffic into passive node and stop traffic from active node. 
  • Enable file sharing and allow active node to synced up with the patched version. If you don’t have file sharing, you need to manually deploy the service.
  • Carry out testing on other node and once it is passed, allow traffic into new node (if required)
    Remove the secondary passive node from the system (system will be 1 active/ 1 passive)

Applying a patch to the server (needs a restart)

  • Bring one additional passive node to the system with existing version of services. This is in case if the active node goes down while applying the patch on first passive node (system will be 1 active/ 2 passive)
  • Apply the patch on first passive node and carry out testing
  • Once the testing is done enable traffic into this node and remove traffic from active node
  • Apply the patch on active node and carry out testing
  • Once the testing is done, enable traffic into this node and remove traffic from previous node (or you can keep this node as active)
  • Remove the secondary passive node from the system (system will be 1 active/ 1 passive)

Doing a version upgrade to the server

  • Bring one additional passive node to the system with existing version of services. This is in case if the active node goes down while applying the patch on first passive node (system will be 1 active/ 2 passive)
  • Execute the migration scripts provided in WSO2 documentation to move the databases to the new version in passive node
  • Deploy the artifacts in the new version in passive node
  • Do a testing on this passive node and once testing is passed, expose traffic into this node
  • Follow the same steps into the active node
  • Once the testing is done, direct the traffic into this node (if required)
Instead of maintaining the production system through manual processes, WSO2 provides artifacts which can be used to automate the deployment and scalability of the production system through docker and kubernetes.

Deployment automation

Comments

  1. PropellerAds has revolutionized the way I approach online advertising with their unparalleled focus on buy quality web traffic. As a seasoned marketer, I've navigated through various platforms, but none come close to the precision targeting and genuine engagement offered by PropellerAds. Their commitment to filtering out low-quality traffic ensures that every ad impression counts, leading to higher conversion rates and better ROI. If you're serious about maximizing your online presence and driving real results, PropellerAds is the ultimate game-changer you've been searching for.

    ReplyDelete

Post a Comment

Popular posts from this blog

WSO2 ESB tuning performance with threads

I have written several blog posts explaining the internal behavior of the ESB and the threads created inside ESB. With this post, I am talking about the effect of threads in the WSO2 ESB and how to tune up threads for optimal performance. You can refer [1] and [2] to understand the threads created within the ESB. [1] http://soatutorials.blogspot.com/2015/05/understanding-threads-created-in-wso2.html [2] http://wso2.com/library/articles/2012/03/importance-performance-wso2-esb-handles-nonobvious/ Within this blog post, I am discussing about the "worker threads" which are used for processing the data within the WSO2 ESB. There are 2 types of worker threads created when you start sending the requests to the server 1) Server Worker/Client Worker Threads 2) Mediator Worker (Synapse-Worker) Threads Server Worker/Client Worker Threads These set of threads will be used to process all the requests/responses coming to the ESB server. ServerWorker Threads will be used to pr

How puppet works in your IT infrstructure

What is Puppet? Puppet is IT automation software that helps system administrators manage infrastructure throughout its lifecycle, from provisioning and configuration to orchestration and reporting. Using Puppet, you can easily automate repetitive tasks, quickly deploy critical applications, and proactively manage change, scaling from 10s of servers to 1000s, on-premise or in the cloud. How the puppet works? It works like this..Puppet agent is a daemon that runs on all the client servers(the servers where you require some configuration, or the servers which are going to be managed using puppet.) All the clients which are to be managed will have puppet agent installed on them, and are called nodes in puppet. Puppet Master: This machine contains all the configuration for different hosts. Puppet master will run as a daemon on this master server. Puppet Agent: This is the daemon that will run on all the servers, which are to be managed using p

How to configure timeouts in WSO2 ESB to get rid of client timeout errors

WSO2 ESB has defined some configuration parameters which controls the timeout of a particular request which is going out of ESB. In a particular  scneario, your client sends a request to ESB, and then ESB sends a request to another endpoint to serve the request. CLIENT->WSO2 ESB->BACKEND The reason for clients getting timeout is that ESB timeout is larger than client's timeout. This can be solved by either increasing the timeout at client side or by decreasing the timeout in ESB side. In any of the case, you can control the timeout in ESB using the below properties. 1) Global timeout defined in synapse.properties (ESB_HOME\repository\conf\) file. This will decide the maximum time that a callback is waiting in the ESB for a response for a particular request. If ESB does not get any response from Back End, it will drop the message and clears out the call back. This is a global level parameter which affects all the endpoints configured in ESB. synapse.global_timeout_inte